anjakefala commented on PR #40565:
URL: https://github.com/apache/arrow/pull/40565#issuecomment-2008319524

   @pitrou Yes, indeed it does!
   
   I looked at my original code that was resulting in freezing at the 
`RETURN_NOT_OK(memo_table.GetOrInsert(value, &memo_index));` line.
   
   ```
   18 inline Status ConvertAsPyObjects(const PandasOptions& options, const 
ChunkedArray& data,
     619                                  WrapFunction&& wrap_func, PyObject** 
out_values) {
     620   using ArrayType = typename TypeTraits<Type>::ArrayType;
     621   using Scalar = typename MemoizationTraits<Type>::Scalar;
     622   std::function<Status(const typename 
MemoizationTraits<Type>::Scalar&, PyObject**)> WrapFunc;
     623
     624   if (options.deduplicate_objects) {
     626     ::arrow::internal::ScalarMemoTable<Scalar> 
memo_table(options.pool);
     627     std::vector<PyObject*> unique_values;
     628     int32_t memo_size = 0;
     630
     631     WrapFunc = [&](const Scalar& value, PyObject** out_values) {
     633       int32_t memo_index;
     634       RETURN_NOT_OK(memo_table.GetOrInsert(value, &memo_index));
     636       if (memo_index == memo_size) {
     637         // New entry
     639         RETURN_NOT_OK(wrap_func(value, out_values));
     640         unique_values.push_back(*out_values);
     641         ++memo_size;
     642       } else {
     644         // Duplicate entry
     645         Py_INCREF(unique_values[memo_index]);
     646         *out_values = unique_values[memo_index];
     647       }
     648       return Status::OK();
     649     };
     650   } else {
     651     WrapFunc = [&](const Scalar& value, PyObject** out_values) {
     652       return wrap_func(value, out_values);
     653     };
     654   }
     655
     656   for (int c = 0; c < data.num_chunks(); c++) {
     658     const auto& arr = arrow::internal::checked_cast<const 
ArrayType&>(*data.chunk(c));
     660     RETURN_NOT_OK(internal::WriteArrayObjects(arr, WrapFunc, 
out_values));
     662     out_values += arr.length();
     663   }
     664   return Status::OK();
     665 }
   ```
   
   It seems that the main difference is your creation of `convert_chunks`, and 
moving the for loop within it. Do you know why that helped fix the freezing?
   
   Anyway, this works great for my needs! Thank you!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to