AlenkaF commented on PR #40661:
URL: https://github.com/apache/arrow/pull/40661#issuecomment-2006507006

   Opened a draft PR in case there are any ideas what is missing in the C++ 
code. The conversion currently results in a numpy array or pandas series with 
correct length but only one value, which is not correct:
   
   ```python
   In [1]: import pyarrow as pa
      ...: import pyarrow.compute as pc
      ...: 
      ...: arr = pc.run_end_encode([1, 1, 2, 3, 3, 3, 6])
      ...: arr.to_numpy()
   Out[1]: 
   array([6510615555426900570, 6510615555426900570, 6510615555426900570,
          6510615555426900570, 6510615555426900570, 6510615555426900570,
          6510615555426900570])
   
   In [2]: import pyarrow as pa
      ...: import pyarrow.compute as pc
      ...: 
      ...: arr = pc.run_end_encode([1, 1, 2, 3, 3, 3, 6])
      ...: arr.to_pandas()
   Out[2]: 
   0    6510615555426900570
   1    6510615555426900570
   2    6510615555426900570
   3    6510615555426900570
   4    6510615555426900570
   5    6510615555426900570
   6    6510615555426900570
   dtype: int64
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to