teneon opened a new issue #11559:
URL: https://github.com/apache/arrow/issues/11559


   Hi there,
   
   thank you for your amazing library! 
   
   We are passing _Arrow table_ from Python to Cython in order to quickly 
iterate millions of arrow table rows inside Cython. We need to do this in a 
sequence, so iterating is the only option. In C++ there is an 
[example](https://arrow.apache.org/docs/cpp/examples/row_columnar_conversion.html)
 which i tested and it works good, but i couldn't get the same methods to work 
in Cython (based on my limited knowledge on the matter)
   
   We plan to iterate rows in a for loop, but we can't figure out how to 
"access / check / print"  values for each row.
   
   How could we, for example, print a single row from the passed arrow table 
(or multiple rows if you so prefer) of "date, name, age, weight" inside Cython? 
Could you give us an example please?
   
   **python code:**
   ```py
   import pandas as pd
   import prophet.cython.arrow.myarrow as myarrow
   
   df = pd.DataFrame({
       'date': pd.date_range(start='2020-01-01 00:00:00', periods=3, 
freq='1min'),
       'name': ['jack', 'tim', 'frank'],
       'age': [32, 25, 65],
       'weight': [66.46, 84.11, 71.52]
   })
   table = pa.Table.from_pandas(df)
   myarrow.iterate_table(table) # This is where arrow table is being passed 
from Python to Cython
   ```
   
   **cython code:**
   ```py
   from __future__ import print_function
   cimport pyarrow
   from pyarrow.lib cimport *
   
   def iterate_table(obj):
       cdef int num_columns = 0
       cdef int num_rows = 0
       cdef:
           shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
           shared_ptr[CChunkedArray] array
           shared_ptr[CArray] chunk
           shared_ptr[CArrayData] data
           
       if table.get() == NULL:
           raise TypeError("not a table...")
   
       num_columns = table.get().num_columns()
       num_rows = table.get().num_rows()
       print("num_columns: ",num_columns) # prints 4 as expected
       print("num_rows: ",num_rows) # prints 3 as expected
   
       array = table.get().column(2)
       chunk = array.get().chunk(0)
       data = chunk.get().data()
       print("chunk length: ", chunk.get().length()) # prints 3 as expected
       print("data length: ", data.get().length) # prints 3 as expected
   ```
   best regards,
   Neon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to