adrien-grl opened a new issue, #49305:
URL: https://github.com/apache/arrow/issues/49305

   ### Describe the enhancement requested
   
   # Problem / Motivation
   
   In PyArrow, `pyarrow.ipc.RecordBatchFileReader` the only way I found to 
gather the total number of rows of contained in a Feather file is to do 
something along the lines of:
   ```python
   num_rows = sum(reader.get_batch(i).num_rows for i in 
range(reader.num_record_batches))
   ```
   This is not very efficient when it seems you can directly count the rows 
using the metadata (as in `RecordBatchFileReader::CountRows`).
   
   The current way of doing is intractable when when reading from remote file 
systems.
   
   # Proposed solution
   Expose `RecordBatchFileReader::CountRows` in Python.
   
   # References
   
https://github.com/apache/arrow/blob/76f781512330d99a2e308c16f5fba7ededc3e292/cpp/src/arrow/ipc/reader.h#L204
   
   Thank you for reading my suggestion and all the amazing work!!
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to