[ https://issues.apache.org/jira/browse/ARROW-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16394287#comment-16394287 ]
Wes McKinney commented on ARROW-2296: ------------------------------------- This is already contained in the RecordBatch metadata, and does not require reading the whole file https://github.com/apache/arrow/blob/master/format/Message.fbs#L50 Does this not satisfy the use case? > [C++] Add num_rows to file footer > --------------------------------- > > Key: ARROW-2296 > URL: https://issues.apache.org/jira/browse/ARROW-2296 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Format > Reporter: Lawrence Chan > Priority: Minor > > Maybe I'm overlooking something, but I don't see something on the API surface > to get the number of rows in a arrow file without reading all the record > batches. This is useful when we want to read into contiguous buffers, because > it allows us to allocate the right sizes up front. > I'd like to propose that we add `num_rows` as a field in the file footer so > it's easy to query without reading the whole file. > Meanwhile, before we get that added to the official format fbs, it would be > nice to haveĀ a method that iterates over the record batch headers and sums up > the lengths without reading the actual record batch body. -- This message was sent by Atlassian JIRA (v7.6.3#76005)