[ 
https://issues.apache.org/jira/browse/ARROW-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317406#comment-17317406
 ] 

David Li commented on ARROW-9697:
---------------------------------

I'm taking a swing at this and it'll be up once ARROW-11797 lands. Note that 
Joris correctly guesses that the Parquet reader indeed implements the 
optimization internally; there's no need for a special method as the Parquet 
reader will just fabricate a batch if it notices you aren't reading any columns.

> [C++][Dataset] num_rows method for Dataset/Scanner
> --------------------------------------------------
>
>                 Key: ARROW-9697
>                 URL: https://issues.apache.org/jira/browse/ARROW-9697
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Neal Richardson
>            Assignee: David Li
>            Priority: Major
>              Labels: dataset
>             Fix For: 4.0.0
>
>
> Something like Scanner::ToTable except first Project to keep 0 columns, and 
> for each record batch, grab the num_rows. Then sum the resulting vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to