[
https://issues.apache.org/jira/browse/ARROW-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weston Pace updated ARROW-14805:
--------------------------------
Labels: ORC c++ dataset good-second-issue (was: ORC c++ dataset)
> [C++][Dataset] Support Count function without projections in ORC to avoid
> loading all columns
> ---------------------------------------------------------------------------------------------
>
> Key: ARROW-14805
> URL: https://issues.apache.org/jira/browse/ARROW-14805
> Project: Apache Arrow
> Issue Type: Sub-task
> Components: C++
> Reporter: xiangxiang Shen
> Priority: Major
> Labels: ORC, c++, dataset, good-second-issue
>
> For ORC support in dataset, when execute count query without projections,
> just like "select count(*) from table", it will load all columns. Because orc
> lib code is that
> [https://github.com/apache/orc/blob/22828f79a526069d9629719c9476b7addad91ae6/c%2B%2B/src/Reader.cc#L120-L144.]
>
> Arrow side can improve it like parquet in dataset.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)