[
https://issues.apache.org/jira/browse/ARROW-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-17524:
-----------------------------------
Labels: pull-request-available (was: )
> The ORC reader method ReadStripe does not work when we specify fields to
> selected as a list of integers
> -------------------------------------------------------------------------------------------------------
>
> Key: ARROW-17524
> URL: https://issues.apache.org/jira/browse/ARROW-17524
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Affects Versions: 8.0.1
> Reporter: Louis Calot
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I think there is a bug in the ORC reader : when we specify the fields indexes
> that we want to keep, it does not work correctly. Looking at the code, it
> seems to be because we do "includeTypes" in lieue of "include" when setting
> the ORC options.
> It can be problematic when we want to import an ORC table containing Union
> types as it will do an error at the import, even if we try not to import
> these specific fields.
> The definitions of the corresponding ORC methods are here :
> [https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L185-L191]
> and
> [https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L201-L207]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)