Louis Calot created ARROW-17524:
-----------------------------------
Summary: The ORC reader method ReadStripe does not work when we
specify fields to selected as a list of integers
Key: ARROW-17524
URL: https://issues.apache.org/jira/browse/ARROW-17524
Project: Apache Arrow
Issue Type: Bug
Components: C++
Affects Versions: 8.0.1
Reporter: Louis Calot
I think there is a bug in the ORC reader : when we specify the fields indexes
that we want to keep, it does not work correctly. Looking at the code, it seems
to be because we do "includeTypes" in lieue of "include" when setting the ORC
options.
It can be problematic when we want to import an ORC table containing Union
types as it will do an error at the import, even if we try not to import these
specific fields.
The definitions of the corresponding ORC methods are here :
[https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L185-L191]
and
[https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L201-L207]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)