Louis Calot created ARROW-17524:
-----------------------------------

             Summary: The ORC reader method ReadStripe does not work when we 
specify fields to selected as a list of integers
                 Key: ARROW-17524
                 URL: https://issues.apache.org/jira/browse/ARROW-17524
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
    Affects Versions: 8.0.1
            Reporter: Louis Calot


I think there is a bug in the ORC reader : when we specify the fields indexes 
that we want to keep, it does not work correctly. Looking at the code, it seems 
to be because we do "includeTypes" in lieue of "include" when setting the ORC 
options.
It can be problematic when we want to import an ORC table containing Union 
types as it will do an error at the import, even if we try not to import these 
specific fields.

The definitions of the corresponding ORC methods are here :
[https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L185-L191]

and
[https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L201-L207]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to