[ 
https://issues.apache.org/jira/browse/ARROW-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17524:
-----------------------------------
    Labels: pull-request-available  (was: )

> The ORC reader method ReadStripe does not work when we specify fields to 
> selected as a list of integers
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17524
>                 URL: https://issues.apache.org/jira/browse/ARROW-17524
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 8.0.1
>            Reporter: Louis Calot
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think there is a bug in the ORC reader : when we specify the fields indexes 
> that we want to keep, it does not work correctly. Looking at the code, it 
> seems to be because we do "includeTypes" in lieue of "include" when setting 
> the ORC options.
> It can be problematic when we want to import an ORC table containing Union 
> types as it will do an error at the import, even if we try not to import 
> these specific fields.
> The definitions of the corresponding ORC methods are here :
> [https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L185-L191]
> and
> [https://github.com/apache/orc/blob/72220851cbde164a22706f8d47741fd1ad3db190/c%2B%2B/src/Options.hh#L201-L207]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to