[
https://issues.apache.org/jira/browse/ORC-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428441#comment-15428441
]
Owen O'Malley commented on ORC-92:
----------------------------------
I agree that would be really useful to be able to specify the lower types by
name.
You could even use virtual names for the other complex types:
map type could use "key" and "value"
union type could use the tag "0", "1", etc.
array type could use "value"
You could use the same Reader.include(list<string>) method because the proposed
semantics are the same as the current ones if the names don't have a '.'.
Please file a jira for the enhancement.
The ColumnSelection enum is an internal detail, the code had gotten messy with
booleans for each potential way of setting which columns should be read. An
enum seemed like a more natural way to make sure that no more than one boolean
was set.
> Support column id and column name selection in ReaderOptions
> ------------------------------------------------------------
>
> Key: ORC-92
> URL: https://issues.apache.org/jira/browse/ORC-92
> Project: Orc
> Issue Type: New Feature
> Components: C++
> Affects Versions: 1.2.0
> Reporter: Chunyang Wen
> Assignee: Chunyang Wen
> Priority: Minor
> Fix For: 1.2.0
>
>
> Currently, in C++ version of orc. We can only select by filed id or field
> name. This works fine when data structure is flat such as struct<int1:int,
> s1:string, list1:array<int>>. But when we have a nested structure,
> struct<int1:int, struct1:struct<int2:int, long2:long>>. We still can only
> select the field of int1 and struct1. We can not directly select long2.
> We can select long2 by its column id. This can be achieved by updating
> include function in ReaderOptions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)