[jira] [Commented] (ARROW-257) Add a typeids Vector to Union type

2016-09-22 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514435#comment-15514435
 ] 

Julien Le Dem commented on ARROW-257:
-

The current java implementation uses the ordinal in the MinorType to denote the 
type id in the type vector.
However the Arrow spec defines it as the index in the children of the Field.
This JIRA is a way to reconcile the too.
When the Vector is not using the child index as a type id it provides the ids 
in the typeIds field. (typeIds is the same length as the children in the Field)

> Add a typeids Vector to Union type
> --
>
> Key: ARROW-257
> URL: https://issues.apache.org/jira/browse/ARROW-257
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Format
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
>
> {noformat}
> enum UnionMode:int { Sparse, Dense }
> table Union {
>   mode: UnionMode;
>   typeIds: [Int32]; // optional, describes typeid of each child.
> }
> {noformat}
> The idea is to enable providing an id different from the child offset (the 
> default)
> This enables an optimization where we use predefined ids when constructing 
> the type vector of the union but want the children to be only the actually 
> used types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ARROW-257) Add a typeids Vector to Union type

2016-09-22 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514043#comment-15514043
 ] 

Steven Phillips commented on ARROW-257:
---

I don't understand that purpose or benefit of this change. Could you give a 
concrete example of where this would be useful?

> Add a typeids Vector to Union type
> --
>
> Key: ARROW-257
> URL: https://issues.apache.org/jira/browse/ARROW-257
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Format
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
>
> {noformat}
> enum UnionMode:int { Sparse, Dense }
> table Union {
>   mode: UnionMode;
>   typeIds: [Int32]; // optional, describes typeid of each child.
> }
> {noformat}
> The idea is to enable providing an id different from the child offset (the 
> default)
> This enables an optimization where we use predefined ids when constructing 
> the type vector of the union but want the children to be only the actually 
> used types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ARROW-257) Add a typeids Vector to Union type

2016-08-15 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420729#comment-15420729
 ] 

Wes McKinney commented on ARROW-257:


So if I understand correctly, support we had a union of 50 types, but only 5 of 
them actually occur in the data, then the typeIds would indicate the indices of 
the observed child types. That makes sense to me.

> Add a typeids Vector to Union type
> --
>
> Key: ARROW-257
> URL: https://issues.apache.org/jira/browse/ARROW-257
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Format
>Reporter: Julien Le Dem
>
> {noformat}
> enum UnionMode:int { Sparse, Dense }
> table Union {
>   mode: UnionMode;
>   typeIds: [Int32]; // optional, describes typeid of each child.
> }
> {noformat}
> The idea is to enable providing an id different from the child offset (the 
> default)
> This enables an optimization where we use predefined ids when constructing 
> the type vector of the union but want the children to be only the actually 
> used types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)