[
https://issues.apache.org/jira/browse/ARROW-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048232#comment-16048232
]
Wes McKinney commented on ARROW-692:
------------------------------------
It would be simpler for the schema to be immutable, and therefore constructed
in a single pass. That makes it easier to support nested subfields.
As an example, see the "list of encoded string" field in
https://gist.github.com/wesm/5100e41173a3b5e53437b7a887e4383a#file-gistfile1-json,
we have
{code}
{
"name": "list of encoded string",
"nullable": true,
"type": {
"name": "list"
},
"children": [
{
"name": "item",
"nullable": true,
"type": {
"name": "utf8"
},
"dictionary": {
"id": 1,
"indexType": {
"name": "int",
"bitWidth": 8,
"isSigned": true
},
"isOrdered": false
},
"children": [],
"typeLayout": {
"vectors": [
{
"type": "VALIDITY",
"typeBitWidth": 1
},
{
"type": "DATA",
"typeBitWidth": 8
}
]
}
}
],
"typeLayout": {
"vectors": [
{
"type": "VALIDITY",
"typeBitWidth": 1
},
{
"type": "OFFSET",
"typeBitWidth": 32
}
]
}
}
{code}
> Java<->C++ Integration tests for dictionary-encoded vectors
> -----------------------------------------------------------
>
> Key: ARROW-692
> URL: https://issues.apache.org/jira/browse/ARROW-692
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Java - Vectors
> Reporter: Wes McKinney
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)