[ 
https://issues.apache.org/jira/browse/ARROW-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045207#comment-16045207
 ] 

Bryan Cutler commented on ARROW-692:
------------------------------------

Updated the sample JSON - the Field schema should be the dictionary type (utf8 
here), and not the index type

[~wesmckinn] and [~julienledem] I have a couple questions:

1)  The "name" field in the dictionary is meaningless right?  It's not part of 
the RecordBatch message.  In Arrow Java, when writing it will be whatever name 
the user initializes the dictionary vector as.  When reading, the dictionary 
vector will be the first Field name that has a dictionary encoding.  Would it 
be better to overwrite any name to something standard like "DICT#" where # is 
the dictionary id?

2)  Does it make sense for the dictionary field to be nullable?  In Java the 
dictionary field nullable flag will be whatever the first field using that 
encoding is.  Should nullable only be allowed to be false and enforce this when 
setting the dictionary field? Of course the encoded index field can be nullable.



> Java<->C++ Integration tests for dictionary-encoded vectors
> -----------------------------------------------------------
>
>                 Key: ARROW-692
>                 URL: https://issues.apache.org/jira/browse/ARROW-692
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Java - Vectors
>            Reporter: Wes McKinney
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to