[
https://issues.apache.org/jira/browse/ARROW-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Micah Kornfield resolved ARROW-5862.
------------------------------------
Resolution: Fixed
Fix Version/s: 0.15.0
Issue resolved by pull request 4813
[https://github.com/apache/arrow/pull/4813]
> [Java] Provide dictionary builder
> ---------------------------------
>
> Key: ARROW-5862
> URL: https://issues.apache.org/jira/browse/ARROW-5862
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Java
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.15.0
>
> Time Spent: 6h 40m
> Remaining Estimate: 0h
>
> The dictionary builder servers for the following scenario which is frequently
> encountered in practice when dictionary encoding is involved: the dictionary
> values are not known a priori, so they are determined dynamically, as new
> data arrive continually.
> In particular, when a new value arrives, it is tested to check if it is
> already in the dictionary. If so, it is simply neglected, otherwise, it is
> added to the dictionary.
>
> When all values have been evaluated, the dictionary can be considered
> complete. So encoding can start afterward.
> The code snippet using a dictionary builder should be like this:
> {{DictonaryBuilder<IntVector> dictionaryBuilder = ...}}
> {{dictionaryBuilder.startBuild();}}
> {{...}}
> {{dictionaryBuild.addValue(newValue);}}
> {{...}}
> {{dictionaryBuilder.endBuild();}}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)