[ 
https://issues.apache.org/jira/browse/ARROW-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5862:
----------------------------------
    Labels: pull-request-available  (was: )

> [Java] Provide dictionary builder
> ---------------------------------
>
>                 Key: ARROW-5862
>                 URL: https://issues.apache.org/jira/browse/ARROW-5862
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>              Labels: pull-request-available
>
> The dictionary builder servers for the following scenario which is frequently 
> encountered in practice when dictionary encoding is involved: the dictionary 
> values are not known a priori, so they are determined dynamically, as new 
> data arrive continually.
> In particular, when a new value arrives, it is tested to check if it is 
> already in the dictionary. If so, it is simply neglected, otherwise, it is 
> added to the dictionary.
>  
> When all values have been evaluated, the dictionary can be considered 
> complete. So encoding can start afterward.
> The code snippet using a dictionary builder should be like this:
> {{DictonaryBuilder<IntVector> dictionaryBuilder = ...}}
> {{dictionaryBuilder.startBuild();}}
> {{...}}
> {{dictionaryBuild.addValue(newValue);}}
> {{...}}
> {{dictionaryBuilder.endBuild();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to