Thanks Micah, I'll take the Java side implementation.

Thanks,
Ji Liu


------------------------------------------------------------------
From:Micah Kornfield <emkornfi...@gmail.com>
Send Time:2019年12月2日(星期一) 09:25
To:dev <dev@arrow.apache.org>
Subject:Re: [Result] [VOTE] Clarifications and forward compatibility changes 
for Dictionary Encoding (second iteration)

I've merged the PR and created ARROW-7283
<https://issues.apache.org/jira/browse/ARROW-7283> [1] to track
implementation for languages currently in the integration test.


[1] https://issues.apache.org/jira/browse/ARROW-7283

On Wed, Nov 27, 2019 at 1:03 AM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> The vote carries with 3 bindings votes +1 votes, 1 non-binding +1 vote and
> 1 non-binding +.5 vote.
>
> To follow-up I will:
> 1.  Open up JIRAs for work items in reference implementations (c++/java)
> 2.  Merge the pull request containing the specification changes.
>
> Thanks,
> Micah
>
> On Tue, Nov 26, 2019 at 12:50 AM Sutou Kouhei <k...@clear-code.com> wrote:
>
>> +1 (binding)
>>
>> In <cak7z5t_1bpigahnn13orr2o0qwzo54nb_4zv5eyfn6w8k+o...@mail.gmail.com>
>>   "[VOTE] Clarifications and forward compatibility changes for Dictionary
>> Encoding (second iteration)" on Wed, 20 Nov 2019 20:41:57 -0800,
>>   Micah Kornfield <emkornfi...@gmail.com> wrote:
>>
>> > Hello,
>> > As discussed on [1], I've proposed clarifications in a PR [2] that
>> > clarifies:
>> >
>> > 1.  It is not required that all dictionary batches occur at the
>> beginning
>> > of the IPC stream format (if a the first record batch has an all null
>> > dictionary encoded column, the null column's dictionary might not be
>> sent
>> > until later in the stream).
>> >
>> > 2.  A second dictionary batch for the same ID that is not a "delta
>> batch"
>> > in an IPC stream indicates the dictionary should be replaced.
>> >
>> > 3.  Clarifies that the file format, can only contain 1 "NON-delta"
>> > dictionary batch and multiple "delta" dictionary batches. Dictionary
>> > replacement is not supported in the file format.
>> >
>> > 4.  Add an enum to dictionary metadata for possible future changes in
>> what
>> > format dictionary batches can be sent. (the most likely would be an
>> array
>> > Map<Int, Value>).  An enum is needed as a place holder to allow for
>> forward
>> > compatibility past the release 1.0.0.
>> >
>> > If accepted there will be work in all implementations to make sure that
>> > they cover the edge cases highlighted and additional integration testing
>> > will be needed.
>> >
>> > Please vote whether to accept these additions. The vote will be open
>> for at
>> > least 72 hours.
>> >
>> > [ ] +1 Accept these change to the specification
>> > [ ] +0
>> > [ ] -1 Do not accept the changes because...
>> >
>> > Thanks,
>> > Micah
>> >
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/d0f137e9db0abfcfde2ef879ca517a710f620e5be4dd749923d22c37@%3Cdev.arrow.apache.org%3E
>> > [2] https://github.com/apache/arrow/pull/5585
>>
>

Reply via email to