+1 (binding). Thanks Micah
On Wed, Nov 20, 2019 at 10:42 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > Hello, > As discussed on [1], I've proposed clarifications in a PR [2] that > clarifies: > > 1. It is not required that all dictionary batches occur at the beginning > of the IPC stream format (if a the first record batch has an all null > dictionary encoded column, the null column's dictionary might not be sent > until later in the stream). > > 2. A second dictionary batch for the same ID that is not a "delta batch" > in an IPC stream indicates the dictionary should be replaced. > > 3. Clarifies that the file format, can only contain 1 "NON-delta" > dictionary batch and multiple "delta" dictionary batches. Dictionary > replacement is not supported in the file format. > > 4. Add an enum to dictionary metadata for possible future changes in what > format dictionary batches can be sent. (the most likely would be an array > Map<Int, Value>). An enum is needed as a place holder to allow for forward > compatibility past the release 1.0.0. > > If accepted there will be work in all implementations to make sure that > they cover the edge cases highlighted and additional integration testing > will be needed. > > Please vote whether to accept these additions. The vote will be open for at > least 72 hours. > > [ ] +1 Accept these change to the specification > [ ] +0 > [ ] -1 Do not accept the changes because... > > Thanks, > Micah > > > [1] > https://lists.apache.org/thread.html/d0f137e9db0abfcfde2ef879ca517a710f620e5be4dd749923d22c37@%3Cdev.arrow.apache.org%3E > [2] https://github.com/apache/arrow/pull/5585