+1 (binding)
Le 29/06/2020 à 23:59, Wes McKinney a écrit : > Hi, > > As discussed on the mailing list [1], it has been proposed to allow > the use of unsigned dictionary indices (which is already technically > possible in our metadata serialization, but not allowed according to > the language of the columnar specification), with the following > caveats: > > * Unless part of an application's requirements (e.g. if it is > necessary to store dictionaries with size 128 to 255 more compactly), > implementations are recommended to prefer signed over unsigned > integers, with int32 continuing to be the "default" when the indexType > field of DictionaryEncoding is null > * uint64 dictionary indices, while permitted, are strongly not > recommended unless required by an application as they are more > difficult to work with in some programming languages (e.g. Java) and > they do not offer the storage size benefits that uint8 and uint16 do. > > This change is backwards compatible, but not forward compatible for > all implementations (for example, C++ will reject unsigned integers). > Assuming that the V5 MetadataVersion change is accepted, to protect > against forward compatibility issues such implementations would be > recommended to not allow unsigned dictionary indices to be serialized > using V4 MetadataVersion. > > A PR with the changes to the columnar specification (possibly subject > to some clarifying language) is at [2]. > > The vote will be open for at least 72 hours. > > [ ] +1 Accept changes to allow unsigned integer dictionary indices > [ ] +0 > [ ] -1 Do not accept because... > > [1]: > https://lists.apache.org/thread.html/r746e0a76c4737a2cf48dec656103677169bebb303240e62ae1c66d35%40%3Cdev.arrow.apache.org%3E > [2]: https://github.com/apache/arrow/pull/7567 >