+1 on the format additions The implementations will probably need a bit more review back-and-forth. Regards Antoine. Le 28/06/2023 à 21:34, Benjamin Kietzman a écrit :
Hello, I'd like to propose adding Utf8View arrays to the arrow format. Previous discussion in [1], columnar format description in [2], flatbuffers changes in [3]. There are implementations available in both C++[4] and Go[5] which exercise the new type over IPC. Utf8View format demonstrates[6] significant performance benefits over Utf8 in common tasks. The vote will be open for at least 72 hours. [ ] +1 add the proposed Utf8View type to the Apache Arrow format [ ] -1 do not add the proposed Utf8View type to the Apache Arrow format because... Sincerely, Ben Kietzman [1] https://lists.apache.org/thread/w88tpz76ox8h3rxkjl4so6rg3f1rv7wt [2] https://github.com/apache/arrow/blob/46cf7e67766f0646760acefa4d2d01cdfead2d5d/docs/source/format/Columnar.rst#variable-size-binary-view-layout [3] https://github.com/apache/arrow/pull/35628/files#diff-0623d567d0260222d5501b4e169141b5070eabc2ec09c3482da453a3346c5bf3 [4] https://github.com/apache/arrow/pull/35628 [5] https://github.com/apache/arrow/pull/35769 [6] https://github.com/apache/arrow/pull/35628#issuecomment-1583218617