[GitHub] [iceberg] findepi commented on a change in pull request #3966: Core: Adds Utility Class for Implementing ZOrdering

GitBox Tue, 08 Feb 2022 04:30:36 -0800


findepi commented on a change in pull request #3966:
URL: https://github.com/apache/iceberg/pull/3966#discussion_r801579852




##########
File path: core/src/main/java/org/apache/iceberg/util/ZOrderByteUtils.java
##########
@@ -108,54 +108,70 @@ private ZOrderByteUtils() {
    * Strings are lexicographically sortable BUT if different byte array 
lengths will
    * ruin the Z-Ordering. (ZOrder requires that a given column contribute the 
same number of bytes every time).
    * This implementation just uses a set size to for all output byte 
representations. Truncating longer strings
-   * and right padding 0 for shorter strings.
+   * and right padding 0 for shorter strings. Requires UTF8 (or ASCII) 
encoding for ordering guarantees to hold.

Review comment:
       Please don't add a comment. Just replace `String.getBytes()` with 
`String.getBytes(java.nio.charset.StandardCharsets#UTF_8)`.
   
   Of course, i don't know the rules of the project, but otherwise i don't see 
why we would want to rely on JVM's default encoding. This is z-order, not 
sorting, but z-order should be consistent with sorting, and varchar value 
ordering is codepoint-based (in the absence of collation which doesn't exist in 
Iceberg world). The UTF-8 encoding produces bytes that, if treated as unsigned, 
lead to the same order as codepoint-based lexicographical ordering. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] findepi commented on a change in pull request #3966: Core: Adds Utility Class for Implementing ZOrdering

Reply via email to