zhuqi-lucas commented on code in PR #7875:
URL: https://github.com/apache/arrow-rs/pull/7875#discussion_r2188050125


##########
arrow-array/src/array/byte_view_array.rs:
##########
@@ -1193,9 +1197,19 @@ mod tests {
             b"abcdefghi",
             b"abcdefghij",
             b"abcdefghijk",
-            b"abcdefghijkl", // 12 bytes, max inline
-            b"bar",
-            b"bar\0", // special case to test length tiebreaker
+            b"abcdefghijkl",
+            // 
───────────────────────────────────────────────────────────────────────
+            // This pair verifies that we didn’t accidentally reverse the 
inline bytes:
+            // without our fix, “backend one” would compare as if it were
+            //    “eno dnekcab”, so “one” might end up sorting _after_ “two”.
+            b"backend one", // special case: tests byte-order reversal bug

Review Comment:
   Good question, because :
   
   The bug caused full byte reversal of the inline string bytes, meaning the 
entire 12-byte segment was reversed before comparison.
   
   For strings like "xyy" and "xyz", which differ only in their last byte, 
reversing the bytes moves this difference to the first byte of the reversed 
string.
   
   Since comparisons are done on the reversed bytes for both strings, the order 
is consistently flipped but preserved between them.
   
   Thus, even though the byte order is wrong globally (the entire string is 
reversed), "xyy" still compares correctly as less than "xyz" in the reversed 
space, so the test passes.
   
   In other words, differences at the end of short strings don’t expose the 
reversal bug, because reversing the entire string simply moves the difference 
to the front, preserving the relative order.
   
   The bug only becomes apparent in strings with differences in the middle or 
earlier bytes, like "backend one" vs "backend two", where reversing the entire 
inline data inverts the lexicographical order unexpectedly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to