zhuqi-lucas commented on code in PR #7875:
URL: https://github.com/apache/arrow-rs/pull/7875#discussion_r2188050125
##########
arrow-array/src/array/byte_view_array.rs:
##########
@@ -1193,9 +1197,19 @@ mod tests {
b"abcdefghi",
b"abcdefghij",
b"abcdefghijk",
- b"abcdefghijkl", // 12 bytes, max inline
- b"bar",
- b"bar\0", // special case to test length tiebreaker
+ b"abcdefghijkl",
+ //
───────────────────────────────────────────────────────────────────────
+ // This pair verifies that we didn’t accidentally reverse the
inline bytes:
+ // without our fix, “backend one” would compare as if it were
+ // “eno dnekcab”, so “one” might end up sorting _after_ “two”.
+ b"backend one", // special case: tests byte-order reversal bug
Review Comment:
Good question, because :
The bug caused full byte reversal of the inline string bytes, meaning the
entire 12-byte segment was reversed before comparison.
For strings like "xyy" and "xyz", which differ only in their last byte,
reversing the bytes moves this difference to the first byte of the reversed
string.
Since comparisons are done on the reversed bytes for both strings, the order
is consistently flipped but preserved between them.
Thus, even though the byte order is wrong globally (the entire string is
reversed), "xyy" still compares correctly as less than "xyz" in the reversed
space, so the test passes.
In other words, differences at the end of short strings don’t expose the
reversal bug, because reversing the entire string simply moves the difference
to the front, preserving the relative order.
The bug only becomes apparent in strings with differences in the middle or
earlier bytes, like "backend one" vs "backend two", where reversing the entire
inline data inverts the lexicographical order unexpectedly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]