richardstartin commented on code in PR #10044:
URL: https://github.com/apache/pinot/pull/10044#discussion_r1059802361
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/BaseImmutableDictionary.java:
##########
@@ -259,8 +259,7 @@ protected int binarySearch(byte[] value) {
while (low <= high) {
int mid = (low + high) >>> 1;
- byte[] midValue = _valueReader.getBytes(mid, _numBytesPerValue);
- int compareResult = ByteArray.compare(midValue, value);
+ int compareResult = _valueReader.compareUnsignedBytes(mid,
_numBytesPerValue, value);
Review Comment:
The existing logic here appears to be broken for fixed length
`FixedByteValueReaderWriter`: it compares `value` with the _padded_ stored
value retrieved by `ValueReader.getBytes`. This comparison will break ties
using the lengths when mismatches can't be found, meaning the comparison will
never yield zero unless `value` is also padded. I can only find testing for
this method with byte arrays with the same size, so this problem would not
arise in tests, but I can't find where the padding required to make this work
with arbitrary length byte arrays is taking place.
I decided to roll this change back as addressing this problem is bigger than
the scope of the change I wanted to make, and `STRING` columns are much more
common than `BYTES`. I refactored the code to make adding an unsigned
comparison trivial once this has been resolved.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]