salvatorecampagna opened a new pull request, #15817:
URL: https://github.com/apache/lucene/pull/15817

   ## Problem
   
   `NumericFieldStats.decodeLong` (introduced in #15760) only handled 4-byte 
(`IntField`) and 8-byte (`LongField`) packed point values via a switch on 
`packed.length`. Any other width caused an `IllegalArgumentException`, crashing 
the query during `SortedNumericDocValuesRangeQuery.rewrite()`.
   
   `HalfFloatPoint` produces 2-byte packed values. A range query on such a 
field triggered the exception. The original PR only tested `IntField` (4 bytes) 
and `LongField` (8 bytes), so CI did not catch the bug:
   
   ```
   java.lang.IllegalArgumentException: Unsupported packed value length: 2 
(expected 8 or 4)
       at 
org.apache.lucene.search.NumericFieldStats.decodeLong(NumericFieldStats.java:121)
       at 
org.apache.lucene.search.NumericFieldStats.getStatsFromPoints(NumericFieldStats.java:72)
       at 
org.apache.lucene.search.NumericFieldStats.getStats(NumericFieldStats.java:58)
       ...
   ```
   
   ## Solution
   
   Replace the switch-based decoder with a generic loop that handles any packed 
value length from 1 to 8 bytes. All Lucene point types use the same encoding: 
big-endian byte order with the sign bit flipped. The loop reads the bytes 
sequentially, re-flips the sign bit on the first byte, and sign-extends the 
result into a `long`. This is allocation-free, unlike 
`NumericUtils.sortableBytesToBigInt`, which copies the array and creates a 
`BigInteger`.
   
   For point fields wider than 8 bytes (e.g. `InetAddressPoint` at 16 bytes, 
`BigIntegerPoint` at 16 bytes), `getStatsFromPoints` now returns `null` instead 
of throwing, allowing `getStats` to fall through to the `DocValuesSkipper` 
path. These wider point types are never used with 
`SortedNumericDocValuesRangeQuery` in practice, but the graceful fallback 
avoids unexpected failures.
   
   ## Tests
   
   - `TestNumericFieldStats.testGetStatsWithAllByteWidths`: exercises 
`decodeLong` with min, zero, and max values at every byte width from 1 to 8
   - `TestNumericFieldStats.testGetStatsReturnsNullForWidePointValues`: 
verifies graceful `null` return for `InetAddressPoint` (16 bytes)
   - `TestHalfFloatPoint.testNumericFieldStats`: integration test with real 
`HalfFloatPoint` (2 bytes)
   
   ```
   ./gradlew -p lucene/core test --tests 
"org.apache.lucene.search.TestNumericFieldStats"
   ./gradlew -p lucene/sandbox test --tests 
"org.apache.lucene.sandbox.document.TestHalfFloatPoint.testNumericFieldStats"
   ```
   
   Follows up on #15760.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to