salvatore-campagna opened a new issue, #15740:
URL: https://github.com/apache/lucene/issues/15740
### Description
### Description
Lucene currently has two ways to retrieve the global min/max value of a
numeric field across segments:
- `PointValues.getMinPackedValue()` / `PointValues.getMaxPackedValue()`:
returns `null` when no points exist for the field.
- `DocValuesSkipper.globalMinValue()` / `DocValuesSkipper.globalMaxValue()`:
returns sentinel values (`Long.MIN_VALUE` or `Long.MAX_VALUE`) when no data
exists or when the skipper is not available for a leaf reader.
These two APIs have different "no data" semantics. `PointValues` returns
`null`, which callers can check for and handle cleanly. `DocValuesSkipper`
returns sentinel values that callers must know about and filter out.
Specifically:
- `globalMinValue()` returns `Long.MAX_VALUE` when no segments have the
field, and `Long.MIN_VALUE` when a leaf reader has the field info but no
skipper.
- `globalMaxValue()` returns `Long.MIN_VALUE` when no segments have the
field, and `Long.MAX_VALUE` when a leaf reader has the field info but no
skipper.
This makes it error-prone for callers that need to retrieve min/max values
from a field: they must first determine which data structure is available, then
call the right API, and then handle the different "no data" conventions. If a
caller picks the wrong API or forgets to filter sentinels, invalid values
propagate silently.
### Proposal
Introduce a unified API for retrieving the global min/max value of a numeric
field, abstracting over the underlying data structure. The API should:
1. Return `null` when no data exists, regardless of whether the field uses
BKD trees or doc values skippers.
2. Automatically delegate to whichever data structure is available for the
field.
3. Define clear behavior when both structures are available or when neither
is available (return `null`).
A possible solution:
```java
public record MinMax(long min, long max) {}
// Returns null if values cannot be loaded
public static MinMax getGlobalMinMax(IndexReader reader, String field)
throws IOException { ... }
```
This would eliminate the need for callers to know which underlying data
structure a field uses and would prevent sentinel values from leaking into
application logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]