epotyom opened a new pull request, #16122:
URL: https://github.com/apache/lucene/pull/16122
Two structurally identical problems were solved by two separate mechanisms:
- **Taxonomy rollup**: `getOrdinalsToRollup()` + `getChildrenOrds(int)`
drove a recursive tree-walk in `CountFacetRecorder` and
`LongAggregationsFacetRecorder`, descending from each dim root through the full
taxonomy subtree regardless of which nodes had hits.
- **Range remapping**: the `pos[]` lookup (elementary interval → user range
position) was baked into each `NonOverlappingLongRangeFacetCutter` leaf
cutter's `nextOrd()`, running once per matching document during collection.
### What this PR does
Adds two default methods to `FacetCutter`:
```java
default boolean needsRemapping() throws IOException { return false; }
default OrdinalIterator remapOrd(int mergedOrd) throws IOException { ... }
```
When `needsRemapping()` is true, recorders iterate over recorded ordinals
and call `remapOrd()` to obtain final ordinal(s).
**`TaxonomyFacetsCutter`**: switches from `children`/`siblings` to a
`parents` array walk. `remapOrd(ord)` walks from `ord` up to the dim root,
emitting every ancestor so counts accumulate at each level.
**`NonOverlappingLongRangeFacetCutter`**: leaf cutters now yield raw
elementary-interval ordinals. `remapOrd()` applies the `pos[]` lookup at reduce
time.
### Performance
- **Taxonomy**: cost is now O(recorded ordinals × hierarchy depth) instead
of O(full taxonomy subtree). Sparse result sets benefit significantly.
- **Ranges**: pos[] lookup cost drops from O(matching documents) to
O(distinct elementary intervals with hits).
### Removed from `FacetCutter`
- `getOrdinalsToRollup()`
- `getChildrenOrds(int)`
### New helpers
- `OrdinalIterator.fromSingleOrd(int)` -- one-shot iterator over a single
ordinal; used by `remapOrd` implementations that map 1-to-1.
### Benchmarks
TBD
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]