xiangfu0 opened a new pull request, #16605:
URL: https://github.com/apache/pinot/pull/16605

   This PR adds a smart distinct count aggregator backed by UltraLogLog 
(ULL):\n\n- New function: distinctCountSmartULL(expression, 
'threshold=...;p=...')\n  - Starts with exact set accumulation; promotes to ULL 
once threshold is exceeded\n  - Parameters: \n    - threshold: (#) to trigger 
promotion (default 100_000; <=0 disables promotion)\n    - p: ULL parameter p 
(default CommonConstants.Helix.DEFAULT_ULTRALOGLOG_P)\n\nImplementation 
details:\n- pinot-core: DistinctCountSmartULLAggregationFunction (set→ULL)\n- 
pinot-segment-spi: AggregationFunctionType.DISTINCTCOUNTSMARTULL\n- pinot-core: 
AggregationFunctionFactory wiring\n- Planner/runtime:\n  - AggregationPlanNode: 
dictionary-based eligibility\n  - NonScanBasedAggregationOperator: dictionary 
paths for ULL/RAWULL/SmartULL\n\nTests:\n- Query-level tests added mirroring 
existing SmartHLL coverage\n- Enum recognition updated\n\nNotes:\n- Keeps 
parity with SmartHLL semantics; uses hash4j wyhash for ULL\n- Maintains 
BYTES-serialized 
 merge paths where applicable\n\nAfter this lands, we can consider MV variants 
and end-to-end serialized ULL exports where useful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to