xiangfu0 commented on code in PR #18724:
URL: https://github.com/apache/pinot/pull/18724#discussion_r3439643615
##########
pinot-core/src/main/java/org/apache/pinot/core/common/MinionConstants.java:
##########
@@ -151,6 +151,16 @@ public static abstract class MergeTask {
// Tasks can take use of this field to coordinate with the merge task. By
default, segment is safe
// to merge, so existing segments w/o this field can be merged just as
before.
public static final String SEGMENT_ZK_METADATA_SHOULD_NOT_MERGE_KEY =
"shouldNotMerge";
+
+ /// Aggregation types supported by the rollup reducer (see
ValueAggregatorFactory in pinot-core). Used to reject
+ /// parseable but unsupported aggregation types at table config validation
time instead of at task runtime.
+ public static final EnumSet<AggregationFunctionType>
AVAILABLE_CORE_VALUE_AGGREGATORS =
Review Comment:
Done in e95e7df565 — moved `AVAILABLE_CORE_VALUE_AGGREGATORS` out of the
abstract `MergeTask` base into `MergeRollupTask`.
`RealtimeToOfflineSegmentsTask` references it for its own rollup merge path
(noted in the constant's javadoc), since it runs the same rollup aggregation
when configured with rollup merge type.
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/aggregator/ValueAggregatorFactory.java:
##########
@@ -30,12 +30,19 @@ public class ValueAggregatorFactory {
private ValueAggregatorFactory() {
}
- /**
- * Constructs a ValueAggregator from the given aggregation type.
- *
- * When adding entries to this please add them to the Set in
org.apache.pinot.segment.local.utils.TableConfigUtils
- * named AVAILABLE_CORE_VALUE_AGGREGATORS so that they can be used in
RealtimeToOfflineTask
- */
+ /// Returns `true` if the given aggregation type is order sensitive, i.e. it
picks a value based on the original
+ /// (pre-rounding) time order of the rows instead of combining the values,
and therefore requires the rollup rows
+ /// to be sorted on the hidden original time column within each rollup group.
+ public static boolean isOrderSensitive(AggregationFunctionType
aggregationType) {
Review Comment:
Done in e95e7df565 — renamed `isOrderSensitive` to `requiresTimeOrdering`
and updated all call sites (`RollupReducer`, `SegmentProcessorConfig`,
`MergeTaskUtils`).
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorConfig.java:
##########
@@ -123,6 +124,22 @@ public Map<String, Map<String, String>>
getAggregationFunctionParameters() {
return _aggregationFunctionParameters;
}
+ /// Returns whether the rollup requires the original (pre-rounding) time
values to be preserved as an extra sort
+ /// field in the intermediate files. This is the case when any of the
configured aggregation types is order
+ /// sensitive (FIRSTWITHTIME/LASTWITHTIME), which picks the value based on
the original time order instead of
+ /// combining the values.
+ public boolean requiresOriginalTimeOrdering() {
+ if (_mergeType != MergeType.ROLLUP || _timeHandlerConfig.getType() !=
TimeHandler.Type.EPOCH) {
Review Comment:
Done in e95e7df565 — changed the guard to `== TimeHandler.Type.NO_OP` so it
is no longer pinned to `EPOCH` and any real time-handler type opts into
original-time ordering. (Note: a future `DATE_TIME` handler will need to
populate the hidden original-time field like `EpochTimeHandler` does; the
`RollupReducer` guard throws loudly if an order-sensitive aggregation runs
without it.)
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]