xiangfu0 commented on code in PR #18724:
URL: https://github.com/apache/pinot/pull/18724#discussion_r3439652826
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/aggregator/ValueAggregatorFactory.java:
##########
@@ -30,12 +30,19 @@ public class ValueAggregatorFactory {
private ValueAggregatorFactory() {
}
- /**
- * Constructs a ValueAggregator from the given aggregation type.
- *
- * When adding entries to this please add them to the Set in
org.apache.pinot.segment.local.utils.TableConfigUtils
- * named AVAILABLE_CORE_VALUE_AGGREGATORS so that they can be used in
RealtimeToOfflineTask
- */
+ /// Returns `true` if the given aggregation type is order sensitive, i.e. it
picks a value based on the original
+ /// (pre-rounding) time order of the rows instead of combining the values,
and therefore requires the rollup rows
+ /// to be sorted on the hidden original time column within each rollup group.
+ public static boolean isOrderSensitive(AggregationFunctionType
aggregationType) {
Review Comment:
Fixed in e95e7df565 — renamed `isOrderSensitive` → `requiresTimeOrdering`
and updated all call sites (`SegmentProcessorConfig`, `RollupReducer`,
`MergeTaskUtils`).
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentProcessorConfig.java:
##########
@@ -123,6 +124,22 @@ public Map<String, Map<String, String>>
getAggregationFunctionParameters() {
return _aggregationFunctionParameters;
}
+ /// Returns whether the rollup requires the original (pre-rounding) time
values to be preserved as an extra sort
+ /// field in the intermediate files. This is the case when any of the
configured aggregation types is order
+ /// sensitive (FIRSTWITHTIME/LASTWITHTIME), which picks the value based on
the original time order instead of
+ /// combining the values.
+ public boolean requiresOriginalTimeOrdering() {
+ if (_mergeType != MergeType.ROLLUP || _timeHandlerConfig.getType() !=
TimeHandler.Type.EPOCH) {
Review Comment:
Fixed in e95e7df565 — `requiresOriginalTimeOrdering()` now guards on `type
== NO_OP` instead of `!= EPOCH`, so original-time ordering stays enabled for
all non-`NO_OP` time-handler types rather than being pinned to `EPOCH`.
##########
pinot-core/src/main/java/org/apache/pinot/core/common/MinionConstants.java:
##########
@@ -151,6 +151,16 @@ public static abstract class MergeTask {
// Tasks can take use of this field to coordinate with the merge task. By
default, segment is safe
// to merge, so existing segments w/o this field can be merged just as
before.
public static final String SEGMENT_ZK_METADATA_SHOULD_NOT_MERGE_KEY =
"shouldNotMerge";
+
+ /// Aggregation types supported by the rollup reducer (see
ValueAggregatorFactory in pinot-core). Used to reject
+ /// parseable but unsupported aggregation types at table config validation
time instead of at task runtime.
+ public static final EnumSet<AggregationFunctionType>
AVAILABLE_CORE_VALUE_AGGREGATORS =
Review Comment:
Fixed in e95e7df565 — moved `AVAILABLE_CORE_VALUE_AGGREGATORS` from the
abstract `MergeTask` base into `MergeRollupTask`.
`RealtimeToOfflineSegmentsTaskGenerator` now references
`MergeRollupTask.AVAILABLE_CORE_VALUE_AGGREGATORS` for its own rollup merge
path (it runs the same rollup aggregation), documented on the constant. Happy
to give RTO a separate copy instead if you would prefer to avoid the
cross-reference.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]