Copilot commented on code in PR #18724:
URL: https://github.com/apache/pinot/pull/18724#discussion_r3417580256
##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java:
##########
@@ -510,6 +511,24 @@ public void validateTaskConfigs(TableConfig tableConfig,
Schema schema, Map<Stri
Preconditions.checkState(columnNames.contains(dimension), "Column
dimension to erase \"" + dimension
+ "\" not found in schema!");
}
+ // check no mis-configured aggregation types
+ for (Map.Entry<String, String> entry : taskConfigs.entrySet()) {
+ if (entry.getKey().endsWith(MergeTask.AGGREGATION_TYPE_KEY_SUFFIX)) {
+ String column = StringUtils.removeEnd(entry.getKey(),
MergeTask.AGGREGATION_TYPE_KEY_SUFFIX);
+ try {
Review Comment:
`validateTaskConfigs` validates aggregation types but does not verify that
the configured column exists in the schema. A typo like
`missingCol.aggregationType=sum` would currently pass validation (and then be
silently ignored at runtime for non-order-sensitive types). Add a schema
existence check for the derived `column` before parsing/allowlisting the
aggregation type.
##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/MergeTaskUtils.java:
##########
@@ -137,6 +139,31 @@ public static Map<String, AggregationFunctionType>
getAggregationTypes(Map<Strin
return aggregationTypes;
}
+ /// Validates the prerequisites for an order sensitive aggregation
(FIRSTWITHTIME/LASTWITHTIME) on the given
+ /// column: the column must be a metric column in the schema, and the table
must have a time column with a DateTime
+ /// spec in the schema so that the reducer can order the values by the
original time. No-op for other aggregation
+ /// types. The given aggregation type must be parseable (see
+ /// [AggregationFunctionType#getAggregationFunctionType(String)]).
+ public static void validateOrderSensitiveAggregation(TableConfig
tableConfig, Schema schema, String column,
+ String aggregationType) {
+ AggregationFunctionType aggregationFunctionType =
AggregationFunctionType.getAggregationFunctionType(
+ aggregationType);
+ if (!ValueAggregatorFactory.isOrderSensitive(aggregationFunctionType)) {
+ return;
+ }
+ FieldSpec fieldSpec = schema.getFieldSpecFor(column);
+ Preconditions.checkState(fieldSpec != null && fieldSpec.getFieldType() ==
FieldSpec.FieldType.METRIC,
+ "Aggregation type: %s on column: %s requires the column to be a metric
column in schema!", aggregationType,
+ column);
Review Comment:
`validateOrderSensitiveAggregation` uses a combined check (`fieldSpec !=
null && fieldSpec.getFieldType() == METRIC`) but the error message only
mentions “metric column”. When the column is missing from the schema, this
message is misleading and makes config debugging harder. Split this into two
`checkState` calls so missing-column and wrong-field-type are reported
accurately.
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/genericrow/GenericRowFileRecordReader.java:
##########
@@ -94,6 +94,12 @@ public int compare(int rowId1, int rowId2) {
return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2]);
}
+ /// Compares the records at the given row ids. Only compare the values for
the first `numFieldsToCompare` fields.
+ public int compare(int rowId1, int rowId2, int numFieldsToCompare) {
+ assert _sortedRowIds != null;
+ return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2],
numFieldsToCompare);
+ }
Review Comment:
The new `compare(rowId1, rowId2, numFieldsToCompare)` relies on `assert
_sortedRowIds != null;`. Assertions are typically disabled in production, so
this can devolve into a confusing NPE if the reader is used before sorting.
Prefer an explicit runtime check that throws a clear exception (and consider
aligning `compare(int,int)` similarly).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]