noob-se7en commented on code in PR #18549:
URL: https://github.com/apache/pinot/pull/18549#discussion_r3290888037
##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/SegmentValidationUtils.java:
##########
@@ -57,6 +70,66 @@ public static void validateTimeInterval(SegmentMetadata
segmentMetadata, TableCo
}
}
+ public static void validateUpsertSegmentPartitionMetadata(SegmentMetadata
segmentMetadata, TableConfig tableConfig) {
+ if (tableConfig.getTableType() != TableType.OFFLINE ||
!tableConfig.isUpsertEnabled()) {
+ return;
+ }
+
+ String partitionColumn = getPartitionColumn(tableConfig);
+ if (StringUtils.isEmpty(partitionColumn)) {
+ return;
+ }
+
+ ColumnMetadata columnMetadata =
segmentMetadata.getColumnMetadataFor(partitionColumn);
+ Set<Integer> partitions = columnMetadata != null ?
columnMetadata.getPartitions() : null;
+ if (partitions == null || partitions.size() != 1) {
+ throw new ControllerApplicationException(LOGGER,
+ String.format("Uploaded segment: %s for offline upsert table: %s
must contain exactly one partition id for "
Review Comment:
Pinot convention #14404 avoids `String.format` in exception messages even
outside hot paths, since the pattern gets adopted by callers later. consider
plain concatenation: `"Uploaded segment: " + segmentMetadata.getName() + " for
offline upsert table: " + tableConfig.getTableName() + " must contain exactly
one partition id for column: " + partitionColumn + ", got: " + partitions`.
##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/SegmentValidationUtils.java:
##########
@@ -57,6 +70,66 @@ public static void validateTimeInterval(SegmentMetadata
segmentMetadata, TableCo
}
}
+ public static void validateUpsertSegmentPartitionMetadata(SegmentMetadata
segmentMetadata, TableConfig tableConfig) {
+ if (tableConfig.getTableType() != TableType.OFFLINE ||
!tableConfig.isUpsertEnabled()) {
Review Comment:
the table-type guard skips realtime upsert, but realtime upsert tables
receive uploads via backfill / segment-replace / minion refresh, and a
multi-partition segment in those flows breaks upsert the same way. is realtime
exclusion intentional? extending coverage isn't quite a one-line drop of the
guard since `getPartitionColumn` only reads
`instanceAssignmentConfigMap[OFFLINE]`; realtime resolution needs `[CONSUMING]`
(and likely `[COMPLETED]`) too. worth either widening scope or adding a code
comment justifying the offline-only restriction.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]