LakshSingla commented on code in PR #16175: URL: https://github.com/apache/druid/pull/16175#discussion_r1545966059
########## extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java: ########## @@ -2036,19 +2045,24 @@ private static boolean isTimeBucketedIngestion(final MSQSpec querySpec) * Compute shard columns for {@link DimensionRangeShardSpec}. Returns an empty list if range-based sharding * is not applicable. */ - private static List<String> computeShardColumns( + private static Pair<List<String>, String> computeShardColumns( final RowSignature signature, final ClusterBy clusterBy, - final ColumnMappings columnMappings + final ColumnMappings columnMappings, + boolean mayHaveMultiValuedClusterByFields ) { + if (mayHaveMultiValuedClusterByFields) { + // DimensionRangeShardSpec cannot handle multi-valued fields. + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, the fields in the CLUSTER BY clause contains a multivalues. Using NumberedShardSpec instead."); Review Comment: nit: grammar Also, if its possible to pinpoint the multiValue fields without much refactoring, then we can mention that here. ```suggestion return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, the fields in the CLUSTERED BY clause contains multivalues in column [%s]. Using NumberedShardSpec instead."); ``` ########## extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java: ########## @@ -2036,19 +2045,24 @@ private static boolean isTimeBucketedIngestion(final MSQSpec querySpec) * Compute shard columns for {@link DimensionRangeShardSpec}. Returns an empty list if range-based sharding * is not applicable. */ - private static List<String> computeShardColumns( + private static Pair<List<String>, String> computeShardColumns( final RowSignature signature, final ClusterBy clusterBy, - final ColumnMappings columnMappings + final ColumnMappings columnMappings, + boolean mayHaveMultiValuedClusterByFields ) { + if (mayHaveMultiValuedClusterByFields) { + // DimensionRangeShardSpec cannot handle multi-valued fields. + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, the fields in the CLUSTER BY clause contains a multivalues. Using NumberedShardSpec instead."); + } final List<KeyColumn> clusterByColumns = clusterBy.getColumns(); final List<String> shardColumns = new ArrayList<>(); final boolean boosted = isClusterByBoosted(clusterBy); final int numShardColumns = clusterByColumns.size() - clusterBy.getBucketByCount() - (boosted ? 1 : 0); if (numShardColumns == 0) { - return Collections.emptyList(); + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, as there are no shardColumns. Using NumberedShardSpec instead."); Review Comment: What happens if the user doesn't supply the clustered by. In that case, the reason doesn't seem necessary, or it can be reworded. ```suggestion return Pair.of(Collections.emptyList(), "Using NumberedShardSpec as no columns are supplied in the 'CLUSTERED BY' clause."); ``` ########## extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java: ########## @@ -2057,25 +2071,25 @@ private static List<String> computeShardColumns( // DimensionRangeShardSpec only handles ascending order. if (column.order() != KeyOrder.ASCENDING) { - return Collections.emptyList(); + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, RangedShardSpec only supports ascending CLUSTER BY keys. Using NumberedShardSpec instead."); } ColumnType columnType = signature.getColumnType(column.columnName()).orElse(null); // DimensionRangeShardSpec only handles strings. if (!(ColumnType.STRING.equals(columnType))) { - return Collections.emptyList(); + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, RangedShardSpec only supports string CLUSTER BY keys. Using NumberedShardSpec instead."); } // DimensionRangeShardSpec only handles columns that appear as-is in the output. if (outputColumns.isEmpty()) { - return Collections.emptyList(); + return Pair.of(Collections.emptyList(), "Cannot use RangeShardSpec, RangeShardSpec only supports columns that appear as-is in the output. Using NumberedShardSpec instead."); Review Comment: What does as-is mean? ########## docs/api-reference/sql-ingestion-api.md: ########## @@ -299,7 +299,7 @@ The response shows an example report for a query. }, "pendingTasks": 0, "runningTasks": 2, - "segmentLoadStatus": { + "segmentLoadWaiterStatus": { Review Comment: This is a correction to the doc right (and not a backward incompatible API change)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org