LakshSingla commented on code in PR #13352:
URL: https://github.com/apache/druid/pull/13352#discussion_r1021064001
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/QueryValidator.java:
##########
@@ -55,6 +57,15 @@ public static void validateQueryDef(final QueryDefinition
queryDef)
throw new ISE("Number of workers must be greater than 0");
}
}
+
+ // Check if the number of columns in the query's CLUSTERED BY clause donot
exceed the limit
+ ClusterBy queryClusteredBy =
queryDef.getFinalStageDefinition().getClusterBy();
Review Comment:
The cluster by columns in the earlier stages might not have a 1:1
correspondence with the query that the user has written therefore raising a
cluster by error, in that case, shouldn't be actionable for the user IMO. Hence
I only added the limit in the final stage (the original query that the user has
written). Along with the Sequential merge mode on, I think that there should be
enough guard rails in place to prevent an OOM.
However we can add a limit on the cluster by in the other stages if we
rephrase the error message as something like "Enough grouping keys present in
stage [xx], the query might OOM". Those cluster by keys can correspond to
something present in the group by clause for example. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]