LakshSingla commented on code in PR #16854:
URL: https://github.com/apache/druid/pull/16854#discussion_r1722960087


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryKit.java:
##########
@@ -360,4 +354,36 @@ private QueryDefinitionBuilder 
makeQueryDefinitionBuilder(String queryId, DataSo
     }
     return queryDefBuilder;
   }
+
+  /**
+   * Computes the ClusterBy for the final window stage which may or may not 
have the partition boosted column,
+   * depending on the {@code segmentGranularity} parameter passed. We don't 
have to take the CLUSTERED BY
+   * columns into account, as they are handled as {@link 
org.apache.druid.query.scan.ScanQuery#orderBys}.
+   */
+  private static ClusterBy computeClusterByForFinalWindowStage(Granularity 
segmentGranularity)

Review Comment:
   GroupBy is different due to the fact that it aggregates as well, therefore 
the chance of having a large partition if the partition is on the grouping 
dimensions is negligible (haven't seen anything like that in practice). 
Therefore there can't be large partitions when the post shuffle spec = pre 
shuffle spec. There can be large partitions if that isn't true which is when we 
have group by + order by etc. Window, scan kits don't aggregate, therefore they 
have to worry about partition sizes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to