gianm commented on code in PR #16911:
URL: https://github.com/apache/druid/pull/16911#discussion_r1728013632


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/kernel/ShuffleSpec.java:
##########
@@ -68,4 +70,17 @@ public interface ShuffleSpec
    * @throws IllegalStateException if kind is {@link ShuffleKind#GLOBAL_SORT} 
with more than one target partition
    */
   int partitionCount();
+
+  /**
+   * Limit that can be applied during shuffling. This is provided to enable 
performance optimizations.
+   *
+   * Implementations may apply this limit to each partition individually, or 
may apply it to the entire resultset
+   * (across all partitions). Either approach is valid, so downstream logic 
must handle either one.
+   *
+   * Implementations may also ignore this hint completely, or may apply a 
limit that is somewhat higher than this hint.
+   */
+  default long limitHint()

Review Comment:
   I don't think offset can be pushed down? Limit can be pushed down when 
sorting because if some row is in the top N globally, it must also be in the 
first N rows of whichever partition it appears in.
   
   But with offset, if for example you have `LIMIT N OFFSET M`, we can't push 
down `OFFSET M` (i.e. skip `M` rows). It is possible that some of the first `M` 
rows in partition A still need to appear in the final resultset, perhaps 
because some of them are greater than any of the first `M` rows in the 
globally-sorted result. So the best we can do is push down `LIMIT N + M`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to