Re: [PR] HIVE-29322: Avoid TopNKeyOperator When ReduceSink TopNkey Filtering Provides Better Pruning for ORDER BY LIMIT Queries [hive]

via GitHub Wed, 07 Jan 2026 04:03:56 -0800


zabetak commented on code in PR #6202:
URL: https://github.com/apache/hive/pull/6202#discussion_r2668171523



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java:
##########
@@ -111,6 +123,50 @@ public Object process(Node nd, Stack<Node> stack, 
NodeProcessorCtx procCtx,
     return null;
   }
 
+  /**
+   * Returns true if the ReduceSink is only under an ORDER BY + LIMIT plan
+   * and has no GroupBy or Join operators in its upstream ancestry.
+   * This is used to disable TopNKey for pure ORDER BY LIMIT queries where
+   * LIMIT pushdown must take precedence.
+   */
+  public static boolean isOrderByLimitPath(ReduceSinkOperator rs) {

Review Comment:
   Thanks for running the experiments. Ideally, it would be nice if we could 
have a theoretical claim that shows that TopNKeyOptimization is useful or not 
when we have simple (no GBY/JOIN etc) plan branches similarly to the analysis 
that we did in the JIRA ticket for the ORDER BY LIMIT use-case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-29322: Avoid TopNKeyOperator When ReduceSink TopNkey Filtering Provides Better Pruning for ORDER BY LIMIT Queries [hive]

Reply via email to