zabetak commented on code in PR #6202:
URL: https://github.com/apache/hive/pull/6202#discussion_r2668171523
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java:
##########
@@ -111,6 +123,50 @@ public Object process(Node nd, Stack<Node> stack,
NodeProcessorCtx procCtx,
return null;
}
+ /**
+ * Returns true if the ReduceSink is only under an ORDER BY + LIMIT plan
+ * and has no GroupBy or Join operators in its upstream ancestry.
+ * This is used to disable TopNKey for pure ORDER BY LIMIT queries where
+ * LIMIT pushdown must take precedence.
+ */
+ public static boolean isOrderByLimitPath(ReduceSinkOperator rs) {
Review Comment:
Thanks for running the experiments. Ideally, it would be nice if we could
have a theoretical claim that shows that TopNKeyOptimization is useful or not
when we have simple (no GBY/JOIN etc) plan branches similarly to the analysis
that we did in the JIRA ticket for the ORDER BY LIMIT use-case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]