Re: [PR] HIVE-29322: Avoid TopNKeyOperator When Map-Side LIMIT Pushdown Provides Better Pruning for ORDER BY LIMIT Queries [hive]

via GitHub Tue, 09 Dec 2025 00:08:07 -0800


okumin commented on code in PR #6202:
URL: https://github.com/apache/hive/pull/6202#discussion_r2601529807



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java:
##########
@@ -71,6 +71,13 @@ public Object process(Node nd, Stack<Node> stack, 
NodeProcessorCtx procCtx,
       return null;
     }
 
+    // Skip the current optimization when a simple global ORDER BY...LIMIT is 
present
+    // (topN > -1 and hasOnlyOrderByLimit()).
+    // This plan structure is handled more efficiently by the specialized 
'TopN In Reducer' optimization.
+    if (reduceSinkDesc.getTopN() > -1 && reduceSinkDesc.hasOnlyOrderByLimit()) 
{

Review Comment:
   I'm also curious about the two questions in [the JIRA 
ticket](https://issues.apache.org/jira/browse/HIVE-29322). I don't have an 
instant answer on whether it is optimal to reuse Top-N for the global sort. 
However, in my impression, it is at least suboptimal and might not be so bad.
   Disclaimer: My next response might not be timely because I have to visit a 
hospital every day.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-29322: Avoid TopNKeyOperator When Map-Side LIMIT Pushdown Provides Better Pruning for ORDER BY LIMIT Queries [hive]

Reply via email to