wangyum commented on a change in pull request #35287:
URL: https://github.com/apache/spark/pull/35287#discussion_r791255798
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -724,6 +724,16 @@ object LimitPushDown extends Rule[LogicalPlan] {
Limit(le, Project(a.output, LocalLimit(le, a.child)))
case Limit(le @ IntegerLiteral(1), p @ Project(_, a: Aggregate)) if
a.groupOnly =>
Limit(le, p.copy(child = Project(a.output, LocalLimit(le, a.child))))
+ // Only push down when the limit number is less than or equal to 5000,
+ // which is the default_limit value of Hue.
+ case Limit(le @ IntegerLiteral(limit), a @ Aggregate(_, _, child))
+ if a.groupOnly && child.maxRowsPerPartition.forall(_ > limit) &&
+ limit <= math.min(conf.topKSortFallbackThreshold, 5000) =>
Review comment:
The maximum threshold is 5000 to benefit most SQL editors:
https://github.com/cloudera/hue/blob/release-4.10.0/desktop/conf.dist/hue.ini#L948-L949
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]