cloud-fan commented on code in PR #40811:
URL: https://github.com/apache/spark/pull/40811#discussion_r1170717475
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala:
##########
@@ -254,13 +254,20 @@ object SubExprUtils extends PredicateHelper {
* scalar subquery during planning.
*
* Note: `exprId` is used to have a unique name in explain string output.
+ *
+ * `mayHaveCountBug` is whether it's possible for the subquery to evaluate to
non-null on
+ * empty input (zero tuples). It is false if the subquery has a GROUP BY
clause, because in that
+ * case the subquery yields no row at all on empty input to the GROUP BY,
which evaluates to NULL.
+ * It is set in PullupCorrelatedPredicates to true/false, before it is set its
value is None.
+ * See constructLeftJoins in RewriteCorrelatedScalarSubquery for more details.
*/
case class ScalarSubquery(
plan: LogicalPlan,
outerAttrs: Seq[Expression] = Seq.empty,
exprId: ExprId = NamedExpression.newExprId,
joinCond: Seq[Expression] = Seq.empty,
- hint: Option[HintInfo] = None)
+ hint: Option[HintInfo] = None,
+ mayHaveCountBug: Option[Boolean] = None)
Review Comment:
how about we make the naming easier to understand? `hasGlobalAggregate`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]