jiangzhx commented on PR #5907:
URL:
https://github.com/apache/arrow-datafusion/pull/5907#issuecomment-1513437056
> I will take a look. Usually, for un-correlated scalar subqueries, it will
be kept as the subquery but need to do count check.
>
> WHERE EXISTS (SELECT b FROM t2 where a>1 )
>
> rewrite to
>
> WHERE ScalarSubQuery(SELECT b FROM t2 where a>1 limit 1)
>
> We need to add a physical SubQueryExec.
yes,you are right,as you expressed, in Spark, rewriting `exists` to `scalar
subquery` can solve the query requirements for this scenario.
```
/**
* Rewrite non correlated exists subquery to use ScalarSubquery
* WHERE EXISTS (SELECT A FROM TABLE B WHERE COL1 > 10)
* will be rewritten to
* WHERE (SELECT 1 FROM (SELECT A FROM TABLE B WHERE COL1 > 10) LIMIT 1)
IS NOT NULL
*/
object RewriteNonCorrelatedExists extends Rule[LogicalPlan] {
override def apply(plan: LogicalPlan): LogicalPlan =
plan.transformAllExpressionsWithPruning(
_.containsPattern(EXISTS_SUBQUERY)) {
case exists: Exists if exists.children.isEmpty =>
IsNotNull(
ScalarSubquery(
plan = Limit(Literal(1), Project(Seq(Alias(Literal(1), "col")()),
exists.plan)),
exprId = exists.exprId,
hint = exists.hint))
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]