jiangzhx commented on issue #5791:
URL:
https://github.com/apache/arrow-datafusion/issues/5791#issuecomment-1502692953
i find a way to create scalar subquery when use dataframe.
There are 3 places here that need attention in order to write the correct
code.
1. `.select(vec![count(lit(COUNT_STAR_EXPANSION))])? ` should add after
aggregate, scalar_subquery_to_join look like need this
3. `count(*)` rewrite to `count(lit(COUNT_STAR_EXPANSION))` after #5686
merge ,it's not necessary.
4. ` .into_optimized_plan()` rewrite to `.into_unoptimized_plan()` because
eliminate_projection will remove this. then got scalar_subquery_to_join not work
Maybe it's just me, but writing the correct code using a dataframe is not
easy. We should write a document for this place.
the right way to create scalar subquery with dataframe api.
```
#[tokio::test]
async fn test_count_wildcard_on_where_scalar_subquery() -> Result<()> {
let ctx = create_join_context()?;
ctx.sql("select a,b from t1 where (select count(*) from t2 where t1.a =
t2.a)>0;")
.await?
.explain(false, false)?
.show()
.await?;
let subquery = Expr::ScalarSubquery(datafusion_expr::Subquery {
subquery: Arc::new(
ctx.table("t2")
.await?
.filter(col("t1.a").eq(col("t2.a")))?
.aggregate(vec![], vec![count(lit(COUNT_STAR_EXPANSION))])?
.select(vec![count(lit(COUNT_STAR_EXPANSION))])?
.into_unoptimized_plan(),
),
outer_ref_columns: vec![],
});
ctx.table("t1")
.await?
.filter(subquery.gt(lit(ScalarValue::UInt8(Some(0)))))?
.select(vec![col("t1.a"), col("t1.b")])?
.explain(false, false)?
.show()
.await?;
Ok(())
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]