jiangzhx commented on issue #5791:
URL: 
https://github.com/apache/arrow-datafusion/issues/5791#issuecomment-1502692953

   i find a way to create scalar subquery when use dataframe.
   
   There are 3 places here that need attention in order to write the correct 
code.
   
   1. `.select(vec![count(lit(COUNT_STAR_EXPANSION))])?   `  should add after 
aggregate, scalar_subquery_to_join look like  need this
   3.  `count(*)` rewrite to `count(lit(COUNT_STAR_EXPANSION))`    after #5686 
merge ,it's not necessary.
   4. ` .into_optimized_plan()` rewrite to `.into_unoptimized_plan()`  because 
eliminate_projection will remove this. then got scalar_subquery_to_join not work
   
   
   Maybe it's just me, but writing the correct code using a dataframe is not 
easy. We should write a document for this place. 
   
   
   the right way to create scalar subquery  with dataframe api.
   ```
   
   #[tokio::test]
   async fn test_count_wildcard_on_where_scalar_subquery() -> Result<()> {
       let ctx = create_join_context()?;
       ctx.sql("select a,b from t1 where (select count(*) from t2 where t1.a = 
t2.a)>0;")
           .await?
           .explain(false, false)?
           .show()
           .await?;
   
       let subquery = Expr::ScalarSubquery(datafusion_expr::Subquery {
           subquery: Arc::new(
               ctx.table("t2")
                   .await?
                   .filter(col("t1.a").eq(col("t2.a")))?
                   .aggregate(vec![], vec![count(lit(COUNT_STAR_EXPANSION))])?
                   .select(vec![count(lit(COUNT_STAR_EXPANSION))])?   
                   .into_unoptimized_plan(),
           ),
           outer_ref_columns: vec![],
       });
   
       ctx.table("t1")
           .await?
           .filter(subquery.gt(lit(ScalarValue::UInt8(Some(0)))))?
           .select(vec![col("t1.a"), col("t1.b")])?
           .explain(false, false)?
           .show()
           .await?;
       Ok(())
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to