ygf11 commented on code in PR #4577:
URL: https://github.com/apache/arrow-datafusion/pull/4577#discussion_r1046724249
##########
datafusion/core/tests/sql/joins.rs:
##########
@@ -2400,3 +2399,52 @@ async fn reduce_cross_join_with_cast_expr_join_key() ->
Result<()> {
Ok(())
}
+
+#[tokio::test]
+async fn reduce_cross_join_with_wildcard_and_expr() -> Result<()> {
+ let test_repartition_joins = vec![true, false];
+ for repartition_joins in test_repartition_joins {
+ let ctx = create_join_context("t1_id", "t2_id", repartition_joins)?;
+
+ let sql = "select *,t1.t1_id+11 from t1,t2 where t1.t1_id+11=t2.t2_id";
+ let msg = format!("Creating logical plan for '{}'", sql);
+ let plan = ctx
+ .create_logical_plan(&("explain ".to_owned() + sql))
+ .expect(&msg);
+ let state = ctx.state();
+ let plan = state.optimize(&plan)?;
+
+ let expected = vec![
+ "Explain [plan_type:Utf8, plan:Utf8]",
+ " Projection: t1.t1_id, t1.t1_name, t1.t1_int, t2.t2_id,
t2.t2_name, t2.t2_int, CAST(t1.t1_id AS Int64) + Int64(11) [t1_id:UInt32;N,
t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N,
t2_int:UInt32;N, t1.t1_id + Int64(11):Int64;N]",
+ " Projection: t1.t1_id, t1.t1_name, t1.t1_int, t2.t2_id,
t2.t2_name, t2.t2_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N,
t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N]",
+ " Inner Join: t1.t1_id + Int64(11) = CAST(t2.t2_id AS Int64)
[t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t1.t1_id + Int64(11):Int64;N,
t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N, CAST(t2.t2_id AS
Int64):Int64;N]",
Review Comment:
> But the condition of join is t1.t1_id + Int64(11) = CAST(t2.t2_id AS
Int64).
The reason is projection will:
* using `expr.display_name()` --- `t1.t1_id + Int64(11)` as field name in
its schema, which we also use to find the field.
* using `full expr name` --- `CAST(t1.t1_id AS Int64) + Int64(11)` to
display the projection expression.
To address this gap, we can add alias or remove these projections #4389. I
am working on #4588, and plan to submit a pr today or tomorrow.
> Do we need to do type coercion for the exprs in the Join Plan after this
pr https://github.com/apache/arrow-datafusion/pull/4353?
No, I think the type coercion already has done in this test case, because
the join keys are from join filter which has been optimized.
https://github.com/apache/arrow-datafusion/blob/b822b0e5e582676eb58faf7fb89adc312dc95174/datafusion/expr/src/utils.rs#L481-L485
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]