adriangb commented on code in PR #20143:
URL: https://github.com/apache/datafusion/pull/20143#discussion_r2765279782
##########
datafusion/sqllogictest/test_files/projection_pushdown.slt:
##########
@@ -1339,7 +1357,240 @@ SELECT id, s['value'] FROM simple_struct ORDER BY id,
s['value'];
5 250
#####################
-# Section 12: Cleanup
+# Section 12: Join Tests - get_field Extraction from Join Nodes
+#####################
+
+# Create a second table for join tests
+statement ok
+COPY (
+ SELECT
+ column1 as id,
+ column2 as s
+ FROM VALUES
+ (1, {role: 'admin', level: 10}),
+ (2, {role: 'user', level: 5}),
+ (3, {role: 'guest', level: 1}),
+ (4, {role: 'admin', level: 8}),
+ (5, {role: 'user', level: 3})
+) TO 'test_files/scratch/projection_pushdown/join_right.parquet'
+STORED AS PARQUET;
+
+statement ok
+CREATE EXTERNAL TABLE join_right STORED AS PARQUET
+LOCATION 'test_files/scratch/projection_pushdown/join_right.parquet';
+
+###
+# Test 12.1: Join with get_field in equijoin condition
+# Tests extraction from join ON clause - get_field on each side routed
appropriately
+###
+
+query TT
+EXPLAIN SELECT simple_struct.id, join_right.id
+FROM simple_struct
+INNER JOIN join_right ON simple_struct.s['value'] = join_right.s['level'] * 10;
+----
+logical_plan
+01)Projection: simple_struct.id, join_right.id
+02)--Inner Join: get_field(simple_struct.s, Utf8("value")) =
get_field(join_right.s, Utf8("level")) * Int64(10)
+03)----TableScan: simple_struct projection=[id, s]
+04)----TableScan: join_right projection=[id, s]
+physical_plan
+01)HashJoinExec: mode=CollectLeft, join_type=Inner,
on=[(simple_struct.s[value]@2, join_right.s[level] * Int64(10)@2)],
projection=[id@0, id@3]
+02)--DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/projection_pushdown/simple.parquet]]},
projection=[id, s, get_field(s@1, value) as simple_struct.s[value]],
file_type=parquet
+03)--DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/projection_pushdown/join_right.parquet]]},
projection=[id, s, get_field(s@1, level) * 10 as join_right.s[level] *
Int64(10)], file_type=parquet
Review Comment:
I'm actually surprised this is getting pushed down into the scan here. I'm
not sure what would cause that. It's not a bad thing but maybe we can evaluate
if we should have the aliases there or not when we change this next time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]