[GitHub] [arrow-datafusion] ygf11 commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

GitBox Sun, 08 Jan 2023 19:13:11 -0800


ygf11 commented on issue #4837:
URL: 
https://github.com/apache/arrow-datafusion/issues/4837#issuecomment-1375055044


   I found the bug is in the `union type coercion`(#3513), and the bug still 
exists. 
   We can reproduce it in the master branch:
   ```sql
   ❯ create table table_2(name text, id INT) as  values('Alex',1);
   0 rows in set. Query took 0.002 seconds.
   ❯ create table table_1(name text, id TINYINT) as  values('Alex',1);
   0 rows in set. Query took 0.002 seconds.
   ❯ (
       SELECT * FROM table_1
       EXCEPT
       SELECT * FROM table_2
   )
   UNION ALL
   (
       SELECT * FROM table_2
       EXCEPT
       SELECT * FROM table_1
   );
   SchemaError(FieldNotFound { field: Column { relation: Some("table_2"), name: 
"id" }, valid_fields: Some([Column { relation: Some("table_1"), name: "name" }, 
Column { relation: Some("table_1"), name: "id" }]) })
   ```
   For union operation, we need ensure each data type of left and right should 
be same.
   It is done in:
   
https://github.com/apache/arrow-datafusion/blob/71b9baecd0a3c881f96e9994d922f3c1b3d61854/datafusion/expr/src/expr_rewriter.rs#L523-L527
   
   But it uses `plan.expressions()` to get fields(schema) of the input, which I 
think is not correct, because it maybe return other expressions, like `join` 
will return its predicates not the output fields. 
   
   To fix this issue, I think we can abandon `plan.expressions()`, and use the 
input schema to enumerate the output fields, finally create the new plan with 
`Projection::try_new_with_schema`. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] ygf11 commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

Reply via email to