phillipleblanc opened a new issue, #15813:
URL: https://github.com/apache/datafusion/issues/15813
### Describe the bug
The DataFusion SQL Unparser does not correctly roundtrip SQL queries that
use `UNION` rather than `UNION ALL`. For example the following query:
```sql
SELECT col1 FROM footable
UNION
SELECT col1 FROM bartable
```
Should result in a query that filters out duplicate rows from the final
result. DataFusion handles this by adding a `LogicalPlan::Distinct` node as the
parent of the `LogicalPlan::Union` node.
```rust
Distinct:
Union:
TableScan
TableScan
```
However, this is currently unparsed to the following SQL:
```sql
SELECT col1 FROM footable
UNION ALL
SELECT col1 FROM bartable
```
That will cause incorrect results when executed, because the duplicate rows
will not be filtered out.
### To Reproduce
Parse the following query into a DataFusion LogicalPlan and then immediately
unparse it and note that it unparses to a `UNION ALL` instead of a `UNION`:
```sql
SELECT j1_string AS col1, j1_id AS id FROM j1
UNION
SELECT j2_string AS col1, j2_id AS id FROM j2
UNION
SELECT j3_string AS col1, j3_id AS id FROM j3
```
### Expected behavior
The `UNION` is correctly preserved in the unparsed SQL from a LogicalPlan
that adds a Distinct node directly above a Union node.
### Additional context
I've already fixed this and will submit a PR shortly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]