alamb opened a new issue, #2712:
URL: https://github.com/apache/arrow-datafusion/issues/2712
**Describe the bug**
We found this issue while working on IOx. IOx was (accidentally) optimizing
a `LogicalPlan` more than once and when optimizations were applied the second
time it saw an error like `Schema error: Schema contains duplicate unqualified
field name 'IsNull-Column-sys.host'`
**To Reproduce**
Try and optimize this plan twice:
```
Projection: #cpu_load_short.host, #cpu_load_short.region,
#cpu_load_short.value AS value, #cpu_load_short.time
Sort: #cpu_load_short.host ASC NULLS FIRST, #cpu_load_short.region ASC
NULLS FIRST, #cpu_load_short.time ASC NULLS FIRST
Filter: #cpu_load_short.host IS NULL AND #cpu_load_short.host IS NULL OR
#cpu_load_short.host = Utf8("") OR NOT #cpu_load_short.host IS NULL AND
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR
#cpu_load_short.host = Utf8("")
TableScan: cpu_load_short projection=None
```
After the first call to optimize it looks like
```
Projection: #cpu_load_short.host, #cpu_load_short.region,
#cpu_load_short.value AS value, #cpu_load_short.time
Sort: #cpu_load_short.host ASC NULLS FIRST, #cpu_load_short.region ASC
NULLS FIRST, #cpu_load_short.time ASC NULLS FIRST
Projection: #cpu_load_short.host IS NULL OR #cpu_load_short.host =
Utf8("") AS
BinaryExpr-ORBinaryExpr-=LiteralColumn-cpu_load_short.hostIsNull-Column-cpu_load_short.host,
#cpu_load_short.host IS NULL AS IsNull-Column-cpu_load_short.host,
#cpu_load_short.host, #cpu_load_short.region, #cpu_load_short.time,
#cpu_load_short.value
Filter: #cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL
AND #cpu_load_short.host IS NULL OR #cpu_load_short.host = Utf8("") AS
cpu_load_short.host IS NULL OR cpu_load_short.host = Utf8("") OR NOT
#cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL AND
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR
#cpu_load_short.host = Utf8("") AS cpu_load_short.host IS NULL OR
cpu_load_short.host = Utf8("")
TableScan: cpu_load_short projection=Some([host, region, time,
value]), partial_filters=[#cpu_load_short.host IS NULL AS cpu_load_short.host
IS NULL AND #cpu_load_short.host IS NULL OR #cpu_load_short.host = Utf8("") AS
cpu_load_short.host IS NULL OR cpu_load_short.host = Utf8("") OR NOT
#cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL AND
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR
#cpu_load_short.host = Utf8("") AS cpu_load_short.host IS NULL OR
cpu_load_short.host = Utf8("")]
```
After the next call to optimize() errors with
```
`Schema error: Schema contains duplicate unqualified field name
'IsNull-Column-sys.host'`
```
I am working on a self contained reproducer
**Expected behavior**
The second call to optimize should not error and should work correctly.
**Additional context**
IOx ticket with original issue:
https://github.com/influxdata/influxdb_iox/issues/4800
PR to stop optimizing twice:
https://github.com/influxdata/influxdb_iox/pull/4809
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]