alamb opened a new issue, #2712:
URL: https://github.com/apache/arrow-datafusion/issues/2712

   **Describe the bug**
   
   We found this issue while working on IOx. IOx was (accidentally) optimizing 
a `LogicalPlan` more than once and when optimizations were applied the second 
time it saw an error like `Schema error: Schema contains duplicate unqualified 
field name 'IsNull-Column-sys.host'`
   
   **To Reproduce**
   Try and optimize this plan twice:
   
   ```
   Projection: #cpu_load_short.host, #cpu_load_short.region, 
#cpu_load_short.value AS value, #cpu_load_short.time
     Sort: #cpu_load_short.host ASC NULLS FIRST, #cpu_load_short.region ASC 
NULLS FIRST, #cpu_load_short.time ASC NULLS FIRST
       Filter: #cpu_load_short.host IS NULL AND #cpu_load_short.host IS NULL OR 
#cpu_load_short.host = Utf8("") OR NOT #cpu_load_short.host IS NULL AND 
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR 
#cpu_load_short.host = Utf8("")
         TableScan: cpu_load_short projection=None
   ```
   
   After the first call to optimize it looks like
   
   ```
   Projection: #cpu_load_short.host, #cpu_load_short.region, 
#cpu_load_short.value AS value, #cpu_load_short.time
     Sort: #cpu_load_short.host ASC NULLS FIRST, #cpu_load_short.region ASC 
NULLS FIRST, #cpu_load_short.time ASC NULLS FIRST
       Projection: #cpu_load_short.host IS NULL OR #cpu_load_short.host = 
Utf8("") AS 
BinaryExpr-ORBinaryExpr-=LiteralColumn-cpu_load_short.hostIsNull-Column-cpu_load_short.host,
 #cpu_load_short.host IS NULL AS IsNull-Column-cpu_load_short.host, 
#cpu_load_short.host, #cpu_load_short.region, #cpu_load_short.time, 
#cpu_load_short.value
         Filter: #cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL 
AND #cpu_load_short.host IS NULL OR #cpu_load_short.host = Utf8("") AS 
cpu_load_short.host IS NULL OR cpu_load_short.host = Utf8("") OR NOT 
#cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL AND 
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR 
#cpu_load_short.host = Utf8("") AS cpu_load_short.host IS NULL OR 
cpu_load_short.host = Utf8("")
           TableScan: cpu_load_short projection=Some([host, region, time, 
value]), partial_filters=[#cpu_load_short.host IS NULL AS cpu_load_short.host 
IS NULL AND #cpu_load_short.host IS NULL OR #cpu_load_short.host = Utf8("") AS 
cpu_load_short.host IS NULL OR cpu_load_short.host = Utf8("") OR NOT 
#cpu_load_short.host IS NULL AS cpu_load_short.host IS NULL AND 
#cpu_load_short.host = Utf8("server01") OR #cpu_load_short.host IS NULL OR 
#cpu_load_short.host = Utf8("") AS cpu_load_short.host IS NULL OR 
cpu_load_short.host = Utf8("")]
   ```
   
   After the next call to optimize() errors with
   
   ```
   `Schema error: Schema contains duplicate unqualified field name 
'IsNull-Column-sys.host'`
   ```
   
   I am working on a self contained reproducer
   
   
   
   
   
   **Expected behavior**
   The second call to optimize should not error and should work correctly. 
   
   **Additional context**
   IOx ticket with original issue: 
https://github.com/influxdata/influxdb_iox/issues/4800
   PR to stop optimizing twice: 
https://github.com/influxdata/influxdb_iox/pull/4809
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to