[ 
https://issues.apache.org/jira/browse/DRILL-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8182:
--------------------------------
    Description: 
When a query includes multiple SELECTs against a workbook by using TABLE 
functions to access different sheets, and those sheets contain a column with 
the same name, then values for that column come a single sheet for both 
SELECTs.  To reproduce, run the following query against the attachment and note 
that the `Name` values returned from the Products sheet are `Name` values from 
the Customers sheet.

 
{code:java}
with
prod as (
    select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` (type 
=> 'excel', sheetName => 'Products'))
)
, cust as (
    select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` (type 
=> 'excel', sheetName => 'Customers'))
)
select * from cust join prod on cust.Id = prod.Id; {code}

  was:
When a query creates multiple scans against a workbook, targeting different 
sheets using TABLE functions then the resulting datasets appear to get mixed 
with one overwriting the other.  To reproduce, run the following query against 
the attachment and note that the value returned from the Products sheet is a 
name from the Customers sheet.

 
{code:java}
with cust as (
    select * from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` (type => 
'excel', sheetName => 'Customers'))
),
prod as (
    select * from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` (type => 
'excel', sheetName => 'Products'))
)
select * from cust join prod on cust.Id = prod.Id;
{code}


> Excel format plugin sheet scan overwriting bug
> ----------------------------------------------
>
>                 Key: DRILL-8182
>                 URL: https://issues.apache.org/jira/browse/DRILL-8182
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Other
>    Affects Versions: 1.20.0
>            Reporter: James Turton
>            Assignee: Charles Givre
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: Products_Customers_Orders.xlsx
>
>
> When a query includes multiple SELECTs against a workbook by using TABLE 
> functions to access different sheets, and those sheets contain a column with 
> the same name, then values for that column come a single sheet for both 
> SELECTs.  To reproduce, run the following query against the attachment and 
> note that the `Name` values returned from the Products sheet are `Name` 
> values from the Customers sheet.
>  
> {code:java}
> with
> prod as (
>     select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` 
> (type => 'excel', sheetName => 'Products'))
> )
> , cust as (
>     select Id, Name from TABLE(dfs.tmp.`/Products_Customers_Orders.xlsx` 
> (type => 'excel', sheetName => 'Customers'))
> )
> select * from cust join prod on cust.Id = prod.Id; {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to