[ 
https://issues.apache.org/jira/browse/DRILL-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Turton updated DRILL-8261:
--------------------------------
    Description: The attributes MAP column generated by the XML format plugin 
is currently explicit and present in wildcard selects. However, attributes are 
frequently not present at all in XML data, or are better queried using explicit 
projections of the individual attributes of interest to scalars. The motivating 
example here is an ETL-style query that transforms XML to Parquet using a CTAS 
with a wildcard column spec. This query will fail for XML that has no 
attributes because the Parquet writer cannot write a Parquet schema containing 
the empty struct produced by the attributes map. It is therefore proposed that 
the attributes MAP becomes an implicit column.  (was: The attributes MAP column 
generated by the XML format plugin is currently explicit and present in 
wildcard selects. However, attributes are frequently not present at all in XML 
data, or are better queried using explicit projections of the individual 
attributes of interest to scalars. The motivating example here is an ETL-style 
query that transforms XML to Parquet using a CTAS with a wildcard column spec. 
This query will fail for XML that has no attributes because the Parquet writer 
cannot write a Parquet schema containing an empty struct. It is therefore 
proposed that the attributes MAP becomes an implicit column.)

> Make the XML format plugin's attributes MAP an implicit column
> --------------------------------------------------------------
>
>                 Key: DRILL-8261
>                 URL: https://issues.apache.org/jira/browse/DRILL-8261
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - XML
>    Affects Versions: 1.20.1
>            Reporter: James Turton
>            Assignee: James Turton
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> The attributes MAP column generated by the XML format plugin is currently 
> explicit and present in wildcard selects. However, attributes are frequently 
> not present at all in XML data, or are better queried using explicit 
> projections of the individual attributes of interest to scalars. The 
> motivating example here is an ETL-style query that transforms XML to Parquet 
> using a CTAS with a wildcard column spec. This query will fail for XML that 
> has no attributes because the Parquet writer cannot write a Parquet schema 
> containing the empty struct produced by the attributes map. It is therefore 
> proposed that the attributes MAP becomes an implicit column.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to