cgivre commented on issue #2307:
URL: https://github.com/apache/drill/issues/2307#issuecomment-913701607


   @KendraKrat 
   Thanks for reporting this.  The issue here is that Drill is using a 
streaming reader and doesn't know the schema in advance.   Drill sees the first 
field and interprets that as an empty `VARCHAR` field with two attributes.  
Then, it sees the next field with the same name, `extra` and same attributes 
and has no way to determine the intent of the data. 
   
   I would actually argue that this isn't a great way to format XML, but often 
we're stuck with what the data provider gives us, so it's a moot point. 
   
   I've thought about adding `list` support for the XML reader which would 
partially address this, however the real way would be to add provided schema 
and XSD support.  That way you can explicitly tell Drill what to expect in 
terms of schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to