cgivre commented on issue #2307: URL: https://github.com/apache/drill/issues/2307#issuecomment-913701607
@KendraKrat Thanks for reporting this. The issue here is that Drill is using a streaming reader and doesn't know the schema in advance. Drill sees the first field and interprets that as an empty `VARCHAR` field with two attributes. Then, it sees the next field with the same name, `extra` and same attributes and has no way to determine the intent of the data. I would actually argue that this isn't a great way to format XML, but often we're stuck with what the data provider gives us, so it's a moot point. I've thought about adding `list` support for the XML reader which would partially address this, however the real way would be to add provided schema and XSD support. That way you can explicitly tell Drill what to expect in terms of schema. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
