cgivre commented on pull request #2283: URL: https://github.com/apache/drill/pull/2283#issuecomment-891377214
> Yes, I think I did grok the unknown schema problem. The thought above, which somehow escaped all the striking out I did to it after thinking a bit more, was to take advantage of the fact that scalar string can be embedded into a single element map. The tuple generating code would need to become aware when it should do this. > > My second comment's comparison of the situation with a JSON property that is first null, then an object, is also a bit dubious because empty XML elements do not represent nulls (from I what read) so much as zero length strings. > > If there is an effort to make querying XML behave in a more similar way to querying equivalent JSON, for some definition of equivalent, it should probably wait for another PR. I think you're right about that. From what I remember, there is an option for Drill's JSON parser to treat `NaN` and something else as `null`. For XML I don't know how you'd distinguish between an empty string and `null`. This was also an issue with some data I was working on. The JSON version used empty strings to denote `null` then subsequent rows would contain maps which would cause SchemaChange exceptions. The only way to fix that was to use the `UNION` data type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
