Kimahriman commented on pull request #32972: URL: https://github.com/apache/spark/pull/32972#issuecomment-864495223
They're mostly different issues. This is more of a semantics thing. If you have two nested structs with the same fields, but in a different order, you have to set `allowMissingCol` to true in order for the structs to be sorted, which isn't very intuitive. This is trying to make the `ByName` part apply to nested structs as well, and leave `allowMissingCol` to just actually apply to missing (possibly nested) columns. So I do think this idea makes sense, but I don't think the implementation handles multiple levels of nested structs correctly. `addFields` assumes adding missing columns, so I think you could end up with a case that adds null nested columns even if `allowMissingCol` is false. I think the logic would have to be added to `addFields` to handle whether or not it should add null missing columns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
