houqp commented on pull request #605: URL: https://github.com/apache/arrow-datafusion/pull/605#issuecomment-867081932
Good catch on the regressions, I will get them fixed tonight with unit tests. > It seems like a core challenge is in the semantics of the * expansion in select * from ... type queries which varies depending on type of join. Yeah, that's exactly the problem. Initially, I tried the approach of keeping the join columns from both relations (i.e. `f1.foo` and `f2.foo`) in the logical schema. Then I ran into the problem of it breaking the [physical planning invariants](https://github.com/apache/arrow-datafusion/blob/master/docs/specification/invariants.md#the-physical-schema-is-invariant-under-planning). Because for `Using` join, we are only producing a single join column in the output, the physical plan schema can only has a single field for the join column. The schema from the logical plan will need to match that as well in order to honor that invariant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
