chasingegg commented on pull request #35168: URL: https://github.com/apache/spark/pull/35168#issuecomment-1019416563
> I think a complete fix is to let `Union` has fresh attribute IDs for its output, as technically `Union`'s outputs are totally different from its first child's outputs. It's going to be a big change so we should only do it in the master branch. > > Can we open a PR for 3.0 with this surgical fix? Let's make sure the test still passes with this fix in 3.0. Actually now we haven't passed the all tests in master branch. In pyspark module, it seems that it has `repeat` function which makes that two columns are the same like `a, a`, and `select a` works fine, but after this fix it would complained that `it is ambigous`, so I'm wondering if we could remove the pyspark thing or carefully disable the alias operation when the `original union` works fine. WDYT @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
