Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10630#issuecomment-174165179
When resolving the conflicts, I realized the multi-children `Union` might
have duplicate `exprId`. So far, I did not add a function to de-duplicate them.
This is not a trivial work, if needed. When the children has hundreds of nodes,
it is infeasible to use the current per-pair de-duplication. That means, we
need to rewrite the whole function `dedup`.
Let me know if we need to open a separate PR to do it now. So far, unlike
`Intersect`, we did not hit any issue even if there exist duplicate `exprId`
values. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]