Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/6587#issuecomment-114983340
I'm going to again voice my objection here. At the core there is a
fundamental problem: we have two types of equality that we care about.
Structural equality (i.e. all of the fields of the two classes are the same)
and reference equality (these two attributes are referring to the same spot in
the input tuple).
I believe that it would be confusing to have equals and hash code refer to
anything other than structural equality. We cannot get rid of the name part of
attribute references (or ignore it in equality) because we are case preserving
even when we are case insensitive. So attributes that have different names
*are different*.
I don't think that it is too big of a burden for developers to watch for
these types of equality and make sure they are applied properly when doing code
review. I do think that large refactorings like this are likely to introduce
regressions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]