gaogaotiantian commented on PR #53113:
URL: https://github.com/apache/spark/pull/53113#issuecomment-3550657237

   @zhengruifeng using `__eq__` from `DataType` is much better, but I still 
have concerns. `__dict__` does not contain all the info - even at Python level.
   
   For example:
   
   ```python
   class Vector(UserDefinedType):
       __slots__ = ["dimension"]
   
       def __init__(self, dimension):
           super(Vector, self).__init__()
           self.dimension = dimension
   
   type1 = Vector(3)
   type2 = Vector(2)
   
   
   assert type1.__dict__ == type2.__dict__  # This will be falsly True
   ```
   
   This is a valid (even encouraged) usage of Python classes, I know it might 
not be common, but our code will have issues.
   
   I think the safest way is to force users to write their own `__eq__` and 
`__hash__` methods so they can't blame us for failed code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to