jackye1995 commented on code in PR #4953:
URL: https://github.com/apache/iceberg/pull/4953#discussion_r889170956
##########
api/src/main/java/org/apache/iceberg/util/CharSequenceSet.java:
##########
@@ -172,7 +172,7 @@ public boolean equals(Object o) {
@Override
public int hashCode() {
- return Objects.hash(wrapperSet);
Review Comment:
Thanks for the explanation. I see this line in your blog:
> The problem is that although the documentation says hashCode() doesn’t
provide a consistency guarantee, the Java standard library behaves as if it did
provide the guarantee. People start relying on it, and since
backwards-compatibility is rated so highly in the Java community, it will
probably never ever be changed, even though the documentation would allow it to
be changed. So the JVM gets the worst of both worlds: a hash table
implementation that is open to DoS attacks, but also a hash function that can’t
always safely be used for communication between processes. :(
>
> Therefore…
>
> So what I’d like to ask for is this: if you’re building a distributed
framework based on the JVM, please don’t use Java’s hashCode() for anything
that needs to work across different processes. Because it’ll look like it works
fine when you use it with strings and numbers, and then someday a brave soul
will use (e.g.) a protocol buffers object, and then spend days banging their
head against a wall trying to figure out why messages are getting sent to the
wrong servers.
In that case, it feels like we should really update the errorprone config to
not report warning for not using hashCode...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]