[
https://issues.apache.org/jira/browse/HADOOP-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978922#action_12978922
]
Doug Cutting commented on HADOOP-6904:
--------------------------------------
About hash sizes: Java hashcodes are designed for hash tables, a
collision-friendly purpose. This is not a collision-friendly purpose. A
collision in protocols hashes would cause confusing failures.
Looking at:
http://en.wikipedia.org/wiki/Birthday_problem#Probability_table
With 32-bit hashses, if there are 93 versions of a protocol, then there's a 1
in a million chance (p=10^-6) of a collision, assuming a very good hash
function. Those are probably acceptable odds: a protocol might have 93
variations, but there probably won't be anywhere near a million protocols. If
we used 64-bit hashes, then we could have millions of versions of millions of
protocols before there's a decent chance of a collision. That's a lot better.
In Avro we use 128-bit MD5 hashes to identify protocols, which is probably
overkil. So I'm okay with 32-bit hashes here, but I'd be happier with 64. A
64-bit string hash is easy to write:
http://stackoverflow.com/questions/1660501/what-is-a-good-64bit-hash-function-in-java-for-textual-strings
> A baby step towards inter-version RPC communications
> ----------------------------------------------------
>
> Key: HADOOP-6904
> URL: https://issues.apache.org/jira/browse/HADOOP-6904
> Project: Hadoop Common
> Issue Type: New Feature
> Components: ipc
> Affects Versions: 0.22.0
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: majorMinorVersion.patch, majorMinorVersion1.patch,
> rpcCompatible-trunk.patch, rpcCompatible-trunk1.patch,
> rpcCompatible-trunk2.patch, rpcVersion.patch, rpcVersion1.patch
>
>
> Currently RPC communications in Hadoop is very strict. If a client has a
> different version from that of the server, a VersionMismatched exception is
> thrown and the client can not connect to the server. This force us to update
> both client and server all at once if a RPC protocol is changed. But sometime
> different versions do not mean the client & server are not compatible. It
> would be nice if we could relax this restriction and allows us to support
> inter-version communications.
> My idea is that DfsClient catches VersionMismatched exception when it
> connects to NameNode. It then checks if the client & the server is
> compatible. If yes, it sets the NameNode version in the dfs client and allows
> the client to continue talking to NameNode. Otherwise, rethrow the
> VersionMismatch exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.