Some additional details... A few years ago, we moved from Protocol Buffers 2.4.1 to 2.5.0. There were some challenges with that upgrade, because 2.5.0 was not backwards-compatible with 2.4.1. We needed to coordinate carefully with projects downstream of Hadoop that receive our protobuf classes through transitive dependency. Here are a few issues with more background:
https://issues.apache.org/jira/browse/HADOOP-9845 https://issues.apache.org/jira/browse/HBASE-8165 https://issues.apache.org/jira/browse/HIVE-5112 It was important to complete this upgrade before Hadoop 2.x came out of beta. After that, we committed to a policy of backwards-compatibility within the 2.x release line. I can't find a statement about whether or not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at compile time and on the wire). Do you know the answer? If it's backwards-incompatible, then we wouldn't be able to do this upgrade within Hadoop 2.x, though we could consider it for 3.x (trunk). In general, we upgrade dependencies when a new release offers a compelling benefit, not solely to keep up with the latest. In the case of 2.5.0, there was a performance benefit. Looking at the release notes for 2.6.0 and 2.6.1, I don't see anything particularly compelling. (That's just my opinion though, and others might disagree.) https://github.com/google/protobuf/blob/master/CHANGES.txt BTW, if anyone is curious, it's possible to try a custom build right now linked against 2.6.1. You'd pass -Dprotobuf.version=2.6.1 and -Dprotoc.path=<path to protoc 2.6.1 binary> when you run the mvn command. --Chris Nauroth On 5/13/15, 8:59 AM, "Allen Wittenauer" <[email protected]> wrote: > >On May 13, 2015, at 5:02 AM, Alan Burlison <[email protected]> >wrote: > >> The current version of Protocol Buffers is 2.6.1 but the current >>version required by Hadoop is 2.5.0. Is there any reason for this, or >>should I log a JIRA to get it updated? > > The story of protocol buffers is part of a shameful past where Hadoop >trusted Google. This was a terrible mistake, based upon the last time >the project upgraded. 2.4->2.5 required some source level, non-backward >compatible, and completely-avoidable-but-G-made-us-do-it-anyway surgery >to make work. This also ended up being a flag day for every single >developer who not only worked with Hadoop but all of the downstream >projects as well. Big disaster. > > The fact that when Google shut down Google Code, they didn't even tag >previous releases in the github source tree without significant amount >of pressure from the open source community was just adding insult to >injury. As a result, I believe the collective opinion is to just flat >out avoid adding any more Google bits into the system. > > See also: guava, which suffers from the same shortsightedness. > > At some point, we'll either upgrade, switch to a different protocol >serialization format, or fork protobuf. >
