[
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685038#comment-15685038
]
Andrew Wang commented on HDFS-11096:
------------------------------------
I talked with [~kasha] about compatibility generally offline, which was very
helpful. Some notes:
h2. Source and binary compatibility
>From the API guidelines:
{quote}
Public-Stable APIs must be deprecated for at least one major release prior to
their removal in a major release.
{quote}
>From the ABI guidelines:
{quote}
In particular for MapReduce applications, the developer community will try our
best to support provide binary compatibility across major releases e.g.
applications using org.apache.hadoop.mapred.
...
APIs are supported compatibly across hadoop-1.x and hadoop-2.x. See
Compatibility for MapReduce applications between hadoop-1.x and hadoop-2.x for
more details.
{quote}
The intention encoded in these guidelines is that we should strive to not break
API or ABI compatibility in a major release. Regarding Public/Stable APIs, I
think this means we can't remove in 3.0 unless it was deprecated in 2.2.
There are also other ways of breaking ABI compatibility (e.g. adding a new
abstract method to an interface), which I think should be included under this
guideline.
Since users can bundle MR jars with their application, MR compat is somewhat
less important than HDFS/YARN compatibility.
h2. Wire compatibility
Client/server wire compatibility is important since clients might want to
read/write data or submit jobs across versions.
Server/server compatibility is important for rolling upgrade.
>From the compat guide:
{quote}
Compatibility can be broken only at a major release, though breaking
compatibility even at major releases has grave consequences and should be
discussed in the Hadoop community.
{quote}
If we had to prioritize, I think client/server compatibility is the more
important of the two, though based on my audit of the HDFS PBs for alpha1,
server/server also seemed okay.
h2. Discussion
The biggest need here is for testing.
Source compatibility testing is the easiest, and relatively well covered.
Downstream projects have been picking up 3.0.0-alpha1, and here at Cloudera,
we've got all of the CDH projects compiling against alpha1 with posted fixes.
Binary compatibility is more difficult, and not covered by Cloudera's internal
testing since we compile all of CDH as a monolith. JACC though covers this, and
I set up [nightly
runs|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-JACC/] for
trunk on Jenkins.
Wire compatibility is the most difficult. There's no automated check for PB or
REST compatibility, and setting up cross-version clusters is essentially
impossible in a unit test. This has been a problem even within just the 2.x
line, so there's a real need for better cross-version integration testing.
If you're interested in compatibility, additional input on prioritization and
test strategy would be appreciated.
> Support rolling upgrade between 2.x and 3.x
> -------------------------------------------
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: rolling upgrades
> Affects Versions: 3.0.0-alpha1
> Reporter: Andrew Wang
> Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling
> reasons to break compatibility, let's restore the ability to rolling upgrade
> to 3.x releases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]