On 27 10 08 13:50, "Sanjay Radia" <[EMAIL PROTECTED]> wrote:
> I have merged the various Hadoop 1.0 Compatibility items that have
> been discussed in this thread and categorized and listed them below.
>
>
>
> Hadoop 1.0 Compatibility
> ==================
>
> Standard release numbering:
> - Only bug fixes in dot releases: m.x.y
> - no changes to API, disk format, protocols or config etc.
> - new features in major (m.x.0) and minor (m.x.0) releases
>
>
> 1. API Compatibility
> -------------------------
> No need for client recompilation when upgrading across minor
releases
> (ie. from m.x to m.y, where x <= y)
> Classes or methods deprecated in m.x can be removed in (m+1).0
> Note that this is stronger than what we have been doing in Hadoop
0.x
> releases.
> Motivation. This the industry standard compatibility rules for major
> and
> minor releases.
>
> 2 Data Compatibility
> --------------------------
> 2.a HDFS metadata and data can change across minor or major
releases ,
> but such
> changes are transparent to user application. That is release upgrade
> must
> automatically convert the metadata and data as needed. Further, a
> release
> upgrade must allow a cluster to roll back to the older version and
its
> older
> disk format.
> Motivation: Users expect File systems preserve data transparently
> across
> releases.
>
> 2.a-Stronger
> HDFS metadata and data can change across minor or major releases,
but
> such
> changes are transparent to user application. That is release upgrade
> must
> automatically convert the metadata and data as needed. During
*minor*
> releases,
> disk format changes have to backward and forward compatible; i.e. an
> older
> version of Hadoop can be started on a newer version of the disk
> format. Hence
> a version roll back is simple, just restart the older version of
Hadoop.
> Major releases allow more significant changes to the disk format and
> have be
> only backward compatible; however major release upgrade must allow a
> cluster to
> roll back to the older version and its older disk format.
> Motivation: Minor release are very easy to roll back for an admin.
>
>
> 2.a-WeakerAutomaticConversion:
> Automatic conversion is supported across a small number of releases.
> If a user
> wants to jump across multiple releases he may be forced to go
through
> a few
> intermediate release to get to the final desired release.
>
> 3. Wire Protocol Compatibility
> ----------------------------------------
> We offer no wire compatibility in our 0.x release today.
> The motivation *isn't* to make a our protocols public. Applications
> will not
> call the protocol directly but through a library (in our case
> FileSystem class
> and its implementations). Instead the motivation is that customers
run
> multiple clusters and have apps that access data across clusters.
> Customers
> cannot be expected to update all clusters simultaneously.
>
>
> 3.a Old m.x clients can connect to new m.y servers, where x <= y but
> the old clients might get reduced functionality or performance. m.x
> clients might not be able to connect to (m+1).z servers
>
> 3.b. New m.y clients must be able to connect to old m.x server,
where
> x< y but
> only for old m.x functionality.
> Comment: Generally old API methods continue to use old rpc methods.
> However, it is legal to have new implementations of old API methods
> call new
> rpcs methods, as long as the library transparently handles the
fallback
> case for old servers.
>
> 3.c. At any major release transition [ ie from a release m.x to a
> release (m+1).0], a user should be able to read data from the
cluster
> running the old version. (OR shall we generalize this to: from m.x
to
> (m+i).z ?)
>
> Motivation: data copying across clusters is a common operation for
many
> customers. For example this is routinely at done at Yahoo; another
use
> case is
> HADOOP-4058. Today, http (or hftp) provides a guaranteed compatible
> way of
> copying data across versions. Clearly one cannot force a customer to
> simultaneously update all its Hadoop clusters on to a new major
> release. The
> above documents this requirement; we can satisfy it via the http/
hftp
> mechanism
> or some other mechanism.
>
> 3.c-Stronger
> Shall we add a stronger requirement for 1. 0 : wire compatibility
> across major versions? That is not just for reading but for all
> operations. This can be supported by class loading or other games.
> Note we can wait to provide this when 2. 0 happens. If Hadoop
> provided this guarantee then it would allow customers to partition
> their data across clusters without risking apps breaking across
major
> releases due to wire incompatibility issues.
>
> Motivation: Data copying is a compromise. Customers really want to
run
> apps across clusters running different versions. ( See item 2 )
>
>
> 4. Intra Hadoop Service Compatibility
> --------------------------------------------------
> The HDFS Service has multiple components (NN, DN, Balancer) that
> communicate
> amongst themselves. Similarly the MapReduce service has
> components (JR and TT) that communicate amongst themselves.
> Currently we require that the all the components of a service have
the
> same build version and hence talk the same wire protocols.
> This build-version checking prevents rolling upgrades. It has the
> benefit that the admin can ensure that the entire cluster has
exactly
> the same build version.
>
> 4.a HDFS and MapReduce require that their respective sub-components
> have the same build version in order to form a cluster.
> [ie. Maintain the current mechanism.]
>
> 4.a-Stronger: Intra-service wire-protocol compatibility
> [I am listing this here to document it, but I don't think we are
ready
> to take
> this on for Hadoop 1.0. Alternatively, we could require intra -
service
> wire
> compatibility but check for build version till we are ready for
rolling
> upgrades]
>
> Wire protocols between internal Hadoop components are compatible
across
> minor versions.
> Examples are NN-DN, DN-DN and NN-Balancer, etc.
> Old m.x components can talk to new m.y components (x<=y)
> Wire compatibility can break across major versions.
> Motivation: Allow rolling upgrades.
>