[
https://issues.apache.org/jira/browse/HADOOP-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer resolved HADOOP-5071.
--------------------------------------
Resolution: Won't Fix
Hadoop 1.0 was released.
> Hadoop 1.0 Compatibility Requirements
> -------------------------------------
>
> Key: HADOOP-5071
> URL: https://issues.apache.org/jira/browse/HADOOP-5071
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Sanjay Radia
> Assignee: Sanjay Radia
>
> The purpose of this Jira is to decide on Hadoop 1.0 Compatibility
> requirements
> A proposal is described below that was discussed on email alias
> [email protected]
> Release terminology used below:
> *Standard release numbering: major, minor, dot releases*
> * Only bug fixes in dot releases: m.x.y
> ** no changes to API, disk format, protocols or config etc. in a dot release
> * new features in major (m.0) and minor (m.x.0) releases
> *Hadoop Compatibility Proposal*
> - *1 API Compatibility*
> No need for client recompilation when upgrading across minor releases (ie.
> from m.x to m.y, where x <= y)
> Classes or methods deprecated in m.x can be removed in (m+1).0
> Note that this is stronger than what we have been doing in Hadoop 0.x
> releases.
> This is fairly standard compatibility rules for major and minor
> releases.
> - *2 Data Compatibility*
> -- Motivation: Users expect File systems preserve data transparently across
> releases.
> -- 2.a HDFS metadata and data can change across minor or major releases , but
> such changes are transparent to user application. That is release upgrade
> must automatically convert the metadata and data as needed. Further, a
> release upgrade must allow a cluster to roll back to the older version and
> its older disk format. (rollback needs to restore the orignal data not any
> updated data).
> -- 2.a-WeakerAutomaticConversion:
> Automatic conversion is support across a small number of releases. If a user
> wants to jump across multiple releases he may be forced to go through a few
> intermediate release to get to the final desired release.
> - *3 Wire Protocol Compatibility*
> We offer no wire compatibility in our 0.x release today.
> -- Motivation: The motivation *isn't* to make the hadoop protocols public.
> Applications will not call the protocol directly but through a library (in
> our case FileSystem class and its implementations). Instead the motivation is
> that customers run multiple clusters and have apps that access data across
> clusters. Customers cannot be expected to update all clusters simultaneously.
> -- 3.a Old m.x clients can connect to new m.y servers, where x <= y but the
> old clients might get reduced functionality or performance. m.x clients might
> not be able to connect to (m+1).z servers
> -- 3.b. New m.y clients must be able to connect to old m.x server, where x< y
> but only for old m.x functionality.
> Comment: Generally old API methods continue to use old rpc methods. However,
> it is legal to have new implementations of old API methods call new
> rpcs methods, as long as the library transparently handles the fallback case
> for old servers.
> -- 3.c. At any major release transition [ ie from a release m.x to a release
> (m+1).0], a user should be able to read data from the cluster running the old
> version.
> --- Motivation: data copying across clusters is a common operation for many
> customers. For example this is routinely at done at Yahoo; another use case
> is HADOOP-4058. Today, http (or hftp) provides a guaranteed compatible way of
> copying data across versions. Clearly one cannot force a customer to
> simultaneously update all its Hadoop clusters on to a new major release. We
> can satisfy this requirement via the http/hftp mechanism or some other
> mechanism.
> -- 3.c-Stronger
> Shall we add a stronger requirement for 1. 0 : wire compatibility across
> major versions? That is not just for reading but for all operations. This can
> be supported by class loading or other games.
> Note we can wait to provide this when 2. 0 happens. If Hadoop provided this
> guarantee then it would allow customers to partition their data across
> clusters without risking apps breaking across major releases due to wire
> incompatibility issues.
> --- Motivation: Data copying is a compromise. Customers really want to run
> apps across clusters running different versions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira