Re: Hadoop 1.0 Compatibility Discussion.

Robert Chansler Thu, 23 Oct 2008 12:49:03 -0700

I'd endorse Konstantin's suggestion that the only "downgrade" that will be
supported is "rollback," (and roll back works regardless of version).


There should be a time and version rule for allowed upgrades. "Upgrades to
1.x will be supported for 0.x for x>17. Upgrades to X.* will be supported
for (X-1).*, and also (X-2).* if (X-1).0 is less than one year old."

For interoperation between versions, I might conspicuously deprecate
HFTP/HTTP access to files in 1.0 while making a strong demand for
interoperability. "Client applications may read/write data between any two
versions not less than 1.0 that permit upgrade. 1.* need only support
HTTP/HFTP for sufficiently relaxed security regimes. Support for HTTP/HFTP
may be withdrawn in 2.0."

On 20 10 08 18:50, "Sanjay Radia" <[EMAIL PROTECTED]> wrote:

> The Hadoop 1.0 wiki has a section on compatibility.
> http://wiki.apache.org/hadoop/Release1.0Requirements
> 
> Since the wiki is awkward for discussions, I am continuing the
> discussion here.
> I or someone will update the wiki when agreements are reached.
> 
> Here is the current list of compatibility requirements on the Hadoop
> 1.0 Wiki for the convenience of this email thread.
> --------
> What does Hadoop 1.0 mean?
>      * Standard release numbering: Only bug fixes in 1.x.y releases
> and new features in 1.x.0 releases.
>      * No need for client recompilation when upgrading from 1.x to
> 1.y, where x <= y
>            o  Can't remove deprecated classes or methods until 2.0
>       * Old 1.x clients can connect to new 1.y servers, where x <= y
>      * New FileSystem clients must be able to call old methods when
> talking to old servers. This generally will be done by having old
> methods continue to use old rpc methods. However, it is legal to have
> new implementations of old methods call new rpcs methods, as long as
> the library transparently handles the fallback case for old servers.
> -----------------
> 
> A couple of  additional compatibility requirements:
> 
> * HDFS metadata and data is preserved across release changes, both
> major and minor. That is,
> whenever a release is upgraded, the HDFS metadata from the old release
> will be converted automatically
> as needed.
> 
> The above has been followed so far in Hadoop; I am just documenting it
> in the 1.0 requirements list.
> 
>    * In a major release transition [ ie from a release x.y to a
> release (x+1).0], a user should be able to read data from the cluster
> running the old version.  (OR shall we generalize this to: from x.y to
> (x+i).z ?)
> 
> The motivation: data copying across clusters is a common operation for
> many customers
> (for example this is routinely at done at Yahoo.). Today, http (or
> hftp) provides a guaranteed compatible way of copying data across
> versions.  Clearly one cannot force a customer to simultaneously
> update all its hadoop clusters on to
> a new major release. The above documents this requirement; we can
> satisfy it via the http/hftp mechanism or some other mechanism.
> 
> Question: is one is willing to break applications that operate across
> clusters (ie an application that accesses data across clusters that
> cross a major release boundary? I asked the operations team at Yahoo
> that run our hadoop clusters. We currently do not have any applicaions
> that access data across clusters as part  of a MR job. The reason
> being that Hadoop routinely breaks  wire compatibility across releases
> and so such apps would be very unreliable. However, the copying of
> data across clusters is t is crucial and needs to be supported.
> 
> Shall we add a stronger requirement for 1.0:  wire compatibility
> across major versions? This can be supported by class loading or other
> games. Note we can wait to provide this when 2.0 happens. If Hadoop
> provided this guarantee then it would allow customers to partition
> their data across clusters without risking apps breaking across major
> releases due to wire incompatibility issues.
> 
>

Re: Hadoop 1.0 Compatibility Discussion.

Reply via email to