I'd endorse Konstantin's suggestion that the only "downgrade" that will be supported is "rollback," (and roll back works regardless of version).
There should be a time and version rule for allowed upgrades. "Upgrades to 1.x will be supported for 0.x for x>17. Upgrades to X.* will be supported for (X-1).*, and also (X-2).* if (X-1).0 is less than one year old." For interoperation between versions, I might conspicuously deprecate HFTP/HTTP access to files in 1.0 while making a strong demand for interoperability. "Client applications may read/write data between any two versions not less than 1.0 that permit upgrade. 1.* need only support HTTP/HFTP for sufficiently relaxed security regimes. Support for HTTP/HFTP may be withdrawn in 2.0." On 20 10 08 18:50, "Sanjay Radia" <[EMAIL PROTECTED]> wrote: > The Hadoop 1.0 wiki has a section on compatibility. > http://wiki.apache.org/hadoop/Release1.0Requirements > > Since the wiki is awkward for discussions, I am continuing the > discussion here. > I or someone will update the wiki when agreements are reached. > > Here is the current list of compatibility requirements on the Hadoop > 1.0 Wiki for the convenience of this email thread. > -------- > What does Hadoop 1.0 mean? > * Standard release numbering: Only bug fixes in 1.x.y releases > and new features in 1.x.0 releases. > * No need for client recompilation when upgrading from 1.x to > 1.y, where x <= y > o Can't remove deprecated classes or methods until 2.0 > * Old 1.x clients can connect to new 1.y servers, where x <= y > * New FileSystem clients must be able to call old methods when > talking to old servers. This generally will be done by having old > methods continue to use old rpc methods. However, it is legal to have > new implementations of old methods call new rpcs methods, as long as > the library transparently handles the fallback case for old servers. > ----------------- > > A couple of additional compatibility requirements: > > * HDFS metadata and data is preserved across release changes, both > major and minor. That is, > whenever a release is upgraded, the HDFS metadata from the old release > will be converted automatically > as needed. > > The above has been followed so far in Hadoop; I am just documenting it > in the 1.0 requirements list. > > * In a major release transition [ ie from a release x.y to a > release (x+1).0], a user should be able to read data from the cluster > running the old version. (OR shall we generalize this to: from x.y to > (x+i).z ?) > > The motivation: data copying across clusters is a common operation for > many customers > (for example this is routinely at done at Yahoo.). Today, http (or > hftp) provides a guaranteed compatible way of copying data across > versions. Clearly one cannot force a customer to simultaneously > update all its hadoop clusters on to > a new major release. The above documents this requirement; we can > satisfy it via the http/hftp mechanism or some other mechanism. > > Question: is one is willing to break applications that operate across > clusters (ie an application that accesses data across clusters that > cross a major release boundary? I asked the operations team at Yahoo > that run our hadoop clusters. We currently do not have any applicaions > that access data across clusters as part of a MR job. The reason > being that Hadoop routinely breaks wire compatibility across releases > and so such apps would be very unreliable. However, the copying of > data across clusters is t is crucial and needs to be supported. > > Shall we add a stronger requirement for 1.0: wire compatibility > across major versions? This can be supported by class loading or other > games. Note we can wait to provide this when 2.0 happens. If Hadoop > provided this guarantee then it would allow customers to partition > their data across clusters without risking apps breaking across major > releases due to wire incompatibility issues. > >
