What about defining compatibility as fully implementing all the public-stable annotated interfaces for a particular release?
Jacob Rideout On Wed, May 11, 2011 at 4:42 PM, Ian Holsman <[email protected]> wrote: > For apache (httpd I'm assuming you mean). we define compatibility as > adherence to the set of RFC's that define the HTTP protocol. > > I'm no expert in this (Roy is though), but we could attempt to do something > similar when it comes to HDFS/Map-Reduce protocols. I'm not sure what benefit > there would be to going to a RFC, as opposed to documenting the API on our > site. > > > On May 12, 2011, at 7:24 AM, Eric Baldeschwieler wrote: > >> This is a really interesting topic! I completely agree that we need to get >> ahead of this. >> >> I would be really interested in learning of any experience other apache >> projects, such as apache or tomcat have with these issues. >> >> --- >> E14 - typing on glass >> >> On May 10, 2011, at 6:31 AM, "Steve Loughran" <[email protected]> wrote: >> >>> >>> Back in Jan 2011, I started a discussion about how to define Apache >>> Hadoop Compatibility: >>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%[email protected]%3E >>> >>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet >>> >>> http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.pdf >>> >>> It claims that their implementations are 100% compatible, even though >>> the Enterprise edition uses a C filesystem. It also claims that both >>> their software releases contain "Certified Stacks", without defining >>> what Certified means, or who does the certification -only that it is an >>> improvement. >>> >>> >>> I think we should revisit this issue before people with their own >>> agendas define what compatibility with Apache Hadoop is for us >>> >>> >>> Licensing >>> -Use of the Hadoop codebase must follow the Apache License >>> http://www.apache.org/licenses/LICENSE-2.0 >>> -plug in components that are dynamically linked to (Filesystems and >>> schedulers) don't appear to be derivative works on my reading of this, >>> >>> Naming >>> -this is something for branding@apache, they will have their opinions. >>> The key one is that the name "Apache Hadoop" must get used, and it's >>> important to make clear it is a derivative work. >>> -I don't think you can claim to have a Distribution/Fork/Version of >>> Apache Hadoop if you swap out big chunks of it for alternate >>> filesystems, MR engines, etc. Some description of this is needed >>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ" >>> >>> Compatibility >>> -the definition of the Hadoop interfaces and classes is the Apache >>> Source tree, >>> -the definition of semantics of the Hadoop interfaces and classes is >>> the Apache Source tree, including the test classes. >>> -the verification that the actual semantics of an Apache Hadoop >>> release is compatible with the expected semantics is that current and >>> future tests pass >>> -bug reports can highlight incompatibility with expectations of >>> community users, and once incorporated into tests form part of the >>> compatibility testing >>> -vendors can claim and even certify their derivative works as >>> compatible with other versions of their derivative works, but cannot >>> claim compatibility with Apache Hadoop unless their code passes the >>> tests and is consistent with the bug reports marked as ("by design"). >>> Perhaps we should have tests that verify each of these "by design" >>> bugreps to make them more formal. >>> >>> Certification >>> -I have no idea what this means in EMC's case, they just say "Certified" >>> -As we don't do any certification ourselves, it would seem impossible >>> for us to certify that any derivative work is compatible. >>> -It may be best to state that nobody can certify their derivative as >>> "compatible with Apache Hadoop" unless it passes all current test suites >>> -And require that anyone who declares compatibility define what they >>> mean by this >>> >>> This is a good argument for getting more functional tests out there >>> -whoever has more functional tests needs to get them into a test module >>> that can be used to test real deployments. >>> > >
