What about defining compatibility as fully implementing all the
public-stable annotated interfaces for a particular release?

Jacob Rideout

On Wed, May 11, 2011 at 4:42 PM, Ian Holsman <[email protected]> wrote:
> For apache (httpd I'm assuming you mean). we define compatibility as 
> adherence to the set of RFC's that define the HTTP protocol.
>
> I'm no expert in this (Roy is though), but we could attempt to do something 
> similar when it comes to HDFS/Map-Reduce protocols. I'm not sure what benefit 
> there would be to going to a RFC, as opposed to documenting the API on our 
> site.
>
>
> On May 12, 2011, at 7:24 AM, Eric Baldeschwieler wrote:
>
>> This is a really interesting topic!  I completely agree that we need to get 
>> ahead of this.
>>
>> I would be really interested in learning of any experience other apache 
>> projects, such as apache or tomcat have with these issues.
>>
>> ---
>> E14 - typing on glass
>>
>> On May 10, 2011, at 6:31 AM, "Steve Loughran" <[email protected]> wrote:
>>
>>>
>>> Back in Jan 2011, I started a discussion about how to define Apache
>>> Hadoop Compatibility:
>>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%[email protected]%3E
>>>
>>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet
>>>
>>> http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.pdf
>>>
>>> It claims that their implementations are 100% compatible, even though
>>> the Enterprise edition uses a C filesystem. It also claims that both
>>> their software releases contain "Certified Stacks", without defining
>>> what Certified means, or who does the certification -only that it is an
>>> improvement.
>>>
>>>
>>> I think we should revisit this issue before people with their own
>>> agendas define what compatibility with Apache Hadoop is for us
>>>
>>>
>>> Licensing
>>> -Use of the Hadoop codebase must follow the Apache License
>>> http://www.apache.org/licenses/LICENSE-2.0
>>> -plug in components that are dynamically linked to (Filesystems and
>>> schedulers) don't appear to be derivative works on my reading of this,
>>>
>>> Naming
>>> -this is something for branding@apache, they will have their opinions.
>>> The key one is that the name "Apache Hadoop" must get used, and it's
>>> important to make clear it is a derivative work.
>>> -I don't think you can claim to have a Distribution/Fork/Version of
>>> Apache Hadoop if you swap out big chunks of it for alternate
>>> filesystems, MR engines, etc. Some description of this is needed
>>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ"
>>>
>>> Compatibility
>>> -the definition of the Hadoop interfaces and classes is the Apache
>>> Source tree,
>>> -the definition of semantics of the Hadoop interfaces and classes is
>>> the Apache Source tree, including the test classes.
>>> -the verification that the actual semantics of an Apache Hadoop
>>> release is compatible with the expected semantics is that current and
>>> future tests pass
>>> -bug reports can highlight incompatibility with expectations of
>>> community users, and once incorporated into tests form part of the
>>> compatibility testing
>>> -vendors can claim and even certify their derivative works as
>>> compatible with other versions of their derivative works, but cannot
>>> claim compatibility with Apache Hadoop unless their code passes the
>>> tests and is consistent with the bug reports marked as ("by design").
>>> Perhaps we should have tests that verify each of these "by design"
>>> bugreps to make them more formal.
>>>
>>> Certification
>>> -I have no idea what this means in EMC's case, they just say "Certified"
>>> -As we don't do any certification ourselves, it would seem impossible
>>> for us to certify that any derivative work is compatible.
>>> -It may be best to state that nobody can certify their derivative as
>>> "compatible with Apache Hadoop" unless it passes all current test suites
>>> -And require that anyone who declares compatibility define what they
>>> mean by this
>>>
>>> This is a good argument for getting more functional tests out there
>>> -whoever has more functional tests needs to get them into a test module
>>> that can be used to test real deployments.
>>>
>
>

Reply via email to