On Tue, May 10, 2011 at 3:29 AM, Steve Loughran <[email protected]> wrote:
> I think we should revisit this issue before people with their own agendas > define what compatibility with Apache Hadoop is for us > I agree completely. As you point out, this week we've had a flood of products calling themselves "Hadoop" or "Distribution of Hadoop" that include only a part of Hadoop. This is will dilute Apache's Hadoop trademark and create consumer confusion. Licensing > -Use of the Hadoop codebase must follow the Apache License > http://www.apache.org/licenses/LICENSE-2.0 > -plug in components that are dynamically linked to (Filesystems and > schedulers) don't appear to be derivative works on my reading of this, > +1 Plugins are usually considered independent works. Note that the Apache license does permit commercial closed-source derivative works. A company could take Hadoop's code, modify it, and sell a binary release as long as they meet the conditions of the Apache license. > Naming > -this is something for branding@apache, they will have their opinions. > The key one is that the name "Apache Hadoop" must get used, and it's > important to make clear it is a derivative work. > -I don't think you can claim to have a Distribution/Fork/Version of Apache > Hadoop if you swap out big chunks of it for alternate filesystems, MR > engines, etc. Some description of this is needed > "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ" > The Hadoop name is the primary tool that the project has for minimizing customer confusion. I think we need to create a very clear definition of what can be called Hadoop and what can not. Apache gives the PMCs a fair amount of latitude in picking the policy for their project name and I think we need to do so. Given the large number of so-called Hadoop products that are being released, I believe that we should require "Hadoop" to mean specifically the Apache Hadoop releases (possibly with a few critical security patches). Projects that are derivative works can either be "powered by Apache Hadoop," or "based on Apache Hadoop." What do others think? -- Owen
