[GitHub] nifi issue #475: - Add Maven profile to compile nifi-hadoop-libraries-nar us...

trixpan Sat, 11 Jun 2016 01:57:48 -0700

Github user trixpan commented on the issue:

    https://github.com/apache/nifi/pull/475
  
    @joewitt , as a user.proto-developer I would be happy with any approach 
that results in workable binaries covering a particular flavour / supported 
platform without having to commit a patch every time I pull from git. 
    
    Agree that perhaps settings.xml will be enough but I generally think the 
least we fiddle with wider settings (settings.xml for example) the better. 
    
    I also thought that a profile is better than creating documentation 
articles because I personally believe that code is generally more likely to be 
looked after than documentation articles. 
    
    May be me just me as a total JAVA and Maven newbie but nothing beats the 
simplicity of `-Phadoop_flavour=cdh5`
    
    I fully agree we must ensure licensing is properly taken care of; Having 
said that, I have always been under the impression that unlike the GPL, the ASL 
does impose restrictions around *linking* non-ASF code. So hypothetically 
speaking I suspect we could even go to the extreme lengths of releasing 
binaries linking to non ASL code as long the foreign code licenses are 
respected (e.g. ASL software does not exclude GPL licensed code, it is the [GPL 
- through its terms - that excludes liking by ASF licensed 
code](http://www.apache.org/licenses/GPL-compatibility.html)).
    
    Having said that, I suspect, given the [presence of MapR hadoop related 
code on 
github](https://github.com/mapr/hadoop-common/blob/release-2.7.0-mapr-1506/hadoop-hdfs-project/hadoop-hdfs/pom.xml),
 that their hadoop artifacts are released under ASF 2.0 but perhaps one of 
theirs, like @tdunning can help shedding some light. 
    
    But back to the profile: 
    
    The reason I ended up trying the profile approach is the fact Spark refers 
directly to Cloudera's and MapR's repositories on its main 
[pom.xml](https://github.com/apache/spark/blob/branch-1.6/pom.xml#L285). This 
led me to conclude (perhaps incorrectly) that it would be ok to have a pom 
pointing to a particular set of artifacts as long the binary produced by the 
formal release does not break ASF or foreign license licensing restrictions.
    
    To be hones, Spark's approach is even simpler than using profiles:
    
    Their pom.xml includes all repo's enabled by default and [let the user 
specify the hadoop version 
as](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version):
    
    ```
    # Cloudera CDH 4.2.0 with MapReduce v1
    mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -Phadoop-1 -DskipTests clean package
    ```
    
    Smartly paying with the fact that while vendors must respect the artifact 
ID, they tend to distinguish their supported code by embedding their names into 
the software version(e.g. 2.0.0-mr1-cdh4.2.0, hadoop-hdfs-2.x-mapr-1506, etc).
    
    Yet, given to support Spark's approach changes would have to be introduced 
to the main pom.xml, I decided to keep the scope of changes minimal, changing 
only `hadoop-libraries-nar` pom, hence reducing the potential of code spilling 
beyond planned/needed.
    
    Hope this makes my way of thinking a bit clearer.
    
    Please let me know your preference and I will be happy to adjust.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #475: - Add Maven profile to compile nifi-hadoop-libraries-nar us...

Reply via email to