[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

Sean Busbey (JIRA) Mon, 16 Apr 2018 11:33:27 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439846#comment-16439846
 ]


Sean Busbey commented on HBASE-20332:
-------------------------------------

to test out, first do a local install so you can get what the pom/jar will look 
like:
{code}
mvn -Psite-install-step -Prelease install
 {code}

Now you can look in your local maven repo for the jar(s) and the poms that a 
client will get (default user repo listed in this example):
{code}
mvn dependency:list -f 
~/.m2/repository/org/apache/hbase/hbase-shaded-mapreduce/3.0.0-SNAPSHOT/hbase-shaded-mapreduce-3.0.0-SNAPSHOT.pom
mvn dependency:tree -f 
~/.m2/repository/org/apache/hbase/hbase-shaded-mapreduce/3.0.0-SNAPSHOT/hbase-shaded-mapreduce-3.0.0-SNAPSHOT.pom
{code}

junit shows up because of our root parent pom giving it as a dependency. I 
tried a few things to get rid of it but nothing worked. I think we need to fix 
that generally (i.e. remove the top level listing of it as a dependency) rather 
than try to do it here.

> shaded mapreduce module shouldn't include hadoop
> ------------------------------------------------
>
>                 Key: HBASE-20332
>                 URL: https://issues.apache.org/jira/browse/HBASE-20332
>             Project: HBase
>          Issue Type: Sub-task
>          Components: mapreduce, shading
>    Affects Versions: 2.0.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-20332.0.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

Reply via email to