[
https://issues.apache.org/jira/browse/HADOOP-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626288#comment-15626288
]
Sangjin Lee commented on HADOOP-11804:
--------------------------------------
Thanks for the work [~busbey]! I just did a quick test with the latest patch.
One high level concern is in terms of maintaining dependencies in the pom's. If
a developer adds a new dependency to a module, how would that propagate to
these client pom's? Would he/she need to add it to these client pom's for the
most part? It wasn't entirely clear to me what that cost of maintenance is. If
that is the only way to keep it clean, that's OK. But it would be great if that
cost is kept to a minimum.
1.
The patch indeed does not apply for me via plain {{git apply}}: it breaks with
{{hadoop-client/pom.xml}} and {{hadoop-maven-plugins/pom.xml}}. I did {{git
apply --reject HADOOP-11804.1.patch}}.
2.
Once I fixed the git apply issues, I did {{mvn clean install package -Pdist
-DskipTests -Dmaven.javadoc.skip}} and it fails right away:
{noformat}
[ERROR] The project
org.apache.hadoop:hadoop-client-minicluster:3.0.0-alpha2-SNAPSHOT
(/Users/sjlee/git/hadoop-trunk/hadoop-client-modules/hadoop-client-minicluster/pom.xml)
has 1 error
[ERROR] 'dependencies.dependency.version' for org.mortbay.jetty:jetty:jar
is missing. @ line 266, column 17
{noformat}
I got past it by providing a version for this (chose 6.1.26).
3.
The build still fails with a couple of duplicate classes issues. One is what
Andrew reported above. Another is duplicate jetty classes.
{noformat}
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses failed
with message:
Duplicate classes found:
Found in:
org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-alpha2-SNAPSHOT:compile
org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-alpha2-SNAPSHOT:compile
Duplicate classes:
org/apache/hadoop/shaded/org/eclipse/jetty/io/ssl/SslConnection$2.class
org/apache/hadoop/shaded/org/eclipse/jetty/server/RequestLog.class
org/apache/hadoop/shaded/org/eclipse/jetty/server/ResourceCache$1.class
org/apache/hadoop/shaded/org/eclipse/jetty/util/log/AbstractLogger.class
org/apache/hadoop/shaded/org/eclipse/jetty/util/annotation/Name.class
org/apache/hadoop/shaded/org/eclipse/jetty/util/component/LifeCycle.class
org/apache/hadoop/shaded/org/eclipse/jetty/server/HttpChannel$Commit100Callback.class
org/apache/hadoop/shaded/org/eclipse/jetty/util/ssl/SslContextFactory$1.class
...
{noformat}
4.
Was there a significant difficulty in handing the timeline service v.2? Is it
just the number of new dependencies we’re pulling in or the fact that there is
a HBase dependency?
5.
Regarding the logging libraries, I agree we probably want to exclude them.
Things like log4j properties and the way slf4j works can cause issues down the
road if shaded.
> POC Hadoop Client w/o transitive dependencies
> ---------------------------------------------
>
> Key: HADOOP-11804
> URL: https://issues.apache.org/jira/browse/HADOOP-11804
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: build
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Attachments: HADOOP-11804.1.patch, HADOOP-11804.2.patch,
> HADOOP-11804.3.patch, HADOOP-11804.4.patch
>
>
> make a hadoop-client-api and hadoop-client-runtime that i.e. HBase can use to
> talk with a Hadoop cluster without seeing any of the implementation
> dependencies.
> see proposal on parent for details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]