[
https://issues.apache.org/jira/browse/HADOOP-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345354#comment-14345354
]
Steve Loughran commented on HADOOP-11656:
-----------------------------------------
...Long haul interaction is a use case I have to care about (SLIDER-151), as
well as data ingress there's full remote YARN Job submission & interaction with
the deployed app, including the sitatuation where the cluster is isolated and
something (i.e. Apache Knox) is bridging access. Lean resty clients work there,
for a lean that includes Jersey, Jackson, SLF4J, WebHDFS, any YARN equivalent
and serializable representations of all the datatypes.
That is a separate project from "stopping hadoop dependencies causing grief in
apps that are designed to directly hook up to Hadoop HDFS, be a YARN AM, etc":
both have value
> Classpath isolation for downstream clients
> ------------------------------------------
>
> Key: HADOOP-11656
> URL: https://issues.apache.org/jira/browse/HADOOP-11656
> Project: Hadoop Common
> Issue Type: New Feature
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Labels: classloading, classpath, dependencies
>
> Currently, Hadoop exposes downstream clients to a variety of third party
> libraries. As our code base grows and matures we increase the set of
> libraries we rely on. At the same time, as our user base grows we increase
> the likelihood that some downstream project will run into a conflict while
> attempting to use a different version of some library we depend on. This has
> already happened with i.e. Guava several times for HBase, Accumulo, and Spark
> (and I'm sure others).
> While YARN-286 and MAPREDUCE-1700 provided an initial effort, they default to
> off and they don't do anything to help dependency conflicts on the driver
> side or for folks talking to HDFS directly. This should serve as an umbrella
> for changes needed to do things thoroughly on the next major version.
> We should ensure that downstream clients
> 1) can depend on a client artifact for each of HDFS, YARN, and MapReduce that
> doesn't pull in any third party dependencies
> 2) only see our public API classes (or as close to this as feasible) when
> executing user provided code, whether client side in a launcher/driver or on
> the cluster in a container or within MR.
> This provides us with a double benefit: users get less grief when they want
> to run substantially ahead or behind the versions we need and the project is
> freer to change our own dependency versions because they'll no longer be in
> our compatibility promises.
> Project specific task jiras to follow after I get some justifying use cases
> written in the comments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)