Steve Loughran created HADOOP-13016:
---------------------------------------

             Summary: reinstate hadoop-hdfs as dependency of hadoop-client, 
create hadoop-lean-client for minimal deployments
                 Key: HADOOP-13016
                 URL: https://issues.apache.org/jira/browse/HADOOP-13016
             Project: Hadoop Common
          Issue Type: Improvement
          Components: build
    Affects Versions: 2.8.0
            Reporter: Steve Loughran


the split of hadoop-hdfs and hadoop-hdfs-client is breaking code of mine whose 
builds declared a dependency on hadoop-client and expected all of HDFS to make 
it in.

I'm finding this first, because I'm building and testing downstream code 
against branch-2; I find myself having to explicitly declare a dependency on 
hadoop-hdfs to make things work again.

We've also seen problems downstream (e.g. spark) where the move of s3n classes 
to hadoop-aws has broken code which expects it to be there.

At the same time, I see the merits in a lean, low-dependency client, which 
hadoop-client and its dependencies is not today.

I propose

# reinstate hadoop-hdfs as dependency of hadoop-client
# add hadoop-aws as a dependency of hadoop-client —but excluding adding any 
amazon-aws JARs.
# create hadoop-lean-client for minimal deployments, stripping out all 
extraneous dependencies,
# for hadoop-lean-client, have a compatibility statement of "we will strip out 
anything we can from this, even over point releases". That is, anything that 
can be dropped in future, will.

This will give downstream projects a choice: the old POM with everything, the 
lean POM for new apps.

And, by reinstating hadoop-hdfs, things will build again




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to