[ 
https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197165#comment-15197165
 ] 

Steve Loughran commented on SPARK-7481:
---------------------------------------

...thinking some more about this

How about 

# adding a {{spark-cloud}} module which, initially, does nothing but declare 
the dependencies on {{hadoop-aws}}, {{hadoop-openstack}}, and on 2.7+, 
{{hadoop-azure}}. 
# have spark assembly declare a dependency on this module, but explicitly 
excluding all dependencies other than the hadoop ones (i.e. no amazon libs, no 
extra httpclient ones for openstack (if there are any), anything azure wants). 
If someone wants to add the relevant amazon libs, they need to explicitly add 
it on the {{--jars}} option.

Doing it this way means that if a project depends on {{spark-cloud}} it gets 
all the cloud dependencies that version of spark+hadoop needs.

It also provides a placeholder for explicit cloud support, specifically

- output committers that don't try to rename/assume that directory delete is 
atomic and O(1)
- some optional tests/examples to read/write data. 

The tests would be good not just for spark, but for catching regressions in 
hadoop/aws/azure code.

If people think this is good, assign it to me and I'll look at it in april

> Add Hadoop 2.6+ profile to pull in object store FS accessors
> ------------------------------------------------------------
>
>                 Key: SPARK-7481
>                 URL: https://issues.apache.org/jira/browse/SPARK-7481
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 1.3.1
>            Reporter: Steve Loughran
>
> To keep the s3n classpath right, to add s3a, swift & azure, the dependencies 
> of spark in a 2.6+ profile need to add the relevant object store packages 
> (hadoop-aws, hadoop-openstack, hadoop-azure)
> this adds more stuff to the client bundle, but will mean a single spark 
> package can talk to all of the stores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to