[
https://issues.apache.org/jira/browse/SPARK-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568035#comment-14568035
]
Marcelo Vanzin edited comment on SPARK-4048 at 6/1/15 9:22 PM:
---------------------------------------------------------------
Just for completeness:
{{quote}}
Case 1:
Spark uses code from Jar A.
This is same as curator case. Spark should bundle Jar A.
{{quote}}
That's only the case if you do not enable any of the "-provided" profiles.
Again, those profiles exists to *explicitly break* that case, and thus those
using them are expected to know what they're doing.
was (Author: vanzin):
Just for completeness:
{{quote}}
Case 1:
Spark uses code from Jar A.
This is same as curator case. Spark should bundle Jar A.
{{quote}}
That's only the case if you do not enable any of the "*-provided" profiles.
Again, those profiles exists to *explicitly break* that case, and thus those
using them are expected to know what they're doing.
> Enhance and extend hadoop-provided profile
> ------------------------------------------
>
> Key: SPARK-4048
> URL: https://issues.apache.org/jira/browse/SPARK-4048
> Project: Spark
> Issue Type: Improvement
> Components: Build
> Affects Versions: 1.2.0
> Reporter: Marcelo Vanzin
> Assignee: Marcelo Vanzin
> Fix For: 1.3.0
>
>
> The hadoop-provided profile is used to not package Hadoop dependencies inside
> the Spark assembly. It works, sort of, but it could use some enhancements. A
> quick list:
> - It doesn't include all things that could be removed from the assembly
> - It doesn't work well when you're publishing artifacts based on it
> (SPARK-3812 fixes this)
> - There are other dependencies that could use similar treatment: Hive, HBase
> (for the examples), Flume, Parquet, maybe others I'm missing at the moment.
> - Unit tests, more specifically, those that use local-cluster mode, do not
> work when the assembly is built with this profile enabled.
> - The scripts to launch Spark jobs do not add needed "provided" jars to the
> classpath when this profile is enabled, leaving it for people to figure that
> out for themselves.
> - The examples assembly duplicates a lot of things in the main assembly.
> Part of this task is selfish since we build internally with this profile and
> we'd like to make it easier for us to merge changes without having to keep
> too many patches on top of upstream. But those feel like good improvements to
> me, regardless.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]