[
https://issues.apache.org/jira/browse/SPARK-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567752#comment-14567752
]
Marcelo Vanzin edited comment on SPARK-4048 at 6/1/15 6:37 PM:
---------------------------------------------------------------
That is not a regression. The whole point of "hadoop-provided" is that *you*
have to provide the needed jars. So if a jar is missing, you are failing to
provide it.
was (Author: vanzin):
That is not a regression. The whole point of "hadoop-provided" is that *you*
have to provide the needed jars. So if a jar is missing, you are failing to
provide them.
> Enhance and extend hadoop-provided profile
> ------------------------------------------
>
> Key: SPARK-4048
> URL: https://issues.apache.org/jira/browse/SPARK-4048
> Project: Spark
> Issue Type: Improvement
> Components: Build
> Affects Versions: 1.2.0
> Reporter: Marcelo Vanzin
> Assignee: Marcelo Vanzin
> Fix For: 1.3.0
>
>
> The hadoop-provided profile is used to not package Hadoop dependencies inside
> the Spark assembly. It works, sort of, but it could use some enhancements. A
> quick list:
> - It doesn't include all things that could be removed from the assembly
> - It doesn't work well when you're publishing artifacts based on it
> (SPARK-3812 fixes this)
> - There are other dependencies that could use similar treatment: Hive, HBase
> (for the examples), Flume, Parquet, maybe others I'm missing at the moment.
> - Unit tests, more specifically, those that use local-cluster mode, do not
> work when the assembly is built with this profile enabled.
> - The scripts to launch Spark jobs do not add needed "provided" jars to the
> classpath when this profile is enabled, leaving it for people to figure that
> out for themselves.
> - The examples assembly duplicates a lot of things in the main assembly.
> Part of this task is selfish since we build internally with this profile and
> we'd like to make it easier for us to merge changes without having to keep
> too many patches on top of upstream. But those feel like good improvements to
> me, regardless.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]