[
https://issues.apache.org/jira/browse/PIG-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172212#comment-14172212
]
Praveen Rachabattuni commented on PIG-4233:
-------------------------------------------
[~rohini] So, the idea would be to submit the pig dependency jars leaving spark
jars to the cluster along with the pig snapshot jar. This can either be done by
adding all jars from lib directory to classpath or use the legacy jar. Let me
know if I missing something.
I have created a new jira(PIG-4236) to avoid packaging of spark jars along with
pig dependencies.
> Package pig along with dependencies into a fat jar while job submission to
> Spark cluster
> ----------------------------------------------------------------------------------------
>
> Key: PIG-4233
> URL: https://issues.apache.org/jira/browse/PIG-4233
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: Praveen Rachabattuni
> Assignee: Praveen Rachabattuni
> Attachments: PIG-4233.patch
>
>
> Currently we have a fat jar created in legacy directory which contains pig
> along with dependencies.
> Would need to modify build.xml to add spark dependency jars to include in
> legacy fat jar.
> Running job on Spark cluster:
> 1. export SPARK_HOME=/path/to/spark
> 2. export
> SPARK_PIG_JAR=$PIG_HOME/legacy/pig-0.14.0-SNAPSHOT-withouthadoop-h1.jar
> 3. export SPARK_MASTER=spark://localhost:7077
> 4 export HADOOP_HOME=/path/to/hadoop
> 5. Launch the job using ./bin/pig -x spark
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)