[
https://issues.apache.org/jira/browse/OOZIE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851984#comment-15851984
]
Andras Piros commented on OOZIE-2787:
-------------------------------------
[~satishsaley] thanks for the patch!
Some observations:
* please add test case to {{TestSparkMain}} or elsewhere
* please rename new Maven profile to {{spark-2.1-kafka-1.6.2}} to get a better
idea what's in there
* I'd extract the {{filterJars()}} to a nested class for better testability and
SRP, like {{JarURIFilter}}. In that case you could pass all the necessary
parameters via constructor, and have a {{toString{})) method that calls
{{StringUtils.join()}}
* it's OK w/ me if all the JAR files of the current directory are filtered,
supposing all those ones are application JARs. What about other packages like
{{.py}} and {{.zip}} files? Maybe worth having unit tests for those as well
> Oozie distributes application jar twice making the spark job fail
> -----------------------------------------------------------------
>
> Key: OOZIE-2787
> URL: https://issues.apache.org/jira/browse/OOZIE-2787
> Project: Oozie
> Issue Type: Bug
> Reporter: Satish Subhashrao Saley
> Assignee: Satish Subhashrao Saley
> Attachments: OOZIE-2787-1.patch
>
>
> Oozie adds the application jar to the list of files to be uploaded to
> distributed cache. Since this gets added twice, the job fails. This is
> observed from spark 2.1.0 which introduces a check for same file and fails
> the job.
> {code}
> --master
> yarn
> --deploy-mode
> cluster
> --name
> oozieSparkStarter
> --class
> ScalaWordCount
> --queue
> default
> --conf
> spark.executor.extraClassPath=$PWD/*
> --conf
> spark.driver.extraClassPath=$PWD/*
> --conf
> spark.executor.extraJavaOptions=-Dlog4j.configuration=spark-log4j.properties
> --conf
> spark.driver.extraJavaOptions=-Dlog4j.configuration=spark-log4j.properties
> --conf
> spark.yarn.security.tokens.hive.enabled=false
> --conf
> spark.yarn.security.tokens.hbase.enabled=false
> --files
> hdfs://mycluster.com/user/saley/oozie/apps/sparkapp/lib/spark-example.jar
> --properties-file
> spark-defaults.conf
> --verbose
> spark-example.jar
> samplefile.txt
> output
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)