José Manuel Abuín Mosquera created SPARK-2753:
-------------------------------------------------
Summary: Is it supposed --archives option in yarn cluster mode to
uncompress file?
Key: SPARK-2753
URL: https://issues.apache.org/jira/browse/SPARK-2753
Project: Spark
Issue Type: Bug
Components: YARN
Affects Versions: 1.0.0
Environment: CentOS release 6.5 (64 bits) and Hadoop 2.2.0
Reporter: José Manuel Abuín Mosquera
Hi all,
this is my first sent issue, I googled and searche dinto the Spark code and
arrived here.
When passing as argument to --archives a tar.gz or a .zip file, Spark uploads
it to the distributed cache, but it is not uncompressing it.
According the documentation, it is supposed to uncompress it, is this a bug??
Launching command is:
/opt/spark-1.0.1/bin/spark-submit --class ProlnatSpark --master yarn-cluster
--num-executors 32 --driver-library-path /opt/hadoop/hadoop-2.2.0/lib/native/
--driver-memory 390m --executor-memory 890m --executor-cores 1
--archives=Diccionarios.tar.gz --verbose ProlnatSpark.jar
Wikipedias/WikipediaPlain.txt saidaWikipediaSpark
In files
/yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala and
/yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala
doesn't seem to uncompress the files.
I hope this helps, than you very much :)
--
This message was sent by Atlassian JIRA
(v6.2#6252)