Dell - Internal Use - Confidential
Hi,

Using spark 0.8 and hadoop 1.2.1 with cluster of 2 node each have 16 CPU and 
allocated 8G of RAM

I am running into a use case that if I try to save a very large JavaRDD<String> 
that was created using paralleize from Java List<String> my job workers are 
failing as follows

13/11/11 19:23:48 INFO Worker: Executor app-20131111191414-0001/2 finished with 
state FAILED message Command exited with code 1 exitStatus 1

Looks like the spark driver trying 5 times to execute the  then decide to kill 
the process

Any help on how to get more info on the reason of failure or what code 1 
existStatus 1 would means here?

Any setting or configuration that I can use in spark that would dump more info 
on error?

Here's my logs

13/11/11 19:14:50 INFO Worker: Asked to launch executor 
app-20131111190659-0000/0 for OMDBQueryService
13/11/11 19:14:50 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "0" "poc3" "16"
13/11/11 19:16:47 INFO Worker: Executor app-20131111190659-0000/0 finished with 
state FAILED message Command exited with code 1 exitStatus 1
13/11/11 19:16:47 INFO Worker: Asked to launch executor 
app-20131111190659-0000/2 for OMDBQueryService
13/11/11 19:16:47 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "2" "poc3" "16"
13/11/11 19:16:53 INFO Worker: Executor app-20131111190659-0000/2 finished with 
state FAILED message Command exited with code 1 exitStatus 1
13/11/11 19:16:53 INFO Worker: Asked to launch executor 
app-20131111190659-0000/4 for OMDBQueryService
13/11/11 19:16:53 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "4" "poc3" "16"
13/11/11 19:17:02 INFO Worker: Executor app-20131111190659-0000/4 finished with 
state FAILED message Command exited with code 1 exitStatus 1
13/11/11 19:17:02 INFO Worker: Asked to launch executor 
app-20131111190659-0000/6 for OMDBQueryService
13/11/11 19:17:02 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "6" "poc3" "16"
13/11/11 19:17:09 INFO Worker: Executor app-20131111190659-0000/6 finished with 
state FAILED message Command exited with code 1 exitStatus 1
13/11/11 19:17:09 INFO Worker: Asked to launch executor 
app-20131111190659-0000/8 for OMDBQueryService
13/11/11 19:17:09 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "8" "poc3" "16"
13/11/11 19:17:17 INFO Worker: Executor app-20131111190659-0000/8 finished with 
state FAILED message Command exited with code 1 exitStatus 1
13/11/11 19:17:17 INFO Worker: Asked to launch executor 
app-20131111190659-0000/10 for OMDBQueryService
13/11/11 19:17:17 INFO ExecutorRunner: Launch command: "java" "-cp" 
":/opt/spark-0.8.0/conf:/opt/spark-0.8.0/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.0-incubating-hadoop1.0.4.jar"
 "-Dspark.executor.memory=8g" "-Dspark.local.dir=/tmp/spark" 
"-XX:+UseParallelGC" "-XX:+UseParallelOldGC" "-XX:+DisableExplicitGC" 
"-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Dspark.executor.memory=8g" 
"-Dspark.local.dir=/tmp/spark" "-XX:+UseParallelGC" "-XX:+UseParallelOldGC" 
"-XX:+DisableExplicitGC" "-XX:MaxPermSize=1024m" "-Xms512M" "-Xmx512M" 
"org.apache.spark.executor.StandaloneExecutorBackend" 
"akka://spark@poc1:54482/user/StandaloneScheduler" "10" "poc3" "16"
13/11/11 19:17:20 INFO Worker: Asked to kill executor app-20131111190659-0000/10
13/11/11 19:17:20 INFO ExecutorRunner: Killing process!
13/11/11 19:17:20 INFO ExecutorRunner: Runner thread for executor 
app-20131111190659-0000/10 interrupted
13/11/11 19:17:21 INFO Worker: Executor app-20131111190659-0000/10 finished 
with state KILLED

Reply via email to