Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Ted Yu
Guillermo :

bq. Shell output: Requested user hdfs is not whitelisted and has id
496,which is below the minimum allowed 1000

Are you using a secure cluster ?

Can user hdfs be re-created with uuid > 1000 ?

Cheers

On Sat, Jun 27, 2015 at 2:32 AM, Guillermo Ortiz 
wrote:

> I'm checking the logs in YARN and I found this error as well
>
> Application application_1434976209271_15614 failed 2 times due to AM
> Container for appattempt_1434976209271_15614_02 exited with exitCode:
> 255
>
>
> Diagnostics: Exception from container-launch.
> Container id: container_1434976209271_15614_02_01
> Exit code: 255
> Stack trace: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Shell output: Requested user hdfs is not whitelisted and has id 496,which
> is below the minimum allowed 1000
> Container exited with a non-zero exit code 255
> Failing this attempt. Failing the application.
>
> 2015-06-27 11:25 GMT+02:00 Guillermo Ortiz :
>
>> Well SPARK_CLASSPATH it's just a random name, the complete script is this:
>>
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>
>> SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/"
>> for lib in `ls /usr/metrics/lib/*.jar`
>> do
>> if [ -z "$SPARK_CLASSPATH" ]; then
>> SPARK_CLASSPATH=$lib
>> else
>> SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib
>> fi
>> done
>> spark-submit --name "Metrics"
>>
>> I need to add all the jars as you know,, maybe it was a bad name
>> SPARK_CLASSPATH
>>
>> The code doesn't have any stateful operation, yo I guess that it¡s okay
>> doesn't have checkpoint. I have executed hundres of times thiscode in VM
>> from Cloudera and never got this error.
>>
>> 2015-06-27 11:21 GMT+02:00 Tathagata Das :
>>
>>> 1. you need checkpointing mostly for recovering from driver failures,
>>> and in some cases also for some stateful operations.
>>>
>>> 2. Could you try not using the SPARK_CLASSPATH environment variable.
>>>
>>> TD
>>>
>>> On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
>>> wrote:
>>>
 I don't have any checkpoint on my code. Really, I don't have to save
 any state. It's just a log processing of a PoC.
 I have been testing the code in a VM from Cloudera and I never got that
 error.. Not it's a real cluster.

 The command to execute Spark
 spark-submit --name "PoC Logs" --master yarn-client --class
 com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
 /usr/metrics/ex/metrics-spark.jar $1 $2 $3

 val sparkConf = new SparkConf()
 val ssc = new StreamingContext(sparkConf, Seconds(5))
 val kafkaParams = Map[String, String]("metadata.broker.list" ->
 args(0))
 val topics = args(1).split("\\,")
 val directKafkaStream = KafkaUtils.createDirectStream[String,
 String, StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)

 directKafkaStream.foreachRDD { rdd =>
   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
   .
}

 I understand that I just need a checkpoint if I need to recover the
 task it something goes wrong, right?


 2015-06-27 9:39 GMT+02:00 Tathagata Das :

> How are you trying to execute the code again? From checkpoints, or
> otherwise?
> Also cc'ed Hari who may have a better idea of YARN related issues.
>
> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz <
> konstt2...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm executing a SparkStreamig code with Kafka. IçThe code was working
>> but today I tried to execute the code again and I got an exception, I 
>> dn't
>> know what's it happening. right now , there are no jobs executions on 
>> YARN.
>> How could it fix it?
>>
>> Exception in thread "main" org.apache.spark.SparkException: Yarn
>> application has already ended! It might have been killed or unable to
>> launch application master.
>> at
>> org.apache.spark.scheduler.cluster.YarnClientSchedu

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Guillermo Ortiz
I changed the variable name and I got the same error.

2015-06-27 11:36 GMT+02:00 Tathagata Das :

> Well, though randomly chosen, SPARK_CLASSPATH is a recognized env variable
> that is picked up by spark-submit. That is what was used pre-Spark-1.0, but
> got deprecated after that. Mind renamign that variable and trying it out
> again? At least it will reduce one possible source of problem.
>
> TD
>
> On Sat, Jun 27, 2015 at 2:32 AM, Guillermo Ortiz 
> wrote:
>
>> I'm checking the logs in YARN and I found this error as well
>>
>> Application application_1434976209271_15614 failed 2 times due to AM
>> Container for appattempt_1434976209271_15614_02 exited with exitCode:
>> 255
>>
>>
>> Diagnostics: Exception from container-launch.
>> Container id: container_1434976209271_15614_02_01
>> Exit code: 255
>> Stack trace: ExitCodeException exitCode=255:
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>> at org.apache.hadoop.util.Shell.run(Shell.java:455)
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Shell output: Requested user hdfs is not whitelisted and has id 496,which
>> is below the minimum allowed 1000
>> Container exited with a non-zero exit code 255
>> Failing this attempt. Failing the application.
>>
>> 2015-06-27 11:25 GMT+02:00 Guillermo Ortiz :
>>
>>> Well SPARK_CLASSPATH it's just a random name, the complete script is
>>> this:
>>>
>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>
>>> SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/"
>>> for lib in `ls /usr/metrics/lib/*.jar`
>>> do
>>> if [ -z "$SPARK_CLASSPATH" ]; then
>>> SPARK_CLASSPATH=$lib
>>> else
>>> SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib
>>> fi
>>> done
>>> spark-submit --name "Metrics"
>>>
>>> I need to add all the jars as you know,, maybe it was a bad name
>>> SPARK_CLASSPATH
>>>
>>> The code doesn't have any stateful operation, yo I guess that it¡s okay
>>> doesn't have checkpoint. I have executed hundres of times thiscode in VM
>>> from Cloudera and never got this error.
>>>
>>> 2015-06-27 11:21 GMT+02:00 Tathagata Das :
>>>
 1. you need checkpointing mostly for recovering from driver failures,
 and in some cases also for some stateful operations.

 2. Could you try not using the SPARK_CLASSPATH environment variable.

 TD

 On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
 wrote:

> I don't have any checkpoint on my code. Really, I don't have to save
> any state. It's just a log processing of a PoC.
> I have been testing the code in a VM from Cloudera and I never got
> that error.. Not it's a real cluster.
>
> The command to execute Spark
> spark-submit --name "PoC Logs" --master yarn-client --class
> com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
> /usr/metrics/ex/metrics-spark.jar $1 $2 $3
>
> val sparkConf = new SparkConf()
> val ssc = new StreamingContext(sparkConf, Seconds(5))
> val kafkaParams = Map[String, String]("metadata.broker.list" ->
> args(0))
> val topics = args(1).split("\\,")
> val directKafkaStream = KafkaUtils.createDirectStream[String,
> String, StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)
>
> directKafkaStream.foreachRDD { rdd =>
>   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
>   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
>   .
>}
>
> I understand that I just need a checkpoint if I need to recover the
> task it something goes wrong, right?
>
>
> 2015-06-27 9:39 GMT+02:00 Tathagata Das :
>
>> How are you trying to execute the code again? From checkpoints, or
>> otherwise?
>> Also cc'ed Hari who may have a better idea of YARN related issues.
>>
>> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz <
>> konstt2...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm executing a SparkStreamig code with Kafka. IçThe code was
>>> working but today I tried to execute the code again and I got an 
>>> exception,
>>> I dn't know what's it happening. right now , there are no jobs 
>>> execut

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Tathagata Das
Well, though randomly chosen, SPARK_CLASSPATH is a recognized env variable
that is picked up by spark-submit. That is what was used pre-Spark-1.0, but
got deprecated after that. Mind renamign that variable and trying it out
again? At least it will reduce one possible source of problem.

TD

On Sat, Jun 27, 2015 at 2:32 AM, Guillermo Ortiz 
wrote:

> I'm checking the logs in YARN and I found this error as well
>
> Application application_1434976209271_15614 failed 2 times due to AM
> Container for appattempt_1434976209271_15614_02 exited with exitCode:
> 255
>
>
> Diagnostics: Exception from container-launch.
> Container id: container_1434976209271_15614_02_01
> Exit code: 255
> Stack trace: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Shell output: Requested user hdfs is not whitelisted and has id 496,which
> is below the minimum allowed 1000
> Container exited with a non-zero exit code 255
> Failing this attempt. Failing the application.
>
> 2015-06-27 11:25 GMT+02:00 Guillermo Ortiz :
>
>> Well SPARK_CLASSPATH it's just a random name, the complete script is this:
>>
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>
>> SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/"
>> for lib in `ls /usr/metrics/lib/*.jar`
>> do
>> if [ -z "$SPARK_CLASSPATH" ]; then
>> SPARK_CLASSPATH=$lib
>> else
>> SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib
>> fi
>> done
>> spark-submit --name "Metrics"
>>
>> I need to add all the jars as you know,, maybe it was a bad name
>> SPARK_CLASSPATH
>>
>> The code doesn't have any stateful operation, yo I guess that it¡s okay
>> doesn't have checkpoint. I have executed hundres of times thiscode in VM
>> from Cloudera and never got this error.
>>
>> 2015-06-27 11:21 GMT+02:00 Tathagata Das :
>>
>>> 1. you need checkpointing mostly for recovering from driver failures,
>>> and in some cases also for some stateful operations.
>>>
>>> 2. Could you try not using the SPARK_CLASSPATH environment variable.
>>>
>>> TD
>>>
>>> On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
>>> wrote:
>>>
 I don't have any checkpoint on my code. Really, I don't have to save
 any state. It's just a log processing of a PoC.
 I have been testing the code in a VM from Cloudera and I never got that
 error.. Not it's a real cluster.

 The command to execute Spark
 spark-submit --name "PoC Logs" --master yarn-client --class
 com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
 /usr/metrics/ex/metrics-spark.jar $1 $2 $3

 val sparkConf = new SparkConf()
 val ssc = new StreamingContext(sparkConf, Seconds(5))
 val kafkaParams = Map[String, String]("metadata.broker.list" ->
 args(0))
 val topics = args(1).split("\\,")
 val directKafkaStream = KafkaUtils.createDirectStream[String,
 String, StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)

 directKafkaStream.foreachRDD { rdd =>
   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
   .
}

 I understand that I just need a checkpoint if I need to recover the
 task it something goes wrong, right?


 2015-06-27 9:39 GMT+02:00 Tathagata Das :

> How are you trying to execute the code again? From checkpoints, or
> otherwise?
> Also cc'ed Hari who may have a better idea of YARN related issues.
>
> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz <
> konstt2...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm executing a SparkStreamig code with Kafka. IçThe code was working
>> but today I tried to execute the code again and I got an exception, I 
>> dn't
>> know what's it happening. right now , there are no jobs executions on 
>> YARN.
>> How could it fix it?
>>
>> Exception in thread "main" org.apache.spark.SparkException: Yarn
>> application has already ended! It might have been killed or unable to
>> launch application master.

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Guillermo Ortiz
I'm checking the logs in YARN and I found this error as well

Application application_1434976209271_15614 failed 2 times due to AM
Container for appattempt_1434976209271_15614_02 exited with exitCode:
255


Diagnostics: Exception from container-launch.
Container id: container_1434976209271_15614_02_01
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Shell output: Requested user hdfs is not whitelisted and has id 496,which
is below the minimum allowed 1000
Container exited with a non-zero exit code 255
Failing this attempt. Failing the application.

2015-06-27 11:25 GMT+02:00 Guillermo Ortiz :

> Well SPARK_CLASSPATH it's just a random name, the complete script is this:
>
> export HADOOP_CONF_DIR=/etc/hadoop/conf
>
> SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/"
> for lib in `ls /usr/metrics/lib/*.jar`
> do
> if [ -z "$SPARK_CLASSPATH" ]; then
> SPARK_CLASSPATH=$lib
> else
> SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib
> fi
> done
> spark-submit --name "Metrics"
>
> I need to add all the jars as you know,, maybe it was a bad name
> SPARK_CLASSPATH
>
> The code doesn't have any stateful operation, yo I guess that it¡s okay
> doesn't have checkpoint. I have executed hundres of times thiscode in VM
> from Cloudera and never got this error.
>
> 2015-06-27 11:21 GMT+02:00 Tathagata Das :
>
>> 1. you need checkpointing mostly for recovering from driver failures, and
>> in some cases also for some stateful operations.
>>
>> 2. Could you try not using the SPARK_CLASSPATH environment variable.
>>
>> TD
>>
>> On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
>> wrote:
>>
>>> I don't have any checkpoint on my code. Really, I don't have to save any
>>> state. It's just a log processing of a PoC.
>>> I have been testing the code in a VM from Cloudera and I never got that
>>> error.. Not it's a real cluster.
>>>
>>> The command to execute Spark
>>> spark-submit --name "PoC Logs" --master yarn-client --class
>>> com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
>>> /usr/metrics/ex/metrics-spark.jar $1 $2 $3
>>>
>>> val sparkConf = new SparkConf()
>>> val ssc = new StreamingContext(sparkConf, Seconds(5))
>>> val kafkaParams = Map[String, String]("metadata.broker.list" ->
>>> args(0))
>>> val topics = args(1).split("\\,")
>>> val directKafkaStream = KafkaUtils.createDirectStream[String,
>>> String, StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)
>>>
>>> directKafkaStream.foreachRDD { rdd =>
>>>   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
>>>   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
>>>   .
>>>}
>>>
>>> I understand that I just need a checkpoint if I need to recover the task
>>> it something goes wrong, right?
>>>
>>>
>>> 2015-06-27 9:39 GMT+02:00 Tathagata Das :
>>>
 How are you trying to execute the code again? From checkpoints, or
 otherwise?
 Also cc'ed Hari who may have a better idea of YARN related issues.

 On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz >>> > wrote:

> Hi,
>
> I'm executing a SparkStreamig code with Kafka. IçThe code was working
> but today I tried to execute the code again and I got an exception, I dn't
> know what's it happening. right now , there are no jobs executions on 
> YARN.
> How could it fix it?
>
> Exception in thread "main" org.apache.spark.SparkException: Yarn
> application has already ended! It might have been killed or unable to
> launch application master.
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
> at org.apache.spark.SparkContext.(SparkContext.scala:379)
> at
> org.apache.spark.streaming.StreamingContext$.createNewSpar

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Guillermo Ortiz
Well SPARK_CLASSPATH it's just a random name, the complete script is this:

export HADOOP_CONF_DIR=/etc/hadoop/conf
SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/"
for lib in `ls /usr/metrics/lib/*.jar`
do
if [ -z "$SPARK_CLASSPATH" ]; then
SPARK_CLASSPATH=$lib
else
SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib
fi
done
spark-submit --name "Metrics"

I need to add all the jars as you know,, maybe it was a bad name
SPARK_CLASSPATH

The code doesn't have any stateful operation, yo I guess that it¡s okay
doesn't have checkpoint. I have executed hundres of times thiscode in VM
from Cloudera and never got this error.

2015-06-27 11:21 GMT+02:00 Tathagata Das :

> 1. you need checkpointing mostly for recovering from driver failures, and
> in some cases also for some stateful operations.
>
> 2. Could you try not using the SPARK_CLASSPATH environment variable.
>
> TD
>
> On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
> wrote:
>
>> I don't have any checkpoint on my code. Really, I don't have to save any
>> state. It's just a log processing of a PoC.
>> I have been testing the code in a VM from Cloudera and I never got that
>> error.. Not it's a real cluster.
>>
>> The command to execute Spark
>> spark-submit --name "PoC Logs" --master yarn-client --class
>> com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
>> /usr/metrics/ex/metrics-spark.jar $1 $2 $3
>>
>> val sparkConf = new SparkConf()
>> val ssc = new StreamingContext(sparkConf, Seconds(5))
>> val kafkaParams = Map[String, String]("metadata.broker.list" ->
>> args(0))
>> val topics = args(1).split("\\,")
>> val directKafkaStream = KafkaUtils.createDirectStream[String, String,
>> StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)
>>
>> directKafkaStream.foreachRDD { rdd =>
>>   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
>>   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
>>   .
>>}
>>
>> I understand that I just need a checkpoint if I need to recover the task
>> it something goes wrong, right?
>>
>>
>> 2015-06-27 9:39 GMT+02:00 Tathagata Das :
>>
>>> How are you trying to execute the code again? From checkpoints, or
>>> otherwise?
>>> Also cc'ed Hari who may have a better idea of YARN related issues.
>>>
>>> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz 
>>> wrote:
>>>
 Hi,

 I'm executing a SparkStreamig code with Kafka. IçThe code was working
 but today I tried to execute the code again and I got an exception, I dn't
 know what's it happening. right now , there are no jobs executions on YARN.
 How could it fix it?

 Exception in thread "main" org.apache.spark.SparkException: Yarn
 application has already ended! It might have been killed or unable to
 launch application master.
 at
 org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
 at
 org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
 at
 org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
 at org.apache.spark.SparkContext.(SparkContext.scala:379)
 at
 org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
 at
 org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75)
 at
 com.produban.metrics.MetricsTransfInternationalSpark$.main(MetricsTransfInternationalSpark.scala:66)
 at
 com.produban.metrics.MetricsTransfInternationalSpark.main(MetricsTransfInternationalSpark.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
 at
 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
 at
 org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
 at
 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 *15/06/27 09:27:09 ERROR Utils: Uncaught exception in thread delete
 Spark local dirs*
 java.lang.NullPointerException
 at org.apache.spark.storage.DiskBlockManager.org
 $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
 at
 org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Tathagata Das
1. you need checkpointing mostly for recovering from driver failures, and
in some cases also for some stateful operations.

2. Could you try not using the SPARK_CLASSPATH environment variable.

TD

On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz 
wrote:

> I don't have any checkpoint on my code. Really, I don't have to save any
> state. It's just a log processing of a PoC.
> I have been testing the code in a VM from Cloudera and I never got that
> error.. Not it's a real cluster.
>
> The command to execute Spark
> spark-submit --name "PoC Logs" --master yarn-client --class
> com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
> /usr/metrics/ex/metrics-spark.jar $1 $2 $3
>
> val sparkConf = new SparkConf()
> val ssc = new StreamingContext(sparkConf, Seconds(5))
> val kafkaParams = Map[String, String]("metadata.broker.list" ->
> args(0))
> val topics = args(1).split("\\,")
> val directKafkaStream = KafkaUtils.createDirectStream[String, String,
> StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)
>
> directKafkaStream.foreachRDD { rdd =>
>   val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
>   val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
>   .
>}
>
> I understand that I just need a checkpoint if I need to recover the task
> it something goes wrong, right?
>
>
> 2015-06-27 9:39 GMT+02:00 Tathagata Das :
>
>> How are you trying to execute the code again? From checkpoints, or
>> otherwise?
>> Also cc'ed Hari who may have a better idea of YARN related issues.
>>
>> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz 
>> wrote:
>>
>>> Hi,
>>>
>>> I'm executing a SparkStreamig code with Kafka. IçThe code was working
>>> but today I tried to execute the code again and I got an exception, I dn't
>>> know what's it happening. right now , there are no jobs executions on YARN.
>>> How could it fix it?
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Yarn
>>> application has already ended! It might have been killed or unable to
>>> launch application master.
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
>>> at
>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
>>> at org.apache.spark.SparkContext.(SparkContext.scala:379)
>>> at
>>> org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
>>> at
>>> org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75)
>>> at
>>> com.produban.metrics.MetricsTransfInternationalSpark$.main(MetricsTransfInternationalSpark.scala:66)
>>> at
>>> com.produban.metrics.MetricsTransfInternationalSpark.main(MetricsTransfInternationalSpark.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>> *15/06/27 09:27:09 ERROR Utils: Uncaught exception in thread delete
>>> Spark local dirs*
>>> java.lang.NullPointerException
>>> at org.apache.spark.storage.DiskBlockManager.org
>>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>>> at
>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>>> at
>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>>> at
>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>>> at
>>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>>> at
>>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>>> Exception in thread "delete Spark local dirs"
>>> java.lang.NullPointerException
>>> at org.apache.spark.storage.DiskBlockManager.org
>>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>>> at
>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>>> at
>>> org.apache.spark.storage.DiskBlockMana

Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Guillermo Ortiz
I don't have any checkpoint on my code. Really, I don't have to save any
state. It's just a log processing of a PoC.
I have been testing the code in a VM from Cloudera and I never got that
error.. Not it's a real cluster.

The command to execute Spark
spark-submit --name "PoC Logs" --master yarn-client --class
com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g
/usr/metrics/ex/metrics-spark.jar $1 $2 $3

val sparkConf = new SparkConf()
val ssc = new StreamingContext(sparkConf, Seconds(5))
val kafkaParams = Map[String, String]("metadata.broker.list" -> args(0))
val topics = args(1).split("\\,")
val directKafkaStream = KafkaUtils.createDirectStream[String, String,
StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet)

directKafkaStream.foreachRDD { rdd =>
  val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
  val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) =>
  .
   }

I understand that I just need a checkpoint if I need to recover the task it
something goes wrong, right?


2015-06-27 9:39 GMT+02:00 Tathagata Das :

> How are you trying to execute the code again? From checkpoints, or
> otherwise?
> Also cc'ed Hari who may have a better idea of YARN related issues.
>
> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz 
> wrote:
>
>> Hi,
>>
>> I'm executing a SparkStreamig code with Kafka. IçThe code was working but
>> today I tried to execute the code again and I got an exception, I dn't know
>> what's it happening. right now , there are no jobs executions on YARN.
>> How could it fix it?
>>
>> Exception in thread "main" org.apache.spark.SparkException: Yarn
>> application has already ended! It might have been killed or unable to
>> launch application master.
>> at
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
>> at
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
>> at
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
>> at org.apache.spark.SparkContext.(SparkContext.scala:379)
>> at
>> org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
>> at
>> org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75)
>> at
>> com.produban.metrics.MetricsTransfInternationalSpark$.main(MetricsTransfInternationalSpark.scala:66)
>> at
>> com.produban.metrics.MetricsTransfInternationalSpark.main(MetricsTransfInternationalSpark.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>> at
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>> at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>> at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>> *15/06/27 09:27:09 ERROR Utils: Uncaught exception in thread delete Spark
>> local dirs*
>> java.lang.NullPointerException
>> at org.apache.spark.storage.DiskBlockManager.org
>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>> Exception in thread "delete Spark local dirs"
>> java.lang.NullPointerException
>> at org.apache.spark.storage.DiskBlockManager.org
>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>> at
>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>>
>>
>>
>>
>


Re: Uncaught exception in thread delete Spark local dirs

2015-06-27 Thread Tathagata Das
How are you trying to execute the code again? From checkpoints, or
otherwise?
Also cc'ed Hari who may have a better idea of YARN related issues.

On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz 
wrote:

> Hi,
>
> I'm executing a SparkStreamig code with Kafka. IçThe code was working but
> today I tried to execute the code again and I got an exception, I dn't know
> what's it happening. right now , there are no jobs executions on YARN.
> How could it fix it?
>
> Exception in thread "main" org.apache.spark.SparkException: Yarn
> application has already ended! It might have been killed or unable to
> launch application master.
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
> at org.apache.spark.SparkContext.(SparkContext.scala:379)
> at
> org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
> at
> org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75)
> at
> com.produban.metrics.MetricsTransfInternationalSpark$.main(MetricsTransfInternationalSpark.scala:66)
> at
> com.produban.metrics.MetricsTransfInternationalSpark.main(MetricsTransfInternationalSpark.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> *15/06/27 09:27:09 ERROR Utils: Uncaught exception in thread delete Spark
> local dirs*
> java.lang.NullPointerException
> at org.apache.spark.storage.DiskBlockManager.org
> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
> at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
> Exception in thread "delete Spark local dirs"
> java.lang.NullPointerException
> at org.apache.spark.storage.DiskBlockManager.org
> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
> at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>
>
>
>