Re: Spark job stops after a while.

2016-01-21 Thread Guillermo Ortiz
I'm using 1.5.0 of  Spark confirmed. Less this
jar file:/opt/centralLogs/lib/spark-catalyst_2.10-1.5.1.jar.

I'm going to keep looking for,, Thank you!.

2016-01-21 16:29 GMT+01:00 Ted Yu :

> Maybe this is related (fixed in 1.5.3):
> SPARK-11195 Exception thrown on executor throws ClassNotFoundException on
> driver
>
> FYI
>
> On Thu, Jan 21, 2016 at 7:10 AM, Guillermo Ortiz 
> wrote:
>
>> I'm using CDH 5.5.1 with Spark 1.5.x (I think that it's 1.5.2).
>>
>> I know that the library is here:
>> cloud-user@ose10kafkaelk:/opt/centralLogs/lib$ jar tf
>> elasticsearch-hadoop-2.2.0-beta1.jar | grep
>>  EsHadoopIllegalArgumentException
>> org/elasticsearch/hadoop/EsHadoopIllegalArgumentException.class
>>
>> I have check in SparkUI with the process running
>> http://10.129.96.55:39320/jars/elasticsearch-hadoop-2.2.0-beta1.jar Added
>> By User
>> And spark.jars from SparkUI.
>>
>> .file:/opt/centralLogs/lib/elasticsearch-hadoop-2.2.0-beta1.jar,file:/opt/centralLogs/lib/geronimo-annotation_1.0_spec-1.1.1.jar,
>>
>> I think that in yarn-client although it has the error it doesn't stop the
>> execution, but I don't know why.
>>
>>
>>
>> 2016-01-21 15:55 GMT+01:00 Ted Yu :
>>
>>> Looks like jar containing EsHadoopIllegalArgumentException class wasn't
>>> in the classpath.
>>> Can you double check ?
>>>
>>> Which Spark version are you using ?
>>>
>>> Cheers
>>>
>>> On Thu, Jan 21, 2016 at 6:50 AM, Guillermo Ortiz 
>>> wrote:
>>>
 I'm runing a Spark Streaming process and it stops in a while. It makes
 some process an insert the result in ElasticSeach with its library. After a
 while the process fail.

 I have been checking the logs and I have seen this error
 2016-01-21 14:57:54,388 [sparkDriver-akka.actor.default-dispatcher-17]
 INFO  org.apache.spark.storage.BlockManagerInfo - Added broadcast_2_piece0
 in memory on ose11kafkaelk.novalocal:46913 (size: 6.0 KB, free: 530.3 MB)
 2016-01-21 14:57:54,646 [task-result-getter-1] INFO
  org.apache.spark.scheduler.TaskSetManager - Finished task 0.0 in stage 2.0
 (TID 7) in 397 ms on ose12kafkaelk.novalocal (1/6)
 2016-01-21 14:57:54,647 [task-result-getter-2] INFO
  org.apache.spark.scheduler.TaskSetManager - Finished task 2.0 in stage 2.0
 (TID 10) in 395 ms on ose12kafkaelk.novalocal (2/6)
 2016-01-21 14:57:54,731 [task-result-getter-3] INFO
  org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage 2.0
 (TID 9) in 481 ms on ose11kafkaelk.novalocal (3/6)
 2016-01-21 14:57:54,844 [task-result-getter-1] INFO
  org.apache.spark.scheduler.TaskSetManager - Finished task 4.0 in stage 2.0
 (TID 8) in 595 ms on ose10kafkaelk.novalocal (4/6)
 2016-01-21 14:57:54,850 [task-result-getter-0] WARN
  org.apache.spark.ThrowableSerializationWrapper - Task exception could not
 be deserialized
 java.lang.ClassNotFoundException:
 org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)

 I don't know why I'm getting this error because the class
 org.elasticsearch.hadoop.EsHadoopIllegalArgumentException is in the library
 of elasticSearch.

 After this error I get others error and finally Spark ends.
 2016-01-21 14:57:55,012 [JobScheduler] INFO
  org.apache.spark.streaming.scheduler.JobScheduler - Starting job streaming
 job 145338464 ms.0 from job set of time 145338464 ms
 2016-01-21 14:57:55,012 [JobScheduler] ERROR
 org.apache.spark.streaming.scheduler.JobScheduler - Error running job
 streaming job 1453384635000 ms.0
 org.apache.spark.SparkException: Job aborted due to stage failure: Task
 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage
 2.0 (TID 13, ose11kafkaelk.novalocal): UnknownReason
 Driver stacktrace:
 at org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
 at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
 at scala.Option.foreach(Option.scala:236)
 at
 

Re: Spark job stops after a while.

2016-01-21 Thread Ted Yu
Looks like jar containing EsHadoopIllegalArgumentException class wasn't in
the classpath.
Can you double check ?

Which Spark version are you using ?

Cheers

On Thu, Jan 21, 2016 at 6:50 AM, Guillermo Ortiz 
wrote:

> I'm runing a Spark Streaming process and it stops in a while. It makes
> some process an insert the result in ElasticSeach with its library. After a
> while the process fail.
>
> I have been checking the logs and I have seen this error
> 2016-01-21 14:57:54,388 [sparkDriver-akka.actor.default-dispatcher-17]
> INFO  org.apache.spark.storage.BlockManagerInfo - Added broadcast_2_piece0
> in memory on ose11kafkaelk.novalocal:46913 (size: 6.0 KB, free: 530.3 MB)
> 2016-01-21 14:57:54,646 [task-result-getter-1] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 0.0 in stage 2.0
> (TID 7) in 397 ms on ose12kafkaelk.novalocal (1/6)
> 2016-01-21 14:57:54,647 [task-result-getter-2] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 2.0 in stage 2.0
> (TID 10) in 395 ms on ose12kafkaelk.novalocal (2/6)
> 2016-01-21 14:57:54,731 [task-result-getter-3] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage 2.0
> (TID 9) in 481 ms on ose11kafkaelk.novalocal (3/6)
> 2016-01-21 14:57:54,844 [task-result-getter-1] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 4.0 in stage 2.0
> (TID 8) in 595 ms on ose10kafkaelk.novalocal (4/6)
> 2016-01-21 14:57:54,850 [task-result-getter-0] WARN
>  org.apache.spark.ThrowableSerializationWrapper - Task exception could not
> be deserialized
> java.lang.ClassNotFoundException:
> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
>
> I don't know why I'm getting this error because the class
> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException is in the library
> of elasticSearch.
>
> After this error I get others error and finally Spark ends.
> 2016-01-21 14:57:55,012 [JobScheduler] INFO
>  org.apache.spark.streaming.scheduler.JobScheduler - Starting job streaming
> job 145338464 ms.0 from job set of time 145338464 ms
> 2016-01-21 14:57:55,012 [JobScheduler] ERROR
> org.apache.spark.streaming.scheduler.JobScheduler - Error running job
> streaming job 1453384635000 ms.0
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1
> in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage
> 2.0 (TID 13, ose11kafkaelk.novalocal): UnknownReason
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1507)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1469)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:67)
> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:54)
> at org.elasticsearch.spark.rdd.EsSpark$.saveJsonToEs(EsSpark.scala:90)
> at
> org.elasticsearch.spark.package$SparkJsonRDDFunctions.saveJsonToEs(package.scala:44)
> at
> produban.spark.CentralLog$$anonfun$createContext$1.apply(CentralLog.scala:56)
> at
> produban.spark.CentralLog$$anonfun$createContext$1.apply(CentralLog.scala:33)
> at
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631)
> at
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631)
> at
> 

Re: Spark job stops after a while.

2016-01-21 Thread Ted Yu
Maybe this is related (fixed in 1.5.3):
SPARK-11195 Exception thrown on executor throws ClassNotFoundException on
driver

FYI

On Thu, Jan 21, 2016 at 7:10 AM, Guillermo Ortiz 
wrote:

> I'm using CDH 5.5.1 with Spark 1.5.x (I think that it's 1.5.2).
>
> I know that the library is here:
> cloud-user@ose10kafkaelk:/opt/centralLogs/lib$ jar tf
> elasticsearch-hadoop-2.2.0-beta1.jar | grep
>  EsHadoopIllegalArgumentException
> org/elasticsearch/hadoop/EsHadoopIllegalArgumentException.class
>
> I have check in SparkUI with the process running
> http://10.129.96.55:39320/jars/elasticsearch-hadoop-2.2.0-beta1.jar Added
> By User
> And spark.jars from SparkUI.
>
> .file:/opt/centralLogs/lib/elasticsearch-hadoop-2.2.0-beta1.jar,file:/opt/centralLogs/lib/geronimo-annotation_1.0_spec-1.1.1.jar,
>
> I think that in yarn-client although it has the error it doesn't stop the
> execution, but I don't know why.
>
>
>
> 2016-01-21 15:55 GMT+01:00 Ted Yu :
>
>> Looks like jar containing EsHadoopIllegalArgumentException class wasn't
>> in the classpath.
>> Can you double check ?
>>
>> Which Spark version are you using ?
>>
>> Cheers
>>
>> On Thu, Jan 21, 2016 at 6:50 AM, Guillermo Ortiz 
>> wrote:
>>
>>> I'm runing a Spark Streaming process and it stops in a while. It makes
>>> some process an insert the result in ElasticSeach with its library. After a
>>> while the process fail.
>>>
>>> I have been checking the logs and I have seen this error
>>> 2016-01-21 14:57:54,388 [sparkDriver-akka.actor.default-dispatcher-17]
>>> INFO  org.apache.spark.storage.BlockManagerInfo - Added broadcast_2_piece0
>>> in memory on ose11kafkaelk.novalocal:46913 (size: 6.0 KB, free: 530.3 MB)
>>> 2016-01-21 14:57:54,646 [task-result-getter-1] INFO
>>>  org.apache.spark.scheduler.TaskSetManager - Finished task 0.0 in stage 2.0
>>> (TID 7) in 397 ms on ose12kafkaelk.novalocal (1/6)
>>> 2016-01-21 14:57:54,647 [task-result-getter-2] INFO
>>>  org.apache.spark.scheduler.TaskSetManager - Finished task 2.0 in stage 2.0
>>> (TID 10) in 395 ms on ose12kafkaelk.novalocal (2/6)
>>> 2016-01-21 14:57:54,731 [task-result-getter-3] INFO
>>>  org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage 2.0
>>> (TID 9) in 481 ms on ose11kafkaelk.novalocal (3/6)
>>> 2016-01-21 14:57:54,844 [task-result-getter-1] INFO
>>>  org.apache.spark.scheduler.TaskSetManager - Finished task 4.0 in stage 2.0
>>> (TID 8) in 595 ms on ose10kafkaelk.novalocal (4/6)
>>> 2016-01-21 14:57:54,850 [task-result-getter-0] WARN
>>>  org.apache.spark.ThrowableSerializationWrapper - Task exception could not
>>> be deserialized
>>> java.lang.ClassNotFoundException:
>>> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>>
>>> I don't know why I'm getting this error because the class
>>> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException is in the library
>>> of elasticSearch.
>>>
>>> After this error I get others error and finally Spark ends.
>>> 2016-01-21 14:57:55,012 [JobScheduler] INFO
>>>  org.apache.spark.streaming.scheduler.JobScheduler - Starting job streaming
>>> job 145338464 ms.0 from job set of time 145338464 ms
>>> 2016-01-21 14:57:55,012 [JobScheduler] ERROR
>>> org.apache.spark.streaming.scheduler.JobScheduler - Error running job
>>> streaming job 1453384635000 ms.0
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>>> 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage
>>> 2.0 (TID 13, ose11kafkaelk.novalocal): UnknownReason
>>> Driver stacktrace:
>>> at org.apache.spark.scheduler.DAGScheduler.org
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
>>> at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>>> at scala.Option.foreach(Option.scala:236)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1507)
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1469)
>>> at
>>> 

Re: Spark job stops after a while.

2016-01-21 Thread Guillermo Ortiz
I'm using CDH 5.5.1 with Spark 1.5.x (I think that it's 1.5.2).

I know that the library is here:
cloud-user@ose10kafkaelk:/opt/centralLogs/lib$ jar tf
elasticsearch-hadoop-2.2.0-beta1.jar | grep
 EsHadoopIllegalArgumentException
org/elasticsearch/hadoop/EsHadoopIllegalArgumentException.class

I have check in SparkUI with the process running
http://10.129.96.55:39320/jars/elasticsearch-hadoop-2.2.0-beta1.jar Added
By User
And spark.jars from SparkUI.
.file:/opt/centralLogs/lib/elasticsearch-hadoop-2.2.0-beta1.jar,file:/opt/centralLogs/lib/geronimo-annotation_1.0_spec-1.1.1.jar,

I think that in yarn-client although it has the error it doesn't stop the
execution, but I don't know why.



2016-01-21 15:55 GMT+01:00 Ted Yu :

> Looks like jar containing EsHadoopIllegalArgumentException class wasn't
> in the classpath.
> Can you double check ?
>
> Which Spark version are you using ?
>
> Cheers
>
> On Thu, Jan 21, 2016 at 6:50 AM, Guillermo Ortiz 
> wrote:
>
>> I'm runing a Spark Streaming process and it stops in a while. It makes
>> some process an insert the result in ElasticSeach with its library. After a
>> while the process fail.
>>
>> I have been checking the logs and I have seen this error
>> 2016-01-21 14:57:54,388 [sparkDriver-akka.actor.default-dispatcher-17]
>> INFO  org.apache.spark.storage.BlockManagerInfo - Added broadcast_2_piece0
>> in memory on ose11kafkaelk.novalocal:46913 (size: 6.0 KB, free: 530.3 MB)
>> 2016-01-21 14:57:54,646 [task-result-getter-1] INFO
>>  org.apache.spark.scheduler.TaskSetManager - Finished task 0.0 in stage 2.0
>> (TID 7) in 397 ms on ose12kafkaelk.novalocal (1/6)
>> 2016-01-21 14:57:54,647 [task-result-getter-2] INFO
>>  org.apache.spark.scheduler.TaskSetManager - Finished task 2.0 in stage 2.0
>> (TID 10) in 395 ms on ose12kafkaelk.novalocal (2/6)
>> 2016-01-21 14:57:54,731 [task-result-getter-3] INFO
>>  org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage 2.0
>> (TID 9) in 481 ms on ose11kafkaelk.novalocal (3/6)
>> 2016-01-21 14:57:54,844 [task-result-getter-1] INFO
>>  org.apache.spark.scheduler.TaskSetManager - Finished task 4.0 in stage 2.0
>> (TID 8) in 595 ms on ose10kafkaelk.novalocal (4/6)
>> 2016-01-21 14:57:54,850 [task-result-getter-0] WARN
>>  org.apache.spark.ThrowableSerializationWrapper - Task exception could not
>> be deserialized
>> java.lang.ClassNotFoundException:
>> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> I don't know why I'm getting this error because the class
>> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException is in the library
>> of elasticSearch.
>>
>> After this error I get others error and finally Spark ends.
>> 2016-01-21 14:57:55,012 [JobScheduler] INFO
>>  org.apache.spark.streaming.scheduler.JobScheduler - Starting job streaming
>> job 145338464 ms.0 from job set of time 145338464 ms
>> 2016-01-21 14:57:55,012 [JobScheduler] ERROR
>> org.apache.spark.streaming.scheduler.JobScheduler - Error running job
>> streaming job 1453384635000 ms.0
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1
>> in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage
>> 2.0 (TID 13, ose11kafkaelk.novalocal): UnknownReason
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1507)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1469)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
>> at 

Re: Spark job stops after a while.

2016-01-21 Thread Guillermo Ortiz
I think that it's that bug, because the error is the same.. thanks a lot.

2016-01-21 16:46 GMT+01:00 Guillermo Ortiz :

> I'm using 1.5.0 of  Spark confirmed. Less this
> jar file:/opt/centralLogs/lib/spark-catalyst_2.10-1.5.1.jar.
>
> I'm going to keep looking for,, Thank you!.
>
> 2016-01-21 16:29 GMT+01:00 Ted Yu :
>
>> Maybe this is related (fixed in 1.5.3):
>> SPARK-11195 Exception thrown on executor throws ClassNotFoundException on
>> driver
>>
>> FYI
>>
>> On Thu, Jan 21, 2016 at 7:10 AM, Guillermo Ortiz 
>> wrote:
>>
>>> I'm using CDH 5.5.1 with Spark 1.5.x (I think that it's 1.5.2).
>>>
>>> I know that the library is here:
>>> cloud-user@ose10kafkaelk:/opt/centralLogs/lib$ jar tf
>>> elasticsearch-hadoop-2.2.0-beta1.jar | grep
>>>  EsHadoopIllegalArgumentException
>>> org/elasticsearch/hadoop/EsHadoopIllegalArgumentException.class
>>>
>>> I have check in SparkUI with the process running
>>> http://10.129.96.55:39320/jars/elasticsearch-hadoop-2.2.0-beta1.jar Added
>>> By User
>>> And spark.jars from SparkUI.
>>>
>>> .file:/opt/centralLogs/lib/elasticsearch-hadoop-2.2.0-beta1.jar,file:/opt/centralLogs/lib/geronimo-annotation_1.0_spec-1.1.1.jar,
>>>
>>> I think that in yarn-client although it has the error it doesn't stop
>>> the execution, but I don't know why.
>>>
>>>
>>>
>>> 2016-01-21 15:55 GMT+01:00 Ted Yu :
>>>
 Looks like jar containing EsHadoopIllegalArgumentException class
 wasn't in the classpath.
 Can you double check ?

 Which Spark version are you using ?

 Cheers

 On Thu, Jan 21, 2016 at 6:50 AM, Guillermo Ortiz 
 wrote:

> I'm runing a Spark Streaming process and it stops in a while. It makes
> some process an insert the result in ElasticSeach with its library. After 
> a
> while the process fail.
>
> I have been checking the logs and I have seen this error
> 2016-01-21 14:57:54,388 [sparkDriver-akka.actor.default-dispatcher-17]
> INFO  org.apache.spark.storage.BlockManagerInfo - Added broadcast_2_piece0
> in memory on ose11kafkaelk.novalocal:46913 (size: 6.0 KB, free: 530.3 MB)
> 2016-01-21 14:57:54,646 [task-result-getter-1] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 0.0 in stage 
> 2.0
> (TID 7) in 397 ms on ose12kafkaelk.novalocal (1/6)
> 2016-01-21 14:57:54,647 [task-result-getter-2] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 2.0 in stage 
> 2.0
> (TID 10) in 395 ms on ose12kafkaelk.novalocal (2/6)
> 2016-01-21 14:57:54,731 [task-result-getter-3] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage 
> 2.0
> (TID 9) in 481 ms on ose11kafkaelk.novalocal (3/6)
> 2016-01-21 14:57:54,844 [task-result-getter-1] INFO
>  org.apache.spark.scheduler.TaskSetManager - Finished task 4.0 in stage 
> 2.0
> (TID 8) in 595 ms on ose10kafkaelk.novalocal (4/6)
> 2016-01-21 14:57:54,850 [task-result-getter-0] WARN
>  org.apache.spark.ThrowableSerializationWrapper - Task exception could not
> be deserialized
> java.lang.ClassNotFoundException:
> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
>
> I don't know why I'm getting this error because the class
> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException is in the 
> library
> of elasticSearch.
>
> After this error I get others error and finally Spark ends.
> 2016-01-21 14:57:55,012 [JobScheduler] INFO
>  org.apache.spark.streaming.scheduler.JobScheduler - Starting job 
> streaming
> job 145338464 ms.0 from job set of time 145338464 ms
> 2016-01-21 14:57:55,012 [JobScheduler] ERROR
> org.apache.spark.streaming.scheduler.JobScheduler - Error running job
> streaming job 1453384635000 ms.0
> org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in
> stage 2.0 (TID 13, ose11kafkaelk.novalocal): UnknownReason
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
>