[ 
https://issues.apache.org/jira/browse/SPARK-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190783#comment-15190783
 ] 

yuemeng commented on SPARK-13818:
---------------------------------

code like:
 stream.foreachRDD( rdd=>
      {

        val ep = esPath + getIndexName("") + "/event"
        rdd.saveToEs(ep)
      }

when spark streaming run well,we restart the elasticsearch,tasks which at this 
point will be failed,but this batch never finished or failed,it streaming web 
ui,we can see that:this job had task faild because of error( 
org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cluster state volatile; 
cannot find node backing shards - please check whether your cluster is 
stable),but this batch will be always in processing status.in my opinion,if job 
failed becasue of  task failure by some reason,this batch's status will be 
finished or failed instead of processing

will be anyone like to check this issue.thanks 
[~zsxwing] can u help me to check this issue




> the spark streaming job will be always processing status when restart 
> elasticsearch 
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-13818
>                 URL: https://issues.apache.org/jira/browse/SPARK-13818
>             Project: Spark
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 1.3.0, 1.4.0, 1.5.0
>            Reporter: yuemeng
>            Priority: Blocker
>             Fix For: 1.4.2, 1.5.3
>
>
> Using spark streaming to write data into elasticsearch-hadoop system ,when we 
> restart  elasticsearch system,tasks in some job at this time will be get 
> follow error:
> Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most 
> recent failure: Lost task 0.3 in stage 4.0 (TID 75, CIS-store02): 
> org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cluster state 
> volatile; cannot find node backing shards - please check whether your cluster 
> is stable
> at 
> org.elasticsearch.hadoop.rest.RestRepository.getWriteTargetPrimaryShards(RestRepository.java:370)
> at 
> org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:425)
> at 
> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:393)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
> at 
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at 
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace:
> and this batch will be always in the status of processing,Never failed or 
> finished,it maybe cause resources for this batch never release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to