[
https://issues.apache.org/jira/browse/SPARK-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190783#comment-15190783
]
yuemeng commented on SPARK-13818:
---------------------------------
code like:
stream.foreachRDD( rdd=>
{
val ep = esPath + getIndexName("") + "/event"
rdd.saveToEs(ep)
}
when spark streaming run well,we restart the elasticsearch,tasks which at this
point will be failed,but this batch never finished or failed,it streaming web
ui,we can see that:this job had task faild because of error(
org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cluster state volatile;
cannot find node backing shards - please check whether your cluster is
stable),but this batch will be always in processing status.in my opinion,if job
failed becasue of task failure by some reason,this batch's status will be
finished or failed instead of processing
will be anyone like to check this issue.thanks
[~zsxwing] can u help me to check this issue
> the spark streaming job will be always processing status when restart
> elasticsearch
> ------------------------------------------------------------------------------------
>
> Key: SPARK-13818
> URL: https://issues.apache.org/jira/browse/SPARK-13818
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.3.0, 1.4.0, 1.5.0
> Reporter: yuemeng
> Priority: Blocker
> Fix For: 1.4.2, 1.5.3
>
>
> Using spark streaming to write data into elasticsearch-hadoop system ,when we
> restart elasticsearch system,tasks in some job at this time will be get
> follow error:
> Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most
> recent failure: Lost task 0.3 in stage 4.0 (TID 75, CIS-store02):
> org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cluster state
> volatile; cannot find node backing shards - please check whether your cluster
> is stable
> at
> org.elasticsearch.hadoop.rest.RestRepository.getWriteTargetPrimaryShards(RestRepository.java:370)
> at
> org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:425)
> at
> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:393)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace:
> and this batch will be always in the status of processing,Never failed or
> finished,it maybe cause resources for this batch never release.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]