[ 
https://issues.apache.org/jira/browse/SPARK-21836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yue long updated SPARK-21836:
-----------------------------
    Description: 
When using the package spark-streaming-kafka-0-8 for accessing kafka in spark 
dstream, many user will face the "could not find leader" exception if some of 
kafka brokers are down. This will cause the whole streaming fail, like  
[SPARK-18983|https://issues.apache.org/jira/browse/SPARK-18983] said. The 
failed kafka brokers may also cause other problems when creating Dstream or 
creating the batch job.

Even though the down of kafka broker is not the bug of spark streaming, we can 
avoid this failure in spark streaming.  Especially for the reasons that kakfa 
cluster is not always stable in the real production and it will use other 
broker to substitute the failed broker in a very short time. If our streaming 
fails instantly when one kafka broker is down, it may take many efforts to 
re-start it.

Does anyoner think we should add some retry logic when kakfa broker is down?  I 
have implement this function in spark 1.6.3 and spark 2.1.0, and test them. If 
we implement this function, it will reduce the failure number of 
kafka-streaming which may help streaming users.

  was:
When using the package spark-streaming-kafka-0-8 for accessing kafka in spark 
dstream, many user will face the "could not find leader" exception if some of 
kafka brokers are down. This will cause the whole streaming fail, like  
[SPARK-18983|https://issues.apache.org/jira/browse/SPARK-18983] said. The 
failed kafka brokers may also cause other problems when creating Dstream or 
creating the batch job.

Even though the down of kafka broker is not the bug of spark streaming, we can 
avoid this failure in spark streaming.  Especially for the reasons that kakfa 
cluster is not always stable in the real production.

Does anyoner think we should add some retry logic when kakfa broker is down?  I 
have implement this function in spark 1.6.3 and spark 2.1.0, and test them. If 
we implement this function, it will reduce the failure number of 
kafka-streaming which may help streaming users.


> [STREAMING] Retry when kafka broker is down in kafka-streaming-0-8
> ------------------------------------------------------------------
>
>                 Key: SPARK-21836
>                 URL: https://issues.apache.org/jira/browse/SPARK-21836
>             Project: Spark
>          Issue Type: Improvement
>          Components: DStreams
>    Affects Versions: 1.6.3, 2.1.0
>            Reporter: yue long
>
> When using the package spark-streaming-kafka-0-8 for accessing kafka in spark 
> dstream, many user will face the "could not find leader" exception if some of 
> kafka brokers are down. This will cause the whole streaming fail, like  
> [SPARK-18983|https://issues.apache.org/jira/browse/SPARK-18983] said. The 
> failed kafka brokers may also cause other problems when creating Dstream or 
> creating the batch job.
> Even though the down of kafka broker is not the bug of spark streaming, we 
> can avoid this failure in spark streaming.  Especially for the reasons that 
> kakfa cluster is not always stable in the real production and it will use 
> other broker to substitute the failed broker in a very short time. If our 
> streaming fails instantly when one kafka broker is down, it may take many 
> efforts to re-start it.
> Does anyoner think we should add some retry logic when kakfa broker is down?  
> I have implement this function in spark 1.6.3 and spark 2.1.0, and test them. 
> If we implement this function, it will reduce the failure number of 
> kafka-streaming which may help streaming users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to