[ 
https://issues.apache.org/jira/browse/SPARK-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258956#comment-15258956
 ] 

Faisal commented on SPARK-14737:
--------------------------------

Should we expect the same behavior if only 1 node of broker is down? like if we 
have kakfa cluster of 5 nodes and only 1 went down, then should spark continue 
running or stops? 
we observe that spark executors shut down with following messages in log and so 
driver.
{quote}
2016-04-26 16:12:58 INFO  RemoteActorRefProvider$RemotingTerminator:74 - 
Shutting down remote daemon.
2016-04-26 16:12:58 INFO  RemoteActorRefProvider$RemotingTerminator:74 - Remote 
daemon shut down; proceeding with flushing remote transports.
{quote}

> Kafka Brokers are down - spark stream should retry
> --------------------------------------------------
>
>                 Key: SPARK-14737
>                 URL: https://issues.apache.org/jira/browse/SPARK-14737
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 1.3.0
>         Environment: Suse Linux, Cloudera Enterprise 5.4.8 (#7 built by 
> jenkins on 20151023-1205 git: d7dbdf29ac1d57ae9fb19958502d50dcf4e4fffd), 
> kafka_2.10-0.8.2.2
>            Reporter: Faisal
>
> I have spark streaming application that uses direct streaming - listening to 
> KAFKA topic.
> {code}
> HashMap<String, String> kafkaParams = new HashMap<String, String>();
>     kafkaParams.put("metadata.broker.list", "broker1,broker2,broker3");
>     kafkaParams.put("auto.offset.reset", "largest");
>     HashSet<String> topicsSet = new HashSet<String>();
>     topicsSet.add("Topic1");
>     JavaPairInputDStream<String, String> messages = 
> KafkaUtils.createDirectStream(
>             jssc, 
>             String.class, 
>             String.class,
>             StringDecoder.class, 
>             StringDecoder.class, 
>             kafkaParams, 
>             topicsSet
>     );
> {code}
> I notice when i stop/shutdown kafka brokers, my spark application also 
> shutdown.
> Here is the spark execution script
> {code}
> spark-submit \
> --master yarn-cluster \
> --files /home/siddiquf/spark/log4j-spark.xml
> --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
> --conf 
> "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
> --class com.example.MyDataStreamProcessor \
> myapp.jar 
> {code}
> Spark job submitted successfully and i can track the application driver and 
> worker/executor nodes.
> Everything works fine but only concern if kafka borkers are offline or 
> restarted my application controlled by yarn should not shutdown? but it does.
> If this is expected behavior then how to handle such situation with least 
> maintenance? Keeping in mind Kafka cluster is not in hadoop cluster and 
> managed by different team that is why requires our application to be 
> resilient enough.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to