[
https://issues.apache.org/jira/browse/SPARK-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278387#comment-15278387
]
Sean Owen commented on SPARK-14737:
-----------------------------------
You could retry the entire app if it fails, in general. That's probably the
best thing to be able to do in any event. You might be able to re-create a
stream within one application and just start it again; I haven't tried that.
i'm not sure what parameter you are referring to. I don't think in general you
can have a 'recover from a fatal error' mechanism, and this is one of them.
Why is the broker totally down -- isn't that the problem? do you need more
replicas?
> Kafka Brokers are down - spark stream should retry
> --------------------------------------------------
>
> Key: SPARK-14737
> URL: https://issues.apache.org/jira/browse/SPARK-14737
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 1.3.0
> Environment: Suse Linux, Cloudera Enterprise 5.4.8 (#7 built by
> jenkins on 20151023-1205 git: d7dbdf29ac1d57ae9fb19958502d50dcf4e4fffd),
> kafka_2.10-0.8.2.2
> Reporter: Faisal
>
> I have spark streaming application that uses direct streaming - listening to
> KAFKA topic.
> {code}
> HashMap<String, String> kafkaParams = new HashMap<String, String>();
> kafkaParams.put("metadata.broker.list", "broker1,broker2,broker3");
> kafkaParams.put("auto.offset.reset", "largest");
> HashSet<String> topicsSet = new HashSet<String>();
> topicsSet.add("Topic1");
> JavaPairInputDStream<String, String> messages =
> KafkaUtils.createDirectStream(
> jssc,
> String.class,
> String.class,
> StringDecoder.class,
> StringDecoder.class,
> kafkaParams,
> topicsSet
> );
> {code}
> I notice when i stop/shutdown kafka brokers, my spark application also
> shutdown.
> Here is the spark execution script
> {code}
> spark-submit \
> --master yarn-cluster \
> --files /home/siddiquf/spark/log4j-spark.xml
> --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
> --conf
> "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
> --class com.example.MyDataStreamProcessor \
> myapp.jar
> {code}
> Spark job submitted successfully and i can track the application driver and
> worker/executor nodes.
> Everything works fine but only concern if kafka borkers are offline or
> restarted my application controlled by yarn should not shutdown? but it does.
> If this is expected behavior then how to handle such situation with least
> maintenance? Keeping in mind Kafka cluster is not in hadoop cluster and
> managed by different team that is why requires our application to be
> resilient enough.
> Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]