Faisal created SPARK-14737:
------------------------------
Summary: Kafka Brokers are down - spark stream should retry
Key: SPARK-14737
URL: https://issues.apache.org/jira/browse/SPARK-14737
Project: Spark
Issue Type: Improvement
Components: Streaming
Affects Versions: 1.3.0
Environment: Suse Linux, Cloudera Enterprise 5.4.8 (#7 built by
jenkins on 20151023-1205 git: d7dbdf29ac1d57ae9fb19958502d50dcf4e4fffd),
kafka_2.10-0.8.2.2
Reporter: Faisal
I have spark streaming application that uses direct streaming - listening to
KAFKA topic.
{code}
HashMap<String, String> kafkaParams = new HashMap<String, String>();
kafkaParams.put("metadata.broker.list", "broker1,broker2,broker3");
kafkaParams.put("auto.offset.reset", "largest");
HashSet<String> topicsSet = new HashSet<String>();
topicsSet.add("Topic1");
JavaPairInputDStream<String, String> messages =
KafkaUtils.createDirectStream(
jssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams,
topicsSet
);
{code}
I notice when i stop/shutdown kafka brokers, my spark application also shutdown.
Here is the spark execution script
{code}
spark-submit \
--master yarn-cluster \
--files /home/siddiquf/spark/log4j-spark.xml
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.xml" \
--class com.example.MyDataStreamProcessor \
myapp.jar
{code}
Spark job submitted successfully and i can track the application driver and
worker/executor nodes.
Everything works fine but only concern if kafka borkers are offline or
restarted my application controlled by yarn should not shutdown? but it does.
If this is expected behavior then how to handle such situation with least
maintenance? Keeping in mind Kafka cluster is not in hadoop cluster and managed
by different team that is why requires our application to be resilient enough.
Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]