Hi, I wonder if anyone has a good example of how to write Spark RDDs into Kafka. Specifically, my question is if there is an advantage of sending a list of messages each time over sending one message at a time.
Sample code for sending one message at a time: dStream.foreachRDD(rdd => { rdd.collect().foreach(myObj => { val keyedMessage:KeyedMessage[String, String] = ??? //do something to convert mObj to KeyedMessage producer.send(keyedMessage) }) }) Sample code for sending a list of messages: dStream.foreachRDD(rdd => { var messages = new ListBuffer[KeyedMessage[String, String]]() rdd.collect().foreach(myObj => { val keyedMessage:KeyedMessage[String, String] = ??? //do something to convert mObj to KeyedMessage messages += keyedMessage }) producer.send(messages.toList) }) I noticed in 0.8.2.1, the KafkaProducer class only sends one ProducerRecord at a time. I don't know if there is another way to send a list. So, I am really interested to know if there are any examples of writing Spark RDDs into Kafka with the latest Kafka library. Thanks, Ming Zhao