Spark Streaming with Kafka | Check if DStream is Empty | HDFS Write

Anish Sneh Thu, 22 May 2014 10:17:38 -0700

Hi All

I am using Spark Streaming with Kafka, I recieve messages and after minor 
processing I write them to HDFS, as of now I am using saveAsTextFiles() / 
saveAsHadoopFiles() Java methods



- Is there some default way of writing stream to Hadoop like we have HDFS sink 
concept in Flume? I mean is there some configurable way of writing at Spark 
Streaming after processing DStream.

- How can I check if DStream is empty so that I can skip HDFS write if no 
message is present (I am pulling Kafka topic every 1 sec)? because sometime it 
writes empty file to HDFS due to unavailability of messages.

 
Please suggest.

TIA

-- 
Anish Sneh
"Experience is the best teacher."
+91-99718-55883
http://in.linkedin.com/in/anishsneh

Spark Streaming with Kafka | Check if DStream is Empty | HDFS Write

Reply via email to