Posting your code would be really helpful in figuring out gotchas. On Tue, Sep 23, 2014 at 9:19 AM, Matt Narrell <matt.narr...@gmail.com> wrote: > Hey, > > Spark 1.1.0 > Kafka 0.8.1.1 > Hadoop (YARN/HDFS) 2.5.1 > > I have a five partition Kafka topic. I can create a single Kafka receiver > via KafkaUtils.createStream with five threads in the topic map and consume > messages fine. Sifting through the user list and Google, I see that its > possible to split the Kafka receiver among the Spark workers such that I can > have a receiver per topic, and have this distributed to workers rather than > localized to the driver. I’m following something like this: > https://github.com/apache/spark/blob/ae58aea2d1435b5bb011e68127e1bcddc2edf5b2/extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java#L132 > But for Kafka obviously. From the Streaming Programming Guide “ Receiving > multiple data streams can therefore be achieved by creating multiple input > DStreams and configuring them to receive different partitions of the data > stream from the source(s)." > > However, I’m not able to consume any messages from Kafka after I perform the > union operation. Again, if I create a single, multi-threaded, receiver I > can consume messages fine. If I create 5 receivers in a loop, and call > jssc.union(…) i get: > > INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks > INFO scheduler.ReceiverTracker: Stream 1 received 0 blocks > INFO scheduler.ReceiverTracker: Stream 2 received 0 blocks > INFO scheduler.ReceiverTracker: Stream 3 received 0 blocks > INFO scheduler.ReceiverTracker: Stream 4 received 0 blocks > > > Do I need to do anything to the unioned DStream? Am I going about this > incorrectly? > > Thanks in advance. > > Matt
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org