Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Tim Smith
Good to know it worked out and thanks for the update. I didn't realize you need to provision for receiver workers + processing workers. One would think a worker would process multiple stages of an app/job and receive is just a stage of the job. On Thu, Sep 25, 2014 at 12:05 PM, Matt Narrell wro

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
Additionally, If I dial up/down the number of executor cores, this does what I want. Thanks for the extra eyes! mn On Sep 25, 2014, at 12:34 PM, Matt Narrell wrote: > Tim, > > I think I understand this now. I had a five node Spark cluster and a five > partition topic, and I created five r

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
Tim, I think I understand this now. I had a five node Spark cluster and a five partition topic, and I created five receivers. I found this: http://stackoverflow.com/questions/25785581/custom-receiver-stalls-worker-in-spark-streaming Indicating that if I use all my workers as receivers, there

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
I suppose I have other problems as I can’t get the Scala example to work either. Puzzling, as I have literally coded like the examples (that are purported to work), but no luck. mn On Sep 24, 2014, at 11:27 AM, Tim Smith wrote: > Maybe differences between JavaPairDStream and JavaPairReceiver

Re: Multiple Kafka Receivers and Union

2014-09-24 Thread Tim Smith
Maybe differences between JavaPairDStream and JavaPairReceiverInputDStream? On Wed, Sep 24, 2014 at 7:46 AM, Matt Narrell wrote: > The part that works is the commented out, single receiver stream below the > loop. It seems that when I call KafkaUtils.createStream more than once, I > don’t rece

Re: Multiple Kafka Receivers and Union

2014-09-24 Thread Matt Narrell
The part that works is the commented out, single receiver stream below the loop. It seems that when I call KafkaUtils.createStream more than once, I don’t receive any messages. I’ll dig through the logs, but at first glance yesterday I didn’t see anything suspect. I’ll have to look closer. m

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Tim Smith
Maybe post the before-code as in what was the code before you did the loop (that worked)? I had similar situations where reviewing code before (worked) and after (does not work) helped. Also, what helped is the Scala REPL because I can see what are the object types being returned by each statement.

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Matt Narrell
To my eyes, these are functionally equivalent. I’ll try a Scala approach, but this may cause waves for me upstream (e.g., non-Java) Thanks for looking at this. If anyone else can see a glaring issue in the Java approach that would be appreciated. Thanks, Matt On Sep 23, 2014, at 4:13 PM, Tim

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Tim Smith
Sorry, I am almost Java illiterate but here's my Scala code to do the equivalent (that I have tested to work): val kInStreams = (1 to 10).map{_ => KafkaUtils.createStream(ssc,zkhost.acme.net:2182,"myGrp",Map("myTopic" -> 1), StorageLevel.MEMORY_AND_DISK_SER) } //Create 10 receivers across the clus

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Matt Narrell
So, this is scrubbed some for confidentiality, but the meat of it is as follows. Note, that if I substitute the commented section for the loop, I receive messages from the topic. SparkConf sparkConf = new SparkConf(); sparkConf.set("spark.streaming.unpersist", "true"); sparkConf.set("spark.logC

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Tim Smith
Posting your code would be really helpful in figuring out gotchas. On Tue, Sep 23, 2014 at 9:19 AM, Matt Narrell wrote: > Hey, > > Spark 1.1.0 > Kafka 0.8.1.1 > Hadoop (YARN/HDFS) 2.5.1 > > I have a five partition Kafka topic. I can create a single Kafka receiver > via KafkaUtils.createStream wi

Multiple Kafka Receivers and Union

2014-09-23 Thread Matt Narrell
Hey, Spark 1.1.0 Kafka 0.8.1.1 Hadoop (YARN/HDFS) 2.5.1 I have a five partition Kafka topic. I can create a single Kafka receiver via KafkaUtils.createStream with five threads in the topic map and consume messages fine. Sifting through the user list and Google, I see that its possible to spl