I'm quite interested in this as well. I remember something about a streaming context needing one core. If that's the case, then won't 10 apps require 10 cores? Seems like a waste unless each topic is quite resource hungry? Would love to hear from the experts :)
Date: Mon, 27 Oct 2014 06:35:29 -0400 From: [email protected] To: [email protected] Subject: Re: Which is better? One spark app listening to 10 topics vs. 10 spark apps each listening to 1 topic On 10/27/2014 05:19 AM, Jianshi Huang wrote: Any suggestion? :) Jianshi On Thu, Oct 23, 2014 at 3:49 PM, Jianshi Huang <[email protected]> wrote: The Kafka stream has 10 topics and the data rate is quite high (~ 100K/s per topic). Which configuration do you recommend? - 1 Spark app consuming all Kafka topics - 10 separate Spark app each consuming one topic Assuming they have the same resource pool. Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ Do you have time to try and benchmark both? I don't know anything about Kafka, but I would imagine that the performance of both options would be similar. That said, I would recommend having them all run separately; adding new data streams doesn't require killing a monolithic job, and an error in one stream would affect a monolithic job much worse that having them all run separately. Regards, Alec
