I believe when you go with 1, it will distribute the consumer across your
cluster (possibly on 6 machines), but still it i don't see a away to tell
from which partition it will consume etc. If you are looking to have a
consumer where you can specify the partition details and all, then you are
better off with the lowlevel consumer.
<https://github.com/dibbhatt/kafka-spark-consumer>



Thanks
Best Regards

On Tue, Feb 24, 2015 at 9:36 AM, bit1...@163.com <bit1...@163.com> wrote:

> Hi,
> I  am experimenting Spark Streaming and Kafka Integration, To read
> messages from Kafka in parallel, basically there are two ways
> 1. Create many Receivers like (1 to 6).map(_ => KakfaUtils.createStream).
> 2. Specifiy many threads when calling KakfaUtils.createStream like val
> topicMap("myTopic"=>6), this will create one receiver with 6 reading
> threads.
>
> My question is which option is better, sounds option 2 is better is to me
> because it saves a lot of cores(one Receiver one core), but I learned
> from somewhere else that choice 1 is better, so I would ask and see how you
> guys elaborate on this. Thank
>
> ------------------------------
> bit1...@163.com
>

Reply via email to