Re: force the kafka consumer process to different machines

李森栋 Wed, 13 May 2015 09:00:08 -0700

thank you very much


来自 魅族 MX4 Pro

-------- 原始邮件 --------
发件人：Cody Koeninger <[email protected]>
时间：周三 5月13日 23:52
收件人：hotdog <[email protected]>
抄送：[email protected]
主题：Re: force the kafka consumer process to different machines

>I assume you're using the receiver based approach?  Have you tried the
>createDirectStream api?
>
>https://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html
>
>If you're sticking with the receiver based approach I think your only
>option would be to create more consumer streams and union them.  That
>doesn't give you control over where they're run, but should increase the
>consumer parallelism.
>
>On Wed, May 13, 2015 at 10:33 AM, hotdog <[email protected]> wrote:
>
>> I 'm using streaming integrated with streaming-kafka.
>>
>> My kafka topic has 80 partitions, while my machines have 40 cores. I found
>> that when the job is running, the kafka consumer processes are only deploy
>> to 2 machines, the bandwidth of the 2 machines will be very very high.
>>
>> I wonder is there any way to control the kafka consumer's dispatch？
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/force-the-kafka-consumer-process-to-different-machines-tp22872.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>

Re: force the kafka consumer process to different machines

Reply via email to