[jira] [Comment Edited] (BEAM-8121) Messages are not distributed per machines when consuming from Kafka topic with 1 partition

TJ (Jira) Mon, 02 Sep 2019 07:04:06 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-8121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920886#comment-16920886
 ]


TJ edited comment on BEAM-8121 at 9/2/19 2:03 PM:
--------------------------------------------------

I use Dataflow runner. 

I tried Reshuffle.viaRandomKey(), it worked quite poorly and had side effects:
 * Pipeline started to have constant lag (2-3 min) on Reshuffle step.
 * CPU usage increased from 20-30% to almost 80%.  On n1-standard-4 
([https://cloud.google.com/compute/docs/machine-types]).
 * With single machine it was consuming less messages. 1 machine without 
reshuffle step ±1500message/s, 1 machine with reshuffle step ±450message/s. 
When it scaled to 5 it was consuming the same 1500message/s as without the step.

I approached google support and they advised do not use Reshuffle step at all. 
Cause it's deprecated and not stable.
I assume if I would use more powerful machines and the higher number of them as 
5 it would theoretically reach the max throughput, but it would be really 
costly for such small message throughput.

One more thing to mention. I did the throughput test for Kafka Standalone app 
and Dataflow pipeline. They were consuming from the same topic with 1 
partition. Message size 1.3KB. Results:
 Standalone kafka client was consuming 5000 message/s 
 Dataflow pipeline with few empty tasks 3200message/s
 Dataflow pipeline with normal tasks what manipulates data 1500message/s


was (Author: taujan):
I use Dataflow runner. 

I tried Reshuffle.viaRandomKey(), it worked quite poorly and had side effects:
 * Pipeline started to have constant lag (2-3 min) on Reshuffle step.
 * CPU usage increased from 20-30% to almost 80%.  On n1-standard-4 
([https://cloud.google.com/compute/docs/machine-types]).
 * With single machine it was consuming less messages. 1 machine without 
reshuffle step ±1500message/s, 1 machine with reshuffle step ±450message/s. 
When it scaled to 5 it was consuming the same 1500message/s as without the step.

I approached google support and they advised do not use Reshuffle step at all. 
Cause it's deprecated and not stable.

One more thing to mention. I did the throughput test for Kafka Standalone app 
and Dataflow pipeline. They were consuming from the same topic with 1 
partition. Results:
Standalone kafka client was consuming 4500 message/s 
Dataflow pipeline with few empty tasks 3200message/s
Dataflow pipeline with normal tasks what manipulates data 1500message/s

> Messages are not distributed per machines when consuming from Kafka topic 
> with 1 partition
> ------------------------------------------------------------------------------------------
>
>                 Key: BEAM-8121
>                 URL: https://issues.apache.org/jira/browse/BEAM-8121
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-kafka
>    Affects Versions: 2.14.0
>            Reporter: TJ
>            Priority: Major
>
> Messages are consumed from Kafka using KafkaIO. Each kafka topic contains 
> only 1 partition. (That means that messages can be consumed only by one 
> Consumer per 1 consumer group)
> When backlog of topic grows and system scales from 1 to X machines, all the 
> messages seems to be executed on  the same machine on which they are read. 
> Due to that message throughput doesn't increase comparing X machines to 1 
> machine. If one machine was reading 2K messages  per s, X machines will be 
> reading the same amount.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Comment Edited] (BEAM-8121) Messages are not distributed per machines when consuming from Kafka topic with 1 partition

Reply via email to