Re: Kafka Stream tuning.

2018-02-14 Thread Guozhang Wang
Hello Brilly,


If you commit every second (note the commit interval unit is milliseconds,
so 1000 means a second), and each commit takes 23 millis, you will get
about that throughput. The question is 1) do you really need to commit
every second? 2) If you really do, how to reduce it. For 2) since you
mentioned you only have a simple filtering application I'd assume it is
stateless, so most of the time spent would be the sync-commit-offset call
to the broker. 23 millis does look a bit too long for a single rpc but
again that would very much dependent on your client-server network. If you
cannot tune your client-server rpc latency better, then you'd have to
consider option 1) to reduce your commit frequency.


Guozhang




On Tue, Feb 13, 2018 at 7:29 PM, TSANG, Brilly <brilly.ts...@hk.daiwacm.com>
wrote:

> I have also check the commit-latency-avg, it's around 23 millis per
> commit.  That translate to about the same throughput that I'm getting now
> (0.04message/millis).  Does anyone got any benchmark for kafka stream's
> commit-latency-avg?  Is it possible to tune it to be faster?  I just want
> to verify if this is supposed to be latency limit and we will have to work
> with horizontal scaling with more partition and stream processes if the
> input throughput is higher.
>
> Another side question will be is custom consumer/publisher going to be
> faster than default kafka stream implementation?
>
> Regards,
> Brilly
>
> -Original Message-
> From: TSANG, Brilly [mailto:brilly.ts...@hk.daiwacm.com]
> Sent: Wednesday, February 14, 2018 11:01 AM
> To: users@kafka.apache.org
> Subject: RE: Kafka Stream tuning.
>
> Hey Damian and folks,
>
> I've also tried 1000 and 500 and the performance state is exactly the
> same.  Any other ideas?
>
> Regards,
> Brilly
>
> -Original Message-
> From: Damian Guy [mailto:damian....@gmail.com]
> Sent: Tuesday, February 13, 2018 4:48 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka Stream tuning.
>
> Hi Brilly,
>
> My initial guess is that it is the overhead of committing. Commit is
> synchronous and you have the commit interval set to 50ms. Perhaps try
> increasing it.
>
> Thanks,
> Damian
>
> On Tue, 13 Feb 2018 at 07:49 TSANG, Brilly <brilly.ts...@hk.daiwacm.com>
> wrote:
>
> > Hi kafka users,
> >
> > I created a filtering stream with the Processor API;  input topic that
> > have input rate at ~5 records per millisecond.  The filtering function
> > on average takes 0.05milliseconds to complete which in ideal case
> > would translate to (1/0.05)  20 records per millisecond.  However,
> > when I benchmark the whole process, the streams is only processing
> > 0.05 record per milliseconds.
> >
> > Anyone have any idea on how to tune the steaming system to be faster
> > as
> > 0.05 record is very far away from the theoretical max of 20?  The
> > results above are per partition based where I have 16 partition for
> > the input topic and all partitions have similar throughput.
> >
> > I've only set the streams to have the following config:
> > Properties config = new Properties();
> > config.put(StreamsConfig.APPLICATION_ID_CONFIG, appId);
> > config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrap);
> > config.put(StreamsConfig.STATE_DIR_CONFIG, stateDir);
> > config.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 50);
> >
> > I'm not defining TimeExtractor so the default one is used.
> >
> > Thanks for any help in advance.
> >
> > Regards,
> > Brilly
> >
> > 
> >
> >
> > **
> > **
> > DISCLAIMER:
> > This email and any attachment(s) are intended solely for the person(s)
> > named above, and are or may contain information of a proprietary or
> > confidential nature. If you are not the intended recipient(s), you
> > should delete this message immediately. Any use, disclosure or
> > distribution of this message without our prior consent is strictly
> prohibited.
> > This message may be subject to errors, incomplete or late delivery,
> > interruption, interception, modification, or may contain viruses.
> > Neither Daiwa Capital Markets Hong Kong Limited, its subsidiaries,
> > affiliates nor their officers or employees represent or warrant the
> > accuracy or completeness, nor accept any responsibility or liability
> > whatsoever for any use of or reliance upon, this email

RE: Kafka Stream tuning.

2018-02-13 Thread TSANG, Brilly
I have also check the commit-latency-avg, it's around 23 millis per commit.  
That translate to about the same throughput that I'm getting now 
(0.04message/millis).  Does anyone got any benchmark for kafka stream's 
commit-latency-avg?  Is it possible to tune it to be faster?  I just want to 
verify if this is supposed to be latency limit and we will have to work with 
horizontal scaling with more partition and stream processes if the input 
throughput is higher.

Another side question will be is custom consumer/publisher going to be faster 
than default kafka stream implementation?

Regards,
Brilly

-Original Message-
From: TSANG, Brilly [mailto:brilly.ts...@hk.daiwacm.com]
Sent: Wednesday, February 14, 2018 11:01 AM
To: users@kafka.apache.org
Subject: RE: Kafka Stream tuning.

Hey Damian and folks,

I've also tried 1000 and 500 and the performance state is exactly the same.  
Any other ideas?

Regards,
Brilly

-Original Message-
From: Damian Guy [mailto:damian@gmail.com]
Sent: Tuesday, February 13, 2018 4:48 PM
To: users@kafka.apache.org
Subject: Re: Kafka Stream tuning.

Hi Brilly,

My initial guess is that it is the overhead of committing. Commit is 
synchronous and you have the commit interval set to 50ms. Perhaps try 
increasing it.

Thanks,
Damian

On Tue, 13 Feb 2018 at 07:49 TSANG, Brilly <brilly.ts...@hk.daiwacm.com>
wrote:

> Hi kafka users,
>
> I created a filtering stream with the Processor API;  input topic that
> have input rate at ~5 records per millisecond.  The filtering function
> on average takes 0.05milliseconds to complete which in ideal case
> would translate to (1/0.05)  20 records per millisecond.  However,
> when I benchmark the whole process, the streams is only processing
> 0.05 record per milliseconds.
>
> Anyone have any idea on how to tune the steaming system to be faster
> as
> 0.05 record is very far away from the theoretical max of 20?  The
> results above are per partition based where I have 16 partition for
> the input topic and all partitions have similar throughput.
>
> I've only set the streams to have the following config:
> Properties config = new Properties();
> config.put(StreamsConfig.APPLICATION_ID_CONFIG, appId);
> config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrap);
> config.put(StreamsConfig.STATE_DIR_CONFIG, stateDir);
> config.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 50);
>
> I'm not defining TimeExtractor so the default one is used.
>
> Thanks for any help in advance.
>
> Regards,
> Brilly
>
> 
>
>
> **
> **
> DISCLAIMER:
> This email and any attachment(s) are intended solely for the person(s)
> named above, and are or may contain information of a proprietary or
> confidential nature. If you are not the intended recipient(s), you
> should delete this message immediately. Any use, disclosure or
> distribution of this message without our prior consent is strictly prohibited.
> This message may be subject to errors, incomplete or late delivery,
> interruption, interception, modification, or may contain viruses.
> Neither Daiwa Capital Markets Hong Kong Limited, its subsidiaries,
> affiliates nor their officers or employees represent or warrant the
> accuracy or completeness, nor accept any responsibility or liability
> whatsoever for any use of or reliance upon, this email or any of the
> contents hereof. The contents of this message are for information
> purposes only, and subject to change without notice.
> This message is not and is not intended to be an offer or solicitation
> to buy or sell any securities or financial products, nor does any
> recommendation, opinion or advice necessarily reflect those of Daiwa
> Capital Markets Hong Kong Limited, its subsidiaries or affiliates.
>
> **
> **
>




DISCLAIMER:
This email and any attachment(s) are intended solely for the person(s) named 
above, and are or may contain information of a proprietary or confidential 
nature. If you are not the intended recipient(s), you should delete this 
message immediately. Any use, disclosure or distribution of this message 
without our prior consent is strictly prohibited.
This message may be subject to errors, incomplete or late delivery, 
interruption, interception, modification, or may contain viruses. Neither Daiwa 
Capital Markets Hong Kong Limited, its subsidiaries, affiliates nor their 
officers or employees represent or warrant the accuracy or completeness, nor 
accept any responsibility or liability whatsoever for any use of or

RE: Kafka Stream tuning.

2018-02-13 Thread TSANG, Brilly
Hey Damian and folks,

I've also tried 1000 and 500 and the performance state is exactly the same.  
Any other ideas?

Regards,
Brilly

-Original Message-
From: Damian Guy [mailto:damian@gmail.com]
Sent: Tuesday, February 13, 2018 4:48 PM
To: users@kafka.apache.org
Subject: Re: Kafka Stream tuning.

Hi Brilly,

My initial guess is that it is the overhead of committing. Commit is 
synchronous and you have the commit interval set to 50ms. Perhaps try 
increasing it.

Thanks,
Damian

On Tue, 13 Feb 2018 at 07:49 TSANG, Brilly <brilly.ts...@hk.daiwacm.com>
wrote:

> Hi kafka users,
>
> I created a filtering stream with the Processor API;  input topic that
> have input rate at ~5 records per millisecond.  The filtering function
> on average takes 0.05milliseconds to complete which in ideal case
> would translate to (1/0.05)  20 records per millisecond.  However,
> when I benchmark the whole process, the streams is only processing
> 0.05 record per milliseconds.
>
> Anyone have any idea on how to tune the steaming system to be faster
> as
> 0.05 record is very far away from the theoretical max of 20?  The
> results above are per partition based where I have 16 partition for
> the input topic and all partitions have similar throughput.
>
> I've only set the streams to have the following config:
> Properties config = new Properties();
> config.put(StreamsConfig.APPLICATION_ID_CONFIG, appId);
> config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrap);
> config.put(StreamsConfig.STATE_DIR_CONFIG, stateDir);
> config.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 50);
>
> I'm not defining TimeExtractor so the default one is used.
>
> Thanks for any help in advance.
>
> Regards,
> Brilly
>
> 
>
>
> **
> **
> DISCLAIMER:
> This email and any attachment(s) are intended solely for the person(s)
> named above, and are or may contain information of a proprietary or
> confidential nature. If you are not the intended recipient(s), you
> should delete this message immediately. Any use, disclosure or
> distribution of this message without our prior consent is strictly prohibited.
> This message may be subject to errors, incomplete or late delivery,
> interruption, interception, modification, or may contain viruses.
> Neither Daiwa Capital Markets Hong Kong Limited, its subsidiaries,
> affiliates nor their officers or employees represent or warrant the
> accuracy or completeness, nor accept any responsibility or liability
> whatsoever for any use of or reliance upon, this email or any of the
> contents hereof. The contents of this message are for information
> purposes only, and subject to change without notice.
> This message is not and is not intended to be an offer or solicitation
> to buy or sell any securities or financial products, nor does any
> recommendation, opinion or advice necessarily reflect those of Daiwa
> Capital Markets Hong Kong Limited, its subsidiaries or affiliates.
>
> **
> **
>




DISCLAIMER:
This email and any attachment(s) are intended solely for the person(s) named 
above, and are or may contain information of a proprietary or confidential 
nature. If you are not the intended recipient(s), you should delete this 
message immediately. Any use, disclosure or distribution of this message 
without our prior consent is strictly prohibited.
This message may be subject to errors, incomplete or late delivery, 
interruption, interception, modification, or may contain viruses. Neither Daiwa 
Capital Markets Hong Kong Limited, its subsidiaries, affiliates nor their 
officers or employees represent or warrant the accuracy or completeness, nor 
accept any responsibility or liability whatsoever for any use of or reliance 
upon, this email or any of the contents hereof. The contents of this message 
are for information purposes only, and subject to change without notice.
This message is not and is not intended to be an offer or solicitation to buy 
or sell any securities or financial products, nor does any recommendation, 
opinion or advice necessarily reflect those of Daiwa Capital Markets Hong Kong 
Limited, its subsidiaries or affiliates.



Re: Kafka Stream tuning.

2018-02-13 Thread Damian Guy
Hi Brilly,

My initial guess is that it is the overhead of committing. Commit is
synchronous and you have the commit interval set to 50ms. Perhaps try
increasing it.

Thanks,
Damian

On Tue, 13 Feb 2018 at 07:49 TSANG, Brilly 
wrote:

> Hi kafka users,
>
> I created a filtering stream with the Processor API;  input topic that
> have input rate at ~5 records per millisecond.  The filtering function on
> average takes 0.05milliseconds to complete which in ideal case would
> translate to (1/0.05)  20 records per millisecond.  However, when I
> benchmark the whole process, the streams is only processing 0.05 record per
> milliseconds.
>
> Anyone have any idea on how to tune the steaming system to be faster as
> 0.05 record is very far away from the theoretical max of 20?  The results
> above are per partition based where I have 16 partition for the input topic
> and all partitions have similar throughput.
>
> I've only set the streams to have the following config:
> Properties config = new Properties();
> config.put(StreamsConfig.APPLICATION_ID_CONFIG, appId);
> config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrap);
> config.put(StreamsConfig.STATE_DIR_CONFIG, stateDir);
> config.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 50);
>
> I'm not defining TimeExtractor so the default one is used.
>
> Thanks for any help in advance.
>
> Regards,
> Brilly
>
> 
>
>
> 
> DISCLAIMER:
> This email and any attachment(s) are intended solely for the person(s)
> named above, and are or may contain information of a proprietary or
> confidential nature. If you are not the intended recipient(s), you should
> delete this message immediately. Any use, disclosure or distribution of
> this message without our prior consent is strictly prohibited.
> This message may be subject to errors, incomplete or late delivery,
> interruption, interception, modification, or may contain viruses. Neither
> Daiwa Capital Markets Hong Kong Limited, its subsidiaries, affiliates nor
> their officers or employees represent or warrant the accuracy or
> completeness, nor accept any responsibility or liability whatsoever for any
> use of or reliance upon, this email or any of the contents hereof. The
> contents of this message are for information purposes only, and subject to
> change without notice.
> This message is not and is not intended to be an offer or solicitation to
> buy or sell any securities or financial products, nor does any
> recommendation, opinion or advice necessarily reflect those of Daiwa
> Capital Markets Hong Kong Limited, its subsidiaries or affiliates.
>
> 
>