Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-18 Thread Gowtham S
Hi All

We are also seeking for a custom partitioning strategy, it will be helpful
for us too.


Thanks and regards,
Gowtham S


On Mon, 18 Sept 2023 at 12:13, Karthick  wrote:

> Thanks Liu Ron for the suggestion.
>
> Can you please give any pointers/Reference for the custom partitioning
> strategy, we are currently using murmur hashing with the device unique id.
> It would be helpful if we guide/refer any other strategies.
>
> Thanks and regards
> Karthick.
>
> On Mon, Sep 18, 2023 at 9:18 AM liu ron  wrote:
>
>> Hi, Karthick
>>
>> It looks like a data skewing problem, and I think one of the easiest and
>> most efficient ways for this issue is to increase the number of Partitions
>> and see how it works first, like try expanding by 100 first.
>>
>> Best,
>> Ron
>>
>> Karthick  于2023年9月17日周日 17:03写道:
>>
>>> Thanks Wei Chen, Giannis for the time,
>>>
>>>
>>> For starters, you need to better size and estimate the required number
>>>> of partitions you will need on the Kafka side in order to process 1000+
>>>> messages/second.
>>>> The number of partitions should also define the maximum parallelism for
>>>> the Flink job reading for Kafka.
>>>
>>> Thanks for the pointer, can you please guide on what are all the factors
>>> we need to consider regarding this.
>>>
>>> use a custom partitioner that spreads those devices to somewhat separate
>>>> partitions.
>>>
>>> Please suggest a working solution regarding the custom partitioner, to
>>> distribute the load. It will be helpful.
>>>
>>>
>>> What we were doing at that time was to define multiple topics and each
>>>> has a different # of partitions
>>>
>>> Thanks for the suggestion, is there any calculation for choosing topics
>>> count, is there are any formulae/factors to determine this topic number,
>>> please let me know if available it will be helpful for us to choose that.
>>>
>>> Thanks and Regards
>>> Karthick.
>>>
>>>
>>>
>>> On Sun, Sep 17, 2023 at 4:04 AM Wei Chen  wrote:
>>>
>>>> Hi Karthick,
>>>> We’ve experienced the similar issue before. What we were doing at that
>>>> time was to define multiple topics and each has a different # of partitions
>>>> which means some of the topics with more partitions will have the high
>>>> parallelisms for processing.
>>>> And you can further divide the topics into several groups and each
>>>> group should have the similar # of partitions. For each group, you can
>>>> define as the source of flink data stream to run them in parallel with
>>>> different parallelism.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> -- Original --
>>>> *From:* Giannis Polyzos 
>>>> *Date:* Sat,Sep 16,2023 11:52 PM
>>>> *To:* Karthick 
>>>> *Cc:* Gowtham S , user >>> >
>>>> *Subject:* Re: Urgent: Mitigating Slow Consumer Impact and Seeking
>>>> Open-SourceSolutions in Apache Kafka Consumers
>>>>
>>>> Can you provide some more context on what your Flink job will be doing?
>>>> There might be some things you can do to fix the data skew on the link
>>>> side, but first, you want to start with Kafka.
>>>> For starters, you need to better size and estimate the required number
>>>> of partitions you will need on the Kafka side in order to process 1000+
>>>> messages/second.
>>>> The number of partitions should also define the maximum parallelism for
>>>> the Flink job reading for Kafka.
>>>> If you know your "hot devices" in advance you might wanna use a custom
>>>> partitioner that spreads those devices to somewhat separate partitions.
>>>> Overall this is somewhat of a trial-and-error process. You might also
>>>> want to check that these partitions are evenly balanced among your brokers
>>>> and don't cause too much stress on particular brokers.
>>>>
>>>> Best
>>>>
>>>> On Sat, Sep 16, 2023 at 6:03 PM Karthick 
>>>> wrote:
>>>>
>>>>> Hi Gowtham i agree with you,
>>>>>
>>>>> I'm eager to resolve the issue or gain a better understanding. Your
>>>>> assistance would

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-17 Thread Karthick
Thanks Liu Ron for the suggestion.

Can you please give any pointers/Reference for the custom partitioning
strategy, we are currently using murmur hashing with the device unique id.
It would be helpful if we guide/refer any other strategies.

Thanks and regards
Karthick.

On Mon, Sep 18, 2023 at 9:18 AM liu ron  wrote:

> Hi, Karthick
>
> It looks like a data skewing problem, and I think one of the easiest and
> most efficient ways for this issue is to increase the number of Partitions
> and see how it works first, like try expanding by 100 first.
>
> Best,
> Ron
>
> Karthick  于2023年9月17日周日 17:03写道:
>
>> Thanks Wei Chen, Giannis for the time,
>>
>>
>> For starters, you need to better size and estimate the required number of
>>> partitions you will need on the Kafka side in order to process 1000+
>>> messages/second.
>>> The number of partitions should also define the maximum parallelism for
>>> the Flink job reading for Kafka.
>>
>> Thanks for the pointer, can you please guide on what are all the factors
>> we need to consider regarding this.
>>
>> use a custom partitioner that spreads those devices to somewhat separate
>>> partitions.
>>
>> Please suggest a working solution regarding the custom partitioner, to
>> distribute the load. It will be helpful.
>>
>>
>> What we were doing at that time was to define multiple topics and each
>>> has a different # of partitions
>>
>> Thanks for the suggestion, is there any calculation for choosing topics
>> count, is there are any formulae/factors to determine this topic number,
>> please let me know if available it will be helpful for us to choose that.
>>
>> Thanks and Regards
>> Karthick.
>>
>>
>>
>> On Sun, Sep 17, 2023 at 4:04 AM Wei Chen  wrote:
>>
>>> Hi Karthick,
>>> We’ve experienced the similar issue before. What we were doing at that
>>> time was to define multiple topics and each has a different # of partitions
>>> which means some of the topics with more partitions will have the high
>>> parallelisms for processing.
>>> And you can further divide the topics into several groups and each group
>>> should have the similar # of partitions. For each group, you can define as
>>> the source of flink data stream to run them in parallel with different
>>> parallelism.
>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>>
>>> -- Original --
>>> *From:* Giannis Polyzos 
>>> *Date:* Sat,Sep 16,2023 11:52 PM
>>> *To:* Karthick 
>>> *Cc:* Gowtham S , user 
>>> *Subject:* Re: Urgent: Mitigating Slow Consumer Impact and Seeking
>>> Open-SourceSolutions in Apache Kafka Consumers
>>>
>>> Can you provide some more context on what your Flink job will be doing?
>>> There might be some things you can do to fix the data skew on the link
>>> side, but first, you want to start with Kafka.
>>> For starters, you need to better size and estimate the required number
>>> of partitions you will need on the Kafka side in order to process 1000+
>>> messages/second.
>>> The number of partitions should also define the maximum parallelism for
>>> the Flink job reading for Kafka.
>>> If you know your "hot devices" in advance you might wanna use a custom
>>> partitioner that spreads those devices to somewhat separate partitions.
>>> Overall this is somewhat of a trial-and-error process. You might also
>>> want to check that these partitions are evenly balanced among your brokers
>>> and don't cause too much stress on particular brokers.
>>>
>>> Best
>>>
>>> On Sat, Sep 16, 2023 at 6:03 PM Karthick 
>>> wrote:
>>>
>>>> Hi Gowtham i agree with you,
>>>>
>>>> I'm eager to resolve the issue or gain a better understanding. Your
>>>> assistance would be greatly appreciated.
>>>>
>>>> If there are any additional details or context needed to address my
>>>> query effectively, please let me know, and I'll be happy to provide them.
>>>>
>>>> Thank you in advance for your time and consideration. I look forward to
>>>> hearing from you and benefiting from your expertise.
>>>>
>>>> Thanks and Regards
>>>> Karthick.
>>>>
>>>> On Sat, Sep 16, 2023 at 11:04 AM Gowtham S 
>>>> wrote:
>>>>
>>>>&g

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-17 Thread liu ron
Hi, Karthick

It looks like a data skewing problem, and I think one of the easiest and
most efficient ways for this issue is to increase the number of Partitions
and see how it works first, like try expanding by 100 first.

Best,
Ron

Karthick  于2023年9月17日周日 17:03写道:

> Thanks Wei Chen, Giannis for the time,
>
>
> For starters, you need to better size and estimate the required number of
>> partitions you will need on the Kafka side in order to process 1000+
>> messages/second.
>> The number of partitions should also define the maximum parallelism for
>> the Flink job reading for Kafka.
>
> Thanks for the pointer, can you please guide on what are all the factors
> we need to consider regarding this.
>
> use a custom partitioner that spreads those devices to somewhat separate
>> partitions.
>
> Please suggest a working solution regarding the custom partitioner, to
> distribute the load. It will be helpful.
>
>
> What we were doing at that time was to define multiple topics and each has
>> a different # of partitions
>
> Thanks for the suggestion, is there any calculation for choosing topics
> count, is there are any formulae/factors to determine this topic number,
> please let me know if available it will be helpful for us to choose that.
>
> Thanks and Regards
> Karthick.
>
>
>
> On Sun, Sep 17, 2023 at 4:04 AM Wei Chen  wrote:
>
>> Hi Karthick,
>> We’ve experienced the similar issue before. What we were doing at that
>> time was to define multiple topics and each has a different # of partitions
>> which means some of the topics with more partitions will have the high
>> parallelisms for processing.
>> And you can further divide the topics into several groups and each group
>> should have the similar # of partitions. For each group, you can define as
>> the source of flink data stream to run them in parallel with different
>> parallelism.
>>
>>
>>
>>
>> ------
>>
>>
>>
>> -- Original ------
>> *From:* Giannis Polyzos 
>> *Date:* Sat,Sep 16,2023 11:52 PM
>> *To:* Karthick 
>> *Cc:* Gowtham S , user 
>> *Subject:* Re: Urgent: Mitigating Slow Consumer Impact and Seeking
>> Open-SourceSolutions in Apache Kafka Consumers
>>
>> Can you provide some more context on what your Flink job will be doing?
>> There might be some things you can do to fix the data skew on the link
>> side, but first, you want to start with Kafka.
>> For starters, you need to better size and estimate the required number of
>> partitions you will need on the Kafka side in order to process 1000+
>> messages/second.
>> The number of partitions should also define the maximum parallelism for
>> the Flink job reading for Kafka.
>> If you know your "hot devices" in advance you might wanna use a custom
>> partitioner that spreads those devices to somewhat separate partitions.
>> Overall this is somewhat of a trial-and-error process. You might also
>> want to check that these partitions are evenly balanced among your brokers
>> and don't cause too much stress on particular brokers.
>>
>> Best
>>
>> On Sat, Sep 16, 2023 at 6:03 PM Karthick 
>> wrote:
>>
>>> Hi Gowtham i agree with you,
>>>
>>> I'm eager to resolve the issue or gain a better understanding. Your
>>> assistance would be greatly appreciated.
>>>
>>> If there are any additional details or context needed to address my
>>> query effectively, please let me know, and I'll be happy to provide them.
>>>
>>> Thank you in advance for your time and consideration. I look forward to
>>> hearing from you and benefiting from your expertise.
>>>
>>> Thanks and Regards
>>> Karthick.
>>>
>>> On Sat, Sep 16, 2023 at 11:04 AM Gowtham S 
>>> wrote:
>>>
>>>> Hi Karthik
>>>>
>>>> This appears to be a common challenge related to a slow-consuming
>>>> situation. Those with relevant experience in addressing such matters should
>>>> be capable of providing assistance.
>>>>
>>>> Thanks and regards,
>>>> Gowtham S
>>>>
>>>>
>>>> On Fri, 15 Sept 2023 at 23:06, Giannis Polyzos 
>>>> wrote:
>>>>
>>>>> Hi Karthick,
>>>>>
>>>>> on a high level seems like a data skew issue and some partitions have
>>>>> way more data than others?
>>>>> What is the number of your devices? how many messages are 

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-17 Thread Karthick
Thanks Wei Chen, Giannis for the time,


For starters, you need to better size and estimate the required number of
> partitions you will need on the Kafka side in order to process 1000+
> messages/second.
> The number of partitions should also define the maximum parallelism for
> the Flink job reading for Kafka.

Thanks for the pointer, can you please guide on what are all the factors we
need to consider regarding this.

use a custom partitioner that spreads those devices to somewhat separate
> partitions.

Please suggest a working solution regarding the custom partitioner, to
distribute the load. It will be helpful.


What we were doing at that time was to define multiple topics and each has
> a different # of partitions

Thanks for the suggestion, is there any calculation for choosing topics
count, is there are any formulae/factors to determine this topic number,
please let me know if available it will be helpful for us to choose that.

Thanks and Regards
Karthick.



On Sun, Sep 17, 2023 at 4:04 AM Wei Chen  wrote:

> Hi Karthick,
> We’ve experienced the similar issue before. What we were doing at that
> time was to define multiple topics and each has a different # of partitions
> which means some of the topics with more partitions will have the high
> parallelisms for processing.
> And you can further divide the topics into several groups and each group
> should have the similar # of partitions. For each group, you can define as
> the source of flink data stream to run them in parallel with different
> parallelism.
>
>
>
>
> --
>
>
>
> -- Original --
> *From:* Giannis Polyzos 
> *Date:* Sat,Sep 16,2023 11:52 PM
> *To:* Karthick 
> *Cc:* Gowtham S , user 
> *Subject:* Re: Urgent: Mitigating Slow Consumer Impact and Seeking
> Open-SourceSolutions in Apache Kafka Consumers
>
> Can you provide some more context on what your Flink job will be doing?
> There might be some things you can do to fix the data skew on the link
> side, but first, you want to start with Kafka.
> For starters, you need to better size and estimate the required number of
> partitions you will need on the Kafka side in order to process 1000+
> messages/second.
> The number of partitions should also define the maximum parallelism for
> the Flink job reading for Kafka.
> If you know your "hot devices" in advance you might wanna use a custom
> partitioner that spreads those devices to somewhat separate partitions.
> Overall this is somewhat of a trial-and-error process. You might also want
> to check that these partitions are evenly balanced among your brokers and
> don't cause too much stress on particular brokers.
>
> Best
>
> On Sat, Sep 16, 2023 at 6:03 PM Karthick 
> wrote:
>
>> Hi Gowtham i agree with you,
>>
>> I'm eager to resolve the issue or gain a better understanding. Your
>> assistance would be greatly appreciated.
>>
>> If there are any additional details or context needed to address my query
>> effectively, please let me know, and I'll be happy to provide them.
>>
>> Thank you in advance for your time and consideration. I look forward to
>> hearing from you and benefiting from your expertise.
>>
>> Thanks and Regards
>> Karthick.
>>
>> On Sat, Sep 16, 2023 at 11:04 AM Gowtham S 
>> wrote:
>>
>>> Hi Karthik
>>>
>>> This appears to be a common challenge related to a slow-consuming
>>> situation. Those with relevant experience in addressing such matters should
>>> be capable of providing assistance.
>>>
>>> Thanks and regards,
>>> Gowtham S
>>>
>>>
>>> On Fri, 15 Sept 2023 at 23:06, Giannis Polyzos 
>>> wrote:
>>>
>>>> Hi Karthick,
>>>>
>>>> on a high level seems like a data skew issue and some partitions have
>>>> way more data than others?
>>>> What is the number of your devices? how many messages are you
>>>> processing?
>>>> Most of the things you share above sound like you are looking for
>>>> suggestions around load distribution for Kafka.  i.e number of partitions,
>>>> how to distribute your device data etc.
>>>> It would be good to also share what your flink job is doing as I don't
>>>> see anything mentioned around that.. are you observing back pressure in the
>>>> Flink UI?
>>>>
>>>> Best
>>>>
>>>> On Fri, Sep 15, 2023 at 3:46 PM Karthick 
>>>> wrote:
>>>>
>>>>> Dear Apache Flink Community,
>>>>>
>>>>>
>

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-16 Thread Wei Chen
Hi Karthick,
We’ve experienced the similar issue before. What we were doing at that time was 
to define multiple topics and each has a different # of partitions which means 
some of the topics with more partitions will have the high parallelisms for 
processing.
And you can further divide the topics into several groups and each group should 
have the similar # of partitions. For each group, you can define as the source 
of flink data stream to run them in parallel with different parallelism.













-- Original --
From: Giannis Polyzos