what is Best Practices of kafka mirror maker 2 #12375

2022-07-04 Thread Phongtat Yaemsomphong
Now, i config mm2,properties this below

clusters=DC1,DC2
DC1.bootstrap.servers = broker1-dev1:30011, broker2-dev1:30012,
broker3-dev1:30013
DC2.bootstrap.servers = broker1-dev2:30011, broker2-dev2:30012,
broker3-dev2:30013

DC1.config.storage.replication.factor = 3
DC2.config.storage.replication.factor = 3

DC1.offset.storage.replication.factor = 3
DC2.offset.storage.replication.factor = 3

DC1.status.storage.replication.factor = 3
DC2.status.storage.replication.factor = 3

DC1->DC2.enabled = true
DC2->DC1.enabled = true

offset-syncs.topic.replication.factor = 3
heartbeats.topic.replication.factor = 3
checkpoints.topic.replication.factor = 3

topics = .*
groups = .*

task.max = 2
replication.factor = 3
refresh.topics.enabled = true
sync.topics.configs.enabled = true
refresh.topics.interval.seconds = 10

topic.blacklist = .*[-.]internal, .*.replica, __consumer_offsets
groups.blacklist = console-consumer-.*, connect-.*, __.*

DC1->DC2.emit.heartbeats.enabled = true
DC1->DC2.emit.checkpoints.enabled = true

DC2->DC1.emit.heartbeats.enabled = true
DC2->DC1.emit.checkpoints.enabled = true
What is the expected behavior?
I have a test case this below.

Create topic on DC1 name topicTest1, It will replica sync to DC2 and on the
DC2 will have topic in name DC1.testTopic1.
next step, I produce message to topicTest1 on DC1 and message will sync to
DC1.testTopic1 on DC2.
last step I produce message to DC1.testTopic1 on DC2 but message not able
sync to topicTest1 on DC1.
I have a question. in test last step at bullet 3. the message is produced
to DC1.testTopic1 on DC2 should already sync to topicTest1 on DC1, or not
sync ?

If not sync, please explain me


-
ขอบคุณครับ
พงษ์ธัช แย้มสมพงษ์ (ฟีน)
เบอร์โทร : 088-5288080

-- 


DISCLAIMER: This message and any attachments are solely for the intended 
recipient and may contain confidential or privileged information. Any 
information, comment or statement contained in this e-mail, including any 
attachments (if any) are those of the author and are not necessarily 
endorsed by the company. The company shall, therefore, not be liable or 
responsible for any of such contents, including damages resulting from any 
virus transmitted by this e-mail. If you are not the intended recipient, 
any disclosure, copying, use, or distribution of the information included 
in this message and any attachments is prohibited. If you have received 
this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments.

Thank 
you.


"kafka-producer-network-thread" best practices

2020-11-02 Thread Miroslav Tsvetanov
Hello everyone,

I noticed that kafka-producer-network-thread is much loaded than everyone
else thread.
I have a Kafka cluster with 3 brokers and 1 kafka-producer-network-thread
do I need to scale it and if so how many?
Will something bad happen if I scale and what are the best practices?

Thanks in advance.

Best regards,
Miroslav


Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-09-01 Thread Rijo Roy
files)similar configurations are in 
leader node and observed that in leader node and the other follower node, there 
are no files in /var/log/kafka but in the updated ZK follower node which is 
failing to start has files written to /var/log/kafka

Please help..

Thanks,
Rijo Roy
On 2020/08/19 17:52:34, Rijo Roy  wrote: 
> Sure Manoj!
> 
> Really appreciate your quick response..
> 
> On 2020/08/19 17:40:54,  wrote: 
> > Great .
> > Share your finding  to this group  once you done upgrade Confluent Kafka 
> > 4.1x to 5.3x successfully .
> > 
> > I see many people having  same question here .
> > 
> > On 8/19/20, 10:38 AM, "Rijo Roy"  wrote:
> > 
> > [External]
> > 
> > 
> > Thanks Manoj!
> > 
> > Yeah, the plan is to start with non-prod and validate first before 
> > going to prod.
> > 
> > Thanks & Regards,
> > Rijo Roy
> > 
> > On 2020/08/19 17:33:53,  wrote:
> > > I advise to do it non-prod for validation .
> > > You can backup data log folder if you want but I have'nt see any 
> > issue . but better to backup data if it small .
> > >
> > > Don’t change below value to latest until you done full validation , 
> > once you changed  to latest then you can't rollback .
> > >
> > > inter.broker.protocol.version=2.1.x
> > >
> > > On 8/19/20, 9:52 AM, "Rijo Roy"  wrote:
> > >
> > > [External]
> > >
> > >
> > > Thanks Manoj! Appreciate your help..
> > >
> > > I will follow the steps you pointed out..
> > >
> > > Do you think there is a need to :
> > > 1. backup the data before the rolling upgrade
> > > 2. some kind of datasync that should be considered here.. I don't 
> > think this is required as I am performing an in-place upgrade..
> > >
> > > Thanks & Regards,
> > > Rijo Roy
> > >
> > > On 2020/08/18 20:45:42,  wrote:
> > > > You can follow below steps
> > > >
> > > > 1. set inter.broker.protocol.version=2.1.x  and rolling restart 
> > kafka
> > > > 2. Rolling upgrade the Kafka cluster to 2.5 -
> > > > 3. rolling upgrade ZK cluster
> > > > Validate the kafka .
> > > >
> > > > 4. set inter.broker.protocol.version= new version and rolling 
> > restart the Kafka
> > > >
> > > >
> > > >
> > > > On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
> > > >
> > > > [External]
> > > >
> > >     >
> > > > Hi,
> > > >
> > > > I am a newbie in Kafka and would greatly appreciate if 
> > someone could help with best-practices and steps to upgrade to v5.3x.
> > > >
> > > > Below is my existing set-up:
> > > > OS version:  Ubuntu 16.04.6 LTS
> > > > ZooKeeper version : 3.4.10
> > > > Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
> > > >
> > > > We need to upgrade our OS version to Ubuntu 18.04 LTS whose 
> > minimum requirement is to upgrade Kafka to v5.3x. Could someone please help 
> > me with the best-practices & steps for the upgrade..
> > > >
> > > > Please let me know if you need any more information so that 
> > you could help me.
> > > >
> > > > Appreciate your help!
> > > >
> > > > Thanks & Regards,
> > > > Rijo Roy
> > > >
> > > >
> > > >
> > > > This e-mail and any files transmitted with it are for the sole 
> > use of the intended recipient(s) and may contain confidential and 
> > privileged information. If you are not the intended recipient(s), please 
> > reply to the sender and destroy all copies of the original message. Any 
> > unauthorized review, use, disclosure, dissemination, forwarding, printing 
> > or copying of this email, and/or any action taken in reliance on the 
> > contents of this e-mail is strictly prohibited and may be unlawful. Where 
> > permitted by applicable law, this e-mail and ot

Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-19 Thread Rijo Roy
Sure Manoj!

Really appreciate your quick response..

On 2020/08/19 17:40:54,  wrote: 
> Great .
> Share your finding  to this group  once you done upgrade Confluent Kafka 4.1x 
> to 5.3x successfully .
> 
> I see many people having  same question here .
> 
> On 8/19/20, 10:38 AM, "Rijo Roy"  wrote:
> 
> [External]
> 
> 
> Thanks Manoj!
> 
> Yeah, the plan is to start with non-prod and validate first before going 
> to prod.
> 
> Thanks & Regards,
> Rijo Roy
> 
> On 2020/08/19 17:33:53,  wrote:
> > I advise to do it non-prod for validation .
> > You can backup data log folder if you want but I have'nt see any issue 
> . but better to backup data if it small .
> >
> > Don’t change below value to latest until you done full validation , 
> once you changed  to latest then you can't rollback .
> >
> > inter.broker.protocol.version=2.1.x
> >
> > On 8/19/20, 9:52 AM, "Rijo Roy"  wrote:
> >
> > [External]
> >
> >
> > Thanks Manoj! Appreciate your help..
> >
> > I will follow the steps you pointed out..
> >
> > Do you think there is a need to :
> > 1. backup the data before the rolling upgrade
> > 2. some kind of datasync that should be considered here.. I don't 
> think this is required as I am performing an in-place upgrade..
> >
> > Thanks & Regards,
> > Rijo Roy
> >
> > On 2020/08/18 20:45:42,  wrote:
> > > You can follow below steps
> > >
> > > 1. set inter.broker.protocol.version=2.1.x  and rolling restart 
> kafka
> > > 2. Rolling upgrade the Kafka cluster to 2.5 -
> > > 3. rolling upgrade ZK cluster
> > > Validate the kafka .
> > >
> > > 4. set inter.broker.protocol.version= new version and rolling 
> restart the Kafka
> > >
> > >
> > >
> > > On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
> > >
> > > [External]
> > >
> > >
> > > Hi,
> > >
> > > I am a newbie in Kafka and would greatly appreciate if 
> someone could help with best-practices and steps to upgrade to v5.3x.
> > >
> > > Below is my existing set-up:
> > > OS version:  Ubuntu 16.04.6 LTS
> > > ZooKeeper version : 3.4.10
> > > Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
> > >
> > > We need to upgrade our OS version to Ubuntu 18.04 LTS whose 
> minimum requirement is to upgrade Kafka to v5.3x. Could someone please help 
> me with the best-practices & steps for the upgrade..
> > >
> > > Please let me know if you need any more information so that 
> you could help me.
> > >
> > > Appreciate your help!
> > >
> > > Thanks & Regards,
> > > Rijo Roy
> > >
> > >
> > >
> > > This e-mail and any files transmitted with it are for the sole 
> use of the intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> > > This e-mail and any files transmitted with it are for the sole 
> use of the intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> > >
> >

Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-19 Thread Manoj.Agrawal2
Great .
Share your finding  to this group  once you done upgrade Confluent Kafka 4.1x 
to 5.3x successfully .

I see many people having  same question here .

On 8/19/20, 10:38 AM, "Rijo Roy"  wrote:

[External]


Thanks Manoj!

Yeah, the plan is to start with non-prod and validate first before going to 
prod.

Thanks & Regards,
Rijo Roy

On 2020/08/19 17:33:53,  wrote:
> I advise to do it non-prod for validation .
> You can backup data log folder if you want but I have'nt see any issue . 
but better to backup data if it small .
>
> Don’t change below value to latest until you done full validation , once 
you changed  to latest then you can't rollback .
>
> inter.broker.protocol.version=2.1.x
>
> On 8/19/20, 9:52 AM, "Rijo Roy"  wrote:
>
> [External]
>
>
> Thanks Manoj! Appreciate your help..
>
> I will follow the steps you pointed out..
>
> Do you think there is a need to :
> 1. backup the data before the rolling upgrade
> 2. some kind of datasync that should be considered here.. I don't 
think this is required as I am performing an in-place upgrade..
>
> Thanks & Regards,
> Rijo Roy
>
> On 2020/08/18 20:45:42,  wrote:
> > You can follow below steps
> >
> > 1. set inter.broker.protocol.version=2.1.x  and rolling restart 
kafka
> > 2. Rolling upgrade the Kafka cluster to 2.5 -
> > 3. rolling upgrade ZK cluster
> > Validate the kafka .
> >
> > 4. set inter.broker.protocol.version= new version and rolling 
restart the Kafka
> >
> >
> >
> > On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
> >
> > [External]
> >
> >
> > Hi,
> >
> > I am a newbie in Kafka and would greatly appreciate if someone 
could help with best-practices and steps to upgrade to v5.3x.
> >
> > Below is my existing set-up:
> > OS version:  Ubuntu 16.04.6 LTS
> > ZooKeeper version : 3.4.10
> > Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
> >
> > We need to upgrade our OS version to Ubuntu 18.04 LTS whose 
minimum requirement is to upgrade Kafka to v5.3x. Could someone please help me 
with the best-practices & steps for the upgrade..
> >
> > Please let me know if you need any more information so that you 
could help me.
> >
> > Appreciate your help!
> >
> > Thanks & Regards,
> > Rijo Roy
> >
> >
> >
> > This e-mail and any files transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential and privileged 
information. If you are not the intended recipient(s), please reply to the 
sender and destroy all copies of the original message. Any unauthorized review, 
use, disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
> > This e-mail and any files transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential and privileged 
information. If you are not the intended recipient(s), please reply to the 
sender and destroy all copies of the original message. Any unauthorized review, 
use, disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
> >
>
>
> This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail add

Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-19 Thread Rijo Roy
Thanks Manoj!

Yeah, the plan is to start with non-prod and validate first before going to 
prod.

Thanks & Regards,
Rijo Roy

On 2020/08/19 17:33:53,  wrote: 
> I advise to do it non-prod for validation .
> You can backup data log folder if you want but I have'nt see any issue . but 
> better to backup data if it small .
> 
> Don’t change below value to latest until you done full validation , once you 
> changed  to latest then you can't rollback .
> 
> inter.broker.protocol.version=2.1.x
> 
> On 8/19/20, 9:52 AM, "Rijo Roy"  wrote:
> 
> [External]
> 
> 
> Thanks Manoj! Appreciate your help..
> 
> I will follow the steps you pointed out..
> 
> Do you think there is a need to :
> 1. backup the data before the rolling upgrade
> 2. some kind of datasync that should be considered here.. I don't think 
> this is required as I am performing an in-place upgrade..
> 
> Thanks & Regards,
> Rijo Roy
> 
> On 2020/08/18 20:45:42,  wrote:
> > You can follow below steps
> >
> > 1. set inter.broker.protocol.version=2.1.x  and rolling restart kafka
> > 2. Rolling upgrade the Kafka cluster to 2.5 -
> > 3. rolling upgrade ZK cluster
> > Validate the kafka .
> >
> > 4. set inter.broker.protocol.version= new version and rolling restart 
> the Kafka
> >
> >
> >
> > On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
> >
> > [External]
> >
> >
> > Hi,
> >
> > I am a newbie in Kafka and would greatly appreciate if someone 
> could help with best-practices and steps to upgrade to v5.3x.
> >
> > Below is my existing set-up:
> > OS version:  Ubuntu 16.04.6 LTS
> > ZooKeeper version : 3.4.10
> > Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
> >
> > We need to upgrade our OS version to Ubuntu 18.04 LTS whose minimum 
> requirement is to upgrade Kafka to v5.3x. Could someone please help me with 
> the best-practices & steps for the upgrade..
> >
> > Please let me know if you need any more information so that you 
> could help me.
> >
> > Appreciate your help!
> >
> > Thanks & Regards,
> > Rijo Roy
> >
> >
> >
> > This e-mail and any files transmitted with it are for the sole use of 
> the intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> > This e-mail and any files transmitted with it are for the sole use of 
> the intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> >
> 
> 
> This e-mail and any files transmitted with it are for the sole use of the 
> intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> This e-mail and any files transmitted with it are for the sole use of the 
> intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> 


Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-19 Thread Manoj.Agrawal2
I advise to do it non-prod for validation .
You can backup data log folder if you want but I have'nt see any issue . but 
better to backup data if it small .

Don’t change below value to latest until you done full validation , once you 
changed  to latest then you can't rollback .

inter.broker.protocol.version=2.1.x

On 8/19/20, 9:52 AM, "Rijo Roy"  wrote:

[External]


Thanks Manoj! Appreciate your help..

I will follow the steps you pointed out..

Do you think there is a need to :
1. backup the data before the rolling upgrade
2. some kind of datasync that should be considered here.. I don't think 
this is required as I am performing an in-place upgrade..

Thanks & Regards,
Rijo Roy

On 2020/08/18 20:45:42,  wrote:
> You can follow below steps
>
> 1. set inter.broker.protocol.version=2.1.x  and rolling restart kafka
> 2. Rolling upgrade the Kafka cluster to 2.5 -
> 3. rolling upgrade ZK cluster
> Validate the kafka .
>
> 4. set inter.broker.protocol.version= new version and rolling restart the 
Kafka
>
>
>
> On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
>
> [External]
>
>
> Hi,
>
> I am a newbie in Kafka and would greatly appreciate if someone could 
help with best-practices and steps to upgrade to v5.3x.
>
> Below is my existing set-up:
> OS version:  Ubuntu 16.04.6 LTS
> ZooKeeper version : 3.4.10
> Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
>
> We need to upgrade our OS version to Ubuntu 18.04 LTS whose minimum 
requirement is to upgrade Kafka to v5.3x. Could someone please help me with the 
best-practices & steps for the upgrade..
>
> Please let me know if you need any more information so that you could 
help me.
>
> Appreciate your help!
>
> Thanks & Regards,
> Rijo Roy
>
>
>
> This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
> This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
>


This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.


Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-19 Thread Rijo Roy
Thanks Manoj! Appreciate your help..

I will follow the steps you pointed out..

Do you think there is a need to :
1. backup the data before the rolling upgrade 
2. some kind of datasync that should be considered here.. I don't think this is 
required as I am performing an in-place upgrade..

Thanks & Regards,
Rijo Roy

On 2020/08/18 20:45:42,  wrote: 
> You can follow below steps
> 
> 1. set inter.broker.protocol.version=2.1.x  and rolling restart kafka
> 2. Rolling upgrade the Kafka cluster to 2.5 -
> 3. rolling upgrade ZK cluster
> Validate the kafka .
> 
> 4. set inter.broker.protocol.version= new version and rolling restart the 
> Kafka
> 
> 
> 
> On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:
> 
> [External]
> 
> 
> Hi,
> 
> I am a newbie in Kafka and would greatly appreciate if someone could help 
> with best-practices and steps to upgrade to v5.3x.
> 
> Below is my existing set-up:
> OS version:  Ubuntu 16.04.6 LTS
> ZooKeeper version : 3.4.10
> Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1
> 
> We need to upgrade our OS version to Ubuntu 18.04 LTS whose minimum 
> requirement is to upgrade Kafka to v5.3x. Could someone please help me with 
> the best-practices & steps for the upgrade..
> 
> Please let me know if you need any more information so that you could 
> help me.
> 
> Appreciate your help!
> 
> Thanks & Regards,
> Rijo Roy
> 
> 
> 
> This e-mail and any files transmitted with it are for the sole use of the 
> intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> This e-mail and any files transmitted with it are for the sole use of the 
> intended recipient(s) and may contain confidential and privileged 
> information. If you are not the intended recipient(s), please reply to the 
> sender and destroy all copies of the original message. Any unauthorized 
> review, use, disclosure, dissemination, forwarding, printing or copying of 
> this email, and/or any action taken in reliance on the contents of this 
> e-mail is strictly prohibited and may be unlawful. Where permitted by 
> applicable law, this e-mail and other e-mail communications sent to and from 
> Cognizant e-mail addresses may be monitored.
> 


Re: Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-18 Thread Manoj.Agrawal2
You can follow below steps

1. set inter.broker.protocol.version=2.1.x  and rolling restart kafka
2. Rolling upgrade the Kafka cluster to 2.5 -
3. rolling upgrade ZK cluster
Validate the kafka .

4. set inter.broker.protocol.version= new version and rolling restart the Kafka



On 8/18/20, 12:54 PM, "Rijo Roy"  wrote:

[External]


Hi,

I am a newbie in Kafka and would greatly appreciate if someone could help 
with best-practices and steps to upgrade to v5.3x.

Below is my existing set-up:
OS version:  Ubuntu 16.04.6 LTS
ZooKeeper version : 3.4.10
Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1

We need to upgrade our OS version to Ubuntu 18.04 LTS whose minimum 
requirement is to upgrade Kafka to v5.3x. Could someone please help me with the 
best-practices & steps for the upgrade..

Please let me know if you need any more information so that you could help 
me.

Appreciate your help!

Thanks & Regards,
Rijo Roy



This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.


Steps & best-practices to upgrade Confluent Kafka 4.1x to 5.3x

2020-08-18 Thread Rijo Roy
Hi,

I am a newbie in Kafka and would greatly appreciate if someone could help with 
best-practices and steps to upgrade to v5.3x. 

Below is my existing set-up:
OS version:  Ubuntu 16.04.6 LTS
ZooKeeper version : 3.4.10
Kafka version : confluent-kafka-2.11 / 1.1.1-cp2 / v4.1.1

We need to upgrade our OS version to Ubuntu 18.04 LTS whose minimum requirement 
is to upgrade Kafka to v5.3x. Could someone please help me with the 
best-practices & steps for the upgrade..

Please let me know if you need any more information so that you could help me.

Appreciate your help!

Thanks & Regards,
Rijo Roy



Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-11 Thread Jason Turim
I am not sure off the top, but since the method is on the consumer my
intuition is that it would pause all the partitions the consumer is reading
from.  I think the best thing to do is write a little test harness app to
verify the behavior.

On Mon, May 11, 2020 at 7:31 AM Ali Nazemian  wrote:

> Hi Jason,
>
> Thank you for the message. It seems quite interesting. So something I am
> not sure about "pause" and "resume" is it works based on a partition
> allocation. What will happen if more partitions are assigned to a single
> consumer? For example, in the case where we have over-partition a Kafka
> topic to reserve some capacity for scaling up in case of burst traffic.
>
> Regards,
> Ali
>
> On Sat, May 9, 2020 at 11:52 PM Jason Turim  wrote:
>
> > Hi Ali,
> >
> > You may want to look at using the consumer pause / resume api.  Its a
> > mechanism that allows you to poll without retrieving new messages.
> >
> > I employed this strategy to effectively handle highly variable workloads
> by
> > processing them in a background thread.   First pause the consumer when
> > messages were received.  Then start a background thread to process the
> > data, meanwhile the primary thread continues to poll (it won't retrieve
> > data while the consumer is paused).  Finally, when the bg processing is
> > complete, call resume on the consumer to retrieve the next batch of
> > messages.
> >
> > More info here in the section "Detecting Consumer Failures" -
> >
> >
> https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> >
> > hth,
> > jt
> >
> > On Fri, May 8, 2020, 10:50 PM Chris Toomey  wrote:
> >
> > > What exactly is your understanding of what's happening when you say
> "the
> > > pipeline will be blocked by the group coordinator for up to "
> > > max.poll.interval.ms""? Please explain that.
> > >
> > > There's no universal recipe for "long-running jobs", there's just
> > > particular issues you might be encountering and suggested solutions to
> > > those issues.
> > >
> > >
> > >
> > > On Fri, May 8, 2020 at 7:03 PM Ali Nazemian 
> > wrote:
> > >
> > > > Hi Chris,
> > > >
> > > > I am not sure where I said about the "automatic partition
> > reassignment",
> > > > but what I know here is the side effect of increasing "
> > > > max.poll.interval.ms"
> > > > is if the consumer hangs for whatever reason the pipeline will be
> > blocked
> > > > by the group coordinator for up to "max.poll.interval.ms". So I am
> not
> > > > sure
> > > > if this is because of the automatic partition assignment or something
> > > else.
> > > > What I am looking for is how I can deal with long-running jobs in
> > Apache
> > > > Kafka.
> > > >
> > > > Thanks,
> > > > Ali
> > > >
> > > > On Sat, May 9, 2020 at 4:25 AM Chris Toomey 
> wrote:
> > > >
> > > > > I interpreted your post as saying "when our consumer gets stuck,
> > > Kafka's
> > > > > automatic partition reassignment kicks in and that's problematic
> for
> > > us."
> > > > > Hence I suggested not using the automatic partition assignment,
> which
> > > per
> > > > > my interpretation would address your issue.
> > > > >
> > > > > Chris
> > > > >
> > > > > On Fri, May 8, 2020 at 2:19 AM Ali Nazemian  >
> > > > wrote:
> > > > >
> > > > > > Thanks, Chris. So what is causing the consumer to get stuck is a
> > side
> > > > > > effect of the built-in partition assignment in Kafka and by
> > > overriding
> > > > > that
> > > > > > behaviour I should be able to address the long-running job issue,
> > is
> > > > that
> > > > > > right? Can you please elaborate more on this?
> > > > > >
> > > > > > Regards,
> > > > > > Ali
> > > > > >
> > > > > > On Fri, May 8, 2020 at 1:09 PM Chris Toomey 
> > > wrote:
> > > > > >
> > > > > > > You really have to decide what behavior it is you want when one
> > of
> > > > your
> > > > > > > consumers gets "stuck". If you don't like the way the group
> > > protocol
> > > > > > > dynamically manages topic partition assignments or can't figure
> > out
> > > > an
> > > > > > > appropriate set of configuration settings that achieve your
> goal,
> > > you
> > > > > can
> > > > > > > always elect to not use the group protocol and instead manage
> > topic
> > > > > > > partition assignment yourself. As I just replied to another
> post,
> > > > > > there's a
> > > > > > > nice writeup of this under  "Manual Partition Assignment" in
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > > > > > >  .
> > > > > > >
> > > > > > > Chris
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 7, 2020 at 12:37 AM Ali Nazemian <
> > > alinazem...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > To help understanding my case in more details, the error I
> can
> > > see
> > > > > > > > constantly is the consumer losing heartbeat and hence
> > apparently
> > > > the
> > > > > > > group
> > > > > > > > get rebalanced based on the 

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-11 Thread Ali Nazemian
Hi Jason,

Thank you for the message. It seems quite interesting. So something I am
not sure about "pause" and "resume" is it works based on a partition
allocation. What will happen if more partitions are assigned to a single
consumer? For example, in the case where we have over-partition a Kafka
topic to reserve some capacity for scaling up in case of burst traffic.

Regards,
Ali

On Sat, May 9, 2020 at 11:52 PM Jason Turim  wrote:

> Hi Ali,
>
> You may want to look at using the consumer pause / resume api.  Its a
> mechanism that allows you to poll without retrieving new messages.
>
> I employed this strategy to effectively handle highly variable workloads by
> processing them in a background thread.   First pause the consumer when
> messages were received.  Then start a background thread to process the
> data, meanwhile the primary thread continues to poll (it won't retrieve
> data while the consumer is paused).  Finally, when the bg processing is
> complete, call resume on the consumer to retrieve the next batch of
> messages.
>
> More info here in the section "Detecting Consumer Failures" -
>
> https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
>
> hth,
> jt
>
> On Fri, May 8, 2020, 10:50 PM Chris Toomey  wrote:
>
> > What exactly is your understanding of what's happening when you say "the
> > pipeline will be blocked by the group coordinator for up to "
> > max.poll.interval.ms""? Please explain that.
> >
> > There's no universal recipe for "long-running jobs", there's just
> > particular issues you might be encountering and suggested solutions to
> > those issues.
> >
> >
> >
> > On Fri, May 8, 2020 at 7:03 PM Ali Nazemian 
> wrote:
> >
> > > Hi Chris,
> > >
> > > I am not sure where I said about the "automatic partition
> reassignment",
> > > but what I know here is the side effect of increasing "
> > > max.poll.interval.ms"
> > > is if the consumer hangs for whatever reason the pipeline will be
> blocked
> > > by the group coordinator for up to "max.poll.interval.ms". So I am not
> > > sure
> > > if this is because of the automatic partition assignment or something
> > else.
> > > What I am looking for is how I can deal with long-running jobs in
> Apache
> > > Kafka.
> > >
> > > Thanks,
> > > Ali
> > >
> > > On Sat, May 9, 2020 at 4:25 AM Chris Toomey  wrote:
> > >
> > > > I interpreted your post as saying "when our consumer gets stuck,
> > Kafka's
> > > > automatic partition reassignment kicks in and that's problematic for
> > us."
> > > > Hence I suggested not using the automatic partition assignment, which
> > per
> > > > my interpretation would address your issue.
> > > >
> > > > Chris
> > > >
> > > > On Fri, May 8, 2020 at 2:19 AM Ali Nazemian 
> > > wrote:
> > > >
> > > > > Thanks, Chris. So what is causing the consumer to get stuck is a
> side
> > > > > effect of the built-in partition assignment in Kafka and by
> > overriding
> > > > that
> > > > > behaviour I should be able to address the long-running job issue,
> is
> > > that
> > > > > right? Can you please elaborate more on this?
> > > > >
> > > > > Regards,
> > > > > Ali
> > > > >
> > > > > On Fri, May 8, 2020 at 1:09 PM Chris Toomey 
> > wrote:
> > > > >
> > > > > > You really have to decide what behavior it is you want when one
> of
> > > your
> > > > > > consumers gets "stuck". If you don't like the way the group
> > protocol
> > > > > > dynamically manages topic partition assignments or can't figure
> out
> > > an
> > > > > > appropriate set of configuration settings that achieve your goal,
> > you
> > > > can
> > > > > > always elect to not use the group protocol and instead manage
> topic
> > > > > > partition assignment yourself. As I just replied to another post,
> > > > > there's a
> > > > > > nice writeup of this under  "Manual Partition Assignment" in
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > > > > >  .
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > >
> > > > > > On Thu, May 7, 2020 at 12:37 AM Ali Nazemian <
> > alinazem...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > To help understanding my case in more details, the error I can
> > see
> > > > > > > constantly is the consumer losing heartbeat and hence
> apparently
> > > the
> > > > > > group
> > > > > > > get rebalanced based on the log I can see from Kafka side:
> > > > > > >
> > > > > > > GroupCoordinator 11]: Member
> > > > > > > consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo
> has
> > > > > failed,
> > > > > > > removing it from the group
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Ali
> > > > > > >
> > > > > > > On Thu, May 7, 2020 at 2:38 PM Ali Nazemian <
> > alinazem...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > With the emerge of using Apache Kafka for event-driven
> > > > architecture,
> > > > > > one
> > > > > > > > thing 

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-09 Thread Jason Turim
Hi Ali,

You may want to look at using the consumer pause / resume api.  Its a
mechanism that allows you to poll without retrieving new messages.

I employed this strategy to effectively handle highly variable workloads by
processing them in a background thread.   First pause the consumer when
messages were received.  Then start a background thread to process the
data, meanwhile the primary thread continues to poll (it won't retrieve
data while the consumer is paused).  Finally, when the bg processing is
complete, call resume on the consumer to retrieve the next batch of
messages.

More info here in the section "Detecting Consumer Failures" -
https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html

hth,
jt

On Fri, May 8, 2020, 10:50 PM Chris Toomey  wrote:

> What exactly is your understanding of what's happening when you say "the
> pipeline will be blocked by the group coordinator for up to "
> max.poll.interval.ms""? Please explain that.
>
> There's no universal recipe for "long-running jobs", there's just
> particular issues you might be encountering and suggested solutions to
> those issues.
>
>
>
> On Fri, May 8, 2020 at 7:03 PM Ali Nazemian  wrote:
>
> > Hi Chris,
> >
> > I am not sure where I said about the "automatic partition reassignment",
> > but what I know here is the side effect of increasing "
> > max.poll.interval.ms"
> > is if the consumer hangs for whatever reason the pipeline will be blocked
> > by the group coordinator for up to "max.poll.interval.ms". So I am not
> > sure
> > if this is because of the automatic partition assignment or something
> else.
> > What I am looking for is how I can deal with long-running jobs in Apache
> > Kafka.
> >
> > Thanks,
> > Ali
> >
> > On Sat, May 9, 2020 at 4:25 AM Chris Toomey  wrote:
> >
> > > I interpreted your post as saying "when our consumer gets stuck,
> Kafka's
> > > automatic partition reassignment kicks in and that's problematic for
> us."
> > > Hence I suggested not using the automatic partition assignment, which
> per
> > > my interpretation would address your issue.
> > >
> > > Chris
> > >
> > > On Fri, May 8, 2020 at 2:19 AM Ali Nazemian 
> > wrote:
> > >
> > > > Thanks, Chris. So what is causing the consumer to get stuck is a side
> > > > effect of the built-in partition assignment in Kafka and by
> overriding
> > > that
> > > > behaviour I should be able to address the long-running job issue, is
> > that
> > > > right? Can you please elaborate more on this?
> > > >
> > > > Regards,
> > > > Ali
> > > >
> > > > On Fri, May 8, 2020 at 1:09 PM Chris Toomey 
> wrote:
> > > >
> > > > > You really have to decide what behavior it is you want when one of
> > your
> > > > > consumers gets "stuck". If you don't like the way the group
> protocol
> > > > > dynamically manages topic partition assignments or can't figure out
> > an
> > > > > appropriate set of configuration settings that achieve your goal,
> you
> > > can
> > > > > always elect to not use the group protocol and instead manage topic
> > > > > partition assignment yourself. As I just replied to another post,
> > > > there's a
> > > > > nice writeup of this under  "Manual Partition Assignment" in
> > > > >
> > > > >
> > > >
> > >
> >
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > > > >  .
> > > > >
> > > > > Chris
> > > > >
> > > > >
> > > > > On Thu, May 7, 2020 at 12:37 AM Ali Nazemian <
> alinazem...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > To help understanding my case in more details, the error I can
> see
> > > > > > constantly is the consumer losing heartbeat and hence apparently
> > the
> > > > > group
> > > > > > get rebalanced based on the log I can see from Kafka side:
> > > > > >
> > > > > > GroupCoordinator 11]: Member
> > > > > > consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has
> > > > failed,
> > > > > > removing it from the group
> > > > > >
> > > > > > Thanks,
> > > > > > Ali
> > > > > >
> > > > > > On Thu, May 7, 2020 at 2:38 PM Ali Nazemian <
> alinazem...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > With the emerge of using Apache Kafka for event-driven
> > > architecture,
> > > > > one
> > > > > > > thing that has become important is how to tune apache Kafka
> > > consumer
> > > > to
> > > > > > > manage long-running jobs. The main issue raises when we set a
> > > > > relatively
> > > > > > > large value for "max.poll.interval.ms". Setting this value
> will,
> > > of
> > > > > > > course, resolve the issue of repetitive rebalance, but creates
> > > > another
> > > > > > > operational issue. I am looking for some sort of golden
> strategy
> > to
> > > > > deal
> > > > > > > with long-running jobs with Apache Kafka.
> > > > > > >
> > > > > > > If the consumer hangs for whatever reason, there is no easy way
> > of
> > > > > > passing
> > > > > > > that stage. It can easily block the pipeline, and you cannot do
> > > 

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-08 Thread Chris Toomey
What exactly is your understanding of what's happening when you say "the
pipeline will be blocked by the group coordinator for up to "
max.poll.interval.ms""? Please explain that.

There's no universal recipe for "long-running jobs", there's just
particular issues you might be encountering and suggested solutions to
those issues.



On Fri, May 8, 2020 at 7:03 PM Ali Nazemian  wrote:

> Hi Chris,
>
> I am not sure where I said about the "automatic partition reassignment",
> but what I know here is the side effect of increasing "
> max.poll.interval.ms"
> is if the consumer hangs for whatever reason the pipeline will be blocked
> by the group coordinator for up to "max.poll.interval.ms". So I am not
> sure
> if this is because of the automatic partition assignment or something else.
> What I am looking for is how I can deal with long-running jobs in Apache
> Kafka.
>
> Thanks,
> Ali
>
> On Sat, May 9, 2020 at 4:25 AM Chris Toomey  wrote:
>
> > I interpreted your post as saying "when our consumer gets stuck, Kafka's
> > automatic partition reassignment kicks in and that's problematic for us."
> > Hence I suggested not using the automatic partition assignment, which per
> > my interpretation would address your issue.
> >
> > Chris
> >
> > On Fri, May 8, 2020 at 2:19 AM Ali Nazemian 
> wrote:
> >
> > > Thanks, Chris. So what is causing the consumer to get stuck is a side
> > > effect of the built-in partition assignment in Kafka and by overriding
> > that
> > > behaviour I should be able to address the long-running job issue, is
> that
> > > right? Can you please elaborate more on this?
> > >
> > > Regards,
> > > Ali
> > >
> > > On Fri, May 8, 2020 at 1:09 PM Chris Toomey  wrote:
> > >
> > > > You really have to decide what behavior it is you want when one of
> your
> > > > consumers gets "stuck". If you don't like the way the group protocol
> > > > dynamically manages topic partition assignments or can't figure out
> an
> > > > appropriate set of configuration settings that achieve your goal, you
> > can
> > > > always elect to not use the group protocol and instead manage topic
> > > > partition assignment yourself. As I just replied to another post,
> > > there's a
> > > > nice writeup of this under  "Manual Partition Assignment" in
> > > >
> > > >
> > >
> >
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > > >  .
> > > >
> > > > Chris
> > > >
> > > >
> > > > On Thu, May 7, 2020 at 12:37 AM Ali Nazemian 
> > > > wrote:
> > > >
> > > > > To help understanding my case in more details, the error I can see
> > > > > constantly is the consumer losing heartbeat and hence apparently
> the
> > > > group
> > > > > get rebalanced based on the log I can see from Kafka side:
> > > > >
> > > > > GroupCoordinator 11]: Member
> > > > > consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has
> > > failed,
> > > > > removing it from the group
> > > > >
> > > > > Thanks,
> > > > > Ali
> > > > >
> > > > > On Thu, May 7, 2020 at 2:38 PM Ali Nazemian  >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > With the emerge of using Apache Kafka for event-driven
> > architecture,
> > > > one
> > > > > > thing that has become important is how to tune apache Kafka
> > consumer
> > > to
> > > > > > manage long-running jobs. The main issue raises when we set a
> > > > relatively
> > > > > > large value for "max.poll.interval.ms". Setting this value will,
> > of
> > > > > > course, resolve the issue of repetitive rebalance, but creates
> > > another
> > > > > > operational issue. I am looking for some sort of golden strategy
> to
> > > > deal
> > > > > > with long-running jobs with Apache Kafka.
> > > > > >
> > > > > > If the consumer hangs for whatever reason, there is no easy way
> of
> > > > > passing
> > > > > > that stage. It can easily block the pipeline, and you cannot do
> > much
> > > > > about
> > > > > > it. Therefore, it came to my mind that I am probably missing
> > > something
> > > > > > here. What are the expectations? Is it not valid to use Apache
> > Kafka
> > > > for
> > > > > > long-live jobs? Are there any other parameters need to be set,
> and
> > > the
> > > > > > issue of a consumer being stuck is caused by misconfiguration?
> > > > > >
> > > > > > I can see there are a lot of the same issues have been raised
> > > regarding
> > > > > > "the consumer is stuck" and usually, the answer has been "yeah,
> > > that's
> > > > > > because you have a long-running job, etc.". I have seen different
> > > > > > suggestions:
> > > > > >
> > > > > > - Avoid using long-running jobs. Read the message, submit it into
> > > > another
> > > > > > thread and let the consumer to pass. Obviously this can cause
> data
> > > loss
> > > > > and
> > > > > > it would be a difficult problem to handle. It might be better to
> > > avoid
> > > > > > using Kafka in the first place for these types of requests.
> > > > > >
> > > > > > - Avoid using apache Kafka for long-running 

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-08 Thread Ali Nazemian
Hi Chris,

I am not sure where I said about the "automatic partition reassignment",
but what I know here is the side effect of increasing "max.poll.interval.ms"
is if the consumer hangs for whatever reason the pipeline will be blocked
by the group coordinator for up to "max.poll.interval.ms". So I am not sure
if this is because of the automatic partition assignment or something else.
What I am looking for is how I can deal with long-running jobs in Apache
Kafka.

Thanks,
Ali

On Sat, May 9, 2020 at 4:25 AM Chris Toomey  wrote:

> I interpreted your post as saying "when our consumer gets stuck, Kafka's
> automatic partition reassignment kicks in and that's problematic for us."
> Hence I suggested not using the automatic partition assignment, which per
> my interpretation would address your issue.
>
> Chris
>
> On Fri, May 8, 2020 at 2:19 AM Ali Nazemian  wrote:
>
> > Thanks, Chris. So what is causing the consumer to get stuck is a side
> > effect of the built-in partition assignment in Kafka and by overriding
> that
> > behaviour I should be able to address the long-running job issue, is that
> > right? Can you please elaborate more on this?
> >
> > Regards,
> > Ali
> >
> > On Fri, May 8, 2020 at 1:09 PM Chris Toomey  wrote:
> >
> > > You really have to decide what behavior it is you want when one of your
> > > consumers gets "stuck". If you don't like the way the group protocol
> > > dynamically manages topic partition assignments or can't figure out an
> > > appropriate set of configuration settings that achieve your goal, you
> can
> > > always elect to not use the group protocol and instead manage topic
> > > partition assignment yourself. As I just replied to another post,
> > there's a
> > > nice writeup of this under  "Manual Partition Assignment" in
> > >
> > >
> >
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
> > >  .
> > >
> > > Chris
> > >
> > >
> > > On Thu, May 7, 2020 at 12:37 AM Ali Nazemian 
> > > wrote:
> > >
> > > > To help understanding my case in more details, the error I can see
> > > > constantly is the consumer losing heartbeat and hence apparently the
> > > group
> > > > get rebalanced based on the log I can see from Kafka side:
> > > >
> > > > GroupCoordinator 11]: Member
> > > > consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has
> > failed,
> > > > removing it from the group
> > > >
> > > > Thanks,
> > > > Ali
> > > >
> > > > On Thu, May 7, 2020 at 2:38 PM Ali Nazemian 
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > With the emerge of using Apache Kafka for event-driven
> architecture,
> > > one
> > > > > thing that has become important is how to tune apache Kafka
> consumer
> > to
> > > > > manage long-running jobs. The main issue raises when we set a
> > > relatively
> > > > > large value for "max.poll.interval.ms". Setting this value will,
> of
> > > > > course, resolve the issue of repetitive rebalance, but creates
> > another
> > > > > operational issue. I am looking for some sort of golden strategy to
> > > deal
> > > > > with long-running jobs with Apache Kafka.
> > > > >
> > > > > If the consumer hangs for whatever reason, there is no easy way of
> > > > passing
> > > > > that stage. It can easily block the pipeline, and you cannot do
> much
> > > > about
> > > > > it. Therefore, it came to my mind that I am probably missing
> > something
> > > > > here. What are the expectations? Is it not valid to use Apache
> Kafka
> > > for
> > > > > long-live jobs? Are there any other parameters need to be set, and
> > the
> > > > > issue of a consumer being stuck is caused by misconfiguration?
> > > > >
> > > > > I can see there are a lot of the same issues have been raised
> > regarding
> > > > > "the consumer is stuck" and usually, the answer has been "yeah,
> > that's
> > > > > because you have a long-running job, etc.". I have seen different
> > > > > suggestions:
> > > > >
> > > > > - Avoid using long-running jobs. Read the message, submit it into
> > > another
> > > > > thread and let the consumer to pass. Obviously this can cause data
> > loss
> > > > and
> > > > > it would be a difficult problem to handle. It might be better to
> > avoid
> > > > > using Kafka in the first place for these types of requests.
> > > > >
> > > > > - Avoid using apache Kafka for long-running requests
> > > > >
> > > > > - Workaround based approaches like if the consumer is blocked, try
> to
> > > use
> > > > > another consumer group and set the offset to the current value for
> > the
> > > > new
> > > > > consumer group, etc.
> > > > >
> > > > > There might be other suggestions I have missed here, but that is
> not
> > > the
> > > > > point of this email. What I am looking for is what is the best
> > practice
> > > > for
> > > > > dealing with long-running jobs with Apache Kafka. I cannot easily
> > avoid
> > > > > using Kafka because it plays a critical part in our application and
> > > data
> > > > > pipeline. On the other side, 

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-08 Thread Ali Nazemian
Thanks, Chris. So what is causing the consumer to get stuck is a side
effect of the built-in partition assignment in Kafka and by overriding that
behaviour I should be able to address the long-running job issue, is that
right? Can you please elaborate more on this?

Regards,
Ali

On Fri, May 8, 2020 at 1:09 PM Chris Toomey  wrote:

> You really have to decide what behavior it is you want when one of your
> consumers gets "stuck". If you don't like the way the group protocol
> dynamically manages topic partition assignments or can't figure out an
> appropriate set of configuration settings that achieve your goal, you can
> always elect to not use the group protocol and instead manage topic
> partition assignment yourself. As I just replied to another post, there's a
> nice writeup of this under  "Manual Partition Assignment" in
>
> https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
>  .
>
> Chris
>
>
> On Thu, May 7, 2020 at 12:37 AM Ali Nazemian 
> wrote:
>
> > To help understanding my case in more details, the error I can see
> > constantly is the consumer losing heartbeat and hence apparently the
> group
> > get rebalanced based on the log I can see from Kafka side:
> >
> > GroupCoordinator 11]: Member
> > consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has failed,
> > removing it from the group
> >
> > Thanks,
> > Ali
> >
> > On Thu, May 7, 2020 at 2:38 PM Ali Nazemian 
> wrote:
> >
> > > Hi,
> > >
> > > With the emerge of using Apache Kafka for event-driven architecture,
> one
> > > thing that has become important is how to tune apache Kafka consumer to
> > > manage long-running jobs. The main issue raises when we set a
> relatively
> > > large value for "max.poll.interval.ms". Setting this value will, of
> > > course, resolve the issue of repetitive rebalance, but creates another
> > > operational issue. I am looking for some sort of golden strategy to
> deal
> > > with long-running jobs with Apache Kafka.
> > >
> > > If the consumer hangs for whatever reason, there is no easy way of
> > passing
> > > that stage. It can easily block the pipeline, and you cannot do much
> > about
> > > it. Therefore, it came to my mind that I am probably missing something
> > > here. What are the expectations? Is it not valid to use Apache Kafka
> for
> > > long-live jobs? Are there any other parameters need to be set, and the
> > > issue of a consumer being stuck is caused by misconfiguration?
> > >
> > > I can see there are a lot of the same issues have been raised regarding
> > > "the consumer is stuck" and usually, the answer has been "yeah, that's
> > > because you have a long-running job, etc.". I have seen different
> > > suggestions:
> > >
> > > - Avoid using long-running jobs. Read the message, submit it into
> another
> > > thread and let the consumer to pass. Obviously this can cause data loss
> > and
> > > it would be a difficult problem to handle. It might be better to avoid
> > > using Kafka in the first place for these types of requests.
> > >
> > > - Avoid using apache Kafka for long-running requests
> > >
> > > - Workaround based approaches like if the consumer is blocked, try to
> use
> > > another consumer group and set the offset to the current value for the
> > new
> > > consumer group, etc.
> > >
> > > There might be other suggestions I have missed here, but that is not
> the
> > > point of this email. What I am looking for is what is the best practice
> > for
> > > dealing with long-running jobs with Apache Kafka. I cannot easily avoid
> > > using Kafka because it plays a critical part in our application and
> data
> > > pipeline. On the other side, we have had so many challenges to keep the
> > > long-running jobs stable operationally. So I would appreciate it if
> > someone
> > > can help me to understand what approach can be taken to deal with these
> > > jobs with Apache Kafka as a message broker.
> > >
> > > Thanks,
> > > Ali
> > >
> >
> >
> > --
> > A.Nazemian
> >
>


-- 
A.Nazemian


Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-07 Thread Chris Toomey
You really have to decide what behavior it is you want when one of your
consumers gets "stuck". If you don't like the way the group protocol
dynamically manages topic partition assignments or can't figure out an
appropriate set of configuration settings that achieve your goal, you can
always elect to not use the group protocol and instead manage topic
partition assignment yourself. As I just replied to another post, there's a
nice writeup of this under  "Manual Partition Assignment" in
https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 .

Chris


On Thu, May 7, 2020 at 12:37 AM Ali Nazemian  wrote:

> To help understanding my case in more details, the error I can see
> constantly is the consumer losing heartbeat and hence apparently the group
> get rebalanced based on the log I can see from Kafka side:
>
> GroupCoordinator 11]: Member
> consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has failed,
> removing it from the group
>
> Thanks,
> Ali
>
> On Thu, May 7, 2020 at 2:38 PM Ali Nazemian  wrote:
>
> > Hi,
> >
> > With the emerge of using Apache Kafka for event-driven architecture, one
> > thing that has become important is how to tune apache Kafka consumer to
> > manage long-running jobs. The main issue raises when we set a relatively
> > large value for "max.poll.interval.ms". Setting this value will, of
> > course, resolve the issue of repetitive rebalance, but creates another
> > operational issue. I am looking for some sort of golden strategy to deal
> > with long-running jobs with Apache Kafka.
> >
> > If the consumer hangs for whatever reason, there is no easy way of
> passing
> > that stage. It can easily block the pipeline, and you cannot do much
> about
> > it. Therefore, it came to my mind that I am probably missing something
> > here. What are the expectations? Is it not valid to use Apache Kafka for
> > long-live jobs? Are there any other parameters need to be set, and the
> > issue of a consumer being stuck is caused by misconfiguration?
> >
> > I can see there are a lot of the same issues have been raised regarding
> > "the consumer is stuck" and usually, the answer has been "yeah, that's
> > because you have a long-running job, etc.". I have seen different
> > suggestions:
> >
> > - Avoid using long-running jobs. Read the message, submit it into another
> > thread and let the consumer to pass. Obviously this can cause data loss
> and
> > it would be a difficult problem to handle. It might be better to avoid
> > using Kafka in the first place for these types of requests.
> >
> > - Avoid using apache Kafka for long-running requests
> >
> > - Workaround based approaches like if the consumer is blocked, try to use
> > another consumer group and set the offset to the current value for the
> new
> > consumer group, etc.
> >
> > There might be other suggestions I have missed here, but that is not the
> > point of this email. What I am looking for is what is the best practice
> for
> > dealing with long-running jobs with Apache Kafka. I cannot easily avoid
> > using Kafka because it plays a critical part in our application and data
> > pipeline. On the other side, we have had so many challenges to keep the
> > long-running jobs stable operationally. So I would appreciate it if
> someone
> > can help me to understand what approach can be taken to deal with these
> > jobs with Apache Kafka as a message broker.
> >
> > Thanks,
> > Ali
> >
>
>
> --
> A.Nazemian
>


Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-07 Thread Ali Nazemian
To help understanding my case in more details, the error I can see
constantly is the consumer losing heartbeat and hence apparently the group
get rebalanced based on the log I can see from Kafka side:

GroupCoordinator 11]: Member
consumer-3-f46e14b4-5998-4083-b7ec-bed4e3f374eb in group foo has failed,
removing it from the group

Thanks,
Ali

On Thu, May 7, 2020 at 2:38 PM Ali Nazemian  wrote:

> Hi,
>
> With the emerge of using Apache Kafka for event-driven architecture, one
> thing that has become important is how to tune apache Kafka consumer to
> manage long-running jobs. The main issue raises when we set a relatively
> large value for "max.poll.interval.ms". Setting this value will, of
> course, resolve the issue of repetitive rebalance, but creates another
> operational issue. I am looking for some sort of golden strategy to deal
> with long-running jobs with Apache Kafka.
>
> If the consumer hangs for whatever reason, there is no easy way of passing
> that stage. It can easily block the pipeline, and you cannot do much about
> it. Therefore, it came to my mind that I am probably missing something
> here. What are the expectations? Is it not valid to use Apache Kafka for
> long-live jobs? Are there any other parameters need to be set, and the
> issue of a consumer being stuck is caused by misconfiguration?
>
> I can see there are a lot of the same issues have been raised regarding
> "the consumer is stuck" and usually, the answer has been "yeah, that's
> because you have a long-running job, etc.". I have seen different
> suggestions:
>
> - Avoid using long-running jobs. Read the message, submit it into another
> thread and let the consumer to pass. Obviously this can cause data loss and
> it would be a difficult problem to handle. It might be better to avoid
> using Kafka in the first place for these types of requests.
>
> - Avoid using apache Kafka for long-running requests
>
> - Workaround based approaches like if the consumer is blocked, try to use
> another consumer group and set the offset to the current value for the new
> consumer group, etc.
>
> There might be other suggestions I have missed here, but that is not the
> point of this email. What I am looking for is what is the best practice for
> dealing with long-running jobs with Apache Kafka. I cannot easily avoid
> using Kafka because it plays a critical part in our application and data
> pipeline. On the other side, we have had so many challenges to keep the
> long-running jobs stable operationally. So I would appreciate it if someone
> can help me to understand what approach can be taken to deal with these
> jobs with Apache Kafka as a message broker.
>
> Thanks,
> Ali
>


-- 
A.Nazemian


Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-06 Thread Ali Nazemian
Hi,

With the emerge of using Apache Kafka for event-driven architecture, one
thing that has become important is how to tune apache Kafka consumer to
manage long-running jobs. The main issue raises when we set a relatively
large value for "max.poll.interval.ms". Setting this value will, of course,
resolve the issue of repetitive rebalance, but creates another operational
issue. I am looking for some sort of golden strategy to deal with
long-running jobs with Apache Kafka.

If the consumer hangs for whatever reason, there is no easy way of passing
that stage. It can easily block the pipeline, and you cannot do much about
it. Therefore, it came to my mind that I am probably missing something
here. What are the expectations? Is it not valid to use Apache Kafka for
long-live jobs? Are there any other parameters need to be set, and the
issue of a consumer being stuck is caused by misconfiguration?

I can see there are a lot of the same issues have been raised regarding
"the consumer is stuck" and usually, the answer has been "yeah, that's
because you have a long-running job, etc.". I have seen different
suggestions:

- Avoid using long-running jobs. Read the message, submit it into another
thread and let the consumer to pass. Obviously this can cause data loss and
it would be a difficult problem to handle. It might be better to avoid
using Kafka in the first place for these types of requests.

- Avoid using apache Kafka for long-running requests

- Workaround based approaches like if the consumer is blocked, try to use
another consumer group and set the offset to the current value for the new
consumer group, etc.

There might be other suggestions I have missed here, but that is not the
point of this email. What I am looking for is what is the best practice for
dealing with long-running jobs with Apache Kafka. I cannot easily avoid
using Kafka because it plays a critical part in our application and data
pipeline. On the other side, we have had so many challenges to keep the
long-running jobs stable operationally. So I would appreciate it if someone
can help me to understand what approach can be taken to deal with these
jobs with Apache Kafka as a message broker.

Thanks,
Ali


Re: Best practices for compacting topics with tombstones

2019-07-18 Thread Omar Al-Safi
If I recall correctly, you can set 'delete.retention.ms' in the topic level
configuration to how long you want to retain the tombstones in the topic.
By default is set to 8640, you can set it to lower than this. Regarding
the performance, I am not really why would compaction causes the
performance hit to your broker, but the question would be how much data you
hold there, how often you have updates to your topic (records with the same
key) and how often you have tombstones for records

On Wed, 17 Jul 2019 at 22:12, Chris Baumgartner <
chris.baumgart...@fujifilm.com> wrote:

> Hello,
>
> I'm wondering if anyone has advice on configuring compaction. Here is my
> scenario:
>
> A producer writes raw data to topic #1. A stream app reads the data from
> topic #1, processes it, writes the processed data to topic #2, and then
> writes a tombstone record to topic #1.
>
> So, I don't intend for data to be retained very long in topic #1.
>
> Are there any best practices for configuring compaction on topic #1 in this
> case? I don't want to keep the data around very long after it has been
> processed, but I also don't want to cause performance issues by compacting
> too often.
>
> Thanks.
>
> - Chris
>
> --
> NOTICE:  This message, including any attachments, is only for the use of
> the intended recipient(s) and may contain confidential, sensitive and/or
> privileged information, or information otherwise prohibited from
> dissemination or disclosure by law or regulation, including applicable
> export regulations.  If the reader of this message is not the intended
> recipient, you are hereby notified that any use, disclosure, copying,
> dissemination or distribution of this message or any of its attachments is
> strictly prohibited.  If you received this message in error, please
> contact
> the sender immediately by reply email and destroy this message, including
> all attachments, and any copies thereof.
>


Best practices for compacting topics with tombstones

2019-07-17 Thread Chris Baumgartner
Hello,

I'm wondering if anyone has advice on configuring compaction. Here is my
scenario:

A producer writes raw data to topic #1. A stream app reads the data from
topic #1, processes it, writes the processed data to topic #2, and then
writes a tombstone record to topic #1.

So, I don't intend for data to be retained very long in topic #1.

Are there any best practices for configuring compaction on topic #1 in this
case? I don't want to keep the data around very long after it has been
processed, but I also don't want to cause performance issues by compacting
too often.

Thanks.

- Chris

-- 
NOTICE:  This message, including any attachments, is only for the use of 
the intended recipient(s) and may contain confidential, sensitive and/or 
privileged information, or information otherwise prohibited from 
dissemination or disclosure by law or regulation, including applicable 
export regulations.  If the reader of this message is not the intended 
recipient, you are hereby notified that any use, disclosure, copying, 
dissemination or distribution of this message or any of its attachments is 
strictly prohibited.  If you received this message in error, please contact 
the sender immediately by reply email and destroy this message, including 
all attachments, and any copies thereof.


Re: Best practices

2018-05-21 Thread Matthias J. Sax
If you specify some bootstrap.servers, after connecting the producer
will learn about all available brokers automatically, by fetching
cluster metadata from the first broker it connects. Thus, in practice it
is usually sufficient to specify 3 to 5 brokers (in case one is down,
the producer can "bootstrap" itself connecting to any other broker in
the list initially).

Also note, that producer do no write to an arbitrary broker: for each
partition, there is a dedicates leader and the producer sends all write
to the leader (as explained in the blog post is shared in my last reply
--- please read it:
https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/)

For best practices, it's recommended to configure producer retries via
config parameter `retries` (default is zero). For this case, if a write
fails, the producer will retry the write internally (potentially to a
different broker in case the leader changed). Only after all retries are
exhausted, you would get a callback that indicates the failed write.


-Matthias

On 5/21/18 5:40 AM, Pavel Sapozhnikov wrote:
> If a process failed to write a message into one broker what is the best
> practice to write to same topic on different broker? Is there one? I should
> be able to get a list of brokers programmatically from zk path /brokers/ids
> ?
> 
> On Sun, May 20, 2018, 3:21 PM Matthias J. Sax <matth...@confluent.io> wrote:
> 
>> You can register a callback for each sent record to learn about
>> successful write or fail:
>>
>>> producer.send(record, callback);
>>
>> For replication, you don't need to send twice. If the replication factor
>> is configured broker side, the broker take care of replication
>> automatically.
>>
>> You can also configure when you want to be informed about successful
>> write: before or after replication.
>>
>> This blog post should help:
>>
>> https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/
>>
>>
>>
>> -Matthias
>>
>> On 5/20/18 11:00 AM, Pavel Sapozhnikov wrote:
>>> Hello
>>>
>>> What are the best practices when it comes to publishing a message into
>>> kafka. When sending a message into Kafka is it possible to know if that
>>> message has successfully been published? If not, what is the best
>> practice
>>> to know when messages are not getting published?
>>>
>>> Second question.
>>>
>>> If I have two kafka brokers and very simplistic one kafka topic
>> replicated
>>> on both. Do I need to send to both brokers. What are the best practices
>> for
>>> that.
>>>
>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: Best practices

2018-05-21 Thread Pavel Sapozhnikov
If a process failed to write a message into one broker what is the best
practice to write to same topic on different broker? Is there one? I should
be able to get a list of brokers programmatically from zk path /brokers/ids
?

On Sun, May 20, 2018, 3:21 PM Matthias J. Sax <matth...@confluent.io> wrote:

> You can register a callback for each sent record to learn about
> successful write or fail:
>
> > producer.send(record, callback);
>
> For replication, you don't need to send twice. If the replication factor
> is configured broker side, the broker take care of replication
> automatically.
>
> You can also configure when you want to be informed about successful
> write: before or after replication.
>
> This blog post should help:
>
> https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/
>
>
>
> -Matthias
>
> On 5/20/18 11:00 AM, Pavel Sapozhnikov wrote:
> > Hello
> >
> > What are the best practices when it comes to publishing a message into
> > kafka. When sending a message into Kafka is it possible to know if that
> > message has successfully been published? If not, what is the best
> practice
> > to know when messages are not getting published?
> >
> > Second question.
> >
> > If I have two kafka brokers and very simplistic one kafka topic
> replicated
> > on both. Do I need to send to both brokers. What are the best practices
> for
> > that.
> >
>
>


Re: Best practices

2018-05-20 Thread Matthias J. Sax
You can register a callback for each sent record to learn about
successful write or fail:

> producer.send(record, callback);

For replication, you don't need to send twice. If the replication factor
is configured broker side, the broker take care of replication
automatically.

You can also configure when you want to be informed about successful
write: before or after replication.

This blog post should help:
https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/



-Matthias

On 5/20/18 11:00 AM, Pavel Sapozhnikov wrote:
> Hello
> 
> What are the best practices when it comes to publishing a message into
> kafka. When sending a message into Kafka is it possible to know if that
> message has successfully been published? If not, what is the best practice
> to know when messages are not getting published?
> 
> Second question.
> 
> If I have two kafka brokers and very simplistic one kafka topic replicated
> on both. Do I need to send to both brokers. What are the best practices for
> that.
> 



signature.asc
Description: OpenPGP digital signature


Best practices

2018-05-20 Thread Pavel Sapozhnikov
Hello

What are the best practices when it comes to publishing a message into
kafka. When sending a message into Kafka is it possible to know if that
message has successfully been published? If not, what is the best practice
to know when messages are not getting published?

Second question.

If I have two kafka brokers and very simplistic one kafka topic replicated
on both. Do I need to send to both brokers. What are the best practices for
that.


Re: Best practices Partition Key

2018-01-25 Thread Maria Pilar
Yes, I´m capturing different events from the same entity/resource (create,
update and delete) for that reason I´ve choosen that options however my
question is if i can improve my solution if I want to use kafka as
datastore including the partition key of cassandra for each entity as
partition key of kafka.

On 25 January 2018 at 16:02, Dmitry Minkovsky  wrote:

> > one entity - one topic, because I need to ensure the properly ordering in
> the events.
>
> This is a great in insight. I discovered that keeping entity-related things
> on one topic is much easier than splitting entity-related things onto
> multiple topics. If you have one topic, replaying that topic is trivial. If
> you have multiple topics, replaying those topics requires careful
> synchronization. In my case, I am doing event capture and I have
> entity-related events on multiple topics. For example, for a user entity I
> have topics `join-requests` and `settings-update-requests`. Having separate
> topics is superficially nicer in terms of consuming them with Kafka
> Streams: you can set up topic-specific serdes. But the benefit you get from
> this is dwarfed by the complexity of then having to synchronize these two
> streams if you want to replay them. Your situation seems simpler though
> because you are not even doing event capture, but just logging complete
> entities out of Cassandra.
>
> > If I will use kafka like a datastore and search throgh the records,
>
> Interactive Queries API makes this very nice.
>
> On Thu, Jan 25, 2018 at 8:47 AM, Maria Pilar  wrote:
>
> > Hi everyone,
> >
> > I´m trying to understand the best practice to define the partition key. I
> > have defined some topics that they are related with entities in cassandra
> > data model, the relationship is one-to-one, one entity - one topic,
> because
> > I need to ensure the properly ordering in the events. I have created one
> > partition for each topic to ensure it as well.
> >
> > If I will use kafka like a datastore and search throgh the records, I
> know
> > that could be a best practice use the partition key of Cassandra (e.g
> > Customer ID) as a partition key in kafka
> >
> > any comment please ??
> >
> > thanks
> >
>


Re: Best practices Partition Key

2018-01-25 Thread Dmitry Minkovsky
> I know that could be a best practice use the partition key of Cassandra
(e.g Customer ID) as a partition key in kafka

Yeah, the Kafka Producer will hash that key with murmur so all entities
coming out of cassandra with the same partition key will end up on the same
kafka partition. Then you can use Kafka Streams Interactive Queries to get
data..

On Thu, Jan 25, 2018 at 10:02 AM, Dmitry Minkovsky 
wrote:

> > one entity - one topic, because I need to ensure the properly ordering
> in the events.
>
> This is a great in insight. I discovered that keeping entity-related
> things on one topic is much easier than splitting entity-related things
> onto multiple topics. If you have one topic, replaying that topic is
> trivial. If you have multiple topics, replaying those topics requires
> careful synchronization. In my case, I am doing event capture and I have
> entity-related events on multiple topics. For example, for a user entity I
> have topics `join-requests` and `settings-update-requests`. Having separate
> topics is superficially nicer in terms of consuming them with Kafka
> Streams: you can set up topic-specific serdes. But the benefit you get from
> this is dwarfed by the complexity of then having to synchronize these two
> streams if you want to replay them. Your situation seems simpler though
> because you are not even doing event capture, but just logging complete
> entities out of Cassandra.
>
> > If I will use kafka like a datastore and search throgh the records,
>
> Interactive Queries API makes this very nice.
>
> On Thu, Jan 25, 2018 at 8:47 AM, Maria Pilar  wrote:
>
>> Hi everyone,
>>
>> I´m trying to understand the best practice to define the partition key. I
>> have defined some topics that they are related with entities in cassandra
>> data model, the relationship is one-to-one, one entity - one topic,
>> because
>> I need to ensure the properly ordering in the events. I have created one
>> partition for each topic to ensure it as well.
>>
>> If I will use kafka like a datastore and search throgh the records, I know
>> that could be a best practice use the partition key of Cassandra (e.g
>> Customer ID) as a partition key in kafka
>>
>> any comment please ??
>>
>> thanks
>>
>
>


Re: Best practices Partition Key

2018-01-25 Thread Dmitry Minkovsky
> one entity - one topic, because I need to ensure the properly ordering in
the events.

This is a great in insight. I discovered that keeping entity-related things
on one topic is much easier than splitting entity-related things onto
multiple topics. If you have one topic, replaying that topic is trivial. If
you have multiple topics, replaying those topics requires careful
synchronization. In my case, I am doing event capture and I have
entity-related events on multiple topics. For example, for a user entity I
have topics `join-requests` and `settings-update-requests`. Having separate
topics is superficially nicer in terms of consuming them with Kafka
Streams: you can set up topic-specific serdes. But the benefit you get from
this is dwarfed by the complexity of then having to synchronize these two
streams if you want to replay them. Your situation seems simpler though
because you are not even doing event capture, but just logging complete
entities out of Cassandra.

> If I will use kafka like a datastore and search throgh the records,

Interactive Queries API makes this very nice.

On Thu, Jan 25, 2018 at 8:47 AM, Maria Pilar  wrote:

> Hi everyone,
>
> I´m trying to understand the best practice to define the partition key. I
> have defined some topics that they are related with entities in cassandra
> data model, the relationship is one-to-one, one entity - one topic, because
> I need to ensure the properly ordering in the events. I have created one
> partition for each topic to ensure it as well.
>
> If I will use kafka like a datastore and search throgh the records, I know
> that could be a best practice use the partition key of Cassandra (e.g
> Customer ID) as a partition key in kafka
>
> any comment please ??
>
> thanks
>


Best practices Partition Key

2018-01-25 Thread Maria Pilar
Hi everyone,

I´m trying to understand the best practice to define the partition key. I
have defined some topics that they are related with entities in cassandra
data model, the relationship is one-to-one, one entity - one topic, because
I need to ensure the properly ordering in the events. I have created one
partition for each topic to ensure it as well.

If I will use kafka like a datastore and search throgh the records, I know
that could be a best practice use the partition key of Cassandra (e.g
Customer ID) as a partition key in kafka

any comment please ??

thanks


Re: best practices for replication factor / partitions __consumer_offsets

2018-01-24 Thread ??????
we have 3 brokers and set partitions=50 and replica_factor=3


 
---Original---
From: "Dennis"<daoden...@gmail.com>
Date: 2018/1/24 14:05:53
To: "users"<users@kafka.apache.org>;
Subject: best practices for replication factor / partitions __consumer_offsets


Hi,

Are there any best practices or how to size __consumer_offsets and the
associated replication factor?

Regards,

Dennis O.

best practices for replication factor / partitions __consumer_offsets

2018-01-23 Thread Dennis
Hi,

Are there any best practices or how to size __consumer_offsets and the
associated replication factor?

Regards,

Dennis O.


Re: Best practices - Using kafka (with http server) as source-of-truth

2017-10-30 Thread adslqa cet
can you share your results ?


Re: Kafka streams store migration - best practices

2017-08-01 Thread Damian Guy
No you don't need to set a listener. Was just mentioning as it an option if
you wan't to know that the metadata needs refreshing,

On Tue, 1 Aug 2017 at 13:25 Debasish Ghosh <ghosh.debas...@gmail.com> wrote:

> Regarding the last point, do I need to set up the listener ?
>
> All I want is to do a query from the store. For that I need to invoke 
> streams.store()
> first, which can potentially throw an InvalidStateStoreException during
> rebalancing / migration of stores. If I call streams.store() with retries
> till the rebalancing is done or I exceed some max retry count, then I think
> I should good.
>
> Or am I missing something ?
>
> regards.
>
> On Tue, Aug 1, 2017 at 1:10 PM, Damian Guy <damian@gmail.com> wrote:
>
>> Hi,
>>
>> On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <ghosh.debas...@gmail.com>
>> wrote:
>>
>>> Hi -
>>>
>>> I have a Kafka Streams application that needs to run on multiple
>>> instances.
>>> It fetches metadata from all local stores and has an http query layer for
>>> interactive queries. In some cases when I have new instances deployed,
>>> store migration takes place making the current metadata invalid. Here are
>>> my questions regarding some of the best practices to be followed to
>>> handle
>>> this issue of store migration -
>>>
>>>- When the migration is in process, a query for the metadata may
>>> result
>>>in InvalidStateStoreException - is it a good practice to always have a
>>>retry semantics based query for the metadata ?
>>>
>>
>> Yes. Whenever the application is rebalancing the stores will be
>> unavailable, so retrying is the right thing to do.
>>
>>
>>>- Should I check KafkaStreams.state() and only assume that I have got
>>>the correct metadata when the state() call returns Running. If it
>>>returns Rebalancing, then I should re-query. Is this correct approach
>>> ?
>>>
>>
>> Correct again! If the state is rebalancing, then the metadata (for some
>> stores at least) is going to change, so you should get it again. You can
>> set a StateListener on the KafkaStreams instance to listen to these events.
>>
>>
>>>
>>> regards.
>>>
>>> --
>>> Debasish Ghosh
>>> http://manning.com/ghosh2
>>> http://manning.com/ghosh
>>>
>>> Twttr: @debasishg
>>> Blog: http://debasishg.blogspot.com
>>> Code: http://github.com/debasishg
>>>
>>
>
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>


Re: Kafka streams store migration - best practices

2017-08-01 Thread Damian Guy
Hi,

On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <ghosh.debas...@gmail.com> wrote:

> Hi -
>
> I have a Kafka Streams application that needs to run on multiple instances.
> It fetches metadata from all local stores and has an http query layer for
> interactive queries. In some cases when I have new instances deployed,
> store migration takes place making the current metadata invalid. Here are
> my questions regarding some of the best practices to be followed to handle
> this issue of store migration -
>
>- When the migration is in process, a query for the metadata may result
>in InvalidStateStoreException - is it a good practice to always have a
>retry semantics based query for the metadata ?
>

Yes. Whenever the application is rebalancing the stores will be
unavailable, so retrying is the right thing to do.


>- Should I check KafkaStreams.state() and only assume that I have got
>the correct metadata when the state() call returns Running. If it
>returns Rebalancing, then I should re-query. Is this correct approach ?
>

Correct again! If the state is rebalancing, then the metadata (for some
stores at least) is going to change, so you should get it again. You can
set a StateListener on the KafkaStreams instance to listen to these events.


>
> regards.
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>


Kafka streams store migration - best practices

2017-08-01 Thread Debasish Ghosh
Hi -

I have a Kafka Streams application that needs to run on multiple instances.
It fetches metadata from all local stores and has an http query layer for
interactive queries. In some cases when I have new instances deployed,
store migration takes place making the current metadata invalid. Here are
my questions regarding some of the best practices to be followed to handle
this issue of store migration -

   - When the migration is in process, a query for the metadata may result
   in InvalidStateStoreException - is it a good practice to always have a
   retry semantics based query for the metadata ?
   - Should I check KafkaStreams.state() and only assume that I have got
   the correct metadata when the state() call returns Running. If it
   returns Rebalancing, then I should re-query. Is this correct approach ?

regards.

-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg


Re: Re-Balancing Kafka topics - Best practices

2017-06-15 Thread Abhimanyu Nagrath
Hi Tod,
I am not able to access the link
https://github.com/linkedin/kafka-toolskafka-assigner



Regards,
Abhimanyu

On Fri, Jun 16, 2017 at 12:26 AM, karan alang <karan.al...@gmail.com> wrote:

> Thanks Todd.. for the detailed reply.
>
> regds,
> Karan Alang
>
> On Tue, Jun 13, 2017 at 3:19 PM, Todd Palino <tpal...@gmail.com> wrote:
>
> > A few things here…
> >
> > 1) auto.leader.rebalance.enable can have serious performance impacts on
> > larger clusters. It’s currently in need of some development work to
> enable
> > it to batch leader elections into smaller groups and back off between
> them,
> > as well as have a better backoff after broker startup. I don’t recommend
> > using it.
> >
> > 2) auto.leader.rebalance.enable is not going to get you what you’re
> looking
> > for. It only changes the leader for a partition to the “optimal” leader
> (I
> > put that in quotes because it’s a pretty dumb algorithm. It’s whichever
> > replica is listed first). It does not move partitions around to assure
> you
> > have a balance of traffic across the cluster.
> >
> > If you want to rebalance partitions, you have a couple options right now:
> > 1) Run kafka-reassign-partitions.sh. It will move all of the partitions
> > around and try and assure an even count on each broker. It does not
> balance
> > traffic, however, (if you have a really busy partition and a really slow
> > partition, it considers them equal).
> > 2) Use an external tool like https://github.com/linkedin/kafka-tools
> > kafka-assigner. This is a script we developed at LinkedIn for doing
> > operations that involve moving partitions around and provides a number of
> > different ways to rebalance traffic.
> >
> > There are other tools available for doing this, but right now it requires
> > something external to the Apache Kafka core.
> >
> > -Todd
> >
> >
> > On Tue, Jun 13, 2017 at 5:30 PM, karan alang <karan.al...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Fpr Re-balancing Kafka partitions, we can set property ->
> > >
> > >
> > > *auto.leader.rebalance.enable = true in server.properties file.*
> > >
> > > *Is that the recommended way or is it better to reBalance the kafka
> > > partitions manually ?(using *scripts - *kafka-preferred-replica-
> > > election.sh,
> > > *
> > >
> > > *kafka-reassign-partition.sh)*
> > > *One of the blogs mentioned that - it is preferable to Re-balance Kafka
> > > topics manually, since setting   *
> > >
> > > *auto.leader.rebalance.enable = true causes issues.*
> > >
> > > Pls let me know.
> > > Any other best practices wrt. Re-balancing kafka topics ?
> > >
> > > thanks!
> > >
> >
> >
> >
> > --
> > *Todd Palino*
> > Senior Staff Engineer, Site Reliability
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>


Re: Re-Balancing Kafka topics - Best practices

2017-06-15 Thread karan alang
Thanks Todd.. for the detailed reply.

regds,
Karan Alang

On Tue, Jun 13, 2017 at 3:19 PM, Todd Palino <tpal...@gmail.com> wrote:

> A few things here…
>
> 1) auto.leader.rebalance.enable can have serious performance impacts on
> larger clusters. It’s currently in need of some development work to enable
> it to batch leader elections into smaller groups and back off between them,
> as well as have a better backoff after broker startup. I don’t recommend
> using it.
>
> 2) auto.leader.rebalance.enable is not going to get you what you’re looking
> for. It only changes the leader for a partition to the “optimal” leader (I
> put that in quotes because it’s a pretty dumb algorithm. It’s whichever
> replica is listed first). It does not move partitions around to assure you
> have a balance of traffic across the cluster.
>
> If you want to rebalance partitions, you have a couple options right now:
> 1) Run kafka-reassign-partitions.sh. It will move all of the partitions
> around and try and assure an even count on each broker. It does not balance
> traffic, however, (if you have a really busy partition and a really slow
> partition, it considers them equal).
> 2) Use an external tool like https://github.com/linkedin/kafka-tools
> kafka-assigner. This is a script we developed at LinkedIn for doing
> operations that involve moving partitions around and provides a number of
> different ways to rebalance traffic.
>
> There are other tools available for doing this, but right now it requires
> something external to the Apache Kafka core.
>
> -Todd
>
>
> On Tue, Jun 13, 2017 at 5:30 PM, karan alang <karan.al...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > Fpr Re-balancing Kafka partitions, we can set property ->
> >
> >
> > *auto.leader.rebalance.enable = true in server.properties file.*
> >
> > *Is that the recommended way or is it better to reBalance the kafka
> > partitions manually ?(using *scripts - *kafka-preferred-replica-
> > election.sh,
> > *
> >
> > *kafka-reassign-partition.sh)*
> > *One of the blogs mentioned that - it is preferable to Re-balance Kafka
> > topics manually, since setting   *
> >
> > *auto.leader.rebalance.enable = true causes issues.*
> >
> > Pls let me know.
> > Any other best practices wrt. Re-balancing kafka topics ?
> >
> > thanks!
> >
>
>
>
> --
> *Todd Palino*
> Senior Staff Engineer, Site Reliability
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>


Re: Re-Balancing Kafka topics - Best practices

2017-06-13 Thread Todd Palino
A few things here…

1) auto.leader.rebalance.enable can have serious performance impacts on
larger clusters. It’s currently in need of some development work to enable
it to batch leader elections into smaller groups and back off between them,
as well as have a better backoff after broker startup. I don’t recommend
using it.

2) auto.leader.rebalance.enable is not going to get you what you’re looking
for. It only changes the leader for a partition to the “optimal” leader (I
put that in quotes because it’s a pretty dumb algorithm. It’s whichever
replica is listed first). It does not move partitions around to assure you
have a balance of traffic across the cluster.

If you want to rebalance partitions, you have a couple options right now:
1) Run kafka-reassign-partitions.sh. It will move all of the partitions
around and try and assure an even count on each broker. It does not balance
traffic, however, (if you have a really busy partition and a really slow
partition, it considers them equal).
2) Use an external tool like https://github.com/linkedin/kafka-tools
kafka-assigner. This is a script we developed at LinkedIn for doing
operations that involve moving partitions around and provides a number of
different ways to rebalance traffic.

There are other tools available for doing this, but right now it requires
something external to the Apache Kafka core.

-Todd


On Tue, Jun 13, 2017 at 5:30 PM, karan alang <karan.al...@gmail.com> wrote:

> Hi All,
>
> Fpr Re-balancing Kafka partitions, we can set property ->
>
>
> *auto.leader.rebalance.enable = true in server.properties file.*
>
> *Is that the recommended way or is it better to reBalance the kafka
> partitions manually ?(using *scripts - *kafka-preferred-replica-
> election.sh,
> *
>
> *kafka-reassign-partition.sh)*
> *One of the blogs mentioned that - it is preferable to Re-balance Kafka
> topics manually, since setting   *
>
> *auto.leader.rebalance.enable = true causes issues.*
>
> Pls let me know.
> Any other best practices wrt. Re-balancing kafka topics ?
>
> thanks!
>



-- 
*Todd Palino*
Senior Staff Engineer, Site Reliability
Data Infrastructure Streaming



linkedin.com/in/toddpalino


Re-Balancing Kafka topics - Best practices

2017-06-13 Thread karan alang
Hi All,

Fpr Re-balancing Kafka partitions, we can set property ->


*auto.leader.rebalance.enable = true in server.properties file.*

*Is that the recommended way or is it better to reBalance the kafka
partitions manually ?(using *scripts - *kafka-preferred-replica-election.sh,
*

*kafka-reassign-partition.sh)*
*One of the blogs mentioned that - it is preferable to Re-balance Kafka
topics manually, since setting   *

*auto.leader.rebalance.enable = true causes issues.*

Pls let me know.
Any other best practices wrt. Re-balancing kafka topics ?

thanks!


Re: best practices to handle large messages

2017-04-06 Thread Vincent Dautremont
Hi,
you might be interested by this presentation
https://www.slideshare.net/JiangjieQin/handle-large-messages-in-apache-kafka-58692297

On Wed, Apr 5, 2017 at 1:27 AM, Mohammad Kargar <mkar...@phemi.com> wrote:

> What are best practices to handle large messages (2.5 MB) in Kafka?
>
> Thanks,
> Mohammad
>

-- 
The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
computer.


best practices to handle large messages

2017-04-04 Thread Mohammad Kargar
What are best practices to handle large messages (2.5 MB) in Kafka?

Thanks,
Mohammad


Brokers / Best practices to set log.flush.interval.*

2016-09-04 Thread Florian Hussonnois
Hi all,

I would like to know how to configure the following paramaters :

log.flush.interval.messages
log.flush.interval.ms
log.flush.scheduler.interval.ms

The Kafka 0.8.X documentation indicates it is not recommanded to set these
parameters as this can have major impact on performance.

But since Kafka 0.9.x, 0.10.x this cautions are not any more indicated.

In addition, the linkedin's configuration which is given on the Apache
Kafka website is :
log.flush.interval.ms=1
log.flush.interval.messages=2
log.flush.scheduler.interval.ms=2000

When do we need to set these parameters ? What can be the impact if we use
the default setting with a large memory (page-cache) ?

Thanks,

-- 
Florian


Consumer best practices

2016-05-05 Thread Spico Florin
Hello!
  We are using Kafka 0.9.1. We have created a class CustomKafkaConsumer
whose method receive
has the pseudocoode

public OurClassStructure[] receiveFromKafka()
 {
//gte the message from topic
   ConsumerRecords received= org

.apache

.kafka

.clients

.consumer

.KafkaConsumer.poll(timeout);


  //transform received to our  OurClassStructure[]
  }


we would like use this CustomKafkaConsumer due to our needs:
1. get the kafka message from a long lived thread (such as Storm spout in
the nextTuple method)
2.  get the kafka messages from a Java scheduled thread executor at one
second frequency


ScheduledThreadPoolExecutor.scheduleAtFixedRate(
()->{
OurClassStructure[]  km=CustomKafkaConsumer.receiveFromKafka();
//doSomething with km
},
 long initialDelay,1,TimeUnit.SECOND)


I've seen that the recommended way (in the book Kafka definitive guide
chapter 4 and also in this post
http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client)
to consume from Kafka is to have an infinite loop and do the pool the
messages from that loop.

My questions are:

1. Is our approach with the scheduling consumption a good one?
2.If not what are the caveats/gotchas of our approach?
3. Should we adhere to the recommended way to consume the message from
Kafka?

I look forward for your answers.
Regards,
 Florin


Re: Best practices for producer

2015-11-06 Thread Li Tao
For producer, there is no need to know zookeeper servers to produce
messages.

For consumer, it was necessary to connect zookeeper to consume messages,
because previously the offsets value were
stored by zookeeper, consumer need to know the offsets value to read
message from kafka brokers. Newest version of consumer has no such
limitation, the offsets are stored in kafka broker.

On Sat, Oct 31, 2015 at 2:10 AM, Gmail - Jai  wrote:

> Hello,
>
> I am working/learning on kafka for producing messages and had a little
> doubt about kafka producer configuration.
> Currently I am using kafka 0.8.0 version and using zk.connect property
> which is not specified in documentation for .8 version but still works on
> this version.
> There is a property broker.list where we can specify broker host and port
> to connect to the brokers.
>
> My question is which way is the best practice to connect to kafka cluster
> with multiple brokers?
> Why was zk.connect dropped?
>
> Appreciate your help!
>
> Thanks
> Jai
>
>


Best practices for producer

2015-10-30 Thread Gmail - Jai
Hello,

I am working/learning on kafka for producing messages and had a little doubt 
about kafka producer configuration. 
Currently I am using kafka 0.8.0 version and using zk.connect property which is 
not specified in documentation for .8 version but still works on this version.
There is a property broker.list where we can specify broker host and port to 
connect to the brokers. 

My question is which way is the best practice to connect to kafka cluster with 
multiple brokers?
Why was zk.connect dropped?

Appreciate your help!

Thanks
Jai   



Kafka Best Practices in AWS

2015-10-26 Thread Jennifer Fountain
We installed four nodes (r3.xlarge) with approx 100 topics.  Some topics
have a replication factor of 4, some have 3 and some have 2. Each node is
running about 90% cpu usage and we are seeing performance issues with
messages consumption.

Would anyone have any words of wisdom or recommendations for AWS?  Should
we add more nodes? What should be the replication factor for x amount of
nodes?

Thank you!

-- 


Jennifer Fountain
DevOPS


Re: Kafka Best Practices in AWS

2015-10-26 Thread Steve Brandon
Depending on the consumer, I tend to work with c4 types and have found
better success. Although we run about 200+ consumers per server on 16 core
servers and see moderate load.  YMMV
On Oct 26, 2015 8:32 AM, "Jennifer Fountain"  wrote:

> We installed four nodes (r3.xlarge) with approx 100 topics.  Some topics
> have a replication factor of 4, some have 3 and some have 2. Each node is
> running about 90% cpu usage and we are seeing performance issues with
> messages consumption.
>
> Would anyone have any words of wisdom or recommendations for AWS?  Should
> we add more nodes? What should be the replication factor for x amount of
> nodes?
>
> Thank you!
>
> --
>
>
> Jennifer Fountain
> DevOPS
>


Re: Best practices - Using kafka (with http server) as source-of-truth

2015-07-30 Thread Prabhjot Bharaj
Hi Ewen,

Thanks for your response. I'll experiment and benchmark it with the normal
proxy and NGinx as well and update the results.

Regards,
prabcs

On Mon, Jul 27, 2015 at 11:10 PM, Ewen Cheslack-Postava e...@confluent.io
wrote:

 Hi Prabhjot,

 Confluent has a REST proxy with docs that may give some guidance:
 http://docs.confluent.io/1.0/kafka-rest/docs/intro.html The new producer
 that it uses is very efficient, so you should be able to get pretty good
 throughput. You take a bit of a hit due to the overhead of sending data
 through a proxy, but with appropriate batching you can get about 2/3 the
 performance as you would get using the Java producer directly.

 There are also a few other proxies you can find here:
 https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-HTTPREST

 You can also put nginx (or HAProxy, or a variety of other solutions) in
 front of REST proxies for load balancing, HA, SSL termination, etc. This is
 yet another hop, so it might affect throughput and latency.

 -Ewen

 On Mon, Jul 27, 2015 at 6:55 AM, Prabhjot Bharaj prabhbha...@gmail.com
 wrote:

  Hi Folks,
 
  I would like to understand the best practices when using kafka as the
  source-of-truth, given the fact that I want to pump in data to Kafka
 using
  http methods.
 
  What are the current production configurations for such a use case:-
 
  1. Kafka-http-client - is it scalable the way Nginx is ??
  2. Using Kafka and Nginx together - If anybody has used this, please
  explain
  3. Any other scalable method ?
 
  Regards,
  prabcs
 



 --
 Thanks,
 Ewen




-- 
-
There are only 10 types of people in the world: Those who understand
binary, and those who don't


Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Prabhjot Bharaj
Hi Folks,

I would like to understand the best practices when using kafka as the
source-of-truth, given the fact that I want to pump in data to Kafka using
http methods.

What are the current production configurations for such a use case:-

1. Kafka-http-client - is it scalable the way Nginx is ??
2. Using Kafka and Nginx together - If anybody has used this, please explain
3. Any other scalable method ?

Regards,
prabcs


Re: Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Ewen Cheslack-Postava
Hi Prabhjot,

Confluent has a REST proxy with docs that may give some guidance:
http://docs.confluent.io/1.0/kafka-rest/docs/intro.html The new producer
that it uses is very efficient, so you should be able to get pretty good
throughput. You take a bit of a hit due to the overhead of sending data
through a proxy, but with appropriate batching you can get about 2/3 the
performance as you would get using the Java producer directly.

There are also a few other proxies you can find here:
https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-HTTPREST

You can also put nginx (or HAProxy, or a variety of other solutions) in
front of REST proxies for load balancing, HA, SSL termination, etc. This is
yet another hop, so it might affect throughput and latency.

-Ewen

On Mon, Jul 27, 2015 at 6:55 AM, Prabhjot Bharaj prabhbha...@gmail.com
wrote:

 Hi Folks,

 I would like to understand the best practices when using kafka as the
 source-of-truth, given the fact that I want to pump in data to Kafka using
 http methods.

 What are the current production configurations for such a use case:-

 1. Kafka-http-client - is it scalable the way Nginx is ??
 2. Using Kafka and Nginx together - If anybody has used this, please
 explain
 3. Any other scalable method ?

 Regards,
 prabcs




-- 
Thanks,
Ewen


Re: Best Practices for Java Consumers

2015-06-24 Thread Jeff Holoman
+1 On this idea.

On Tue, Jun 23, 2015 at 5:55 PM, Gwen Shapira gshap...@cloudera.com wrote:

 I don't know of any such resource, but I'll be happy to help
 contribute from my experience.
 I'm sure others would too.

 Do you want to start one?

 Gwen

 On Tue, Jun 23, 2015 at 2:03 PM, Tom McKenzie thomaswmcken...@gmail.com
 wrote:
  Hello
 
  Is there a good reference for best practices on running Java consumers?
  I'm thinking a FAQ format.
 
 - How should we run them?  We are currently running them in Tomcat on
 Ubuntu, are there other approaches using services?  Maybe the service
 wrapper http://wrapper.tanukisoftware.com/doc/english/download.jsp?
 - How should we monitor them?  I going to try
 https://github.com/stealthly/metrics-kafka
 - How do we scale?  I'm guessing we run more servers but we could also
 run more threads?  What are the tradeoffs?
 - Links to code examples
 - How do we make consumers robust?  i.e. Best practices for exceptions
 handling.  We have noticed if our code has a NPE it's stops consuming.
 
  Also if this doesn't exist I would be willing to start this document.  I
  would think it would be near this page
 
 https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
 
  Thanks,
  Tom




-- 
Jeff Holoman
Systems Engineer


Best Practices for Java Consumers

2015-06-23 Thread Tom McKenzie
Hello

Is there a good reference for best practices on running Java consumers?
I'm thinking a FAQ format.

   - How should we run them?  We are currently running them in Tomcat on
   Ubuntu, are there other approaches using services?  Maybe the service
   wrapper http://wrapper.tanukisoftware.com/doc/english/download.jsp?
   - How should we monitor them?  I going to try
   https://github.com/stealthly/metrics-kafka
   - How do we scale?  I'm guessing we run more servers but we could also
   run more threads?  What are the tradeoffs?
   - Links to code examples
   - How do we make consumers robust?  i.e. Best practices for exceptions
   handling.  We have noticed if our code has a NPE it's stops consuming.

Also if this doesn't exist I would be willing to start this document.  I
would think it would be near this page
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

Thanks,
Tom


Re: Best Practices for Java Consumers

2015-06-23 Thread Gwen Shapira
I don't know of any such resource, but I'll be happy to help
contribute from my experience.
I'm sure others would too.

Do you want to start one?

Gwen

On Tue, Jun 23, 2015 at 2:03 PM, Tom McKenzie thomaswmcken...@gmail.com wrote:
 Hello

 Is there a good reference for best practices on running Java consumers?
 I'm thinking a FAQ format.

- How should we run them?  We are currently running them in Tomcat on
Ubuntu, are there other approaches using services?  Maybe the service
wrapper http://wrapper.tanukisoftware.com/doc/english/download.jsp?
- How should we monitor them?  I going to try
https://github.com/stealthly/metrics-kafka
- How do we scale?  I'm guessing we run more servers but we could also
run more threads?  What are the tradeoffs?
- Links to code examples
- How do we make consumers robust?  i.e. Best practices for exceptions
handling.  We have noticed if our code has a NPE it's stops consuming.

 Also if this doesn't exist I would be willing to start this document.  I
 would think it would be near this page
 https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

 Thanks,
 Tom


Consumer Best Practices

2015-06-22 Thread Tom McKenzie
Hello

Is there a good reference for best practices on running Java consumers?
I'm thinking a FAQ format.

   - How should we run them?  We are currently running them in Tomcat on
   Ubuntu, are there other approaches using services?  Maybe the service
   wrapper http://wrapper.tanukisoftware.com/doc/english/download.jsp?
   - How should we monitor them?  I going to try
   https://github.com/stealthly/metrics-kafka
   https://github.com/stealthly/metrics-kafka
   - How do we scale?  I'm guessing we run more servers but we could also
   run more threads?  What are the tradeoffs?
   - Links to code examples
   - How do we make consumers robust?  i.e. Best practices for exceptions
   handling.  We have noticed if our code has a NPE it's stops consuming.

Also if this doesn't exist I would be willing to start this document.  I
would think it would be near this page
https://cwiki.apache.org/confluence/display/KAFKA
/0.8.0+SimpleConsumer+Example

Thanks,
Tom


Re: AWS EC2 deployment best practices

2014-09-30 Thread Joe Crobak
I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.

To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.

On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole 
philip.oto...@yahoo.com.invalid wrote:

 If only Kafka had rack awarenessyou could run 1 cluster and set up the
 replicas in different AZs.


 https://issues.apache.org/jira/browse/KAFKA-1215

 As for your question about ephemeral versus EBS, I presume you are
 proposing to use ephemeral *with* replicas, right?


 Philip



 -
 http://www.philipotoole.com


 On Monday, September 29, 2014 9:45 PM, Joe Crobak joec...@gmail.com
 wrote:



 We're planning a deploy to AWS EC2, and I was hoping to get some advice on
 best practices. I've seen the Loggly presentation [1], which has some good
 recommendations on instance types and EBS setup. Aside from that, there
 seem to be several options in terms of multi-Availability Zone (AZ)
 deployment. The ones we're considering are:

 1) Treat each AZ as a separate data center. Producers write to the kafka
 cluster in the same AZ. For consumption, two options:
 1a) designate one cluster the master cluster and use mirrormaker. This
 was discussed here [2] where some gotchas related to offset management were
 raised.
 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
 for each cluster).

 Pros:
 * if there's a network partition between AZs (or extra latency), the
 consumer(s) will catch up once the event is resolved.
 * If an AZ goes offline, only unprocessed data in that AZ is lost until the
 AZ comes back online. The other AZ is unaffected. (consume failover is more
 complicated in 1a, it seems).
 Cons:
 * Duplicate infrastructure and either more moving parts (1a) or more
 complicated consumers (1b).
 * It's unclear how this scales if one wants to add a second region to the
 mix.

 2) The second option is to treat AZs as the same data center. In this case,
 there's no guarantee that a writer is writing to a node in the same AZ.

 Pros:
 * Simplified setup-all data is in one place.
 Cons:
 * Harder to design for availability—what if the leader of the partition is
 in a different AZ than the producer and there's a partition between AZs? If
 latency is high or throughput is low between AZs, write throughput suffers
 if `request.required.acks` = -1


 Some other considerations:
 * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
 AZs, but option 1a/b would let us do separate clusters per AZ.
 * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
 replication. Are folks using ephemeral storage instead of EBS now?
 Provisioned IOPs can get expensive pretty quickly.

 Any suggestions/experience along these lines (or others!) would be greatly
 appreciated. If there's good feedback, I'd be happy to put together a wiki
 page with the details.

 Thanks,
 Joe

 [1] http://search-hadoop.com/m/4TaT4BQRJy
 [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain



Re: AWS EC2 deployment best practices

2014-09-30 Thread Philip O'Toole
OK, yeah, speaking from experience I would be comfortable with using the 
ephemeral storage if it's replicated across AZs. More and more EC2 instances 
have local SSDs, so you'll get great IO. Of course, you better monitor your 
instance, and if a instance terminates, you're vulnerable if a second instance 
is lost. It might argue for 3 copies.

As you correctly pointed out in your original e-mail, the Loggly setup predated 
0.8 -- so there was no replication to worry about. We ran 3-broker clusters, 
and put a broker, of each cluster, in a different AZ. This did mean that during 
an AZ failure that certain brokers would be unavailable (but the messages were 
still on disk, ready for processing when the AZ came back online), but it did 
mean that there was always some Kafka brokers running somewhere that were 
reachable, and incoming traffic could be sent there. The Producers we wrote 
took care of dealing with this. In other words the pipeline kept moving data.


Of course, in a healthy pipeline, each message was written to ES within a 
matter of seconds, and we had replication there (as outlined in the 
accompanying talk). It all worked very well.


Philip

 
-
http://www.philipotoole.com 


On Tuesday, September 30, 2014 2:49 PM, Joe Crobak joec...@gmail.com wrote:
 


I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.

To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.


On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole 
philip.oto...@yahoo.com.invalid wrote:

 If only Kafka had rack awarenessyou could run 1 cluster and set up the
 replicas in different AZs.


 https://issues.apache.org/jira/browse/KAFKA-1215

 As for your question about ephemeral versus EBS, I presume you are
 proposing to use ephemeral *with* replicas, right?


 Philip



 -
 http://www.philipotoole.com


 On Monday, September 29, 2014 9:45 PM, Joe Crobak joec...@gmail.com
 wrote:



 We're planning a deploy to AWS EC2, and I was hoping to get some advice on
 best practices. I've seen the Loggly presentation [1], which has some good
 recommendations on instance types and EBS setup. Aside from that, there
 seem to be several options in terms of multi-Availability Zone (AZ)
 deployment. The ones we're considering are:

 1) Treat each AZ as a separate data center. Producers write to the kafka
 cluster in the same AZ. For consumption, two options:
 1a) designate one cluster the master cluster and use mirrormaker. This
 was discussed here [2] where some gotchas related to offset management were
 raised.
 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
 for each cluster).

 Pros:
 * if there's a network partition between AZs (or extra latency), the
 consumer(s) will catch up once the event is resolved.
 * If an AZ goes offline, only unprocessed data in that AZ is lost until the
 AZ comes back online. The other AZ is unaffected. (consume failover is more
 complicated in 1a, it seems).
 Cons:
 * Duplicate infrastructure and either more moving parts (1a) or more
 complicated consumers (1b).
 * It's unclear how this scales if one wants to add a second region to the
 mix.

 2) The second option is to treat AZs as the same data center. In this case,
 there's no guarantee that a writer is writing to a node in the same AZ.

 Pros:
 * Simplified setup-all data is in one place.
 Cons:
 * Harder to design for availability—what if the leader of the partition is
 in a different AZ than the producer and there's a partition between AZs? If
 latency is high or throughput is low between AZs, write throughput suffers
 if `request.required.acks` = -1


 Some other considerations:
 * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
 AZs, but option 1a/b would let us do separate clusters per AZ.
 * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
 replication. Are folks using ephemeral storage instead of EBS now?
 Provisioned IOPs can get expensive pretty quickly.

 Any suggestions/experience along these lines (or others!) would be greatly
 appreciated. If there's good feedback, I'd be happy to put together a wiki
 page with the details.

 Thanks,
 Joe

 [1] http://search-hadoop.com/m/4TaT4BQRJy
 [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain


Re: AWS EC2 deployment best practices

2014-09-30 Thread James Cheng
I'm also interested in hearing more about deploying Kafka in AWS.

I was also considering options like your 1a and 2. I ran some calculations and 
one interesting thing I ran across was bandwidth costs between AZs.

In 1a, if you can have your producers and consumers in the same AZ as the 
master, then you won't have to pay any bandwidth costs for your 
producers/consumers. You will have to pay bandwidth costs for the mirror-maker 
traffic between clusters in different AZs.

In 2, if your producers and consumers are writing/reading to different AZs, 
then you are paying bandwidth costs between AZs for both producers and 
consumers. In my cost calculation for a modest size cluster, my bandwidth costs 
were roughly the same as my (EC2 instance + EBS) costs.

An idea for #2 is to deploy your producers and your consumers so that they 
always are deployed in the AZ that contains the partitions they want to 
read/write. Or, said another way, move your partitions to the brokers in the 
same AZs as where your producers/consumers are. I think it's doable, but it 
means means that you'd want to write a Kafka client library that is aware of 
your AZ's, and also manage the cluster partitions in-sync with your 
producer/consumer deployments.

With ephemeral disks, I imagine that Kafka would become network bound. In case 
you find it useful, I ran some network performance tests against different EC2 
instances. I only went as far as c3.4xlarge.

https://docs.google.com/spreadsheets/d/1QF-4EO3PQ_YOLbvf6HKpqBTNQ8fyYeRuDMrlDYlK0yQ/pubchart?oid=1634430904format=interactive

-James

On Sep 30, 2014, at 7:47 AM, Philip O'Toole philip.oto...@yahoo.com.INVALID 
wrote:

 OK, yeah, speaking from experience I would be comfortable with using the 
 ephemeral storage if it's replicated across AZs. More and more EC2 instances 
 have local SSDs, so you'll get great IO. Of course, you better monitor your 
 instance, and if a instance terminates, you're vulnerable if a second 
 instance is lost. It might argue for 3 copies.
 
 As you correctly pointed out in your original e-mail, the Loggly setup 
 predated 0.8 -- so there was no replication to worry about. We ran 3-broker 
 clusters, and put a broker, of each cluster, in a different AZ. This did mean 
 that during an AZ failure that certain brokers would be unavailable (but the 
 messages were still on disk, ready for processing when the AZ came back 
 online), but it did mean that there was always some Kafka brokers running 
 somewhere that were reachable, and incoming traffic could be sent there. The 
 Producers we wrote took care of dealing with this. In other words the 
 pipeline kept moving data.
 
 
 Of course, in a healthy pipeline, each message was written to ES within a 
 matter of seconds, and we had replication there (as outlined in the 
 accompanying talk). It all worked very well.
 
 
 Philip
 
 
 -
 http://www.philipotoole.com 
 
 
 On Tuesday, September 30, 2014 2:49 PM, Joe Crobak joec...@gmail.com wrote:
 
 
 
 I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
 my concerns of a producer writing to the partition leader in different AZ,
 though.
 
 To answer your question, I was thinking ephemerals with replication, yes.
 With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
 amortized cost below a single m2.2xlarge with the same amount of EBS
 storage and provisioned IOPs.
 
 
 On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole 
 philip.oto...@yahoo.com.invalid wrote:
 
 If only Kafka had rack awarenessyou could run 1 cluster and set up the
 replicas in different AZs.
 
 
 https://issues.apache.org/jira/browse/KAFKA-1215
 
 As for your question about ephemeral versus EBS, I presume you are
 proposing to use ephemeral *with* replicas, right?
 
 
 Philip
 
 
 
 -
 http://www.philipotoole.com
 
 
 On Monday, September 29, 2014 9:45 PM, Joe Crobak joec...@gmail.com
 wrote:
 
 
 
 We're planning a deploy to AWS EC2, and I was hoping to get some advice on
 best practices. I've seen the Loggly presentation [1], which has some good
 recommendations on instance types and EBS setup. Aside from that, there
 seem to be several options in terms of multi-Availability Zone (AZ)
 deployment. The ones we're considering are:
 
 1) Treat each AZ as a separate data center. Producers write to the kafka
 cluster in the same AZ. For consumption, two options:
 1a) designate one cluster the master cluster and use mirrormaker. This
 was discussed here [2] where some gotchas related to offset management were
 raised.
 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
 for each cluster).
 
 Pros:
 * if there's a network partition between AZs (or extra latency), the
 consumer(s) will catch up once the event is resolved.
 * If an AZ goes offline, only unprocessed data in that AZ is lost until the
 AZ comes back online. The other AZ is unaffected

AWS EC2 deployment best practices

2014-09-29 Thread Joe Crobak
We're planning a deploy to AWS EC2, and I was hoping to get some advice on
best practices. I've seen the Loggly presentation [1], which has some good
recommendations on instance types and EBS setup. Aside from that, there
seem to be several options in terms of multi-Availability Zone (AZ)
deployment. The ones we're considering are:

1) Treat each AZ as a separate data center. Producers write to the kafka
cluster in the same AZ. For consumption, two options:
1a) designate one cluster the master cluster and use mirrormaker. This
was discussed here [2] where some gotchas related to offset management were
raised.
1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
for each cluster).

Pros:
* if there's a network partition between AZs (or extra latency), the
consumer(s) will catch up once the event is resolved.
* If an AZ goes offline, only unprocessed data in that AZ is lost until the
AZ comes back online. The other AZ is unaffected. (consume failover is more
complicated in 1a, it seems).
Cons:
* Duplicate infrastructure and either more moving parts (1a) or more
complicated consumers (1b).
* It's unclear how this scales if one wants to add a second region to the
mix.

2) The second option is to treat AZs as the same data center. In this case,
there's no guarantee that a writer is writing to a node in the same AZ.

Pros:
* Simplified setup-all data is in one place.
Cons:
* Harder to design for availability—what if the leader of the partition is
in a different AZ than the producer and there's a partition between AZs? If
latency is high or throughput is low between AZs, write throughput suffers
if `request.required.acks` = -1


Some other considerations:
* Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
AZs, but option 1a/b would let us do separate clusters per AZ.
* EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
replication. Are folks using ephemeral storage instead of EBS now?
Provisioned IOPs can get expensive pretty quickly.

Any suggestions/experience along these lines (or others!) would be greatly
appreciated. If there's good feedback, I'd be happy to put together a wiki
page with the details.

Thanks,
Joe

[1] http://search-hadoop.com/m/4TaT4BQRJy
[2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain


Re: AWS EC2 deployment best practices

2014-09-29 Thread Philip O'Toole
If only Kafka had rack awarenessyou could run 1 cluster and set up the 
replicas in different AZs.


https://issues.apache.org/jira/browse/KAFKA-1215

As for your question about ephemeral versus EBS, I presume you are proposing to 
use ephemeral *with* replicas, right?


Philip

 
 
-
http://www.philipotoole.com 


On Monday, September 29, 2014 9:45 PM, Joe Crobak joec...@gmail.com wrote:
 


We're planning a deploy to AWS EC2, and I was hoping to get some advice on
best practices. I've seen the Loggly presentation [1], which has some good
recommendations on instance types and EBS setup. Aside from that, there
seem to be several options in terms of multi-Availability Zone (AZ)
deployment. The ones we're considering are:

1) Treat each AZ as a separate data center. Producers write to the kafka
cluster in the same AZ. For consumption, two options:
1a) designate one cluster the master cluster and use mirrormaker. This
was discussed here [2] where some gotchas related to offset management were
raised.
1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
for each cluster).

Pros:
* if there's a network partition between AZs (or extra latency), the
consumer(s) will catch up once the event is resolved.
* If an AZ goes offline, only unprocessed data in that AZ is lost until the
AZ comes back online. The other AZ is unaffected. (consume failover is more
complicated in 1a, it seems).
Cons:
* Duplicate infrastructure and either more moving parts (1a) or more
complicated consumers (1b).
* It's unclear how this scales if one wants to add a second region to the
mix.

2) The second option is to treat AZs as the same data center. In this case,
there's no guarantee that a writer is writing to a node in the same AZ.

Pros:
* Simplified setup-all data is in one place.
Cons:
* Harder to design for availability—what if the leader of the partition is
in a different AZ than the producer and there's a partition between AZs? If
latency is high or throughput is low between AZs, write throughput suffers
if `request.required.acks` = -1


Some other considerations:
* Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
AZs, but option 1a/b would let us do separate clusters per AZ.
* EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
replication. Are folks using ephemeral storage instead of EBS now?
Provisioned IOPs can get expensive pretty quickly.

Any suggestions/experience along these lines (or others!) would be greatly
appreciated. If there's good feedback, I'd be happy to put together a wiki
page with the details.

Thanks,
Joe

[1] http://search-hadoop.com/m/4TaT4BQRJy
[2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain

Best practices for what goes in a message?

2014-06-20 Thread Mike Pence
Hello group,

I realize that the contents and style of messaging is largely a domain
specific thing for each implementation, contingent on the scope of the
requirements for the organization implementing Kafka, etc. But surely there
are some things that are essential, things that each Kafka user should
include in their messages as a best practice.

I would appreciate any pointers in that direction.


Re: Best practices for what goes in a message?

2014-06-20 Thread Guozhang Wang
Maybe you can read some wiki pages like

https://cwiki.apache.org/confluence/display/KAFKA/Operations

https://kafka.apache.org/documentation.html#operations

Guozhang


On Fri, Jun 20, 2014 at 8:45 AM, Mike Pence mike.pe...@gmail.com wrote:

 Hello group,

 I realize that the contents and style of messaging is largely a domain
 specific thing for each implementation, contingent on the scope of the
 requirements for the organization implementing Kafka, etc. But surely there
 are some things that are essential, things that each Kafka user should
 include in their messages as a best practice.

 I would appreciate any pointers in that direction.




-- 
-- Guozhang


Event data modelling best practices?

2014-01-27 Thread Vjeran Marcinko
Hi,

This I guess is not just question for Kafka, but for all event driven
systems, but since most of people here deal with events, I would like to
hear some basic suggestions for modelling event messages, or even better,
pointer to some relevant literature/website that deals with this stuff?

Anyway, my quetions are something like this...

If I have event that is spawned when some request is processed, such as:

BankAccountService.credit(long bankAccountId, long amount);
, and event that is triggered then is (in some pseudo data structure):

BankAccountCredited {
long bankAccountId;
long amount;
}

1. If I leave just these pieces of data in this event, the consumer would
not be able to reconstruct the state of bank account (account's balance
being the piece of state that changed), if not having the same logic present
in event accumulator (which is especially very problematic when code
versioning is in place, which is practically alwys)?

2. Because of previous code/logic requirement to reconstruct state, I guess
it would be wise to include piece of account state that changed, such as
adding balance after tha credit request execution:
BankAccountCredited {
long bankAccountId;
long amount;
long balance;
}

3. Another option that maybe seems better when thinking that many different
events will want to report state of acount after action, then nested Bank
Account dana structure seems better, right?

BankAccountCredited {
long amount;
BankAccount {
long id;
long balance;
boolean active;
}
}
We can see that in this case there is also some fields rpesent (active) of
account entitiy that were not directly affected by credit action, but we
have them here because BankAccount dana structure contains all of fields,
that is OK, right?

4. What is some downstream consumers are interested in all events
(category) that change account's balance, meaning, maybe the consumer
doesn't care if event is BankAccountCredited or BankAccountDebited, because
he is interested in the category of evevnts that can be described as
BankAccountBalanceChanged. Since there is no supertyping usually present
in popular serialization libs (Avro, Thrift...), how do you implement this -
do you subscribe consumer individually to all topics that contrain events
that change bank account balance, or you create one topic that contains all
of evevtns of that particular category? (the later aproach would not work
because categories doesn't have to be so straightforward, many events have
many-to-many relationship to various categories - in java it would simply be
implemented with using interfaces to mark categories, but here we don't have
that option)

5. What if some action mutate several entities when being processed? Do you
spawn 2 events from application layer, or you publish just one which
subsequently, by some real-time rpocessor, triggers spawning 2 various ones
- each for different  entitiy that was affected?

I could probably think of some other questions, but you get the point what
I'm interested in..

Best regards,
Vjeran




Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-25 Thread Jun Rao
You can't create the topic with the same name before deleting it first.

Thanks,

Jun

On Mon, Mar 25, 2013 at 9:58 AM, Scott Clasen sc...@heroku.com wrote:

 Jun Thanks. To clarify, do you mean that clients will have cached broker
 lists or some other data that will make them ignore the new brokers?

 Like so

 topic-1 replication factor 3, on broker-ids 1,2,3
 all brokers 1,2,3 die, and are never coming back.
 delete all kafka data in zookeeper.
 boot 4,5,6, create new topic called topic-1 repl factor 3, brokers 4,5,6

 clients will/will not start sending to topic-1 on 4,5,6?



 On Sun, Mar 24, 2013 at 4:01 PM, Jun Rao jun...@gmail.com wrote:

  If you bring up 3 new brokers with different broker ids, you won't be
 able
  to use them on existing topics until after you have run the partition
  reassignment tool.
 
  Thanks,
 
  Jun
 
  On Fri, Mar 22, 2013 at 9:23 PM, Scott Clasen sc...@heroku.com wrote:
 
   Thanks!
  
Would there be any difference if I instead  deleted all the Kafka data
   from zookeeper and booted 3 instances  with different broker id?
 clients
   with cached broker id lists or any other issue?
  
   Sent from my iPhone
  
   On Mar 22, 2013, at 9:15 PM, Jun Rao jun...@gmail.com wrote:
  
In scenario 2, you can bring up 3 new brokers with the same broker
 id.
   You
won't get the data back. However, new data can be published to and
   consumed
from the new brokers.
   
Thanks,
   
Jun
   
On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen sc...@heroku.com
  wrote:
   
Thanks Neha-
   
To Clarify...
   
*In scenario = 1 will the new broker get all messages on the other
   brokers
replicated to it?
   
*In Scenario 2 = it is clear that the data is gone, but I still
 need
producers to be able to send and consumers to receive on the same
   topic. In
my testing today I was unable to do that as I kept getting
 errors...so
   if i
was doing the correct steps it seems there is a bug here, basically
  the
second-cluster-topic topic is unusable after all 3 brokers crash,
  and
   3
more are booted to replace them.  Something not quite correct in
   zookeeper?
   
Like so
   
./bin/kafka-reassign-partitions.sh --zookeeper ...
 --path-to-json-file
reassign.json
   
kafka.common.LeaderNotAvailableException: Leader not available for
  topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
   
   
  
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
   
   
  
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
   
   
  
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at
  scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at
   
   
  
 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
at
  kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
at
 kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
at
   
   
  
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
at
   
   
  
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
at
   
   
  
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
Caused by: kafka.common.LeaderNotAvailableException: No leader
 exists
   for
partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
... 16 more
topic: second-cluster-topic
   
./bin/kafka-preferred-replica-election.sh  --zookeeper...
--path-to-json-file elect.json
   
   
[2013-03-22 10:24:20,706] INFO Created preferred replica
 election
   path
with { partitions:[ { partition:0, topic:first-cluster-topic
  },
   {
partition:0, topic:second-cluster-topic } ], version:1 }
(kafka.admin.PreferredReplicaLeaderElectionCommand$)
   
./bin/kafka-list-topic.sh  --zookeeper ... --topic
  second-cluster-topic
   
2013-03-22 10:24:30,869] ERROR Error while fetching metadata for
   partition
[second-cluster-topic,0] (kafka.admin.AdminUtils$)
kafka.common.LeaderNotAvailableException: Leader not available for
  topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
   
   
  
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
 

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-24 Thread Jun Rao
If you bring up 3 new brokers with different broker ids, you won't be able
to use them on existing topics until after you have run the partition
reassignment tool.

Thanks,

Jun

On Fri, Mar 22, 2013 at 9:23 PM, Scott Clasen sc...@heroku.com wrote:

 Thanks!

  Would there be any difference if I instead  deleted all the Kafka data
 from zookeeper and booted 3 instances  with different broker id? clients
 with cached broker id lists or any other issue?

 Sent from my iPhone

 On Mar 22, 2013, at 9:15 PM, Jun Rao jun...@gmail.com wrote:

  In scenario 2, you can bring up 3 new brokers with the same broker id.
 You
  won't get the data back. However, new data can be published to and
 consumed
  from the new brokers.
 
  Thanks,
 
  Jun
 
  On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen sc...@heroku.com wrote:
 
  Thanks Neha-
 
  To Clarify...
 
  *In scenario = 1 will the new broker get all messages on the other
 brokers
  replicated to it?
 
  *In Scenario 2 = it is clear that the data is gone, but I still need
  producers to be able to send and consumers to receive on the same
 topic. In
  my testing today I was unable to do that as I kept getting errors...so
 if i
  was doing the correct steps it seems there is a bug here, basically the
  second-cluster-topic topic is unusable after all 3 brokers crash, and
 3
  more are booted to replace them.  Something not quite correct in
 zookeeper?
 
  Like so
 
  ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
  reassign.json
 
  kafka.common.LeaderNotAvailableException: Leader not available for topic
  second-cluster-topic partition 0
  at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
  at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
  at
 
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
  at
 
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
  at
 
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
  at scala.collection.immutable.List.foreach(List.scala:45)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
  at scala.collection.immutable.List.map(List.scala:45)
  at
 
 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
  at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
  at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
  at
 
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
  at
 
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
  at
 
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
  at scala.collection.immutable.List.foreach(List.scala:45)
  at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
  at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
  Caused by: kafka.common.LeaderNotAvailableException: No leader exists
 for
  partition 0
  at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
  ... 16 more
  topic: second-cluster-topic
 
  ./bin/kafka-preferred-replica-election.sh  --zookeeper...
  --path-to-json-file elect.json
 
 
  [2013-03-22 10:24:20,706] INFO Created preferred replica election
 path
  with { partitions:[ { partition:0, topic:first-cluster-topic },
 {
  partition:0, topic:second-cluster-topic } ], version:1 }
  (kafka.admin.PreferredReplicaLeaderElectionCommand$)
 
  ./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic
 
  2013-03-22 10:24:30,869] ERROR Error while fetching metadata for
 partition
  [second-cluster-topic,0] (kafka.admin.AdminUtils$)
  kafka.common.LeaderNotAvailableException: Leader not available for topic
  second-cluster-topic partition 0
  at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
  at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
  at
 
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
  at
 
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
  at
 
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
  at scala.collection.immutable.List.foreach(List.scala:45)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
  at scala.collection.immutable.List.map(List.scala:45)
  at
 
 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
  at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
  at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
  at
 
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
  at
 
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
  at
 
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
  at scala.collection.immutable.List.foreach(List.scala:45)
  at 

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Neha Narkhede
* Scenario 1:  BrokerID 1,2,3   Broker 2 dies.

Here, you can use reassign partitions tool and for all partitions that
had a replica on broker 2, move it to broker 4

* Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
there.

There is no way to recover any data here since there is nothing
available to consume data from.

Thanks,
Neha

On Fri, Mar 22, 2013 at 10:46 AM, Scott Clasen sc...@heroku.com wrote:
 What would the recommended practice be for the following scenarios?

 Running on EC2, ephemperal disks only for kafka.

 There are 3 kafka servers. The broker ids are always increasing. If a
 broker dies its never coming back.

 All topics have a replication factor of 3.

 * Scenario 1:  BrokerID 1,2,3   Broker 2 dies.

 Recover by:

 Boot another: BrokerID 4
 ?? run bin/kafka-reassign-partitions.sh   for any topic+partition and
 replace brokerid 2 with brokerid 4
 ?? anything else to do to cause messages to be replicated to 4??

 NOTE: This appears to work but not positive 4 got messages replicated to it.

 * Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
 there.

 Messages obviously lost.
 Recover to a functional state by:

 Boot 3 more: 4,5 6
 ?? run bin/kafka-reassign-partitions.sh  for all topics/partitions, swap
 1,2,3 for 4,5,6?
 ?? rin bin/kafka-preferred-replica-election.sh for all topics/partitions
 ?? anything else to do to allow producers to start sending successfully??


 NOTE: I had some trouble with scenario 2. Will try to reproduce and open a
 ticket, if in fact my procedures for scenario 2 are correct, and I still
 cant get to a good state.


Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Scott Clasen
Thanks Neha-

To Clarify...

*In scenario = 1 will the new broker get all messages on the other brokers
replicated to it?

*In Scenario 2 = it is clear that the data is gone, but I still need
producers to be able to send and consumers to receive on the same topic. In
my testing today I was unable to do that as I kept getting errors...so if i
was doing the correct steps it seems there is a bug here, basically the
second-cluster-topic topic is unusable after all 3 brokers crash, and 3
more are booted to replace them.  Something not quite correct in zookeeper?

Like so

./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
reassign.json

kafka.common.LeaderNotAvailableException: Leader not available for topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at
kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
... 16 more
topic: second-cluster-topic

./bin/kafka-preferred-replica-election.sh  --zookeeper...
--path-to-json-file elect.json


[2013-03-22 10:24:20,706] INFO Created preferred replica election path
with { partitions:[ { partition:0, topic:first-cluster-topic }, {
partition:0, topic:second-cluster-topic } ], version:1 }
(kafka.admin.PreferredReplicaLeaderElectionCommand$)

./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic

2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
[second-cluster-topic,0] (kafka.admin.AdminUtils$)
kafka.common.LeaderNotAvailableException: Leader not available for topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at
kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
... 16 more





On Fri, Mar 22, 2013 at 1:12 PM, Neha Narkhede neha.narkh...@gmail.comwrote:

 * Scenario 1:  BrokerID 1,2,3   Broker 2 dies.

 Here, you can use reassign partitions tool and for all partitions that
 had a replica on broker 2, move it to broker 4

 * Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
 there.

 There is no way to recover any data here since there is nothing
 available to consume data from.

 Thanks,
 Neha

 On Fri, Mar 22, 2013 at 10:46 AM, Scott Clasen sc...@heroku.com wrote:
  What would the recommended practice be for the following scenarios?
 
  Running on EC2, ephemperal disks only for kafka.
 
  There are 3 kafka servers. The broker ids are always increasing. If a
  broker dies its never coming back.
 
  

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Neha Narkhede
 *In scenario = 1 will the new broker get all messages on the other brokers
 replicated to it?

Yes, unless it gets all the messages, it does not reflect the new
replicas state in zookeeper.

 *In Scenario 2 = it is clear that the data is gone, but I still need
 producers to be able to send and consumers to receive on the same topic. In
 my testing today I was unable to do that as I kept getting errors...so if i
 was doing the correct steps it seems there is a bug here, basically the
 second-cluster-topic topic is unusable after all 3 brokers crash, and 3
 more are booted to replace them.  Something not quite correct in zookeeper?

Partiton reassignment tool is not the right tool to achieve that.
Since the data is gone, it is
better to delete all topics so the state is gone form zookeeper. We
don't have the delete topic
functionality ready yet. Once the topics are deleted, you can either
let them get auto-created or
create them again.

Thanks,
Neha


On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen sc...@heroku.com wrote:
 Thanks Neha-

 To Clarify...

 *In scenario = 1 will the new broker get all messages on the other brokers
 replicated to it?

 *In Scenario 2 = it is clear that the data is gone, but I still need
 producers to be able to send and consumers to receive on the same topic. In
 my testing today I was unable to do that as I kept getting errors...so if i
 was doing the correct steps it seems there is a bug here, basically the
 second-cluster-topic topic is unusable after all 3 brokers crash, and 3
 more are booted to replace them.  Something not quite correct in zookeeper?

 Like so

 ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
 reassign.json

 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: second-cluster-topic

 ./bin/kafka-preferred-replica-election.sh  --zookeeper...
 --path-to-json-file elect.json


 [2013-03-22 10:24:20,706] INFO Created preferred replica election path
 with { partitions:[ { partition:0, topic:first-cluster-topic }, {
 partition:0, topic:second-cluster-topic } ], version:1 }
 (kafka.admin.PreferredReplicaLeaderElectionCommand$)

 ./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic

 2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
 [second-cluster-topic,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at
 

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Jun Rao
In scenario 2, you can bring up 3 new brokers with the same broker id. You
won't get the data back. However, new data can be published to and consumed
from the new brokers.

Thanks,

Jun

On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen sc...@heroku.com wrote:

 Thanks Neha-

 To Clarify...

 *In scenario = 1 will the new broker get all messages on the other brokers
 replicated to it?

 *In Scenario 2 = it is clear that the data is gone, but I still need
 producers to be able to send and consumers to receive on the same topic. In
 my testing today I was unable to do that as I kept getting errors...so if i
 was doing the correct steps it seems there is a bug here, basically the
 second-cluster-topic topic is unusable after all 3 brokers crash, and 3
 more are booted to replace them.  Something not quite correct in zookeeper?

 Like so

 ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
 reassign.json

 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at

 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at

 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at

 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at

 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at

 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at

 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at

 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: second-cluster-topic

 ./bin/kafka-preferred-replica-election.sh  --zookeeper...
 --path-to-json-file elect.json


 [2013-03-22 10:24:20,706] INFO Created preferred replica election path
 with { partitions:[ { partition:0, topic:first-cluster-topic }, {
 partition:0, topic:second-cluster-topic } ], version:1 }
 (kafka.admin.PreferredReplicaLeaderElectionCommand$)

 ./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic

 2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
 [second-cluster-topic,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at

 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at

 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at

 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at

 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at

 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at

 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at

 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more





 On Fri, Mar 22, 2013 at 1:12 PM, Neha Narkhede neha.narkh...@gmail.com
 wrote:

  * Scenario 1:  BrokerID 1,2,3   Broker 2 dies.
 
  Here, you can use reassign partitions tool and for all partitions that
  had a replica on broker 2, move it to broker 4
 
  * Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
  there.
 
  There is no way to recover any data here since there is 

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Scott Clasen
Thanks! 

 Would there be any difference if I instead  deleted all the Kafka data from 
zookeeper and booted 3 instances  with different broker id? clients with cached 
broker id lists or any other issue?

Sent from my iPhone

On Mar 22, 2013, at 9:15 PM, Jun Rao jun...@gmail.com wrote:

 In scenario 2, you can bring up 3 new brokers with the same broker id. You
 won't get the data back. However, new data can be published to and consumed
 from the new brokers.
 
 Thanks,
 
 Jun
 
 On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen sc...@heroku.com wrote:
 
 Thanks Neha-
 
 To Clarify...
 
 *In scenario = 1 will the new broker get all messages on the other brokers
 replicated to it?
 
 *In Scenario 2 = it is clear that the data is gone, but I still need
 producers to be able to send and consumers to receive on the same topic. In
 my testing today I was unable to do that as I kept getting errors...so if i
 was doing the correct steps it seems there is a bug here, basically the
 second-cluster-topic topic is unusable after all 3 brokers crash, and 3
 more are booted to replace them.  Something not quite correct in zookeeper?
 
 Like so
 
 ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
 reassign.json
 
 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at
 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: second-cluster-topic
 
 ./bin/kafka-preferred-replica-election.sh  --zookeeper...
 --path-to-json-file elect.json
 
 
 [2013-03-22 10:24:20,706] INFO Created preferred replica election path
 with { partitions:[ { partition:0, topic:first-cluster-topic }, {
 partition:0, topic:second-cluster-topic } ], version:1 }
 (kafka.admin.PreferredReplicaLeaderElectionCommand$)
 
 ./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic
 
 2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
 [second-cluster-topic,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic
 second-cluster-topic partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at
 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at
 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 
 
 
 
 
 On Fri, Mar 22, 2013 at 1:12 PM, Neha Narkhede neha.narkh...@gmail.com

Re: Best practices for changing partition numbers

2013-01-08 Thread Jun Rao
If you don't have a lot of topics, one thing you can do is to
over-partition a topic.

Also, in 0.7, # of partitions grows with brokers. This is going to change
in 0.8, in which # of partitions is specified at topic creation time and
won't change as brokers change. One needs to use an admin DDL to change #
of partitions.

Thanks,

Jun

On Mon, Jan 7, 2013 at 10:23 PM, David Ross dyr...@klout.com wrote:

 Hello,

 We have found that, for our application, having a number of total
 partitions as a multiple of the number of consumer hosts is beneficial.
 Because of this, whenever we add or remove consumer hosts, we have to
 change the number of partitions in the server config.

 What are best practices for changing the number of partitions? It seems
 like adding partitions is fine but removing partitions would result in data
 loss - am I right? Is that avoidable? Is it preferable to bring in new
 servers with new partitions? Anything else I should keep in mind on this
 issue?


 Thanks!

 David



Re: Best practices for changing partition numbers

2013-01-08 Thread Jun Rao
Reducing # partitions is going to be tricky. The data for those dropped
partitions will just be lost.

Thanks,

Jun

On Tue, Jan 8, 2013 at 1:24 PM, David Ross dyr...@klout.com wrote:

 Yeah that makes sense, but what if we do need to change the number of
 partitions? What if we need to reduce it?

 On Tue, Jan 8, 2013 at 12:42 PM, Jun Rao jun...@gmail.com wrote:

  If you don't have a lot of topics, one thing you can do is to
  over-partition a topic.
 
  Also, in 0.7, # of partitions grows with brokers. This is going to change
  in 0.8, in which # of partitions is specified at topic creation time and
  won't change as brokers change. One needs to use an admin DDL to change #
  of partitions.
 
  Thanks,
 
  Jun
 
  On Mon, Jan 7, 2013 at 10:23 PM, David Ross dyr...@klout.com wrote:
 
   Hello,
  
   We have found that, for our application, having a number of total
   partitions as a multiple of the number of consumer hosts is beneficial.
   Because of this, whenever we add or remove consumer hosts, we have to
   change the number of partitions in the server config.
  
   What are best practices for changing the number of partitions? It seems
   like adding partitions is fine but removing partitions would result in
  data
   loss - am I right? Is that avoidable? Is it preferable to bring in new
   servers with new partitions? Anything else I should keep in mind on
 this
   issue?
  
  
   Thanks!
  
   David
  
 



Best practices for changing partition numbers

2013-01-07 Thread David Ross
Hello,

We have found that, for our application, having a number of total
partitions as a multiple of the number of consumer hosts is beneficial.
Because of this, whenever we add or remove consumer hosts, we have to
change the number of partitions in the server config.

What are best practices for changing the number of partitions? It seems
like adding partitions is fine but removing partitions would result in data
loss - am I right? Is that avoidable? Is it preferable to bring in new
servers with new partitions? Anything else I should keep in mind on this
issue?


Thanks!

David