How to set partition.assignment.strategy at run time?

2018-01-24 Thread Jun MA
Hi all, I have a custom PartitionAssignor class that would need a parameter to construct the class. So instead of specify partition.assignment.strategy with the class name in the consumer properties, how could I do it at runtime? I’m using kafka 0.9 java client. Thanks, Jun

Log compaction failed because offset map doesn't have enough space

2017-05-16 Thread Jun Ma
Hi team, We are having a issue with compacting __consumer_offsets topic in our cluster. We’re seeing logs in log-cleaner.log saying: [2017-05-16 11:56:28,993] INFO Cleaner 0: Building offset map for log __consumer_offsets-15 for 349 segments in offset range [0, 619265471). (kafka.log.LogCleaner)

Re: Does Kafka producer waits till previous batch returns responce before sending next one?

2017-04-30 Thread Jun MA
Does this mean that if the client have retry > 0 and max.in.flight.requests.per.connection > 1, then even if the topic only have one partition, there’s still no guarantee of the ordering? Thanks, Jun > On Apr 30, 2017, at 7:57 AM, Hans Jespersen wrote: > > There is a

Re: Elegant way to failover controller

2017-04-05 Thread Jun MA
t; znode in Zookeeper for the cluster. This will cause the current controller > to resign and a new one to be elected, but it’s random as to what broker > that will be. > > A better question here is why do you want to move the controller? > > -Todd > > On Wed, Apr 5, 2017 at 9:09 AM, Ju

Elegant way to failover controller

2017-04-05 Thread Jun MA
Hi, We are running kafka 0.9.0.1 and I’m looking for a elegant way to failover controller to other brokers. Right now I have to restart the broker to failover it to other brokers. Is there a way to failover controller to a specific broker? Is there a way to failover it without restart the

Re: ISR churn

2017-03-22 Thread Jun MA
Hi David, I checked our cluster, the producer purgatory size is under 3 mostly. But I’m not quite understand this metrics, could you please explain it a little bit? Thanks, Jun > On Mar 22, 2017, at 3:07 PM, David Garcia wrote: > > producer purgatory size

Re: ISR churn

2017-03-22 Thread Jun MA
elease for fixing: > > https://issues.apache.org/jira/browse/KAFKA-4674 > <https://issues.apache.org/jira/browse/KAFKA-4674> > > Jun Ma, what exactly did you do to failover the controller to a new broker? > If that works for you, I'd like to try it in our staging clusters.

Recommended number of partitions on each broker

2017-03-01 Thread Jun MA
Hi, I’m curious what’s the recommended number of partitions running on each individual broker? We have a 3 nodes clusters running 0.9.0.1, each one has 24 cores, 1.1T ssd, 48G ram, 10G NIC. There’s about 300 partitions running on each broker and the resource usage is pretty low (5% cpu, 50%

Re: Frequently shrink/expand ISR on one broker

2017-02-23 Thread Jun MA
Feb 19, 2017, at 2:13 PM, Jun MA <mj.saber1...@gmail.com> wrote: > > Hi team, > > We are running confluent 0.9.0.1 on a cluster with 6 brokers. These days one > of our broker(broker 1) frequently shrink the ISR and expand it immediately > every about 20 minutes and I c

Re: Question about messages in __consumer_offsets topic

2017-02-22 Thread Jun MA
d > > > > On Wed, Feb 22, 2017 at 3:50 PM, Jun MA <mj.saber1...@gmail.com> wrote: > >> Hi guys, >> >> I’m trying to consume from __consumer_offsets topic to get exact committed >> offset of each consumer. Normally I can see messages like: >>

Question about messages in __consumer_offsets topic

2017-02-22 Thread Jun MA
Hi guys, I’m trying to consume from __consumer_offsets topic to get exact committed offset of each consumer. Normally I can see messages like: [eds-els-recopp-jenkins-01-5651,eds-incre-staging-1,0]::[OffsetMetadata[29791925,NO_METADATA],CommitTime 1487090167367,ExpirationTime 1487176567367],

Re: JMX metrics for replica lag time

2017-02-21 Thread Jun MA
opic=([-.\w]+),partition=([0-9]+) > lag > should be proportional to the maximum batch size of a produce request. > > On Mon, Feb 20, 2017 at 5:43 PM, Jun Ma <mj.saber1...@gmail.com> wrote: > >> Hi Guozhang, >> >> Thanks for your replay. Could you tell me whic

Re: JMX metrics for replica lag time

2017-02-20 Thread Jun Ma
fact even in > 0.10.x they are still the same as stated in: > > https://kafka.apache.org/documentation/#monitoring > > The mechanism for determine which followers have been dropped out of ISR > has changed, but the metrics are not. > > > Guozhang > > > On Su

JMX metrics for replica lag time

2017-02-19 Thread Jun MA
Hi, I’m looking for the JMX metrics to represent replica lag time for 0.9.0.1. Base on the documentation, I can only find kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica, which is max lag in messages btw follower and leader replicas. But since in 0.9.0.1 lag in messages

Frequently shrink/expand ISR on one broker

2017-02-19 Thread Jun MA
Hi team, We are running confluent 0.9.0.1 on a cluster with 6 brokers. These days one of our broker(broker 1) frequently shrink the ISR and expand it immediately every about 20 minutes and I couldn’t find out why. Based on the log, I can kick out any of other brokers, not just a specific one.

Re: Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-21 Thread Jun MA
Hi Peter, We’ve seen this happen under normal operation in our virtualized environment as well. Our network is not very stable, blips happen pretty frequently. Your explanation sounds reasonable to me, I’m very interested in your further thought on this. In our case, we’re using quorum based

Re: Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-20 Thread Jun MA
cation > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Generation+rather+than+High+Watermark+for+Truncation>>. > If you can reproduce in a more generally scenario we would be very > interested. > > All

Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-18 Thread Jun MA
Would be grateful to hear opinions from experts out there. Thanks in advance > On Dec 17, 2016, at 11:06 AM, Jun MA <mj.saber1...@gmail.com> wrote: > > Hi, > > We saw the following FATAL error in 2 of our brokers (3 in total, the active > controller doesn’t hav

Halting because log truncation is not allowed for topic __consumer_offsets

2016-12-16 Thread Jun MA
Hi, We saw the following FATAL error in 2 of our brokers (3 in total, the active controller doesn’t have this) and they crashed in the same time. [2016-12-16 16:12:47,085] FATAL [ReplicaFetcherThread-0-3], Halting because log truncation is not allowed for topic __consumer_offsets, Current

Re: Consumer group keep bouncing between generation id 0 and 1

2016-11-19 Thread Jun Ma
given topic, one has to query all the consumer groups and > filter by their subscribed topics. > > > Guozhang > > > > > On Fri, Nov 18, 2016 at 4:19 PM, Jun MA <mj.saber1...@gmail.com> wrote: > > > Hi guys, > > > > I’m using kafka 0.9.0.1 and Java cli

Consumer group keep bouncing between generation id 0 and 1

2016-11-18 Thread Jun MA
Hi guys, I’m using kafka 0.9.0.1 and Java client. I saw the following exceptions throw by my consumer: Caused by: java.lang.IllegalStateException: Correlation id for response (767587) does not match request (767585) at

Lost __consumer_offsets and _schemas topic level config in zookeeper

2016-08-17 Thread Jun MA
Hi, We are using confluent 0.9.0.1 and we’ve noticed recently that we lost __consumer_offsets and _schemas (used for schema registry) topic level config. We checked zookeeper config/topics/ and found that there is no __consumer_offsets and _schemas topic. Our server level config use cleanup

How to manually bring offline partition back?

2016-07-14 Thread Jun MA
Hi all, We have some partitions that only have 1 replication, and after the broker failed, partitions on that broker becomes unavailable. We set unclean.leader.election.enable=false, so the controller doesn’t bring those partitions back online even after that broker is up. We tried Preferred

Re: kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
commit any offset for offsets.retention.minutes, kafka will clean up its offset, which make sense for my case. Thanks, Jun > On May 3, 2016, at 11:46 AM, Jun MA <mj.saber1...@gmail.com> wrote: > > Thanks for your reply. I checked the offset topic and the cleanup policy is > actu

Re: kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
ijs <gerard.kl...@dizzit.com> wrote: > > Looks like it, you need to be sure the offset topic is using compaction, > and the broker is set to enable compaction. > > On Tue, May 3, 2016 at 9:56 AM Jun MA <mj.saber1...@gmail.com> wrote: > >> Hi, >> >

kafka 0.9 offset unknown after cleanup

2016-05-03 Thread Jun MA
Hi, I’m using 0.9.0.1 new-consumer api. I noticed that after kafka cleans up all old log segments(reach delete.retention time), I got unknown offset. bin/kafka-consumer-groups.sh --bootstrap-server server:9092 --new-consumer --group testGroup --describe GROUP, TOPIC, PARTITION, CURRENT OFFSET,