Re: [DISCUSS] KIP-175: Additional '--describe' views for ConsumerGroupCommand

2017-07-06 Thread Jeff Widman
Thanks for the KIP Vahid. I think it'd be useful to have these filters. That said, I also agree with Edo. We don't currently rely on the output, but there's been more than one time when debugging an issue that I notice something amiss when I see all the data at once but if it wasn't present in th

Re: Committing an invalid offset with KafkaConsumer.commitSync

2017-09-03 Thread Jeff Widman
What broker version are you testing with? On Sep 3, 2017 4:14 AM, "Stig Døssing" wrote: > Hi, > > The documentation for KafkaConsumer.commitSync(Map) states that a > KafkaException will be thrown if the committed offset is invalid. I can't > seem to provoke this behavior, so I'd like clarificati

Re: Adding or removing input topics to a Kafka Consumer without downtime

2017-09-05 Thread Jeff Widman
accept. > > I’m curious to hear if anyone are experiencing the same issues, or if > anyone have any thoughts or opinions? Are we doing something wrong, or is > this something that can be solved by the Kafka Consumer client? > > Thanks, > > Håkon > -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Fix the wiki entry for KIP-62?

2017-10-18 Thread Jeff Widman
time given to consumers for message processing to 60 seconds* should read "5 mins" rather than "60 seconds" I tried to edit it, but I don't have proper permissions. Can someone either fix it or give me the appropriate permissions? -- *Jeff Widman* jeffwidman.com <

Need help understanding precedence of log.flush.interval.ms versus log.flush.scheduler.interval.ms

2017-11-02 Thread Jeff Widman
e guarantee only that the message will be flushed once *both* timers have expired? -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

What are reasonable limits for max number of consumer groups per partition and per broker?

2017-11-13 Thread Jeff Widman
sumer groups? -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: Number of partition on a single broker

2017-12-14 Thread Jeff Widman
I have also seen slower replication across the cluster when partitions per broker are abnormally high, even though the bytes/message throughput isn't that high. Due to legacy reasons, we have a lot of partitions per broker, with only a handful really hot and the others just barely trickling data, o

Re: Usual remedy for "Under Replicated" and "Offline Partitions"

2018-02-02 Thread Jeff Widman
irection... Cheers, Jeff On Fri, Feb 2, 2018 at 11:27 AM, Richard Rodseth wrote: > We have a DataDog integration showing some metrics, and for one of our > clusters the above two > values are > 0 and highlighted in red. > > What's the usual remedy (Confluient Platform,

How to calculate consumer lag in wall-clock time by querying the broker?

2018-02-06 Thread Jeff Widman
ervice. Cheers, Jeff -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: can't consume from partitions due to KAFKA-3963

2018-03-23 Thread Jeff Widman
You could have had split brain where multiple brokers thought they were the controller. This has been a problematic piece of Kafka for a long time and only in the last release or two have some of the edge cases been cleaned up. To force a controller re-election, remove the "/controller" znode from

How to decommission a broker so the controller doesn't return it in the list of known brokers?

2016-09-08 Thread Jeff Widman
known cluster brokers to the known live brokers in zookeeper. But would not be surprised if something in my mental model is incorrect. This is for Kafka 0.8.2. I am planning to upgrade to 0.10 in the not-to-distant future, so if 0.10 handles this differently, I'm also curious about th

Re: How to decommission a broker so the controller doesn't return it in the list of known brokers?

2016-09-09 Thread Jeff Widman
It looks like this problem is caused by this bug in Kafka 8, which was fixed in Kafka 9: https://issues.apache.org/jira/browse/KAFKA-972 On Thu, Sep 8, 2016 at 3:55 PM, Jeff Widman wrote: > How do I permanently remove a broker from a Kafka cluster? > > Scenario: > > I have a st

Re: Anyone running Kafka on Kubernetes in production?

2016-09-14 Thread Jeff Widman
ch services > / replication controllers you have) > > Also, how has the performance been for you? I've read a report which said > the performance suffered running kafka as a docker container. > -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: Manually update consumer offset stored in Kafka

2016-10-14 Thread Jeff Widman
I also would like to know this. Is the solution to just use a console producer against the internal topics that store the offsets? On Wed, Oct 12, 2016 at 2:26 PM, Yifan Ying wrote: > Hi, > > In old consumers, we use the following command line tool to manually update > offsets stored in zk: > >

Re: [VOTE] Add REST Server to Apache Kafka

2016-10-25 Thread Jeff Widman
-1 As an end-user, while I like the idea in theory, in practice I don't think it's a good idea just yet. Certainly, it'd be useful, enabling things like https://github.com/Landoop/kafka-topics-ui to work without needing anything outside of Kafka core. But there are already enough things in the e

Re: consumer_offsets partition skew and possibly ignored retention

2016-10-28 Thread Jeff Widman
James, What version did you experience the problem with? On Oct 28, 2016 6:26 PM, "James Brown" wrote: > I was having this problem with one of my __consumer_offsets partitions; I > used reassignment to move the large partition onto a different set of > machines (which forced the cleaner to run t

Cleanup partition offsets that exist for consumer groups but not in broker

2016-11-03 Thread Jeff Widman
We hit an error in some custom monitoring code for our Kafka cluster where the root cause was zookeeper was storing for some partition offsets for consumer groups, but those partitions didn't actually exist on the brokers. Apparently in the past, some colleagues needed to reset a stuck cluster cau

Re: Mysterious timeout

2016-11-04 Thread Jeff Widman
Mike, Did you ever figure this out? We're considering using Kafka on Kubernetes and very interested in how it's going for you. On Thu, Oct 27, 2016 at 8:34 AM, Martin Gainty wrote: > MG>can u write simpleConsumer to determine when lead broker times-out.. > then you'll need to tweak connection s

How to reset the GroupCoordinator's saved session timeout for a group?

2018-05-14 Thread Jeff Widman
? 2) Is there any way to force the GroupCoordinator to clear the session timeout or are we stuck waiting out the 12 hours? (or migrating offsets to a new consumer group name and starting the consumers again under a new name) -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: Prioritized Topics for Kafka

2019-01-17 Thread Jeff Widman
ntact the sender immediately; and (iii) delete this email. Our > privacy policy is available here: > https://origamienergy.com/privacy-policy/. Origami Energy Limited > (company number 8619644); Origami Storage Limited (company number 10436515) > and OSSPV001 Limited (company number 10933403), each registered in England > and each with a registered office at: Ashcombe Court, Woolsack Way, > Godalming, GU7 1LQ. > > -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: PR review

2019-06-20 Thread Jeff Widman
om/apache/kafka/pull/6771 > > Could this Be reviewed for new release ? This is important for our project. > > Thanks, > -- *Jeff Widman* jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265) <><

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-19 Thread Jeff Widman
I am also interested in learning how others are handling this. I also support several services where average message processing time takes 20 seconds per message but p99 time is about 20 minutes and the stop-the-world rebalancing is very painful On Fri, Jul 19, 2019, 11:38 AM Raman Gupta wrote:

Is it a bad idea to use periods within a consumer group name? "my-service.topic1_consumer_group"

2016-12-13 Thread Jeff Widman
I vaguely remember reading somewhere that it's a bad idea to use periods within Kafka consumer group names because it can potentially conflict with metric names. I've searched, and not finding anything, so am I just mis-remembering? It is operationally convenient because zookeeper CLI allows tab

How does max.poll.records behave when a single consumer is consuming multiple topics?

2016-12-14 Thread Jeff Widman
Scenario: Four topics, each with one partition having 1000 messages. A single consumer group subscribed to all four topics. Only two consumer processes within the consumer group,.. Using the default strategy, each individual consumer will be subscribed to two topic_partitions. Will a single call

Is it possible for consumers within a single consumer group to have different subscriptions?

2016-12-21 Thread Jeff Widman
Searched for a while and not finding a clear answer. Is it possible for consumers within a single consumer group to have different topic subscriptions? If no, if any one of the consumers calls subscribe() with new topic list, how is that subscription propagated to the other consumers in the group

Re: Is it a bad idea to use periods within a consumer group name? "my-service.topic1_consumer_group"

2016-12-23 Thread Jeff Widman
ssues it is best to use either, but not both.* On Tue, Dec 13, 2016 at 10:08 PM, Praveen wrote: > Not that I know of. We at Flurry have been using periods in our group names > for a while now and haven't encountered any issues b/c of that. > > > > On Tue, Dec 13, 2016 at

Is this a bug or just unintuitive behavior?

2017-01-04 Thread Jeff Widman
I'm seeing consumers miss messages when they subscribe before the topic is actually created. Scenario: 1) kafka 0.10.1.1 cluster with allow-topic no topics, but supports topic auto-creation as soon as a message is published to the topic 2) consumer subscribes using topic string or a regex pattern.

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread Jeff Widman
you all messages from the beginning if you don't explicitly consume > from the beginning. > > > > Sent from my iPhone > > > >> On Jan 4, 2017, at 6:53 PM, Jeff Widman wrote: > >> > >> I'm seeing consumers miss messages when they subscribe before th

Is there a performance problem with new broker + old log.message.format.version + new consumer?

2017-01-11 Thread Jeff Widman
We upgraded our Kafka clusters from 0.8.2.1 to 0.10.0.1, but most of our consumers use older libraries that do not support the new message format. So we set the brokers' log.message.format.version to 0.8.2 while we work on upgrading our consumers. In the meantime, I'm worried about a performance p

Re: [VOTE] KIP-106 - Default unclean.leader.election.enabled True => False

2017-01-11 Thread Jeff Widman
+1 nonbinding. We were bit by this in a production environment. On Wed, Jan 11, 2017 at 11:42 AM, Ian Wrigley wrote: > +1 (non-binding) > > > On Jan 11, 2017, at 11:33 AM, Jay Kreps wrote: > > > > +1 > > > > On Wed, Jan 11, 2017 at 10:56 AM, Ben Stopford wrote: > > > >> Looks like there was a

Re: Correlation Id errors for both console producer and consumer

2017-01-17 Thread Jeff Widman
What versions of Kafka and Zookeeper are you using? On Tue, Jan 17, 2017 at 11:57 AM, Zac Harvey wrote: > I have 2 Kafkas backed by 3 ZK nodes. I want to test the Kafka nodes by > running the kafka-console-producer and -consumer locally on each node. > > So I SSH into one of my Kafka brokers usi

Re: Role of Zookeeper in Kafka Rest Proxy

2017-02-17 Thread Jeff Widman
Will this new release use a new consumer? On Feb 16, 2017 11:33 PM, wrote: > You can't integrate 3.1.1 REST Proxy with a secure cluster because it uses > the old consumer API (hence zookeeper dependency). The 3.2 REST Proxy will > allow you to integrate with a secure cluster because it is update

Re: Heartbeats while consuming a message in kafka-python

2017-02-21 Thread Jeff Widman
As far as I understood it, the primary thrust of KIP-62 was making it so heartbeats could be issued outside of the poll() loop, meaning that the session.timeout.ms could be reduced below the length of time it takes a consumer to process a particular batch of messages. Unfortunately, while both lib

Re: Heartbeats while consuming a message in kafka-python

2017-02-21 Thread Jeff Widman
fully supporting KIP-62 is on the roadmap of kafka-python > already, and maybe Magnus (cc'ed) can explain a bit more on the timeline of > it. > > > Guozhang > > > On Tue, Feb 21, 2017 at 12:06 PM, Jeff Widman wrote: > > > As far as I understood it, the p

Re: Creating topic partitions automatically using python

2017-02-23 Thread Jeff Widman
This is probably a better fit for the pykafka issue tracker. AFAIK, there's no public kafka API for creating partitions right now, so pykafka is likely hacking around this by calling out to internal Java APIs, so it will be brittle On Thu, Feb 23, 2017 at 5:13 AM, VIVEK KUMAR MISHRA 13BIT0066 < v

Re: Question about messages in __consumer_offsets topic

2017-02-23 Thread Jeff Widman
The topic deletion only triggers tombstone on brokers >= 0.10.2, correct? I thought there was an outstanding bug report for this in lower versions... On Wed, Feb 22, 2017 at 6:17 PM, Hans Jespersen wrote: > The __consumer_offsets topic should also get a tombstone message as soon as > a topic is

Re: Recommended number of partitions on each broker

2017-03-02 Thread Jeff Widman
We normally run over 1,000 partitions per broker, and I know of a major company with 30+ kafka clusters that averages 1,100 partitions per broker across all clusters. So 300 shouldn't be an issue as long as the throughput per partition isn't too high. Given that disk and cpu are so low, I'd guess

What is request.timeout in the consumer used for?

2017-03-02 Thread Jeff Widman
In the consumer, what will trigger the request.timeout? Is it just if broker doesn't respond within that period of time? I'm guessing in a healthy cluster, the primary culprit for triggering this is if one of the steps within the consumer group rebalancing taking a long time of inter-broker commu

Re: How to set offset for a consumer in Kafka 0.10.0.X

2017-03-07 Thread Jeff Widman
Offsets for modern kafka consumers are stored in an internal Kafka topic, so they aren't as easy to change as zookeeper. To set a consumer offset, you need a consumer within a consumer group to call commit() with your explicit offset. If needed, you can create a dummy consumer and tell it to join

Re: How to set offset for a consumer in Kafka 0.10.0.X

2017-03-08 Thread Jeff Widman
> The right way to go about getting the offset set to a specific value > (12345678 in this example) for a specific consumer group? > > Regards > -- > Glen Ogilvie > Open Systems Specialists > Level 1, 162 Grafton Road > http://www.oss.co.nz/ > > Ph: +64 9 984

Re: How to set offset for a consumer in Kafka 0.10.0.X

2017-03-08 Thread Jeff Widman
c-partitions, seeks to the desired > offset, and commits. > - Offset tool shuts down > - Consumers then restart and re-join the consumer group, resuming at the > offsets that were last committed for each topic-partition > > On Wed, Mar 8, 2017 at 10:51 AM, Jeff Widman wrote: &g

Re: ISR churn

2017-03-22 Thread Jeff Widman
To manually failover the controller, just delete the /controller znode in zookeeper On Wed, Mar 22, 2017 at 11:46 AM, Marcos Juarez wrote: > We're seeing the same exact pattern of ISR shrinking/resizing, mostly on > partitions with the largest volume, with thousands of messages per second. > It

Re: any production deployment of kafka 0.10.2.0

2017-03-26 Thread Jeff Widman
+ Users list On Mar 26, 2017 8:17 AM, "Jianbin Wei" wrote: We are thinking about upgrading our system to 0.10.2.0. Has anybody upgraded his/her system to 0.10.2.0 and any issues? Regards, -- Jianbin

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Jeff Widman
*As Onur explained, if ZK is down, Kafka can still work, but won't be able to react to actual broker failures until ZK is up again. So if a broker is down in that window, some of the partitions may not be ready for read or write.* We had a production scenario where ZK had a long GC pause and Kafka

Re: Re: ZK and Kafka failover testing

2017-04-19 Thread Jeff Widman
Oops, I linked to the wrong ticket, this is the one we hit: https://issues.apache.org/jira/browse/KAFKA-3042 On Wed, Apr 19, 2017 at 1:45 PM, Jeff Widman wrote: > > > > > > *As Onur explained, if ZK is down, Kafka can still work, but won't be able > to react to actua

Re: more than 1 active controler

2017-04-21 Thread Jeff Widman
Remove the /controller znode in zookeeper and it will force kafka to trigger a new controller re-election. On Fri, Apr 21, 2017 at 1:58 PM, wei wrote: > We noticed we have more than 2 active controllers. How can we fix the > issue? it has been for a few days. > > Thanks, > Wei >

Why do I need to specify replication factor when creating a topic?

2017-05-11 Thread Jeff Widman
When creating a new topic, why do I need to specify the replication factor and number of partitions? I'd rather than when omitted, Kafka defaults to the value set in server.properties. Was this an explicit design decision?

Re: Why do I need to specify replication factor when creating a topic?

2017-05-11 Thread Jeff Widman
ed to the value in server.properties, rather than our code having to figure out whether it's a dev vs produciton cluster. I'm aware we could hack around this by relying on topic auto-creation, but we'd rather disable that to prevent topics being accidentally created. On Thu, May 11

Re: Why do I need to specify replication factor when creating a topic?

2017-05-12 Thread Jeff Widman
as > I'm aware it will still be 2 calls (1 to get the default configs, another > to create the topics with those configs). > > -Tommy > > > From: Jeff Widman [j...@netskope.com] > Sent: Thursday, May 11, 2017 7:42 PM > To: use