Re: Monitoring offset lag

Tom Dearman Fri, 08 Jul 2016 08:59:13 -0700
I should mention this was using the web server to check status.

> On 8 Jul 2016, at 16:56, Tom Dearman <tom.dear...@gmail.com> wrote:
> 
> Todd,
> 
> Thanks for that I am taking a look.
> 
> Is there a bug whereby if you only have a couple of messages on a topic, both 
> with the same key, that burrow doesn’t return correct info.  I was finding 
> that http://localhost:8100/v2/kafka/betwave/consumer 
> <http://localhost:8100/v2/kafka/betwave/consumer> was returning a message 
> with empty consumers until I put on another message with a different key, 
> i.e. a minimum of 2 partitions with something in them.  I know this is not 
> very like production, but on my local this I was only testing with one user 
> so get just one partition filled.
> 
> Tom
>> On 6 Jul 2016, at 18:08, Todd Palino <tpal...@gmail.com 
>> <mailto:tpal...@gmail.com>> wrote:
>> 
>> Yeah, I've written dissertations at this point on why MaxLag is flawed. We
>> also used to use the offset checker tool, and later something similar that
>> was a little easier to slot into our monitoring systems. Problems with all
>> of these is why I wrote Burrow (https://github.com/linkedin/Burrow 
>> <https://github.com/linkedin/Burrow>)
>> 
>> For more details, you can also check out my blog post on the release:
>> https://engineering.linkedin.com/apache-kafka/burrow-kafka-consumer-monitoring-reinvented
>>  
>> <https://engineering.linkedin.com/apache-kafka/burrow-kafka-consumer-monitoring-reinvented>
>> 
>> -Todd
>> 
>> On Wednesday, July 6, 2016, Tom Dearman <tom.dear...@gmail.com> wrote:
>> 
>>> I recently had a problem on my production which I believe was a
>>> manifestation of the issue kafka-2978 (Topic partition is not sometimes
>>> consumed after rebalancing of consumer group), this is fixed in 0.9.0.1 and
>>> we will upgrade our client soon.  However, it made me realise that I didn’t
>>> have any monitoring set up on this.  The only thing I can find as a metric
>>> is the
>>> kafka.consumer:type=ConsumerFetcherManager,name=MaxLag,clientId=([-.\w]+),
>>> which, if I understand correctly, is the max lag of any partition that that
>>> particular consumer is consuming.
>>> 1. If I had been monitoring this, and if my consumer was suffering from
>>> the issue in kafka-2978, would I actually have been alerted, i.e. since the
>>> consumer would think it is consuming correctly would it not have updated
>>> the metric.
>>> 2. There is another way to see offset lag using the command
>>> /usr/bin/kafka-consumer-groups --new-consumer --bootstrap-server
>>> 10.10.1.61:9092 --describe —group consumer_group_name and parsing the
>>> response.  Is it safe or advisable to do this?  I like the fact that it
>>> tells me each partition lag, although it is also not available if no
>>> consumer from the group is currently consuming.
>>> 3. Is there a better way of doing this?
>> 
>> 
>> 
>> -- 
>> *Todd Palino*
>> Staff Site Reliability Engineer
>> Data Infrastructure Streaming
>> 
>> 
>> 
>> linkedin.com/in/toddpalino
>
Re: Monitoring offset lag

Reply via email to