Well, I've currently "solved" the monitoring problem on our end by skipping
the JMX attributes and using the 'kafka-rb' Ruby gem to read the latest
offset for the topic. This is the comparison of offsets read from the
consumer, JMX, and the Ruby gem:

Storm Consumer offset:
---------------------------------
"offset"=>2847272176, "partition"=>2, "broker"=>{"host"=>"10.120.x.x",
"port"=>9092}, "topic"=>"mcommits"}

JMX Attributes:
---------------------
attrs: {"CurrentOffset"=>162919578, "Name"=>"mcommits-2",
"NumAppendedMessages"=>7285958, "NumberOfSegments"=>3, "Size"=>1236661577}

OFFSETS command sent from Ruby gem:
------------------------------------------------------------
10.120.x.x:mcommits:2: latest offset: 2847274728

This is the output of the log file directory for this topic:partition:

-rw-r--r-- 1 root root 536871010 2012-11-16 05:36 00000000001610613151.kafka
-rw-r--r-- 1 root root 536870989 2012-11-20 14:28 00000000002147484161.kafka
-rw-r--r-- 1 root root 162919578 2012-11-21 15:48 00000000002684355150.kafka

So, the "latest offset" from the Ruby gem matches what I would expect to
see -- only slightly ahead of the active consumer. AFAICT, the values from
JMX aren't usable to monitor this. Should I file a bug or feature request
to publish an offset in the JMX attributes that matches the 'latest offset'
read from the Kakfa server?

Cheers,

Mike


On Wed, Nov 21, 2012 at 12:08 AM, Jun Rao <jun...@gmail.com> wrote:

> The attribute getCurrentOffset gives the log end offset. It's not
> necessarily the log size though since older segments could be deleted.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 20, 2012 at 1:12 PM, Mike Heffner <m...@librato.com> wrote:
>
> > Jun,
> >
> > Do you have any idea on what the JMX attribute values on the beans "
> > kafka:type=kafka.logs.{topic name}-{partition idx}" represent then? It
> > seems like these should correctly represent the current offsets of the
> > producer logs? They appeared to track correctly for a while, but once the
> > log size grew, they seemed to no longer be correct. Is there potentially
> a
> > bug in these values are large log sizes?
> >
> > I can try the other interface, but it would be nice to know what's wrong
> > with the current JMX values.
> >
> > Thanks,
> >
> > Mike
> >
> >
> > On Tue, Nov 20, 2012 at 12:12 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> > > The tool gets the end offset of the log using getOffsetBefore and the
> > > consumer offset from ZK. It then calculates the lag.
> > >
> > > We do have a JMX for lag in ZookeeperConsumerConnector. The api is the
> > > following, but you need to provide topic/brokerid/partitionid.
> > >
> > > /**
> > >  *  JMX interface for monitoring consumer
> > >  */
> > > trait ZookeeperConsumerConnectorMBean {
> > >   def getPartOwnerStats: String
> > >   def getConsumerGroup: String
> > >   def getOffsetLag(topic: String, brokerId: Int, partitionId: Int):
> Long
> > >   def getConsumedOffset(topic: String, brokerId: Int, partitionId:
> Int):
> > > Long
> > >   def getLatestOffset(topic: String, brokerId: Int, partitionId: Int):
> > Long
> > > }
> > >
> > > Thanks
> > >
> > > Jun
> > >
> > > On Tue, Nov 20, 2012 at 8:03 AM, Mike Heffner <m...@librato.com>
> wrote:
> > >
> > > > I have not tried that yet, I was hoping to use an existing Ruby
> > > monitoring
> > > > process that we use to monitor several other existing resources.  I
> > also
> > > > don't want to make changes to the Kafka consumer code, as it's part
> of
> > a
> > > > bundled package (Storm).
> > > >
> > > > Where does ConsumerOffsetChecker pull its information from? Shouldn't
> > the
> > > > values from JMX match? Guess I might need to look at its source code
> to
> > > > figure out what it's doing.
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Nov 20, 2012 at 12:34 AM, Jun Rao <jun...@gmail.com> wrote:
> > > >
> > > > > Instead of using jmx, have you tried using ConsumerOffsetChecker to
> > > > figure
> > > > > out the consumer lag?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Nov 19, 2012 at 7:10 PM, Mike Heffner <m...@librato.com>
> > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to write a custom monitoring script for our Kafka
> setup
> > > and
> > > > > > would like some help understanding how to interpret the JMX
> > > attributes.
> > > > > >
> > > > > > In our setup, the consumers are writing their current offset to a
> > > path
> > > > in
> > > > > > ZK. This is the value they are getting back from a call
> > > > > > to SimpleConsumer.getOffsetsBefore(). A snapshot of this value
> > looks
> > > > > like:
> > > > > >
> > > > > > {"offset"=>5338008447, "partition"=>2,
> > > "broker"=>{"host"=>"10.x.x.94",
> > > > > > "port"=>9092}, "topic"=>"mcommits"}
> > > > > >
> > > > > > Using the MX4J interface, I poll the
> > > > > > bean "kafka:type=kafka.logs.mcommits-2" on host 10.x.x.94 and get
> > the
> > > > > > attribute values:
> > > > > >
> > > > > > {"CurrentOffset"=>506171524, "Name"=>"mcommits-2",
> > > > > > "NumAppendedMessages"=>10526508, "NumberOfSegments"=>4,
> > > > > "Size"=>2116784530}
> > > > > >
> > > > > > At the time both of these values were snapshotted, this consumer
> > was
> > > > > close
> > > > > > to the end of the log file. In that case, I would expect both
> > offsets
> > > > to
> > > > > be
> > > > > > fairly similar, however the consumer offset is >> the producer
> log
> > > > > offset,
> > > > > > which doesn't make sense.
> > > > > >
> > > > > > Clearly there is something I'm not understanding. How do I use
> the
> > > JMX
> > > > > > attributes to calculate how far behind the consumer is from the
> end
> > > of
> > > > > the
> > > > > > log file? In this scenario the consumer offset is >> both the
> > > > > CurrentOffset
> > > > > > value and the Size value. Is there a way of interpreting these
> > values
> > > > > that
> > > > > > I'm not seeing?
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mike
> > > > > >
> > > > > > --
> > > > > >
> > > > > >   Mike Heffner <m...@librato.com>
> > > > > >   Librato, Inc.
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > >   Mike Heffner <m...@librato.com>
> > > >   Librato, Inc.
> > > >
> > >
> >
> >
> >
> > --
> >
> >   Mike Heffner <m...@librato.com>
> >   Librato, Inc.
> >
>



-- 

  Mike Heffner <m...@librato.com>
  Librato, Inc.

Reply via email to