The tool gets the end offset of the log using getOffsetBefore and the consumer offset from ZK. It then calculates the lag.
We do have a JMX for lag in ZookeeperConsumerConnector. The api is the following, but you need to provide topic/brokerid/partitionid. /** * JMX interface for monitoring consumer */ trait ZookeeperConsumerConnectorMBean { def getPartOwnerStats: String def getConsumerGroup: String def getOffsetLag(topic: String, brokerId: Int, partitionId: Int): Long def getConsumedOffset(topic: String, brokerId: Int, partitionId: Int): Long def getLatestOffset(topic: String, brokerId: Int, partitionId: Int): Long } Thanks Jun On Tue, Nov 20, 2012 at 8:03 AM, Mike Heffner <m...@librato.com> wrote: > I have not tried that yet, I was hoping to use an existing Ruby monitoring > process that we use to monitor several other existing resources. I also > don't want to make changes to the Kafka consumer code, as it's part of a > bundled package (Storm). > > Where does ConsumerOffsetChecker pull its information from? Shouldn't the > values from JMX match? Guess I might need to look at its source code to > figure out what it's doing. > > > > > On Tue, Nov 20, 2012 at 12:34 AM, Jun Rao <jun...@gmail.com> wrote: > > > Instead of using jmx, have you tried using ConsumerOffsetChecker to > figure > > out the consumer lag? > > > > Thanks, > > > > Jun > > > > On Mon, Nov 19, 2012 at 7:10 PM, Mike Heffner <m...@librato.com> wrote: > > > > > Hi, > > > > > > I am trying to write a custom monitoring script for our Kafka setup and > > > would like some help understanding how to interpret the JMX attributes. > > > > > > In our setup, the consumers are writing their current offset to a path > in > > > ZK. This is the value they are getting back from a call > > > to SimpleConsumer.getOffsetsBefore(). A snapshot of this value looks > > like: > > > > > > {"offset"=>5338008447, "partition"=>2, "broker"=>{"host"=>"10.x.x.94", > > > "port"=>9092}, "topic"=>"mcommits"} > > > > > > Using the MX4J interface, I poll the > > > bean "kafka:type=kafka.logs.mcommits-2" on host 10.x.x.94 and get the > > > attribute values: > > > > > > {"CurrentOffset"=>506171524, "Name"=>"mcommits-2", > > > "NumAppendedMessages"=>10526508, "NumberOfSegments"=>4, > > "Size"=>2116784530} > > > > > > At the time both of these values were snapshotted, this consumer was > > close > > > to the end of the log file. In that case, I would expect both offsets > to > > be > > > fairly similar, however the consumer offset is >> the producer log > > offset, > > > which doesn't make sense. > > > > > > Clearly there is something I'm not understanding. How do I use the JMX > > > attributes to calculate how far behind the consumer is from the end of > > the > > > log file? In this scenario the consumer offset is >> both the > > CurrentOffset > > > value and the Size value. Is there a way of interpreting these values > > that > > > I'm not seeing? > > > > > > > > > Thanks, > > > > > > Mike > > > > > > -- > > > > > > Mike Heffner <m...@librato.com> > > > Librato, Inc. > > > > > > > > > -- > > Mike Heffner <m...@librato.com> > Librato, Inc. >