[ https://issues.apache.org/jira/browse/SAMZA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278061#comment-14278061 ]
Yan Fang commented on SAMZA-503: -------------------------------- Had a look at this issue. Here are some points: 1. I guess the %s-%s-messages-behind-high-watermark behaves as designed. It returns the the difference between latest offset of the incoming stream and that of the cache(BrokerProxy). {code} metrics.lag(tp).set(hw - nextOffset) {code} And it is called only when Samza fetches data. Meanwhile, %s-%s-offset-change, %s-%s-bytes-read and %s-%s-messages-read have exact the same behavior. They all are refreshed when Samza fetches data. Only called in _moveMessagesToTheirQueue_ . So I think, in the original design, those metrics are for measuring the changes between input stream and the cache. 2. If my understanding is corret, what [~theduderog] wants is a gauge which compares the latest offset of the incoming stream and the commited offset of Samza (offset of the latest processed message). So even {quote} In the BrokerProxy, we currently just sleep if we don't need to fetch messages. We could do a fetch simply to get the latest offset, update the gauges, then sleep. {quote} does not meet his requirement. Because by doing this, the gauge is still showing "the difference between latest offset of the incoming stream and that of the cache(BrokerProxy)". It only updates the latest offset of the incoming stream. The BrokerProxy does not have the latest committed offset. 3. So a possible solution is to leave the "-messages-behind-high-watermark" as is but add a new metric to compare the committed offset and the latest offset. Not sure which class it should go. If we do not provide it out-of-box, it can also be calculated in Task (in process) because IncomingMessageEnvelop has the offset information. Any thoughts? > Lag gauge very slow to update for slow jobs > ------------------------------------------- > > Key: SAMZA-503 > URL: https://issues.apache.org/jira/browse/SAMZA-503 > Project: Samza > Issue Type: Bug > Components: metrics > Affects Versions: 0.8.0 > Environment: Mac OS X, Oracle Java 7, ProcessJobFactory > Reporter: Roger Hoover > Assignee: Yan Fang > Fix For: 0.9.0 > > > For slow jobs, the > KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does > not get updated very often. > To reproduce: > * Create a job that processes one message and sleeps for 5 seconds > * Create it's input topic but do not populate it yet > * Start the job > * Load 1000s of messages to it's input topic. You can keep adding messages > with a "wait -n 1 <kafka console producer command>" > What happens: > * Run jconsole to view the JMX metrics > * The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG > time (~10 minutes?) before finally updating. > What should happen: > * The gauge should get updated at a reasonable interval (a least every few > seconds) > I think what's happening is that the BrokerProxy only updates the high > watermark when a consumer is ready for more messages. When the job is so > slow, this rarely happens to the metric doesn't get updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)