[jira] [Commented] (SAMZA-503) Lag gauge very slow to update for slow jobs

Yan Fang (JIRA) Wed, 14 Jan 2015 17:39:07 -0800

    [ 
https://issues.apache.org/jira/browse/SAMZA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278061#comment-14278061
 ]


Yan Fang commented on SAMZA-503:
--------------------------------

Had a look at this issue. Here are some points:

1. I guess the %s-%s-messages-behind-high-watermark behaves as designed. It 
returns the the difference between latest offset of the incoming stream and 
that of the cache(BrokerProxy). 
{code}
metrics.lag(tp).set(hw - nextOffset)
{code} 
And it is called only when Samza fetches data. Meanwhile, %s-%s-offset-change, 
%s-%s-bytes-read and %s-%s-messages-read have exact the same behavior. They all 
are refreshed when Samza fetches data. Only called in 
_moveMessagesToTheirQueue_ . So I think, in the original design, those metrics 
are for measuring the changes between input stream and the cache.

2. If my understanding is corret, what [~theduderog] wants is a gauge which 
compares the latest offset of the incoming stream and the commited offset of 
Samza (offset of the latest processed message). So even
{quote}
  In the BrokerProxy, we currently just sleep if we don't need to fetch 
messages. We could do a fetch simply to get the latest offset, update the 
gauges, then sleep. 
{quote}
does not meet his requirement. Because by doing this, the gauge is still 
showing "the difference between latest offset of the incoming stream and that 
of the cache(BrokerProxy)". It only updates the latest offset of the incoming 
stream. The BrokerProxy does not have the latest committed offset.

3. So a possible solution is to leave the "-messages-behind-high-watermark" as 
is but add a new metric to compare the committed offset and the latest offset. 
Not sure which class it should go. If we do not provide it out-of-box, it can 
also be calculated in Task (in process) because IncomingMessageEnvelop has the 
offset information.

Any thoughts?


> Lag gauge very slow to update for slow jobs
> -------------------------------------------
>
>                 Key: SAMZA-503
>                 URL: https://issues.apache.org/jira/browse/SAMZA-503
>             Project: Samza
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.8.0
>         Environment: Mac OS X, Oracle Java 7, ProcessJobFactory
>            Reporter: Roger Hoover
>            Assignee: Yan Fang
>             Fix For: 0.9.0
>
>
> For slow jobs, the 
> KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does 
> not get updated very often.
> To reproduce:
> * Create a job that processes one message and sleeps for 5 seconds
> * Create it's input topic but do not populate it yet
> * Start the job
> * Load 1000s of messages to it's input topic.  You can keep adding messages 
> with a "wait -n 1 <kafka console producer command>"
> What happens:
> * Run jconsole to view the JMX metrics
> * The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG 
> time (~10 minutes?) before finally updating.
> What should happen:
> * The gauge should get updated at a reasonable interval (a least every few 
> seconds)
> I think what's happening is that the BrokerProxy only updates the high 
> watermark when a consumer is ready for more messages.  When the job is so 
> slow, this rarely happens to the metric doesn't get updated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SAMZA-503) Lag gauge very slow to update for slow jobs

Reply via email to