[
https://issues.apache.org/jira/browse/SAMZA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118473#comment-14118473
]
Yan Fang commented on SAMZA-349:
--------------------------------
{quote}
Did you benchmark/profile this? I'm a little concerned that if we're keeping
every single timer event within a 30 sec interval (not downsampling), the
memory and CPU overhead could become significant. The reservoir might well end
up containing several million values.
{quote}
* Did not do benchmark. Maybe I should add this part.
* Actually, default is "300s", not "30s". :( . Currently the metric calculates
the average time one code block spends in the 300s. I have the same concern.
But users can set the timer interval by themselves. Is there a way to bypass
this problem? I feel the reservoir has to keep all the events in the time
interval.
* What does the "downsampling" mean here?
{quote}
Did you consider using System.nanoTime() instead of System.currentTimeMillis()?
For many jobs, a call to process() will hopefully take less than a millisecond,
so a millisecond-resolution timer metric would be useless.
{quote}
yes, this is an option. I choose ms is because even hello-samza takes more than
1 ms in my situation. We always can go to nanoTime if ms becomes useless.
{quote}
Were you planning to add percentile metrics? If not, you don't really need a
reservoir and snapshots (eg. the mean can be calculated just with a running sum
and count).
{quote}
Because not quite sure if we need percentile in the future or we need other
ways of calculating time (besides sliding window), I keep these two classes for
easy implementation in the future.
{quote}
Suggestion: it would be useful to add "utilization" (aka "duty cycle") as a
metric, which is the sum of all the timings divided by the window length. That
can tell you how much idle time there is in the event loop (how much headroom
before the job will start falling behind).
{quote}
agree. this seems useful. Open SAMZA-401 to implement this.
> add timer in metrics
> --------------------
>
> Key: SAMZA-349
> URL: https://issues.apache.org/jira/browse/SAMZA-349
> Project: Samza
> Issue Type: Bug
> Reporter: Yan Fang
> Assignee: Yan Fang
> Attachments: SAMZA-349.1.patch, SAMZA-349.2.patch, SAMZA-349.3.patch,
> SAMZA-349.patch, SAMZA-349.patch
>
>
> If my understanding is correct, the metrics we provide are for every 60
> seconds and all counters will be reset every 60 seconds. Current the
> MetricsSnapshotReporter seems missing this implementation. It sends out the
> metrics every 60 seconds but does not reset the counter value.
> {code}
> registry.getGroup(group).foreach {
> case (name, metric) =>
> metric.visit(new MetricsVisitor {
> def counter(counter: Counter) = groupMsg.put(name,
> counter.getCount: java.lang.Long)
> def gauge[T](gauge: Gauge[T]) = groupMsg.put(name,
> gauge.getValue.asInstanceOf[Object])
> })
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)