[jira] [Comment Edited] (OAK-3654) Integrate with Metrics for various stats collection

Chetan Mehrotra (JIRA) Mon, 23 Nov 2015 22:42:07 -0800

    [ 
https://issues.apache.org/jira/browse/OAK-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15023775#comment-15023775
 ]


Chetan Mehrotra edited comment on OAK-3654 at 11/24/15 6:40 AM:
----------------------------------------------------------------

bq. hence when ACCURATE.getTime() is called it will generally be slower than 
just calling System.nanoTime().

Ack

bq.  it explains why System.getCurrentTimeMillis() is faster than 
System.nanoTime() (measured at 13ns vs 18ns).

Yes thats the understanding. Key point here is that Metrics is being called in 
very critical path of read and write and if every such call involves a call to 
System.nanoTime then that raise a bit of concern. 

Now looking back at the change done so far I think there is some extra 
duplicate work now. Currently we maintain a pair of metrics
* SESSION_READ_COUNTER and SESSION_READ_DURATION
* SESSION_WRITE_COUNTER and SESSION_WRITE_DURATION
* QUERY_COUNT and QUERY_DURATION

In current approach for each pair we manage a pair of Meter and Timer. Hence 
any call now involve 4 calls to System.nanoTime() 
# one to determine duration, 
# 1 in Meter like SESSION_READ_COUNTER. Meter does a call for every mark in 
tickIfNecessary - This can be saved by turning it into a NoopMeter with r1716032
# 1 in Meter managed within the Timer like SESSION_READ_DURATION
# 1 in Histrogram (via ExponentiallyDecayingReservoir) being managed within the 
Timer 

Would have to refactor that to avoid the extra call. But with that also it 
would get down to 3

bq. Metrics has considerable production experience. I would avoid making 
fundamental changes unilaterally. If there is a real problem with the Metrics 
code and how they measure time, I would open an issue on their bug tracker to 
validate the issue is real.

Sure. Would try to get some numbers to benchmark.

btw looking at the issue tracker for Metric came across this interesting thread 
on [Mechanical 
Sympathy|https://groups.google.com/forum/#!msg/mechanical-sympathy/I4JfZQ1GYi8] 
which is very much related to latency measurement we are doing here. For very 
fast operation say 1000/sec it would be better to go for sampling to measure 
the timing. 

Also have a look at http://latencyutils.github.io/LatencyUtils/

So may be we should not do measurement in the very critical parts like read and 
write average. Still digging deeper ....


was (Author: chetanm):
bq. hence when ACCURATE.getTime() is called it will generally be slower than 
just calling System.nanoTime().

Ack

bq.  it explains why System.getCurrentTimeMillis() is faster than 
System.nanoTime() (measured at 13ns vs 18ns).

Yes thats the understanding. Key point here is that Metrics is being called in 
very critical path of read and write and if every such call involves a call to 
System.nanoTime then that raise a bit of concern. 

Now looking back at the change done so far I think there is some extra 
duplicate work now. Currently we maintain a pair of metrics
* SESSION_READ_COUNTER and SESSION_READ_DURATION
* SESSION_WRITE_COUNTER and SESSION_WRITE_DURATION
* QUERY_COUNT and QUERY_DURATION

In current approach for each pair we manage a pair of Meter and Timer. Hence 
any call now involve 3 calls to System.nanoTime() 
# one to determine duration, 
# 1 in Meter like SESSION_READ_COUNTER. Meter does a call for every mark in 
tickIfNecessary - This can be saved by turning it into a NoopMeter with r1716032
# 1 in Meter managed within the Timer like SESSION_READ_DURATION

Would have to refactor that to avoid the extra call. But with that also it 
would get down to 2

bq. Metrics has considerable production experience. I would avoid making 
fundamental changes unilaterally. If there is a real problem with the Metrics 
code and how they measure time, I would open an issue on their bug tracker to 
validate the issue is real.

Sure. Would try to get some numbers to benchmark

> Integrate with Metrics for various stats collection 
> ----------------------------------------------------
>
>                 Key: OAK-3654
>                 URL: https://issues.apache.org/jira/browse/OAK-3654
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.4
>
>         Attachments: OAK-3654-v1.patch, query-stats.png
>
>
> As suggested by [~ianeboston] in OAK-3478 current approach of collecting 
> TimeSeries data is not easily consumable by other monitoring systems. Also 
> just extracting the last moment data and exposing it in simple form would not 
> be useful. 
> Instead of that we should look into using Metrics library [1] for collecting 
> metrics. To avoid having dependency on Metrics API all over in Oak we can 
> come up with minimal interfaces which can be used in Oak and then provide an 
> implementation backed by Metric.
> This task is meant to explore that aspect and come up with proposed changes 
> to see if its feasible to make this change
> * metrics-core ~100KB in size with no dependency
> * ASL Licensee
> [1] http://metrics.dropwizard.io



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (OAK-3654) Integrate with Metrics for various stats collection

Reply via email to