[ 
https://issues.apache.org/jira/browse/OAK-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010908#comment-15010908
 ] 

Ian Boston commented on OAK-3478:
---------------------------------

@Chetan Mehrotra The bean "Oak Repository Statistics-simple" produces the right 
type of stats, but I think the the "per second" approach wont work as the 
sequence of numbers from the existing TimeSeries follows the pattern 
0,0,0,0,0,0,0,761000000,0,0,0,0,0,0,0 etc. If the monitoring tool happens to 
query either side of the 716000000 value, then it get 0. To get 761000000 it 
has to query at exactly the correct time. You could use the per minute 
value.... but it would be better not to re-invent the research in this area and 
look at what others have already proven to work in production.

Most Metrics gathering systems use some form of moving average eg 
http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average or a 
pure counter to let the monitoring tool do the work. Pure counters are not very 
good as they tend to hit overflow problems, Moving averages of some form are 
better, provided the Windows or reservoirs can be implemented efficiently. 
Since averages are not always that useful when identifying performance most 
metrics tools also provide live histogram of the metric to produce 50%,  
99.99%. 99.99% becomes important if the operation is performed thousands of 
times where each operation correlates with others, as a slow operation 0.001 % 
of the time will impact all operations with absolute certainty. Oak has plenty 
of places where this is a characteristic, starting with queues. To produce 
those sort of stats you may need to implement 
http://www.cs.umd.edu/~samir/498/vitter.pdf. 

The link was taken from Codehale Metrics which has various implementations 
testing in production. If Oak is not prepared to use a third party library for 
metrics support it should learn from what others have used successfully in 
production.


> Provide JMX Beans for Oak that can be monitored by external tooling.
> --------------------------------------------------------------------
>
>                 Key: OAK-3478
>                 URL: https://issues.apache.org/jira/browse/OAK-3478
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, jcr, lucene, query
>    Affects Versions: 1.3.7
>            Reporter: Ian Boston
>            Assignee: Chetan Mehrotra
>             Fix For: 1.3.11
>
>         Attachments: OAK-3478-v1.patch
>
>
> The Current JMX beans, while ok in the Sling Web Console are hard if not 
> impossible to monitor with external tooling, as external tooling will poll 
> for current values, ideally from named attributes containing primitive types. 
> Those values containing timers, counters or gauges. Timers timing an 
> operation. Counters counting an operation. Guages measuring an instantaneous 
> value.
> The request is to provide a small number of JMX beans that can be configured 
> into an external monitoring tool like AppDynamics, Ganglia, NewRelic, Splunk 
> etc etc, which in turn will provide long term time series and statistics. 
> Primitive values of this form can also be graphed with ease in JConsole, 
> VisualVM etc. A improvement for the Sling Web Console might be to add a 
> Console that can maintain a TimeSeries graph of any JMX bean by object name 
> in the same way Ganglia, AppDynamics does, however that may be duplicating 
> existing functionality.
> The Metrics Library could be considered to provide the above functionality 
> for all JMX beans and monitoring, although its footprint at 111K might be 
> considered too big as an additional dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to