[ 
https://issues.apache.org/jira/browse/OAK-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320547#comment-15320547
 ] 

Michael Dürig commented on OAK-4445:
------------------------------------

Collecting initial statistics at 
http://svn.apache.org/viewvc?rev=1747392&view=rev

This is by no means done yet and I left a FIXME in the code to that respect. 
However the current statistics allow us to better analyse system behaviour in 
longevity tests. 

The collected data is currently written to the log file at INFO level: when 
writing a node involves some form of compaction (either explicit from the 
compactor or implicit caused by a deferred compaction):

{noformat}
14:53:26.086 INFO  [pool-1-thread-39] SegmentWriter.java:357 
NodeStats{op=compact, nodeCount=1687, writeOps=1356, deDupNodes=0, 
cacheHits=331, cacheMiss=80, hitRate=80.5352798053528}
{noformat}

* op: always compact as normal write operations are too frequent and are not 
logged.
* nodeCount: total number of nodes in the sub-tree rooted at the written node
* writeOps: number of nodes that actually had to be written as there was no 
de-duplication and and no cache hits
* deDupNodes: number of nodes that where de-duplicated as the store already 
contained them
* cacheHits: number of cache hits for a deferred compacted node
* cacheMisses: number of cache misses for a deferred compacted node
* hitRate: percentage of cacheHits wrt. total cache accesses 

In addition some descriptive statistics for the time (in nanoseconds) of normal 
write operations vs. write operations involving a deferred compaction are 
logged once per GC cycle. (those statistics are accumulated *per* GC cycle). 

{noformat}
14:53:23.160 INFO  [pool-1-thread-46] SegmentWriter.java:350 Write node stats: 
DescriptiveStatistics:
n: 472
min: 10000.0
max: 5.5568E7
mean: 511688.5593220339
std dev: 4075371.0625786274
median: 13000.0
skewness: 9.870001377872049
kurtosis: 106.2014803156692

14:53:23.165 INFO  [pool-1-thread-46] SegmentWriter.java:351 Compact node 
stats: DescriptiveStatistics:
n: 12
min: 659000.0
max: 4.3388E7
mean: 2.1615916666666668E7
std dev: 1.5782288948279006E7
median: 2.20125E7
skewness: 0.02796112575661888
kurtosis: -1.7958717412611467
{noformat}

[~frm], [~volteanu] FYI


> Collect write statistics 
> -------------------------
>
>                 Key: OAK-4445
>                 URL: https://issues.apache.org/jira/browse/OAK-4445
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc, monitoring
>             Fix For: 1.6
>
>
> We should come up with a good set of write statistics to collect like number 
> of records/nodes/properties/bytes. Additionally those statistics should be 
> collected for normal operation vs. compaction related operation. This would 
> allow us to more precisely analyse the effect of compaction on the overall 
> system. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to