Hi Alex, thanks for the feedback.

I think the two options that you outlined both assume that there are updates at 
least every second. In that case, as you said, you have the option of pushing 
info to TimeSeries as actual updates happen (which could be more frequent than 
every second) or you could aggregate updates into a number and push the 
aggregate number every second.

However, my point was more about the case when there are updates less than 
every second, say every 2 seconds. In that case,  you'd see TimeSeries like 
100, 0, 40, 0, … and the problem with that is it makes you think that the 
actual bundle count is 100, then it goes down to 0, when it actually means that 
at time 1, bundle count is 100 but at time 2, bundle count didn't change, I.e. 
Still 100.

This resetting every second is fine for things like SESSION_READ_COUNTER where 
we actually care about the frequency of session reads and frequency by its 
definition is per second. However, for things like session count, bundle count, 
bundle size, we care about a number that increases or decreases but it's not 
frequency (I.e. Shouldn't reset every second).

SESSION_COUNT got this correctly (doesn't update every second) and I do think 
that BUNDLE_COUNTER and BUNDLE_WS_SIZE_COUNTER should do the same because they 
represent numbers not frequencies. Otherwise, the implementations have to 
update them every second with the same values in order to avoid getting zeros 
and that's unnecessary IMO.

-Mete

From: Alex Parvulescu 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thu, 16 Feb 2012 06:15:37 -0800
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: RepositoryStatistics question

Hi Mete,

The answer depends on the implementation, and as you can see on the wiki page 
[0] these two counters have not been implemented yet :)

You have 2 options here:

1) is what you listed above: you provide an incremental implementation where 
each time there is a change to the number of bundles, you push that info to the 
counter via the #addAndGet(long delta) method.
In this case, yes we should change to non-incremental.

or

2) similar to BUNDLE_CACHE_SIZE_COUNTER you provide the absolute number each 
time via #set(long newValue), possibly at a lower frequency that the one at the 
actual updates come in.

So, as I've said it depends on the actual implementation and not having one yet 
means that if you find it easier the other way around (#1) we can switch 
without breaking anything.


best,
alex

[0] http://wiki.apache.org/jackrabbit/Statistics


On Thu, Feb 16, 2012 at 2:20 PM, Mete Atamel 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I was looking at some of the values of Type enum in RepositoryStatistics class 
and these two caught my attention:


        BUNDLE_COUNTER(true),

        BUNDLE_WS_SIZE_COUNTER(true),

I'm assuming BUNDLE_COUNTER is the current number of nodes/bundles and 
BUNDLE_WS_SIZE_COUNTER is the total size of all nodes/bundles in the workspace 
(correct me if I'm wrong). The problem is that these enums are initialized with 
true, meaning they will reset their counts to zero every second and this is 
kind of weird.

For example, if 100 nodes are created at time 1 and time 3, you'd see something 
like this for BUNDLE_COUNTER: 100, 0, 100, 0, 0, ... But I'd expect to see 100, 
100, 200, 200, 200, … Because BUNDLE_COUNT sounds like the total number of 
nodes at the current time rather than new bundles created at the current time. 
If it were named NEW_BUNDLE_COUNT, I'd expect to see 100, 0, 100, 0, 0, …Same 
goes for BUNDLE_WS_SIZE_COUNTER.

So, I'm suggesting that we change these enum values to initialize with false 
instead and I'm curious to know what everyone else thinks.

Thanks,
Mete

Reply via email to