On 28 Jun 2009, at 18:50, Paul Davis wrote:
2. Would it be helpful to be able to enable/disable stats
completely. These
calculations must add some overhead.
That definitely seems reasonable though I'm not entirely certain how
best to implement this.
I'd opt for a ./configure option --disable-stats and -ifdef() based
conditional
code.
Cheers
Jan
--
3. The use of moving averages is great, but as you comment there be
quite a
lot of variability within a given time interval. Moving averages are
generally useful only over time, for example in making short term
trading
decisions a moving average can help guess the direction of the next
reversion to a mean. In this scenario I would think peak usages
would also
be of value. One could maintain min/max stats with respect to these
moving
averages along with a time interval in order to identify hot spots.
Sounds reasonable. I'm not sure if min/max is more or less proper than
quartiles. Or maybe just different? My stats-fu is less than stellar.
I'll have a closer look and write some tests
On Jun 27, 2009, at 9:32 PM, Paul Joseph Davis (JIRA) wrote:
Fixing weirdness in couch_stats_aggregator.erl
----------------------------------------------
Key: COUCHDB-396
URL: https://issues.apache.org/jira/browse/
COUCHDB-396
Project: CouchDB
Issue Type: Improvement
Components: Database Core, HTTP Interface
Affects Versions: 0.10
Environment: trunk
Reporter: Paul Joseph Davis
Assignee: Paul Joseph Davis
Fix For: 0.10
Attachments: couchdb_stats_aggregator.patch
Looking at adding unit tests to the couchdb_stats_aggregator
module the
other day I realized it was doing some odd calculations. This is a
fairly
non-trivial patch so I figured that I'd put in JIRA and get feed
back before
applying. This patch does everything the old version does afaict,
but I'll
be adding tests before I consider it complete.
List of major changes:
* The old behavior for stats was to integrate incoming values for
a time
period and then reset the values and start integrating again. That
seemed a
bit odd so I rewrote things to keep the average and standard
deviation for
the last N seconds with approximately 1 sample per second.
* Changed request timing calculations [note below]
* Sample periods are configurable in the .ini file. Sample periods
of 0
are a special case and integrate all values from couchdb boot up.
* Sample descriptions are in the configuration files now.
* You can request different time periods for the root stats end
point.
* Added a sum to the list of statistics
* Simplified some of the external API
The biggest change is in how time for requests are calculated.
AFAICT, the
old way was accumulating request timings in the stats collector
and just
adding new values as clock ticks went by as everything else does
which makes
sense in the case of resetting counters every time period. In the
new way
I'm keeping a list of the samples in the last time period and when
I get a
clock tick part of the update is to remove the samples that have
passed out
of the time period. For a variable like request_time this would
lead to
unbounded storage.
The new method is calculating the average time of all requests in
a single
clock tick (1s). One thing this loses is when you start having
lots of
variability in a single clock tick. Ie, your average request time
is 100ms,
but 10% of your requests are taking 500ms. I've read of people
doing the
averaging trick but also storing quantile information as well [1].
There are
also algorithms for doing single pass quantile estimation and the
like so
its possible to do those things in O(N) time. The issue with
quantiles is
that it'd start breaking the logic of how the collector and
aggregators are
setup. As it is now, there's basically a one event -> one stat
constraint.
For the time being I went without quartiles to minimize the impact
of the
patch.
This code will also be on github [3] as I add patches.
[1] http://code.flickr.com/blog/2008/10/27/counting-timing/
[2]
http://www.slamb.org/svn/repos/trunk/projects/loadtest/benchtools/stats.py
(See
the QuantileEstimator class)
[3] http://github.com/davisp/couchdb/tree/stats-patch
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.