Federico Sacerdoti wrote:

On Tuesday, September 24, 2002, at 10:49 AM, Steven Wagner wrote:

People have cited disk I/O as a bottleneck. I personally doubt this. If it were true, you'd be seeing random gaps whenever RRD updates came thick and fast (i.e. while all threads were updating RRDs at once), and the failures should be at least a bit more distributed - not just in one cluster.


So we noticed that disk I/O was a bottleneck on the old version of gmetad. We had ~400 hosts, each with ~25rrds (incl summaries), 12K each, being updated once every 15 seconds. That's 10,000 file updates every 15 seconds, and since their aggregate size was 120MB, it exceeded Linux's filesystem cache.

Erm, actually I meant "I doubt this is the case for you," as the stated number of hosts/clusters seemed small (and besides it wasn't occurring in 2.4.x, which records more metrics per host).

Bad day for me.  :S


Reply via email to