Now that's a big setup.

I think the corruption is a result of the code not correctly handling the 
out-of-memory problems, and so if your version isn't experiencing a memory leak 
then you'll not be affected even if the bug is in the version you're using.  
The big problem is the memory leak, and I guess I'll need to learn to use 
valgrind to track it.

This is pointing the finger at it being the newer code (last and info, most 
likely, since create is rarely done) that causes any leaks, though  there were 
a couple of additional changes between 2092 and 2136 that could be to blame.

I might try out the -a option; we've not used it yet as it's a new one in 
1.4.trunk

Steve


________________________________
Steve Shipway
ITS Unix Services Design Lead
University of Auckland, New Zealand
Floor 1, 58 Symonds Street, Auckland
Phone: +64 (0)9 3737599 ext 86487
DDI: +64 (0)9 924 6487
Mobile: +64 (0)21 753 189
Email: [email protected]<mailto:[email protected]>
P Please consider the environment before printing this e-mail


From: Thorsten von Eicken [mailto:[email protected]]
Sent: Friday, 22 October 2010 3:13 p.m.
To: Steve Shipway
Cc: kevin brintnall; [email protected]; [email protected]
Subject: Re: [rrd-developers] rrdcached use corrupting RRD files (trunk)

Sadly interesting...
As a separate data point, we're running over 100 rrdcached servers, each 
handling >30k tree nodes and receiving about 3k updates/sec, caching data for 
~1 hour so updating files at ~20 updates/sec. Uptime in months without problem, 
never seen corruption (knock on wood). We're running 1.4 trunk revision r2092 
(randomly picked) on Ubuntu 8.04 (used to run on CentOS 5.2, I believe). We're 
not seeing any memory leak and running stable at 800-900MB virtual / 500-600MB 
rss. We're using TCP sockets and doing updates, fetches and flushes. The 
command line we use is:
/usr/bin/rrdcached -w 3600 -z 3600 -f 7200 -t 2 -a 128 -b /rrds/hosts -B -j 
/rrds/journal -p /var/run/rrdcached/rrdcached.pid -l 10.x.x.x:xxxx
I'm not writing this to contradict you, I'm just wondering what could be 
different in your set-up that causes the problems. (Oh, that reminds me that 
the -a 128 made a huge difference for us around memory allocation performance.)
Good luck!
TvE



_______________________________________________
rrd-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users

Reply via email to