Hello, thanks. In the meantime I patched my Debian packages locally. But I ran into another race condition. rrd_update_r() isn't thread-save, because the C locale is an application wide variable. Assume one has rrdlib (A) and some other library (B) and the execution order is as follows:
(A1) old_locale = setlocale(...) (B1) old_locale = setlocale(...) (A2) // do some locale-dependent stuff (A3) setlocale( old_locale ) (B2) // do some locale-dependent stuff (B3) setlocale( old_locale ) Here, library B can be any library that also sets the global C locale within a different thread context. In the best case some strings are misinterpreted, in the worst case the memory gets corrupted :-( At the moment, I wrote a work-around by using an application wide mutex that must be locked by any thread that wants to call any library that might change the global C locale. But of course this isn't very nice. Are there any chances that the rrd_update_r function (and relatives) will be rewritten? For example, C++ locales are bounded to a specific stream and are not global. At least there should be big, BIG warning at http://oss.oetiker.ch/rrdtool/prog/rrdthreads.en.html that the C locale is subject to a race condition. Regards, Matthias Am Sonntag, 15. Juni 2014, 22:37:01 schrieb Tobias Oetiker: > Hi Matthias, > > yes you are right ... we fixed this in master, but not in the 1.4 > branch ... it is now ... > > cheers > tobi > > Today Matthias Nagel wrote: > > > Hello, > > > > I am writing a multi-threaded C++ application that uses rrdlib natively by > > calling rrd_update_r(). If I compile without optimazations and enable > > -ggdb everything seems to work fine. As soon as I switch to -O2 and disable > > -ggdb my apllication crashes at runtime. > > > > If it crashes the output is either > > > > *** glibc detected *** rrdtool: <something> > > > > or > > > > expected timestamp not found in data source from <input> > > > > but <input> is not the string that was given to rrd_update_r but unreadable > > garbage. Obviously, it is a memory corruption problem. Therefore, I ran the > > application under valgrind and I noticed that the problems comes from > > inside of the rrdlib. The message is > > > > > > ==11724== Invalid read of size 1 > > ==11724== at 0x4C2A051: __GI_strcmp (mc_replace_strmem.c:712) > > ==11724== by 0x5A4FF7F: setlocale (setlocale.c:210) > > ==11724== by 0x505D06B: _rrd_update (rrd_update.c:982) > > ==11724== Address 0x9deb0d0 is 0 bytes inside a block of size 12 free'd > > ==11724== at 0x4C27D4E: free (vg_replace_malloc.c:427) > > ==11724== by 0x5A4FCBD: setname (setlocale.c:173) > > ==11724== by 0x5A500B0: setlocale (setlocale.c:417) > > ==11724== by 0x505D02D: _rrd_update (rrd_update.c:974) > > > > Let's have a look at it: > > > > rrd_update.c:973: old_locale = setlocale(LC_NUMERIC, NULL); > > rrd_update.c:974: setlocale(LC_NUMERIC, "C"); > > rrd_update.c:982: setlocale(LC_NUMERIC, old_locale); > > > > The problem is obvious. The variable "old_locale" that is used at the 3rd > > line was assigned at the 1st line. But the 2nd call to "setlocale" freed > > the return value of the first call. According to the man pages the return > > value is a pointer to static memory and freed/allocated on every call. > > Actually the 2nd line (974) should be ommited and it should be > > > > rrd_update.c:973: old_locale = setlocale(LC_NUMERIC, "C" ); > > rrd_update.c:974: // deleted > > rrd_update.c:982: setlocale(LC_NUMERIC, old_locale); > > > > Why this double call to "setlocale" anyway? > > > > Best regards, Matthias > > > > > > -- Matthias Nagel Parkstraße 27 76131 Karlsruhe Festnetz: +49-721-96869289 Mobil: +49-151-15998774 e-Mail: [email protected] ICQ: 499797758 Skype: nagmat84 _______________________________________________ rrd-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
