Hello,
thanks. In the meantime I patched my Debian packages locally. But I ran into 
another race condition. rrd_update_r() isn't thread-save, because the C locale 
is an application wide variable. Assume one has rrdlib (A) and some other 
library (B) and the execution order is as follows:

(A1) old_locale = setlocale(...)
(B1) old_locale = setlocale(...)
(A2) // do some locale-dependent stuff
(A3) setlocale( old_locale )
(B2) // do some locale-dependent stuff
(B3) setlocale( old_locale )

Here, library B can be any library that also sets the global C locale within a 
different thread context. In the best case some strings are misinterpreted, in 
the worst case the memory gets corrupted :-( At the moment, I wrote a 
work-around by using an application wide mutex that must be locked by any 
thread that wants to call any library that might change the global C locale. 
But of course this isn't very nice.

Are there any chances that the rrd_update_r function (and relatives) will be 
rewritten? For example, C++ locales are bounded to a specific stream and are 
not global. At least there should be big, BIG warning at 
http://oss.oetiker.ch/rrdtool/prog/rrdthreads.en.html that the C locale is 
subject to a race condition.

Regards, Matthias

Am Sonntag, 15. Juni 2014, 22:37:01 schrieb Tobias Oetiker:
> Hi Matthias,
> 
> yes you are right ... we fixed this in master, but not in the 1.4
> branch ... it is now ...
> 
> cheers
> tobi
> 
> Today Matthias Nagel wrote:
> 
> > Hello,
> >
> > I am writing a multi-threaded C++ application that uses rrdlib natively by 
> > calling rrd_update_r().  If I compile without optimazations and enable 
> > -ggdb everything seems to work fine. As soon as I switch to -O2 and disable 
> > -ggdb my apllication crashes at runtime.
> >
> > If it crashes the output is either
> >
> > *** glibc detected *** rrdtool: <something>
> >
> > or
> >
> > expected timestamp not found in data source from <input>
> >
> > but <input> is not the string that was given to rrd_update_r but unreadable 
> > garbage. Obviously, it is a memory corruption problem. Therefore, I ran the 
> > application under valgrind and I noticed that the problems comes from 
> > inside of the rrdlib. The message is
> >
> >
> > ==11724== Invalid read of size 1
> > ==11724==    at 0x4C2A051: __GI_strcmp (mc_replace_strmem.c:712)
> > ==11724==    by 0x5A4FF7F: setlocale (setlocale.c:210)
> > ==11724==    by 0x505D06B: _rrd_update (rrd_update.c:982)
> > ==11724==  Address 0x9deb0d0 is 0 bytes inside a block of size 12 free'd
> > ==11724==    at 0x4C27D4E: free (vg_replace_malloc.c:427)
> > ==11724==    by 0x5A4FCBD: setname (setlocale.c:173)
> > ==11724==    by 0x5A500B0: setlocale (setlocale.c:417)
> > ==11724==    by 0x505D02D: _rrd_update (rrd_update.c:974)
> >
> > Let's have a look at it:
> >
> > rrd_update.c:973: old_locale = setlocale(LC_NUMERIC, NULL);
> > rrd_update.c:974: setlocale(LC_NUMERIC, "C");
> > rrd_update.c:982: setlocale(LC_NUMERIC, old_locale);
> >
> > The problem is obvious. The variable "old_locale" that is used at the 3rd 
> > line was assigned at the 1st line. But the 2nd call to "setlocale" freed 
> > the return value of the first call. According to the man pages the return 
> > value is a pointer to static memory and freed/allocated on every call. 
> > Actually the 2nd line (974) should be ommited and it should be
> >
> > rrd_update.c:973: old_locale = setlocale(LC_NUMERIC, "C" );
> > rrd_update.c:974: // deleted
> > rrd_update.c:982: setlocale(LC_NUMERIC, old_locale);
> >
> > Why this double call to "setlocale" anyway?
> >
> > Best regards, Matthias
> >
> >
> 
> 

-- 
Matthias Nagel
Parkstraße 27
76131 Karlsruhe

Festnetz: +49-721-96869289
Mobil: +49-151-15998774
e-Mail: [email protected]
ICQ: 499797758
Skype: nagmat84 

_______________________________________________
rrd-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users

Reply via email to