Re: Thread issue - Possible fix
Marcelo Ferreira [EMAIL PROTECTED] wrote: I tested the 20020203 snapshot with the _r() changes. I ran then server without the -s option (radiusd -p 1812 ) and it cored dump: sigh These problems are a real pain to track down. If I run it with -s (radiusd -s -p 1812) works fine. Yeah, it's probably a thread issue. The rest of the thread-unsafe functions should be replaced with the thread-safe version, but I haven't had time to do that yet. As always, patches are welcome. btw.. did you have (extra) time to revise the rlm_radutmp patch that I sent No, not yet. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
Miquel van Smoorenburg [EMAIL PROTECTED] wrote: Well, you should be able to link against thread-safe variants of those functions, right? I mean, strtok() is obviously not thread safe, but localtime() should be. Nope. Log into a *BSD box. The thread-safe variant of localtime() is localtime_r(). Localtime returns a pointer to static data, which can be over-written in subsequent calls. I've *never* understood why people designed functions which returned pointers to static data. They don't even scale on *one* process, much less multi-threaded. e.g. Try to keep two different pointers to data returned by two subsequent calls to localtime(), for two different dates. You can't. You get the same pointer both times. So the second call over-writes the data from the first call. So the only sane way to use that function even in a process with one thread of execution, is to copy the results to another buffer. But if that's true, why the heck didn't localtime just take a pointer to the place to store the returned data? But I digress... It's probably just a matter of linking against the right library. libc_r instead of libc, most likely. libc_r usually has the foo_r() functions. libc() doesn't. libc_r usually *also* has the unsafe foo() functions, so people who want to shoot themselves in the foot can do so. I think Linux turns on thread-safety if you define -DREENTRANT and link with -pthreads. Other systems might need other magic incantations. Like FreeBSD, and probably Solaris. glibc does *insane* things to be thread-safe while returning pointers to static data. That's why Linux doesn't have the foo_r() functions. It's also why glibc is so much larger and more complicated than the *BSD C libraries. They're trying to be smart cute, instead of writing simple, clean, code. Don't just substitute all library functions for their _r equivalents, just find out how to link against the thread-safe libc variant. The server has been linking that way for a while, and it isn't good enough. People other than the glibc hackers run away screaming when asked to make thread-safe versions of localtime(), etc. The simplest thing to do is to *always* use the localtime_r() functions, and to have a trivial wrapper around localtime() using 'memcpy' to get the same effect. I've made that change in the server for ctime_r() and for localtime_r(). If it *creates* problems, I'd love to know. IMHO, *all* of the functions returning pointers to static data should just go away. They're badly designed. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
I have been experiencing crashes as well under high accounting load (Solaris 7 , mysql accounting). It appears that the use of non thread-safe library functions are responsible for this. I have replaced all the localtime() and ctime() calls with their POSIX thread safe counterparts (localtime_r() and ctime_r()) and radiusd has been running perfectly for the last 2 hours under constant heavy load from radclient (1000+ req/sec) whereas before it would crash within a few minutes. There are a few other instances of non thread-safe library functions in the source (gmtime, rand and strtok), but they are used mainly at startup so they should not affect the stability, although to be correct, I think they should be replaced with the *_r versions as well. I don't know how applicable these changes are to other OS's (Linux), but perhaps someone running FR on linux could try this and see if it helps. Eddie - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
can you send the diff of all files that you fix ? []s Marcelo Ferreira Canbras TVA Cabo Ltda Canbras Acesso - STA Phone: +5511-4993-8728 - Original Message - From: Eddie Stassen [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, February 01, 2002 2:25 PM Subject: Re: Thread issue - Possible fix I have been experiencing crashes as well under high accounting load (Solaris 7 , mysql accounting). It appears that the use of non thread-safe library functions are responsible for this. I have replaced all the localtime() and ctime() calls with their POSIX thread safe counterparts (localtime_r() and ctime_r()) and radiusd has been running perfectly for the last 2 hours under constant heavy load from radclient (1000+ req/sec) whereas before it would crash within a few minutes. There are a few other instances of non thread-safe library functions in the source (gmtime, rand and strtok), but they are used mainly at startup so they should not affect the stability, although to be correct, I think they should be replaced with the *_r versions as well. I don't know how applicable these changes are to other OS's (Linux), but perhaps someone running FR on linux could try this and see if it helps. Eddie - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
At 12:33 PM 02/02/01 -0200, you wrote: can you send the diff of all files that you fix ? []s I don't have the time right now to make up the diffs (its Friday afternoon and 30 deg C outside :)), but I changed all the localtime ctime references in the following files: src/main/xlat.c src/lib/valuepair.c src/main/log.c src/modules/rlm_detail/rlm_detail.c You have to change the 'struct tm *TM' declarations to 'struct tm TM' and the 'TM = localtime(time)' calls to 'localtime_r(time,TM)'. Similar for ctime (you may have to create some temp variables though). I am not too familiar with linux, but I suspect you may need some extra #defines to enable the POSIX functions. Regards, Eddie Marcelo Ferreira Canbras TVA Cabo Ltda Canbras Acesso - STA Phone: +5511-4993-8728 - Original Message - From: Eddie Stassen [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, February 01, 2002 2:25 PM Subject: Re: Thread issue - Possible fix I have been experiencing crashes as well under high accounting load (Solaris 7 , mysql accounting). It appears that the use of non thread-safe library functions are responsible for this. I have replaced all the localtime() and ctime() calls with their POSIX thread safe counterparts (localtime_r() and ctime_r()) and radiusd has been running perfectly for the last 2 hours under constant heavy load from radclient (1000+ req/sec) whereas before it would crash within a few minutes. There are a few other instances of non thread-safe library functions in the source (gmtime, rand and strtok), but they are used mainly at startup so they should not affect the stability, although to be correct, I think they should be replaced with the *_r versions as well. I don't know how applicable these changes are to other OS's (Linux), but perhaps someone running FR on linux could try this and see if it helps. Eddie - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
Eddie Stassen [EMAIL PROTECTED] wrote: It appears that the use of non thread-safe library functions are responsible for this. Uh, yeah. Why didn't I think of that. Sorry... I don't know how applicable these changes are to other OS's (Linux), but perhaps someone running FR on linux could try this and see if it helps. The changes MUSt be made for safety. I'll take a look at doing some of the work over the next few days. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
On Fri, Feb 01, 2002 at 10:52:49AM -0500, [EMAIL PROTECTED] wrote: Eddie Stassen [EMAIL PROTECTED] wrote: It appears that the use of non thread-safe library functions are responsible for this. [...] The changes MUSt be made for safety. I'll take a look at doing some of the work over the next few days. yay! Maybe this will solve the solaris proxy problems that seem to still be present? Sounds like it's time for 0.5. :-) /fc - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Thread issue - Possible fix
In article [EMAIL PROTECTED], [EMAIL PROTECTED] wrote: Eddie Stassen [EMAIL PROTECTED] wrote: It appears that the use of non thread-safe library functions are responsible for this. Uh, yeah. Why didn't I think of that. Sorry... Well, you should be able to link against thread-safe variants of those functions, right? I mean, strtok() is obviously not thread safe, but localtime() should be. It's probably just a matter of linking against the right library. libc_r instead of libc, most likely. I think Linux turns on thread-safety if you define -DREENTRANT and link with -pthreads. Other systems might need other magic incantations. Like FreeBSD, and probably Solaris. The changes MUSt be made for safety. I'll take a look at doing some of the work over the next few days. Don't just substitute all library functions for their _r equivalents, just find out how to link against the thread-safe libc variant. Mike. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html