Re: MUPDATE database problems -- the importance of thread safety

Wesley Craig Wed, 17 Jun 2009 11:17:48 -0700

Please open a report in bugzilla and mark it was a "blocker". Thanksfor finding the issue.


:wes


On 17 Jun 2009, at 09:44, Michael Bacon wrote:

It turns out that this was an issue with mupdate being a multi-threaded daemon, and in a critical place in the non-blocking protcode (in prot_flush_internal()), the behavior relies on the valueof errno. If it's EAGAIN, the write will try again, otherwise itsets s->error and quits. Naturally, being a global variablenormally, errno doesn't work terribly well in multi-threaded codeunless the necessary thread safety switch is passed to thecompiler. Hence, when thread #5 was getting a -1 from the write(2)system call, it was reading errno as 0, rather than EAGAIN as itshould have been.
The solution, should anyone else run into this, is as simple asrecompiling with the thread safety switch. (In the case of Sun'sSPro, it's -mt. I think it's -mthread for gcc, but I'm not sure.)Maddening that the fix was that simple, as I spent two solid weekshunting for the dratted bug.
I have two requests to the CVS maintainers out there. First, thebelow patch to current CVS isn't terribly comprehensive, anddoesn't narrow it down from about a dozen places s->error could beset, but at least would have given SOME kind of indication on theserver that something had gone wrong, and might have saved me abouta week of hunting.
Secondly, I am very weak in the ways of autoconf, but it strikes methat since Cyrus now builds mupdate as multithreaded by default(good decision, IMO), autoconf should make some attempt to figureout what thread safety switch is appropriate and add it to CFLAGS.

Re: MUPDATE database problems -- the importance of thread safety

Reply via email to