Hello,

Well, after a zillion hours of non-trivial debugging, I finally figured out 
why we were getting comm line connection problems on the FreeBSD machine 
during the 2drive-incremental-2disk.  It turns out that on your system, the 
pthread_cond_timedwait() gets spurious returns (i.e. with a 0 status), which 
essentially simulated a connection being made but no authorization, so the 
job was cancelled.

It is interesting because I have never seen this problem on any other system 
though I seem to recall some such reports.  Any the pthread_cond_timedwait, 
documentation permits spurious returns, so I've modified the code to 
specifically test the conditions on which it is waiting. 

Please note, the FreeBSD man page for pthread_cond_timedwait is just plain 
wrong as it fails to indicate that spurious normal returns can occur -- in 
fact, it implies just the opposite.

I'll leave the tests running a good amount of time, just to make sure as the 
only proof that I can have that it is fixed is to run the tests a *very* long 
time without a failure.  With the bug, it fails about once every 20 minutes 
(hence a lot of debugging time).

After testing it here (about an hour), I'll have the fix in the SVN.

Regards,

Kern

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to