On Wednesday 30 August 2006 00:59, Bill Moran wrote:
> In response to "Dan Langille" <[EMAIL PROTECTED]>:
> 
> > On 29 Aug 2006 at 18:20, Bill Moran wrote:
> > 
> > > In response to Kern Sibbald <[EMAIL PROTECTED]>:
> > > 
> > > > On Tuesday 29 August 2006 21:13, Bill Moran wrote:
> > > > > 
> > > > > I'm reposting in case my first post was missed.
> > > > 
> > > > Since we haven't seen this behavior on any other systems, at the 
moment, I 
> > > > consider it most likely a FreeBSD version 6.0 Operating System 
pthreads bug 
> > > > or some unknown pthreads incompatibility with that version of the 
libraries 
> > > > on that OS.
> > > > 
> > > > In the traceback, the only thing I see is that a thread is waiting on 
a 
> > > > pthread_cond_wait() call and it should not be.  That indicates to me 
that the 
> > > > broadcast (or probably a pthread_signal) was lost.
> > > 
> > > Thanks, Kern.  I'll follow up with the FreeBSD hackers and see if I can
> > > track it down.
> > 
> > Bill: Is FreeBSD 6.1-STABLE an option?
> 
> Only with good reason.  This is the primary production backup system for
> the office.  It's already got 300G worth of file volumes, and it backs
> up over 100G to tape on the weekends, and ~500M each time it does
> incrementals.
> 
> Missing a night's worth of backups won't end the world, but corrupting
> a filesystem, or anything like that would be very bad :(
> 
> My plan at this point:
> 1) Upgrading to RELENG_6_1 tonight
> 2) See if the problem repeats this Saturday
> 3) If it does, grab another backtrace and put it out on freebsd-hackers@
>    to see if anyone has any suggestions.

When you do that, you might mention that occassionally Bacula has had problems 
recursively calling pthread_mutexes, which crash on FreeBSD but not on Linux. 
To the best of my knowledge all are now fixed, and this is not the problem in 
any case.  The other point is that the rwlock() routines have been running 
for about 5 years now and have *never* had any errors or failures on *any* 
system.  They implement recursive (for the same process) mutexes.  It is a 
rwlock() that is blocked.  This *should* be impossible unless some other 
thread holds a rwlock() -- the thread should be somewhere in the database 
code, and I saw no such thing, so assume (for the moment) that the blocked 
rwlock() did not correctly get the wakeup call the thread that generated it 
(i.e. lost by the lib or OS).

> 
> ... although I'm open to suggestions.
> 
> -- 
> Bill Moran
> Collaborative Fusion Inc.
> 
> ****************************************************************
> IMPORTANT: This message contains confidential information and is
> intended only for the individual named. If the reader of this
> message is not an intended recipient (or the individual
> responsible for the delivery of this message to an intended
> recipient), please be advised that any re-use, dissemination,
> distribution or copying of this message is prohibited. Please
> notify the sender immediately by e-mail if you have received
> this e-mail by mistake and delete this e-mail from your system.
> E-mail transmission cannot be guaranteed to be secure or
> error-free as information could be intercepted, corrupted, lost,
> destroyed, arrive late or incomplete, or contain viruses. The
> sender therefore does not accept liability for any errors or
> omissions in the contents of this message, which arise as a
> result of e-mail transmission.
> ****************************************************************
> 
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job 
easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Bacula-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel
> 

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to