Since we still have an open bug, please add this to the bug report.

Kern

On Saturday 20 March 2010 00:06:01 Hugh Brown wrote:
> Kern Sibbald wrote:
> > OK, I think the solution is for Hugh to:
> > 1. Figure out why his alert command is broken
> > 2. Create a script with a timer
> > 3. Disable the alert
>
> Here's what I've done:
>
> -- Ran backups, no change; got a hang.  Restarted sd and director.
>
> -- Commented out the "Alert" sections in bacula-sd for the two tape
> drives.  Ran backups, no hang.
>
> -- Changed the Alert section to:
>
>       Alert Command = "sh -c '/etc/bacula/alert_debugging.pl %c'"
>
> which was a very simple perl script (attached).  I reran backups and
> got a hang.  Here's what was logged:
>
>       Mar 19 15:30:24 agnatha hugh[29410]: Parent here...waiting patiently
>       Mar 19 15:30:24 agnatha hugh[29409]: About to run /usr/sbin/smartctl -H 
> -l
> error -q errorsonly -d scsi /dev/changer Mar 19 15:30:24 agnatha
> hugh[29414]: Done: exit status 0
>
> This was the same entries as seen before, since a bunch of jobs ran
> and finished before the hang, but now I've got the two processes
> again:
>
>       bacula   28827  1.1  0.0 248588  6220 ?        Ssl  15:24   0:18
> /usr/sbin/bacula-sd -u bacula -g disk -c /etc/bacula/bacula-sd.conf bacula 
>  29422  0.0  0.0 258816  4492 ?        S    15:30   0:00
> /usr/sbin/bacula-sd -u bacula -g disk -c /etc/bacula/bacula-sd.conf
>
> I ran btraceback on both (attached).  I can't find any mention of
> PID 29422 (the child) in the traceback for 28827 (the parent).  The
> child appears to be hung at closelog(), which matches what I had in
> the original traceback I sent with my first message to the list.  I
> ran kill -6 on both, but only the parent produced a lock dump
> (attached).  If I should add these files to the bug, let me know.
>
> The bacula I'm running now is compiled from the sourceforge SRPM, but
> with the changes detailed in bug #1527:
>
> -- the patch removing the "debug_list_volumes" line
> -- enabling the lock manager in the args to configure
> -- and enabling developer mode in version.h
>
> I thought maybe that last one was causing problems, since it results
> in stdout not being closed, but I was having this problem with the
> stock (though still locally-compiled) SRPM.
>
> I'm thoroughly confused.  For the weekend I'll just be removing the
> alert section and letting things run.
>
> --
> Hugh Brown, Systems Manager
> The Centre for High-Throughput Biology
> [email protected]



------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to