>>>>> On Mon, 07 Dec 2009 14:30:41 +0800, Jim Barber said: > > Hi all. > > I have a problem where every weekend (or more frequently) my storage daemon > crashes. > The crash is random, but is happening either while running VirtualFull jobs > or Copy jobs. > So far it hasn't crashed during regular incremental backups. > > I am running version 3.0.3 of the Bacula software. > > First of all I tried adding a '-d 200' to the arguments that start bacula-sd. > This produced a lot of messages, but nothing unusual that I can see prior to > the crash. > The last few lines in this log look like so: > > vc-sd: mac.c:241-468 before write JobId=468 FI=363302 SessId=1 Strm=MD5 > len=16 > vc-sd: mac.c:241-468 before write JobId=468 FI=363303 SessId=1 > Strm=UATTR len=104 > vc-sd: mac.c:241-468 before write JobId=468 FI=363304 SessId=1 > Strm=UATTR len=122 > vc-sd: mac.c:241-468 before write JobId=468 FI=363305 SessId=1 > Strm=UATTR len=77 > vc-sd: mac.c:241-468 before write JobId=468 FI=363305 SessId=1 > Strm=DATA len=4496 > vc-sd: mac.c:241-468 before write JobId=468 FI=363305 SessId=1 Strm=MD5 > len=16 > > So next I have been trying to get the btraceback program running. > > I am using Debian packages (self built based on the 3.0.2 Debian sources). > These run the storage daemon under the bacula:tape user:group. > So I modified the btraceback program to use sudo to run gdb. > I also configured sudo to allow the bacula user to do so without being > prompted for a password. > I then modified the Debian sources so that packages with debugging symbols > are produced. > > If I become the bacula user and run a test like so: > > /usr/sbin/btraceback /usr/sbin/bacula-sd $PID > > Where: $PID = the process ID of the bacula-sd process, > then I get an email showing debugging information. > So as far as I can tell the btraceback program should be working. > > I had another crash of the storage daemon after making the changes and no > email was sent. > Nor was a bacula-sd.9103.traceback file produced. > So I can't send any useful information to try and track down why the storage > daemon is so unstable. > > It was also unstable when using the 3.0.2 Debian package as well so I don't > think it is my rebuild that is causing the issue. > Although I feel 3.0.3 is more stable than 3.0.2 was, I still can't get a > complete weeks cycle working without a crash. > > The /etc/init.d/bacula-sd script defines the PATH to be, > PATH=/sbin:/bin:/usr/sbin:/usr/bin > So /usr/sbin is in the PATH and so I'd imagine the program should be able to > find the traceback program. > > Any ideas how I can get some useful information from the crash?
Try doing it interactively by attaching gdb to the bacula-sd process before it crashes (run gdb /path/to/bacula-sd and then use gdb's attach command). Then use the commands in btraceback.gdb when it crashes. __Martin ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users