On Sunday 06 December 2009 20:42:19 Jesper Krogh wrote:
> Kern Sibbald wrote:
> >> cwd is: /
> >> $ cd /mnt/backup
> >> cwd is: /mnt/backup/
> >> $ mark cache
> >> 2,872,501 files marked.
> >> $ done
> >> Bootstrap records written to /var/lib/bacula/bacula-dir.restore.1.bsr
> >
> > At that point, as far as I know, there is no more significant work for
> > the Director to do.  It just passes off the bootstrap file, which is
> > written then lets the FD and SD do their thing.
>
> Sorry for not being precise enough in the first round. It is:
> "after I type done" but "before the next line is written" there is 2.5
> hours. So I guess it is building the bootstrap file?

Yes, depending on how you have Bacula configured and how many files there are, 
building a bootstrap file can take a lot of time.  I imagine that most of the 
time is spent in the catalog, but I have never actually measured it.

The main factors are:
1. How many total files there are.
2. How many JobIds are involved.
3. The number of JobMedia records, which depend on the number of JobIds and 
the value set for Maximum File Size in the Storage daemon.

>
> >> Just after done, the system waited for around 2.5 hours before getting
> >> onto the actual restore. Seen from the system side it was pure cpu-load,
> >> having one thread sitting at 100% CPU and absolutly no database-activity
> >> and a decent (not growing) memory usage (~512MB).
> >>
> >> Most of the time it actually never got to done but somehow the thread
> >> taking care of the job just got killed (a watchdog timeout perhaps?)
> >
> > Unless you have set some maximum runtime, the thread should not be
> > killed.
>
> Strange, I've seen this repeatedly, but it is not severe enough to get
> the director to terminate, just the thread.. nothing else.

Well, it could either be that your DB engine is very busy or that there are 
really huge numbers of files and/or JobMedia records.

>
> >> I'm still on Bacula 2.4, so just let me know if there has been looked
> >> into this in 3.0.
> >
> > I recommend that you duplicate the problem then trap the Director with
> > the debugger and find out what it is doing (i.e. where it is spending its
> > time). This sounds odd, though it is possible I am overlooking something.
>
> Is there a short guide on how to do this somewhere?

The Kaboom chapter of the manual tells you how to run the Director under the 
debugger.  You can also attach to the Director while it is running, using:

  cd <bacula-binary-directory>
  gdb bacula-dir <pid-of-director>

Kern



------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to