On 11 Jul 2007 at 11:15, Chris Morris wrote:

> Since I've introduced FreeBSD snapshots into my Bacula plan, I've 
> started getting random backup failures.  A server will fail one day, and 
> back up just fine the next.  A server will back up fine one day and fail 
> the next.  ...all with no changes to the Bacula configuration.
> 
> Below, I've posted pertinent portions of configuration files and message 
> logs.  Please let me know if I need to supply any further information to 
> help troubleshoot this down.

You say the jobs fail.  What is the "failure"?  Error message?

> 
> bacula-dir.conf pertinent portions only...sensitive information removed 
> with:  <*REMOVED*>
> 
>     JobDefs {
>       Name = "BSD"
>       Type = Backup
>       FileSet = "defaultBSD"
>       Storage = storage01
>       Messages = Standard
>       Pool = Default
>       ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     make -g1 /var:autogen_bkup"
>       ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     make -g1 /usr:autogen_bkup"
>       ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     mount /var:autogen_bkup /mnt/var"
>       ClientRunBeforeJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     mount /usr:autogen_bkup /mnt/usr"
>       ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     umount /mnt/var"
>       ClientRunAfterJob = "/usr/local/bin/sudo /usr/local/sbin/snapshot
>     umount /mnt/usr"

I suggest creating scripts on the client, and moving these commands 
into those scripts.  It makes the JobDefs easier to read.  Sure, you 
have to copy stuff to the client, but I think that's cleaner.

YMMV.

>       Priority = 10
>     }
> 
>     Typical job, as they are all nearly identical:
> 
>     Job {
>       Name = "app06_BSD"
>       Client = "app06-fd"
>       Schedule = "MonCycle"
>       JobDefs = "BSD"
>       Write Bootstrap = "/var/db/bacula/app06.bsr"
>     }
> 
>     Typical client, as they are all nearly identical:
> 
>     Client {
>       Name = app06-fd
>       Address = app06
>       FDPort = 9102
>       Catalog = MyCatalog
>       Password = "<*REMOVED*>"      # password for FileDaemon
>       File Retention = 30 days                   # 30 days
>       Job Retention = 6 months                 # six months
>       AutoPrune = yes                               # Prune expired
>     Jobs/Files
>     }
> 
>     My primary FileSet resource:
> 
>     FileSet {
>       Name = "defaultBSD"
>       Include {
>         Options {
>           signature = MD5
>           compression = GZIP
>         }
>         File = /
>         File = /mnt/usr
>         File = /mnt/var
>       }
>       Exclude {
>         File = /proc
>         File = /tmp
>         File = /.journal
>         File = /.fsck
>       }
>     }
> 
> Finally, I get the same message from my /var/log/messages file at every 
> failure.  The lines before and after this have nothing to do with the 
> backup.
> 
>     Jul 11 08:19:07 app11 sudo:   <*REMOVED*> : TTY=unknown ;
>     PWD=/usr/local/etc/rc.d ; USER=root ;
>     COMMAND=/usr/local/sbin/snapshot make -g1 /var:autogen_bkup
>     Jul 11 08:21:34 app11 kernel: fsync: giving up on dirty
>     Jul 11 08:21:34 app11 kernel: 0xffffff005b07cd90: tag devfs, type VCHR
>     Jul 11 08:21:34 app11 kernel: usecount 1, writecount 0, refcount 604
>     mountedhere 0xffffff011f1ba200
>     Jul 11 08:21:34 app11 kernel: flags ()
>     Jul 11 08:21:34 app11 kernel: v_object 0xffffff005d3a0000 ref 0
>     pages 8572 
>     Jul 11 08:21:34 app11 kernel: lock type devfs: EXCL (count 1) by
>     thread 0xffffff00abc57980 (pid 46772)
>     Jul 11 08:21:34 app11 kernel: dev da0s1d
>     Jul 11 08:22:03 app11 sudo:   <*REMOVED*> : TTY=unknown ;
>     PWD=/usr/local/etc/rc.d ; USER=root ;
>     COMMAND=/usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
>     Jul 11 08:24:13 app11 sudo:   <*REMOVED*> : TTY=unknown ;
>     PWD=/usr/local/etc/rc.d ; USER=root ;
>     COMMAND=/usr/local/sbin/snapshot mount /var:autogen_bkup /mnt/var
>     Jul 11 08:24:13 app11 kernel: g_vfs_done():md0[READ(offset=65536,
>     length=8192)]error = 5

This looks like an OS issue, not a Bacula issue.  I suggest following 
up on the FreeBSD maling lists.

-- 
Dan Langille - http://www.langille.org/



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to