Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Dan Langille
On 11 Jul 2007 at 11:15, Chris Morris wrote:

 Since I've introduced FreeBSD snapshots into my Bacula plan, I've 
 started getting random backup failures.  A server will fail one day, and 
 back up just fine the next.  A server will back up fine one day and fail 
 the next.  ...all with no changes to the Bacula configuration.
 
 Below, I've posted pertinent portions of configuration files and message 
 logs.  Please let me know if I need to supply any further information to 
 help troubleshoot this down.

You say the jobs fail.  What is the failure?  Error message?

 
 bacula-dir.conf pertinent portions only...sensitive information removed 
 with:  *REMOVED*
 
 JobDefs {
   Name = BSD
   Type = Backup
   FileSet = defaultBSD
   Storage = storage01
   Messages = Standard
   Pool = Default
   ClientRunBeforeJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 make -g1 /var:autogen_bkup
   ClientRunBeforeJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 make -g1 /usr:autogen_bkup
   ClientRunBeforeJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 mount /var:autogen_bkup /mnt/var
   ClientRunBeforeJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 mount /usr:autogen_bkup /mnt/usr
   ClientRunAfterJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 umount /mnt/var
   ClientRunAfterJob = /usr/local/bin/sudo /usr/local/sbin/snapshot
 umount /mnt/usr

I suggest creating scripts on the client, and moving these commands 
into those scripts.  It makes the JobDefs easier to read.  Sure, you 
have to copy stuff to the client, but I think that's cleaner.

YMMV.

   Priority = 10
 }
 
 Typical job, as they are all nearly identical:
 
 Job {
   Name = app06_BSD
   Client = app06-fd
   Schedule = MonCycle
   JobDefs = BSD
   Write Bootstrap = /var/db/bacula/app06.bsr
 }
 
 Typical client, as they are all nearly identical:
 
 Client {
   Name = app06-fd
   Address = app06
   FDPort = 9102
   Catalog = MyCatalog
   Password = *REMOVED*  # password for FileDaemon
   File Retention = 30 days   # 30 days
   Job Retention = 6 months # six months
   AutoPrune = yes   # Prune expired
 Jobs/Files
 }
 
 My primary FileSet resource:
 
 FileSet {
   Name = defaultBSD
   Include {
 Options {
   signature = MD5
   compression = GZIP
 }
 File = /
 File = /mnt/usr
 File = /mnt/var
   }
   Exclude {
 File = /proc
 File = /tmp
 File = /.journal
 File = /.fsck
   }
 }
 
 Finally, I get the same message from my /var/log/messages file at every 
 failure.  The lines before and after this have nothing to do with the 
 backup.
 
 Jul 11 08:19:07 app11 sudo:   *REMOVED* : TTY=unknown ;
 PWD=/usr/local/etc/rc.d ; USER=root ;
 COMMAND=/usr/local/sbin/snapshot make -g1 /var:autogen_bkup
 Jul 11 08:21:34 app11 kernel: fsync: giving up on dirty
 Jul 11 08:21:34 app11 kernel: 0xff005b07cd90: tag devfs, type VCHR
 Jul 11 08:21:34 app11 kernel: usecount 1, writecount 0, refcount 604
 mountedhere 0xff011f1ba200
 Jul 11 08:21:34 app11 kernel: flags ()
 Jul 11 08:21:34 app11 kernel: v_object 0xff005d3a ref 0
 pages 8572 
 Jul 11 08:21:34 app11 kernel: lock type devfs: EXCL (count 1) by
 thread 0xff00abc57980 (pid 46772)
 Jul 11 08:21:34 app11 kernel: dev da0s1d
 Jul 11 08:22:03 app11 sudo:   *REMOVED* : TTY=unknown ;
 PWD=/usr/local/etc/rc.d ; USER=root ;
 COMMAND=/usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
 Jul 11 08:24:13 app11 sudo:   *REMOVED* : TTY=unknown ;
 PWD=/usr/local/etc/rc.d ; USER=root ;
 COMMAND=/usr/local/sbin/snapshot mount /var:autogen_bkup /mnt/var
 Jul 11 08:24:13 app11 kernel: g_vfs_done():md0[READ(offset=65536,
 length=8192)]error = 5

This looks like an OS issue, not a Bacula issue.  I suggest following 
up on the FreeBSD maling lists.

-- 
Dan Langille - http://www.langille.org/



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Chris Morris
Dan Langille wrote:

 You say the jobs fail.  What is the failure?  Error message?

   
Below, I've pasted in a failure notification email that Bacula 
automatically sends.

   11-Jul 07:18 admin01-dir: Start Backup JobId 255, 
Job=app11_BSD.2007-07-10_23.35.04
   11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547 seconds, FD 
automatically adjusting.
   11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command 
/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /var:autogen_bkup
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount: 
/var/.snap/autogen_bkup.0: Resource temporarily unavailable
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command 
/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command 
/usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup 
/mnt/var
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0: 
Input/output error
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR: unable to 
mount /dev/md0 under /mnt/var
   11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error: 
Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child 
exited with code 1
   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal error: 
Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
   , got 2905 Bad RunBeforeJob command.

   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Error: Bacula 
2.0.3 (06Mar07): 11-Jul-2007 07:23:21
 JobId:  255
 Job:app11_BSD.2007-07-10_23.35.04
 Backup Level:   Differential, since=2007-07-10 07:32:06
 Client: app11-fd 2.0.3 (06Mar07) 
amd64-portbld-freebsd6.2,freebsd,6.2-RC1
 FileSet:defaultBSD 2007-07-09 23:05:00
 Pool:   Default (From Job resource)
 Storage:storage01 (From Job resource)
 Scheduled time: 10-Jul-2007 23:35:03
 Start time: 11-Jul-2007 07:18:09
 End time:   11-Jul-2007 07:23:21
 Elapsed time:   5 mins 12 secs
 Priority:   10
 FD Files Written:   0
 SD Files Written:   0
 FD Bytes Written:   0 (0 B)
 SD Bytes Written:   0 (0 B)
 Rate:   0.0 KB/s
 Software Compression:   None
 VSS:no
 Encryption: no
 Volume name(s):  Volume Session Id:  95
 Volume Session Time:1183747854
 Last Volume Bytes:  403,356,186,192 (403.3 GB)
 Non-fatal FD errors:0
 SD Errors:  0
 FD termination status:   SD termination status:  OK
 Termination:*** Backup Error ***

-- 
S i x  F e e t  U p  |  Nowhere to go but open source
Silicon Valley: +1 (650) 401-8579 x609
Midwest: +1 (317) 861-5948 x609
Toll-Free: 1-866-SIX-FEET
mailto:[EMAIL PROTECTED]
http://www.sixfeetup.com  |  Zope/Plone Custom Development





-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread John Drescher

On 7/11/07, Chris Morris [EMAIL PROTECTED] wrote:


Dan Langille wrote:

 You say the jobs fail.  What is the failure?  Error message?


Below, I've pasted in a failure notification email that Bacula
automatically sends.

   11-Jul 07:18 admin01-dir: Start Backup JobId 255,
Job=app11_BSD.2007-07-10_23.35.04
   11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547 seconds, FD
automatically adjusting.
   11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command
/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /var:autogen_bkup
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount:
/var/.snap/autogen_bkup.0: Resource temporarily unavailable
   11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command
/usr/local/bin/sudo /usr/local/sbin/snapshot make -g1 /usr:autogen_bkup
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command
/usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup
/mnt/var
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0:
Input/output error
   11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR: unable to
mount /dev/md0 under /mnt/var
   11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error:
Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child
exited with code 1
   11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal error:
Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
   , got 2905 Bad RunBeforeJob command.




This says your ClientRunBeforeJob is has failed as it could not perform the
mount of /dev/md0 to /mnt/var. Have you checked into that? Is bacula-fd
running as user bacula? Possibly this is a permissions issue.

John
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Random Backup Failures

2007-07-11 Thread Chris Morris
For the benefit of list readers and those that may still be trying to 
assist me with troubleshooting. 

I have finally been able to duplicate the errors on my own terms.  I 
could never duplicate this before, because I would sit at a terminal 
window and make the snapshot, mount the snapshot, browse the snapshot, 
and umount the snapshot in that order.  Finally, I decided to try to 
*behave* more like I expect a script to work. 

I opened two terminal windows.  In one, I started the snapshot 
generation process.  In the other, I tried to mount the snapshot before 
it was finished.   It, of course, didn't work.  More importantly, 
however, _I got the exact errors that I would randomly get in my 
automated overnight backups.

_Now that I can reproduce the error, fixing and implementing is 
trivial.  Many thanks to you those that provided assistance with this 
matter.

Thank you,

Chris Morris

-- 
S i x  F e e t  U p  |  Nowhere to go but open source
Silicon Valley: +1 (650) 401-8579 x609
Midwest: +1 (317) 861-5948 x609
Toll-Free: 1-866-SIX-FEET
mailto:[EMAIL PROTECTED]
http://www.sixfeetup.com  |  Zope/Plone Custom Development



John Drescher wrote:


 On 7/11/07, *Chris Morris* [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:

 Dan Langille wrote:
 
  You say the jobs fail.  What is the failure?  Error message?
 
 
 Below, I've pasted in a failure notification email that Bacula
 automatically sends.

11-Jul 07:18 admin01-dir: Start Backup JobId 255,
 Job=app11_BSD.2007-07-10_23.35.04
11-Jul 01:19 app11-fd: DIR and FD clocks differ by -21547
 seconds, FD
 automatically adjusting.
11-Jul 01:19 app11-fd: ClientRunBeforeJob: run command
 /usr/local/bin/sudo /usr/local/sbin/snapshot make -g1
 /var:autogen_bkup
11-Jul 01:22 app11-fd: ClientRunBeforeJob: mount:
 /var/.snap/autogen_bkup.0: Resource temporarily unavailable
11-Jul 01:22 app11-fd: ClientRunBeforeJob: run command
 /usr/local/bin/sudo /usr/local/sbin/snapshot make -g1
 /usr:autogen_bkup
11-Jul 01:24 app11-fd: ClientRunBeforeJob: run command
 /usr/local/bin/sudo /usr/local/sbin/snapshot mount /var:autogen_bkup
 /mnt/var
11-Jul 01:24 app11-fd: ClientRunBeforeJob: mount: /dev/md0:
 Input/output error
11-Jul 01:24 app11-fd: ClientRunBeforeJob: snapshot:ERROR:
 unable to
 mount /dev/md0 under /mnt/var
11-Jul 01:24 app11-fd: app11_BSD.2007-07-10_23.35.04 Error:
 Runscript: ClientRunBeforeJob returned non-zero status=1. ERR=Child
 exited with code 1
11-Jul 07:23 admin01-dir: app11_BSD.2007-07-10_23.35.04 Fatal
 error:
 Bad response to ClientRunBeforeJob command: wanted 2000 OK RunBefore
, got 2905 Bad RunBeforeJob command.



 This says your ClientRunBeforeJob is has failed as it could not 
 perform the mount of /dev/md0 to /mnt/var. Have you checked into that? 
 Is bacula-fd running as user bacula? Possibly this is a permissions 
 issue.

 John

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users