Re: [Bacula-users] 25-hour backup job

2006-09-20 Thread Bill Moran
In response to David Hatcher [EMAIL PROTECTED]:
 Hi folks,
 
 New user here. I find the following peculiar and wonder if big backup
 jobs take longer to complete than running several consecutive smaller
 jobs? Here's my story...
 
 Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client =
 labssrv-fd Version: 1.38.4, OS=Windows NT 4.0
 
 For my first full backup on my client (labssrv), I setup the
 bacula-dir.conf file (see attached) to backup everything on the C, F, G,
 and H drives.  Below is the summary, in particular, the job took 13
 hours to backup 173 GB and resulted in 430 non-fatal FD errors, most of
 which were permission errors.  This sounds fairly reasonable to me.
 
   JobId:  1150
   Job:Labssrv.2006-09-08_19.00.03
   Backup Level:   Full
   Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
   FileSet:Labssrv FileSet 2006-09-08 22:52:20
   Pool:   Weekly
   Storage:SDLT
   Scheduled time: 08-Sep-2006 19:00:02
   Start time: 08-Sep-2006 22:52:23
   End time:   09-Sep-2006 12:39:33
   Elapsed time:   13 hours 47 mins 10 secs
   Priority:   11
   FD Files Written:   296,222
   SD Files Written:   296,222
   FD Bytes Written:   173,119,126,494 (173.1 GB)
   SD Bytes Written:   173,182,590,769 (173.1 GB)
   Rate:   3488.2 KB/s
   Software Compression:   None
   Volume name(s): 000103|77
   Volume Session Id:  5
   Volume Session Time:1157757233
   Last Volume Bytes:  25,817,913,037 (25.81 GB)
   Non-fatal FD errors:430
   SD Errors:  0
   FD termination status:  OK
   SD termination status:  OK
   Termination:Backup OK -- with warnings
 
 
 Then I went through the details of the permission errors and granted
 access to the respective files and directories on my labssrv machine. In
 addition, I excluded some old archived data that doesn't need to be
 backed up.  Following is the new summary.  The job took 25 hours to
 backup 188 GB and resulted in 0 non-fatal FD errors  I'm surprised it
 took 12 additional hours to backup 15 GB of data, as if there's an
 exponential problem somewhere.  This doesn't make sense to me.  I'm
 thinking about splitting this job into two separate jobs (job one for
 drives C and F, job two for drives G and H) to see if it will complete
 in under 25 hours.  I have other clients that I back up that run fairly
 quick backup jobs, although the backups are typically less than 100 GB. 
 
   JobId:  1225
   Job:Labssrv.2006-09-15_19.00.03
   Backup Level:   Full
   Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
   FileSet:Labssrv FileSet 2006-09-15 22:50:24
   Pool:   Weekly
   Storage:SDLT
   Scheduled time: 15-Sep-2006 19:00:02
   Start time: 15-Sep-2006 22:50:27
   End time:   16-Sep-2006 23:34:34
   Elapsed time:   1 day 44 mins 7 secs
   Priority:   11
   FD Files Written:   309,962
   SD Files Written:   309,962
   FD Bytes Written:   187,971,236,795 (187.9 GB)
   SD Bytes Written:   188,037,056,495 (188.0 GB)
   Rate:   2110.9 KB/s
   Software Compression:   None
   Volume name(s): 82|000100
   Volume Session Id:  5
   Volume Session Time:1158352255
   Last Volume Bytes:  39,196,049,239 (39.19 GB)
   Non-fatal FD errors:0
   SD Errors:  0
   FD termination status:  OK
   SD termination status:  OK
   Termination:Backup OK

Lots of information missing here -- difficult to help much without
some additional diagnosis.

What DB are you using?  Is it possible that you've pushed the db
server past some limit where performance starts to degrade?  i.e.
created enough records that inserts have become expensive?

If you monitor CPU and IO usage during the backup, where is the
holdup and which program (DBserver? director? storage daemon?
file daemon?) is using that resource?

Is it possible that you hit something on the DLT tape that caused
it to have to rewind or do a bunch of seeking or something else
that put the whole job in wait mode for a long time?

What is the nature of the data in that directory?  Is it possible
that the FD is contested for access to those files and is spending
a lot of time waiting for them to free up when it tries to grab
them?

Mostly guesses here, but hopefully something will be helpful.

-- 
Bill Moran
Collaborative Fusion Inc.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash

Re: [Bacula-users] 25-hour backup job

2006-09-20 Thread David Hatcher
Thanks for the response Bill.  I'll look into some of the points you
mentioned.

Best regards,
David

 

-Original Message-
From: Bill Moran [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 20, 2006 6:53 AM
To: David Hatcher
Cc: Bacula-users@lists.sourceforge.net
Subject: Re: [Bacula-users] 25-hour backup job

In response to David Hatcher [EMAIL PROTECTED]:
 Hi folks,
 
 New user here. I find the following peculiar and wonder if big 
 backup jobs take longer to complete than running several consecutive 
 smaller jobs? Here's my story...
 
 Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client = 
 labssrv-fd Version: 1.38.4, OS=Windows NT 4.0
 
 For my first full backup on my client (labssrv), I setup the 
 bacula-dir.conf file (see attached) to backup everything on the C, F, 
 G, and H drives.  Below is the summary, in particular, the job took 13

 hours to backup 173 GB and resulted in 430 non-fatal FD errors, most 
 of which were permission errors.  This sounds fairly reasonable to me.
 
   JobId:  1150
   Job:Labssrv.2006-09-08_19.00.03
   Backup Level:   Full
   Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
   FileSet:Labssrv FileSet 2006-09-08 22:52:20
   Pool:   Weekly
   Storage:SDLT
   Scheduled time: 08-Sep-2006 19:00:02
   Start time: 08-Sep-2006 22:52:23
   End time:   09-Sep-2006 12:39:33
   Elapsed time:   13 hours 47 mins 10 secs
   Priority:   11
   FD Files Written:   296,222
   SD Files Written:   296,222
   FD Bytes Written:   173,119,126,494 (173.1 GB)
   SD Bytes Written:   173,182,590,769 (173.1 GB)
   Rate:   3488.2 KB/s
   Software Compression:   None
   Volume name(s): 000103|77
   Volume Session Id:  5
   Volume Session Time:1157757233
   Last Volume Bytes:  25,817,913,037 (25.81 GB)
   Non-fatal FD errors:430
   SD Errors:  0
   FD termination status:  OK
   SD termination status:  OK
   Termination:Backup OK -- with warnings
 
 
 Then I went through the details of the permission errors and granted 
 access to the respective files and directories on my labssrv machine. 
 In addition, I excluded some old archived data that doesn't need to be

 backed up.  Following is the new summary.  The job took 25 hours to 
 backup 188 GB and resulted in 0 non-fatal FD errors  I'm surprised it 
 took 12 additional hours to backup 15 GB of data, as if there's an 
 exponential problem somewhere.  This doesn't make sense to me.  I'm 
 thinking about splitting this job into two separate jobs (job one for 
 drives C and F, job two for drives G and H) to see if it will complete

 in under 25 hours.  I have other clients that I back up that run 
 fairly quick backup jobs, although the backups are typically less than
100 GB.
 
   JobId:  1225
   Job:Labssrv.2006-09-15_19.00.03
   Backup Level:   Full
   Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
   FileSet:Labssrv FileSet 2006-09-15 22:50:24
   Pool:   Weekly
   Storage:SDLT
   Scheduled time: 15-Sep-2006 19:00:02
   Start time: 15-Sep-2006 22:50:27
   End time:   16-Sep-2006 23:34:34
   Elapsed time:   1 day 44 mins 7 secs
   Priority:   11
   FD Files Written:   309,962
   SD Files Written:   309,962
   FD Bytes Written:   187,971,236,795 (187.9 GB)
   SD Bytes Written:   188,037,056,495 (188.0 GB)
   Rate:   2110.9 KB/s
   Software Compression:   None
   Volume name(s): 82|000100
   Volume Session Id:  5
   Volume Session Time:1158352255
   Last Volume Bytes:  39,196,049,239 (39.19 GB)
   Non-fatal FD errors:0
   SD Errors:  0
   FD termination status:  OK
   SD termination status:  OK
   Termination:Backup OK

Lots of information missing here -- difficult to help much without some
additional diagnosis.

What DB are you using?  Is it possible that you've pushed the db server
past some limit where performance starts to degrade?  i.e.
created enough records that inserts have become expensive?

If you monitor CPU and IO usage during the backup, where is the holdup
and which program (DBserver? director? storage daemon?
file daemon?) is using that resource?

Is it possible that you hit something on the DLT tape that caused it to
have to rewind or do a bunch of seeking or something else that put the
whole job in wait mode for a long time?

What is the nature of the data in that directory?  Is it possible that
the FD is contested for access to those files and is spending a lot of
time waiting for them to free up when it tries to grab them?

Mostly guesses here, but hopefully something will be helpful.

--
Bill Moran
Collaborative Fusion Inc

[Bacula-users] 25-hour backup job

2006-09-19 Thread David Hatcher
Hi folks,

New user here. I find the following peculiar and wonder if big backup
jobs take longer to complete than running several consecutive smaller
jobs? Here's my story...

Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client =
labssrv-fd Version: 1.38.4, OS=Windows NT 4.0

For my first full backup on my client (labssrv), I setup the
bacula-dir.conf file (see attached) to backup everything on the C, F, G,
and H drives.  Below is the summary, in particular, the job took 13
hours to backup 173 GB and resulted in 430 non-fatal FD errors, most of
which were permission errors.  This sounds fairly reasonable to me.

  JobId:  1150
  Job:Labssrv.2006-09-08_19.00.03
  Backup Level:   Full
  Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
  FileSet:Labssrv FileSet 2006-09-08 22:52:20
  Pool:   Weekly
  Storage:SDLT
  Scheduled time: 08-Sep-2006 19:00:02
  Start time: 08-Sep-2006 22:52:23
  End time:   09-Sep-2006 12:39:33
  Elapsed time:   13 hours 47 mins 10 secs
  Priority:   11
  FD Files Written:   296,222
  SD Files Written:   296,222
  FD Bytes Written:   173,119,126,494 (173.1 GB)
  SD Bytes Written:   173,182,590,769 (173.1 GB)
  Rate:   3488.2 KB/s
  Software Compression:   None
  Volume name(s): 000103|77
  Volume Session Id:  5
  Volume Session Time:1157757233
  Last Volume Bytes:  25,817,913,037 (25.81 GB)
  Non-fatal FD errors:430
  SD Errors:  0
  FD termination status:  OK
  SD termination status:  OK
  Termination:Backup OK -- with warnings


Then I went through the details of the permission errors and granted
access to the respective files and directories on my labssrv machine. In
addition, I excluded some old archived data that doesn't need to be
backed up.  Following is the new summary.  The job took 25 hours to
backup 188 GB and resulted in 0 non-fatal FD errors  I'm surprised it
took 12 additional hours to backup 15 GB of data, as if there's an
exponential problem somewhere.  This doesn't make sense to me.  I'm
thinking about splitting this job into two separate jobs (job one for
drives C and F, job two for drives G and H) to see if it will complete
in under 25 hours.  I have other clients that I back up that run fairly
quick backup jobs, although the backups are typically less than 100 GB. 

  JobId:  1225
  Job:Labssrv.2006-09-15_19.00.03
  Backup Level:   Full
  Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381
  FileSet:Labssrv FileSet 2006-09-15 22:50:24
  Pool:   Weekly
  Storage:SDLT
  Scheduled time: 15-Sep-2006 19:00:02
  Start time: 15-Sep-2006 22:50:27
  End time:   16-Sep-2006 23:34:34
  Elapsed time:   1 day 44 mins 7 secs
  Priority:   11
  FD Files Written:   309,962
  SD Files Written:   309,962
  FD Bytes Written:   187,971,236,795 (187.9 GB)
  SD Bytes Written:   188,037,056,495 (188.0 GB)
  Rate:   2110.9 KB/s
  Software Compression:   None
  Volume name(s): 82|000100
  Volume Session Id:  5
  Volume Session Time:1158352255
  Last Volume Bytes:  39,196,049,239 (39.19 GB)
  Non-fatal FD errors:0
  SD Errors:  0
  FD termination status:  OK
  SD termination status:  OK
  Termination:Backup OK


Thanks!
Dave



bacula-dir.conf
Description: bacula-dir.conf
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users