Re: [Bacula-users] 25-hour backup job
In response to David Hatcher [EMAIL PROTECTED]: Hi folks, New user here. I find the following peculiar and wonder if big backup jobs take longer to complete than running several consecutive smaller jobs? Here's my story... Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client = labssrv-fd Version: 1.38.4, OS=Windows NT 4.0 For my first full backup on my client (labssrv), I setup the bacula-dir.conf file (see attached) to backup everything on the C, F, G, and H drives. Below is the summary, in particular, the job took 13 hours to backup 173 GB and resulted in 430 non-fatal FD errors, most of which were permission errors. This sounds fairly reasonable to me. JobId: 1150 Job:Labssrv.2006-09-08_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-08 22:52:20 Pool: Weekly Storage:SDLT Scheduled time: 08-Sep-2006 19:00:02 Start time: 08-Sep-2006 22:52:23 End time: 09-Sep-2006 12:39:33 Elapsed time: 13 hours 47 mins 10 secs Priority: 11 FD Files Written: 296,222 SD Files Written: 296,222 FD Bytes Written: 173,119,126,494 (173.1 GB) SD Bytes Written: 173,182,590,769 (173.1 GB) Rate: 3488.2 KB/s Software Compression: None Volume name(s): 000103|77 Volume Session Id: 5 Volume Session Time:1157757233 Last Volume Bytes: 25,817,913,037 (25.81 GB) Non-fatal FD errors:430 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK -- with warnings Then I went through the details of the permission errors and granted access to the respective files and directories on my labssrv machine. In addition, I excluded some old archived data that doesn't need to be backed up. Following is the new summary. The job took 25 hours to backup 188 GB and resulted in 0 non-fatal FD errors I'm surprised it took 12 additional hours to backup 15 GB of data, as if there's an exponential problem somewhere. This doesn't make sense to me. I'm thinking about splitting this job into two separate jobs (job one for drives C and F, job two for drives G and H) to see if it will complete in under 25 hours. I have other clients that I back up that run fairly quick backup jobs, although the backups are typically less than 100 GB. JobId: 1225 Job:Labssrv.2006-09-15_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-15 22:50:24 Pool: Weekly Storage:SDLT Scheduled time: 15-Sep-2006 19:00:02 Start time: 15-Sep-2006 22:50:27 End time: 16-Sep-2006 23:34:34 Elapsed time: 1 day 44 mins 7 secs Priority: 11 FD Files Written: 309,962 SD Files Written: 309,962 FD Bytes Written: 187,971,236,795 (187.9 GB) SD Bytes Written: 188,037,056,495 (188.0 GB) Rate: 2110.9 KB/s Software Compression: None Volume name(s): 82|000100 Volume Session Id: 5 Volume Session Time:1158352255 Last Volume Bytes: 39,196,049,239 (39.19 GB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK Lots of information missing here -- difficult to help much without some additional diagnosis. What DB are you using? Is it possible that you've pushed the db server past some limit where performance starts to degrade? i.e. created enough records that inserts have become expensive? If you monitor CPU and IO usage during the backup, where is the holdup and which program (DBserver? director? storage daemon? file daemon?) is using that resource? Is it possible that you hit something on the DLT tape that caused it to have to rewind or do a bunch of seeking or something else that put the whole job in wait mode for a long time? What is the nature of the data in that directory? Is it possible that the FD is contested for access to those files and is spending a lot of time waiting for them to free up when it tries to grab them? Mostly guesses here, but hopefully something will be helpful. -- Bill Moran Collaborative Fusion Inc. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys -- and earn cash
Re: [Bacula-users] 25-hour backup job
Thanks for the response Bill. I'll look into some of the points you mentioned. Best regards, David -Original Message- From: Bill Moran [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 20, 2006 6:53 AM To: David Hatcher Cc: Bacula-users@lists.sourceforge.net Subject: Re: [Bacula-users] 25-hour backup job In response to David Hatcher [EMAIL PROTECTED]: Hi folks, New user here. I find the following peculiar and wonder if big backup jobs take longer to complete than running several consecutive smaller jobs? Here's my story... Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client = labssrv-fd Version: 1.38.4, OS=Windows NT 4.0 For my first full backup on my client (labssrv), I setup the bacula-dir.conf file (see attached) to backup everything on the C, F, G, and H drives. Below is the summary, in particular, the job took 13 hours to backup 173 GB and resulted in 430 non-fatal FD errors, most of which were permission errors. This sounds fairly reasonable to me. JobId: 1150 Job:Labssrv.2006-09-08_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-08 22:52:20 Pool: Weekly Storage:SDLT Scheduled time: 08-Sep-2006 19:00:02 Start time: 08-Sep-2006 22:52:23 End time: 09-Sep-2006 12:39:33 Elapsed time: 13 hours 47 mins 10 secs Priority: 11 FD Files Written: 296,222 SD Files Written: 296,222 FD Bytes Written: 173,119,126,494 (173.1 GB) SD Bytes Written: 173,182,590,769 (173.1 GB) Rate: 3488.2 KB/s Software Compression: None Volume name(s): 000103|77 Volume Session Id: 5 Volume Session Time:1157757233 Last Volume Bytes: 25,817,913,037 (25.81 GB) Non-fatal FD errors:430 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK -- with warnings Then I went through the details of the permission errors and granted access to the respective files and directories on my labssrv machine. In addition, I excluded some old archived data that doesn't need to be backed up. Following is the new summary. The job took 25 hours to backup 188 GB and resulted in 0 non-fatal FD errors I'm surprised it took 12 additional hours to backup 15 GB of data, as if there's an exponential problem somewhere. This doesn't make sense to me. I'm thinking about splitting this job into two separate jobs (job one for drives C and F, job two for drives G and H) to see if it will complete in under 25 hours. I have other clients that I back up that run fairly quick backup jobs, although the backups are typically less than 100 GB. JobId: 1225 Job:Labssrv.2006-09-15_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-15 22:50:24 Pool: Weekly Storage:SDLT Scheduled time: 15-Sep-2006 19:00:02 Start time: 15-Sep-2006 22:50:27 End time: 16-Sep-2006 23:34:34 Elapsed time: 1 day 44 mins 7 secs Priority: 11 FD Files Written: 309,962 SD Files Written: 309,962 FD Bytes Written: 187,971,236,795 (187.9 GB) SD Bytes Written: 188,037,056,495 (188.0 GB) Rate: 2110.9 KB/s Software Compression: None Volume name(s): 82|000100 Volume Session Id: 5 Volume Session Time:1158352255 Last Volume Bytes: 39,196,049,239 (39.19 GB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK Lots of information missing here -- difficult to help much without some additional diagnosis. What DB are you using? Is it possible that you've pushed the db server past some limit where performance starts to degrade? i.e. created enough records that inserts have become expensive? If you monitor CPU and IO usage during the backup, where is the holdup and which program (DBserver? director? storage daemon? file daemon?) is using that resource? Is it possible that you hit something on the DLT tape that caused it to have to rewind or do a bunch of seeking or something else that put the whole job in wait mode for a long time? What is the nature of the data in that directory? Is it possible that the FD is contested for access to those files and is spending a lot of time waiting for them to free up when it tries to grab them? Mostly guesses here, but hopefully something will be helpful. -- Bill Moran Collaborative Fusion Inc
[Bacula-users] 25-hour backup job
Hi folks, New user here. I find the following peculiar and wonder if big backup jobs take longer to complete than running several consecutive smaller jobs? Here's my story... Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client = labssrv-fd Version: 1.38.4, OS=Windows NT 4.0 For my first full backup on my client (labssrv), I setup the bacula-dir.conf file (see attached) to backup everything on the C, F, G, and H drives. Below is the summary, in particular, the job took 13 hours to backup 173 GB and resulted in 430 non-fatal FD errors, most of which were permission errors. This sounds fairly reasonable to me. JobId: 1150 Job:Labssrv.2006-09-08_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-08 22:52:20 Pool: Weekly Storage:SDLT Scheduled time: 08-Sep-2006 19:00:02 Start time: 08-Sep-2006 22:52:23 End time: 09-Sep-2006 12:39:33 Elapsed time: 13 hours 47 mins 10 secs Priority: 11 FD Files Written: 296,222 SD Files Written: 296,222 FD Bytes Written: 173,119,126,494 (173.1 GB) SD Bytes Written: 173,182,590,769 (173.1 GB) Rate: 3488.2 KB/s Software Compression: None Volume name(s): 000103|77 Volume Session Id: 5 Volume Session Time:1157757233 Last Volume Bytes: 25,817,913,037 (25.81 GB) Non-fatal FD errors:430 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK -- with warnings Then I went through the details of the permission errors and granted access to the respective files and directories on my labssrv machine. In addition, I excluded some old archived data that doesn't need to be backed up. Following is the new summary. The job took 25 hours to backup 188 GB and resulted in 0 non-fatal FD errors I'm surprised it took 12 additional hours to backup 15 GB of data, as if there's an exponential problem somewhere. This doesn't make sense to me. I'm thinking about splitting this job into two separate jobs (job one for drives C and F, job two for drives G and H) to see if it will complete in under 25 hours. I have other clients that I back up that run fairly quick backup jobs, although the backups are typically less than 100 GB. JobId: 1225 Job:Labssrv.2006-09-15_19.00.03 Backup Level: Full Client: labssrv-fd Windows NT 4.0,MVS,NT 4.0.1381 FileSet:Labssrv FileSet 2006-09-15 22:50:24 Pool: Weekly Storage:SDLT Scheduled time: 15-Sep-2006 19:00:02 Start time: 15-Sep-2006 22:50:27 End time: 16-Sep-2006 23:34:34 Elapsed time: 1 day 44 mins 7 secs Priority: 11 FD Files Written: 309,962 SD Files Written: 309,962 FD Bytes Written: 187,971,236,795 (187.9 GB) SD Bytes Written: 188,037,056,495 (188.0 GB) Rate: 2110.9 KB/s Software Compression: None Volume name(s): 82|000100 Volume Session Id: 5 Volume Session Time:1158352255 Last Volume Bytes: 39,196,049,239 (39.19 GB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK Thanks! Dave bacula-dir.conf Description: bacula-dir.conf - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users