Re: [Bacula-users] Win64 bacula-fd times out frequently during RunBeforeNow
On 05/11/2012 01:36 AM, Igor Novgorodov wrote: Hi. Among my backup clients there's a Win2k8R2 virtual machine with MSSQL on it, that executes a dump script before making a backup. The script runs for ~30 min and from time to time (20-30% chance) the backup job fails with: 11-May 04:43 backup-dir JobId 10787: Fatal error: Socket error on RunBeforeNow command: ERR=Connection timed out I've got Heartbeat Interval = 60 set in director globally and in Storage sections. Maybe i missed some place elsewhere to put heartbeat in? Thank you. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users Igor, You will usually need to set Heartbeat Interval on the client as well. Regards, Avery Ceo Systems Administrator Enterprise Hosting 678.317.9019 x7602 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] VSS reporting files corrupted or unreadable
On Tue, 2011-11-08 at 01:13 +, Joseph L. Casale wrote: Yes, MySQL is running on this server. To the best of my knowledge, no VSS writer exists for MySQL. MySQL? How the heck is windows VSS supposed to quiesce MySQL? Methinks you are sol with that approach, you'd probably have better luck using a run before script and a mysql supported method... -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users I don't expect MySQL to be in a quiesced state. The client is responsible for scheduling mysqldumps. What I do expect is that the inconsistent database files be read rather than causing the remainder of the backup to fail. If it were our server rather than a client's, then running a mysqldump in the run before is exactly how I would handle consistent database backups, but that still doesn't solve the problem of the unreadable snapshots. -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] VSS reporting files corrupted or unreadable
On Tue, 2011-11-08 at 10:17 +1100, James Harper wrote: Are there any other messages in the event logs about the vss snapshot process? I didn't see anything else significant, but I will look again in a few hours. What you describe would indicate that the problem is Windows rather than Bacula. Can you try the following suggestions, in no particular order: While the backup is running, does the command 'vssadmin list writers' show any writers with errors? The way Bacula uses VSS is fairly simplistic and doesn't involve the writers, but it does give a 'crash consistent' copy of the drive. A writer in an error status would be an indication of a problem though. Didn't notice any, but will retest. Is the MySQL very heavily used? VSS likes to try and find a period of 'idle time' to do its work. Whatever happens, the outcome should never be a corrupt snapshot but maybe you've discovered a bug. Is it possible to make MySQL idle (or stop it altogether but ideally you'd test with the files still in use) and see if the problem persists? Most of our backups are of VMs, so we cloned a production machine for our test and put it on its own VLAN. With no outside traffic, the database should be pretty quiet. I do not have the DB password to flush_tables_with_read_lock, but I should be able to stop the DB. Do a chkdsk /f on the drive where the database is. Probably best to do it on reboot rather than force a dismount of the drive. Obviously the volume is working well enough but there could be some latent corruption or something that only comes out in the snapshot. Already tried this one. Create a snapshot manually. This guy blogs about how to do it http://blogs.msdn.com/b/adioltean/archive/2005/01/20/357836.aspx and map it to a drive letter. The vshadow tool that he talks about is part of the VSS SDK... newer versions are available but I assume this version http://www.microsoft.com/download/en/details.aspx?displaylang=enid=2349 0 might do the trick on 2003. Once you've created the snapshot, see if you can access the files just by copying them to somewhere else. At least then you'll know if you have a general VSS problem or if it is specific to Bacula. Definitely a promising approach. I will let you know what I find. Good luck! James -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] VSS reporting files corrupted or unreadable
The answers will make more sense if I rearrange the questions: On Tue, 2011-11-08 at 08:07 -0500, Avery Ceo wrote: On Tue, 2011-11-08 at 10:17 +1100, James Harper wrote: Create a snapshot manually. This guy blogs about how to do it http://blogs.msdn.com/b/adioltean/archive/2005/01/20/357836.aspx and map it to a drive letter. The vshadow tool that he talks about is part of the VSS SDK... newer versions are available but I assume this version http://www.microsoft.com/download/en/details.aspx?displaylang=enid=2349 0 might do the trick on 2003. Once you've created the snapshot, see if you can access the files just by copying them to somewhere else. At least then you'll know if you have a general VSS problem or if it is specific to Bacula. Definitely a promising approach. I will let you know what I find. No question now - this is a VSS issue with Windows, not Bacula. I ran his other script to copy individual files out of the VSS instead of mapping a persistent snapshot, first against one of the .rar files and then against one of the .MYD files - both came up corrupt. Is the MySQL very heavily used? VSS likes to try and find a period of 'idle time' to do its work. Whatever happens, the outcome should never be a corrupt snapshot but maybe you've discovered a bug. Is it possible to make MySQL idle (or stop it altogether but ideally you'd test with the files still in use) and see if the problem persists? I stopped the MySQL service and reran the copy against both files. Both reported as corrupt. Are there any other messages in the event logs about the vss snapshot process? I didn't see anything else significant, but I will look again in a few hours. Only the standard messages about the service starting and stopping. While the backup is running, does the command 'vssadmin list writers' show any writers with errors? The way Bacula uses VSS is fairly simplistic and doesn't involve the writers, but it does give a 'crash consistent' copy of the drive. A writer in an error status would be an indication of a problem though. Didn't notice any, but will retest. I modified the script to pause and hold the shadow copy in place. No writers report errors. Do a chkdsk /f on the drive where the database is. Probably best to do it on reboot rather than force a dismount of the drive. Obviously the volume is working well enough but there could be some latent corruption or something that only comes out in the snapshot. Already tried this one. Ran it again anyway, and chkdsk found no errors. Boy, this is a stumper. Good luck! James -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] VSS reporting files corrupted or unreadable
We have been running Bacula on Windows with VSS disabled, but would like to turn it on for disaster recovery purposes. In our tests on a fully patched version of Windows Server 2003, we are running into a problem where files that can be backed up without VSS are being reported as corrupted or unreadable when VSS is enabled. I posted a sample log to http://pastebin.com/NRYgmRQy This does appear to be a Windows VSS problem rather than an issue with the Bacula client, as the system logs report an ntfs error when it occurs. A chkdsk did not help. Has anybody seen this issue before? Any suggestions on how to resolve it? Client OS: Microsoft Windows Server 2003, Standard Edition (32 bit) Bacula version: 5.0.3 -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] VSS reporting files corrupted or unreadable
On Mon, 2011-11-07 at 21:59 +0400, Konstantin Khomoutov wrote: On Mon, 07 Nov 2011 11:40:18 -0500 Avery Ceo a...@enterprisehostinginc.com wrote: We have been running Bacula on Windows with VSS disabled, but would like to turn it on for disaster recovery purposes. In our tests on a fully patched version of Windows Server 2003, we are running into a problem where files that can be backed up without VSS are being reported as corrupted or unreadable when VSS is enabled. I posted a sample log to http://pastebin.com/NRYgmRQy This does appear to be a Windows VSS problem rather than an issue with the Bacula client, as the system logs report an ntfs error when it occurs. A chkdsk did not help. Has anybody seen this issue before? Any suggestions on how to resolve it? Client OS: Microsoft Windows Server 2003, Standard Edition (32 bit) Bacula version: 5.0.3 Do you run any sort of antivirus software on the server? I have identical setup for a bunch of servers and did not ever seen anything like what you described. This server is not running antivirus. vssadmin also showed no shadow copies other than Bacula's when I ran the test that generated that sample log. It does seem to be an unusual issue. Searching online, I was only able to find references to anything similar on pre-SP1 machines. -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] VSS reporting files corrupted or unreadable
On Mon, 2011-11-07 at 18:52 +, Joseph L. Casale wrote: Has anybody seen this issue before? Any suggestions on how to resolve it? So, when you enable a writer in VSS, if for example you only manually snapshot one drive while components under the control of that writer exist on another, VSS will exclude the writer. I see you're trying to backup db's for sql? I dont have a lot of experience with bacula's VSS implementation but you might need to set inclusions on all drives that have *ANY* sql data on them, even its an empty file. HTH, jlc Yes, MySQL is running on this server. To the best of my knowledge, no VSS writer exists for MySQL. The current table state may not have been flushed to disk, but attempts to back up the files should not cause errors. Also, the first error file is actually a RAR archive and not one of the tables, so I think there is something else going on here. In my initial tests I was snapshotting and backing up all drives, and seeing the same errors. The errors are only on the second (data) drive, and I thought it might be related to the time lag from when the snapshot was taken to when those files were hit. The log I linked to was generated from the test I ran on just the problem drive to decrease the time between snapshot and read. The errors started with the same file on both tests. -- Regards, Avery Ceo Systems Administrator Enterprise Hosting -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users