Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If there is a way to temporarily freeze a job (possible using signals) to prevent it from writing anything while using LVM to attempt to vgextend; pvmove; vgreduce; vgremove that physical disk could be a viable solution. IMHO once a job is running, the only way to stop it is to cancel it. there is no freeze. sometime in the future there will be the ability to restart a failed job, see the Projects file: Item 1: Ability to restart failed jobs If you are not a coder, you could use http://bacula.org/en/?page=makedonation to help getting it done. - Thomas -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Thu, Nov 4, 2010 at 11:14 PM, Thomas Mueller tho...@chaschperli.ch wrote: So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If there is a way to temporarily freeze a job (possible using signals) to prevent it from writing anything while using LVM to attempt to vgextend; pvmove; vgreduce; vgremove that physical disk could be a viable solution. IMHO once a job is running, the only way to stop it is to cancel it. there is no freeze. sometime in the future there will be the ability to restart a failed job, see the Projects file: Item 1: Ability to restart failed jobs If you are not a coder, you could use http://bacula.org/en/?page=makedonation to help getting it done. - Thomas That seems to be the general consensus. Thank you. Just to clear it up for some people who seem to be confused about the situation: I am not facing a data loss. The backup medium (a single hard drive) has gone bad. I did not want to cancel a ~8.8T run with ~400GB left on account of a single bad 1TB hard drive, but with no recourse that is what I had to do. Anyway, thank you all for your help in this matter. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Fri, Nov 5, 2010 at 10:35 AM, Mark Luntzel m...@luntzel.com wrote: On Thu, Nov 4, 2010 at 11:14 PM, Thomas Mueller tho...@chaschperli.ch wrote: So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If there is a way to temporarily freeze a job (possible using signals) to prevent it from writing anything while using LVM to attempt to vgextend; pvmove; vgreduce; vgremove that physical disk could be a viable solution. IMHO once a job is running, the only way to stop it is to cancel it. there is no freeze. sometime in the future there will be the ability to restart a failed job, see the Projects file: Item 1: Ability to restart failed jobs If you are not a coder, you could use http://bacula.org/en/?page=makedonation to help getting it done. - Thomas That seems to be the general consensus. Thank you. I caution that this method may not work as you expect. Being a programmer myself, I know that nothing gets done quickly and thus cheaply. I would expect a feature like this if you paid a professional developer (who is not doing this voluntarily) to write code for the task would likely cost 10 thousand dollars US. Just to clear it up for some people who seem to be confused about the situation: I am not facing a data loss. The backup medium (a single hard drive) has gone bad. I did not want to cancel a ~8.8T run with ~400GB left on account of a single bad 1TB hard drive, but with no recourse that is what I had to do. Anyway, thank you all for your help in this matter. I understood the problem. I wanted you to try letting the job complete on the bad drive and fix the situation after the job completed. This would only be possible if the drive was mostly functioning.. John -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On 11/05/10 10:35, Mark Luntzel wrote: Just to clear it up for some people who seem to be confused about the situation: I am not facing a data loss. The backup medium (a single hard drive) has gone bad. I did not want to cancel a ~8.8T run with ~400GB left on account of a single bad 1TB hard drive, but with no recourse that is what I had to do. I think the real key thing to take away from this is, when you are talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage. Otherwise it's like playing Russian roulette with a revolver having an unknown number of loaded chambers. -- Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355 ala...@caerllewys.net ala...@metrocast.net p...@co.ordinate.org Renaissance Man, Unix ronin, Perl hacker, Free Stater It's not the years, it's the mileage. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
Item 1: Ability to restart failed jobs If you are not a coder, you could use http://bacula.org/en/?page=makedonation to help getting it done. - Thomas That seems to be the general consensus. Thank you. I caution that this method may not work as you expect. Being a programmer myself, I know that nothing gets done quickly and thus cheaply. I would expect a feature like this if you paid a professional developer (who is not doing this voluntarily) to write code for the task would likely cost 10 thousand dollars US. i did not say it will be done quickly or cheap. but it will help to get it done sometime. - Thomas -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Wed, Nov 03, 2010 at 10:34:44AM -0700, Mark Luntzel wrote: The answer is probably no but... I can hear the disk currently being written to making bad noises, and the speed is extremely slow. Bad disk for sure, about to fail. This is at the end of a multi-Terabyte backup, just about 500 gig left and I would REALLY hate to think there is no way out but to start over completely. So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 The clear consensus appears to be no. However, I wonder if it would be theoretically possible to suspend every bacula process with a SIGSTOP, bitimage copy the failing disk, hot swap it, restore the image, then wake up all the bacula processes with SIGCONT? Perhaps the major difficulty would end up being convincing linux to ignore the physical disk-swap event, but that might depend on how the external enclosure is set up and configured? -- Henry Yen Aegis Information Systems, Inc. Senior Systems Programmer Hicksville, New York -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Fri, Nov 5, 2010 at 3:09 PM, Thomas Mueller tho...@chaschperli.ch wrote: Item 1: Ability to restart failed jobs If you are not a coder, you could use http://bacula.org/en/?page=makedonation to help getting it done. - Thomas That seems to be the general consensus. Thank you. I caution that this method may not work as you expect. Being a programmer myself, I know that nothing gets done quickly and thus cheaply. I would expect a feature like this if you paid a professional developer (who is not doing this voluntarily) to write code for the task would likely cost 10 thousand dollars US. i did not say it will be done quickly or cheap. but it will help to get it done sometime. I just wanted to mention that because in opensource I see so many projects have a bounty or donation for some similar development then the users get frustrated when the total is $350 and no work gets done.. John -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On 11/05/2010 04:18 PM, Phil Stracchino wrote: On 11/05/10 10:35, Mark Luntzel wrote: Just to clear it up for some people who seem to be confused about the situation: I am not facing a data loss. The backup medium (a single hard drive) has gone bad. I did not want to cancel a ~8.8T run with ~400GB left on account of a single bad 1TB hard drive, but with no recourse that is what I had to do. I think the real key thing to take away from this is, when you are talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage. Otherwise it's like playing Russian roulette with a revolver having an unknown number of loaded chambers. Phil, I would said playing Russian roulette without a missing ball :-) -- Bruno Friedmann (irc:tigerfoot) Ioda-Net Sàrl www.ioda-net.ch -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Fri, Nov 5, 2010 at 4:49 PM, Bruno Friedmann br...@ioda-net.ch wrote: On 11/05/2010 04:18 PM, Phil Stracchino wrote: On 11/05/10 10:35, Mark Luntzel wrote: Just to clear it up for some people who seem to be confused about the situation: I am not facing a data loss. The backup medium (a single hard drive) has gone bad. I did not want to cancel a ~8.8T run with ~400GB left on account of a single bad 1TB hard drive, but with no recourse that is what I had to do. I think the real key thing to take away from this is, when you are talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage. Otherwise it's like playing Russian roulette with a revolver having an unknown number of loaded chambers. Phil, I would said playing Russian roulette without a missing ball :-) I do agree that a single disk is no good to store your only backup. However I would add a single raid array containing all of your backups also can destroy your data. I recommend against that as well. But the same can be said for a tape. The moral of the story is to have at least 2 copies of whatever data can not be lost on more than 1 single storage (disk, raid or tape). John John -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On 11/05/10 15:01, Henry Yen wrote: The clear consensus appears to be no. However, I wonder if it would be theoretically possible to suspend every bacula process with a SIGSTOP, bitimage copy the failing disk, hot swap it, restore the image, then wake up all the bacula processes with SIGCONT? Perhaps the major difficulty would end up being convincing linux to ignore the physical disk-swap event, but that might depend on how the external enclosure is set up and configured? ...So what you have to ask yourself is, do you feel lucky? -- Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355 ala...@caerllewys.net ala...@metrocast.net p...@co.ordinate.org Renaissance Man, Unix ronin, Perl hacker, Free Stater It's not the years, it's the mileage. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On 11/05/10 16:49, Bruno Friedmann wrote: On 11/05/2010 04:18 PM, Phil Stracchino wrote: I think the real key thing to take away from this is, when you are talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage. Otherwise it's like playing Russian roulette with a revolver having an unknown number of loaded chambers. Phil, I would said playing Russian roulette without a missing ball :-) Well, that too. :) On a dataset that size with no redundancy, it's not a question of *if* there's going to be a BANG at some point; it's only a question of when. -- Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355 ala...@caerllewys.net ala...@metrocast.net p...@co.ordinate.org Renaissance Man, Unix ronin, Perl hacker, Free Stater It's not the years, it's the mileage. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Wed, Nov 3, 2010 at 6:34 PM, Mark Luntzel m...@luntzel.com wrote: The answer is probably no but... I can hear the disk currently being written to making bad noises, and the speed is extremely slow. Bad disk for sure, about to fail. This is at the end of a multi-Terabyte backup, just about 500 gig left and I would REALLY hate to think there is no way out but to start over completely. So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If there is a way to temporarily freeze a job (possible using signals) to prevent it from writing anything while using LVM to attempt to vgextend; pvmove; vgreduce; vgremove that physical disk could be a viable solution. Of course you would have to: 1. use LVM 2. Have the same amount of physical extents available somewhere else in the system as the failing disk. However, odds are some extents are unreadable and I'm not sure what exactly happens when a pvmove fails to read an extent. Anyone experienced with this particular case? -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On 11/03/10 13:34, Mark Luntzel wrote: The answer is probably no but... I can hear the disk currently being written to making bad noises, and the speed is extremely slow. Bad disk for sure, about to fail. This is at the end of a multi-Terabyte backup, just about 500 gig left and I would REALLY hate to think there is no way out but to start over completely. So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If it's mirrored, or part of any RAID1 or higher configuration, you can swap out the failed disk and rebuild the mirror. If it is a single, unmirrored disk ... sorry, but if it's failed, you're SOL. Check the disk with smartmontools or similar before you assume it's bad, and check your kernel logs for write failures. Is it possible it's just thrashing? -- Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355 ala...@caerllewys.net ala...@metrocast.net p...@co.ordinate.org Renaissance Man, Unix ronin, Perl hacker, Free Stater It's not the years, it's the mileage. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
On Wed, Nov 3, 2010 at 2:50 PM, Phil Stracchino ala...@metrocast.net wrote: On 11/03/10 13:34, Mark Luntzel wrote: The answer is probably no but... I can hear the disk currently being written to making bad noises, and the speed is extremely slow. Bad disk for sure, about to fail. This is at the end of a multi-Terabyte backup, just about 500 gig left and I would REALLY hate to think there is no way out but to start over completely. So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If it's mirrored, or part of any RAID1 or higher configuration, you can swap out the failed disk and rebuild the mirror. If it is a single, unmirrored disk ... sorry, but if it's failed, you're SOL. Check the disk with smartmontools or similar before you assume it's bad, and check your kernel logs for write failures. Is it possible it's just thrashing? Just to clear up confusion, I am talking about the disk being *written to*. It is not in a RAID, it is just in an external SATA/FW enclosure. Since I cannot yet (AFAIK) send SMART commands across the USB/FW bus, I had to settle with xfs_check. Which caused the disk to again make bad clunking noises and output btree block errors. This could just indicate filesystem corruption, but the question is moot for the purposes of this particular job. I hit the IRC channel up, and after discussing the issue with the fine folks there, came to the conclusion that I am indeed SOL. Maybe its on a list of would be nice to have already, but I sure would have liked to not have had to invalidate the whole job on account of one bad volume. Thanks for taking the time to reply, sincerely! -- Mark -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?
As a one-time disk rescuer, let me mention that 'dd' is your friend. Mehma === On 11/3/10 3:16 PM, Mark Luntzel wrote: On Wed, Nov 3, 2010 at 2:50 PM, Phil Stracchinoala...@metrocast.net wrote: On 11/03/10 13:34, Mark Luntzel wrote: The answer is probably no but... I can hear the disk currently being written to making bad noises, and the speed is extremely slow. Bad disk for sure, about to fail. This is at the end of a multi-Terabyte backup, just about 500 gig left and I would REALLY hate to think there is no way out but to start over completely. So is there a way for me to replace that disk without invalidating the entire backup? Running on Linux with an external SATA / FW enclosure, bacula version 3.0.2 If it's mirrored, or part of any RAID1 or higher configuration, you can swap out the failed disk and rebuild the mirror. If it is a single, unmirrored disk ... sorry, but if it's failed, you're SOL. Check the disk with smartmontools or similar before you assume it's bad, and check your kernel logs for write failures. Is it possible it's just thrashing? Just to clear up confusion, I am talking about the disk being *written to*. It is not in a RAID, it is just in an external SATA/FW enclosure. Since I cannot yet (AFAIK) send SMART commands across the USB/FW bus, I had to settle with xfs_check. Which caused the disk to again make bad clunking noises and output btree block errors. This could just indicate filesystem corruption, but the question is moot for the purposes of this particular job. I hit the IRC channel up, and after discussing the issue with the fine folks there, came to the conclusion that I am indeed SOL. Maybe its on a list of would be nice to have already, but I sure would have liked to not have had to invalidate the whole job on account of one bad volume. Thanks for taking the time to reply, sincerely! -- Mark -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users