Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Thomas Mueller

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2
 
 If there is a way to temporarily freeze a job (possible using signals)
 to prevent it from writing anything while using LVM to attempt to
 vgextend; pvmove; vgreduce; vgremove that physical disk could be a
 viable solution.

IMHO once a job is running, the only way to stop it is to cancel it. 
there is no freeze.

sometime in the future there will be the ability to restart a failed job, 
see the Projects file:

Item  1: Ability to restart failed jobs

If you are not a coder, you could use 
http://bacula.org/en/?page=makedonation to help getting it done.

- Thomas


--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Mark Luntzel
On Thu, Nov 4, 2010 at 11:14 PM, Thomas Mueller tho...@chaschperli.ch wrote:

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

 If there is a way to temporarily freeze a job (possible using signals)
 to prevent it from writing anything while using LVM to attempt to
 vgextend; pvmove; vgreduce; vgremove that physical disk could be a
 viable solution.

 IMHO once a job is running, the only way to stop it is to cancel it.
 there is no freeze.

 sometime in the future there will be the ability to restart a failed job,
 see the Projects file:

 Item  1: Ability to restart failed jobs

 If you are not a coder, you could use
 http://bacula.org/en/?page=makedonation to help getting it done.

 - Thomas



That seems to be the general consensus. Thank you.

Just to clear it up for some people who seem to be confused about the
situation: I am not facing a data loss. The backup medium (a single
hard drive) has gone bad. I did not want to cancel a ~8.8T run with
~400GB left on account of a single bad 1TB hard drive, but with no
recourse that is what I had to do.

Anyway, thank you all for your help in this matter.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread John Drescher
On Fri, Nov 5, 2010 at 10:35 AM, Mark Luntzel m...@luntzel.com wrote:
 On Thu, Nov 4, 2010 at 11:14 PM, Thomas Mueller tho...@chaschperli.ch wrote:

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

 If there is a way to temporarily freeze a job (possible using signals)
 to prevent it from writing anything while using LVM to attempt to
 vgextend; pvmove; vgreduce; vgremove that physical disk could be a
 viable solution.

 IMHO once a job is running, the only way to stop it is to cancel it.
 there is no freeze.

 sometime in the future there will be the ability to restart a failed job,
 see the Projects file:

 Item  1: Ability to restart failed jobs

 If you are not a coder, you could use
 http://bacula.org/en/?page=makedonation to help getting it done.

 - Thomas
 That seems to be the general consensus. Thank you.


I caution that this method may not work as you expect. Being a
programmer myself, I know that nothing gets done quickly and thus
cheaply. I would expect a feature like this if you paid a professional
developer (who is not doing this voluntarily) to write code for the
task would likely cost  10 thousand dollars US.


 Just to clear it up for some people who seem to be confused about the
 situation: I am not facing a data loss. The backup medium (a single
 hard drive) has gone bad. I did not want to cancel a ~8.8T run with
 ~400GB left on account of a single bad 1TB hard drive, but with no
 recourse that is what I had to do.

 Anyway, thank you all for your help in this matter.


I understood the problem. I wanted you to try letting the job complete
on the bad drive and fix the situation after the job completed. This
would only be possible if the drive was mostly functioning..

John

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Phil Stracchino
On 11/05/10 10:35, Mark Luntzel wrote:
 Just to clear it up for some people who seem to be confused about the
 situation: I am not facing a data loss. The backup medium (a single
 hard drive) has gone bad. I did not want to cancel a ~8.8T run with
 ~400GB left on account of a single bad 1TB hard drive, but with no
 recourse that is what I had to do.

I think the real key thing to take away from this is, when you are
talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage.
 Otherwise it's like playing Russian roulette with a revolver having an
unknown number of loaded chambers.


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Thomas Mueller

 Item  1: Ability to restart failed jobs

 If you are not a coder, you could use
 http://bacula.org/en/?page=makedonation to help getting it done.

 - Thomas
 That seems to be the general consensus. Thank you.


 I caution that this method may not work as you expect. Being a
 programmer myself, I know that nothing gets done quickly and thus
 cheaply. I would expect a feature like this if you paid a professional
 developer (who is not doing this voluntarily) to write code for the task
 would likely cost  10 thousand dollars US.
 
 

i did not say it will be done quickly or cheap. but it will help to get 
it done sometime. 

- Thomas


--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Henry Yen
On Wed, Nov 03, 2010 at 10:34:44AM -0700, Mark Luntzel wrote:
 The answer is probably no but...
 
 I can hear the disk currently being written to making bad noises, and
 the speed is extremely slow. Bad disk for sure, about to fail. This is
 at the end of a multi-Terabyte backup, just about 500 gig left and I
 would REALLY hate to think there is no way out but to start over
 completely.
 
 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

The clear consensus appears to be no.  However, I wonder if it would
be theoretically possible to suspend every bacula process with a SIGSTOP,
bitimage copy the failing disk, hot swap it, restore the image, then
wake up all the bacula processes with SIGCONT?  Perhaps the major
difficulty would end up being convincing linux to ignore the physical
disk-swap event, but that might depend on how the external enclosure
is set up and configured?

--
Henry Yen   Aegis Information Systems, Inc.
Senior Systems Programmer   Hicksville, New York

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread John Drescher
On Fri, Nov 5, 2010 at 3:09 PM, Thomas Mueller tho...@chaschperli.ch wrote:

 Item  1: Ability to restart failed jobs

 If you are not a coder, you could use
 http://bacula.org/en/?page=makedonation to help getting it done.

 - Thomas
 That seems to be the general consensus. Thank you.


 I caution that this method may not work as you expect. Being a
 programmer myself, I know that nothing gets done quickly and thus
 cheaply. I would expect a feature like this if you paid a professional
 developer (who is not doing this voluntarily) to write code for the task
 would likely cost  10 thousand dollars US.



 i did not say it will be done quickly or cheap. but it will help to get
 it done sometime.


I just wanted to mention that because in opensource I see so many
projects have a bounty or donation for some similar development then
the users get frustrated when the total is $350 and no work gets
done..

John

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Bruno Friedmann
On 11/05/2010 04:18 PM, Phil Stracchino wrote:
 On 11/05/10 10:35, Mark Luntzel wrote:
 Just to clear it up for some people who seem to be confused about the
 situation: I am not facing a data loss. The backup medium (a single
 hard drive) has gone bad. I did not want to cancel a ~8.8T run with
 ~400GB left on account of a single bad 1TB hard drive, but with no
 recourse that is what I had to do.
 
 I think the real key thing to take away from this is, when you are
 talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage.
  Otherwise it's like playing Russian roulette with a revolver having an
 unknown number of loaded chambers.
 
 

Phil, I would said playing Russian roulette without a missing ball :-)

-- 

Bruno Friedmann (irc:tigerfoot)
Ioda-Net Sàrl www.ioda-net.ch

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread John Drescher
On Fri, Nov 5, 2010 at 4:49 PM, Bruno Friedmann br...@ioda-net.ch wrote:
 On 11/05/2010 04:18 PM, Phil Stracchino wrote:
 On 11/05/10 10:35, Mark Luntzel wrote:
 Just to clear it up for some people who seem to be confused about the
 situation: I am not facing a data loss. The backup medium (a single
 hard drive) has gone bad. I did not want to cancel a ~8.8T run with
 ~400GB left on account of a single bad 1TB hard drive, but with no
 recourse that is what I had to do.

 I think the real key thing to take away from this is, when you are
 talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage.
  Otherwise it's like playing Russian roulette with a revolver having an
 unknown number of loaded chambers.



 Phil, I would said playing Russian roulette without a missing ball :-)


I do agree that a single disk is no good to store your only backup.
However I would add a single raid array containing all of your backups
also can destroy your data. I recommend against that as well. But the
same can be said for a tape. The moral of the story is to have at
least 2 copies of whatever data can not be lost on more than 1 single
storage (disk, raid or tape).

John

John

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Phil Stracchino
On 11/05/10 15:01, Henry Yen wrote:
 The clear consensus appears to be no.  However, I wonder if it would
 be theoretically possible to suspend every bacula process with a SIGSTOP,
 bitimage copy the failing disk, hot swap it, restore the image, then
 wake up all the bacula processes with SIGCONT?  Perhaps the major
 difficulty would end up being convincing linux to ignore the physical
 disk-swap event, but that might depend on how the external enclosure
 is set up and configured?


...So what you have to ask yourself is, do you feel lucky?


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-05 Thread Phil Stracchino
On 11/05/10 16:49, Bruno Friedmann wrote:
 On 11/05/2010 04:18 PM, Phil Stracchino wrote:
 I think the real key thing to take away from this is, when you are
 talking datasets of this size, YOU MUST HAVE REDUNDANCY in your storage.
  Otherwise it's like playing Russian roulette with a revolver having an
 unknown number of loaded chambers.
 
 Phil, I would said playing Russian roulette without a missing ball :-)

Well, that too.  :)  On a dataset that size with no redundancy, it's not
a question of *if* there's going to be a BANG at some point; it's only a
question of when.


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-04 Thread Mikael Fridh
On Wed, Nov 3, 2010 at 6:34 PM, Mark Luntzel m...@luntzel.com wrote:
 The answer is probably no but...

 I can hear the disk currently being written to making bad noises, and
 the speed is extremely slow. Bad disk for sure, about to fail. This is
 at the end of a multi-Terabyte backup, just about 500 gig left and I
 would REALLY hate to think there is no way out but to start over
 completely.

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

If there is a way to temporarily freeze a job (possible using signals)
to prevent it from writing anything while using LVM to attempt to
vgextend; pvmove; vgreduce; vgremove that physical disk could be a
viable solution.

Of course you would have to:
1. use LVM
2. Have the same amount of physical extents available somewhere else
in the system as the failing disk.

However, odds are some extents are unreadable and I'm not sure what
exactly happens when a pvmove fails to read an extent. Anyone
experienced with this particular case?

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-03 Thread Phil Stracchino
On 11/03/10 13:34, Mark Luntzel wrote:
 The answer is probably no but...
 
 I can hear the disk currently being written to making bad noises, and
 the speed is extremely slow. Bad disk for sure, about to fail. This is
 at the end of a multi-Terabyte backup, just about 500 gig left and I
 would REALLY hate to think there is no way out but to start over
 completely.
 
 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

If it's mirrored, or part of any RAID1 or higher configuration, you can
swap out the failed disk and rebuild the mirror.  If it is a single,
unmirrored disk ... sorry, but if it's failed, you're SOL.

Check the disk with smartmontools or similar before you assume it's bad,
and check your kernel logs for write failures.  Is it possible it's just
thrashing?


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-03 Thread Mark Luntzel
On Wed, Nov 3, 2010 at 2:50 PM, Phil Stracchino ala...@metrocast.net wrote:
 On 11/03/10 13:34, Mark Luntzel wrote:
 The answer is probably no but...

 I can hear the disk currently being written to making bad noises, and
 the speed is extremely slow. Bad disk for sure, about to fail. This is
 at the end of a multi-Terabyte backup, just about 500 gig left and I
 would REALLY hate to think there is no way out but to start over
 completely.

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

 If it's mirrored, or part of any RAID1 or higher configuration, you can
 swap out the failed disk and rebuild the mirror.  If it is a single,
 unmirrored disk ... sorry, but if it's failed, you're SOL.
 Check the disk with smartmontools or similar before you assume it's bad,
 and check your kernel logs for write failures.  Is it possible it's just
 thrashing?

Just to clear up confusion, I am talking about the disk being *written
to*. It is not in a RAID, it is just in an external SATA/FW enclosure.
Since I cannot yet (AFAIK) send SMART commands across the USB/FW bus,
I had to settle with xfs_check. Which caused the disk to again make
bad clunking noises and output btree block errors. This could just
indicate filesystem corruption, but the question is moot for the
purposes of this particular job.

I hit the IRC channel up, and after discussing the issue with the fine
folks there, came to the conclusion that I am indeed SOL. Maybe its on
a list of would be nice to have already, but I sure would have liked
to not have had to invalidate the whole job on account of one bad
volume.

Thanks for taking the time to reply, sincerely!

-- Mark

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bad disk noises - replace the volume without restarting the backup?

2010-11-03 Thread Mehma Sarja
As a one-time disk rescuer, let me mention that 'dd' is your friend.

Mehma
===
On 11/3/10 3:16 PM, Mark Luntzel wrote:
 On Wed, Nov 3, 2010 at 2:50 PM, Phil Stracchinoala...@metrocast.net  wrote:

 On 11/03/10 13:34, Mark Luntzel wrote:
  
 The answer is probably no but...

 I can hear the disk currently being written to making bad noises, and
 the speed is extremely slow. Bad disk for sure, about to fail. This is
 at the end of a multi-Terabyte backup, just about 500 gig left and I
 would REALLY hate to think there is no way out but to start over
 completely.

 So is there a way for me to replace that disk without invalidating the
 entire backup? Running on Linux with an external SATA / FW enclosure,
 bacula version 3.0.2

 If it's mirrored, or part of any RAID1 or higher configuration, you can
 swap out the failed disk and rebuild the mirror.  If it is a single,
 unmirrored disk ... sorry, but if it's failed, you're SOL.
 Check the disk with smartmontools or similar before you assume it's bad,
 and check your kernel logs for write failures.  Is it possible it's just
 thrashing?
  
 Just to clear up confusion, I am talking about the disk being *written
 to*. It is not in a RAID, it is just in an external SATA/FW enclosure.
 Since I cannot yet (AFAIK) send SMART commands across the USB/FW bus,
 I had to settle with xfs_check. Which caused the disk to again make
 bad clunking noises and output btree block errors. This could just
 indicate filesystem corruption, but the question is moot for the
 purposes of this particular job.

 I hit the IRC channel up, and after discussing the issue with the fine
 folks there, came to the conclusion that I am indeed SOL. Maybe its on
 a list of would be nice to have already, but I sure would have liked
 to not have had to invalidate the whole job on account of one bad
 volume.

 Thanks for taking the time to reply, sincerely!

 -- Mark

 --
 The Next 800 Companies to Lead America's Growth: New Video Whitepaper
 David G. Thomson, author of the best-selling book Blueprint to a
 Billion shares his insights and actions to help propel your
 business during the next growth cycle. Listen Now!
 http://p.sf.net/sfu/SAP-dev2dev
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users