Re: [BackupPC-users] RAID and offsite

2011-05-04 Thread Jim Wilcoxson
Holger Parplies wbppc at parplies.de writes:
...
 
 On the other hand, on the computers where it matters (servers, BackupPC), RAID
 1 has been running for years without a real problem (I *have* seen RAID 
 members
 dropped from an array without understandable reasons, but, mostly, re-adding
 them simply worked; more importantly, there was no interruption of service).

In a prior life, I used Linux software RAID 1 (mirroring) in a startup.  We were
dumb and poor, and built servers using mail-order parts, including hard drives.

For interesting reading, lookup raid vibration error.  At the time, this was a
sort of new phenomena, and even the disk manufacturers didn't understand why
there were very high failure rates of hard drives in RAID configurations.  Most
consumer systems had 1 hard drive, and most enterprise systems used SCSI for
the higher performance.  Both worked fine.

But when companies started using vanilla IDE hard drives in RAID configurations
at very busy sites with a lot of concurrent IO, things broke.  We had a failure
rate of about 10% on Maxtor hard drives, with behavior that has been mentioned
here: the drive fails with a read error, is taken out of the array, but can be
re-added with no problem.  Then later, the drive fails again, at a different
address.  Repeat until you have no more hair to pull out. :)

The problem is that when 2 or more drives are in close proximity and are
accessed concurrently, drive 1 can be sitting over a track waiting for a sector
to rotate around, and a seek in drive 2 can knock drive 1 off track.  This
leads to a read error in drive 1.  If it is part of a RAID, the usual OS
response is to fail the drive since it is mirrored.  If the drive were NOT
mirrored, the OS usually just retries the seek+read, and is successful.  So a
RAID1 configuration makes this configuration break more often.

The drive makers soon realized what was happening and came out with enterprise
class IDE drives.  These have a feature to monitor and adjust the head position
following a seek to ensure the head stays on track.

Once we switched to SCSI drives, this problem went away.

Jim
--
HashBackup: easy Unix onsite/offsite backup
www.hashbackup.com


--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-29 Thread Holger Parplies
Hi,

Les Mikesell wrote on 2011-04-28 23:15:52 -0500 [Re: [BackupPC-users] RAID and 
offsite]:
 On 4/28/11 9:50 PM, Holger Parplies wrote:
  I'm sure that's a point where we'll all disagree with each other :-).
 
  Personally, I wouldn't use a common set of disks for normal backup operation
  and offsite backups. [...]
 
 I don't think there is anything predictable about disk failure. Handling
 them is probably bad.  Normal (even heavy) use doesn't seem to matter
 unless maybe they overheat.

well, age does matter at *some* point, as does heat. Unless you proactively
replace the disks before that point is reached, they will likely all be old
when the first one fails. Sure, if the first disk fails after a few months,
the others will likely be ok (though I've had a set of 15 identical disks
where about 10 failed within the first 2 years).

  [...] I think it brought up the *wrong* (i.e. faulty) disk of the mirror and
  failed on an fsck. [...]
 
 Grub doesn't know about raid and just happens to work with raid1 because it 
 treats the disk as a single drive.

What's more, grub doesn't know about fsck.

grub found and booted a kernel. The kernel then decided that its root FS on
/dev/md0 consisted of the wrong mirror (or maybe its LVM PV on /dev/md1;
probably both). grub and the BIOS have no part in that decision.

I can see that the remaining drive may fail to boot (which it didn't), but I
*can't* see why an array should be started in degraded mode on the *defective*
mirror when both are present.

 And back in IDE days, a drive failure usually locked the controller which
 might have had another drive on the same cable.

Totally unrelated, but yes. SATA in my case anyway.

  I *have* seen RAID members dropped from an array without understandable
  reasons, but, mostly, re-adding them simply worked [...]
 
 I've seen that too.  I think retries are much more aggressive on single
 disks or the last one left in a raid than on the mirror.

Yes, but a retry needs a read error first. Are retries on single disks always
logged or only on failure?

Or perhaps I should ask this: are retries uncommon enough to warrant failing
array members, yet common enough that a disk that has produced one can still
be trustworthy? How do you handle disks where you see that happen? Replace or
retry?

  [...] there are no guarantees your specific software/kernel/driver/hardware
  combination will not trigger some unknown (or unfixed ;-) bug.
 
 I had a machine with a couple of 4-year uptime runs (a red hat 7.3) where 
 several of the scsi drives failed and were hot-swapped and re-synced with no 
 surprises.  So unless something has broken in the software recently, I mostly 
 trust it.

You mean, your RH 7.3 machine had all software/kernel/driver/hardware
combinations that there are?

Like I said, I've seen (and heard of) strange occurrences, yet, like you, I
mostly trust the software, simply out of lack of choice. I *can't* verify its
correct operation; I could only try to reproduce incorrect operation, were I
to notice it. When something strange happens, I mostly attribute it to user
errors, bugs in file system code, hardware errors (memory or power supply).
RAID software errors are last on my mind. In any case, the benefits seem to
outweigh the doubts.

Yet there remain these few strange occurrences, which may or may not be
RAID-related. On average, every few thousand years, a CPU will randomly
compute an incorrect result for some operation for whatever reason. That is
unlikely enough that any single one of us is extremely unlikely to ever be
affected. But there are enough computers around that it does happen on a daily
basis. Most of the time, the effect is probably benign (random mouse movement,
one incorrect sample in an audio stream, another Windoze bluescreen, whatever).
It might as well be RAID weirdness in one case. Or the RAID weirdness may be
the result of an obscure bug. Complex software *does* contain bugs, you know.

  It *would* help to understand how RAID event counts and the Linux RAID
  implementation in general work. Has anyone got any pointers to good
  documentation?
 
 I've never seen it get this wrong when auto-assembling at reboot (and I move 
 disks around frequently and sometimes clone machines by splitting the mirrors 
 into different machines), but it shouldn't matter in the BPC scenario because 
 you are always manually telling it which partition to add to an already
 running array.

That doesn't exactly answer my question, but I'll take it as a no, I don't.

Yes, I *did* mention that, I believe, but if your 2 TB resync doesn't complete
before reboot/power failure, then you exactly *don't* have a rebuild initiated
by an 'md --add'; after reboot, you have an auto-assembly (I also mentioned
that). And, also agreed, I've also never ***seen*** it get this wrong when
auto-assembling at reboot (well, except for once, but let's even ignore that).

My point is that auto-assembly normally takes two (or more) mirrors

Re: [BackupPC-users] RAID and offsite

2011-04-29 Thread Les Mikesell
On 4/29/2011 1:48 AM, Holger Parplies wrote:

 well, age does matter at *some* point, as does heat. Unless you proactively
 replace the disks before that point is reached, they will likely all be old
 when the first one fails. Sure, if the first disk fails after a few months,
 the others will likely be ok (though I've had a set of 15 identical disks
 where about 10 failed within the first 2 years).

I think of it about like light bulbs.  All you know is that they don't 
last forever. Manufacturing batches are probably the most critical 
difference and it's not something you can control.  Anyway, the old rule 
about data is that if something is important you should have at least 3 
copies and don't let the person who destroyed the first 2 touch the last 
one.

 [...] I think it brought up the *wrong* (i.e. faulty) disk of the mirror and
 failed on an fsck. [...]

 Grub doesn't know about raid and just happens to work with raid1 because it
 treats the disk as a single drive.

 What's more, grub doesn't know about fsck.

 grub found and booted a kernel. The kernel then decided that its root FS on
 /dev/md0 consisted of the wrong mirror (or maybe its LVM PV on /dev/md1;
 probably both). grub and the BIOS have no part in that decision.

Sort-of... Grub itself is loaded by bios, which may fail (or not) 
automatically to the alternate disk.  Then it loads the kernel and 
initrd from the disk it was configured to use (but which might not be in 
the same position now).  These can potentially be out of date if one 
copy had been kicked out of the raid and you didn't notice.  But that 
probably wasn't the problem.  The kernel takes over at that point, 
re-detects the drives, assembles the raids, and then looks at the file 
systems.

 I can see that the remaining drive may fail to boot (which it didn't), but I
 *can't* see why an array should be started in degraded mode on the *defective*
 mirror when both are present.

That's going to depend on what broke in the first place. If it went down 
cleanly and both drives work at startup, they should have been assembled 
together.  If you crashed, the raid assembly will be looking at one 
place for the uuid and event counts, where the file system cleanness 
check happens later and looks in a different place.  So the raid 
assembly choice can't have anything to do with the correctness of the 
file system on it.  And just to make things more complicated, I've seen 
cases where bad RAM caused very intermittent problems that included 
differences between the mirror instances that lingered and re-appeared 
randomly after the RAM was fixed.

 I *have* seen RAID members dropped from an array without understandable
 reasons, but, mostly, re-adding them simply worked [...]

 I've seen that too.  I think retries are much more aggressive on single
 disks or the last one left in a raid than on the mirror.

 Yes, but a retry needs a read error first. Are retries on single disks always
 logged or only on failure?

I've seen this with single partitions out of several on the same disk, 
so I don't think it is actually seen as a hardware-level error.  Maybe 
it is just a timeout while the disk does a soft recovery.

 Or perhaps I should ask this: are retries uncommon enough to warrant failing
 array members, yet common enough that a disk that has produced one can still
 be trustworthy? How do you handle disks where you see that happen? Replace or
 retry?

Not sure there's a generic answer. I've replaced drives and not had it 
happen again in some cases.  In at least one case, it did keep happening 
on the swap partition and eventually I stopped adding it back. Much, 
much later the server failed in a way that looked like it was the 
on-board scsi controller.


 [...] there are no guarantees your specific software/kernel/driver/hardware
 combination will not trigger some unknown (or unfixed ;-) bug.

 I had a machine with a couple of 4-year uptime runs (a red hat 7.3) where
 several of the scsi drives failed and were hot-swapped and re-synced with no
 surprises.  So unless something has broken in the software recently, I mostly
 trust it.

 You mean, your RH 7.3 machine had all software/kernel/driver/hardware
 combinations that there are?

No, I mean that the bugs in the software raid1 layer have long been 
ironed out and I expect it to protect against other problems to a 
greater extent than contributing to them.  The physical hard drive 
itself remains as the most likely failure point anyway. And you can 
assume that most of the related software/drivers generally worked or you 
wouldn't have data on the drive to lose.

 Like I said, I've seen (and heard of) strange occurrences, yet, like you, I
 mostly trust the software, simply out of lack of choice. I *can't* verify its
 correct operation;

Yes you can - there is an option to mdadm to verify that the mirrors are 
identical (and fix if they aren't), and the underlying filesystem is 
close enough that you can mount either member partition 

Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread martin f krafft
also sprach Les Mikesell lesmikes...@gmail.com [2011.04.27.1507 +0200]:
  I think a safer alternative would be to do what the OP proposes
  -- that way you always have one safe copy not part of the RAID
  in case something messes up

 You need that anyway, although starting from scratch it doesn't
 matter since you have nothing to lose.  When you were trying to
 convert an old 2-member set to 3, you should have failed/removed
 one of the existing drives (at least logically) until you had
 added the first external and then rotated it offsite, after which
 you would always have the working spare.  There always is a danger
 that the current sync will hit an error on the source leaving the
 whole set unusable, although this should be greatly reduced when
 the source is a 2 member mirror.

There's also drdb, which provides RAID1-over-the-network. It seems
like an excellent way to maintain an offsite copy of the backup
topdir.

-- 
martin | http://madduck.net/ | http://two.sentenc.es/
 
security here. yes, ma'am. yes. groucho glasses. yes, we're on it.
 c'mon, guys. somebody gave an aardvark a nose-cut: somebody who
 can't deal with deconstructionist humor. code blue.
  -- http://azure.humbug.org.au/~aj/armadillos.txt
 
spamtraps: madduck.bo...@madduck.net


digital_signature_gpg.asc
Description: Digital signature (see	http://martin-krafft.net/gpg/sig-policy/999bbcc4/current)
--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread Les Mikesell
On 4/28/11 2:37 AM, martin f krafft wrote:
 also sprach Les Mikeselllesmikes...@gmail.com  [2011.04.27.1507 +0200]:
 I think a safer alternative would be to do what the OP proposes
 -- that way you always have one safe copy not part of the RAID
 in case something messes up

 You need that anyway, although starting from scratch it doesn't
 matter since you have nothing to lose.  When you were trying to
 convert an old 2-member set to 3, you should have failed/removed
 one of the existing drives (at least logically) until you had
 added the first external and then rotated it offsite, after which
 you would always have the working spare.  There always is a danger
 that the current sync will hit an error on the source leaving the
 whole set unusable, although this should be greatly reduced when
 the source is a 2 member mirror.

 There's also drdb, which provides RAID1-over-the-network. It seems
 like an excellent way to maintain an offsite copy of the backup
 topdir.

It seems like a good idea, but I don't think anyone has reported what kind of 
bandwidth it needs or what happens if the server crashes mid-write.  It doesn't 
quite give you the safety net of a completely isolated copy that you get with a 
rotating set of disks.

-- 
   Les Mikesell
 lesmikes...@gmail.com



--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread Holger Parplies
Hi,

Michael Conner wrote on 2011-04-27 10:27:18 -0500 [Re: [BackupPC-users] RAID 
and offsite]:
 On Apr 26, 2011, at 12:08 PM, Les Mikesell wrote:
  On 4/26/2011 11:38 AM, Michael Conner wrote:
  [...]
  Someone used a RAID 1 setup but only put in the second disk periodically,
  then removed it for offsite storage. I have three 2T drives, so was
  considering something similar where I would keep a normal 2-disk RAID 1
  setup but periodically remove one disk and replace it with a prior
  offsite disk.

just to summarize what has been posted so far:

1.) Having an *additional* disk (i.e. 3-disk RAID 1 with 2 permanent and 1
offsite member) protects you against single disk failures during rebuild.
Other failures (software, hardware, controller, lightning, etc.) can still
do harm, so it is still not perfect, but I think there is no disagreement
on that the additional RAID member does add protection against one very
real failure scenario.

2.) You really need more than one offsite disk, if you are taking offsite
seriously. I.e. bringing the disk on-site, failing one RAID member, adding
the previous offsite disk, and then taking the new offsite disk off-site
will temporarily have all disks on-site. That may or may not be of concern
for you, but it is worth emphasizing.
On the other hand, first failing one RAID member, taking it off-site, then
bringing in the other disk and adding it, will leave you with a degraded
RAID for a considerable amount of time (and may not work for you, depending
on how often you want to resync).

With just 4 disks, you can have both a permanent 2-way RAID 1 (3 members, one
only connected for resync) and one copy always offsite. Normally, you keep
both offsite disks offsite, and bring them in alternately to resync.

  [...]
  But, note that even though you don't technically have to stop/unmount 
  the raid while doing the sync, realistically it doesn't perform well 
  enough to do backups at the same time. I use a cron job to start the 
  sync very early in the morning so it will complete before backups would 
  start.

How do you schedule the sync? (Or are you just talking about hot-adding the
disk via cron?)

 All my sata drives are external internals. That is, they are connected to
 PCI sata controller but since there are no bays to install them in the
 computer chasis, I just run the cables outside through a PCI slot bar.
 Still have to figure out the a long-term housing solution. At least they
 are easy to access.

I don't think eSATA has any real disadvantages over SATA performance wise.
Sure, you have external cabling and one or more separate power supplies as
additional points of failure. But if you have that anyway, you might as well
use standard cables that somewhat facilitate handling. Or buy a computer
chassis that will accommodate your drives (and use eSATA for the offsite
drive(s)).

 So I would be ok doing something like this:
 Stop BPC process
 Unmount raid array (md0 made up of sda1 and sdb1)
 Use mdadm to remove sdb1 from the array

Assuming you want to remount your file system and restart BackupPC, you can do
so at this point (or later). As Les said, your performance may vary :).

 Take off the sdb drive, attach offsite one in its place

Assuming your kernel/SATA-driver/SATA-chipset can handle hotswapping ...
otherwise you'd need to reboot here.

 Use mdadm to add sdb1 to md0 and reconstruct
 
 Maybe cycle through whether I remove sda or sdb so all drives get used
 about the same amount over time.

I'm sure that's a point where we'll all disagree with each other :-).

Personally, I wouldn't use a common set of disks for normal backup operation
and offsite backups. BackupPC puts considerable wear on its pool disks. At
some point in time, you'll either have failing disks or proactively want to
replace disks before they start failing. Are you sure you want to think about
failing pool disks and failing offsite backup disks at the same time (i.e.
correlated)? I assume, failing pool disks are one of the things you want to
protect against with offsite backups. So why use backup media that are likely
to begin failing just when you'll need them?

 My main concerns were: can I remount and use md0 while it is rebuilding and
 that there is no danger of the array rebuilding to the state of the newly
 attached drive (I'm very paranoid).

I can understand that. I used RAID 1 in one of my computers (root FS, system,
data) for a time simply for the purpose of gaining experience with RAID 1. I
didn't notice much (except for the noise of the additional disk) until one
disk had some sort of problem. I don't remember the details, but I recall that
I had expected the computer to boot unattendedly (well, the 'reboot' was
manual ... or was it actually a crash that triggered the problem?), which it
didn't. I think it brought up the *wrong* (i.e. faulty) disk of the mirror and
failed on an fsck. Physically removing the faulty disk

Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread Holger Parplies
Hi,

Les Mikesell wrote on 2011-04-27 20:36:13 -0500 [Re: [BackupPC-users] RAID and 
offsite]:
 On 4/27/11 7:10 PM, Chris Parsons wrote:
  On 28/04/2011 6:52 AM, Les Mikesell wrote:
  I've forgotten the original context, but if it is setting up a new
  system you don't have much to lose in the initial sync - and by the time
  you do, you should have another copy already stored offsite.
 
  In this case, why involve the complexities of RAID at all. Just use
  individual disks, each with their own pool and rotate them. If a disk
  fails, you only lose that pool. It avoids all the complexities of raid
  - and the danger of raid corruption. I don't see any point in involving
  raid until you need to span pools over more than one disk.
 
 Most backups do a double duty. One use is for complete system/disaster
 recovery, one is for for when you realize the next day that you deleted
 something you need.   Backuppc is particularly good for the latter, more
 frequent occurrence, but if you've just swapped an old disk back you won't
 have access to the copy you are most likely to need.  You'll also be
 copying more than necessary with older reference copies, but that is less
 likely to be a real problem.

Aside from this very important point, some minor reasons spring to mind ...

1.) n-way RAID 1 gives you a theoretical increase of *read* throughput by a
factor of n. Aside from that, it can save you some head seeks, further
speeding up operation. As it happens, BackupPC prefers reading from the
pool over writing to it, when it has a choice (actually, it prefers
decompression over compression, but the result is the same).
So, at least in theory, RAID 1 can speed up your backups.

2.) Although we usually forget it, RAID is about *uninterrupted* operation.
If your disk dies, your server doesn't go down. With RAID, you *might*
be able to go and buy a replacement disk, plug it into the computer,
and resync the data without any of your users ever noticing anything
(except for the slowdown due to the resync, but they'll probably just
complain, my internet is broken ;-).

3.) If a disk fails, and you only lose that pool, that pool may contain
backups you vitally need. Though the other pools probably contain backups
close by, that may not be good enough. While you can't avoid losing
young backups (from after the last resync), you *can* avoid losing older
backups.

But, if these points are not important in your case, using independent pools
may very well be an option. As a variation, you could even have offsite
BackupPC servers doing backups alternately (server 1 on day 1, server 2 on
day 2, server 3 on day 3, server 1 on day 4, etc.) if your backup clients
(or network bandwidth) can't take the impact of more than one backup per
day. That way, you would have all of the backup history online (though
spread over several servers), and a disasterous event at any one site
would leave you the remaining pools.
Bandwidth permitting, of course.

Regards,
Holger

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread Les Mikesell
On 4/28/11 9:50 PM, Holger Parplies wrote:

 [...]
 But, note that even though you don't technically have to stop/unmount
 the raid while doing the sync, realistically it doesn't perform well
 enough to do backups at the same time. I use a cron job to start the
 sync very early in the morning so it will complete before backups would
 start.

 How do you schedule the sync? (Or are you just talking about hot-adding the
 disk via cron?)

I have trayless hot-swap SATA bays and physically put the disk in the day 
before, then have an 'mdadm --add ... ' command in cron at about 3 am when the 
backups are predictably complete.  The disk is recognized automatically when 
inserted but isn't used until the mdadm command adds it.  Normally I break the 
raid and remove it at the end of the day, but it doesn't really hurt to leave 
it 
in as long as the sync completes before the nightly runs start.

 Take off the sdb drive, attach offsite one in its place

 Assuming your kernel/SATA-driver/SATA-chipset can handle hotswapping ...
 otherwise you'd need to reboot here.

Most do - although I do have a Paradise card that doesn't.

 Use mdadm to add sdb1 to md0 and reconstruct

 Maybe cycle through whether I remove sda or sdb so all drives get used
 about the same amount over time.

 I'm sure that's a point where we'll all disagree with each other :-).

 Personally, I wouldn't use a common set of disks for normal backup operation
 and offsite backups. BackupPC puts considerable wear on its pool disks. At
 some point in time, you'll either have failing disks or proactively want to
 replace disks before they start failing. Are you sure you want to think about
 failing pool disks and failing offsite backup disks at the same time (i.e.
 correlated)? I assume, failing pool disks are one of the things you want to
 protect against with offsite backups. So why use backup media that are likely
 to begin failing just when you'll need them?

I don't think there is anything predictable about disk failure. Handling them 
is 
probably bad.  Normal (even heavy) use doesn't seem to matter unless maybe they 
overheat.

 My main concerns were: can I remount and use md0 while it is rebuilding and
 that there is no danger of the array rebuilding to the state of the newly
 attached drive (I'm very paranoid).

 I can understand that. I used RAID 1 in one of my computers (root FS, system,
 data) for a time simply for the purpose of gaining experience with RAID 1. I
 didn't notice much (except for the noise of the additional disk) until one
 disk had some sort of problem. I don't remember the details, but I recall that
 I had expected the computer to boot unattendedly (well, the 'reboot' was
 manual ... or was it actually a crash that triggered the problem?), which it
 didn't. I think it brought up the *wrong* (i.e. faulty) disk of the mirror and
 failed on an fsck. Physically removing the faulty disk corrected the 
 problem.
 Somewhat disappointing. What's more, *both* disks are now working flawlessly
 in separate computers, so I'm really clueless what the problem was in the
 first place. Sounds like a software error, much like in Jeffrey's case.

Grub doesn't know about raid and just happens to work with raid1 because it 
treats the disk as a single drive.  What happens when booting the 2nd member 
depends on how your bios treats the drive, whether bios and grub agree on the 
device identifier after booting, and whether is was what you expected when you 
installed grub on it (when you still had a working primary drive).  And back in 
IDE days, a drive failure usually locked the controller which might have had 
another drive on the same cable.

 On the other hand, on the computers where it matters (servers, BackupPC), RAID
 1 has been running for years without a real problem (I *have* seen RAID 
 members
 dropped from an array without understandable reasons, but, mostly, re-adding
 them simply worked; more importantly, there was no interruption of service).

I've seen that too.  I think retries are much more aggressive on single disks 
or 
the last one left in a raid than on the mirror.

 I guess that simply means: test it before you rely on it working. Many people
 are using Linux RAID 1 in production environments, so it appears to work well
 enough, but there are no guarantees your specific
 software/kernel/driver/hardware combination will not trigger some unknown (or
 unfixed ;-) bug.

I had a machine with a couple of 4-year uptime runs (a red hat 7.3) where 
several of the scsi drives failed and were hot-swapped and re-synced with no 
surprises.  So unless something has broken in the software recently, I mostly 
trust it.

 It *would* help to understand how RAID event counts and the Linux RAID
 implementation in general work. Has anyone got any pointers to good
 documentation?

I've never seen it get this wrong when auto-assembling at reboot (and I move 
disks around frequently and sometimes clone machines by splitting the mirrors 
into 

Re: [BackupPC-users] RAID and offsite

2011-04-28 Thread Adam Goryachev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 29/04/11 12:50, Holger Parplies wrote:
 I'm sure that's a point where we'll all disagree with each other :-).

No doubt :)

 Personally, I wouldn't use a common set of disks for normal backup operation
 and offsite backups. BackupPC puts considerable wear on its pool disks. At
 some point in time, you'll either have failing disks or proactively want to
 replace disks before they start failing. Are you sure you want to think about
 failing pool disks and failing offsite backup disks at the same time (i.e.
 correlated)? I assume, failing pool disks are one of the things you want to
 protect against with offsite backups. So why use backup media that are likely
 to begin failing just when you'll need them?

I strongly suggest everyone read the google paper on hard disk drive
failures, and methods for predicting them...

One of the issues they noted was that age wasn't a factor after the
first few months, and before about 5 years or more (I forget the exact
numbers...)

Also, usage wasn't a great factor, though all their drives are powered
on all the time :)

I've read parts of it a few times, and think most people here would find
it interesting reading if you have the time... It touches on temperature
of drives as well.

Regards,
Adam

- -- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk26QOkACgkQGyoxogrTyiWW3ACdE1UmfRlwq5Z63Dr97zusbIl8
YJgAn3HOUqIX5wzaLa+ne+2UMZM1TxCa
=PocQ
-END PGP SIGNATURE-

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Adam Goryachev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 27/04/11 15:44, Jeffrey J. Kosowsky wrote:
 Les Mikesell wrote at about 12:08:22 -0500 on Tuesday, April 26,
 2011:
 On 4/26/2011 11:38 AM, Michael Conner wrote:
 However, another idea intrigued me that I saw in an earlier
 posting. Someone used a RAID 1 setup but only put in the second
 disk periodically, then removed it for offsite storage. I have
 three 2T drives, so was considering something similar where I
 would keep a normal 2-disk RAID 1 setup but periodically remove
 one disk and replace it with a prior offsite disk.
 It is working for me, but I use a 3-member RAID1 where 2 are always
  connected and the 3rd is rotated out periodically.  This isn't
 really necessary but when I was first trying it with one internal,
 one external drive the internal one failed, corrupting the attached
 external, and it was something of a hassle to rebuild from the
 remaining offsite external.
 I did it that way where the 3rd 'backup' drive was mounted via USB
 and had a *catastrophic* failure where something went wrong with the
 3rd drive causing all three RAID1 members to become corrupted. I'm
 not sure exactly what but I ended up losing 2 years of backups.

I suppose everyone has had their own experience with all these things,
but at the end of the day, this is a pure risk/benefit analysis

 I think a safer alternative would be to do what the OP proposes -- 
 that way you always have one safe copy not part of the RAID in case 
 something messes up

In other words, you are increasing the risk of your one remaining HDD to
fail during the time that you are syncing the latest removable drive you
just plugged in. You are potentially decreasing your risk of corrupting
your RAID array due to some unknown hardware issue with a previously
used USB HDD.

As opposed to decreasing your risk to a single disk failure during a
RAID1 resync (when adding the third disk), and increasing the risk of
array corruption due to some weird code problem that you may have
experienced some time in the past...

Not to minimise the issues you had, just wanted to remind everyone to
properly analyse the risks with the different options they select.

Regards,
Adam

- -- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk23svgACgkQGyoxogrTyiUNwACgiGMVAZNV593jvPdHoOCSaC+t
hLwAoJDvJZ6gS8G3K7tgVAw4MCzL1pBf
=RAb+
-END PGP SIGNATURE-

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Jeffrey J. Kosowsky
Adam Goryachev wrote at about 16:08:56 +1000 on Wednesday, April 27, 2011:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  On 27/04/11 15:44, Jeffrey J. Kosowsky wrote:
   Les Mikesell wrote at about 12:08:22 -0500 on Tuesday, April 26,
   2011:
   On 4/26/2011 11:38 AM, Michael Conner wrote:
   However, another idea intrigued me that I saw in an earlier
   posting. Someone used a RAID 1 setup but only put in the second
   disk periodically, then removed it for offsite storage. I have
   three 2T drives, so was considering something similar where I
   would keep a normal 2-disk RAID 1 setup but periodically remove
   one disk and replace it with a prior offsite disk.
   It is working for me, but I use a 3-member RAID1 where 2 are always
connected and the 3rd is rotated out periodically.  This isn't
   really necessary but when I was first trying it with one internal,
   one external drive the internal one failed, corrupting the attached
   external, and it was something of a hassle to rebuild from the
   remaining offsite external.
   I did it that way where the 3rd 'backup' drive was mounted via USB
   and had a *catastrophic* failure where something went wrong with the
   3rd drive causing all three RAID1 members to become corrupted. I'm
   not sure exactly what but I ended up losing 2 years of backups.
  
  I suppose everyone has had their own experience with all these things,
  but at the end of the day, this is a pure risk/benefit analysis
  
   I think a safer alternative would be to do what the OP proposes -- 
   that way you always have one safe copy not part of the RAID in case 
   something messes up
  
  In other words, you are increasing the risk of your one remaining HDD to
  fail during the time that you are syncing the latest removable drive you
  just plugged in. You are potentially decreasing your risk of corrupting
  your RAID array due to some unknown hardware issue with a previously
  used USB HDD.
  

But you still have a removed spare that was up to date as of the time
you removed it just prior to inserting your backup drive. So, at most
you lost a few hours of a concurrently running backuppc process and if
you halted backuppc prior to synching (as per Les's approach -- which
is a good idea to prevent disk thrashing anyway) then you haven't lost
anything, assuming you have nothing else running on that disk.

  As opposed to decreasing your risk to a single disk failure during a
  RAID1 resync (when adding the third disk), and increasing the risk of
  array corruption due to some weird code problem that you may have
  experienced some time in the past...
  
  Not to minimise the issues you had, just wanted to remind everyone to
  properly analyse the risks with the different options they select.

I still think that losing all 3 (which however unlikely is still
possible) is way, way, way, worse than potentially losing 1-2 out of 3
and still having a spare to recover (carefully) from. And my case can
occur if you lose a disk controller or if there is a transient or if
you do something stupid and overwrite the disk, etc...

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Les Mikesell
On 4/27/11 12:44 AM, Jeffrey J. Kosowsky wrote:
 Les Mikesell wrote at about 12:08:22 -0500 on Tuesday, April 26, 2011:
 On 4/26/2011 11:38 AM, Michael Conner wrote:
   I installed BPC a few weeks ago and have been doing testing and setup 
 since then and have things working pretty well on several linux, windows, and 
 mac clients (ultimately there will be about 15 clients). The server is a Dell 
 2400 with a 160gb ide drive, Centos 5.6, BPC 3.1 installed with yum from the 
 testing repos. I've added a sata controller and two 2t drives as a raid 1 
 setup, which is what I'll use for real backups. I can't boot off the sata 
 drives, so I boot from the ide drive and put topdir on the satas.
 
   I've done some searching on offsite backups as I would like to 
 maintain at least a recent copy offsite as disaster protection. DD has been 
 too slow for these large drives (I would have to leave it going overnight 
 with no backups running). I may go with periodic archives using the BPC 
 archive function.
 
   However, another idea intrigued me that I saw in an earlier posting. 
 Someone used a RAID 1 setup but only put in the second disk periodically, 
 then removed it for offsite storage. I have three 2T drives, so was 
 considering something similar where I would keep a normal 2-disk RAID 1 setup 
 but periodically remove one disk and replace it with a prior offsite disk.
 
   Not being particularly experienced in all this, I was hoping someone 
 on the list could offer advice on whether this was a good ideal or not and 
 potential pitfalls.
   
 It is working for me, but I use a 3-member RAID1 where 2 are always
 connected and the 3rd is rotated out periodically.  This isn't really
 necessary but when I was first trying it with one internal, one external
 drive the internal one failed, corrupting the attached external, and it
 was something of a hassle to rebuild from the remaining offsite external.

 I did it that way where the 3rd 'backup' drive was mounted via USB and
 had a *catastrophic* failure where something went wrong with the 3rd
 drive causing all three RAID1 members to become corrupted. I'm not sure
 exactly what but I ended up losing 2 years of backups.

I think that's similar to saying lightning struck my house while using RAID so 
I'll never use it again.   It's actually very difficult to get mdadm to do 
something wrong even if you try since it warns you if you try to use partitions 
that are busy elsewhere or the wrong size.  I wouldn't recommend using USB in 
production though, and would guess the failure was more likely related to 
hardware/drivers than the software raid layer.   Firewire/esata are OK for 
external enclosures and swappable internal sata bays are even nicer.

 I think a safer alternative would be to do what the OP proposes --
 that way you always have one safe copy not part of the RAID in case
 something messes up

You need that anyway, although starting from scratch it doesn't matter since 
you 
have nothing to lose.  When you were trying to convert an old 2-member set to 
3, 
you should have failed/removed one of the existing drives (at least logically) 
until you had added the first external and then rotated it offsite, after which 
you would always have the working spare.  There always is a danger that the 
current sync will hit an error on the source leaving the whole set unusable, 
although this should be greatly reduced when the source is a 2 member mirror.

-- 
   Les Mikesell
 lesmikes...@gmail.com

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Adam Goryachev
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Discussing the differences between using a 2 member RAID1 where one
device is periodically removed and rotated offsite to a 3 member RAID1
where one device is periodically removed and rotated offsite.

On 27/04/11 22:47, Jeffrey J. Kosowsky wrote:
 I still think that losing all 3 (which however unlikely is still
 possible) is way, way, way, worse than potentially losing 1-2 out of 3
 and still having a spare to recover (carefully) from. And my case can
 occur if you lose a disk controller or if there is a transient or if
 you do something stupid and overwrite the disk, etc...

I'm not sure why you think you would lose 3 out of 3? Sure, with the 3
member RAID1 setup, you need one additional drive compared to your 2
device RAID1 setup.

ie, 2 RAID1 devices are always installed, another 3 devices are rotated,
with one offsite, one onsite, and the third in one of (transit, offsite,
onsite). Thus, if your 'worst case' of the three member RAID1 going bad,
then you still have the device you just failed out of the RAID1, as well
as the final offsite RAID1 member. Thus you need to lose both RAID1
devices during a resync before you need to rely on the device you just
removed, or finally the offsite copy.

Alternatively, with a max 2 member RAID1 device, you only need to lose
one drive during a resync to have the same result.

BTW interface failure, controller failure, etc are all of equal risk in
either case, so IMHO, that can be ignored as far as determining the
cost/benefit of these two particular cases (but of course, should also
be considered somewhere else).

Regards,
Adam

- -- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk24JNYACgkQGyoxogrTyiWbdwCdGppIDitEcjoA83s1hS9EkElg
ce8An1TmbncQW1mfYhUNDuyE+i+HF9BV
=Mh4q
-END PGP SIGNATURE-

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Michael Conner

On Apr 26, 2011, at 12:08 PM, Les Mikesell wrote:

 On 4/26/2011 11:38 AM, Michael Conner wrote:
 I installed BPC a few weeks ago and have been doing testing and setup since 
 then and have things working pretty well on several linux, windows, and mac 
 clients (ultimately there will be about 15 clients). The server is a Dell 
 2400 with a 160gb ide drive, Centos 5.6, BPC 3.1 installed with yum from the 
 testing repos. I've added a sata controller and two 2t drives as a raid 1 
 setup, which is what I'll use for real backups. I can't boot off the sata 
 drives, so I boot from the ide drive and put topdir on the satas.
 
 I've done some searching on offsite backups as I would like to maintain at 
 least a recent copy offsite as disaster protection. DD has been too slow for 
 these large drives (I would have to leave it going overnight with no backups 
 running). I may go with periodic archives using the BPC archive function.
 
 However, another idea intrigued me that I saw in an earlier posting. Someone 
 used a RAID 1 setup but only put in the second disk periodically, then 
 removed it for offsite storage. I have three 2T drives, so was considering 
 something similar where I would keep a normal 2-disk RAID 1 setup but 
 periodically remove one disk and replace it with a prior offsite disk.
 
 Not being particularly experienced in all this, I was hoping someone on the 
 list could offer advice on whether this was a good ideal or not and 
 potential pitfalls.
 
 It is working for me, but I use a 3-member RAID1 where 2 are always 
 connected and the 3rd is rotated out periodically.  This isn't really 
 necessary but when I was first trying it with one internal, one external 
 drive the internal one failed, corrupting the attached external, and it 
 was something of a hassle to rebuild from the remaining offsite external.
 
 But, note that even though you don't technically have to stop/unmount 
 the raid while doing the sync, realistically it doesn't perform well 
 enough to do backups at the same time. I use a cron job to start the 
 sync very early in the morning so it will complete before backups would 
 start.
 
 -- 
   Les Mikesell
 lesmikes...@gmail.com

All my sata drives are external internals. That is, they are connected to PCI 
sata controller but since there are no bays to install them in the computer 
chasis, I just run the cables outside through a PCI slot bar. Still have to 
figure out the a long-term housing solution. At least they are easy to access.

So I would be ok doing something like this:
Stop BPC process
Unmount raid array (md0 made up of sda1 and sdb1)
Use mdadm to remove sdb1 from the array
Take off the sdb drive, attach offsite one in its place
Use mdadm to add sdb1 to md0 and reconstruct

Maybe cycle through whether I remove sda or sdb so all drives get used about 
the same amount over time.

My main concerns were: can I remount and use md0 while it is rebuilding and 
that there is no danger of the array rebuilding to the state of the newly 
attached drive (I'm very paranoid). I assume that as long as I use mdadm to 
remove and add sdb, it will use sda as the base (or vice versa).

Mike




--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Jeffrey J. Kosowsky
Adam Goryachev wrote at about 00:14:53 +1000 on Thursday, April 28, 2011:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  Discussing the differences between using a 2 member RAID1 where one
  device is periodically removed and rotated offsite to a 3 member RAID1
  where one device is periodically removed and rotated offsite.
  
  On 27/04/11 22:47, Jeffrey J. Kosowsky wrote:
   I still think that losing all 3 (which however unlikely is still
   possible) is way, way, way, worse than potentially losing 1-2 out of 3
   and still having a spare to recover (carefully) from. And my case can
   occur if you lose a disk controller or if there is a transient or if
   you do something stupid and overwrite the disk, etc...
  
  I'm not sure why you think you would lose 3 out of 3? Sure, with the 3
  member RAID1 setup, you need one additional drive compared to your 2
  device RAID1 setup.
  
  ie, 2 RAID1 devices are always installed, another 3 devices are rotated,
  with one offsite, one onsite, and the third in one of (transit, offsite,
  onsite).
Well, this is the first anyone mentioned about another 3 devices are
rotated. Of course, if you are allowed to posit lots of other offline
backups then be definition you risk less even if all of your current
RAID devices fail. The OP however seemed to imply a *single* 3rd
device not 3 other devices...

   Thus, if your 'worst case' of the three member RAID1 going bad,
  then you still have the device you just failed out of the RAID1, as well
  as the final offsite RAID1 member. 

Well if you 'failed' one device then you have essentially the case
that I am advocating for! i.e., leave one of the original copies in a
non-writable state so that you don't inadvertently lose everything.
That being said, removing the drive physically if possible is even
safer than just software failing it. 

I think what people don't fully realize is that no matter how good
your software RAID is, a hardware problem can easily result in an
error propagating across all live RAID1 copies... as I discovered,
this is a very real and painful case...


  Thus you need to lose both RAID1
  devices during a resync before you need to rely on the device you just
  removed, or finally the offsite copy.

Which is *precisely* what I was proposing except that in addition to
failing the device, I suggested removing it physically for extra
security (again assuming you don't have 3 other backup devices
rotating around offsite :P)

 
  Alternatively, with a max 2 member RAID1 device, you only need to lose
  one drive during a resync to have the same result.

No one talked about that case...

  BTW interface failure, controller failure, etc are all of equal risk in
  either case, so IMHO, that can be ignored as far as determining the
  cost/benefit of these two particular cases (but of course, should also
  be considered somewhere else).

But if a device is physically removed and failed then it is
independently safe unless something physically happens on your site
that destroys both the removed disk and the active RAID1 disks.

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Les Mikesell
On 4/27/2011 3:18 PM, Jeffrey J. Kosowsky wrote:

 ie, 2 RAID1 devices are always installed, another 3 devices are rotated,
 with one offsite, one onsite, and the third in one of (transit, offsite,
 onsite).
 Well, this is the first anyone mentioned about another 3 devices are
 rotated. Of course, if you are allowed to posit lots of other offline
 backups then be definition you risk less even if all of your current
 RAID devices fail. The OP however seemed to imply a *single* 3rd
 device not 3 other devices...

Rotating some number (1) of devices in that 3rd slot seems like a 
reasonable assumption. The point is to have an offsite copy.  You can't 
do that if your only extra instance is in the same building, hooked to 
the same computer.

 I think what people don't fully realize is that no matter how good
 your software RAID is, a hardware problem can easily result in an
 error propagating across all live RAID1 copies... as I discovered,
 this is a very real and painful case...

They don't realize it because it shouldn't be related to RAID operation 
except that a read failure on the source also leaves the sync target 
corrupted.   That said, any of the site disasters that your offsite copy 
is intended to protect against can happen any time your copies are all 
in the same location.  So don't do that.  And the same applies for 
anyone trying to do live network copies of the archive to a single 
remote instance.  Most of the ways you could do that will leave your 
copy corrupted if the source dies midway through the operation.

 Thus you need to lose both RAID1
 devices during a resync before you need to rely on the device you just
 removed, or finally the offsite copy.

 Which is *precisely* what I was proposing except that in addition to
 failing the device, I suggested removing it physically for extra
 security (again assuming you don't have 3 other backup devices
 rotating around offsite :P)

No, you really do need additional copies in the rotation so they are 
never all in the same place when the fire/flood/earthquake/lightning (or 
raid-weirdness) happens.

-- 
   Les Mikesell
lesmikes...@gmail.com

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Jeffrey J. Kosowsky
Les Mikesell wrote at about 15:48:29 -0500 on Wednesday, April 27, 2011:
  On 4/27/2011 3:18 PM, Jeffrey J. Kosowsky wrote:
   Which is *precisely* what I was proposing except that in addition to
   failing the device, I suggested removing it physically for extra
   security (again assuming you don't have 3 other backup devices
   rotating around offsite :P)
  
  No, you really do need additional copies in the rotation so they are 
  never all in the same place when the fire/flood/earthquake/lightning (or 
  raid-weirdness) happens.
  

I agree! But that wasn't the case the OP was referring to.

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Les Mikesell
On 4/27/2011 3:54 PM, Jeffrey J. Kosowsky wrote:
 Les Mikesell wrote at about 15:48:29 -0500 on Wednesday, April 27, 2011:
 On 4/27/2011 3:18 PM, Jeffrey J. Kosowsky wrote:
   Which is *precisely* what I was proposing except that in addition to
   failing the device, I suggested removing it physically for extra
   security (again assuming you don't have 3 other backup devices
   rotating around offsite :P)
   
 No, you really do need additional copies in the rotation so they are
 never all in the same place when the fire/flood/earthquake/lightning (or
 raid-weirdness) happens.
   

 I agree! But that wasn't the case the OP was referring to.

I've forgotten the original context, but if it is setting up a new 
system you don't have much to lose in the initial sync - and by the time 
you do, you should have another copy already stored offsite.

-- 
   Les Mikesell
lesmikes...@gmail.com

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Chris Parsons


  
  
On 28/04/2011 6:52 AM, Les Mikesell wrote:

   that wasn't the case the OP was referring to.

  
I've forgotten the original context, but if it is setting up a new 
system you don't have much to lose in the initial sync - and by the time 
you do, you should have another copy already stored offsite.




In this case, why involve the complexities of RAID at all. Just use
individual disks, each with their own pool and rotate them. If a
disk fails, you only lose that pool. It avoids all the complexities
of raid - and the danger of raid corruption. I don't see any point
in involving raid until you need to span pools over more than one
disk.

-- 
   Chris Parsons
 System / Network
  Administrator 
  
  Petrosys Pty
  Ltd
Level 4 North, 191 Pulteney Street
Adelaide SA 5000 AUSTRALIA
Ph: +61 8 8227 2799 | Direct: +61 8 8418 1922 |
Fax: +61 8 8227 2626
www.petrosys.com.au
  

  

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-27 Thread Les Mikesell
On 4/27/11 7:10 PM, Chris Parsons wrote:
 On 28/04/2011 6:52 AM, Les Mikesell wrote:
   that wasn't the case the OP was referring to.
 I've forgotten the original context, but if it is setting up a new
 system you don't have much to lose in the initial sync - and by the time
 you do, you should have another copy already stored offsite.


 In this case, why involve the complexities of RAID at all. Just use individual
 disks, each with their own pool and rotate them. If a disk fails, you only 
 lose
 that pool. It avoids all the complexities of raid - and the danger of raid
 corruption. I don't see any point in involving raid until you need to span 
 pools
 over more than one disk.

Most backups do a double duty. One use is for complete system/disaster 
recovery, 
one is for for when you realize the next day that you deleted something you 
need.   Backuppc is particularly good for the latter, more frequent occurrence, 
but if you've just swapped an old disk back you won't have access to the copy 
you are most likely to need.  You'll also be copying more than necessary with 
older reference copies, but that is less likely to be a real problem.

-- 
Les Mikesell
 lesmikes...@gmail.com


--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] RAID and offsite

2011-04-26 Thread Michael Conner
I installed BPC a few weeks ago and have been doing testing and setup since 
then and have things working pretty well on several linux, windows, and mac 
clients (ultimately there will be about 15 clients). The server is a Dell 2400 
with a 160gb ide drive, Centos 5.6, BPC 3.1 installed with yum from the testing 
repos. I've added a sata controller and two 2t drives as a raid 1 setup, which 
is what I'll use for real backups. I can't boot off the sata drives, so I boot 
from the ide drive and put topdir on the satas.

I've done some searching on offsite backups as I would like to maintain at 
least a recent copy offsite as disaster protection. DD has been too slow for 
these large drives (I would have to leave it going overnight with no backups 
running). I may go with periodic archives using the BPC archive function.

However, another idea intrigued me that I saw in an earlier posting. Someone 
used a RAID 1 setup but only put in the second disk periodically, then removed 
it for offsite storage. I have three 2T drives, so was considering something 
similar where I would keep a normal 2-disk RAID 1 setup but periodically remove 
one disk and replace it with a prior offsite disk. 

Not being particularly experienced in all this, I was hoping someone on the 
list could offer advice on whether this was a good ideal or not and potential 
pitfalls.

Thanks

Mike


--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-26 Thread Les Mikesell
On 4/26/2011 11:38 AM, Michael Conner wrote:
 I installed BPC a few weeks ago and have been doing testing and setup since 
 then and have things working pretty well on several linux, windows, and mac 
 clients (ultimately there will be about 15 clients). The server is a Dell 
 2400 with a 160gb ide drive, Centos 5.6, BPC 3.1 installed with yum from the 
 testing repos. I've added a sata controller and two 2t drives as a raid 1 
 setup, which is what I'll use for real backups. I can't boot off the sata 
 drives, so I boot from the ide drive and put topdir on the satas.

 I've done some searching on offsite backups as I would like to maintain at 
 least a recent copy offsite as disaster protection. DD has been too slow for 
 these large drives (I would have to leave it going overnight with no backups 
 running). I may go with periodic archives using the BPC archive function.

 However, another idea intrigued me that I saw in an earlier posting. Someone 
 used a RAID 1 setup but only put in the second disk periodically, then 
 removed it for offsite storage. I have three 2T drives, so was considering 
 something similar where I would keep a normal 2-disk RAID 1 setup but 
 periodically remove one disk and replace it with a prior offsite disk.

 Not being particularly experienced in all this, I was hoping someone on the 
 list could offer advice on whether this was a good ideal or not and potential 
 pitfalls.

It is working for me, but I use a 3-member RAID1 where 2 are always 
connected and the 3rd is rotated out periodically.  This isn't really 
necessary but when I was first trying it with one internal, one external 
drive the internal one failed, corrupting the attached external, and it 
was something of a hassle to rebuild from the remaining offsite external.

But, note that even though you don't technically have to stop/unmount 
the raid while doing the sync, realistically it doesn't perform well 
enough to do backups at the same time. I use a cron job to start the 
sync very early in the morning so it will complete before backups would 
start.

-- 
   Les Mikesell
 lesmikes...@gmail.com

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RAID and offsite

2011-04-26 Thread Jeffrey J. Kosowsky
Les Mikesell wrote at about 12:08:22 -0500 on Tuesday, April 26, 2011:
  On 4/26/2011 11:38 AM, Michael Conner wrote:
   I installed BPC a few weeks ago and have been doing testing and setup 
   since then and have things working pretty well on several linux, windows, 
   and mac clients (ultimately there will be about 15 clients). The server is 
   a Dell 2400 with a 160gb ide drive, Centos 5.6, BPC 3.1 installed with yum 
   from the testing repos. I've added a sata controller and two 2t drives as 
   a raid 1 setup, which is what I'll use for real backups. I can't boot off 
   the sata drives, so I boot from the ide drive and put topdir on the satas.
  
   I've done some searching on offsite backups as I would like to maintain at 
   least a recent copy offsite as disaster protection. DD has been too slow 
   for these large drives (I would have to leave it going overnight with no 
   backups running). I may go with periodic archives using the BPC archive 
   function.
  
   However, another idea intrigued me that I saw in an earlier posting. 
   Someone used a RAID 1 setup but only put in the second disk periodically, 
   then removed it for offsite storage. I have three 2T drives, so was 
   considering something similar where I would keep a normal 2-disk RAID 1 
   setup but periodically remove one disk and replace it with a prior offsite 
   disk.
  
   Not being particularly experienced in all this, I was hoping someone on 
   the list could offer advice on whether this was a good ideal or not and 
   potential pitfalls.
  
  It is working for me, but I use a 3-member RAID1 where 2 are always 
  connected and the 3rd is rotated out periodically.  This isn't really 
  necessary but when I was first trying it with one internal, one external 
  drive the internal one failed, corrupting the attached external, and it 
  was something of a hassle to rebuild from the remaining offsite external.

I did it that way where the 3rd 'backup' drive was mounted via USB and
had a *catastrophic* failure where something went wrong with the 3rd
drive causing all three RAID1 members to become corrupted. I'm not sure
exactly what but I ended up losing 2 years of backups.

I think a safer alternative would be to do what the OP proposes --
that way you always have one safe copy not part of the RAID in case
something messes up

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/