Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?

Richard Freeman Fri, 23 Jan 2009 08:53:37 -0800

Duncan wrote:

I'd blame that on your choice of RAID (and ultimately on the defectivehardware, but it wouldn't have been as bad on RAID-1 or RAID-6), morethan on what was running on top of it.

Agree - RAID-6 would have helped in this particular circumstance(assuming I didn't lose more than one drive). The non-server hardwarestill was a big issue. I'm not sure I'd ever go with RAID-6 forpersonal use - that is a lot of money in non-useful drives.

What I'd guess happened is that the dirty/degraded crash happened whilethe set of stripes that also had the LVM2 record was being written, althoit wasn't necessarily the LVM data itself being written, but justsomething that happened to be in the same stripe set so the checksumcovering it had to be rewritten as well. It's also possible the hardwareerror you mentioned was affecting the reliability of what the spindlereturned even when it didn't cause resets. In that case, even if thedata was on a different stripe, the resulting checksum written could endup invalid, thus playing havoc with a recovery.

Sounds likely. I think the lvm2 metadata got corrupted. I'm a big fanof zfs and btrfs (once they're production ready) precisely because theytry to address the RAID stripe problem with copy-on-write right down tothe physical level.

data=ordered is the middle ground and I believe what ext3 has alwaysdefaulted to, and what reiserfs has defaulted to for years.

Yup - using ordered data. From a metadata integrity standpoint Ibelieve this has been shown to be equivalent to data=journal. As youpoint out once lvm was hosed that didn't help much.

Lucky, or more appropriately wise you! There aren't so many folks thatbackup to normally offline external device that regularly. Honestly, Idon't.

Yeah - I've learned that lesson over time the hard way. I can't backupeverything (at least not with a big investment), but I do use dar andpar2 to backup everything important. I just create a dar backup weekly,and then run a script on a laptop to copy the data offline. I don'tbackup anything that requires snapshots (I use a cron job do do a mysqlexport separately and back that up), so that works fine for me. This isreally just my high value data - when my system was hosed I had toreinstall from stage3, but I had all my /etc config files so getting upand running didn't take a huge amount of effort. However, I did learnthe hard way that some programs store their actual config files in /varand symlink them into /etc - be sure to catch those in your backups! Iended up having my samba domain controller SID change which was aheadache since now all my usernames don't have their old permissions onall my XP workstations). Granted, this is a house with all of fourusers, which helped with the cleanup.

So... I guess that's something else I can add to my list now, for thenext time I setup a new disk set or whatever. To the everything-portage-touches-on-root that I explained in the other replies, and the RAID-6that I had already chosen over RAID-5, I can now add to the list killingthe LVM2 used in my current setup.

If you have RAID-6 I'm not sure it is worth worrying about getting ridof LVM2. At least, assuming you don't start havingmultiple-drive-failures (a possibility with desktop hardware with allthe drives sharing the same power cords, interfaces, etc).

If you want to think really long term take a look at btrfs. It lookslike it aims to be everything that zfs is (minus the GPL-incompatiblelicense). Definitely not ready for prime time, but the proposed featureset looks better than zfs. I don't like the inability to reshape zfs -you can add more arrays to your system, but you can't add one drive toan existing array (online or offline). Btrfs seems to aim to be able todo this. Again, it is completely experimental at this point - don't useit except to try it out. It will be possible to migrate ext3/4 directlyin-place to btrfs, and even reverse the migration (minus any changes -it essentially snapshots the existing data). The only limitation isthat if you delete files you won't get the space back until you get ridof the ability to migrate back to ext3 (since it is a snapshot).

Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?

Reply via email to