RE: FAQ
Here's one more update of the FAQ. Assuming not too many objections, I'll send it to Jacob, and see if I can contact the list owner and get a footer onto this list. Greg Linux-RAID FAQ Gregory Leblanc gleblanc (at) cu-portland.edu Revision History Revision v0.03 7 August 2000 Revised by: gml Added a request to use a wget type program to fetch the patch. Tried to make things look a little bit better, failed miserably. Revision v0.02 4 August 2000 Revised by: gml Revised a the How do I patch? and the What does /proc/mdstat look like? questions. This is a FAQ for the Linux-RAID mailing list, hosted on vger.rutgers.edu. It's intended as a supplement to the existing Linux-RAID HOWTO, to cover questions that keep occurring on the mailing list. PLEASE read this document before your post to the list. _ 1. General 1.1. Where can I find archives for the linux-raid mailing list? 2. Kernel 2.1. I'm running [insert your linux distribution here]. Do I need to patch my kernel to make RAID work? 2.2. How can I tell if I need to patch my kernel? 2.3. Where can I get the latest RAID patches for my kernel? 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? 1. General 1.1. Where can I find archives for the linux-raid mailing list? My favorite archives are at Geocrawler. http://www.geocrawler.com/lists/3/Linux/57/0/ Other archives are available at http://marc.theaimsgroup.com/?l=linux-raidr=1w=2 Another archive site is http://www.mail-archive.com/linux-raid@vger.rutgers.edu/ 2. Kernel 2.1. I'm running [insert your linux distribution here]. Do I need to patch my kernel to make RAID work? Well, the short answer is, it depends. Distributions that are keeping up to date have the RAID patches included in their kernels. The kernel that RedHat distributes, as do some others. If you download a 2.2.x kernel from ftp.kernel.org, then you will need to patch your kernel. 2.2. How can I tell if I need to patch my kernel? The easiest way is to check what's in /proc/mdstat. Here's a sample from a 2.2.x kernel, with the RAID patches applied. [gleblanc@grego1 gleblanc]$ cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [translucent] read_ahead not set unused devices: none If the contents of /proc/mdstat looks like the above, then you don't need to patch your kernel. Here's a sample from a 2.2.x kernel, without the RAID patches applied. [root@finch root]$ cat /proc/mdstat Personalities : [1 linear] [2 raid0] [3 raid1] [4 raid5] read_ahead not set md0 : inactive md1 : inactive md2 : inactive md3 : inactive If your /proc/mdstat looks like this one, then you need to patch your kernel. 2.3. Where can I get the latest RAID patches for my kernel? The patches for the 2.2.x kernels up to, and including, 2.2.13 are available from ftp.kernel.org. Use the kernel patch that most closely matches your kernel revision. For example, the 2.2.11 patch can also be used on 2.2.12 and 2.2.13. The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. Please use something like wget/curl/lftp to retrieve this patch, as it's easier on the server than using a client like Netscape. Downloading patches with Lynx has been unsuccessful for me; wget may be the easiest way. 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? First, unpack the kernel into some directory, generally people use /usr/src/linux. Change to this directory, and type patch -p1 /path/to/raid-version.patch. On my RedHat 6.2 system, I decompressed the 2.2.16 kernel into /usr/src/linux-2.2.16. From /usr/src/linux-2.2.16, I type in patch -p1 /home/gleblanc/raid-2.2.16-A0. Then I rebuild the kernel using make menuconfig and related builds.
RE: FAQ update
-Original Message- From: James Manning [mailto:[EMAIL PROTECTED]] Sent: Saturday, August 05, 2000 6:08 AM To: Linux Raid list (E-mail) Subject: Re: FAQ update [Luca Berra] The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp Yes, *please* *please* *please* I need some clarification on this. I couldn't make lynx work, it chopped off long lines or something. wget works, I've never heard of the other two. Why exactly is NetScrape bad? That server load thing sounds fishy to me... Greg
Re: FAQ update
On Mon, Aug 07, 2000 at 08:47:47AM -0700, Gregory Leblanc wrote: -Original Message- From: James Manning [mailto:[EMAIL PROTECTED]] Sent: Saturday, August 05, 2000 6:08 AM To: Linux Raid list (E-mail) Subject: Re: FAQ update [Luca Berra] The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp Yes, *please* *please* *please* I need some clarification on this. I couldn't make lynx work, it chopped off long lines or something. wget works, I've never heard of the other two. Why exactly is NetScrape bad? That server load thing sounds fishy to me... Greg ok, i'll clarify NutScrape may not work for the same reason lynx failed for you redhat server says the file is text/plain so both netscape and lynx fail if you view the file and than save it to a local file. If you Shift-click on netscape or press 'd' on lynx it should work. i don't give a damn about the load on redhat http server, but i don't like receiving tons of mails saying that the patch from mingo site fails for them. L. P.S. someone could suggest mingo gzips the blasted patches : -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l.
Re: FAQ update
On Fri, Aug 04, 2000 at 01:47:23PM -0700, Gregory Leblanc wrote: Here's a new version, with a couple of changes. What other questions get asked all the time? Greg The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l.
Re: FAQ update
Luca Berra wrote: i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp Works fine for me. -- Edward Schernau,mailto:[EMAIL PROTECTED] Network Architect http://www.schernau.com RC5-64#: 243249 e-gold acct #:131897
Re: FAQ update
[Luca Berra] The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp Yes, *please* *please* *please* -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4
Re: FAQ update
Yo Luca! On Sat, 5 Aug 2000, Edward Schernau wrote: i'd add: dont use netscape to fetch patches from mingo's site, it hurts use lynx/wget/curl/lftp Works fine for me. We are not worried about you. We are worried about mingos FTP server. If you access an FPT server with Netscape it puts a much greater load on the server than if you access it with a plain old ftp client. RGDS GARY --- Gary E. Miller Rellim 20340 Empire Ave, Suite E-3, Bend, OR 97701 [EMAIL PROTECTED] Tel:+1(541)382-8588 Fax: +1(541)382-8676
Re: FAQ
On Fri, Aug 04, 2000 at 09:48:18AM +0530, Abhishek Khaitan wrote: Can;t we use bunzip2 instead of playing with tar? And after bunzip2, try tar -x kernel-2.2.16.tar ? The usual suggestion is: bzip2 -dc filename.tar.bz2 | tar -xf - s/bzip2/gzip/ or s/bzip2/uncompress/ as necessary -- Randomly Generated Tagline: If you remove stricture from a large Perl program currently, you're just installing delayed bugs, whereas with this feature, you're installing an instant bug that's easily fixed. Whoopee. -- Larry Wall in [EMAIL PROTECTED]
Re: FAQ - a suggestion
How about just putting something in like: "Uncompressing the patch is beyond the scope of this document." -- Edward Schernau,mailto:[EMAIL PROTECTED] Network Architect http://www.schernau.com RC5-64#: 243249 e-gold acct #:131897
Re: FAQ
On 08/04/2000 09:54 -0400, Theo Van Dinter wrote: The usual suggestion is: bzip2 -dc filename.tar.bz2 | tar -xf - or use bzcat, which is exactly the same as bzip2 -dc... -- +--+--+ | Tim Walberg | [EMAIL PROTECTED] | | 828 Marshall Ct. | www.concentric.net/~twalberg | | Palatine, IL 60074 | | +--+--+ PGP signature
Re: FAQ
Tim Walberg wrote: On 08/04/2000 09:54 -0400, Theo Van Dinter wrote: The usual suggestion is: bzip2 -dc filename.tar.bz2 | tar -xf - or use bzcat, which is exactly the same as bzip2 -dc... most versions of tar now support either I or y for (un)compress -- Mathieu Arnold
Re: FAQ
Gregory Leblanc wrote: snip 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. My tar cannot use bz2-compressed unless used with --use-compress-program=bzip2. so that line sould probably read "bzcat kernel-2.2.16.tar.bz2 | tar xf -". Also the only tar I saw that knows bzip2 is slackware's and it uses the '-y' switch for that. I never saw the '-I' switch for tar and my 'info tar' does not list it. Bottomline: Your tar is too customized to be in a FAQ. Marc -- Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/ University of Bielefeld, Dep. of Mathematics / Dep. of Physics PGP-keyID's: 0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)
Re: FAQ
Marc Mutz wrote: My tar cannot use bz2-compressed unless used with --use-compress-program=bzip2. so that line sould probably read "bzcat kernel-2.2.16.tar.bz2 | tar xf -". Also the only tar I saw that knows bzip2 is slackware's and it uses the '-y' switch for that. I never saw the '-I' switch for tar and my 'info tar' does not list it. Bottomline: Your tar is too customized to be in a FAQ. How about both options? The tar that comes with RH6.2 does this just fine... Ed -- Edward Schernau,mailto:[EMAIL PROTECTED] Network Architect http://www.schernau.com RC5-64#: 243249 e-gold acct #:131897
Re: FAQ
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Marc , On Thu, 3 Aug 2000, Marc Mutz wrote: Gregory Leblanc wrote: snip 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. My tar cannot use bz2-compressed unless used with ...snip... Your tar is too customized to be in a FAQ. Unless you want to provide a URL: to the modified sources ? OR Just goto ftp.gnu.org grab the original stick to just "it's" available options . Just my unneeded opinion . JimL ++ | James W. Laferriere | System Techniques | Give me VMS | | NetworkEngineer | 25416 22nd So | Give me Linux | | [EMAIL PROTECTED] | DesMoines WA 98198 | only on AXP | ++ -BEGIN PGP SIGNATURE- Version: PGP 6.5.1i iQA/AwUBOYmfU9bsrYDRJjJBEQIx3QCgshT14eDujACAdVPKvrqLLWIKsbsAoPGk cIZjEZFNbygVQHJfqYBJNzMI =j3v9 -END PGP SIGNATURE-
Re: FAQ
[Marc Mutz] 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. Your tar is too customized to be in a FAQ. there is no bzip2 standard in gnu tar, so let's be intelligent and avoid the issue by going with the .gz tarball as a recommendation. -z is standard. Also, none of the tarballs will start with "kernel-" but "linux-" anyway, so that needs fixing. Also, I'd add "/path/to/" before the raid in the patch command, since otherwise we'd need to tell them to move the patch over to that directory (pedantic, yes, but still) oh, and "move any directory called linux to something else" seems to miss the possibility of a symlink, where renaming the symlink would be kind of pointless. Whether tar would just kill the symlink at extract time anyway is worth a check. -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4
Re: FAQ
On Thu, Aug 03, 2000 at 01:34:33PM -0400, James Manning wrote: there is no bzip2 standard in gnu tar, so let's be intelligent and avoid the issue by going with the .gz tarball as a recommendation. -z is standard. from the info page from gnu tar 1.13.17: `--bzip2' `-I' This option tells `tar' to read or write archives through `bzip2'. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l.
Re: FAQ
[Luca Berra] from the info page from gnu tar 1.13.17: `--bzip2' `-I' This option tells `tar' to read or write archives through `bzip2'. As mentioned previously, this is a distro-specific hack. I have it in my tar as well, but trusting it to be part of core GNU tar just because it works on your system is silly. version 1.13 is the latest at ftp://ftp.gnu.org/pub/gnu/tar/ and specifically mentions the bzip2 situation in its NEWS file: +++ * An interim GNU tar alpha had new --bzip2 and --ending-file options, but they have been removed to maintain compatibility with paxutils. Please try --use=bzip2 instead of --bzip2. +++ Checking the ChangeLog shows bzip2 support added 1999-02-01 (in the form of -y, --bzip2, and --bunzip2) and then removed 1999-06-16 In any case, it certainly is true that we can trust -z to be around on any standard Linux install, and as such it is the correct answer to this thread. -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4
RE: FAQ
-Original Message- From: James Manning [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 03, 2000 10:35 AM To: [EMAIL PROTECTED] Subject: Re: FAQ [Marc Mutz] 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. Your tar is too customized to be in a FAQ. there is no bzip2 standard in gnu tar, so let's be intelligent and avoid the issue by going with the .gz tarball as a recommendation. -z is standard. It's going to be changed to the POSIX tar and GNU gzip invoked separately, because everybody felt the need to bitch, and because people aren't smart enough to not send me two copies of the message. :-) Also, none of the tarballs will start with "kernel-" but "linux-" anyway, so that needs fixing. Also, I'd add "/path/to/" before the raid in the patch command, since otherwise we'd need to tell them to move the patch over to that directory (pedantic, yes, but still) ok, cool, I'll fix those. oh, and "move any directory called linux to something else" seems to miss the possibility of a symlink, where renaming the symlink would be kind of pointless. Whether tar would just kill the symlink at extract time anyway is worth a check. Tar likes to clobber things when I give it half a chance. I'll mention about the symlink a bit more, although perhaps I should just tell people that they're expected to be familiar with downloading, unpacking, and building kernels before they read this document. Greg
RE: FAQ
Can;t we use bunzip2 instead of playing with tar? And after bunzip2, try tar -x kernel-2.2.16.tar ? -Original Message- From: James Manning [SMTP:[EMAIL PROTECTED]] Sent: Thursday, August 03, 2000 10:35 AM To: [EMAIL PROTECTED] Subject: Re: FAQ [Marc Mutz] 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. Your tar is too customized to be in a FAQ. there is no bzip2 standard in gnu tar, so let's be intelligent and avoid the issue by going with the .gz tarball as a recommendation. -z is standard. Also, none of the tarballs will start with "kernel-" but "linux-" anyway, so that needs fixing. Also, I'd add "/path/to/" before the raid in the patch command, since otherwise we'd need to tell them to move the patch over to that directory (pedantic, yes, but still) oh, and "move any directory called linux to something else" seems to miss the possibility of a symlink, where renaming the symlink would be kind of pointless. Whether tar would just kill the symlink at extract time anyway is worth a check. -- James Manning [EMAIL PROTECTED] GPG Key fingerprint = B913 2FBD 14A9 CE18 B2B7 9C8E A0BF B026 EEBB F6E4
Re: FAQ
Can we get the list administrator to add a footer to each message that has the URL of one of the archives? It will cut down on the questions like "...where is the FAQ?" -ilia Gregory Leblanc wrote: Here's a quickie FAQ, it's very incomplete, but I wanted to get some feedback on what I've got right now. Thanks, Greg Linux-RAID FAQ Gregory Leblanc gleblanc (at) cu-portland.edu Revision History Revision v0.01 31 July 2000 Revised by: gml Initial draft of this FAQ. This is a FAQ for the Linux-RAID mailing list, hosted on vger.rutgers.edu. It's intended as a supplement to the existing Linux-RAID HOWTO, to cover questions that keep occurring on the mailing list. PLEASE read this document before your post to the list. _ 1. General 1.1. Where can I find archives for the linux-raid mailing list? 2. Kernel 2.1. I'm running the DooDad Linux Distribution. Do I need to patch my kernel to make RAID work? 2.2. How can I tell if I need to patch my kernel? 2.3. Where can I get the latest RAID patches for my kernel? 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? 1. General 1.1. Where can I find archives for the linux-raid mailing list? My favorite archives are at Geocrawler. Other archives are available at http://marc.theaimsgroup.com/?l=linux-raidr=1w=2 Another archive site is http://www.mail-archive.com/linux-raid@vger.rutgers.edu/. 2. Kernel 2.1. I'm running the DooDad Linux Distribution. Do I need to patch my kernel to make RAID work? Well, the short answer is, it depends. Distributions that are keeping up to date have the RAID patches included in their kernels. The kernel that RedHat distributes, as do some others. If you download a 2.2.x kernel from ftp.kernel.org, then you will need to patch your kernel. 2.2. How can I tell if I need to patch my kernel? The easiest way is to check what's in /proc/mdstat. Here's a sample from a 2.2.x kernel, with the RAID patches applied. [gleblanc@grego1 gleblanc]$ cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [translucent] read_ahead not set unused devices: none> [gleblanc@grego1 gleblanc]$ If the contents of /proc/mdstat looks like the above, then you don't need to patch your kernel. I'll get a copy of something from an UN-patched 2.2.x kernel and put it here shortly. If your /proc/mdstat looks like this one, then you need to patch your kernel. 2.3. Where can I get the latest RAID patches for my kernel? The patches for the 2.2.x kernels up to, and including, 2.2.13 are available from ftp.kernel.org. Use the kernel patch that most closely matches your kernel revision. For example, the 2.2.11 patch can also be used on 2.2.12 and 2.2.13. The patches for 2.2.14 and later kernels are at http://people.redhat.com/mingo/raid-patches/. Use the right patch for your kernel, these patches haven't worked on other kernel revisions yet. 2.4. How do I apply the patch to a kernel that I just downloaded from ftp.kernel.org? Put the downloaded kernel in /usr/src. Change to this directory, and move any directory called linux to something else. Then, type tar -Ixvf kernel-2.2.16.tar.bz2, replacing kernel-2.2.16.tar.bz2 with your kernel. Then cd to /usr/src/linux, and run patch -p1 raid-2.2.16-A0. Then compile the kernel as usual. -- -+-- Ilia Baldine | [EMAIL PROTECTED] Network Research Engineer, | ph#:(919)248-1847 Advanced Networking Research, MCNC | FAX:(919)248-1455 -+-- "I used to think the brain was the most important part of the body, but then I realized who was telling me that." -Emo Philips
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Hi, Chris Wedgwood writes: This may affect data which was not being written at the time of the crash. Only raid 5 is affected. Long term -- if you journal to something outside the RAID5 array (ie. to raid-1 protected log disks) then you should be safe against this type of failure? Indeed. The jfs journaling layer in ext3 is a completely generic block device journaling layer which could be used for such a purpose (and raid/LVM journaling is one of the reasons it was designed this way). --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Hi, Benno Senoner writes: wow, really good idea to journal to a RAID1 array ! do you think it is possible to to the following: - N disks holding a soft RAID5 array. - reserve a small partition on at least 2 disks of the array to hold a RAID1 array. - keep the journal on this partition. Yes. My jfs code will eventually support this. The main thing it is missing right now is the ability to journal multiple devices to a single journal: the on-disk structure is already designed with that in mind but the code does not yet support it. --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Chris Wedgwood wrote: In the power+disk failure case, there is a very narrow window in which parity may be incorrect, so loss of the disk may result in inability to correctly restore the lost data. For some people, this very narrow window may still be a problem. Especially when you consider the case of a disk failing because of a power surge -- which also kills a drive. This may affect data which was not being written at the time of the crash. Only raid 5 is affected. Long term -- if you journal to something outside the RAID5 array (ie. to raid-1 protected log disks) then you should be safe against this type of failure? -cw wow, really good idea to journal to a RAID1 array ! do you think it is possible to to the following: - N disks holding a soft RAID5 array. - reserve a small partition on at least 2 disks of the array to hold a RAID1 array. - keep the journal on this partition. do you think that this will be possible ? is ext3 / reiserfs capable of keeping the journal on a different partition than the one holding the FS ? That would really be great ! Benno.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Ingo, I can fairly regularly generate corruption (data or ext2 filesystem) on a busy RAID-5 by adding a spare drive to a degraded array and letting it build the parity. Could the problem be from the bad (illegal) buffer interactions you mentioned, or are there other areas that need fixing as well? I have been looking into this issue for a long time with no resolve. Since you may be aware of possible problem areas: any ideas, code or encouragement is greatly welcome. Lance. Ingo Molnar wrote: On Wed, 12 Jan 2000, Gadi Oxman wrote: As far as I know, we took care not to poke into the buffer cache to find clean buffers -- in raid5.c, the only code which does a find_buffer() is: yep, this is still the case. (Sorry Stephen, my bad.) We will have these problems once we try to eliminate the current copying overhead. Nevertheless there are bad (illegal) interactions between the RAID code and the buffer cache, i'm cleaning up this for 2.3 right now. Especially the reconstruction code is a rathole. Unfortunately blocking reconstruction if b_count == 0 is not acceptable because several filesystems (such as ext2fs) keep metadata caches around (eg. the block group descriptors in the ext2fs case) which have b_count == 1 for a longer time.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power fai
Hi, On Wed, 12 Jan 2000 11:28:28 MET-1, "Petr Vandrovec" [EMAIL PROTECTED] said: I did not follow this thread (on -fsdevel) too close (and I never looked into RAID code, so I should shut up), but... can you confirm that after buffer with data is finally marked dirty, parity is recomputed anyway? So that window is really small and same problems occurs every moment when you wrote data, but did not wrote parity yet? Yes, that's what I said. --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Hi, On Wed, 12 Jan 2000 22:09:35 +0100, Benno Senoner [EMAIL PROTECTED] said: Sorry for my ignorance I got a little confused by this post: Ingo said we are 100% journal-safe, you said the contrary, Raid resync is safe in the presence of journaling. Journaling is not safe in the presence of raid resync. can you or Ingo please explain us in which situation (power-loss) running linux-raid+ journaled FS we risk a corrupted filesystem ? Please read my previous reply on the subject (the one that started off with "I'm tired of answering the same question a million times so here's a definitive answer"). Basically, there will always be a small risk of data loss if power-down is accompanied by loss of a disk (it's a double-failure); and the current implementation of raid resync means that journaling will be broken by the raid1 or raid5 resync code after a reboot on a journaled filesystem (ext3 is likely to panic, reiserfs will not but will still get its IO ordering requirements messed up by the resync). After the reboot if all disk remain intact physically, will we only lose the data that was being written, or is there a possibility to end up in a corrupted filesystem which could more damages in future ? In the power+disk failure case, there is a very narrow window in which parity may be incorrect, so loss of the disk may result in inability to correctly restore the lost data. This may affect data which was not being written at the time of the crash. Only raid 5 is affected. --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
- Original Message - From: "Benno Senoner" [EMAIL PROTECTED] To: "Stephen C. Tweedie" [EMAIL PROTECTED] Cc: "Linux RAID" [EMAIL PROTECTED]; [EMAIL PROTECTED]; "Ingo Molnar" [EMAIL PROTECTED] Sent: Tuesday, January 11, 2000 1:17 PM Subject: Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ? -- much snippage here The problem is that power outages are unpredictable even in presence of UPSes therefore it is important to have some protection against power losses. regards, Benno. I run an MGE UPS on my RH6.1 box running RAID 1, they have software for Linux that communicates with the UPS and performs an orderly system shutdown if the box goes on battery and stays on battery for a given (user selectable) length of time. I have tested and verified that this actually works, it's a Good Thing(tm). I did have to cut one pin on the standard RS-232 cable that came the UPS for use on the Linux box, and download the software and install (scripted, easy...) bwilling
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
James Manning wrote: [ Tuesday, January 11, 2000 ] Benno Senoner wrote: The problem is that power outages are unpredictable even in presence of UPSes therefore it is important to have some protection against power losses. I gotta ask dying power supply? cord getting ripped out? Most ppl run serial lines (of course :) and with powerd they get nice shutdowns :) Just wanna make sure I'm understanding you... James -- Miscellaneous Engineer --- IBM Netfinity Performance Development yep, obviously the UPS has a serial line to shut down the machine nicely before a failure, but it happened to me that the serial cable was disconnected and the power outage lasted SEVERAL hours during a weekend , where no one was in the machine room (of an ISP). you know murphy's law ... :-) But I am mainly interested in the power-failure-protection in the case where you want to setup a workstation with a reliable disk array (soft raid5), and do not have always an UPS handy, you will loose the file that was being written, but the important thing is that the disk array remains in a safe state , just like a single disk + journaled FS. Sthephen Tweedie said that this is possible (by fixing the remaining races in the RAID code), if these problems will be fixed sometime, then our fears of a corrupted soft-RAID array in the case of a power-failure on a machine without UPS will completely go away. cheers, Benno.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power fai
On 11 Jan 00 at 22:24, Stephen C. Tweedie wrote: The race I'm concerned about could occur when the raid driver wants to compute parity for a stripe and finds some of the blocks are present, and clean, in the buffer cache. Raid assumes that those buffers represent what is on disk, naturally enough. So, it uses them to calculate parity without rereading all of the disk blocks in the stripe. The trouble is that the standard practice in the kernel, when modifying a buffer, is to make the change and _then_ mark the buffer dirty. If you hit that window, then the raid driver will find a buffer which doesn't match what is on disk, and will compute parity from that buffer rather than from the on-disk contents. Hi Stephen, I did not follow this thread (on -fsdevel) too close (and I never looked into RAID code, so I should shut up), but... can you confirm that after buffer with data is finally marked dirty, parity is recomputed anyway? So that window is really small and same problems occurs every moment when you wrote data, but did not wrote parity yet? Thanks, Petr Vandrovec [EMAIL PROTECTED]
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
On Wed, 12 Jan 2000, Gadi Oxman wrote: As far as I know, we took care not to poke into the buffer cache to find clean buffers -- in raid5.c, the only code which does a find_buffer() is: yep, this is still the case. (Sorry Stephen, my bad.) We will have these problems once we try to eliminate the current copying overhead. Nevertheless there are bad (illegal) interactions between the RAID code and the buffer cache, i'm cleaning up this for 2.3 right now. Especially the reconstruction code is a rathole. Unfortunately blocking reconstruction if b_count == 0 is not acceptable because several filesystems (such as ext2fs) keep metadata caches around (eg. the block group descriptors in the ext2fs case) which have b_count == 1 for a longer time. If both power and a disk fails at once then we still might get local corruption for partially written RAID5 stripes. If either power or a disk fails, then the Linux RAID5 code is safe wrt. journalling, because it behaves like an ordinary disk. We are '100% journal-safe' if power fails during resync. We are also 100% journal-safe if power fails during reconstruction of failed disk or in degraded mode. the 2.3 buffer-cache enhancements i wrote ensure that 'cache snooping' and adding to the buffer-cache can be done safely by 'external' cache managers. I also added means to do atomic IO operations which in fact are several underlying IO operations - without the need of allocating a separate bh. The RAID code uses these facilities now. Ingo
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
"Stephen C. Tweedie" wrote: Hi, On Tue, 11 Jan 2000 15:03:03 +0100, mauelsha [EMAIL PROTECTED] said: THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the only way you can get bitten by this failure mode is to have a system failure and a disk failure at the same time. To try to avoid this kind of problem some brands do have additional logging (to disk which is slow for sure or to NVRAM) in place, which enables them to at least recognize the fault to avoid the reconstruction of invalid data or even enables them to recover the data by using redundant copies of it in NVRAM + logging information what could be written to the disks and what not. Absolutely: the only way to avoid it is to make the data+parity updates atomic, either in NVRAM or via transactions. I'm not aware of any software RAID solutions which do such logging at the moment: do you know of any? AFAIK Veritas only does the first part of what i mentioned above (invalid on disk data recognition). They do logging by default for RAID5 volumes and optionaly also for RAID1 volumes. In the RAID5 (with logging) case they can figure out if an n-1 disk write took place and can rebuild the data. In case an n-m (1 m n) took place they can therefore at least recognize the desaster ;-) In the RAID1 (with logging) scenario they are able to recognize, which of the n mirrors have actual data and which ones don't to deliver the actual data to the user and to try to make the other mirrors consistent. But because it's a software solution without any NVRAM support they can't handle the data redundancy case. Heinz
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
Hi, On Wed, 12 Jan 2000 00:12:55 +0200 (IST), Gadi Oxman [EMAIL PROTECTED] said: Stephen, I'm afraid that there are some misconceptions about the RAID-5 code. I don't think so --- I've been through this with Ingo --- but I appreciate your feedback since I'm getting inconsistent advise here! Please let me explain... In an early pre-release version of the RAID code (more than two years ago?), which didn't protect against that race, we indeed saw locked buffers changing under us from the point in which we computed the parity till the point in which they were actually written to the disk, leading to a corrupted parity. That is not the race. The race has nothing at all to do with buffers changing while they are being used for parity: that's a different problem, long ago fixed by copying the buffers. The race I'm concerned about could occur when the raid driver wants to compute parity for a stripe and finds some of the blocks are present, and clean, in the buffer cache. Raid assumes that those buffers represent what is on disk, naturally enough. So, it uses them to calculate parity without rereading all of the disk blocks in the stripe. The trouble is that the standard practice in the kernel, when modifying a buffer, is to make the change and _then_ mark the buffer dirty. If you hit that window, then the raid driver will find a buffer which doesn't match what is on disk, and will compute parity from that buffer rather than from the on-disk contents. 1. n dirty blocks are scheduled for a stripe write. That's not the race. The problem occurs when only one single dirty block is scheduled for a write, and we need to find the contents of the rest of the stripe to compute parity. Point (2) is also incorrect; we have taken care *not* to peek into the buffer cache to find clean buffers and use them for parity calculations. We make no such assumptions. Not according to Ingo --- can we get a definitive answer on this, please? Many thanks, Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
Perhaps I am confused. How is it that a power outage while attached to the UPS becomes "unpredictable"? We run a Dell PowerEdge 2300/400 using Linux software raid and the system monitors it's own UPS. When power failure occures the system will bring itself down to a minimal state (runlevel 1) after the batteries are below 50% .. and once below 15% it will shutdown which turns off the UPS. When power comes back on the UPS fires up and the system resumes as normal. Addmitedly this wont prevent issues like god reaching out and slapping my system via lightning or something, nor will it resolve issues where someone decides to grab the power cable and swing around on it severing the connection from the UPS to the system .. but for the most part it has thus far prooven to be a fairly decent configuration. Benno Senoner wrote: "Stephen C. Tweedie" wrote: (...) 3) The soft-raid backround rebuild code reads and writes through the buffer cache with no synchronisation at all with other fs activity. After a crash, this background rebuild code will kill the write-ordering attempts of any journalling filesystem. This affects both ext3 and reiserfs, under both RAID-1 and RAID-5. Interaction 3) needs a bit more work from the raid core to fix, but it's still not that hard to do. So, can any of these problems affect other, non-journaled filesystems too? Yes, 1) can: throughout the kernel there are places where buffers are modified before the dirty bits are set. In such places we will always mark the buffers dirty soon, so the window in which an incorrect parity can be calculated is _very_ narrow (almost non-existant on non-SMP machines), and the window in which it will persist on disk is also very small. This is not a problem. It is just another example of a race window which exists already with _all_ non-battery-backed RAID-5 systems (both software and hardware): even with perfect parity calculations, it is simply impossible to guarantee that an entire stipe update on RAID-5 completes in a single, atomic operation. If you write a single data block and its parity block to the RAID array, then on an unexpected reboot you will always have some risk that the parity will have been written, but not the data. On a reboot, if you lose a disk then you can reconstruct it incorrectly due to the bogus parity. THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the only way you can get bitten by this failure mode is to have a system failure and a disk failure at the same time. --Stephen thank you very much for these clear explanations, Last doubt: :-) Assume all RAID code - FS interaction problems get fixed, since a linux soft-RAID5 box has no battery backup, does this mean that we will loose data ONLY if there is a power failure AND successive disk failure ? If we loose the power and then after reboot all disks remain intact can the RAID layer reconstruct all information in a safe way ? The problem is that power outages are unpredictable even in presence of UPSes therefore it is important to have some protection against power losses. regards, Benno.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
"Stephen C. Tweedie" wrote: Ideally, what I'd like to see the reconstruction code do is to: * lock a stripe * read a new copy of that stripe locally * recalc parity and write back whatever disks are necessary for the stripe * unlock the stripe so that the data never goes through the buffer cache at all, but that the stripe is locked with respect to other IOs going on below the level of ll_rw_block (remember there may be IOs coming in to ll_rw_block which are not from the buffer cache, eg. swap or journal IOs). We are '100% journal-safe' if power fails during resync. Except for the fact that resync isn't remotely journal-safe in the first place, yes. :-) --Stephen Sorry for my ignorance I got a little confused by this post: Ingo said we are 100% journal-safe, you said the contrary, can you or Ingo please explain us in which situation (power-loss) running linux-raid+ journaled FS we risk a corrupted filesystem ? I am interested what happens if the power goes down while you write heavily to a ext3/reiserfs (journaled FS) on soft-raid5 array. After the reboot if all disk remain intact physically, will we only lose the data that was being written, or is there a possibility to end up in a corrupted filesystem which could more damages in future ? (or do we need to wait for the raid code in 2.3 ?) sorry for re-asking that question, but I am still confused. regards, Benno.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
Hi, On Tue, 11 Jan 2000 16:41:55 -0600, "Mark Ferrell" [EMAIL PROTECTED] said: Perhaps I am confused. How is it that a power outage while attached to the UPS becomes "unpredictable"? One of the most common ways to get an outage while on a UPS is somebody tripping over, or otherwise removing, the cable between the UPS and the computer. How exactly is that predictable? Just because you reduce the risk of unexpected power outage doesn't mean we can ignore the possibility. --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?
Hi, On Wed, 12 Jan 2000 07:21:17 -0500 (EST), Ingo Molnar [EMAIL PROTECTED] said: On Wed, 12 Jan 2000, Gadi Oxman wrote: As far as I know, we took care not to poke into the buffer cache to find clean buffers -- in raid5.c, the only code which does a find_buffer() is: yep, this is still the case. OK, that's good to know. Especially the reconstruction code is a rathole. Unfortunately blocking reconstruction if b_count == 0 is not acceptable because several filesystems (such as ext2fs) keep metadata caches around (eg. the block group descriptors in the ext2fs case) which have b_count == 1 for a longer time. That's not a problem: we don't need reconstruction to interact with the buffer cache at all. Ideally, what I'd like to see the reconstruction code do is to: * lock a stripe * read a new copy of that stripe locally * recalc parity and write back whatever disks are necessary for the stripe * unlock the stripe so that the data never goes through the buffer cache at all, but that the stripe is locked with respect to other IOs going on below the level of ll_rw_block (remember there may be IOs coming in to ll_rw_block which are not from the buffer cache, eg. swap or journal IOs). We are '100% journal-safe' if power fails during resync. Except for the fact that resync isn't remotely journal-safe in the first place, yes. :-) --Stephen
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
"Stephen C. Tweedie" wrote: (...) 3) The soft-raid backround rebuild code reads and writes through the buffer cache with no synchronisation at all with other fs activity. After a crash, this background rebuild code will kill the write-ordering attempts of any journalling filesystem. This affects both ext3 and reiserfs, under both RAID-1 and RAID-5. Interaction 3) needs a bit more work from the raid core to fix, but it's still not that hard to do. So, can any of these problems affect other, non-journaled filesystems too? Yes, 1) can: throughout the kernel there are places where buffers are modified before the dirty bits are set. In such places we will always mark the buffers dirty soon, so the window in which an incorrect parity can be calculated is _very_ narrow (almost non-existant on non-SMP machines), and the window in which it will persist on disk is also very small. This is not a problem. It is just another example of a race window which exists already with _all_ non-battery-backed RAID-5 systems (both software and hardware): even with perfect parity calculations, it is simply impossible to guarantee that an entire stipe update on RAID-5 completes in a single, atomic operation. If you write a single data block and its parity block to the RAID array, then on an unexpected reboot you will always have some risk that the parity will have been written, but not the data. On a reboot, if you lose a disk then you can reconstruct it incorrectly due to the bogus parity. THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the only way you can get bitten by this failure mode is to have a system failure and a disk failure at the same time. --Stephen thank you very much for these clear explanations, Last doubt: :-) Assume all RAID code - FS interaction problems get fixed, since a linux soft-RAID5 box has no battery backup, does this mean that we will loose data ONLY if there is a power failure AND successive disk failure ? If we loose the power and then after reboot all disks remain intact can the RAID layer reconstruct all information in a safe way ? The problem is that power outages are unpredictable even in presence of UPSes therefore it is important to have some protection against power losses. regards, Benno.
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
"Stephen C. Tweedie" wrote: Hi, This is a FAQ: I've answered it several times, but in different places, SNIP THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the only way you can get bitten by this failure mode is to have a system failure and a disk failure at the same time. To try to avoid this kind of problem some brands do have additional logging (to disk which is slow for sure or to NVRAM) in place, which enables them to at least recognize the fault to avoid the reconstruction of invalid data or even enables them to recover the data by using redundant copies of it in NVRAM + logging information what could be written to the disks and what not. Heinz
Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?
Hi, On Tue, 11 Jan 2000 15:03:03 +0100, mauelsha [EMAIL PROTECTED] said: THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the only way you can get bitten by this failure mode is to have a system failure and a disk failure at the same time. To try to avoid this kind of problem some brands do have additional logging (to disk which is slow for sure or to NVRAM) in place, which enables them to at least recognize the fault to avoid the reconstruction of invalid data or even enables them to recover the data by using redundant copies of it in NVRAM + logging information what could be written to the disks and what not. Absolutely: the only way to avoid it is to make the data+parity updates atomic, either in NVRAM or via transactions. I'm not aware of any software RAID solutions which do such logging at the moment: do you know of any? --Stephen
Re: FAQ and archive
Is there a mailing list archive available? How about an FAQ? While many archives are available, I use http://www.mail-archive.com/linux-raid@vger.rutgers.edu/ Since it's a searchable archive, I tend to use that instead of looking at any FAQ's as documentation appears to be lagging development by a good margin (which is fine, I'd rather have working code than well-documented uselessness :) HTH, James Manning -- Miscellaneous Engineer --- IBM Netfinity Performance Development
Re: FAQ
Bruno Prior wrote: It strikes me that this list desperately needs a FAQ. I'm off on holiday for the next two weeks, but unless someone else wants to volunteer, I'm willing to put one together when I get back. If people would like me to do this, I would welcome suggestions for questions to go in the FAQ. Whoever volunteers: The first answer should summarize which version of {md,raid}tools works with which kernel patched with{,out} patch XY. Can't think of an question for that, though. IMO it is very necessay to clear the fog that has laid itself across raid-with-linux in the last weeks or so. Marc -- Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/ University of Bielefeld, Dep. of Mathematics / Dep. of Physics PGP-keyID's: 0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)
Re: FAQ
On Fri, 9 Jul 1999, Marc Mutz wrote: Bruno Prior wrote: It strikes me that this list desperately needs a FAQ. I'm off on holiday for the next two weeks, but unless someone else wants to volunteer, I'm willing to put one together when I get back. If people would like me to do this, I would welcome suggestions for questions to go in the FAQ. Whoever volunteers: The first answer should summarize which version of {md,raid}tools works with which kernel patched with{,out} patch XY. Can't think of an question for that, though. Hrm. "I have kernel X-Version and I'm trying to use raidtools Y-Version, but it isn't working. What's up with that?" Maybe also include a few more subquestions like "I'm using the version of raidtools that came with my Linux distro, but it doesn't seem to be working" IMO it is very necessay to clear the fog that has laid itself across raid-with-linux in the last weeks or so. Having inherited the responsibility of maintaining a kernel RPM after 2.2.8 and discovering that software raid stuff no longer worked, I'd have to agree. ;-) -- Kelley Spoon [EMAIL PROTECTED] Sic Semper Tyrannis.
Re: FAQ
These questions are from the point of view of 0.90 or higher (i.e. RH 6.0). - How do you recover a RAID1 or a RAID5 with a bad disk when you have no spares, i.e. how do you hotremove and hotadd? Please go through it step by step because many paths seem to lead to hangs. - How do you recover a RAID1 or a RAID5 when you do have a spare? Does it or can it work automatically? - How do you keep your raidtab file sync'd with your actual RAID when persistent-superblock is 1? Is there a translator for the numerical values, i.e. for parity-algorithm, found in /proc/mdstat? Thanks, Larry Dickson Land-5 Corporation At 12:06 PM 7/9/99 +0100, you wrote: It strikes me that this list desperately needs a FAQ. I'm off on holiday for the next two weeks, but unless someone else wants to volunteer, I'm willing to put one together when I get back. If people would like me to do this, I would welcome suggestions for questions to go in the FAQ. Cheers, Bruno Prior [EMAIL PROTECTED]