Re: [PATCH md 2 of 4] Fix raid6 problem

2005-02-02 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:A. James Lewis [EMAIL PROTECTED] In newsgroup: linux.dev.raid Sorry for the delay in replying, I've been using RAID6 in a real life situation with 2.6.9 + patch, for 2 months now, with 1.15Tb of storage, and I have had more than 1 drive

Re: [PATCH md 2 of 4] Fix raid6 problem

2005-02-03 Thread H. Peter Anvin
Lars Marowsky-Bree wrote: On 2005-02-03T08:39:41, H. Peter Anvin [EMAIL PROTECTED] wrote: Yes, right now there is no RAID5-RAID6 conversion tool that I know of. Hm. One of the checksums is identical, as is the disk layout of the data, no? No, the layout is different. -hpa - To unsubscribe

Re: [PATCH md 2 of 4] Fix raid6 problem

2005-02-03 Thread H. Peter Anvin
Guy wrote: Would you say that the 2.6 Kernel is suitable for storing mission-critical data, then? Sure. I'd trust 2.6 over 2.4 at this point. I ask because I have read about a lot of problems with data corruption and oops on this list and the SCSI list. But in most or all cases the 2.4 Kernel

[PATCH] RAID Kconfig cleanups, remove experimental tag from RAID-6

2005-02-08 Thread H. Peter Anvin
This patch removes the experimental tag from RAID-6 (unfortunately the damage is already done...:-|) and cleans up a few more things in the Kconfig file. Signed-Off-By: H. Peter Anvin [EMAIL PROTECTED] Index: linux-2.5/drivers/md/Kconfig

Re: Forcing a more random uuid (random seed bug)

2005-02-22 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:Niccolo Rigacci [EMAIL PROTECTED] In newsgroup: linux.dev.raid I get /dev/md5, /dev/md6, /dev/md7 and /dev/md8 all with the same UUID! It seems that there is a bug in mdadm: when generating the UUID for a volume, the random() function is

Re: Forcing a more random uuid (random seed bug)

2005-02-22 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:[EMAIL PROTECTED] In newsgroup: linux.dev.raid +if ((my_fd = open(/dev/random, O_RDONLY)) != -1) { Please use /dev/urandom for such applications. /dev/random is the highest-quality generator, but will block if entropy isn't available.

Re: EVMS or md?

2005-04-04 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:David Kewley [EMAIL PROTECTED] In newsgroup: linux.dev.raid Mike Tran wrote on Monday 04 April 2005 12:28: We (EVMS team) intended to support RAID6 last year. But as we all remember RAID6 was not stable then. I may write a plugin to support

Re: Adding Reed-Solomon Personality to MD, need help/advice

2005-12-29 Thread H. Peter Anvin
Jeff Breidenbach wrote: The fundamental problem is that generic RS requires table lookups even in the common case, whereas RAID-6 uses shortcuts to substantially speed up the computation in the common case. If one wanted to support a typical 8-bit RS code (which supports a max of 256 drives,

Re: Exporting which partitions to md-configure

2006-01-30 Thread H. Peter Anvin
Neil Brown wrote: On Monday January 30, [EMAIL PROTECTED] wrote: Any feeling how best to do that? My current thinking is to export a flags entry in addition to the current ones, presumably based on struct parsed_partitions-parts[].flags (fs/partitions/check.h), which seems to be what causes

Re: Exporting which partitions to md-configure

2006-01-30 Thread H. Peter Anvin
Kyle Moffett wrote: Well, for an MSDOS partition table, you would look for '253', for a Mac partition table you could look for something like 'Linux_RAID' or similar (just arbitrarily define some name beginning with the Linux_ prefix), etc. This means that the partition table type would

Re: Exporting which partitions to md-configure

2006-01-30 Thread H. Peter Anvin
Neil Brown wrote: Well, grepping through fs/partitions/*.c, the 'flags' thing is set by efi.c, msdos.c sgi.c sun.c Of these, efi compares something against PARTITION_LINUX_RAID_GUID, and msdos.c, sgi.c and sun. compare something against LINUX_RAID_PARTITION. The former would look like

Re: Exporting which partitions to md-configure

2006-01-30 Thread H. Peter Anvin
Neil Brown wrote: Mac partition tables doesn't currently support autodetect (as far as I can tell). Let's keep it that way. For now I guess I'll just take the code from init/do_mounts_md.c; we can worry about ripping the RAID_AUTORUN code out of the kernel later. -hpa - To

Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-06 Thread H. Peter Anvin
Neil Brown wrote: What constitutes 'a piece of data'? A bit? a byte? I would say that msdos:fd is one piece of data. The 'fd' is useless without the 'msdos'. The 'msdos' is, I guess, not completely useless with the fd. I would lean towards the composite, but I wouldn't fight a

Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-07 Thread H. Peter Anvin
Luca Berra wrote: This, in fact is *EXACTLY* what we're talking about; it does require autoassemble. Why do we care about the partition types at all? The reason is that since the md superblock is at the end, it doesn't get automatically wiped if the partition is used as a raw filesystem,

Re: [klibc] Re: Exporting which partitions to md-configure

2006-02-07 Thread H. Peter Anvin
Luca Berra wrote: making it harder for the user is a good thing, but please not at the expense of usability What's the usability problem? -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [PATCH 005 of 11] md: Merge raid5 and raid6 code

2006-04-30 Thread H. Peter Anvin
. This will probably be cleaned up later. Cc: H. Peter Anvin [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] Wonderful! Thank you for doing this :) -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More

Re: Problem with large devices 2TB

2006-05-14 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:Jim Klimov [EMAIL PROTECTED] In newsgroup: linux.dev.raid Since the new parted worked ok (older one didn't), we were happy until we tried a reboot. During the device initialization and after it the system only recognises the 6 or 7

Re: And then there was Bryce...

2006-06-08 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:John Stoffel [EMAIL PROTECTED] In newsgroup: linux.dev.raid The problem is more likely that your /etc/mdadm/mdadm.conf file is specifying exactly which partitions to use, instead of just doing something like the following: DEVICE partitions

Re: And then there was Bryce...

2006-06-08 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:Henrik Holst [EMAIL PROTECTED] In newsgroup: linux.dev.raid The same happened to me with eth0-2. I _could_ not for my life understand why I didn't get internet connect to work. But then I realized that eth0 and eth1 had been swapped after I

Re: which CPU for XOR?

2006-06-09 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:Dexter Filmore [EMAIL PROTECTED] In newsgroup: linux.dev.raid What type of operation is XOR anyway? Should be ALU, right? So - what CPU is best for software raid? One with high integer processing power? Something with massive wide vector

Re: raid6

2006-06-16 Thread H. Peter Anvin
Followup to: [EMAIL PROTECTED] By author:=?GB2312?B?uPDQ29fK?= [EMAIL PROTECTED] In newsgroup: linux.dev.raid I am confronted with a big problem of the raid6 algorithm, when recently I learn the raid6 code of linux 2.6 you have contributed . Unfortunately I can not understand

Re: Ok to go ahead with this setup?

2006-06-22 Thread H. Peter Anvin
Molle Bestefich wrote: Christian Pernegger wrote: Intel SE7230NH1-E mainboard Pentium D 930 HPA recently said that x86_64 CPUs have better RAID5 performance. Actually, anything with SSE2 should be OK. -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the

Re: Ok to go ahead with this setup?

2006-06-22 Thread H. Peter Anvin
Molle Bestefich wrote: Christian Pernegger wrote: Anything specific wrong with the Maxtors? No. I've used Maxtor for a long time and I'm generally happy with them. They break now and then, but their online warranty system is great. I've also been treated kindly by their help desk - talked

Re: Curious code in autostart_array

2006-06-22 Thread H. Peter Anvin
Pete Zaitcev wrote: Hi, guys: My copy of 2.6.17-rc5 has the following code in autostart_array(): mdp_disk_t *desc = sb-disks + i; dev_t dev = MKDEV(desc-major, desc-minor); if (!dev) continue; if (dev ==

Re: Multiple raids on one machine?

2006-06-25 Thread H. Peter Anvin
Chris Allen wrote: 2. Partition the raw disks into four partitions and make /dev/md0,md1,md2,md3. But am I heading for problems here? Is there going to be a big performance hit with four raid5 arrays on the same machine? Am I likely to have dataloss problems if my machine crashes? 2

Re: Linux: Why software RAID?

2006-08-23 Thread H. Peter Anvin
Chris Friesen wrote: Jeff Garzik wrote: But anyway, to help answer the question of hardware vs. software RAID, I wrote up a page: http://linux.yyz.us/why-software-raid.html Just curious...with these guys (http://www.bigfootnetworks.com/KillerOverview.aspx) putting linux on a PCI NIC

Re: strange raid6 assembly problem

2006-08-24 Thread H. Peter Anvin
Mickael Marchand wrote: so basically I don't really know what to do with my sdf3 at the moment and fear to reboot again :o) maybe a --re-add /dev/sdf3 could work here ? but will it survive a reboot ? At this point, for whatever reason, your kernel doesn't see /dev/sdf3 as part of the array.

Re: [md] RAID6: clean up CPUID and FPU enter/exit code

2007-02-08 Thread H. Peter Anvin
My apologies for the screwed-up 'To:' line in the previous email... I did -s `head -1 file` instead of -s `head -1 file` by mistake [:^O -hpa (who is going to bed now...) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED]

Re: PATA/SATA Disk Reliability paper

2007-02-19 Thread H. Peter Anvin
Richard Scobie wrote: Thought this paper may be of interest. A study done by Google on over 100,000 drives they have/had in service. http://labs.google.com/papers/disk_failures.pdf Bastards: Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18].

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Andreas Dilger wrote: And clearing this list when the sector is overwritten, as it will almost certainly be relocated at the disk level. Certainly if the overwrite is successful. -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to

Re: end to end error recovery musings

2007-02-26 Thread H. Peter Anvin
Theodore Tso wrote: In any case, the reason why I bring this up is that it would be really nice if there was a way with a single laptop drive to be able to do snapshots and background fsck's without having to use initrd's with device mapper. This is a major part of why I've been trying to

Re: end to end error recovery musings

2007-02-28 Thread H. Peter Anvin
James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... Well, in that case you'd only see 8x512 data bytes, no metadata...

Re: RAID1, hot-swap and boot integrity

2007-03-04 Thread H. Peter Anvin
Mike Accetta wrote: I've been considering trying something like having the re-sync algorithm on a whole disk array defer the copy for sector 0 to the very end of the re-sync operation. Assuming the BIOS makes at least a minimal consistency check on sector 0 before electing to boot from the

Re: RAID1, hot-swap and boot integrity

2007-03-05 Thread H. Peter Anvin
Mike Accetta wrote: I wonder if having the MBR typically outside of the array and the relative newness of partitioned arrays are related? When I was considering how to architect the RAID1 layout it seemed like a partitioned array on the entire disk worked most naturally. It's one way to do

Re: RAID1, hot-swap and boot integrity

2007-03-07 Thread H. Peter Anvin
Mike Accetta wrote: I gathered the impression somewhere, perhaps incorrectly, that the active flag was a function of the boot block, not the BIOS. We use Grub in the MBR and don't even have an active flag set in the partition table. The system still boots. The active flag is indeed an MBR

Re: mismatch_cnt questions

2007-03-07 Thread H. Peter Anvin
H. Peter Anvin wrote: Eyal Lebedinsky wrote: Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. Can raid6 identify

Re: mismatch_cnt questions

2007-03-08 Thread H. Peter Anvin
Bill Davidsen wrote: When last I looked at Hamming code, and that would be 1989 or 1990, I believe that I learned that the number of Hamming bits needed to cover N data bits was 1+log2(N), which for 512 bytes would be 1+12, and fit into a 16 bit field nicely. I don't know that I would go

Re: Reshaping raid0/10

2007-03-10 Thread H. Peter Anvin
Neil Brown wrote: If I wanted to reshape a raid0, I would just morph it into a raid4 with a missing parity drive, then use the raid5 code to restripe it. Then morph it back to regular raid0. Wow, that made my brain hurt. Given the fact that we're going to have to do this on kernel.org soon,

Re: mismatch_cnt questions

2007-03-13 Thread H. Peter Anvin
Andre Noll wrote: On 00:21, H. Peter Anvin wrote: I have just updated the paper at: http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf ... with this information (in slightly different notation and with a bit more detail.) There's a typo in the new section: s/By assumption, X_z

Re: mkinitrd and RAID6 on FC5

2007-04-23 Thread H. Peter Anvin
Guy Watkins wrote: Is this a REDHAT only problem/bug? If so, since bugzilla.redhat.com gets ignored, where do I complain? Yes, this is Redhat only, and as far as I know, it was fixed a long time ago. I suspect you need to make sure you upgrade your entire system, especially mkinitrd, not

Re: mkinitrd and RAID6 on FC5

2007-04-23 Thread H. Peter Anvin
Guy Watkins wrote: I tried to update/upgrade and no updates are available for mkinitrd. Do you know what version has the fix? The bugzilla was never closed, so it seems it has not been fixed. My version: mkinitrd.i3865.0.32-2 installed I guess Red

Re: Please revert 5b479c91da90eef605f851508744bfe8269591a0 (md partition rescan)

2007-05-10 Thread H. Peter Anvin
Satyam Sharma wrote: On 5/10/07, Xavier Bestel [EMAIL PROTECTED] wrote: On Thu, 2007-05-10 at 16:51 +0200, Jan Engelhardt wrote: (But Andrew never saw your email, I suspect: [EMAIL PROTECTED] is probably some strange mixup of Andrew Morton and Andi Kleen in your mind ;) What do the

Re: [PATCH] [mdadm] Add klibc support to mdadm.h

2007-10-02 Thread H. Peter Anvin
maximilian attems wrote: klibc still misses a lot functionality to let mdadm link against, this small step helps to get to the real trouble.. :) Signed-off-by: maximilian attems [EMAIL PROTECTED] --- mdadm.h |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git

[PATCH] raid6: clean up the style of mktables.c and its output

2007-10-26 Thread H. Peter Anvin
Make both mktables.c and its output CodingStyle compliant. Update the copyright notice. Signed-off-by: H. Peter Anvin [EMAIL PROTECTED] --- drivers/md/mktables.c | 166 +++-- 1 files changed, 79 insertions(+), 87 deletions(-) diff --git a/drivers/md

[PATCH] raid6: clean up the style of raid6test/test.c

2007-10-26 Thread H. Peter Anvin
Clean up the coding style in raid6test/test.c. Break it apart into subfunctions to make the code more readable. Signed-off-by: H. Peter Anvin [EMAIL PROTECTED] --- drivers/md/raid6test/test.c | 117 +-- 1 files changed, 69 insertions(+), 48 deletions

Re: switching root fs '/' to boot from RAID1 with grub

2007-11-01 Thread H. Peter Anvin
Doug Ledford wrote: device /dev/sda (hd0) root (hd0,0) install --stage2=/boot/grub/stage2 /boot/grub/stage1 (hd0) /boot/grub/e2fs_stage1_5 p /boot/grub/stage2 /boot/grub/menu.lst device /dev/hdc (hd0) root (hd0,0) install --stage2=/boot/grub/stage2 /boot/grub/stage1 (hd0)

Re: switching root fs '/' to boot from RAID1 with grub

2007-11-01 Thread H. Peter Anvin
Doug Ledford wrote: Correct, and that's what you want. The alternative is that if the BIOS can see the first disk but it's broken and can't be used, and if you have the boot sector on the second disk set to read from BIOS disk 0x81 because you ASSuMEd the first disk would be broken but still

Re: switching root fs '/' to boot from RAID1 with grub

2007-11-03 Thread H. Peter Anvin
Bill Davidsen wrote: Depends how bad the drive is. Just to align the thread on this - If the boot sector is bad - the bios on newer boxes will skip to the next one. But if it is good, and you boot into garbage - - could be Windows.. does it crash? Right, if the drive is dead almost

Re: switching root fs '/' to boot from RAID1 with grub

2007-11-04 Thread H. Peter Anvin
Bill Davidsen wrote: I don't understand your point, unless there's a Linux bootloader in the BIOS it will boot whatever 512 bytes are in sector 0. So if that's crap it doesn't matter what it would do if it was valid, some other bytes came off the drive instead. Maybe Windows, since there

Re: [md-raid6-accel PATCH 01/12] async_tx: PQXOR implementation

2007-12-27 Thread H. Peter Anvin
Yuri Tikhonov wrote: This patch implements support for the asynchronous computation of RAID-6 syndromes. It provides an API to compute RAID-6 syndromes asynchronously in a format conforming to async_tx interfaces. The async_pxor and async_pqxor_zero_sum functions are very similar to async_xor

On the subject of RAID-6 corruption recovery

2007-12-27 Thread H. Peter Anvin
I got a private email a while ago from Thiemo Nagel claiming that some of the conclusions in my RAID-6 paper was incorrect. This was combined with a proof which was plain wrong, and could easily be disproven using basic enthropy accounting (i.e. how much information is around to play with.)

Re: On the subject of RAID-6 corruption recovery

2007-12-28 Thread H. Peter Anvin
Bill Davidsen wrote: H. Peter Anvin wrote: I got a private email a while ago from Thiemo Nagel claiming that some of the conclusions in my RAID-6 paper was incorrect. This was combined with a proof which was plain wrong, and could easily be disproven using basic enthropy accounting (i.e. how

Re: On the subject of RAID-6 corruption recovery

2008-01-04 Thread H. Peter Anvin
Thiemo Nagel wrote: Inverting your argumentation, that means when we don't see z = n or inconsistent z numbers, multidisc corruption can be excluded statistically. For errors occurring on the level of hard disk blocks (signature: most bytes of the block have D errors, all with same z), the

Re: On the subject of RAID-6 corruption recovery

2008-01-04 Thread H. Peter Anvin
Thiemo Nagel wrote: For errors occurring on the level of hard disk blocks (signature: most bytes of the block have D errors, all with same z), the probability for multidisc corruption to go undetected is ((n-1)/256)**512. This might pose a problem in the limiting case of n=255, however for

Re: On the subject of RAID-6 corruption recovery

2008-01-04 Thread H. Peter Anvin
Thiemo Nagel wrote: That's why I was asking about the generator. Theoretically, this situation might be countered by using a (pseudo-)random pattern of generators for the different bytes of a sector, though I'm not sure whether it is worth the effort. Changing the generator is

Re: On the subject of RAID-6 corruption recovery

2008-01-07 Thread H. Peter Anvin
Mattias Wadenstein wrote: On Mon, 7 Jan 2008, Thiemo Nagel wrote: What you call pathologic cases when it comes to real-world data are very common. It is not at all unusual to find sectors filled with only a constant (usually zero, but not always), in which case your **512 becomes **1. Of