Re: Got raid10 assembled wrong - how to fix?

2008-02-13 Thread Michael Tokarev
George Spelvin wrote: I just discovered (the hard way, sigh, but not too much data loss) that a 4-drive RAID 10 array had the mirroring set up incorrectly. Given 4 drvies A, B, C and D, I had intended to mirror A-C and B-D, so that I could split the mirror and run on either (A,B) or (C,D).

Re: transferring RAID-1 drives via sneakernet

2008-02-13 Thread Michael Tokarev
Jeff Breidenbach wrote: It's not a RAID issue, but make sure you don't have any duplicate volume names. According to Murphy's Law, if there are two / volumes, the wrong one will be chosen upon your next reboot. Thanks for the tip. Since I'm not using volumes or LVM at all, I should be safe

Re: Deleting mdadm RAID arrays

2008-02-05 Thread Michael Tokarev
Janek Kozicki wrote: Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) 2. How can I delete that damn array so it doesn't hang my server up in a loop? dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 This works provided the superblocks are at the beginning of the component

Re: Deleting mdadm RAID arrays

2008-02-05 Thread Michael Tokarev
Moshe Yudkowsky wrote: Michael Tokarev wrote: Janek Kozicki wrote: Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100) 2. How can I delete that damn array so it doesn't hang my server up in a loop? dd if=/dev/zero of=/dev/sdb1 bs=1M count=10 This works provided

Re: Auto generation of mdadm.conf

2008-02-05 Thread Michael Tokarev
Janek Kozicki wrote: Michael Tokarev said: (by the date of Tue, 05 Feb 2008 16:52:18 +0300) Janek Kozicki wrote: I'm not using mdadm.conf at all. That's wrong, as you need at least something to identify the array components. I was afraid of that ;-) So, is that a correct way

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-05 Thread Michael Tokarev
Linda Walsh wrote: Michael Tokarev wrote: Unfortunately an UPS does not *really* help here. Because unless it has control program which properly shuts system down on the loss of input power, and the battery really has the capacity to power the system while it's shutting down (anyone tested

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-04 Thread Michael Tokarev
Moshe Yudkowsky wrote: [] But that's *exactly* what I have -- well, 5GB -- and which failed. I've modified /etc/fstab system to use data=journal (even on root, which I thought wasn't supposed to work without a grub option!) and I can power-cycle the system and bring it up reliably afterwards.

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-04 Thread Michael Tokarev
Moshe Yudkowsky wrote: [] If I'm reading the man pages, Wikis, READMEs and mailing lists correctly -- not necessarily the case -- the ext3 file system uses the equivalent of data=journal as a default. ext3 defaults to data=ordered, not data=journal. ext2 doesn't have journal at all. The

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-04 Thread Michael Tokarev
Eric Sandeen wrote: Moshe Yudkowsky wrote: So if I understand you correctly, you're stating that current the most reliable fs in its default configuration, in terms of protection against power-loss scenarios, is XFS? I wouldn't go that far without some real-world poweroff testing, because

Re: In this partition scheme, grub does not find md information?

2008-02-04 Thread Michael Tokarev
John Stoffel wrote: [] C'mon, how many of you are programmed to believe that 1.2 is better than 1.0? But when they're not different, just just different placements, then it's confusing. Speaking of more is better thing... There were quite a few bugs fixed in recent months wrt version 1

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-04 Thread Michael Tokarev
Eric Sandeen wrote: [] http://oss.sgi.com/projects/xfs/faq.html#nulls and note that recent fixes have been made in this area (also noted in the faq) Also - the above all assumes that when a drive says it's written/flushed data, that it truly has. Modern write-caching drives can wreak

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-03 Thread Michael Tokarev
Moshe Yudkowsky wrote: I've been reading the draft and checking it against my experience. Because of local power fluctuations, I've just accidentally checked my system: My system does *not* survive a power hit. This has happened twice already today. I've got /boot and a few other pieces in

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-03 Thread Michael Tokarev
Moshe Yudkowsky wrote: Michael Tokarev wrote: Speaking of repairs. As I already mentioned, I always use small (256M..1G) raid1 array for my root partition, including /boot, /bin, /etc, /sbin, /lib and so on (/usr, /home, /var are on their own filesystems). And I had the following

Re: In this partition scheme, grub does not find md information?

2008-01-30 Thread Michael Tokarev
Moshe Yudkowsky wrote: [] Mr. Tokarev wrote: By the way, on all our systems I use small (256Mb for small-software systems, sometimes 512M, but 1G should be sufficient) partition for a root filesystem (/etc, /bin, /sbin, /lib, and /boot), and put it on a raid1 on all... ... doing [it] this

Re: WRONG INFO (was Re: In this partition scheme, grub does not find md information?)

2008-01-30 Thread Michael Tokarev
Peter Rabbitson wrote: Moshe Yudkowsky wrote: over the other. For example, I've now learned that if I want to set up a RAID1 /boot, it must actually be 1.2 or grub won't be able to read it. (I would therefore argue that if the new version ever becomes default, then the default sub-version

Re: In this partition scheme, grub does not find md information?

2008-01-30 Thread Michael Tokarev
Keld Jørn Simonsen wrote: [] Ugh. 2-drive raid10 is effectively just a raid1. I.e, mirroring without any striping. (Or, backwards, striping without mirroring). uhm, well, I did not understand: (Or, backwards, striping without mirroring). I don't think a 2 drive vanilla raid10 will do

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Moshe Yudkowsky wrote: Peter Rabbitson wrote: It is exactly what the names implies - a new kind of RAID :) The setup you describe is not RAID10 it is RAID1+0. As far as how linux RAID10 works - here is an excellent article:

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Keld Jørn Simonsen wrote: On Tue, Jan 29, 2008 at 09:57:48AM -0600, Moshe Yudkowsky wrote: In my 4 drive system, I'm clearly not getting 1+0's ability to use grub out of the RAID10. I expect it's because I used 1.2 superblocks (why not use the latest, I said, foolishly...) and therefore the

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Peter Rabbitson wrote: [] However if you want to be so anal about names and specifications: md raid 10 is not a _full_ 1+0 implementation. Consider the textbook scenario with 4 drives: (A mirroring B) striped with (C mirroring D) When only drives A and C are present, md raid 10 with near

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Keld Jørn Simonsen wrote: On Tue, Jan 29, 2008 at 06:13:41PM +0300, Michael Tokarev wrote: Linux raid10 MODULE (which implements that standard raid10 LEVEL in full) adds some quite.. unusual extensions to that standard raid10 LEVEL. The resulting layout is also called raid10 in linux (ie

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Moshe Yudkowsky wrote: Michael Tokarev wrote: There are more-or-less standard raid LEVELS, including raid10 (which is the same as raid1+0, or a stripe on top of mirrors - note it does not mean 4 drives, you can use 6 - stripe over 3 mirrors each of 2 components; or the reverse - stripe

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Peter Rabbitson wrote: Michael Tokarev wrote: Raid10 IS RAID1+0 ;) It's just that linux raid10 driver can utilize more.. interesting ways to lay out the data. This is misleading, and adds to the confusion existing even before linux raid10. When you say raid10 in the hardware raid world

Re: In this partition scheme, grub does not find md information?

2008-01-29 Thread Michael Tokarev
Peter Rabbitson wrote: Moshe Yudkowsky wrote: One of the puzzling things about this is that I conceive of RAID10 as two RAID1 pairs, with RAID0 on top of to join them into a large drive. However, when I use --level=10 to create my md drive, I cannot find out which two pairs are the RAID1's:

Re: Fwd: Error on /dev/sda, but takes down RAID-1

2008-01-23 Thread Michael Tokarev
Martin Seebach wrote: Hi, I'm not sure this is completely linux-raid related, but I can't figure out where to start: A few days ago, my server died. I was able to log in and salvage this content of dmesg: http://pastebin.com/m4af616df I talked to my hosting-people and they said

Re: Last ditch plea on remote double raid5 disk failure

2007-12-31 Thread Michael Tokarev
Neil Brown wrote: On Monday December 31, [EMAIL PROTECTED] wrote: I'm hoping that if I can get raid5 to continue despite the errors, I can bring back up enough of the server to continue, a bit like the remount-ro option in ext2/ext3. If not, oh well... Sorry, but it is oh well. Speaking

Re: Linux RAID Partition Offset 63 cylinders / 30% performance hit?

2007-12-29 Thread Michael Tokarev
Justin Piszcz wrote: [] Good to know/have it confirmed by someone else, the alignment does not matter with Linux/SW RAID. Alignment matters when one partitions Linux/SW raid array. If the inside partitions will not be aligned on a stripe boundary, esp. in the worst case when the filesystem

Re: raid10: unfair disk load?

2007-12-23 Thread Michael Tokarev
maobo wrote: Hi,all Yes, Raid10 read balance is the shortest position time first and considering the sequential access condition. But its performance is really poor from my test than raid0. Single-stream write performance of raid0, raid1 and raid10 should be of similar level (with raid5 and

Re: raid10: unfair disk load?

2007-12-21 Thread Michael Tokarev
Michael Tokarev wrote: I just noticed that with Linux software RAID10, disk usage isn't equal at all, that is, most reads are done from the first part of mirror(s) only. Attached (disk-hour.png) is a little graph demonstrating this (please don't blame me for poor choice of colors

Re: raid10: unfair disk load?

2007-12-21 Thread Michael Tokarev
Janek Kozicki wrote: Michael Tokarev said: (by the date of Fri, 21 Dec 2007 14:53:38 +0300) I just noticed that with Linux software RAID10, disk usage isn't equal at all, that is, most reads are done from the first part of mirror(s) only. what's your kernel version? I recall that recently

Re: ERROR] scsi.c: In function 'scsi_get_serial_number_page'

2007-12-19 Thread Michael Tokarev
Thierry Iceta wrote: Hi I would like to use raidtools-1.00.3 on Rhel5 distribution but I got thie error Use mdadm instead. Raidtools is dangerous/unsafe, and is not maintained for a long time already. /mjt - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of

external bitmaps.. and more

2007-12-06 Thread Michael Tokarev
I come across a situation where external MD bitmaps aren't usable on any standard linux distribution unless special (non-trivial) actions are taken. First is a small buglet in mdadm, or two. It's not possible to specify --bitmap= in assemble command line - the option seems to be ignored. But

Re: assemble vs create an array.......

2007-12-06 Thread Michael Tokarev
[Cc'd to xfs list as it contains something related] Dragos wrote: Thank you. I want to make sure I understand. [Some background for XFS list. The talk is about a broken linux software raid (the reason for breakage isn't relevant anymore). The OP seems to lost the order of drives in his

Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)

2007-12-02 Thread Michael Tokarev
Justin Piszcz said: (by the date of Sun, 2 Dec 2007 04:11:59 -0500 (EST)) The badblocks did not do anything; however, when I built a software raid 5 and the performed a dd: /usr/bin/time dd if=/dev/zero of=fill_disk bs=1M I saw this somewhere along the way: [42332.936706] ata5.00:

Re: assemble vs create an array.......

2007-11-30 Thread Michael Tokarev
Bryce wrote: [] mdadm -C -l5 -n5 -c128 /dev/md0 /dev/sdf1 /dev/sde1 /dev/sdg1 /dev/sdc1 /dev/sdd1 ... IF you don't have the configuration printout, then you're left with exhaustive brute force searching of the combinations You're missing a very important point -- --assume-clean option. For

Re: man mdadm - suggested correction.

2007-11-05 Thread Michael Tokarev
Janek Kozicki wrote: [] Can you please add do the manual under 'SEE ALSO' a reference to /usr/share/doc/mdadm ? /usr/share/doc/mdadm is Debian-specific (well.. not sure it's really Debian (or something derived from it) -- some other distros may use the same naming scheme, too). Other

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-04 Thread Michael Tokarev
Justin Piszcz wrote: # ps auxww | grep D USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 273 0.0 0.0 0 0 ?DOct21 14:40 [pdflush] root 274 0.0 0.0 0 0 ?DOct21 13:00 [pdflush] After several days/weeks,

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-04 Thread Michael Tokarev
Justin Piszcz wrote: On Sun, 4 Nov 2007, Michael Tokarev wrote: [] The next time you come across something like that, do a SysRq-T dump and post that. It shows a stack trace of all processes - and in particular, where exactly each task is stuck. Yes I got it before I rebooted, ran

Re: Time to deprecate old RAID formats?

2007-10-22 Thread Michael Tokarev
John Stoffel wrote: Michael == Michael Tokarev [EMAIL PROTECTED] writes: If you are going to mirror an existing filesystem, then by definition you have a second disk or partition available for the purpose. So you would merely setup the new RAID1, in degraded mode, using the new partition

Re: Software RAID when it works and when it doesn't

2007-10-20 Thread Michael Tokarev
Justin Piszcz wrote: [] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Justin, forgive me please, but can you learn to trim the original messages when

Re: Time to deprecate old RAID formats?

2007-10-20 Thread Michael Tokarev
Doug Ledford wrote: [] 1.0, 1.1, and 1.2 are the same format, just in different positions on the disk. Of the three, the 1.1 format is the safest to use since it won't allow you to accidentally have some sort of metadata between the beginning of the disk and the raid superblock (such as an

Re: Time to deprecate old RAID formats?

2007-10-20 Thread Michael Tokarev
John Stoffel wrote: Michael == Michael Tokarev [EMAIL PROTECTED] writes: [] Michael Well, I strongly, completely disagree. You described a Michael real-world situation, and that's unfortunate, BUT: for at Michael least raid1, there ARE cases, pretty valid ones, when one Michael NEEDS

Re: Time to deprecate old RAID formats?

2007-10-20 Thread Michael Tokarev
Justin Piszcz wrote: On Fri, 19 Oct 2007, Doug Ledford wrote: On Fri, 2007-10-19 at 13:05 -0400, Justin Piszcz wrote: [] Got it, so for RAID1 it would make sense if LILO supported it (the later versions of the md superblock) Lilo doesn't know anything about the superblock format,

Re: very degraded RAID5, or increasing capacity by adding discs

2007-10-09 Thread Michael Tokarev
Neil Brown wrote: On Tuesday October 9, [EMAIL PROTECTED] wrote: [] o During this reshape time, errors may be fatal to the whole array - while mdadm do have a sense of critical section, but the whole procedure isn't as much tested as the rest of raid code, I for one will not rely on it,

Re: very degraded RAID5, or increasing capacity by adding discs

2007-10-08 Thread Michael Tokarev
Janek Kozicki wrote: Hello, Recently I started to use mdadm and I'm very impressed by its capabilities. I have raid0 (250+250 GB) on my workstation. And I want to have raid5 (4*500 = 1500 GB) on my backup machine. Hmm. Are you sure you need that much space on the backup, to start with?

Re: Journalling filesystem corruption fixed in between?

2007-10-03 Thread Michael Tokarev
Rustedt, Florian wrote: Hello list, some folks reported severe filesystem-crashes with ext3 and reiserfs on mdraid level 1 and 5. I guess much more strong evidience and details are needed. Without any additional information I for one can only make a (not-so-pleasant) guess about those some

Re: problem killing raid 5

2007-10-01 Thread Michael Tokarev
Daniel Santos wrote: I retried rebuilding the array once again from scratch, and this time checked the syslog messages. The reconstructions process is getting stuck at a disk block that it can't read. I double checked the block number by repeating the array creation, and did a bad block scan.

Re: problem killing raid 5

2007-10-01 Thread Michael Tokarev
Patrik Jonsson wrote: Michael Tokarev wrote: [] But in any case, md should not stall - be it during reconstruction or not. For this, I can't comment - to me it smells like a bug somewhere (md layer? error handling in driver? something else?) which should be found and fixed

Re: Backups w/ rsync

2007-09-29 Thread Michael Tokarev
Dean S. Messing wrote: Michael Tokarev writes: [] : the procedure is something like this: : : cd /backups : rm -rf tmp/ : cp -al $yesterday tmp/ : rsync -r --delete -t ... /filesystem tmp : mv tmp $today : : That is, link the previous backup to temp (which takes no space

Re: Help: very slow software RAID 5.

2007-09-20 Thread Michael Tokarev
Dean S. Messing wrote: [] [] That's what attracted me to RAID 0 --- which seems to have no downside EXCEPT safety :-). So I'm not sure I'll ever figure out the right tuning. I'm at the point of abandoning RAID entirely and just putting the three disks together as a big LV and being done

Re: SWAP file on a RAID-10 array possible?

2007-08-15 Thread Michael Tokarev
Tomas France wrote: Thanks for the answer, David! I kind of think RAID-10 is a very good choice for a swap file. For now I will need to setup the swap file on a simple RAID-1 array anyway, I just need to be prepared when it's time to add more disks and transform the whole thing into

Re: A raid in a raid.

2007-07-21 Thread Michael Tokarev
mullaly wrote: [] All works well until a system reboot. md2 appears to be brought up before md0 and md1 which causes the raid to start without two of its drives. Is there anyway to fix this? How about listing the arrays in proper order in mdadm.conf ? /mjt - To unsubscribe from this list:

Re: 3ware 9650 tips

2007-07-13 Thread Michael Tokarev
Joshua Baker-LePain wrote: [] Yep, hardware RAID -- I need the hot swappability (which, AFAIK, is still an issue with md). Just out of curiocity - what do you mean by swappability ? For many years we're using linux software raid, we had no problems with swappability of the component drives (in

RFC: dealing with bad blocks: another view

2007-06-13 Thread Michael Tokarev
Now MD subsystem does a very good job at trying to recover a bad block on a disk, by re-writing its content (to force drive to reallocate the block in question) and verifying it's written ok. But I wonder if it's worth the effort to go further than that. Now, md can use bitmaps. And a bitmap

Re: Recovery of software RAID5 using FC6 rescue?

2007-05-09 Thread Michael Tokarev
Nix wrote: On 8 May 2007, Michael Tokarev told this: BTW, for such recovery purposes, I use initrd (initramfs really, but does not matter) with a normal (but tiny) set of commands inside, thanks to busybox. So everything can be done without any help from external recovery CD. Very handy

Re: No such device on --remove

2007-05-09 Thread Michael Tokarev
Bernd Schubert wrote: Benjamin Schieder wrote: [EMAIL PROTECTED]:~# mdadm /dev/md/2 -r /dev/hdh5 mdadm: hot remove failed for /dev/hdh5: No such device md1 and md2 are supposed to be raid5 arrays. You are probably using udev, don't you? Somehow there's presently no /dev/hdh5, but to

Re: removed disk md-device

2007-05-09 Thread Michael Tokarev
Bernd Schubert wrote: Hi, we are presently running into a hotplug/linux-raid problem. Lets assume a hard disk entirely fails or a stupid human being pulls it out of the system. Several partitions of the very same hardisk are also part of linux-software raid. Also, /dev is managed by

Re: Swapping out for larger disks

2007-05-08 Thread Michael Tokarev
Brad Campbell wrote: [] It occurs though that the superblocks would be in the wrong place for the new drives and I'm wondering if the kernel or mdadm might not find them. I once had a similar issue. And wrote a tiny program (a hack, sort of), to read or write md superblock from/to a component

Re: No such device on --remove

2007-05-08 Thread Michael Tokarev
Benjamin Schieder wrote: Hi list. md2 : inactive hdh5[4](S) hdg5[1] hde5[3] hdf5[2] 11983872 blocks [EMAIL PROTECTED]:~# mdadm -R /dev/md/2 mdadm: failed to run array /dev/md/2: Input/output error [EMAIL PROTECTED]:~# mdadm /dev/md/ 0 1 2 3 4 5 [EMAIL PROTECTED]:~# mdadm

Re: Recovery of software RAID5 using FC6 rescue?

2007-05-08 Thread Michael Tokarev
Mark A. O'Neil wrote: Hello, I hope this is the appropriate forum for this request if not please direct me to the correct one. I have a system running FC6, 2.6.20-1.2925, software RAID5 and a power outage seems to have borked the file structure on the RAID. Boot shows the following

Re: s2disk and raid

2007-04-04 Thread Michael Tokarev
Neil Brown wrote: On Tuesday April 3, [EMAIL PROTECTED] wrote: [] After the power cycle the kernel boots, devices are discovered, among which the ones holding raid. Then we try to find the device that holds swap in case of resume and / in case of a normal boot. Now comes a crucial point. The

Re: Swap initialised as an md?

2007-03-23 Thread Michael Tokarev
Bill Davidsen wrote: [] If you use RAID0 on an array it will be faster (usually) than just partitions, but any process with swapped pages will crash if you lose either drive. With RAID1 operation will be more reliable but no faster. If you use RAID10 the array will be faster and more reliable,

Re: Raid 10 Problems?

2007-03-08 Thread Michael Tokarev
Jan Engelhardt wrote: [] The other thing is, the bitmap is supposed to be written out at intervals, not at every write, so the extra head movement for bitmap updates should be really low, and not making the tar -xjf process slower by half a minute. Is there a way to tweak the

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Michael Tokarev
Jason Rainforest wrote: I tried doing a check, found a mismatch_cnt of 8 (7*250Gb SW RAID5, multiple controllers on Linux 2.6.19.2, SMP x86-64 on Athlon64 X2 4200 +). I then ordered a resync. The mismatch_cnt returned to 0 at the start of As pointed out later it was repair, not resync.

Re: Move superblock on partition resize?

2007-02-07 Thread Michael Tokarev
) for exactly this purpose. /mjt /* mdsuper: read or write a linux software raid superbloc (version 0.90) * from or to a given device. * * GPL. * Written by Michael Tokarev ([EMAIL PROTECTED]) */ #define _GNU_SOURCE #include sys/types.h #include stdio.h #include unistd.h #include errno.h #include

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Michael Tokarev
Justin Piszcz wrote: [] Is this a bug that can or will be fixed or should I disable pre-emption on critical and/or server machines? Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO anyway.. ;) /mjt - To unsubscribe from this list: send

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Michael Tokarev
Justin Piszcz wrote: On Tue, 23 Jan 2007, Michael Tokarev wrote: Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO anyway.. ;) So bottom line is make sure not to use preemption on servers or else you will get weird spinlock

Re: raid5 software vs hardware: parity calculations?

2007-01-15 Thread Michael Tokarev
dean gaudet wrote: [] if this is for a database or fs requiring lots of small writes then raid5/6 are generally a mistake... raid10 is the only way to get performance. (hw raid5/6 with nvram support can help a bit in this area, but you just can't beat raid10 if you need lots of writes/s.)

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Michael Tokarev
Justin Piszcz wrote: Using 4 raptor 150s: Without the tweaks, I get 111MB/s write and 87MB/s read. With the tweaks, 195MB/s write and 211MB/s read. Using kernel 2.6.19.1. Without the tweaks and with the tweaks: # Stripe tests: echo 8192 /sys/block/md3/md/stripe_cache_size # DD

Re: RAID1 root and swap and initrd

2006-12-21 Thread Michael Tokarev
[A late follow-up] Bill Davidsen wrote: Michael Tokarev wrote: Andre Majorel wrote: [] Thanks Jurriaan and Gordon. I think I may still be f*cked, however. The Lilo doc says you can't use raid-extra-boot=mbr-only if boot= does not point to a raid device. Which it doesn't because in my

Re: RAID1 root and swap and initrd

2006-12-16 Thread Michael Tokarev
Andre Majorel wrote: [] Thanks Jurriaan and Gordon. I think I may still be f*cked, however. The Lilo doc says you can't use raid-extra-boot=mbr-only if boot= does not point to a raid device. Which it doesn't because in my setup, boot=/dev/sda. Using boot=/dev/md5 would solve the

Re: RAID1 root and swap and initrd

2006-12-16 Thread Michael Tokarev
Andre Majorel wrote: [] So just move it to sda1 (or sda2, sda3) from sda5 Problem is, the disks are entirely used by an extended partition. There's nowhere to move sd?5 to. You're using raid, so you've at least two disk drives. remove one component off all your raid devices (second disk),

Re: why not make everything partitionable?

2006-11-15 Thread Michael Tokarev
martin f krafft wrote: Hi folks, you cannot create partitions within partitions, but you can well use whole disks for a filesystem without any partitions. It's usually better to have a partition table in place, at least on x86. Just to stop possible confusion - be it from kernel, or from

Re: [PATCH 001 of 6] md: Send online/offline uevents when an md array starts/stops.

2006-11-09 Thread Michael Tokarev
Neil Brown wrote: [/dev/mdx...] (much like how /dev/ptmx is used to create /dev/pts/N entries.) [] I have the following patch sitting in my patch queue (since about March). It does what you suggest via /sys/module/md-mod/parameters/MAGIC_FILE which is the only md-specific part of the /sys

Re: [PATCH 001 of 6] md: Send online/offline uevents when an md array starts/stops.

2006-11-09 Thread Michael Tokarev
Michael Tokarev wrote: Neil Brown wrote: [/dev/mdx...] [] An in any case, we have the semantic that opening an md device-file creates the device, and we cannot get rid of that semantic without a lot of warning and a lot of pain. And adding a new semantic isn't really going to help. I

mdadm: bitmaps not supported by this kernel?

2006-10-25 Thread Michael Tokarev
Another 32/64 bits issue, it seems. Running 2.6.18.1 x86-64 kernel and mdadm 2.5.3 (32 bit). # mdadm -G /dev/md1 --bitmap=internal mdadm: bitmaps not supported by this kernel. # mdadm -G /dev/md1 --bitmap=none mdadm: bitmaps not supported by this kernel. etc. Recompiling mdadm in 64bit mode

Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.

2006-10-12 Thread Michael Tokarev
Neil Brown wrote: [] Fix count of degraded drives in raid10. Signed-off-by: Neil Brown [EMAIL PROTECTED] --- .prev/drivers/md/raid10.c 2006-10-09 14:18:00.0 +1000 +++ ./drivers/md/raid10.c 2006-10-05 20:10:07.0 +1000 @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)

Re: avoiding the initial resync on --create

2006-10-11 Thread Michael Tokarev
Doug Ledford wrote: On Mon, 2006-10-09 at 15:10 -0400, Rob Bray wrote: [] Probably the best thing to do would be on create of the array, setup a large all 0 block of mem and repeatedly write that to all blocks in the array devices except parity blocks and use a large all 1 block for that.

Re: Simulating Drive Failure on Mirrored OS drive

2006-10-02 Thread Michael Tokarev
andy liebman wrote: Read up on the md-faulty device. Got any links to this? As I said, we know how to set the device as faulty, but I'm not convinced this is a good simulation of a drive that fails (times out, becomes unresponsive, etc.) Note that 'set device as faulty' is NOT the same

Re: [BUG/PATCH] md bitmap broken on big endian machines

2006-09-29 Thread Michael Tokarev
Paul Clements wrote: Michael Tokarev wrote: Neil Brown wrote: ffs is closer, but takes an 'int' and we have a 'unsigned long'. So use ffz(~X) to convert a chunksize into a chunkshift. So we don't use ffs(int) for an unsigned value because of int vs unsigned int, but we use ffz

Re: [BUG/PATCH] md bitmap broken on big endian machines

2006-09-28 Thread Michael Tokarev
Neil Brown wrote: [] Use ffz instead of find_first_set to convert multiplier to shift. From: Paul Clements [EMAIL PROTECTED] find_first_set doesn't find the least-significant bit on bigendian machines, so it is really wrong to use it. ffs is closer, but takes an 'int' and we have a

proactive-raid-disk-replacement

2006-09-08 Thread Michael Tokarev
Recently Dean Gaudet, in thread titled 'Feature Request/Suggestion - Drive Linking', mentioned his document, http://arctic.org/~dean/proactive-raid5-disk-replacement.txt I've read it, and have some umm.. concerns. Here's why: mdadm -Gb internal --bitmap-chunk=1024 /dev/md4 mdadm /dev/md4

Re: RAID5 fill up?

2006-09-08 Thread Michael Tokarev
Lars Schimmer wrote: Hi! I´ve got a software RAiD5 with 6 250GB HDs. Now I changed one disk after another to a 400GB HD and resynced the raid5 after each change. Now the RAID5 has got 6 400GB HDs and still uses only 6*250GB space. How can I grow the md0 device to use 6*400GB? mdadm --grow

Re: proactive-raid-disk-replacement

2006-09-08 Thread Michael Tokarev
dean gaudet wrote: On Fri, 8 Sep 2006, Michael Tokarev wrote: Recently Dean Gaudet, in thread titled 'Feature Request/Suggestion - Drive Linking', mentioned his document, http://arctic.org/~dean/proactive-raid5-disk-replacement.txt I've read it, and have some umm.. concerns. Here's why

Re: Feature Request/Suggestion - Drive Linking

2006-09-03 Thread Michael Tokarev
Tuomas Leikola wrote: [] Here's an alternate description. On first 'unrecoverable' error, the disk is marked as FAILING, which means that a spare is immediately taken into use to replace the failing one. The disk is not kicked, and readable blocks can still be used to rebuild other blocks

spurious dots in dmesg when reconstructing arrays

2006-08-17 Thread Michael Tokarev
A long time ago I noticied pretty bad formatting of dmesg text in md array reconstruction output, but never bothered to ask. So here it goes. Example dmesg (RAID conf printout sections omitted): md: bindsdb1 RAID1 conf printout: ..6md: syncing RAID array md1 md: minimum _guaranteed_

Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4

2006-08-12 Thread Michael Tokarev
Justin Piszcz wrote: Is there a doc for all of the options you can echo into the sync_action? I'm assuming mdadm does these as well and echo is just another way to run work with the array? How about the obvious, Documentation/md.txt ? And no, mdadm does not perform or trigger data integrity

Re: modifying degraded raid 1 then re-adding other members is bad

2006-08-08 Thread Michael Tokarev
Neil Brown wrote: On Tuesday August 8, [EMAIL PROTECTED] wrote: Assume I have a fully-functional raid 1 between two disks, one hot-pluggable and the other fixed. If I unplug the hot-pluggable disk and reboot, the array will come up degraded, as intended. If I then modify a lot of the data

Re: Converting Ext3 to Ext3 under RAID 1

2006-08-03 Thread Michael Tokarev
Paul Clements wrote: Is 16 blocks a large enough area? Maybe. The superblock will be between 64KB and 128KB from the end of the partition. This depends on the size of the partition: SB_LOC = PART_SIZE - 64K - (PART_SIZE (64K-1)) So, by 16 blocks, I assume you mean 16 filesystem

Re: let md auto-detect 128+ raid members, fix potential race condition

2006-08-01 Thread Michael Tokarev
Alexandre Oliva wrote: [] If mdadm can indeed scan all partitions to bring up all raid devices in them, like nash's raidautorun does, great. I'll give that a try, Never, ever, try to do that (again). Mdadm (or vgscan, or whatever) should NOT assemble ALL arrays found, but only those which it

Re: [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data.

2006-07-31 Thread Michael Tokarev
NeilBrown wrote: They are not needed. conf-failed_disks is the same as mddev-degraded By the way, `failed_disks' is more understandable than `degraded' in this context. Degraded usually refers to the state of the array in question, when failed_disks 0. That to say: I'd rename degraded back

Re: Grub vs Lilo

2006-07-26 Thread Michael Tokarev
Jason Lunz wrote: [EMAIL PROTECTED] said: Wondering if anyone can comment on an easy way to get grub to update all components in a raid1 array. I have a raid1 /boot with a raid10 /root and have previously used lilo with the raid-extra-boot option to install to boot sectors of all component

Re: Grub vs Lilo

2006-07-26 Thread Michael Tokarev
Bernd Rieke wrote: Michael Tokarev wrote on 26.07.2006 20:00: . . The thing with all this my RAID devices works, it is really simple! thing is: for too many people it indeed works, so they think it's good and correct way. But it works up to the actual failure, which, in most setups

Re: [PATCH] enable auto=yes by default when using udev

2006-07-04 Thread Michael Tokarev
Neil Brown wrote: On Monday July 3, [EMAIL PROTECTED] wrote: Hello, the following patch aims at solving an issue that is confusing a lot of users. when using udev, device files are created only when devices are registered with the kernel, and md devices are registered only when started.

Re: New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)

2006-06-21 Thread Michael Tokarev
Niccolo Rigacci wrote: [] From the command line you can see which schedulers are supported and change it on the fly (remember to do it for each RAID disk): # cat /sys/block/hda/queue/scheduler noop [anticipatory] deadline cfq # echo cfq /sys/block/hda/queue/scheduler Otherwise

Re: problems with raid=noautodetect

2006-05-29 Thread Michael Tokarev
Neil Brown wrote: On Friday May 26, [EMAIL PROTECTED] wrote: [] If we assume there is a list of devices provided by a (possibly default) 'DEVICE' line, then DEVICEFILTER !pattern1 !pattern2 pattern3 pattern4 could mean that any device in that list which matches pattern 1 or 2 is

Re: linear writes to raid5

2006-04-20 Thread Michael Tokarev
Neil Brown wrote: On Tuesday April 18, [EMAIL PROTECTED] wrote: [] I mean, mergeing bios into larger requests makes alot of sense between a filesystem and md levels, but it makes alot less sense to do that between md and physical (fsvo physical anyway) disks. This seems completely backwards

Re: linear writes to raid5

2006-04-18 Thread Michael Tokarev
Neil Brown wrote: [] raid5 shouldn't need to merge small requests into large requests. That is what the 'elevator' or io_scheduler algorithms are for. There already merge multiple bio's into larger 'requests'. If they aren't doing that, then something needs to be fixed. It is certainly

Re: accessing mirrired lvm on shared storage

2006-04-18 Thread Michael Tokarev
Neil Brown wrote: [] Very cool... that would be extremely nice to have. Any estimate on when you might get to this? I'm working on it, but there are lots of distractions Neil, is there anything you're NOT working on? ;) Sorry just can't resist... ;) /mjt - To unsubscribe from this

Terrible slow write speed to MegaRAID SCSI array

2006-03-18 Thread Michael Tokarev
We've installed an LSI Logic MegaRaid SCSI 320-1 card on our server (used only temporarily to move data to larger disks, but that's not the point), and measured linear write performance, just to know how much time it will took to copy our (somewhat large) data to the new array. And to my

Re: [PATCH 000 of 5] md: Introduction

2006-01-17 Thread Michael Tokarev
NeilBrown wrote: Greetings. In line with the principle of release early, following are 5 patches against md in 2.6.latest which implement reshaping of a raid5 array. By this I mean adding 1 or more drives to the array and then re-laying out all of the data. Neil, is this online

  1   2   >