Re: How to auto rebuild array?
Hi Rui, Is that a RAID1 ou RAID5? It's a RAID5, but the problem also happends on RAID1. Can you give the output of these commands? mdadm --misc -D /dev/md3 mdadm --misc -E /dev/hda4 mdadm --misc -E /dev/hdb4 mdadm --misc -E /dev/hde4 mdadm --misc -E /dev/hdf4 One other thing. Are you sure that all you raid partitions are marked 0xfd ? There are 0xfd partitions with persistent superblock. Also attach you mdadm.conf/raidconf file, if you have any... I don't use a /etc/mdadm/mdadm.conf file, the raid is started during kernel boot with raid autodetect. That's when the 'kicking non-fresh drive' print occurs. When this happened 'mdadm --detail' will not display the kicked disk in the list of array disks anymore, also the mdadm deamon only will get the name of the array that's degraded, not the name of the kicked device :( I made a script that runs after reboot as fix, it will hott-add the kicked disks back to the array. It seems to fix the problem. Regards, Bart - #! /bin/bash DEVLIST=`ls /dev/hd??` for dev in $DEVLIST; do result=`mdadm --query $dev | grep mismatch` if [ -n $result ]; then raid=/dev/`echo $result | awk 'BEGIN {FS=[ .]} {print $9}'` echo $raid needs $dev added mdadm --add $raid $dev fi done - I have the problem that after a power failure I get the message: Jul 12 15:29:17 kernel: md: created md3 Jul 12 15:29:17 kernel: md: bindhda4 Jul 12 15:29:17 kernel: md: bindhdb4 Jul 12 15:29:17 kernel: md: bindhde4 Jul 12 15:29:17 kernel: md: bindhdf4 Jul 12 15:29:17 kernel: md: running: hdf4hde4hdb4hda4 Jul 12 15:29:17 kernel: md: kicking non-fresh hde4 from array! Jul 12 15:29:17 kernel: md: unbindhde4 I understand that hde4 is not 'fresh' and the array need to be rebuild but I only can do that with 'mdadm --add /dev/md3 /dev/hde4'. I would like to have it turned into a hot-spare, in which case a rebuild would start automatic. This application runs unattended, so there is nobody there to enter mdadm commands How can I make the rebuild starting automatic (like a hardware raidcard does)? - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: disk failed, operator error: Now can't use RAID
On Wednesday July 13, [EMAIL PROTECTED] wrote: I would very much appreciate suggestions on how to get the raid running again. Remove the devices=/dev/hde1,/dev/sdd1,/dev/sdc1,/dev/sdb1,/dev/sda1 line from mdadm.conf (it is wrong and un-needed). Then mdadm -S /dev/md0 # just to be sure mdadm -A /dev/md0 -f /dev/sd[abcd]1 /dev/hd[eg]1 and see if that works. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oops when starting md multipath on a 2.4 kernel
Mike Tran wrote: James Pearson wrote: We have an existing system runing a 2.4.27 based kernel that uses md multipath and external fibre channel arrays. We need to add more internal disks to the system, which means the external drives change device names. When I tried to start the md multipath device using mdadm, the kernel Oops'd. Removing the new internal disks and going back the original setup, I can start the multipath device - as this machine is in production, I can't do any more tests. However, I can reproduce the problem on test system by creating an md multipath device on an external SCSI disk, using /dev/sda1, stopping the multipath device, rmmod'ing the SCSI driver, pluging in a couple of USB storage devices which become /dev/sda and /dev/sdb and then modprobing the SCSI driver, so the original /dev/sda1 is now /dev/sdc1. When I run 'mdadm -A -s', I get the following Oops: [events: 0004] md: bindsdc1,1 md: sdc1's event counter: 0004 md0: former device sda1 is unavailable, removing from array! md: unbindsdc1,0 md: export_rdev(sdc1) md: RAID level -4 does not need chunksize! Continuing anyway. md: multipath personality registered as nr 7 md0: max total readahead window set to 124k md0: 1 data-disks, max readahead per data-disk: 124k Unable to handle kernel NULL pointer dereference at virtual address 0040 printing eip: e096527e *pde = Oops: CPU:0 EIP:0010:[e096527e]Not tainted EFLAGS: 00010246 eax: deb62a94 ebx: ecx: dd65b400 edx: esi: 001c edi: deb62a94 ebp: esp: dd5fbdbc ds: 0018 es: 0018 ss: 0018 Process mdadm (pid: 1389, stackpage=dd5fb000) Stack: dd4c4000 dfa96000 c035ad00 0286 dd4c4000 deb62a94 dd5fbe5c dd4c6000 c02a6e10 dd65b400 c035ef1f 007c 000a 0002 2e2e c0118b49 2e2e 2e2e 0286 Call Trace:[c02a6e10] [c0118b49] [c0118cc4] [c024a88c] [c024abb6] [c0118cc4] [c024907e] [c024b6f2] [c024c60c] [c014a326] [c013c483] [c013ca18] [c01375ac] [c013ca63] [c01439b6] [c01087c7] Code: 8b 45 40 85 c0 0f 84 c2 01 00 00 6a 00 ff b4 24 cc 00 00 00 Running through ksymoops gives: Unable to handle kernel NULL pointer dereference at virtual address 0040 e096527e *pde = Oops: CPU:0 EIP:0010:[e096527e]Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: deb62a94 ebx: ecx: dd65b400 edx: esi: 001c edi: deb62a94 ebp: esp: dd5fbdbc ds: 0018 es: 0018 ss: 0018 Process mdadm (pid: 1389, stackpage=dd5fb000) Stack: dd4c4000 dfa96000 c035ad00 0286 dd4c4000 deb62a94 dd5fbe5c dd4c6000 c02a6e10 dd65b400 c035ef1f 007c 000a 0002 2e2e c0118b49 2e2e 2e2e 0286 Call Trace:[c02a6e10] [c0118b49] [c0118cc4] [c024a88c] [c024abb6] [c0118cc4] [c024907e] [c024b6f2] [c024c60c] [c014a326] [c013c483] [c013ca18] [c01375ac] [c013ca63] [c01439b6] [c01087c7] Code: 8b 45 40 85 c0 0f 84 c2 01 00 00 6a 00 ff b4 24 cc 00 00 00 EIP; e096527e [multipath]multipath_run+2be/6c0 = Trace; c02a6e10 vsnprintf+2e0/450 Trace; c0118b49 call_console_drivers+e9/f0 Trace; c0118cc4 printk+104/110 Trace; c024a88c device_size_calculation+19c/1f0 Trace; c024abb6 do_md_run+2d6/360 Trace; c0118cc4 printk+104/110 Trace; c024907e bind_rdev_to_array+9e/b0 Trace; c024b6f2 add_new_disk+132/290 Trace; c024c60c md_ioctl+6fc/790 Trace; c014a326 iput+236/240 Trace; c013c483 bdput+93/a0 Trace; c013ca18 blkdev_put+98/a0 Trace; c01375ac fput+bc/e0 Trace; c013ca63 blkdev_ioctl+23/30 Trace; c01439b6 sys_ioctl+216/230 Trace; c01087c7 system_call+33/38 Code; e096527e [multipath]multipath_run+2be/6c0 _EIP: Code; e096527e [multipath]multipath_run+2be/6c0 = 0: 8b 45 40 mov0x40(%ebp),%eax = Code; e0965281 [multipath]multipath_run+2c1/6c0 3: 85 c0 test %eax,%eax Code; e0965283 [multipath]multipath_run+2c3/6c0 5: 0f 84 c2 01 00 00 je 1cd _EIP+0x1cd e096544b [multipath]m ultipath_run+48b/6c0 Code; e0965289 [multipath]multipath_run+2c9/6c0 b: 6a 00 push $0x0 Code; e096528b [multipath]multipath_run+2cb/6c0 d: ff b4 24 cc 00 00 00 pushl 0xcc(%esp,1) My /etc/mdadm.conf contains: DEVICE /dev/sd?1 ARRAY /dev/md0 level=multipath num-devices=1 UUID=277e4ba5:6c23c087:e17c877c:da642955 Should md multipath be able to handle changes like this with the underlying devices? Thanks James Pearson Hi James, My co-worker and I just happened to run into this problem a few days ago. So, I would like to share with you what we know. The device major/minor numbers no longer match up values recorded in the descriptor array in the md superblock. Because of the exception made in the current code, the descriptor entries are removed and
Re: Oops when starting md multipath on a 2.4 kernel
On 2005-07-14T11:09:32, James Pearson [EMAIL PROTECTED] wrote: Thanks - that patch applies OK to more recent 2.4 kernels and appears to 'fix' this problem. However, if you have a cut down patch that fixes just this problem, then I would appreciate it if you could make it available. There's a bugfix needed for 2.4 md multipath which prevents guaranteed data corruption on failover too. I don't have time to redo the diffs against 2.4 proper, but - bh-b_rdev = bh-b_dev; - bh-b_rsector = bh-b_blocknr; are probably the two most important changes to multipath.c:multipathd(). The patch in the SLES8 2.4 kernel is patches.common/md-multipath-retry-handling - there's also some locking fixes etc in there. The problem is our kernel has deviated so much from 2.4, and active development is now focused on DM mpath in 2.6, that pulling out smaller chunks and feeding them upstream on 2.4 just isn't worth it :-( Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID-5 streaming read performance
On Wed, 2005-07-13 at 23:58 -0400, Dan Christensen wrote: David Greaves [EMAIL PROTECTED] writes: In my setup I get component partitions, e.g. /dev/sda7: 39MB/s raid device /dev/md2: 31MB/s lvm device /dev/main/media: 53MB/s (oldish system - but note that lvm device is *much* faster) Did you test component device and raid device speed using the read-ahead settings tuned for lvm reads? If so, that's not a fair comparison. :-) For your entertainment you may like to try this to 'tune' your readahead - it's OK to use so long as you're not recording: Thanks, I played around with that a lot. I tuned readahead to optimize lvm device reads, and this improved things greatly. It turns out the default lvm settings had readahead set to 0! But by tuning things, I could get my read speed up to 59MB/s. This is with raw device readahead 256, md device readahead 1024 and lvm readahead 2048. (The speed was most sensitive to the last one, but did seem to depend on the other ones a bit too.) I separately tuned the raid device read speed. To maximize this, I needed to set the raw device readahead to 1024 and the raid device readahead to 4096. This brought my raid read speed from 59MB/s to 78MB/s. Better! (But note that now this makes the lvm read speed look bad.) My raw device read speed is independent of the readahead setting, as long as it is at least 256. The speed is about 58MB/s. Summary: raw device: 58MB/s raid device: 78MB/s lvm device: 59MB/s raid still isn't achieving the 106MB/s that I can get with parallel direct reads, but at least it's getting closer. As a simple test, I wrote a program like dd that reads and discards 64k chunks of data from a device, but which skips 1 out of every four chunks (simulating skipping parity blocks). It's not surprising that this program can only read from a raw device at about 75% the rate of dd, since the kernel readahead is probably causing the skipped blocks to be read anyways (or maybe because the disk head has to pass over those sections of the disk anyways). I then ran four copies of this program in parallel, reading from the raw devices that make up my raid partition. And, like md, they only achieved about 78MB/s. This is very close to 75% of 106MB/s. Again, not surprising, since I need to have raw device readahead turned on for this to be efficient at all, so 25% of the chunks that pass through the controller are ignored. But I still don't understand why the md layer can't do better. If I turn off readahead of the raw devices, and keep it for the raid device, then parity blocks should never be requested, so they shouldn't use any bus/controller bandwidth. And even if each drive is only acting at 75% efficiency, the four drives should still be able to saturate the bus/controller. So I can't figure out what's going on here. when read, i do not think MD will read parity at all. but since parity is on all disk, there might be a seek here. so you might want to try a raid4 to see what happen as well. Is there a way for me to simulate readahead in userspace, i.e. can I do lots of sequential asynchronous reads in parallel? Also, is there a way to disable caching of reads? Having to clear the cache by reading 900M each time slows down testing. I guess I could reboot with mem=100M, but it'd be nice to disable/enable caching on the fly. Hmm, maybe I can just run something like memtest which locks a bunch of ram... after you run your code, check the meminfo, the cached value might be much lower than u expected. my feeling is that linux page cache will discard all cache if last file handle closed. Thanks for all of the help so far! Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID-5 streaming read performance
my problem here. this only apply to sdX not mdX. pls ignore this. ming On Thu, 2005-07-14 at 08:30 -0400, Ming Zhang wrote: Also, is there a way to disable caching of reads? Having to clear the cache by reading 900M each time slows down testing. I guess I could reboot with mem=100M, but it'd be nice to disable/enable caching on the fly. Hmm, maybe I can just run something like memtest which locks a bunch of ram... after you run your code, check the meminfo, the cached value might be much lower than u expected. my feeling is that linux page cache will discard all cache if last file handle closed. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID-5 streaming read performance
Mark Hahn [EMAIL PROTECTED] writes: Is there a way for me to simulate readahead in userspace, i.e. can I do lots of sequential asynchronous reads in parallel? there is async IO, but I don't think this is going to help you much. Also, is there a way to disable caching of reads? Having to clear yes: O_DIRECT. That might disable caching of reads, but it also disables readahead, so unless I manually use aio to simulate readahead, this isn't going to solve my problem, which is having to clear the cache before each test to get relevant results. I'm really surprised there isn't something in /proc you can use to clear or disable the cache. Would be very useful for benchmarking! Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: disk failed, operator error: Now can't use RAID
On 7/14/05, Neil Brown [EMAIL PROTECTED] wrote: On Wednesday July 13, [EMAIL PROTECTED] wrote: I would very much appreciate suggestions on how to get the raid running again. Remove the devices=/dev/hde1,/dev/sdd1,/dev/sdc1,/dev/sdb1,/dev/sda1 line from mdadm.conf (it is wrong and un-needed). Then mdadm -S /dev/md0 # just to be sure mdadm -A /dev/md0 -f /dev/sd[abcd]1 /dev/hd[eg]1 and see if that works. Yes, Thanks! Results are: oak:~# mdadm -S /dev/md0 oak:~# mdadm -A /dev/md0 -f /dev/sd[abcd]1 /dev/hd[eg]1 mdadm: forcing event count in /dev/sda1(0) from 1271893 upto 2816178 mdadm: /dev/md0 has been started with 4 drives (out of 5) and 1 spare. oak:~# cat /proc/mdstat Personalities : [raid5] md0 : active raid5 sda1[0] hde1[5] sdd1[3] sdc1[2] sdb1[1] 781433344 blocks level 5, 32k chunk, algorithm 2 [5/4] [_] [] recovery = 0.1% (389320/195358336) finish=280.4min speed=11585K/sec unused devices: none oak:~# Now... After this is through rebuilding, I need to replace the failed drive. (Creating one partition and setting it to type 0xFD (Linux raid autodetect) What's the best way to get this in service with one drive as a spare? Can I convert my current spare (/dev/hde1) to a regular disk and add the new disk as a spare? Or should I add the new disk as an active drive and if so, will it be rebuilt and the spare (/dev/hde1) be relegated back as a spare? thanks again, hank -- Beautiful Sunny Winfield, Illinois - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID-5 streaming read performance
i also want a way to clear part of the whole page cache by file id. :) understandably, kernel developers are don't high-prioritize this sort of not-useful-for-normal-work feature. i also want a way to tell the cache distribution, how many for file A and B, you should probably try mmaping the file and using mincore. come to think of it, mmap+madvise might be a sensible way to flush pages corresponding to a particular file, as well. I'm really surprised there isn't something in /proc you can use to clear or disable the cache. Would be very useful for benchmarking! I assume you noticed blockdev --flushbufs, no? it works for me (ie, a small, repeated streaming read of a disk device will show pagecache speed). I think the problem is that it's difficult to dissociate readahead, writebehind and normal lru-ish caching. there was quite a flurry of activity around 2.4.10 related to this, and it left a bad taste in everyone's mouth. I think the main conclusion was that too much fanciness results in a fragile, more subtle and difficult-to-maintain system that performs better, true, but over a narrower range of workloads. regards, mark hahn sharcnet/mcmaster. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid5 Failure
Hello - I'm currently stuck in a moderately awkward predicament. I have a 28 disk software RAID5; at the time I created it I was using EVMS - this was because mdadm 1.x didn't support superblock v1 and mdadm 2.x wouldn't compile on my system. Everything was working great; until I had an unusual kernel error: Jun 20 02:55:07 abyss last message repeated 33 times Jun 20 02:55:07 abyss kernel: KERNEL: assertion (flags MSG_PEEK) failed at net/ 59A9F3C Jun 20 02:55:07 abyss kernel: KERNEL: assertion (flags MSG_PEEK) failed at net/ipv4/tcp.c (1294) I used to get this error randomly; a reboot would resolve it - the final fix was to update the kernel. The reason I even noticed the error this time, was because I was attempting to access my RAID, and some of the data wouldn't come up. I did a cat /proc/mdstat and it said 13 of the 28 devices were failed. I checked /var/log/kernel and the above message was spamming the log repeatedly. Upon reboot, I fired up EVMSGui to remount the raid - and I received the following error messages: Jul 14 20:17:46 abyss _3_ Engine: engine_ioctl_object: ioctl to object md/md0 failed with error code 19: No such device Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sda is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdb is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdc is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdd is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sde is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdf is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdg is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdh is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdi is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdj is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdk is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdl is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Object sdm is out of date. Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_analyze_volume: Found 13 stale objects in region md/md0. Jul 14 20:17:47 abyss _0_ MDRaid5RegMgr: sb1_analyze_sb: MD region md/md0 is corrupt Jul 14 20:17:47 abyss _3_ MDRaid5RegMgr: md_fix_dev_major_minor: MD region md/md0 is corrupt. Jul 14 20:17:47 abyss _0_ Engine: plugin_user_message: Message is: MDRaid5RegMgr: Region md/md0 : MD superblocks found in object(s) [sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm ] are not valid. [sda sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm ] will not be activated and should be removed from the region. Jul 14 20:17:47 abyss _0_ Engine: plugin_user_message: Message is: MDRaid5RegMgr: RAID5 region md/md0 is corrupt. The number of raid disks for a full functional array is 28. The number of active disks is 15. Jul 14 20:17:47 abyss _2_ MDRaid5RegMgr: raid5_read: MD Object md/md0 is corrupt, data is suspect Jul 14 20:17:47 abyss _2_ MDRaid5RegMgr: raid5_read: MD Object md/md0 is corrupt, data is suspect I realize this is not the EVMS mailing list; I tried earlier (I've been swamped at work) with no success on resolving this issue there. Today, I tried mdadm 2.0-devel-2. It compiled w/o issue. I did a mdadm --misc -Q /dev/sdm. -([EMAIL PROTECTED])-(~/mdadm-2.0-devel-2)- # ./mdadm --misc -Q /dev/sdm /dev/sdm: is not an md array /dev/sdm: device 134639616 in 28 device undetected raid5 md-1. Use mdadm --examine for more detail. -([EMAIL PROTECTED])-(~/mdadm-2.0-devel-2)- # ./mdadm --misc -E /dev/sdm /dev/sdm: Magic : a92b4efc Version : 01.00 Array UUID : 4e2b6b0a8e:92e91c0c:018a4bf0:9bb74d Name : md/md0 Creation Time : Wed Dec 31 19:00:00 1969 Raid Level : raid5 Raid Devices : 28 Device Size : 143374592 (68.37 GiB 73.41 GB) Super Offset : 143374632 sectors State : clean Device UUID : 4e2b6b0a8e:92e91c0c:018a4bf0:9bb74d Update Time : Sun Jun 19 14:49:52 2005 Checksum : 296bf133 - correct Events : 172758 Layout : left-asymmetric Chunk Size : 128K Array State : Uuuu After which, I checked on /dev/sdn. -([EMAIL PROTECTED])-(~/mdadm-2.0-devel-2)- # ./mdadm --misc -Q /dev/sdn /dev/sdn: is not an md array /dev/sdn: device 134639616 in 28 device undetected raid5 md-1. Use mdadm --examine for more detail. -([EMAIL PROTECTED])-(~/mdadm-2.0-devel-2)- # ./mdadm --misc -E /dev/sdn /dev/sdn: Magic : a92b4efc Version : 01.00 Array UUID : 4e2b6b0a8e:92e91c0c:018a4bf0:9bb74d Name : md/md0 Creation Time : Wed Dec 31 19:00:00 1969 Raid Level : raid5 Raid Devices : 28 Device Size :
Re: Raid5 Failure
On Thursday July 14, [EMAIL PROTECTED] wrote: It looks like the first 'segment of discs' sda-sdm are all marked clean; while sdn-sdab are marked active. What can I do to resolve this issue? Any assistance would be greatly appreciated. Apply the following patch to mdadm-2.0-devel2 (it fixes a few bugs and particularly make --assemble work) then try: mdadm -A /dev/md0 /dev/sd[a-z] /dev/sd Just list all 28 SCSI devices, I'm not sure what their names are. This will quite probably fail. If it does, try again with --force NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./Assemble.c | 13 - ./Query.c| 33 +++-- ./mdadm.h|2 +- ./super0.c |1 + ./super1.c |4 ++-- 5 files changed, 35 insertions(+), 18 deletions(-) diff ./Assemble.c~current~ ./Assemble.c --- ./Assemble.c~current~ 2005-07-15 10:13:04.0 +1000 +++ ./Assemble.c2005-07-15 10:37:59.0 +1000 @@ -473,6 +473,7 @@ int Assemble(struct supertype *st, char if (!devices[j].uptodate) continue; info.disk.number = i; + info.disk.raid_disk = i; info.disk.state = desired_state; if (devices[j].uptodate @@ -526,7 +527,17 @@ int Assemble(struct supertype *st, char /* Almost ready to actually *do* something */ if (!old_linux) { - if (ioctl(mdfd, SET_ARRAY_INFO, NULL) != 0) { + int rv; + if ((vers % 100) = 1) { /* can use different versions */ + mdu_array_info_t inf; + memset(inf, 0, sizeof(inf)); + inf.major_version = st-ss-major; + inf.minor_version = st-minor_version; + rv = ioctl(mdfd, SET_ARRAY_INFO, inf); + } else + rv = ioctl(mdfd, SET_ARRAY_INFO, NULL); + + if (rv) { fprintf(stderr, Name : SET_ARRAY_INFO failed for %s: %s\n, mddev, strerror(errno)); return 1; diff ./Query.c~current~ ./Query.c --- ./Query.c~current~ 2005-07-07 09:19:53.0 +1000 +++ ./Query.c 2005-07-15 11:38:18.0 +1000 @@ -105,26 +105,31 @@ int Query(char *dev) if (superror == 0) { /* array might be active... */ st-ss-getinfo_super(info, super); - mddev = get_md_name(info.array.md_minor); - disc.number = info.disk.number; - activity = undetected; - if (mddev (fd = open(mddev, O_RDONLY))=0) { - if (md_get_version(fd) = 9000 - ioctl(fd, GET_ARRAY_INFO, array)= 0) { - if (ioctl(fd, GET_DISK_INFO, disc) = 0 - makedev((unsigned)disc.major,(unsigned)disc.minor) == stb.st_rdev) - activity = active; - else - activity = mismatch; + if (st-ss-major == 0) { + mddev = get_md_name(info.array.md_minor); + disc.number = info.disk.number; + activity = undetected; + if (mddev (fd = open(mddev, O_RDONLY))=0) { + if (md_get_version(fd) = 9000 + ioctl(fd, GET_ARRAY_INFO, array)= 0) { + if (ioctl(fd, GET_DISK_INFO, disc) = 0 + makedev((unsigned)disc.major,(unsigned)disc.minor) == stb.st_rdev) + activity = active; + else + activity = mismatch; + } + close(fd); } - close(fd); + } else { + activity = unknown; + mddev = array; } - printf(%s: device %d in %d device %s %s md%d. Use mdadm --examine for more detail.\n, + printf(%s: device %d in %d device %s %s %s. Use mdadm --examine for more detail.\n, dev, info.disk.number, info.array.raid_disks, activity, map_num(pers, info.array.level), - info.array.md_minor); + mddev); } return 0; } diff ./mdadm.h~current~ ./mdadm.h --- ./mdadm.h~current~ 2005-07-07 09:19:53.0 +1000 +++ ./mdadm.h 2005-07-15 10:15:51.0 +1000 @@ -73,7 +73,7 @@ struct mdinfo { mdu_array_info_tarray; mdu_disk_info_t disk;
Re: Re[2]: Bugreport mdadm-2.0-devel-1
On Saturday July 9, [EMAIL PROTECTED] wrote: On Thursday July 7, [EMAIL PROTECTED] wrote: Hi Neil! Thanks much for your help, array creation using devel-2 just works, however, the array can't be assembled again after it's stopped:( Hmm, yeh, nor it can :-( I'm not sure when I'll have time to look at this (I'm on leave at the moment with family visiting and such) but I'll definitely get back to you by Thursday if not before. Sorry for the delay. The following patch against -devel2 should fix these problems if (when?) you get more, please let me know. NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./Assemble.c | 13 - ./Query.c| 33 +++-- ./mdadm.h|2 +- ./super0.c |1 + ./super1.c |4 ++-- 5 files changed, 35 insertions(+), 18 deletions(-) diff ./Assemble.c~current~ ./Assemble.c --- ./Assemble.c~current~ 2005-07-15 10:13:04.0 +1000 +++ ./Assemble.c2005-07-15 10:37:59.0 +1000 @@ -473,6 +473,7 @@ int Assemble(struct supertype *st, char if (!devices[j].uptodate) continue; info.disk.number = i; + info.disk.raid_disk = i; info.disk.state = desired_state; if (devices[j].uptodate @@ -526,7 +527,17 @@ int Assemble(struct supertype *st, char /* Almost ready to actually *do* something */ if (!old_linux) { - if (ioctl(mdfd, SET_ARRAY_INFO, NULL) != 0) { + int rv; + if ((vers % 100) = 1) { /* can use different versions */ + mdu_array_info_t inf; + memset(inf, 0, sizeof(inf)); + inf.major_version = st-ss-major; + inf.minor_version = st-minor_version; + rv = ioctl(mdfd, SET_ARRAY_INFO, inf); + } else + rv = ioctl(mdfd, SET_ARRAY_INFO, NULL); + + if (rv) { fprintf(stderr, Name : SET_ARRAY_INFO failed for %s: %s\n, mddev, strerror(errno)); return 1; diff ./Query.c~current~ ./Query.c --- ./Query.c~current~ 2005-07-07 09:19:53.0 +1000 +++ ./Query.c 2005-07-15 11:38:18.0 +1000 @@ -105,26 +105,31 @@ int Query(char *dev) if (superror == 0) { /* array might be active... */ st-ss-getinfo_super(info, super); - mddev = get_md_name(info.array.md_minor); - disc.number = info.disk.number; - activity = undetected; - if (mddev (fd = open(mddev, O_RDONLY))=0) { - if (md_get_version(fd) = 9000 - ioctl(fd, GET_ARRAY_INFO, array)= 0) { - if (ioctl(fd, GET_DISK_INFO, disc) = 0 - makedev((unsigned)disc.major,(unsigned)disc.minor) == stb.st_rdev) - activity = active; - else - activity = mismatch; + if (st-ss-major == 0) { + mddev = get_md_name(info.array.md_minor); + disc.number = info.disk.number; + activity = undetected; + if (mddev (fd = open(mddev, O_RDONLY))=0) { + if (md_get_version(fd) = 9000 + ioctl(fd, GET_ARRAY_INFO, array)= 0) { + if (ioctl(fd, GET_DISK_INFO, disc) = 0 + makedev((unsigned)disc.major,(unsigned)disc.minor) == stb.st_rdev) + activity = active; + else + activity = mismatch; + } + close(fd); } - close(fd); + } else { + activity = unknown; + mddev = array; } - printf(%s: device %d in %d device %s %s md%d. Use mdadm --examine for more detail.\n, + printf(%s: device %d in %d device %s %s %s. Use mdadm --examine for more detail.\n, dev, info.disk.number, info.array.raid_disks, activity, map_num(pers, info.array.level), - info.array.md_minor); + mddev); } return 0; } diff ./mdadm.h~current~ ./mdadm.h --- ./mdadm.h~current~ 2005-07-07 09:19:53.0 +1000 +++ ./mdadm.h 2005-07-15 10:15:51.0 +1000 @@ -73,7 +73,7 @@ struct mdinfo { mdu_array_info_tarray;
Re: RAID-5 streaming read performance
Ming Zhang [EMAIL PROTECTED] writes: On Thu, 2005-07-14 at 19:29 -0400, Mark Hahn wrote: i also want a way to clear part of the whole page cache by file id. :) understandably, kernel developers are don't high-prioritize this sort of not-useful-for-normal-work feature. agree. Clearing just part of the page cache sounds too complicated to be worth it, but clearing it all seems reasonable; some kernel developers spend time doing benchmarks too! Dan Christensen wrote: I'm really surprised there isn't something in /proc you can use to clear or disable the cache. Would be very useful for benchmarking! I assume you noticed blockdev --flushbufs, no? it works for me I had tried this and noticed that it didn't work for files on a filesystem. But it does seem to work for block devices. That's great, thanks. I didn't realize the cache was so complicated; it can be retained for files but not for the block device underlying those files! a test i did show that even you have sda and sdb to form a raid0, the page cache for sda and sdb will not be used by raid0. kind of funny. I thought I had noticed raid devices making use of cache from underlying devices, but a test I just did agrees with your result, for both RAID-1 and RAID-5. Again, this seems odd. Shouldn't the raid layer take advantage of a block that's already in RAM? I guess this won't matter in practice, since you usually don't read from both a raid device and an underlying device. Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Failure
Neil - You are the man; the array went w/o force - and is rebuilding now! -([EMAIL PROTECTED])-(/)- # mdadm --detail /dev/md0 /dev/md0: Version : 01.00.01 Creation Time : Wed Dec 31 19:00:00 1969 Raid Level : raid5 Array Size : 1935556992 (1845.89 GiB 1982.01 GB) Device Size : 71687296 (68.37 GiB 73.41 GB) Raid Devices : 28 Total Devices : 28 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Jul 14 22:07:18 2005 State : active, resyncing Active Devices : 28 Working Devices : 28 Failed Devices : 0 Spare Devices : 0 Layout : left-asymmetric Chunk Size : 128K Rebuild Status : 0% complete UUID : 4e2b6b0a8e:92e91c0c:018a4bf0:9bb74d Events : 172760 Number Major Minor RaidDevice State 0 800 active sync /dev/evms/.nodes/sda 1 8 161 active sync /dev/evms/.nodes/sdb 2 8 322 active sync /dev/evms/.nodes/sdc 3 8 483 active sync /dev/evms/.nodes/sdd 4 8 644 active sync /dev/evms/.nodes/sde 5 8 805 active sync /dev/evms/.nodes/sdf 6 8 966 active sync /dev/evms/.nodes/sdg 7 8 1127 active sync /dev/evms/.nodes/sdh 8 8 1288 active sync /dev/evms/.nodes/sdi 9 8 1449 active sync /dev/evms/.nodes/sdj 10 8 160 10 active sync /dev/evms/.nodes/sdk 11 8 176 11 active sync /dev/evms/.nodes/sdl 12 8 192 12 active sync /dev/evms/.nodes/sdm 13 8 208 13 active sync /dev/evms/.nodes/sdn 14 8 224 14 active sync /dev/evms/.nodes/sdo 15 8 240 15 active sync /dev/evms/.nodes/sdp 16 650 16 active sync /dev/evms/.nodes/sdq 17 65 16 17 active sync /dev/evms/.nodes/sdr 18 65 32 18 active sync /dev/evms/.nodes/sds 19 65 48 19 active sync /dev/evms/.nodes/sdt 20 65 64 20 active sync /dev/evms/.nodes/sdu 21 65 80 21 active sync /dev/evms/.nodes/sdv 22 65 96 22 active sync /dev/evms/.nodes/sdw 23 65 112 23 active sync /dev/evms/.nodes/sdx 24 65 128 24 active sync /dev/evms/.nodes/sdy 25 65 144 25 active sync /dev/evms/.nodes/sdz 26 65 160 26 active sync /dev/evms/.nodes/sdaa 27 65 176 27 active sync /dev/evms/.nodes/sdab -- David M. Strang - Original Message - From: Neil Brown To: David M. Strang Cc: linux-raid@vger.kernel.org Sent: Thursday, July 14, 2005 9:43 PM Subject: Re: Raid5 Failure On Thursday July 14, [EMAIL PROTECTED] wrote: It looks like the first 'segment of discs' sda-sdm are all marked clean; while sdn-sdab are marked active. What can I do to resolve this issue? Any assistance would be greatly appreciated. Apply the following patch to mdadm-2.0-devel2 (it fixes a few bugs and particularly make --assemble work) then try: mdadm -A /dev/md0 /dev/sd[a-z] /dev/sd Just list all 28 SCSI devices, I'm not sure what their names are. This will quite probably fail. If it does, try again with --force NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./Assemble.c | 13 - ./Query.c| 33 +++-- ./mdadm.h|2 +- ./super0.c |1 + ./super1.c |4 ++-- 5 files changed, 35 insertions(+), 18 deletions(-) diff ./Assemble.c~current~ ./Assemble.c --- ./Assemble.c~current~ 2005-07-15 10:13:04.0 +1000 +++ ./Assemble.c 2005-07-15 10:37:59.0 +1000 @@ -473,6 +473,7 @@ int Assemble(struct supertype *st, char if (!devices[j].uptodate) continue; info.disk.number = i; + info.disk.raid_disk = i; info.disk.state = desired_state; if (devices[j].uptodate @@ -526,7 +527,17 @@ int Assemble(struct supertype *st, char /* Almost ready to actually *do* something */ if (!old_linux) { - if (ioctl(mdfd, SET_ARRAY_INFO, NULL) != 0) { + int rv; + if ((vers % 100) = 1) { /* can use different versions */ + mdu_array_info_t inf; + memset(inf, 0, sizeof(inf)); + inf.major_version = st-ss-major; + inf.minor_version = st-minor_version; + rv = ioctl(mdfd, SET_ARRAY_INFO, inf); + } else + rv = ioctl(mdfd, SET_ARRAY_INFO, NULL); + + if (rv) { fprintf(stderr, Name : SET_ARRAY_INFO failed for %s: %s\n, mddev, strerror(errno)); return 1; diff ./Query.c~current~ ./Query.c --- ./Query.c~current~ 2005-07-07 09:19:53.0 +1000 +++ ./Query.c 2005-07-15 11:38:18.0 +1000 @@ -105,26 +105,31 @@ int