Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-30 Thread Neil Brown
On Monday May 28, [EMAIL PROTECTED] wrote: There are two things I'm not sure you covered. First, disks which don't support flush but do have a cache dirty status bit you can poll at times like shutdown. If there are no drivers which support these, it can be ignored. There are really

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-30 Thread Neil Brown
On Monday May 28, [EMAIL PROTECTED] wrote: On Mon, May 28, 2007 at 12:57:53PM +1000, Neil Brown wrote: What exactly do you want to know, and why do you care? If someone explicitly mounts -o barrier and the underlying device cannot do it, then we want to issue a warning or reject the mount

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-30 Thread Neil Brown
On Monday May 28, [EMAIL PROTECTED] wrote: Neil Brown writes: [...] Thus the general sequence might be: a/ issue all preceding writes. b/ issue the commit write with BIO_RW_BARRIER c/ wait for the commit to complete. If it was successful - done

Re: Md corruption using RAID10 on linux-2.6.21

2007-05-30 Thread Neil Brown
On Wednesday May 30, [EMAIL PROTECTED] wrote: Neil, I sent the scripts to you. Any update on this issue? Sorry, I got distracted. Your scripts are way more complicated than needed. Most of the logic in there is already in mdadm. mdadm --assemble /dev/md_d0 --run --uuid=$BOOTUUID

Re: ANNOUNCE: mdadm 2.6.2 - A tool for managing Soft RAID under Linux

2007-05-29 Thread Neil Brown
On Tuesday May 29, [EMAIL PROTECTED] wrote: Hello Neil, On Monday, 21. May 2007, you wrote: I am pleased to announce the availability of mdadm version 2.6.2 Thanks for releasing mdadm 2.6.2. It contains a fix for --test I was looking for right at the moment :-) mdadm fails to

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-27 Thread Neil Brown
Thanks everyone for your input. There was some very valuable observations in the various emails. I will try to pull most of it together and bring out what seem to be the important points. 1/ A BIO_RW_BARRIER request should never fail with -EOPNOTSUP. This is certainly a very attractive

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-27 Thread Neil Brown
On Friday May 25, [EMAIL PROTECTED] wrote: 2007/5/25, Neil Brown [EMAIL PROTECTED]: - Are there other bit that we could handle better? BIO_RW_FAILFAST? BIO_RW_SYNC? What exactly do they mean? BIO_RW_FAILFAST: means low-level driver shouldn't do much (or no) error recovery. Mainly

[RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-25 Thread Neil Brown
This mail is about an issue that has been of concern to me for quite a while and I think it is (well past) time to air it more widely and try to come to a resolution. This issue is how write barriers (the block-device kind, not the memory-barrier kind) should be handled by the various layers.

Re: not resyncing after power cut.

2007-05-22 Thread Neil Brown
On Monday May 21, [EMAIL PROTECTED] wrote: Can someone shed some light on this for me please? Sounds like it could be a kernel bug. What version (exactly) are you running? NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL

ANNOUNCE: mdadm 2.6.2 - A tool for managing Soft RAID under Linux

2007-05-20 Thread Neil Brown
I am pleased to announce the availability of mdadm version 2.6.2 It is available at the usual places: http://www.cse.unsw.edu.au/~neilb/source/mdadm/ and countrycode=xx. http://www.${countrycode}kernel.org/pub/linux/utils/raid/mdadm/ and via git at git://neil.brown.name/mdadm

RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-17 Thread Neil Brown
in KB). chunk itself will not overflow (without triggering a BUG). So change 'chunk' to be 'sector_t, and get rid of the 'BUG' as it becomes impossible to hit. Cc: Jeff Zheng [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid0.c |3 +-- 1

RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Neil Brown
On Thursday May 17, [EMAIL PROTECTED] wrote: I tried the patch, same problem show up, but no bug_on report Is there any other things I can do? What is the nature of the corruption? Is it data in a file that is wrong when you read it back, or does the filesystem metadata get corrupted? Can

RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Neil Brown
On Wednesday May 16, [EMAIL PROTECTED] wrote: On Thu, 17 May 2007, Neil Brown wrote: On Thursday May 17, [EMAIL PROTECTED] wrote: The only difference of any significance between the working and non-working configurations is that in the non-working, the component devices are larger

RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Neil Brown
is used. So I'm quite certain this bug will cause exactly the problems experienced!! Jeff, can you try this patch? Don't bother about the other tests I mentioned, just try this one. Thanks. NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid0.c

Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote: Now, is there a way I can synchronize the contents of RAID-10, or /dev/md10, with the contents of RAID-5, or /dev/sdr, when /dev/sdr is bigger then /dev/md10, and /dev/md10 has to be synchronized on /dev/sdr, not the way around (I would expand

Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote: Neil Brown schrieb: (...) An external bitmap means that if the link goes down, it keeps track of which blocks are in sync and which aren't, and when the link comes back up you re-add the missing device and the rebuild continues where

Re: how to synchronize two devices (RAID-1, but not really?)

2007-05-15 Thread Neil Brown
On Tuesday May 15, [EMAIL PROTECTED] wrote: Interesting lecture, thanks a lot. I'm gonna take that path. :-) I have one more question, though. I want to migrate: - from 4x400GB HDD RAID-10 - to 4x400GB HDD RAID-5 Obviously, I need 8 disks for that, but I have only 6. So my idea

Re: /proc/mdstat showing a device missing from array

2007-05-14 Thread Neil Brown
On Monday May 14, [EMAIL PROTECTED] wrote: Quoting Neil Brown [EMAIL PROTECTED]: A raid5 is always created with one missing device and one spare. This is because recovery onto a spare is faster than resync of a brand new array. This is unclear to me. Do you mean that is how mdadm

Re: /proc/mdstat showing a device missing from array

2007-05-13 Thread Neil Brown
On Sunday May 13, [EMAIL PROTECTED] wrote: Hello all, I just set up a new raid 5 array of 4 750G disks and am having a strange experience where /proc/mdstat is showing that one device is missing from the array. The output from a --detail shows that there is some unnamed device that has

Re: removed disk md-device

2007-05-11 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote: No, I haven't, but it is getting near the top of my list. I have just committed a change to the mdadm .git so that mdadm /dev/md4 --fail detached will fail any components of /dev/md4 that appear to be detached (open returns -ENXIO). and mdadm

Re: removed disk md-device

2007-05-10 Thread Neil Brown
On Wednesday May 9, [EMAIL PROTECTED] wrote: Neil Brown [EMAIL PROTECTED] [2007.04.02.0953 +0200]: Hmmm... this is somewhat awkward. You could argue that udev should be taught to remove the device from the array before removing the device from /dev. But I'm not convinced that you always

Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote: On Thu, 10 May 2007 16:22:31 +1000 NeilBrown [EMAIL PROTECTED] wrote: The test currently looks for any (non-fuzz) difference, either positive or negative. This clearly is not needed. Any non-sync activity will cause the total sectors to grow

Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote: On May 10 2007 16:22, NeilBrown wrote: diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c2007-05-10 15:51:54.0 +1000 +++ ./drivers/md/md.c2007-05-10 16:05:10.0 +1000 @@ -5095,7 +5095,7 @@ static

Re: removed disk md-device

2007-05-10 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote: Neil Brown wrote: On Wednesday May 9, [EMAIL PROTECTED] wrote: Neil Brown [EMAIL PROTECTED] [2007.04.02.0953 +0200]: Hmmm... this is somewhat awkward. You could argue that udev should be taught to remove the device from the array before

Re: Linux MD Raid Bug(?) w/Kernel sync_speed_min Option

2007-05-09 Thread Neil Brown
On Tuesday May 8, [EMAIL PROTECTED] wrote: Neil, awesome patch-- what are the chances of it getting merged into 2.6.22? Probably. I want to think it through a bit more - to make sure I can write a coherent and correct changelog entry. NeilBrown - To unsubscribe from this list: send the

Please revert 5b479c91da90eef605f851508744bfe8269591a0 (md partition rescan)

2007-05-09 Thread Neil Brown
Hi Linus, Could you please revert 5b479c91da90eef605f851508744bfe8269591a0 It causes an oops when auto-detecting raid arrays, and it doesn't seem easy to fix. The array may not be 'open' when do_md_run is called, so bdev-bd_disk might be NULL, so bd_set_size can oops. I cannot really open

Re: Linux MD Raid Bug(?) w/Kernel sync_speed_min Option

2007-05-08 Thread Neil Brown
. This patch might help though. Let me know if it does what you expect. Thanks, NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c

Re: Raid1 replaced with raid10?

2007-05-07 Thread Neil Brown
On Monday May 7, [EMAIL PROTECTED] wrote: Neil Brown wrote: On Friday May 4, [EMAIL PROTECTED] wrote: Peter Rabbitson wrote: Hi, I asked this question back in march but received no answers, so here it goes again. Is it safe to replace raid1 with raid10 where the amount of disks

Re: Partitioned arrays initially missing from /proc/partitions

2007-05-07 Thread Neil Brown
the array is opened, which is later than some people would like. So use blkdev_ioctl to do the rescan immediately that the array has been assembled. This means we can remove all the -change infrastructure as it was only used to trigger a partition rescan. Signed-off-by: Neil Brown [EMAIL PROTECTED

Re: what does md do if it finds an inconsistency?

2007-05-06 Thread Neil Brown
On Sunday May 6, [EMAIL PROTECTED] wrote: On Sun, 06 May 2007, martin f krafft wrote: Maybe the ideal way would be to have mdadm --monitor send an email on mismatch_count0 or a cronjob that regularly sends reminders, until the admin logs in and runs e.g. /usr/share/mdadm/repairarray. You

Re: Raid1 replaced with raid10?

2007-05-06 Thread Neil Brown
On Friday May 4, [EMAIL PROTECTED] wrote: Peter Rabbitson wrote: Hi, I asked this question back in march but received no answers, so here it goes again. Is it safe to replace raid1 with raid10 where the amount of disks is equal to the amount of far/near/offset copies? I understand it

RE: RAID6 question

2007-05-06 Thread Neil Brown
On Friday May 4, [EMAIL PROTECTED] wrote: } -Original Message- } From: [EMAIL PROTECTED] [mailto:linux-raid- } [EMAIL PROTECTED] On Behalf Of Guy Watkins } Sent: Saturday, April 28, 2007 8:52 PM } To: linux-raid@vger.kernel.org } Subject: RAID6 question } } I read in processor.com

Re: attempt to access beyond end of device on RAID0 with mdadm

2007-05-06 Thread Neil Brown
On Wednesday May 2, [EMAIL PROTECTED] wrote: Hello, I am using Debian 3.1 (sarge) with 2.6.17.13-gen64-smp kernel. I have storage of 53 TB (6 units with 9.5 T). All of them are stripped (Raid0) using mdadm 1.9.0. On this unit have created with LVM2 Volume Group and 3 logical Volumes. One

Re: filesystem corruption with md raid6

2007-04-27 Thread Neil Brown
On Thursday April 26, [EMAIL PROTECTED] wrote: have a system with 12 SATA disks attached via SAS. When copying into the array during re-sync I get filesystem errors and corruption for raid6 but not for raid5. This problem is repeatable. I actually have 2 separate 12 disk arrays and get the

Re: Multiple disk failure, but slot numbers are corrupt and preventing assembly.

2007-04-26 Thread Neil Brown
On Tuesday April 24, [EMAIL PROTECTED] wrote: David, thanks for all the advice so far. On 4/24/07, David Greaves [EMAIL PROTECTED] wrote: Essentially all --create does is create superblocks with the data you want (eg slot numbers). It does not touch other 'on disk data'. It is

Re: Partitioned arrays initially missing from /proc/partitions

2007-04-24 Thread Neil Brown
On Tuesday April 24, [EMAIL PROTECTED] wrote: Neil Brown wrote: This problem is very hard to solve inside the kernel. The partitions will not be visible until the array is opened *after* it has been created. Making the partitions visible before that would be possible, but would be very

Re: Partitioned arrays initially missing from /proc/partitions

2007-04-24 Thread Neil Brown
On Tuesday April 24, [EMAIL PROTECTED] wrote: Neil Brown wrote: This problem is very hard to solve inside the kernel. The partitions will not be visible until the array is opened *after* it has been created. Making the partitions visible before that would be possible, but would be very

Re: Partitioned arrays initially missing from /proc/partitions

2007-04-23 Thread Neil Brown
This problem is very hard to solve inside the kernel. The partitions will not be visible until the array is opened *after* it has been created. Making the partitions visible before that would be possible, but would be very easy. I think the best solution is Mike's solution which is to simply

Re: raid5 write performance

2007-04-19 Thread Neil Brown
On Thursday April 19, [EMAIL PROTECTED] wrote: Neil Hello I have been doing some thinking. I feel we should take a different path here. In my tests I actually accumulate the user's buffers and when ready I submit them, an elevator like algorithm. The main problem is the amount of IO's

Re: Recovering a raid5 array with strange event count

2007-04-13 Thread Neil Brown
On Friday April 13, [EMAIL PROTECTED] wrote: Dear All, I have an 8-drive raid-5 array running under 2.6.11. This morning it bombed out, and when I brought it up again, two drives had incorrect event counts: sda1: 0.8258715 sdb1: 0.8258715 sdc1: 0.8258715 sdd1: 0.8258715 sde1:

Re: Manually hacking superblocks

2007-04-13 Thread Neil Brown
On Friday April 13, [EMAIL PROTECTED] wrote: Lasse Kärkkäinen wrote: disk 0, o:1, dev:sdc1 disk 1, o:1, dev:sde1 disk 2, o:1, dev:sdg1 disk 3, o:1, dev:sdi1 disk 4, o:1, dev:sdh1 disk 5, o:1, dev:sdf1 disk 6, o:1, dev:sdd1 I gather that I need a way to alter the superblocks

RE: [PATCH RFC 3/4] md: writeback caching policy for raid5 [experimental]

2007-04-12 Thread Neil Brown
On Wednesday April 11, [EMAIL PROTECTED] wrote: From: Mark Hahn [mailto:[EMAIL PROTECTED] In its current implementation write-back mode acknowledges writes before they have reached non-volatile media. which is basically normal for unix, no? I am referring to when bi_end_io is

Re: LINEAR RAID, little help

2007-04-12 Thread Neil Brown
On Thursday April 12, [EMAIL PROTECTED] wrote: I forgot to ask something, sorry. With RAID0 and all the odd size drives I have I'd get a lot of left over unused space. Is there anyway to make use of this slack? md/raid0 makes use of all available space (modulo chunk size). To quote from man

Re: LINEAR RAID, little help

2007-04-10 Thread Neil Brown
On Saturday April 7, [EMAIL PROTECTED] wrote: Gavin McCullagh wrote: I must admit I've never used linear raid. May I ask what made you choose it over say raid-0? Er, I went with Linear as reading around people seemed to recommend this for odd sized drives (my old drives are 80's,

Re: RAID5 superblocks partly messed up after degradation

2007-04-10 Thread Neil Brown
On Monday April 9, [EMAIL PROTECTED] wrote: Hello, hopefully someone can help me. I'll see what I can do :-) The 4 x 300 RAID can not be assembled anymore. mdadm --assemble --verbose --no-degraded /dev/md5 /dev/hdc1 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: looking for devices for

Re: raid6 rebuild

2007-04-10 Thread Neil Brown
On Thursday April 5, [EMAIL PROTECTED] wrote: On 4/5/07, Lennert Buytenhek [EMAIL PROTECTED] wrote: On Thu, Apr 05, 2007 at 09:54:14AM -0400, Bill Davidsen wrote: I confess, I would feel safer with my data if the rebuild started over, I would like to be sure that when it (finally)

Re: RAID1 out of memory error, was Re: 2.6.21-rc5-mm4

2007-04-10 Thread Neil Brown
calculation for size of filemap_attr array in md/bitmap. If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms) for of 16 (64 bit platforms). filemap_attr would be allocated one 'unsigned long' shorter than required. We need a round-up in there. Signed-off-by: Neil Brown [EMAIL

Re: parity check for read?

2007-04-03 Thread Neil Brown
On Tuesday April 3, [EMAIL PROTECTED] wrote: Hi, Is parity calculation and validation for read operations supported? I guess what you are asking is: With raid5, I would like the drive to handle a read request by reading all the blocks in the stripe and checking the parity. If the

Re: md126-7

2007-04-03 Thread Neil Brown
On Tuesday April 3, [EMAIL PROTECTED] wrote: Hi everyone, I've been using the RAID subsystem quite a bit over the years. This week, for the first time, I created a RAID6 array over loop devices that point to files. To my surprise the device showing in /proc/mdstat for this new array is

Re: s2disk and raid

2007-04-03 Thread Neil Brown
On Tuesday April 3, [EMAIL PROTECTED] wrote: Hi, I've got a bugreport [0] from a user trying to use raid and uswsusp. He's using initramfs-tools available in debian. I'll describe the problem and my analysis, maybe you can comment on what you think. A warning: I only have a casual

Re: [PATCH] md: Avoid a deadlock when removing a device from an md array via sysfs.

2007-04-02 Thread Neil Brown
. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c |3 +++ 1 file changed, 3 insertions(+) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2007-04-02 17:38:46.0 +1000 +++ ./drivers/md/md.c 2007-04-02 18:49:24.0

Re: mismatch_cnt worries

2007-04-02 Thread Neil Brown
On Monday April 2, [EMAIL PROTECTED] wrote: Neil's post here suggests either this is all normal or I'm seriously up the creek. http://www.mail-archive.com/linux-raid@vger.kernel.org/msg07349.html My questions: 1. Should I be worried or is this normal? If so can you explain why

Re: raidtools to mdadm

2007-04-01 Thread Neil Brown
On Sunday April 1, [EMAIL PROTECTED] wrote: as best i can tell i am using the correct commands for what i want but i pretty much get nothing but errors: [EMAIL PROTECTED]:/media# mdadm --build /dev/md1 --chunk=128 --level=0 --raid-devices=2 /dev/sda /dev/sdb mdadm: error opening /dev/md1:

Re: mdadm: RUN_ARRAY failed: Cannot allocate memory

2007-03-29 Thread Neil Brown
test for whether level supports bitmap to correct place. We need to check for internal-consistency of superblock in load_super. validate_super is for inter-device consistency. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c | 42

Re: is this raid5 OK ?

2007-03-29 Thread Neil Brown
On Thursday March 29, [EMAIL PROTECTED] wrote: hi, I manually created my first raid5 on 4 400 GB pata harddisks: [EMAIL PROTECTED] ~]# mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 --spare-devices=0 /dev/hde1 /dev/hdf1 /dev/hdg1 /dev/hdh1 mdadm: layout defaults to

Re: md bitmaps on 2.6.16.y

2007-03-28 Thread Neil Brown
On Monday March 26, [EMAIL PROTECTED] wrote: I have been testing an md RAID 1 created using mdadm 2.6.1 on a 2.6.16.35 kernel, and the initial results are encouraging. (The 2.6.16.y kernel is a long-term stable branch, unlike most stable kernels, which are maintained only until the next

Re: LILO 22.6.1-9.3 not compatible with SW RAID1 metdata = 1.0

2007-03-28 Thread Neil Brown
On Monday March 26, [EMAIL PROTECTED] wrote: Neil, Using: Debian Etch. I picked this up via http://anti.teamidiot.de/nei/2006/10/softraid_lilo/ via google cache. Basically, LILO will not even run correctly if the metadata is not 0.90. After I had done that, LILO ran successfully for

Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-28 Thread Neil Brown
On Tuesday March 27, [EMAIL PROTECTED] wrote: I ran a check on my SW RAID devices this morning. However, when I did so, I had a few lftp sessions open pulling files. After I executed the check, the lftp processes entered 'D' state and I could do 'nothing' in the process until the check

Re: addingdrive to raid 1

2007-03-26 Thread Neil Brown
On Sunday March 25, [EMAIL PROTECTED] wrote: Hi All, Long time Linux software raid user, first real problem. I built md2 with 1 missing drive and am now trying to add a drive (hda4) The partition that is hda4 was previously part of another raid array md1 that no longer exists. I have

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Neil Brown
On Thursday March 22, [EMAIL PROTECTED] wrote: On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: 3 (I think) seperate instances of this, each involving raid5. Is your array degraded or fully operational? Ding! A drive fell out the other day, which is why the problems only appeared

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Neil Brown
On Thursday March 22, [EMAIL PROTECTED] wrote: Not a cfq failure, but I have been able to reproduce a different oops at array stop time while i/o's were pending. I have not dug into it enough to suggest a patch, but I wonder if it is somehow related to the cfq failure since it involves

Re: Another report of a raid6 array being maintaind by _raid5 in ps .

2007-03-21 Thread Neil Brown
is raid456.ko, how about this? There are lots of error message that say 'raid5' too NeilBrown Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c

Re: raid6 array , part id 'fd' not assembling at boot .

2007-03-18 Thread Neil Brown
On Saturday March 17, [EMAIL PROTECTED] wrote: Neil Brown wrote: In-kernel auto-assembly using partition type 0xFD only works for metadata=0.90. This is deliberate. Don't use 0xFD partitions. Use mdadm to assemble your array, either via an initrd or (if it don't hold the root

Re: Data corruption on software raid.

2007-03-18 Thread Neil Brown
On Sunday March 18, [EMAIL PROTECTED] wrote: This may be due to a characteristic of RAID1, which I believe Neil described when discussing check failures in using RAID1 for swap. In some cases, the data is being written from a user buffer, which is changing. and the RAID software does two

Re: Data corruption on software raid.

2007-03-18 Thread Neil Brown
On Sunday March 18, [EMAIL PROTECTED] wrote: Hello! Long story. Get some coke. And a painful story!. See also http://bugzilla.kernel.org/show_bug.cgi?id=8180 It also involves a Silicon Image PCI/SATA controller, though a different model. But then I have an SI PCI SATA controller that has

Re: [Linux-usb-users] Failed reads from RAID-0 array (from newbie who has read the FAQ)

2007-03-18 Thread Neil Brown
On Sunday March 18, [EMAIL PROTECTED] wrote: cp -rv /mnt/* fs2d2/ At this point, the process hangs. So I ran: echo t /proc/sysrq-trigger dmesg dmesg-5-hungread.log Unfortunate (as you say) the whole trace doesn't fit. Could you try compiling the kernel with a larger value for

Re: mdadm file system type check

2007-03-16 Thread Neil Brown
On Friday March 16, [EMAIL PROTECTED] wrote: To whom it may concern, It seems mdadm does not check, warn, abort, or etc if a partition has an incorrect file system type. This has come up for me on a few occasions while building servers with software raid. On one occasion I had a

Re: raid6 array , part id 'fd' not assembling at boot .

2007-03-16 Thread Neil Brown
On Friday March 16, [EMAIL PROTECTED] wrote: Hello All , I am having a dickens of a time with preparing this system to replace my present one . I created a raid6 array over 6 147GB scsi drives . steps I followed were . fdisk /dev/sd[c-h] ( one at a time of course )

Re: mdadm file system type check

2007-03-16 Thread Neil Brown
On Friday March 16, [EMAIL PROTECTED] wrote: Instead of passing along an interpretation, here are some IRC log snippets that pertain from #gentoo-dev @ freenode.net kingtaco|work: livecd ~ # mdadm --create --level=1 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1 kingtaco|work: mdadm:

Re: Failed reads from RAID-0 array (from newbie who has read the FAQ)

2007-03-16 Thread Neil Brown
On Friday March 16, [EMAIL PROTECTED] wrote: I'm not a Linux newbie (I've even written a couple of books and done some very light device driver work), but I'm completely new to the software raid subsystem. I'm doing something rather oddball. I'm making an array of USB flash drives and

Re: sw raid0 read bottleneck

2007-03-13 Thread Neil Brown
On Tuesday March 13, [EMAIL PROTECTED] wrote: On Tue, 13 Mar 2007, Tomka Gergely wrote: On Tue, 13 Mar 2007, Justin Piszcz wrote: Have you tried increasing your readahead values for the md device? Yes. No real change. According to my humble mental image, readahead not a too

Re: Reshaping raid0/10

2007-03-11 Thread Neil Brown
On Saturday March 10, [EMAIL PROTECTED] wrote: Neil Brown wrote: If I wanted to reshape a raid0, I would just morph it into a raid4 with a missing parity drive, then use the raid5 code to restripe it. Then morph it back to regular raid0. Wow, that made my brain hurt. Given

Re: Help with chunksize on raid10 -p o3 array

2007-03-11 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: Hi, I have been trying to figure out the best chunk size for raid10 before migrating my server to it (currently raid1). I am looking at 3 offset stripes, as I want to have two drive failure redundancy, and offset striping is said to have the

Re: mismatch_cnt questions - how about raid10?

2007-03-11 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: I see. So basically for those of us who want to run swap on raid 1 or 10, and at the same time want to rely on mismatch_cnt for early problem detection, the only option is to create a separate md device just for the swap. Is this about

Re: Changing partition types safe? raidtools to mdadm migration

2007-03-09 Thread Neil Brown
On Thursday March 8, [EMAIL PROTECTED] wrote: So I should be safe in just removing the raidtools package and installing mdadm? Possibly. A significant difference is the way arrays are assembled at boot time. raidtools depended on raidstart (which never worked reliably and doesn't work at

Re: mismatch_cnt questions - how about raid10?

2007-03-06 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: Neil Brown wrote: When we write to a raid1, the data is DMAed from memory out to each device independently, so if the memory changes between the two (or more) DMA operations, you will get inconsistency between the devices. Does this apply

Re: Replace drive in RAID5 without losing redundancy?

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: Is it possible to mark a disk as to be replaced by an existing spare, then migrate to the spare disk and kick the old disk _after_ migration has been done? Or not even kick - but mark as new spare. No, this is not possible yet. You can get nearly

Re: no journaling and loops on softraid?

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID#Data_Scrubbing Warning: Be aware that the combination of RAID5 and loop-devices will most likely cause severe filesystem damage, especially when using ext3 and ReiserFS. Some users

Re: mismatch_cnt questions

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: [trim Q re how resync fixes data] For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. Can raid6 identify

Re: high mismatch count after scrub

2007-03-05 Thread Neil Brown
On Tuesday March 6, [EMAIL PROTECTED] wrote: So I practiced what I learned today and scrubbed the array. Getting this: xerxes:/sys/block/md0/md# cat mismatch_cnt 147248 4x 250GB Samsung sATA, smartctl says all fine. Need to worry? If you have a swap file on this array, then that

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Sunday March 4, [EMAIL PROTECTED] wrote: Hello, these questions apparently got buried in another thread, so here goes again ... I have a mismatch_cnt of 384 on a 2-way mirror. The box runs 2.6.17.4 and can't really be rebooted or have its kernel updated easily 1) Where does the

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Sunday March 4, [EMAIL PROTECTED] wrote: Hey, that was quick ... thanks! 1) Where does the mismatch come from? The box hasn't been down since the creation of the array. Do you have swap on the mirror at all? As a matter of fact I do, /dev/md0_p2 is a swap partition. I

Re: mismatch_cnt questions

2007-03-04 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote: Neil Brown wrote: On Sunday March 4, [EMAIL PROTECTED] wrote: I have a mismatch_cnt of 384 on a 2-way mirror. [trim] 3) Is the repair sync action safe to use on the above kernel? Any other methods / additional steps for fixing this? repair

Re: Growing a raid 6 array

2007-03-01 Thread Neil Brown
On Thursday March 1, [EMAIL PROTECTED] wrote: You can only grow a RAID5 array in Linux as of 2.6.20 AFAIK. There are two dimensions for growth. You can increase the amount of each device that is used, or you can increase the number of devices. You are correct that increasing the number of

Re: [PATCH] md: Fix for raid6 reshape.

2007-03-01 Thread Neil Brown
On Thursday March 1, [EMAIL PROTECTED] wrote: On Fri, 2 Mar 2007 15:56:55 +1100 NeilBrown [EMAIL PROTECTED] wrote: - conf-expand_progress = (sector_nr + i)*(conf-raid_disks-1); + conf-expand_progress = (sector_nr + i) * new_data_disks); ahem. It wasn't like that when I tested it,

RE: DMRAID feature direction?

2007-02-28 Thread Neil Brown
On Wednesday February 28, [EMAIL PROTECTED] wrote: Thanks for the archive link, very interesting discussions with EMD... What was the final outcome with EMD? Is it still a valid project? We would like to start helping with RAID feature enhancements, but we need to maintain support vendor

Re: Linux Software RAID Bitmap Question

2007-02-27 Thread Neil Brown
On Tuesday February 27, [EMAIL PROTECTED] wrote: Neil Brown wrote: When md find a bad block (read failure) it either fixes it (by successfully over-writing the correct date) or fails the drive. The count of the times that this has happened is available via /sys/block/mdX/md/errors

Re: Linux Software RAID a bit of a weakness?

2007-02-26 Thread Neil Brown
On Monday February 26, [EMAIL PROTECTED] wrote: On 2/26/07, Colin Simpson [EMAIL PROTECTED] wrote: If I say, dd if=/dev/sda2 of=/dev/null where /dev/sda2 is a component of an active md device. Will the RAID subsystem get upset that someone else is fiddling with the disk (even in

Re: Linux Software RAID Bitmap Question

2007-02-25 Thread Neil Brown
On Sunday February 25, [EMAIL PROTECTED] wrote: Anyone have a good explanation for the use of bitmaps? Anyone on the list use them? http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID#Data_Scrubbing Provides an explanation on that page. I believe Neil stated that using

Re: nonzero mismatch_cnt with no earlier error

2007-02-25 Thread Neil Brown
On Saturday February 24, [EMAIL PROTECTED] wrote: But is this not a good opportunity to repair the bad stripe for a very low cost (no complete resync required)? In this case, 'md' knew nothing about an error. The SCSI layer detected something and thought it had fixed it itself. Nothing for md

Re: trouble creating array

2007-02-25 Thread Neil Brown
On Sunday February 25, [EMAIL PROTECTED] wrote: Any ideas how to find out what has it open? I can happily write all over the disk with dd... I can create and delete the partition, and it's all good... I will try deleting the sd{b,c}1 partitions, reboot, and see what happens. ls -l

Re: end to end error recovery musings

2007-02-25 Thread Neil Brown
On Friday February 23, [EMAIL PROTECTED] wrote: On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote: Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to

Re: Reshaping raid0/10

2007-02-23 Thread Neil Brown
On Friday February 23, [EMAIL PROTECTED] wrote: On Feb 22 2007 06:59, Neil Brown wrote: On Wednesday February 21, [EMAIL PROTECTED] wrote: are there any plans to support reshaping on raid0 and raid10? No concrete plans. It largely depends on time and motivation. I expect

Re: Linux Software RAID a bit of a weakness?

2007-02-23 Thread Neil Brown
On Friday February 23, [EMAIL PROTECTED] wrote: Hi, We had a small server here that was configured with a RAID 1 mirror, using two IDE disks. Last week one of the drives failed in this. So we replaced the drive and set the array to rebuild. The good disk then found a bad block and the

Re: Reshaping raid0/10

2007-02-21 Thread Neil Brown
On Wednesday February 21, [EMAIL PROTECTED] wrote: Hello, are there any plans to support reshaping on raid0 and raid10? No concrete plans. It largely depends on time and motivation. I expect that the various flavours of raid5/raid6 reshape will come first. Then probably converting

Re: [PATCH 006 of 6] md: Add support for reshape of a raid6

2007-02-21 Thread Neil Brown
On Wednesday February 21, [EMAIL PROTECTED] wrote: On Tue, 20 Feb 2007 17:35:16 +1100 NeilBrown [EMAIL PROTECTED] wrote: + for (i = conf-raid_disks ; i-- ; ) { That statement should be dragged out, shot, stomped on then ceremonially incinerated. An experiment in lateral

ANNOUNCE: mdadm 2.6.1 - A tool for managing Soft RAID under Linux

2007-02-21 Thread Neil Brown
I am pleased to announce the availability of mdadm version 2.6.1 It is available at the usual places: http://www.cse.unsw.edu.au/~neilb/source/mdadm/ and countrycode=xx. http://www.${countrycode}kernel.org/pub/linux/utils/raid/mdadm/ and via git at git://neil.brown.name/mdadm

Re: mdadm --grow failed

2007-02-18 Thread Neil Brown
On Sunday February 18, [EMAIL PROTECTED] wrote: I'm not sure how the grow operation is performed but to me it seems that their is no fault tolerance during the operation so any failure will cause a corrupt array. My 2c would be that if any drive fails during a grow operation that the

Re: mdadm --grow failed

2007-02-17 Thread Neil Brown
On Saturday February 17, [EMAIL PROTECTED] wrote: Is my array destroyed? Seeing as the sda disk wasn't completely synced I'm wonder how it was using to resync the array when sdc went offline. I've got a bad feeling about this :| I can understand your bad feeling... What happened there

Re: 2.6.20: reproducible hard lockup with RAID-5 resync

2007-02-16 Thread Neil Brown
On Thursday February 15, [EMAIL PROTECTED] wrote: I think I have found an easily-reproducible bug in Linux 2.6.20. I have already applied the Fix various bugs with aligned reads in RAID5 patch, and that had no effect. It appears to be related to the resync process, and makes the system lock

<    1   2   3   4   5   6   7   8   9   10   >