Re: nonzero mismatch_cnt with no earlier error

2007-02-25 Thread Christian Pernegger
Sorry to hijack the thread a little but I just noticed that the mismatch_cnt for my mirror is at 256. I'd always thought the monthly check done by the mdadm Debian package does repair as well - apparently it doesn't. So I guess I should run repair but I'm wondering ... - is it safe / bugfree

Re: Still can't get md arrays that were started from an initrd to shutdown

2006-07-18 Thread Christian Pernegger
with lvm you have to stop lvm before you can stop the arrays... i wouldn't be surprised if evms has the same issue... AFAIK there's no counterpart to evms_activate. Besides, I'm no longer using EVMS, I just included it in my testing since this issue bit me there first. Thanks, Christian - To

Re: [PATCH] enable auto=yes by default when using udev

2006-07-18 Thread Christian Pernegger
I think I'm leaning towards auto-creating names if they look like standard names (or are listed in mdadm.conf?), but required auto=whatever to create anything else. The auto= option has the disadvantage that it is different for partitionable and regular arrays -- is there no way to detect from

Tuning the I/O scheduler for md?

2006-07-17 Thread Christian Pernegger
Based on various googled comments I have selected 'deadline' as the elevator for the disks comprising my md arrays, with no further tuning yet ... not so stellar :( Basically concurrent reads (even just 2, even worse with 1 read + 1 write) don't work too well. Example: RAID1: I bulk-move some

Still can't get md arrays that were started from an initrd to shutdown

2006-07-17 Thread Christian Pernegger
[This is a bit of a repost, because I'm slightly desperate :)] I'm still having problems with some md arrays not shutting down cleanly on halt / reboot. The problem seems to affect only arrays that are started via an initrd, even if they do not have the root filesystem on them. That's all

Re: Problem with 3xRAID1 to RAID 0

2006-07-12 Thread Christian Pernegger
Are there any actual bonuses to making RAIDs on whole raw disks? Not if you're using regular md devices. For partitionable md arrays using partitions seems a little strange to me, since you then have partitions on a partition. That'd probably make it difficult to just mount a single member of

Re: ICH7 sata-ahci + software raid warning

2006-07-11 Thread Christian Pernegger
For ICH5 and all other controllers (non-AHCI) of course, I've always seen md mark it faulty on a bad disk/sector/etc. ICH7 (ahci) does not. At least not a whole disk dieing, don't know about bad sectors. I thought you were just trying to mark it faulty manually for the purpose of rebuilding

Re: Can't get md array to shut down cleanly

2006-07-10 Thread Christian Pernegger
Nope, EVMS is not the culprit. I installed the test system from scratch, EVMS nowhere in sight -- it now boots successfully from a partitionable md array, courtesty of a yaird-generated initrd I adapted for the purpose. Yay! Or not. I get the md: md_d0 still in use. error again :( This is with

ICH7 sata-ahci + software raid warning

2006-07-10 Thread Christian Pernegger
I'm (still) trying to setup a md array on the ICH7 SATA controller of an Intel SE7230NH1-E with 4 WD5000YS disks. On this controller (in ahci mode) I have not yet managed to get a disk mark as failed. - a bad cable just led to hangs and timeouts - pulling the power on one of the SATA drives

Test feedback 2.6.17.4+libata-tj-stable (EH, hotplug)

2006-07-10 Thread Christian Pernegger
I finally got around to testing 2.6.17.4 with libata-tj-stable-20060710. Hardware: ICH7R in ahci mode + WD5000YS's. EH: much, much better. Before the patch it seemed like errors were only printed to dmesg but never handed up to any layer above. Now md actually fails the disk when I pull the

Re: Can't get md array to shut down cleanly

2006-07-07 Thread Christian Pernegger
Good morning! That patch was against latest -mm For earlier kernels you want to test 'ro'. Ok. Was using stock 2.6.17. Done unmounting local file systems. *md: md0 stopped *md: unbind sdf *md: export_rdevsdf *[last two lines for each disk.] *Stopping RAID arrays ... done (1

Re: Can't get md array to shut down cleanly

2006-07-07 Thread Christian Pernegger
It seems like it really isn't an md issue -- when I remove everything to do with evms (userspace tools + initrd hooks) everything works fine. I took your patch back out and put a few printks in there ... Without evms the active counter is 1 in an idle state, i. e. after the box has finished

Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-06 Thread Christian Pernegger
I suggest you find a SATA related mailing list to post this to (Look in the MAINTAINERS file maybe) or post it to linux-kernel. linux-ide couldn't help much, aside from recommending a bleeding-edge patchset which should fix a lot of things SATA: http://home-tj.org/files/libata-tj-stable/

Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger
Still more problems ... :( My md raid5 still does not always shut down cleanly. The last few lines of the shutdown sequence are always as follows: [...] Will now halt. md: stopping all md devices. md: md0 still in use. Synchronizing SCSI cache for disk /dev/sdd: Synchronizing SCSI cache for

Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger
May be your shutdown script is doing halt -h? Halting the disk immediately without letting the RAID to settle to a clean state can be the cause? I'm using Debian as well and my halt script has the fragment you posted. Besides, shouldn't the array be marked clean at this point: md: stopping

Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger
I get these messages too on Debian Unstable, but since enabling the bitmaps on my devices, resyncing is so fast that I don't even notice it on booting. Bitmaps are great, but the speed of the rebuild is not the problem. The box doesn't have hotswap bays, so I have to shut it down to replace a

Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-06 Thread Christian Pernegger
md is very dependant on the driver doing the right thing. It doesn't do any timeouts or anything like that - it assumes the driver will. md simply trusts the return status from the drive, and fails a drive if and only if a write to the drive is reported as failing (if a read fails, md trys to

Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger
How are you shutting down the machine? If something sending SIGKILL to all processes? First SIGTERM, then SIGKILL, yes. You could try the following patch. I think it should be safe. Hmm, it said chunk failed, so I replaced the line by hand. That didn't want to compile because mode

Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-01 Thread Christian Pernegger
Looks very much like a problem with the SATA controller. If the repeat look you have shown there is an infinite loop, then presumably some failure is not being handled properly. I agree, even though the AHCI driver was supposed to be stable. The loop is not quite infinite btw., it does time out

RAID resync after every boot?

2006-06-29 Thread Christian Pernegger
Yesterday evening I initialized a new RAID5, waited for completion and shut down the machine. Yet when I restarted it this morning it immediately began with a resync -- it seems that it wants to resync on every boot ... This is a new Debian testing installation, array was created with EVMS and

Re: RAID resync after every boot?

2006-06-29 Thread Christian Pernegger
[...] it seems that it wants to resync on every boot ... Update: - During shutdown I got a few errors on the console regarding arrays that couldn't be stopped, because the mdadm package in Debian tries to shut down all arrays, even if it is set not to autostart any -- will file bug. - The

Re: Ok to go ahead with this setup?

2006-06-28 Thread Christian Pernegger
The MaxLine III's (7V300F0) with VA111630/670 firmware currently timeout on a weekly or less basis.. I have just one 7V300F0, so no idea how it behaves is a RAID. It's been fine apart from the fact that my VIA southbridge SATA controllers doesn't even detect it ... :( (Anyone else notice

Re: I need a PCI V2.1 4 port SATA card

2006-06-28 Thread Christian Pernegger
My current 15 drive RAID-6 server is built around a KT600 board with an AMD Sempron processor and 4 SATA150TX4 cards. It does the job but it's not the fastest thing around (takes about 10 hours to do a check of the array or about 15 to do a rebuild). What kind of enclosure do you have this

Re: Ok to go ahead with this setup?

2006-06-28 Thread Christian Pernegger
Here's a tentative setup: Intel SE7230NH1-E mainboard Pentium D 930 2x1GB Crucial 533 DDR2 ECC Intel SC5295-E enclosure The above components have finally arrived ... and I was shocked to see that the case's drive bays do not have their own fan, nor can I think of anywhere to put one.

Re: Is shrinking raid5 possible?

2006-06-26 Thread Christian Pernegger
This is shrinking an array by removing drives. We were talking about shrinking an array by reducing the size of drives - a very different think. Yes I know - I just wanted to get this in as an alternative shrinking semantic. As for reducing the RAID (partition) size on the individual drives I

Re: Is shrinking raid5 possible?

2006-06-23 Thread Christian Pernegger
Why would you ever want to reduce the size of a raid5 in this way? A feature that would have been useful to me a few times is the ability to shrink an array by whole disks. Example: 8x 300 GB disks - 2100 GB raw capacity shrink file system, remove 2 disks = 6x 300 GB disks -- 1500 GB raw

Re: Ok to go ahead with this setup?

2006-06-22 Thread Christian Pernegger
Pentium D 930 HPA recently said that x86_64 CPUs have better RAID5 performance. Good to know. I did intend to use Debian-amd64 anyway. Is it a NAS kind of device? Yes, mostly. It also runs a caching NNTP proxy and drives our networked audio players :) Personal file server describes it