Re: Can't get drives containing spare devices to spindown
On Thursday June 22, [EMAIL PROTECTED] wrote: > > > Thanks Neil for your quick reply. Would it be possible to elaborate a > bit on the problem and the solution? I guess I won't be on 2.6.18 for > some time... > When an array has been idle (no writes) for a short time (20 or 200 ms, depending on which kernel you are running) the array is flagged as 'clean'. so that a crash/power failure at that point will not require a full resync. The 'clean' flag is stored on all superblocks, including the spares. So this causes writes to all devices when there is changes to activity status. Even fairly quite filesystems see occasional updates (updating atime on files, or such syncing the journal), and that causes all devices to be touched. Fix 1/ don't set the 'dirty' flag on spares - there really is no need. However whenever the dirty bit is changed, the 'events' count is updated, so just doing the above will cause the spares to get way behind the main devices in their 'events' count so they will no longer be treated as part of the array. So 2/ When clearing the dirty flag (and nothing else has happened), decrement the events count rather than increment it. Together, these mean that simple dirty/clean transitions do not touch the spares. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't get drives containing spare devices to spindown
Neil Brown wrote: On Thursday June 22, [EMAIL PROTECTED] wrote: Marc L. de Bruin wrote: Situation: /dev/md0, type raid1, containing 2 active devices (/dev/hda1 and /dev/hdc1) and 2 spare devices (/dev/hde1 and /dev/hdg1). Those two spare 'partitions' are the only partitions on those disks and therefore I'd like to spin down those disks using hdparm for obvious reasons (noise, heat). Specifically, 'hdparm -S ' sets the standby (spindown) timeout for a drive; the value is used by the drive to determine how long to wait (with no disk activity) before turning off the spindle motor to save power. However, it turns out that md actually sort-of prevents those spare disks to spindown. I can get them off for about 3 to 4 seconds, after which they immediately spin up again. Removing the spare devices from /dev/md0 (mdadm /dev/md0 --remove /dev/hd[eg]1) actually solves this, but I have no intention actually removing those devices. How can I make sure that I'm actually able to spin down those two spare drives? This is fixed in current -mm kernels and the fix should be in 2.6.18. NeilBrown Thanks Neil for your quick reply. Would it be possible to elaborate a bit on the problem and the solution? I guess I won't be on 2.6.18 for some time... Marc. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't get drives containing spare devices to spindown
On Thursday June 22, [EMAIL PROTECTED] wrote: > Marc L. de Bruin wrote: > > > Situation: /dev/md0, type raid1, containing 2 active devices > > (/dev/hda1 and /dev/hdc1) and 2 spare devices (/dev/hde1 and /dev/hdg1). > > > > Those two spare 'partitions' are the only partitions on those disks > > and therefore I'd like to spin down those disks using hdparm for > > obvious reasons (noise, heat). Specifically, 'hdparm -S > > ' sets the standby (spindown) timeout for a drive; the value > > is used by the drive to determine how long to wait (with no disk > > activity) before turning off the spindle motor to save power. > > > > However, it turns out that md actually sort-of prevents those spare > > disks to spindown. I can get them off for about 3 to 4 seconds, after > > which they immediately spin up again. Removing the spare devices from > > /dev/md0 (mdadm /dev/md0 --remove /dev/hd[eg]1) actually solves this, > > but I have no intention actually removing those devices. > > > > How can I make sure that I'm actually able to spin down those two > > spare drives? This is fixed in current -mm kernels and the fix should be in 2.6.18. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't get drives containing spare devices to spindown
Marc L. de Bruin wrote: Situation: /dev/md0, type raid1, containing 2 active devices (/dev/hda1 and /dev/hdc1) and 2 spare devices (/dev/hde1 and /dev/hdg1). Those two spare 'partitions' are the only partitions on those disks and therefore I'd like to spin down those disks using hdparm for obvious reasons (noise, heat). Specifically, 'hdparm -S ' sets the standby (spindown) timeout for a drive; the value is used by the drive to determine how long to wait (with no disk activity) before turning off the spindle motor to save power. However, it turns out that md actually sort-of prevents those spare disks to spindown. I can get them off for about 3 to 4 seconds, after which they immediately spin up again. Removing the spare devices from /dev/md0 (mdadm /dev/md0 --remove /dev/hd[eg]1) actually solves this, but I have no intention actually removing those devices. How can I make sure that I'm actually able to spin down those two spare drives? I'm replying to myself here which seems pointless, but AFAIK I got no reply and I still believe this is an interesting issue. :-) Also, I have some extra info. After doing some research, it seems that the busy-ness of the filesystem matters too? For example, if I create a /dev/md1 on /dev/hdb1 and /dev/hdd1 with two spares on /dev/hdf1 and /dev/hdh1, put a filesystem on /dev/md1, mount it, put the spare drives to sleep (hdparm -S 5 /dev/hd[fh1]), and leave that filesystem alone completely, every few minutes for to me no obvious reason those spare drives will spin-up. I can only think of one reason: the md subsystem has to put some meta-info (hashes?) about /dev/md1 on the spare drives. If I use the filesystem on /dev/md1 more intensively, those 'every few minutes' seems to become 'every 15 or so seconds'. I may be completely wrong here (I'm no md guru), but maybe someone can confirm this behaviour? And if so, is there a way to control it? And if not, what could happen here? For the original problem I can think of a solution: removing the spare drives from the array, get them to spin-down and use the mdadm monitor feature to trigger a script on a 'Failed' event which adds a spare to that array and remove any spin-down time from that spare. However, although this sort-of fixes the problem, there is still an extra short period of time where the raid1 array is not protected. If the scripts fails for whatever reason, the raid1 array might not be protected for a long time. Also, from an architectural point of view, this is really bad and should not be needed. Thanks again for your time, Marc. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)
Mark Hahn wrote: >> There's much easier/simpler way to set default scheduler. As > > personally, I don't see any point to worrying about the default, > compile-time or boot time: > > for f in `find /sys/block/* -name scheduler`; do echo cfq > $f; done I agree -- if you're talking about changing the io scheduler for the duration of a resync you should take this approach rather than changing kernels or rebooting. --Gil - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)
> There's much easier/simpler way to set default scheduler. As personally, I don't see any point to worrying about the default, compile-time or boot time: for f in `find /sys/block/* -name scheduler`; do echo cfq > $f; done - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[-mm patch] drivers/md/md.c: make code static
This patch makes needlessly global code static. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- drivers/md/md.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.17-mm1-full/drivers/md/md.c.old 2006-06-21 22:59:44.0 +0200 +++ linux-2.6.17-mm1-full/drivers/md/md.c 2006-06-21 23:00:02.0 +0200 @@ -175,7 +175,7 @@ /* Alternate version that can be called from interrupts * when calling sysfs_notify isn't needed. */ -void md_new_event_inintr(mddev_t *mddev) +static void md_new_event_inintr(mddev_t *mddev) { atomic_inc(&md_event_count); wake_up(&md_event_waiters); @@ -2309,7 +2309,7 @@ */ enum array_state { clear, inactive, suspended, readonly, read_auto, clean, active, write_pending, active_idle, bad_word}; -char *array_states[] = { +static char *array_states[] = { "clear", "inactive", "suspended", "readonly", "read-auto", "clean", "active", "write-pending", "active-idle", NULL }; - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bitmap status question
David Greaves wrote: How do I interpret: bitmap: 0/117 pages [0KB], 1024KB chunk in the mdstat output what does it mean when it's, eg: 23/117 This refers to the in-memory bitmap (basically a cache of what's in the on-disk bitmap -- it allows bitmap operations to be more efficient). If it's 23/117 that means there are 23 of 117 pages allocated in the in-memory bitmap. The pages are allocated on demand, and get freed when they're empty (all zeroes). The in-memory bitmap uses 16 bits for each bitmap chunk to count all ongoing writes to the chunk, so it's actually up to 16 times larger than the on-disk bitmap. -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)
Niccolo Rigacci wrote: [] > From the command line you can see which schedulers are supported > and change it on the fly (remember to do it for each RAID disk): > > # cat /sys/block/hda/queue/scheduler > noop [anticipatory] deadline cfq > # echo cfq > /sys/block/hda/queue/scheduler > > Otherwise you can recompile your kernel and set CFQ as the > default I/O scheduler (CONFIG_DEFAULT_CFQ=y in Block layer, IO > Schedulers, Default I/O scheduler). There's much easier/simpler way to set default scheduler. As someone suggested, RTFM Documentation/kernel-parameters.txt. Passing elevator=cfq (or whatever) will do the trick much simpler than kernel recompile. /mjt - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)
OK :) David Niccolo Rigacci wrote: > Thanks to the several guys in this list, I have solved my problem > and elaborated this, can be a new FAQ entry? > > > > Q: Sometimes when a RAID volume is resyncing, the system seems to > locks-up: every disk activity is blocked until resync is done. > > A: This is not strictly related to Linux RAID, this is a problem > related to the Linux kernel and the disk subsytem: in no > circumstances a process should get all the disk resources > preventing others to access them. > > You can control the max speed at which RAID reconstruction is > done by setting it, say at 5 Mb/s: > > echo 5000 > /proc/sys/dev/raid/speed_limit_max > > This is just a workaround, you have to determine the max speed > that does not lock your system by trial and error and you cannot > predict what will be the disk load in the future when the RAID > will be resyncing for some reason. > > Starting from version 2.6, Linux kernel has several choices about > the I/O scheduler to be used. The default is the anticipatory > scheduler, which seems to be sub-optimal on resync high load. If > your kernel has the CFQ scheduler compiled in, use it during > resync. > > >From the command line you can see which schedulers are supported > and change it on the fly (remember to do it for each RAID disk): > > # cat /sys/block/hda/queue/scheduler > noop [anticipatory] deadline cfq > # echo cfq > /sys/block/hda/queue/scheduler > > Otherwise you can recompile your kernel and set CFQ as the > default I/O scheduler (CONFIG_DEFAULT_CFQ=y in Block layer, IO > Schedulers, Default I/O scheduler). > > > -- - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
New FAQ entry? (was IBM xSeries stop responding during RAID1 reconstruction)
Thanks to the several guys in this list, I have solved my problem and elaborated this, can be a new FAQ entry? Q: Sometimes when a RAID volume is resyncing, the system seems to locks-up: every disk activity is blocked until resync is done. A: This is not strictly related to Linux RAID, this is a problem related to the Linux kernel and the disk subsytem: in no circumstances a process should get all the disk resources preventing others to access them. You can control the max speed at which RAID reconstruction is done by setting it, say at 5 Mb/s: echo 5000 > /proc/sys/dev/raid/speed_limit_max This is just a workaround, you have to determine the max speed that does not lock your system by trial and error and you cannot predict what will be the disk load in the future when the RAID will be resyncing for some reason. Starting from version 2.6, Linux kernel has several choices about the I/O scheduler to be used. The default is the anticipatory scheduler, which seems to be sub-optimal on resync high load. If your kernel has the CFQ scheduler compiled in, use it during resync. >From the command line you can see which schedulers are supported and change it on the fly (remember to do it for each RAID disk): # cat /sys/block/hda/queue/scheduler noop [anticipatory] deadline cfq # echo cfq > /sys/block/hda/queue/scheduler Otherwise you can recompile your kernel and set CFQ as the default I/O scheduler (CONFIG_DEFAULT_CFQ=y in Block layer, IO Schedulers, Default I/O scheduler). -- Niccolo Rigacci Firenze - Italy Iraq, missione di pace: 38475 morti - www.iraqbodycount.net - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html