Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Monday May 29, [EMAIL PROTECTED] wrote: On Mon, May 29, 2006 at 12:08:25PM +1000, Neil Brown wrote: On Sunday May 28, [EMAIL PROTECTED] wrote: Thanks for the patches. They are greatly appreciated. You're welcome - mdadm-2.3.1-kernel-byteswap-include-fix.patch reverts a change introduced with mdadm 2.3.1 for redhat compatibility asm/byteorder.h is an architecture dependent file and does more stuff than a call to the linux/byteorder/XXX_endian.h the fact that not calling asm/byteorder.h does not define __BYTEORDER_HAS_U64__ is just an example of issues that might arise. if redhat is broken it should be worked around differently than breaking mdadm. I don't understand the problem here. What exactly breaks with the code currently in 2.5? mdadm doesn't need __BYTEORDER_HAS_U64__, so why does not having id defined break anything? The coomment from the patch says: not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__ causing __fswab64 to be undefined and failure compiling mdadm on big_endian architectures like PPC But again, mdadm doesn't use __fswab64 More details please. you use __cpu_to_le64 (ie in super0.c line 987) bms-sync_size = __cpu_to_le64(size); which in byteorder/big_endian.h is defined as #define __cpu_to_le64(x) ((__force __le64)__swab64((x))) but __swab64 is defined in byteorder/swab.h (included by byteorder/big_endian.h) as #if defined(__GNUC__) (__GNUC__ = 2) defined(__OPTIMIZE__) # define __swab64(x) \ (__builtin_constant_p((__u64)(x)) ? \ ___swab64((x)) : \ __fswab64((x))) #else # define __swab64(x) __fswab64(x) #endif /* OPTIMIZE */ Grrr.. Thanks for the details. I think I'll just give up and do it myself. e.g. short swap16(short in) { int i; short out=0; for (i=0; i4; i++) { out = out8 | (in255); in = in 8; } return out; } I don't need top performance and at least this should be portable... NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with raid=noautodetect
Neil Brown wrote: On Friday May 26, [EMAIL PROTECTED] wrote: [] If we assume there is a list of devices provided by a (possibly default) 'DEVICE' line, then DEVICEFILTER !pattern1 !pattern2 pattern3 pattern4 could mean that any device in that list which matches pattern 1 or 2 is immediately discarded, and remaining device that matches patterns 3 or 4 are included, and the remainder are discard. The rule could be that the default is to include any devices that don't match a !pattern, unless there is a pattern without a '!', in which case the default is to reject non-accepted patterns. Is that straight forward enough, or do I need an order allow,deny like apache has? I'd suggest the following. All the other devices are included or excluded from the list of devices to consider based on the last component in the DEVICE line. Ie. if it ends up at !dev, all the rest of devices are included. If it ends up at dev (w/o !), all the rest are excluded. If memory serves me right, it's how squid ACLs works. There's no need to introduce new keyword. Given this rule, a line DEVICE a b c will do exactly as it does now. Line DEVICE a b c !d is somewhat redundant - it's the same as DEVICE !d Ie, if the list ends up at !stuff, append `partitions' (or *) to it. Ofcourse mixing !s and !!s is useful, like to say use all sda* but not sda1: DEVICE !sda1 sda* (and nothing else). And the default is to have `DEVICE partitions'. The only possible issue I see here is that with udev, it's possible to use, say, /dev/disk/by-id/*-like stuff (don't remember exact directory layout) -- symlinked to /dev/sd* according to the disk serial number or something like that -- for this to work, mdadm needs to use glob() internally. /mjt - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with raid=noautodetect
On Mon, May 29, 2006 at 12:38:25PM +0400, Michael Tokarev wrote: Neil Brown wrote: On Friday May 26, [EMAIL PROTECTED] wrote: I'd suggest the following. All the other devices are included or excluded from the list of devices to consider based on the last component in the DEVICE line. Ie. if it ends up at !dev, all the rest of devices are included. If it ends up at dev (w/o !), all the rest are excluded. If memory serves me right, it's how squid ACLs works. There's no need to introduce new keyword. Given this rule, a line as i said the new keyword is to warn on configurations that do not account for changing device-ids, and if we change the syntax a new keyword would make it clearer. In case the user tries to use a new configuration on an old mdadm. The only possible issue I see here is that with udev, it's possible to use, say, /dev/disk/by-id/*-like stuff (don't remember exact directory layout) -- symlinked to /dev/sd* according to the disk serial number or something like that -- for this to work, mdadm needs to use glob() internally. uhm i think that we would better translate any device found on a DEVICE (or DEVICEFILTER) line to the corresponding major/minor number and blacklist based on that. nothing prevents someone to have an udev rule that creates a device node, instead of symlinking. L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 hang on get_active_stripe
On Sun, 28 May 2006, Neil Brown wrote: The following patch adds some more tracing to raid5, and might fix a subtle bug in ll_rw_blk, though it is an incredible long shot that this could be affecting raid5 (if it is, I'll have to assume there is another bug somewhere). It certainly doesn't break ll_rw_blk. Whether it actually fixes something I'm not sure. If you could try with these on top of the previous patches I'd really appreciate it. When you read from /stripe_cache_active, it should trigger a (cryptic) kernel message within the next 15 seconds. If I could get the contents of that file and the kernel messages, that should help. got the hang again... attached is the dmesg with the cryptic messages. i didn't think to grab the task dump this time though. hope there's a clue in this one :) but send me another patch if you need more data. -dean neemlark:/sys/block/md4/md# cat stripe_cache_size 256 neemlark:/sys/block/md4/md# cat stripe_cache_active 251 0 preread plugged bitlist=0 delaylist=251 neemlark:/sys/block/md4/md# cat stripe_cache_active 251 0 preread plugged bitlist=0 delaylist=251 neemlark:/sys/block/md4/md# echo 512 stripe_cache_size neemlark:/sys/block/md4/md# cat stripe_cache_active 512 292 preread not plugged bitlist=0 delaylist=32 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 292 preread not plugged bitlist=0 delaylist=32 neemlark:/sys/block/md4/md# cat stripe_cache_active 445 0 preread not plugged bitlist=0 delaylist=73 neemlark:/sys/block/md4/md# cat stripe_cache_active 480 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 413 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 13 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 493 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 487 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 405 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 1 preread not plugged bitlist=0 delaylist=28 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 84 preread not plugged bitlist=0 delaylist=69 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 69 preread not plugged bitlist=0 delaylist=56 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 41 preread not plugged bitlist=0 delaylist=38 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 10 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 453 3 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 480 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 14 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 477 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 476 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 486 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 480 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 384 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 387 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 462 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 480 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 448 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 501 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 476 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 416 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 386 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 512 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 434 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 406 0 preread not plugged bitlist=0 delaylist=0 neemlark:/sys/block/md4/md# cat stripe_cache_active 447 0 preread not plugged bitlist=0
[PATCH] md: Fix badness in sysfs_notify caused by md_new_event
Here is a patch for a bug in 2.6.17-rc that just came to light. It should get into 2.6.17 if possible and so is in 'obviously correct' form rather than correct final fix form (see comments). The patch was actually generated agains -rc4-mm3, but applies to -rc4-git9 with a moderate offset for one of the chunks (no fuzz). Thanks, NeilBrown ### Comments for Changeset If an error is reported by a drive in a RAID array (which is done via bi_end_io - in interrupt context), we call md_error and md_new_event which calls sysfs_notify. However sysfs_notify grabs a mutex and so cannot be called in interrupt context. This patch just creates a variant of md_new_event which avoids the sysfs call, and uses that. A better fix for later is to arrange for the event to be called from user-context. Note: avoiding the sysfs call isn't a problem as an error will not, by itself, modify the sync_action attribute. (We do still need to wake_up(md_event_waiters) as an error by itself will modify /proc/mdstat). Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2006-05-30 15:14:20.0 +1000 +++ ./drivers/md/md.c 2006-05-30 15:23:26.0 +1000 @@ -172,6 +172,15 @@ void md_new_event(mddev_t *mddev) } EXPORT_SYMBOL_GPL(md_new_event); +/* Alternate version that can be called from interrupts + * when calling sysfs_notify isn't needed. + */ +void md_new_event_inintr(mddev_t *mddev) +{ + atomic_inc(md_event_count); + wake_up(md_event_waiters); +} + /* * Enables to iterate over all existing md arrays * all_mddevs_lock protects this list. @@ -4268,7 +4277,7 @@ void md_error(mddev_t *mddev, mdk_rdev_t set_bit(MD_RECOVERY_INTR, mddev-recovery); set_bit(MD_RECOVERY_NEEDED, mddev-recovery); md_wakeup_thread(mddev-thread); - md_new_event(mddev); + md_new_event_inintr(mddev); } /* seq_file implementation /proc/mdstat */ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html