Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Monday May 29, [EMAIL PROTECTED] wrote: > On Mon, May 29, 2006 at 12:08:25PM +1000, Neil Brown wrote: > >On Sunday May 28, [EMAIL PROTECTED] wrote: > >Thanks for the patches. They are greatly appreciated. > You're welcome > >> > >> - mdadm-2.3.1-kernel-byteswap-include-fix.patch > >> reverts a change introduced with mdadm 2.3.1 for redhat compatibility > >> asm/byteorder.h is an architecture dependent file and does more > >> stuff than a call to the linux/byteorder/XXX_endian.h > >> the fact that not calling asm/byteorder.h does not define > >> __BYTEORDER_HAS_U64__ is just an example of issues that might arise. > >> if redhat is broken it should be worked around differently than breaking > >> mdadm. > > > >I don't understand the problem here. What exactly breaks with the > >code currently in 2.5? mdadm doesn't need __BYTEORDER_HAS_U64__, so > >why does not having id defined break anything? > >The coomment from the patch says: > > > not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__ > > > causing __fswab64 to be undefined and failure compiling mdadm on > > > big_endian architectures like PPC > > > >But again, mdadm doesn't use __fswab64 > >More details please. > you use __cpu_to_le64 (ie in super0.c line 987) > > bms->sync_size = __cpu_to_le64(size); > > which in byteorder/big_endian.h is defined as > > #define __cpu_to_le64(x) ((__force __le64)__swab64((x))) > > but __swab64 is defined in byteorder/swab.h (included by > byteorder/big_endian.h) as > > #if defined(__GNUC__) && (__GNUC__ >= 2) && defined(__OPTIMIZE__) > # define __swab64(x) \ > (__builtin_constant_p((__u64)(x)) ? \ > ___swab64((x)) : \ > __fswab64((x))) > #else > # define __swab64(x) __fswab64(x) > #endif /* OPTIMIZE */ Grrr.. Thanks for the details. I think I'll just give up and do it myself. e.g. short swap16(short in) { int i; short out=0; for (i=0; i<4; i++) { out = out<<8 | (in&255); in = in >> 8; } return out; } I don't need top performance and at least this should be portable... NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Mon, May 29, 2006 at 12:08:25PM +1000, Neil Brown wrote: On Sunday May 28, [EMAIL PROTECTED] wrote: Thanks for the patches. They are greatly appreciated. You're welcome - mdadm-2.3.1-kernel-byteswap-include-fix.patch reverts a change introduced with mdadm 2.3.1 for redhat compatibility asm/byteorder.h is an architecture dependent file and does more stuff than a call to the linux/byteorder/XXX_endian.h the fact that not calling asm/byteorder.h does not define __BYTEORDER_HAS_U64__ is just an example of issues that might arise. if redhat is broken it should be worked around differently than breaking mdadm. I don't understand the problem here. What exactly breaks with the code currently in 2.5? mdadm doesn't need __BYTEORDER_HAS_U64__, so why does not having id defined break anything? The coomment from the patch says: > not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__ > causing __fswab64 to be undefined and failure compiling mdadm on > big_endian architectures like PPC But again, mdadm doesn't use __fswab64 More details please. you use __cpu_to_le64 (ie in super0.c line 987) bms->sync_size = __cpu_to_le64(size); which in byteorder/big_endian.h is defined as #define __cpu_to_le64(x) ((__force __le64)__swab64((x))) but __swab64 is defined in byteorder/swab.h (included by byteorder/big_endian.h) as #if defined(__GNUC__) && (__GNUC__ >= 2) && defined(__OPTIMIZE__) # define __swab64(x) \ (__builtin_constant_p((__u64)(x)) ? \ ___swab64((x)) : \ __fswab64((x))) #else # define __swab64(x) __fswab64(x) #endif /* OPTIMIZE */ and __fswab64 is defined further into byteorder/swab.h as #ifdef __BYTEORDER_HAS_U64__ static __inline__ __attribute_const__ __u64 __fswab64(__u64 x) . #endif /* __BYTEORDER_HAS_U64__ */ so building mdadm on a ppc (or i suppose a sparc) will break now, if you look at /usr/src/linux/asm-*/byteorder.h you will notice they are very different files, this makes me believe it is not a good idea bypassing asm/byteorder.h And no, just defining __BYTEORDER_HAS_U64__ will break on 32bit big-endian cpus (and if i do not misread it might just compile and give incorrect results) - mdadm-2.4-snprintf.patch this is self commenting, just an error in the snprintf call I wonder how that snuck in... There was an odd extra tab in the patch, but no-matter. I changed it to use 'sizeof(buf)' to be consistent with other uses of snprint. Thanks. yes, that would be better. - mdadm-2.4-strict-aliasing.patch fix for another srict-aliasing problem, you can typecast a reference to a void pointer to anything, you cannot typecast a reference to a struct. Why can't I typecast a reference to a struct??? It seems very unfair... that's strict-aliasing optimization for you, i do agree it _is_ unfair. However I have no problem with the patch. Applied. Thanks. I should really change it to use 'list.h' type lists from the linux kernel. hopefull redhat would not object :) - mdadm-2.5-mdassemble.patch pass CFLAGS to mdassemble build, enabling -Wall -Werror showed some issues also fixed by the patch. yep, thanks. - mdadm-2.5-rand.patch Posix dictates rand() versus bsd random() function, and dietlibc deprecated random(), so switch to srand()/rand() and make everybody happy. Everybody? 'man 3 rand' tells me: Do not use this function in applications intended to be portable when good randomness is needed. Admittedly mdadm doesn't need to be portable - it only needs to run on Linux. But this line in the man page bothers me. I guess -Drandom=rand -Dsrandom=srand might work no. stdlib.h doesn't like that. 'random' returns 'long int' while rand returns 'int'. Interestingly 'random_r' returns 'int' as does 'rand_r'. #ifdef __dietlibc__ #include /* dietlibc has deprecated random and srandom!! */ #define random rand #define srandom srand #endif in mdadm.h. Will that do you? yes, mdassemble will build, so it is ok for me. - mdadm-2.5-unused.patch glibc 2.4 is pedantic on ignoring return values from fprintf, fwrite and write, so now we check the rval and actually do something with it. in the Grow.c case i only print a warning, since i don't think we can do anithing in case we fail invalidating those superblocks (is should never happen, but then...) Ok, thanks. You can see these patches at http://neil.brown.name/cgi-bin/gitweb.cgi?p=mdadm Thanks. -- Luca Berra -- [EMAIL PROTECTED] Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5 kicks non-fresh drives
On Friday May 26, [EMAIL PROTECTED] wrote: > On Thu, 25 May 2006, Craig Hollabaugh wrote: > > > That did it! I set the partition FS Types from 'Linux' to 'Linux raid > > autodetect' after my last re-sync completed. Manually stopped and > > started the array. Things looked good, so I crossed my fingers and > > rebooted. The kernel found all the drives and all is happy here in > > Colorado. > > Would it make sense for the raid code to somehow warn in the log when a > device in a raid set doesn't have "Linux raid autodetect" partition type? > If this was in "dmesg", would you have spotted the problem before? Maybe. Unfortunately md doesn't really have direct access to information on partition types. The way it gets access for auto-detect is an ugly hack which I would rather not make any further use of. Maybe mdadm could be more helpful here. e.g. when you create, assemble, or 'detail' an array it could report any inconsistencies in the partition types, and when you --add a device which is isn't a Raid-autodetect partition to an array the currently comprises such partitions it could give a warning. I had thought that 'libblkid' would help with that, but having looked at the doco, it appears not Maybe I use libparted... or maybe borrow code out of kpartx. There don't seem to be any easy options ;-( Thanks for the suggestion (and if anyone has some good partition hacking code...) NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with raid=noautodetect
On Mon, May 29, 2006 at 02:21:09PM +1000, Neil Brown wrote: 3) introduce "DEVICEFILTER" or similar keyword with the same meaning at the actual "DEVICE" keyboard If it has the same meaning, why not leave it called 'DEVICE'??? the idea was to warn people that write DEVICE /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 ARRAY /dev/md0 ... that it might break since disk naming is not guaranteed to be constant. However, there is at least the beginnings of a good idea here. If we assume there is a list of devices provided by a (possibly default) 'DEVICE' line, then DEVICEFILTER !pattern1 !pattern2 pattern3 pattern4 could mean that any device in that list which matches pattern 1 or 2 is immediately discarded, and remaining device that matches patterns 3 or 4 are included, and the remainder are discard. The rule could be that the default is to include any devices that don't match a !pattern, unless there is a pattern without a '!', in which case the default is to reject non-accepted patterns. Is that straight forward enough, or do I need an order allow,deny like apache has? I think that documenting the feature would be enough L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5 kicks non-fresh drives
On Friday May 26, [EMAIL PROTECTED] wrote: > > I had no idea about this particular configuration requirement. None of > > just to be clear: it's not a requirement. if you want the very nice > auto-assembling behavior, you need to designate the auto-assemblable > partitions. but you can assemble "manually" without 0xfd partitions > (even if that's in an initrd, for instance.) > > I think the current situation is good, since there is some danger of > going too far. for instance, testing each partition to see whether > it contains a valid superblock would be pretty crazy, right? I'm curious: why exactly do you say that? Doing the reads themselves cannot be a problem as the kernel already reads the partition table from each devices. Reading superblocks is no big deal. If you don't like the idea of assembling everything that was found, how is that different from. requiring > either the "auto-assemble-me" partition type, or explicit partitions > given in a config file is a happy medium... assembling everything that was found which had an 'auto-assemble-me' flag? That flag, in common usage, contains almost zero information more than the existence of the raid superblock. Am I missing something? My opinion: the "auto-assemble-me" partition type is not a happy medium. The superblock containing the hostname (as supported by mdadm-2.5) is (I hope). NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with raid=noautodetect
On Friday May 26, [EMAIL PROTECTED] wrote: > On Tue, May 23, 2006 at 08:39:26AM +1000, Neil Brown wrote: > >Presumably you have a 'DEVICE' line in mdadm.conf too? What is it. > >My first guess is that it isn't listing /dev/sdd? somehow. > > Neil, > i am seeing a lot of people that fall in this same error, and i would > propose a way of avoiding this problem > > 1) make "DEVICE partitions" the default if no device line is specified. As you note, we think alike on this :-) > 2) deprecate the "DEVICE" keyword issuing a warning when it is found in > the configuration file Not sure I'm so keen on that, at least not in the near term. > 3) introduce "DEVICEFILTER" or similar keyword with the same meaning at > the actual "DEVICE" keyboard If it has the same meaning, why not leave it called 'DEVICE'??? However, there is at least the beginnings of a good idea here. If we assume there is a list of devices provided by a (possibly default) 'DEVICE' line, then DEVICEFILTER !pattern1 !pattern2 pattern3 pattern4 could mean that any device in that list which matches pattern 1 or 2 is immediately discarded, and remaining device that matches patterns 3 or 4 are included, and the remainder are discard. The rule could be that the default is to include any devices that don't match a !pattern, unless there is a pattern without a '!', in which case the default is to reject non-accepted patterns. Is that straight forward enough, or do I need an order allow,deny like apache has? Thanks for the suggestion. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] install a static build
On Sunday May 28, [EMAIL PROTECTED] wrote: > Hello Luca, > > > maybe you better add an install-static target. > > you're right, that would be a cleaner approach. I've don so, and while > doing so added install-tcc, install-ulibc, install-klibc too. > > And while I'm busy in the Makefile anyway I've made a third patch > which adds the uninstall: target too. > > -- > ---> Dirk Jagdmann > > http://cubic.org/~doj > -> http://llg.cubic.org thanks for these. They are now in my git tree: http://neil.brown.name/cgi-bin/gitweb.cgi?p=mdadm git://neil.brown.name/mdadm They claim me as their author I'm afraid.. I'll have to fix my scripts to get it right next time :-( NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Sunday May 28, [EMAIL PROTECTED] wrote: > On Fri, May 26, 2006 at 04:33:08PM +1000, Neil Brown wrote: > > > > > >I am pleased to announce the availability of > > mdadm version 2.5 > > > > hello, > i tried rebuilding mdadm 2.5 on current mandriva cooker, which uses > gcc-4.1.1, glibc-2.4 and dietlibc 0.29 and found the following issues > addressed by patches attacched to this message > I would be glad if you could review these patches and include them in > upcoming mdadm releases. Thanks for the patches. They are greatly appreciated. > > - mdadm-2.3.1-kernel-byteswap-include-fix.patch > reverts a change introduced with mdadm 2.3.1 for redhat compatibility > asm/byteorder.h is an architecture dependent file and does more > stuff than a call to the linux/byteorder/XXX_endian.h > the fact that not calling asm/byteorder.h does not define > __BYTEORDER_HAS_U64__ is just an example of issues that might arise. > if redhat is broken it should be worked around differently than breaking > mdadm. I don't understand the problem here. What exactly breaks with the code currently in 2.5? mdadm doesn't need __BYTEORDER_HAS_U64__, so why does not having id defined break anything? The coomment from the patch says: > not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__ > causing __fswab64 to be undefined and failure compiling mdadm on > big_endian architectures like PPC But again, mdadm doesn't use __fswab64 More details please. > > - mdadm-2.4-snprintf.patch > this is self commenting, just an error in the snprintf call I wonder how that snuck in... There was an odd extra tab in the patch, but no-matter. I changed it to use 'sizeof(buf)' to be consistent with other uses of snprint. Thanks. > > - mdadm-2.4-strict-aliasing.patch > fix for another srict-aliasing problem, you can typecast a reference to a > void pointer to anything, you cannot typecast a reference to a > struct. Why can't I typecast a reference to a struct??? It seems very unfair... However I have no problem with the patch. Applied. Thanks. I should really change it to use 'list.h' type lists from the linux kernel. > > - mdadm-2.5-mdassemble.patch > pass CFLAGS to mdassemble build, enabling -Wall -Werror showed some > issues also fixed by the patch. yep, thanks. > > - mdadm-2.5-rand.patch > Posix dictates rand() versus bsd random() function, and dietlibc > deprecated random(), so switch to srand()/rand() and make everybody > happy. Everybody? 'man 3 rand' tells me: Do not use this function in applications intended to be portable when good randomness is needed. Admittedly mdadm doesn't need to be portable - it only needs to run on Linux. But this line in the man page bothers me. I guess -Drandom=rand -Dsrandom=srand might work no. stdlib.h doesn't like that. 'random' returns 'long int' while rand returns 'int'. Interestingly 'random_r' returns 'int' as does 'rand_r'. #ifdef __dietlibc__ #include /* dietlibc has deprecated random and srandom!! */ #define random rand #define srandom srand #endif in mdadm.h. Will that do you? > > - mdadm-2.5-unused.patch > glibc 2.4 is pedantic on ignoring return values from fprintf, fwrite and > write, so now we check the rval and actually do something with it. > in the Grow.c case i only print a warning, since i don't think we can do > anithing in case we fail invalidating those superblocks (is should never > happen, but then...) Ok, thanks. You can see these patches at http://neil.brown.name/cgi-bin/gitweb.cgi?p=mdadm more welcome :-) Thanks again, NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Sun, 28 May 2006, Luca Berra wrote: > dietlibc rand() and random() are the same function. > but random will throw a warning saying it is deprecated. that's terribly obnoxious... it's never going to be deprecated, there are only approximately a bazillion programs using random(). -dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Sun, May 28, 2006 at 10:08:19AM -0700, dean gaudet wrote: On Sun, 28 May 2006, Luca Berra wrote: - mdadm-2.5-rand.patch Posix dictates rand() versus bsd random() function, and dietlibc deprecated random(), so switch to srand()/rand() and make everybody happy. fwiw... lots of rand()s tend to suck... and RAND_MAX may not be large enough for this use. glibc rand() is the same as random(). do you know the fact that glibc rand() is the same implementation as random() was one of the reason i believe we could switch to rand() if dietlibc's rand() is good enough? dietlibc rand() and random() are the same function. but random will throw a warning saying it is deprecated. L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Sun, 28 May 2006, Luca Berra wrote: > - mdadm-2.5-rand.patch > Posix dictates rand() versus bsd random() function, and dietlibc > deprecated random(), so switch to srand()/rand() and make everybody > happy. fwiw... lots of rand()s tend to suck... and RAND_MAX may not be large enough for this use. glibc rand() is the same as random(). do you know if dietlibc's rand() is good enough? -dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] mdadm 2.5 (Was: ANNOUNCE: mdadm 2.5 - A tool for managing Soft RAID under Linux)
On Fri, May 26, 2006 at 04:33:08PM +1000, Neil Brown wrote: I am pleased to announce the availability of mdadm version 2.5 hello, i tried rebuilding mdadm 2.5 on current mandriva cooker, which uses gcc-4.1.1, glibc-2.4 and dietlibc 0.29 and found the following issues addressed by patches attacched to this message I would be glad if you could review these patches and include them in upcoming mdadm releases. - mdadm-2.3.1-kernel-byteswap-include-fix.patch reverts a change introduced with mdadm 2.3.1 for redhat compatibility asm/byteorder.h is an architecture dependent file and does more stuff than a call to the linux/byteorder/XXX_endian.h the fact that not calling asm/byteorder.h does not define __BYTEORDER_HAS_U64__ is just an example of issues that might arise. if redhat is broken it should be worked around differently than breaking mdadm. - mdadm-2.4-snprintf.patch this is self commenting, just an error in the snprintf call - mdadm-2.4-strict-aliasing.patch fix for another srict-aliasing problem, you can typecast a reference to a void pointer to anything, you cannot typecast a reference to a struct. - mdadm-2.5-mdassemble.patch pass CFLAGS to mdassemble build, enabling -Wall -Werror showed some issues also fixed by the patch. - mdadm-2.5-rand.patch Posix dictates rand() versus bsd random() function, and dietlibc deprecated random(), so switch to srand()/rand() and make everybody happy. - mdadm-2.5-unused.patch glibc 2.4 is pedantic on ignoring return values from fprintf, fwrite and write, so now we check the rval and actually do something with it. in the Grow.c case i only print a warning, since i don't think we can do anithing in case we fail invalidating those superblocks (is should never happen, but then...) Regards, L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ * Sat Feb 18 2006 Christiaan Welvaart <[EMAIL PROTECTED]> not including asm/byteorder.h will not define __BYTEORDER_HAS_U64__ causing __fswab64 to be undefined and failure compiling mdadm on big_endian architectures like PPC --- mdadm-2.3.1/mdadm.h.bak 2006-02-06 04:52:12.0 +0100 +++ mdadm-2.3.1/mdadm.h 2006-02-18 03:51:59.786926267 +0100 @@ -72,16 +72,7 @@ #include "bitmap.h" #include -/* #include "asm/byteorder.h" Redhat don't like this so... */ -#if __BYTE_ORDER == __LITTLE_ENDIAN -# include -#elif __BYTE_ORDER == __BIG_ENDIAN -# include -#elif __BYTE_ORDER == __PDP_ENDIAN -# include -#else -# error "unknown endianness." -#endif +#include * Sat May 27 2006 Luca Berra <[EMAIL PROTECTED]> snprintf size should be at most the size of the buffer --- mdadm-2.4/util.c.snprintf 2006-05-27 13:53:18.0 +0200 +++ mdadm-2.4/util.c2006-05-27 13:53:38.0 +0200 @@ -439,7 +439,7 @@ } if (create && !std && !nonstd) { static char buf[30]; - snprintf(buf, 1024, "%d:%d", major, minor); + snprintf(buf, 30, "%d:%d", major, minor); nonstd = buf; } * Sat May 27 2006 Luca Berra <[EMAIL PROTECTED]> This is to avoid gcc warnings when building with strict-aliasing optimization --- mdadm-2.4/dlink.h.alias 2006-05-26 21:05:07.0 +0200 +++ mdadm-2.4/dlink.h 2006-05-27 12:32:58.0 +0200 @@ -4,16 +4,16 @@ struct __dl_head { -struct __dl_head * dh_prev; -struct __dl_head * dh_next; +void * dh_prev; +void * dh_next; }; #definedl_alloc(size) ((void*)(((char*)calloc(1,(size)+sizeof(struct __dl_head)))+sizeof(struct __dl_head))) #definedl_new(t) ((t*)dl_alloc(sizeof(t))) #definedl_newv(t,n)((t*)dl_alloc(sizeof(t)*n)) -#define dl_next(p) *((void**)&(((struct __dl_head*)(p))[-1].dh_next)) -#define dl_prev(p) *((void**)&(((struct __dl_head*)(p))[-1].dh_prev)) +#define dl_next(p) *(&(((struct __dl_head*)(p))[-1].dh_next)) +#define dl_prev(p) *(&(((struct __dl_head*)(p))[-1].dh_prev)) void *dl_head(void); char *dl_strdup(char *); * Sat May 27 2006 Luca Berra <[EMAIL PROTECTED]> add CFLAGS to mdassemble build and fix a couple of non-returning functions --- mdadm-2.5/mdadm.h.bluca 2006-05-27 14:25:53.0 +0200 +++ mdadm-2.5/mdadm.h 2006-05-27 15:20:37.0 +0200 @@ -44,10 +44,8 @@ #include #include #include -#ifdef __dietlibc__NONO -int strncmp(const char *s1, const char *s2, size_t n) __THROW __pure__; -char *strncpy(char *dest, const char *src, size_t n) __THROW; -#include +#ifdef __dietlibc__ +#include #endif --- mdadm-2.5/mdassemble.c.bluca2006-05-27 15:11:02.0 +0200 +++ mdadm-2.5/mdassemble.c 2006-05-27 15:15:24.0 +0200 @@ -54,7 +54,7 @@ }; #ifndef MDASSEMBLE_AUTO -/* from mdadm.c */ +/* from mdopen.c */ int open_mddev(char *dev, int autof/*unused */) { int mdfd = open(dev, O_RDWR, 0); @@ -79,7 +79,7 @@ int verbose = 0; in
Re: raid5 hang on get_active_stripe
On Saturday May 27, [EMAIL PROTECTED] wrote: > On Sat, 27 May 2006, Neil Brown wrote: > > > Thanks. This narrows it down quite a bit... too much infact: I can > > now say for sure that this cannot possible happen :-) > > > > 2/ The message.gz you sent earlier with the > > echo t > /proc/sysrq-trigger > > trace in it didn't contain information about md4_raid5 - the > > got another hang again this morning... full dmesg output attached. > Thanks. Nothing surprising there, which maybe is a surprise itself... I'm still somewhat stumped by this. But given that it is nicely repeatable, I'm sure we can get there... The following patch adds some more tracing to raid5, and might fix a subtle bug in ll_rw_blk, though it is an incredible long shot that this could be affecting raid5 (if it is, I'll have to assume there is another bug somewhere). It certainly doesn't break ll_rw_blk. Whether it actually fixes something I'm not sure. If you could try with these on top of the previous patches I'd really appreciate it. When you read from /stripe_cache_active, it should trigger a (cryptic) kernel message within the next 15 seconds. If I could get the contents of that file and the kernel messages, that should help. Thanks heaps, NeilBrown Signed-off-by: Neil Brown <[EMAIL PROTECTED]> ### Diffstat output ./block/ll_rw_blk.c |4 ++-- ./drivers/md/raid5.c | 18 ++ 2 files changed, 20 insertions(+), 2 deletions(-) diff ./block/ll_rw_blk.c~current~ ./block/ll_rw_blk.c --- ./block/ll_rw_blk.c~current~2006-05-28 21:54:23.0 +1000 +++ ./block/ll_rw_blk.c 2006-05-28 21:55:17.0 +1000 @@ -874,7 +874,7 @@ static void __blk_queue_free_tags(reques } q->queue_tags = NULL; - q->queue_flags &= ~(1 << QUEUE_FLAG_QUEUED); + clear_bit(QUEUE_FLAG_QUEUED, &q->queue_flags); } /** @@ -963,7 +963,7 @@ int blk_queue_init_tags(request_queue_t * assign it, all done */ q->queue_tags = tags; - q->queue_flags |= (1 << QUEUE_FLAG_QUEUED); + set_bit(QUEUE_FLAG_QUEUED, &q->queue_flags); return 0; fail: kfree(tags); diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c --- ./drivers/md/raid5.c~current~ 2006-05-27 09:17:10.0 +1000 +++ ./drivers/md/raid5.c2006-05-28 21:56:56.0 +1000 @@ -1701,13 +1701,20 @@ static sector_t sync_request(mddev_t *md * During the scan, completed stripes are saved for us by the interrupt * handler, so that they will not have to wait for our next wakeup. */ +static unsigned long trigger; + static void raid5d (mddev_t *mddev) { struct stripe_head *sh; raid5_conf_t *conf = mddev_to_conf(mddev); int handled; + int trace = 0; PRINTK("+++ raid5d active\n"); + if (test_and_clear_bit(0, &trigger)) + trace = 1; + if (trace) + printk("raid5d runs\n"); md_check_recovery(mddev); @@ -1725,6 +1732,13 @@ static void raid5d (mddev_t *mddev) activate_bit_delay(conf); } + if (trace) + printk(" le=%d, pas=%d, bqp=%d le=%d\n", + list_empty(&conf->handle_list), + atomic_read(&conf->preread_active_stripes), + blk_queue_plugged(mddev->queue), + list_empty(&conf->delayed_list)); + if (list_empty(&conf->handle_list) && atomic_read(&conf->preread_active_stripes) < IO_THRESHOLD && !blk_queue_plugged(mddev->queue) && @@ -1756,6 +1770,8 @@ static void raid5d (mddev_t *mddev) unplug_slaves(mddev); PRINTK("--- raid5d inactive\n"); + if (trace) + printk("raid5d done\n"); } static ssize_t @@ -1813,6 +1829,7 @@ stripe_cache_active_show(mddev_t *mddev, struct list_head *l; n = sprintf(page, "%d\n", atomic_read(&conf->active_stripes)); n += sprintf(page+n, "%d preread\n", atomic_read(&conf->preread_active_stripes)); + n += sprintf(page+n, "%splugged\n", blk_queue_plugged(mddev->queue)?"":"not "); spin_lock_irq(&conf->device_lock); c1=0; list_for_each(l, &conf->bitmap_list) @@ -1822,6 +1839,7 @@ stripe_cache_active_show(mddev_t *mddev, c2++; spin_unlock_irq(&conf->device_lock); n += sprintf(page+n, "bitlist=%d delaylist=%d\n", c1, c2); + trigger = 0x; return n; } else return 0; - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
echo check > .../sync_action displays "resync" in mdstat
A little nit-picking... # echo check > /sys/block/md0/md/sync_action # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [raid4] md1 : active raid1 sda7[1] hda7[0] 6253248 blocks [2/2] [UU] md2 : active raid5 sdd1[2] sdc1[3] sdb1[1] hdc1[0] hdb1[4] 976783616 blocks level 5, 64k chunk, algorithm 2 [5/5] [U] md0 : active raid1 sda6[1] hda6[0] 109464896 blocks [2/2] [UU] [>] resync = 2.7% (3025664/109464896) finish=41.8min speed=42425K/sec It says "resync"... is it resyncing or just checking ? - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html