Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
On Mon, Apr 16 2018, James Simmons wrote: >> James, >> >> If I understand correctly, you're saying you want to be able to build >> without debug support...? I'm not convinced that building a client without >> debug support is interesting or useful. In fact, I think it would be >> harmful, and we shouldn't open up the possibility - this is switchable debug >> with very low overhead when not actually "on". It would be really awful to >> get a problem on a running system and discover there's no debug support - >> that you can't even enable debug without a reinstall. >> >> If I've understood you correctly, then I would want to see proof of a >> significant performance cost when debug is built but *off* before agreeing >> to even exposing this option. (I know it's a choice they'd have to make, >> but if it's not really useful with a side order of potentially harmful, we >> shouldn't even give people the choice.) > > I'm not saying add the option today but this is more for the long game. > While the Intel lustre developers deeply love lustre's debugging > infrastructure I see a future where something better will come along to > replace it. When that day comes we will have a period where both > debugging infrastructurs will exist and some deployers of lustre will > want to turn off the old debugging infrastructure and just use the new. > That is what I have in mind. A switch to flip between options. My position on this is that lustre's debugging infrastructure (in mainline) *will* be changed to use something that the rest of the kernel can and does use. Quite possibly that "something" will first be enhanced so that it is as powerful and useful as what lustre has. I suspect this will partly be pr_debug(), partly WARN_ON(), partly trace points. But I'm not very familiar with tracepoints or with lustre debugging yet so this is far from certain. pr_debug() and tracepoints can be compiled out, but only kernel-wide. There is no reason for lustre to be special there. WARN_ON() and BUG_ON() cannot be compiled out, but BUG_ON() must only be used when proceeding is unarguably worse than crashing the machine. In recent years a lot of BUG_ON()s have been removed or changed to warnings. We need to maintain that attitude. I don't like the idea of have two parallel debuging infrastructures that you can choose between - it encourages confusion and brings no benefits. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
On Mon, Apr 16 2018, James Simmons wrote: >> James, >> >> If I understand correctly, you're saying you want to be able to build >> without debug support...? I'm not convinced that building a client without >> debug support is interesting or useful. In fact, I think it would be >> harmful, and we shouldn't open up the possibility - this is switchable debug >> with very low overhead when not actually "on". It would be really awful to >> get a problem on a running system and discover there's no debug support - >> that you can't even enable debug without a reinstall. >> >> If I've understood you correctly, then I would want to see proof of a >> significant performance cost when debug is built but *off* before agreeing >> to even exposing this option. (I know it's a choice they'd have to make, >> but if it's not really useful with a side order of potentially harmful, we >> shouldn't even give people the choice.) > > I'm not saying add the option today but this is more for the long game. > While the Intel lustre developers deeply love lustre's debugging > infrastructure I see a future where something better will come along to > replace it. When that day comes we will have a period where both > debugging infrastructurs will exist and some deployers of lustre will > want to turn off the old debugging infrastructure and just use the new. > That is what I have in mind. A switch to flip between options. My position on this is that lustre's debugging infrastructure (in mainline) *will* be changed to use something that the rest of the kernel can and does use. Quite possibly that "something" will first be enhanced so that it is as powerful and useful as what lustre has. I suspect this will partly be pr_debug(), partly WARN_ON(), partly trace points. But I'm not very familiar with tracepoints or with lustre debugging yet so this is far from certain. pr_debug() and tracepoints can be compiled out, but only kernel-wide. There is no reason for lustre to be special there. WARN_ON() and BUG_ON() cannot be compiled out, but BUG_ON() must only be used when proceeding is unarguably worse than crashing the machine. In recent years a lot of BUG_ON()s have been removed or changed to warnings. We need to maintain that attitude. I don't like the idea of have two parallel debuging infrastructures that you can choose between - it encourages confusion and brings no benefits. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
On Apr 16, 2018, at 16:48, Doug Oucharekwrote: > >> >> On Apr 16, 2018, at 3:42 PM, James Simmons wrote: >> >> >>> James, >>> >>> If I understand correctly, you're saying you want to be able to build >>> without debug support...? I'm not convinced that building a client without >>> debug support is interesting or useful. In fact, I think it would be >>> harmful, and we shouldn't open up the possibility - this is switchable >>> debug with very low overhead when not actually "on". It would be really >>> awful to get a problem on a running system and discover there's no debug >>> support - that you can't even enable debug without a reinstall. >>> >>> If I've understood you correctly, then I would want to see proof of a >>> significant performance cost when debug is built but *off* before agreeing >>> to even exposing this option. (I know it's a choice they'd have to make, >>> but if it's not really useful with a side order of potentially harmful, we >>> shouldn't even give people the choice.) >> >> I'm not saying add the option today but this is more for the long game. >> While the Intel lustre developers deeply love lustre's debugging >> infrastructure I see a future where something better will come along to >> replace it. When that day comes we will have a period where both >> debugging infrastructurs will exist and some deployers of lustre will >> want to turn off the old debugging infrastructure and just use the new. >> That is what I have in mind. A switch to flip between options. > > Yes please!! An option for users which says “no, you do not have the right > to panic my system via LASSERT whenever you like” would be a blessing. Note that LASSERT() itself does not panic the system, unless you configure it with panic_on_lbug=1. Otherwise, it just blocks that thread (though this can also have an impact on other threads if you are holding locks at that time). That said, the LASSERT() should not be hit unless there is bad code, data corruption, or the LASSERT() itself is incorrect (essentially bad code also). So "whenever you like" is "whenever the system is about to corrupt your data", and people are not very forgiving if a filesystem corrupts their data... Cheers, Andreas -- Andreas Dilger Lustre Principal Architect Intel Corporation
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
On Apr 16, 2018, at 16:48, Doug Oucharek wrote: > >> >> On Apr 16, 2018, at 3:42 PM, James Simmons wrote: >> >> >>> James, >>> >>> If I understand correctly, you're saying you want to be able to build >>> without debug support...? I'm not convinced that building a client without >>> debug support is interesting or useful. In fact, I think it would be >>> harmful, and we shouldn't open up the possibility - this is switchable >>> debug with very low overhead when not actually "on". It would be really >>> awful to get a problem on a running system and discover there's no debug >>> support - that you can't even enable debug without a reinstall. >>> >>> If I've understood you correctly, then I would want to see proof of a >>> significant performance cost when debug is built but *off* before agreeing >>> to even exposing this option. (I know it's a choice they'd have to make, >>> but if it's not really useful with a side order of potentially harmful, we >>> shouldn't even give people the choice.) >> >> I'm not saying add the option today but this is more for the long game. >> While the Intel lustre developers deeply love lustre's debugging >> infrastructure I see a future where something better will come along to >> replace it. When that day comes we will have a period where both >> debugging infrastructurs will exist and some deployers of lustre will >> want to turn off the old debugging infrastructure and just use the new. >> That is what I have in mind. A switch to flip between options. > > Yes please!! An option for users which says “no, you do not have the right > to panic my system via LASSERT whenever you like” would be a blessing. Note that LASSERT() itself does not panic the system, unless you configure it with panic_on_lbug=1. Otherwise, it just blocks that thread (though this can also have an impact on other threads if you are holding locks at that time). That said, the LASSERT() should not be hit unless there is bad code, data corruption, or the LASSERT() itself is incorrect (essentially bad code also). So "whenever you like" is "whenever the system is about to corrupt your data", and people are not very forgiving if a filesystem corrupts their data... Cheers, Andreas -- Andreas Dilger Lustre Principal Architect Intel Corporation
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
> On Apr 16, 2018, at 3:42 PM, James Simmonswrote: > > >> James, >> >> If I understand correctly, you're saying you want to be able to build >> without debug support...? I'm not convinced that building a client without >> debug support is interesting or useful. In fact, I think it would be >> harmful, and we shouldn't open up the possibility - this is switchable debug >> with very low overhead when not actually "on". It would be really awful to >> get a problem on a running system and discover there's no debug support - >> that you can't even enable debug without a reinstall. >> >> If I've understood you correctly, then I would want to see proof of a >> significant performance cost when debug is built but *off* before agreeing >> to even exposing this option. (I know it's a choice they'd have to make, >> but if it's not really useful with a side order of potentially harmful, we >> shouldn't even give people the choice.) > > I'm not saying add the option today but this is more for the long game. > While the Intel lustre developers deeply love lustre's debugging > infrastructure I see a future where something better will come along to > replace it. When that day comes we will have a period where both > debugging infrastructurs will exist and some deployers of lustre will > want to turn off the old debugging infrastructure and just use the new. > That is what I have in mind. A switch to flip between options. Yes please!! An option for users which says “no, you do not have the right to panic my system via LASSERT whenever you like” would be a blessing. Doug > >> - Patrick >> >> On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons" >> >> wrote: >> >> >>> CDEBUG_STACK() and CHECK_STACK() are macros to help with >>> debugging, so move them from >>> drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h >>> to >>> drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> >>> This seems a more fitting location, and is a step towards >>> removing linux/libcfs.h and simplifying the include file structure. >> >>Nak. Currently the lustre client always enables debugging but that >>shouldn't be the case. What we do need is the able to turn off the >>crazy debugging stuff. In the development branch of lustre it is >>done with CDEBUG_ENABLED. We need something like that in Kconfig >>much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like >>to be able to turn that off this should be moved to just after >>LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() >>it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT >>would be empty. >> >>> Signed-off-by: NeilBrown >>> --- >>> .../lustre/include/linux/libcfs/libcfs_debug.h | 32 >>> >>> .../lustre/include/linux/libcfs/linux/libcfs.h | 31 >>> --- >>> 2 files changed, 32 insertions(+), 31 deletions(-) >>> >>> diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> index 9290a19429e7..0dc7b91efe7c 100644 >>> --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char *str, >>> int is_subsys); >>> extern unsigned int libcfs_catastrophe; >>> extern unsigned int libcfs_panic_on_lbug; >>> >>> +/* Enable debug-checks on stack size - except on x86_64 */ >>> +#if !defined(__x86_64__) >>> +# ifdef __ia64__ >>> +# define CDEBUG_STACK() (THREAD_SIZE - \ >>> + ((unsigned long)__builtin_dwarf_cfa() & \ >>> + (THREAD_SIZE - 1))) >>> +# else >>> +# define CDEBUG_STACK() (THREAD_SIZE - \ >>> + ((unsigned long)__builtin_frame_address(0) & \ >>> + (THREAD_SIZE - 1))) >>> +# endif /* __ia64__ */ >>> + >>> +#define __CHECK_STACK(msgdata, mask, cdls) \ >>> +do { \ >>> + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ >>> + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); \ >>> + libcfs_stack = CDEBUG_STACK();\ >>> + libcfs_debug_msg(msgdata, \ >>> +"maximum lustre stack %lu\n",\ >>> +CDEBUG_STACK()); \ >>> + (msgdata)->msg_mask = mask; \ >>> + (msgdata)->msg_cdls = cdls; \ >>> + dump_stack(); \ >>> + /*panic("LBUG");*/
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
> On Apr 16, 2018, at 3:42 PM, James Simmons wrote: > > >> James, >> >> If I understand correctly, you're saying you want to be able to build >> without debug support...? I'm not convinced that building a client without >> debug support is interesting or useful. In fact, I think it would be >> harmful, and we shouldn't open up the possibility - this is switchable debug >> with very low overhead when not actually "on". It would be really awful to >> get a problem on a running system and discover there's no debug support - >> that you can't even enable debug without a reinstall. >> >> If I've understood you correctly, then I would want to see proof of a >> significant performance cost when debug is built but *off* before agreeing >> to even exposing this option. (I know it's a choice they'd have to make, >> but if it's not really useful with a side order of potentially harmful, we >> shouldn't even give people the choice.) > > I'm not saying add the option today but this is more for the long game. > While the Intel lustre developers deeply love lustre's debugging > infrastructure I see a future where something better will come along to > replace it. When that day comes we will have a period where both > debugging infrastructurs will exist and some deployers of lustre will > want to turn off the old debugging infrastructure and just use the new. > That is what I have in mind. A switch to flip between options. Yes please!! An option for users which says “no, you do not have the right to panic my system via LASSERT whenever you like” would be a blessing. Doug > >> - Patrick >> >> On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons" >> >> wrote: >> >> >>> CDEBUG_STACK() and CHECK_STACK() are macros to help with >>> debugging, so move them from >>> drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h >>> to >>> drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> >>> This seems a more fitting location, and is a step towards >>> removing linux/libcfs.h and simplifying the include file structure. >> >>Nak. Currently the lustre client always enables debugging but that >>shouldn't be the case. What we do need is the able to turn off the >>crazy debugging stuff. In the development branch of lustre it is >>done with CDEBUG_ENABLED. We need something like that in Kconfig >>much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like >>to be able to turn that off this should be moved to just after >>LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() >>it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT >>would be empty. >> >>> Signed-off-by: NeilBrown >>> --- >>> .../lustre/include/linux/libcfs/libcfs_debug.h | 32 >>> >>> .../lustre/include/linux/libcfs/linux/libcfs.h | 31 >>> --- >>> 2 files changed, 32 insertions(+), 31 deletions(-) >>> >>> diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> index 9290a19429e7..0dc7b91efe7c 100644 >>> --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h >>> @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char *str, >>> int is_subsys); >>> extern unsigned int libcfs_catastrophe; >>> extern unsigned int libcfs_panic_on_lbug; >>> >>> +/* Enable debug-checks on stack size - except on x86_64 */ >>> +#if !defined(__x86_64__) >>> +# ifdef __ia64__ >>> +# define CDEBUG_STACK() (THREAD_SIZE - \ >>> + ((unsigned long)__builtin_dwarf_cfa() & \ >>> + (THREAD_SIZE - 1))) >>> +# else >>> +# define CDEBUG_STACK() (THREAD_SIZE - \ >>> + ((unsigned long)__builtin_frame_address(0) & \ >>> + (THREAD_SIZE - 1))) >>> +# endif /* __ia64__ */ >>> + >>> +#define __CHECK_STACK(msgdata, mask, cdls) \ >>> +do { \ >>> + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ >>> + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); \ >>> + libcfs_stack = CDEBUG_STACK();\ >>> + libcfs_debug_msg(msgdata, \ >>> +"maximum lustre stack %lu\n",\ >>> +CDEBUG_STACK()); \ >>> + (msgdata)->msg_mask = mask; \ >>> + (msgdata)->msg_cdls = cdls; \ >>> + dump_stack(); \ >>> + /*panic("LBUG");*/\ >>> + } \ >>> +} while (0) >>>
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
> James, > > If I understand correctly, you're saying you want to be able to build without > debug support...? I'm not convinced that building a client without debug > support is interesting or useful. In fact, I think it would be harmful, and > we shouldn't open up the possibility - this is switchable debug with very low > overhead when not actually "on". It would be really awful to get a problem > on a running system and discover there's no debug support - that you can't > even enable debug without a reinstall. > > If I've understood you correctly, then I would want to see proof of a > significant performance cost when debug is built but *off* before agreeing to > even exposing this option. (I know it's a choice they'd have to make, but if > it's not really useful with a side order of potentially harmful, we shouldn't > even give people the choice.) I'm not saying add the option today but this is more for the long game. While the Intel lustre developers deeply love lustre's debugging infrastructure I see a future where something better will come along to replace it. When that day comes we will have a period where both debugging infrastructurs will exist and some deployers of lustre will want to turn off the old debugging infrastructure and just use the new. That is what I have in mind. A switch to flip between options. > - Patrick > > On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons" >> wrote: > > > > CDEBUG_STACK() and CHECK_STACK() are macros to help with > > debugging, so move them from > >drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > > to > >drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > > > This seems a more fitting location, and is a step towards > > removing linux/libcfs.h and simplifying the include file structure. > > Nak. Currently the lustre client always enables debugging but that > shouldn't be the case. What we do need is the able to turn off the > crazy debugging stuff. In the development branch of lustre it is > done with CDEBUG_ENABLED. We need something like that in Kconfig > much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like > to be able to turn that off this should be moved to just after > LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() > it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT > would be empty. > > > Signed-off-by: NeilBrown > > --- > > .../lustre/include/linux/libcfs/libcfs_debug.h | 32 > > > .../lustre/include/linux/libcfs/linux/libcfs.h | 31 > --- > > 2 files changed, 32 insertions(+), 31 deletions(-) > > > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > index 9290a19429e7..0dc7b91efe7c 100644 > > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char > *str, int is_subsys); > > extern unsigned int libcfs_catastrophe; > > extern unsigned int libcfs_panic_on_lbug; > > > > +/* Enable debug-checks on stack size - except on x86_64 */ > > +#if !defined(__x86_64__) > > +# ifdef __ia64__ > > +# define CDEBUG_STACK() (THREAD_SIZE - > \ > > + ((unsigned long)__builtin_dwarf_cfa() & > \ > > + (THREAD_SIZE - 1))) > > +# else > > +# define CDEBUG_STACK() (THREAD_SIZE - > \ > > + ((unsigned long)__builtin_frame_address(0) & > \ > > + (THREAD_SIZE - 1))) > > +# endif /* __ia64__ */ > > + > > +#define __CHECK_STACK(msgdata, mask, cdls) \ > > +do { \ > > + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ > > + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); > \ > > + libcfs_stack = CDEBUG_STACK();\ > > + libcfs_debug_msg(msgdata, \ > > +"maximum lustre stack %lu\n",\ > > +CDEBUG_STACK()); \ > > + (msgdata)->msg_mask = mask; \ > > + (msgdata)->msg_cdls = cdls; \ > > + dump_stack(); \ > > + /*panic("LBUG");*/
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
> James, > > If I understand correctly, you're saying you want to be able to build without > debug support...? I'm not convinced that building a client without debug > support is interesting or useful. In fact, I think it would be harmful, and > we shouldn't open up the possibility - this is switchable debug with very low > overhead when not actually "on". It would be really awful to get a problem > on a running system and discover there's no debug support - that you can't > even enable debug without a reinstall. > > If I've understood you correctly, then I would want to see proof of a > significant performance cost when debug is built but *off* before agreeing to > even exposing this option. (I know it's a choice they'd have to make, but if > it's not really useful with a side order of potentially harmful, we shouldn't > even give people the choice.) I'm not saying add the option today but this is more for the long game. While the Intel lustre developers deeply love lustre's debugging infrastructure I see a future where something better will come along to replace it. When that day comes we will have a period where both debugging infrastructurs will exist and some deployers of lustre will want to turn off the old debugging infrastructure and just use the new. That is what I have in mind. A switch to flip between options. > - Patrick > > On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons" > > wrote: > > > > CDEBUG_STACK() and CHECK_STACK() are macros to help with > > debugging, so move them from > >drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > > to > >drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > > > This seems a more fitting location, and is a step towards > > removing linux/libcfs.h and simplifying the include file structure. > > Nak. Currently the lustre client always enables debugging but that > shouldn't be the case. What we do need is the able to turn off the > crazy debugging stuff. In the development branch of lustre it is > done with CDEBUG_ENABLED. We need something like that in Kconfig > much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like > to be able to turn that off this should be moved to just after > LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() > it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT > would be empty. > > > Signed-off-by: NeilBrown > > --- > > .../lustre/include/linux/libcfs/libcfs_debug.h | 32 > > > .../lustre/include/linux/libcfs/linux/libcfs.h | 31 > --- > > 2 files changed, 32 insertions(+), 31 deletions(-) > > > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > index 9290a19429e7..0dc7b91efe7c 100644 > > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char > *str, int is_subsys); > > extern unsigned int libcfs_catastrophe; > > extern unsigned int libcfs_panic_on_lbug; > > > > +/* Enable debug-checks on stack size - except on x86_64 */ > > +#if !defined(__x86_64__) > > +# ifdef __ia64__ > > +# define CDEBUG_STACK() (THREAD_SIZE - > \ > > + ((unsigned long)__builtin_dwarf_cfa() & > \ > > + (THREAD_SIZE - 1))) > > +# else > > +# define CDEBUG_STACK() (THREAD_SIZE - > \ > > + ((unsigned long)__builtin_frame_address(0) & > \ > > + (THREAD_SIZE - 1))) > > +# endif /* __ia64__ */ > > + > > +#define __CHECK_STACK(msgdata, mask, cdls) \ > > +do { \ > > + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ > > + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); > \ > > + libcfs_stack = CDEBUG_STACK();\ > > + libcfs_debug_msg(msgdata, \ > > +"maximum lustre stack %lu\n",\ > > +CDEBUG_STACK()); \ > > + (msgdata)->msg_mask = mask; \ > > + (msgdata)->msg_cdls = cdls; \ > > + dump_stack(); \ > > + /*panic("LBUG");*/ > \ > > + }
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
James, If I understand correctly, you're saying you want to be able to build without debug support...? I'm not convinced that building a client without debug support is interesting or useful. In fact, I think it would be harmful, and we shouldn't open up the possibility - this is switchable debug with very low overhead when not actually "on". It would be really awful to get a problem on a running system and discover there's no debug support - that you can't even enable debug without a reinstall. If I've understood you correctly, then I would want to see proof of a significant performance cost when debug is built but *off* before agreeing to even exposing this option. (I know it's a choice they'd have to make, but if it's not really useful with a side order of potentially harmful, we shouldn't even give people the choice.) - Patrick On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons"wrote: > CDEBUG_STACK() and CHECK_STACK() are macros to help with > debugging, so move them from >drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > to >drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > This seems a more fitting location, and is a step towards > removing linux/libcfs.h and simplifying the include file structure. Nak. Currently the lustre client always enables debugging but that shouldn't be the case. What we do need is the able to turn off the crazy debugging stuff. In the development branch of lustre it is done with CDEBUG_ENABLED. We need something like that in Kconfig much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like to be able to turn that off this should be moved to just after LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT would be empty. > Signed-off-by: NeilBrown > --- > .../lustre/include/linux/libcfs/libcfs_debug.h | 32 > .../lustre/include/linux/libcfs/linux/libcfs.h | 31 --- > 2 files changed, 32 insertions(+), 31 deletions(-) > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > index 9290a19429e7..0dc7b91efe7c 100644 > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char *str, int is_subsys); > extern unsigned int libcfs_catastrophe; > extern unsigned int libcfs_panic_on_lbug; > > +/* Enable debug-checks on stack size - except on x86_64 */ > +#if !defined(__x86_64__) > +# ifdef __ia64__ > +# define CDEBUG_STACK() (THREAD_SIZE - \ > + ((unsigned long)__builtin_dwarf_cfa() & \ > +(THREAD_SIZE - 1))) > +# else > +# define CDEBUG_STACK() (THREAD_SIZE - \ > + ((unsigned long)__builtin_frame_address(0) & \ > +(THREAD_SIZE - 1))) > +# endif /* __ia64__ */ > + > +#define __CHECK_STACK(msgdata, mask, cdls) \ > +do { \ > + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ > + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); \ > + libcfs_stack = CDEBUG_STACK();\ > + libcfs_debug_msg(msgdata, \ > + "maximum lustre stack %lu\n",\ > + CDEBUG_STACK()); \ > + (msgdata)->msg_mask = mask; \ > + (msgdata)->msg_cdls = cdls; \ > + dump_stack(); \ > + /*panic("LBUG");*/\ > + } \ > +} while (0) > +#define CFS_CHECK_STACK(msgdata, mask, cdls) __CHECK_STACK(msgdata, mask, cdls) > +#else /* __x86_64__ */ > +#define CFS_CHECK_STACK(msgdata, mask, cdls) do {} while (0) > +#define CDEBUG_STACK() (0L) > +#endif /* __x86_64__ */ > + > #ifndef DEBUG_SUBSYSTEM > # define DEBUG_SUBSYSTEM S_UNDEFINED > #endif > diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > index 07d3cb2217d1..83aec9c7698f 100644 > --- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > +++
Re: [lustre-devel] [PATCH 1/6] staging: lustre: move stack-check macros to libcfs_debug.h
James, If I understand correctly, you're saying you want to be able to build without debug support...? I'm not convinced that building a client without debug support is interesting or useful. In fact, I think it would be harmful, and we shouldn't open up the possibility - this is switchable debug with very low overhead when not actually "on". It would be really awful to get a problem on a running system and discover there's no debug support - that you can't even enable debug without a reinstall. If I've understood you correctly, then I would want to see proof of a significant performance cost when debug is built but *off* before agreeing to even exposing this option. (I know it's a choice they'd have to make, but if it's not really useful with a side order of potentially harmful, we shouldn't even give people the choice.) - Patrick On 4/15/18, 10:49 PM, "lustre-devel on behalf of James Simmons" wrote: > CDEBUG_STACK() and CHECK_STACK() are macros to help with > debugging, so move them from >drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > to >drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > > This seems a more fitting location, and is a step towards > removing linux/libcfs.h and simplifying the include file structure. Nak. Currently the lustre client always enables debugging but that shouldn't be the case. What we do need is the able to turn off the crazy debugging stuff. In the development branch of lustre it is done with CDEBUG_ENABLED. We need something like that in Kconfig much like we have CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK. Since we like to be able to turn that off this should be moved to just after LIBCFS_DEBUG_MSG_DATA_DECL. Then from CHECK_STACK down to CWARN() it can be build out. When CDEBUG_ENABLED is disabled CDEBUG_LIMIT would be empty. > Signed-off-by: NeilBrown > --- > .../lustre/include/linux/libcfs/libcfs_debug.h | 32 > .../lustre/include/linux/libcfs/linux/libcfs.h | 31 --- > 2 files changed, 32 insertions(+), 31 deletions(-) > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > index 9290a19429e7..0dc7b91efe7c 100644 > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h > @@ -62,6 +62,38 @@ int libcfs_debug_str2mask(int *mask, const char *str, int is_subsys); > extern unsigned int libcfs_catastrophe; > extern unsigned int libcfs_panic_on_lbug; > > +/* Enable debug-checks on stack size - except on x86_64 */ > +#if !defined(__x86_64__) > +# ifdef __ia64__ > +# define CDEBUG_STACK() (THREAD_SIZE - \ > + ((unsigned long)__builtin_dwarf_cfa() & \ > +(THREAD_SIZE - 1))) > +# else > +# define CDEBUG_STACK() (THREAD_SIZE - \ > + ((unsigned long)__builtin_frame_address(0) & \ > +(THREAD_SIZE - 1))) > +# endif /* __ia64__ */ > + > +#define __CHECK_STACK(msgdata, mask, cdls) \ > +do { \ > + if (unlikely(CDEBUG_STACK() > libcfs_stack)) {\ > + LIBCFS_DEBUG_MSG_DATA_INIT(msgdata, D_WARNING, NULL); \ > + libcfs_stack = CDEBUG_STACK();\ > + libcfs_debug_msg(msgdata, \ > + "maximum lustre stack %lu\n",\ > + CDEBUG_STACK()); \ > + (msgdata)->msg_mask = mask; \ > + (msgdata)->msg_cdls = cdls; \ > + dump_stack(); \ > + /*panic("LBUG");*/\ > + } \ > +} while (0) > +#define CFS_CHECK_STACK(msgdata, mask, cdls) __CHECK_STACK(msgdata, mask, cdls) > +#else /* __x86_64__ */ > +#define CFS_CHECK_STACK(msgdata, mask, cdls) do {} while (0) > +#define CDEBUG_STACK() (0L) > +#endif /* __x86_64__ */ > + > #ifndef DEBUG_SUBSYSTEM > # define DEBUG_SUBSYSTEM S_UNDEFINED > #endif > diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > index 07d3cb2217d1..83aec9c7698f 100644 > --- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > +++ b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h > @@ -80,35 +80,4 @@ > #include > #include