Re: kref refcnt and false positives
> This makes my Maple board very unhappy -- it triggers a WARN_ON() in > kref_get() lots of times... Maybe the refounting in prom.c is broken ? I'll have a look. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
This makes my Maple board very unhappy -- it triggers a WARN_ON() in kref_get() lots of times... Maybe the refounting in prom.c is broken ? I'll have a look. Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Thu, 2006-12-21 at 09:01 +, Linux Kernel Mailing List wrote: > Gitweb: > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f334b60b43a0927f4ab1187cbdb4582f5227c3b1 > Commit: f334b60b43a0927f4ab1187cbdb4582f5227c3b1 > Parent: f238085415c56618e042252894f2fcc971add645 > Author: Venkatesh Pallipadi <[EMAIL PROTECTED]> > AuthorDate: Tue Dec 19 13:01:29 2006 -0800 > Committer: Greg Kroah-Hartman <[EMAIL PROTECTED]> > CommitDate: Wed Dec 20 10:56:43 2006 -0800 > > kref refcnt and false positives > > With WARN_ON addition to kobject_init() > [ > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch > ] > > I started seeing following WARNING on CPU offline followed by online on my > x86_64 system. > > WARNING at lib/kobject.c:172 kobject_init() > <...> > This is a false positive as mce.c is unregistering/registering sysfs > interfaces cleanly on hotplug. > > kref_put() and conditional decrement of refcnt seems to be the root cause > for this and the patch below resolves the issue for me. > > Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> > --- > lib/kref.c |7 +-- > 1 files changed, 1 insertions(+), 6 deletions(-) > > diff --git a/lib/kref.c b/lib/kref.c > index 4a467fa..0d07cc3 100644 > --- a/lib/kref.c > +++ b/lib/kref.c > @@ -52,12 +52,7 @@ int kref_put(struct kref *kref, void (*release)(struct > kref *kref)) > WARN_ON(release == NULL); > WARN_ON(release == (void (*)(struct kref *))kfree); > > - /* > - * if current count is one, we are the last user and can release object > - * right now, avoiding an atomic operation on 'refcount' > - */ > - if ((atomic_read(>refcount) == 1) || > - (atomic_dec_and_test(>refcount))) { > + if (atomic_dec_and_test(>refcount)) { > release(kref); > return 1; > } This makes my Maple board very unhappy -- it triggers a WARN_ON() in kref_get() lots of times... time_init: decrementer frequency = 175.00 MHz time_init: processor frequency = 1400.00 MHz [ cut here ] Badness at lib/kref.c:32 Call Trace: [C050F440] [C000F4C8] .show_stack+0x68/0x1b0 (unreliable) [C050F4E0] [C0189708] .report_bug+0x94/0xe8 [C050F570] [C0021A98] .program_check_exception+0x18c/0x5d0 [C050F640] [C0004774] program_check_common+0xf4/0x100 --- Exception: 700 at .kref_get+0xc/0x24 LR = .of_node_get+0x20/0x3c [C050F930] [C050F9D0] init_thread_union+0x39d0/0x4000 (unreliabl e) [C050F9B0] [C0020F1C] .of_get_parent+0x38/0x64 [C050FA40] [C001B750] .of_translate_address+0xf0/0x38c [C050FB50] [C001BA28] .__of_address_to_resource+0x3c/0xe0 [C050FBF0] [C001BB14] .of_address_to_resource+0x48/0x68 [C050FC90] [C045D240] .maple_get_boot_time+0x40/0x12c [C050FD70] [C001F884] .get_boot_time+0x3c/0xb8 [C050FE10] [C0454040] .time_init+0x274/0x450 [C050FEF0] [C044B74C] .start_kernel+0x188/0x2bc [C050FF90] [C00084C8] .start_here_common+0x54/0x8c Maple: Found RTC at IO 0x900 Console: colour dummy device 80x25 Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) Memory: 501628k/524288k available (4608k kernel code, 21932k reserved, 552k data , 354k bss, 212k init) Calibrating delay loop... 349.18 BogoMIPS (lpj=698368) Mount-cache hash table entries: 256 [ cut here ] Badness at lib/kref.c:32 Call Trace: [C050F7F0] [C000F4C8] .show_stack+0x68/0x1b0 (unreliable)
Re: kref refcnt and false positives
On Thu, 2006-12-21 at 09:01 +, Linux Kernel Mailing List wrote: Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f334b60b43a0927f4ab1187cbdb4582f5227c3b1 Commit: f334b60b43a0927f4ab1187cbdb4582f5227c3b1 Parent: f238085415c56618e042252894f2fcc971add645 Author: Venkatesh Pallipadi [EMAIL PROTECTED] AuthorDate: Tue Dec 19 13:01:29 2006 -0800 Committer: Greg Kroah-Hartman [EMAIL PROTECTED] CommitDate: Wed Dec 20 10:56:43 2006 -0800 kref refcnt and false positives With WARN_ON addition to kobject_init() [ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch ] I started seeing following WARNING on CPU offline followed by online on my x86_64 system. WARNING at lib/kobject.c:172 kobject_init() ... This is a false positive as mce.c is unregistering/registering sysfs interfaces cleanly on hotplug. kref_put() and conditional decrement of refcnt seems to be the root cause for this and the patch below resolves the issue for me. Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED] --- lib/kref.c |7 +-- 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/lib/kref.c b/lib/kref.c index 4a467fa..0d07cc3 100644 --- a/lib/kref.c +++ b/lib/kref.c @@ -52,12 +52,7 @@ int kref_put(struct kref *kref, void (*release)(struct kref *kref)) WARN_ON(release == NULL); WARN_ON(release == (void (*)(struct kref *))kfree); - /* - * if current count is one, we are the last user and can release object - * right now, avoiding an atomic operation on 'refcount' - */ - if ((atomic_read(kref-refcount) == 1) || - (atomic_dec_and_test(kref-refcount))) { + if (atomic_dec_and_test(kref-refcount)) { release(kref); return 1; } This makes my Maple board very unhappy -- it triggers a WARN_ON() in kref_get() lots of times... time_init: decrementer frequency = 175.00 MHz time_init: processor frequency = 1400.00 MHz [ cut here ] Badness at lib/kref.c:32 Call Trace: [C050F440] [C000F4C8] .show_stack+0x68/0x1b0 (unreliable) [C050F4E0] [C0189708] .report_bug+0x94/0xe8 [C050F570] [C0021A98] .program_check_exception+0x18c/0x5d0 [C050F640] [C0004774] program_check_common+0xf4/0x100 --- Exception: 700 at .kref_get+0xc/0x24 LR = .of_node_get+0x20/0x3c [C050F930] [C050F9D0] init_thread_union+0x39d0/0x4000 (unreliabl e) [C050F9B0] [C0020F1C] .of_get_parent+0x38/0x64 [C050FA40] [C001B750] .of_translate_address+0xf0/0x38c [C050FB50] [C001BA28] .__of_address_to_resource+0x3c/0xe0 [C050FBF0] [C001BB14] .of_address_to_resource+0x48/0x68 [C050FC90] [C045D240] .maple_get_boot_time+0x40/0x12c [C050FD70] [C001F884] .get_boot_time+0x3c/0xb8 [C050FE10] [C0454040] .time_init+0x274/0x450 [C050FEF0] [C044B74C] .start_kernel+0x188/0x2bc [C050FF90] [C00084C8] .start_here_common+0x54/0x8c Maple: Found RTC at IO 0x900 Console: colour dummy device 80x25 Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) Memory: 501628k/524288k available (4608k kernel code, 21932k reserved, 552k data , 354k bss, 212k init) Calibrating delay loop... 349.18 BogoMIPS (lpj=698368) Mount-cache hash table entries: 256 [ cut here ] Badness at lib/kref.c:32 Call Trace: [C050F7F0] [C000F4C8] .show_stack+0x68/0x1b0 (unreliable) [C050F890] [C0189708]
Re: kref refcnt and false positives
Andrew Morton <[EMAIL PROTECTED]> writes: > Guys, we have about 100 reports of weirdo > crashes, smashes, bashes and splats in the kref code. The last thing we > need is some obscure, tricksy little optimisation which leads legitimate > uses of the API to mysteriously fail. > > If we are allocating and freeing kref-counted objects at a sufficiently > high frequency for this thing to make a difference then we should fix that > instead of trying to suck faster. Agreed. Correct code maintenance certainly trumps performance. For the same reason someone reusing the data structure shouldn't assume the kref code left it in any particular state. So both sides should be liberal in what they accept. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Thu, 14 Dec 2006 17:19:55 -0700 [EMAIL PROTECTED] (Eric W. Biederman) wrote: > "Pallipadi, Venkatesh" <[EMAIL PROTECTED]> writes: > > >>But I believe Venkatesh problem comes from its release() > >>function : It is > >>supposed to free the object. > >>If not, it should properly setup it so that further uses are OK. > >> > >>ie doing in release(kref) > >>atomic_set(>count, 0); > >> > > > > Agreed that setting kref refcnt to 0 in release will solve the probloem. > > But, once the optimization code is removed, we don't need to set it to > > zero as release will only be called after the count reaches zero anyway. > > The primary point of the optimization is to not write allocate a cache line > unnecessarily. I don't know it's value, but it can have one especially > on big way SMP machines. Guys, we have about 100 reports of weirdo crashes, smashes, bashes and splats in the kref code. The last thing we need is some obscure, tricksy little optimisation which leads legitimate uses of the API to mysteriously fail. If we are allocating and freeing kref-counted objects at a sufficiently high frequency for this thing to make a difference then we should fix that instead of trying to suck faster. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
"Pallipadi, Venkatesh" <[EMAIL PROTECTED]> writes: >>But I believe Venkatesh problem comes from its release() >>function : It is >>supposed to free the object. >>If not, it should properly setup it so that further uses are OK. >> >>ie doing in release(kref) >>atomic_set(>count, 0); >> > > Agreed that setting kref refcnt to 0 in release will solve the probloem. > But, once the optimization code is removed, we don't need to set it to > zero as release will only be called after the count reaches zero anyway. The primary point of the optimization is to not write allocate a cache line unnecessarily. I don't know it's value, but it can have one especially on big way SMP machines. If the optimization is not performed setting the value to 0 immediately there after has not real cost as your cpu has the dirty cache line already. If the optimization is performed you still have to dirty the cache line but at least you don't have to allocate it. How that compares to the branch mispredict in cost I don't know, except that cache line misses are the only operation that is generally more expensive than branch misses. So I see no virtue in avoiding the atomic_set(>count, 0) if you are about to immediately reuse the data structure. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kref refcnt and false positives
>-Original Message- >From: Eric Dumazet [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 13, 2006 11:57 PM >To: Andrew Morton >Cc: Greg KH; Pallipadi, Venkatesh; Arjan; linux-kernel; Eric >W. Biederman >Subject: Re: kref refcnt and false positives > > >I agree this 'optimization' is not "good" (I was the guy who >suggested it >http://lkml.org/lkml/2006/1/30/4 ) > >After Eric Biederman message >(http://lkml.org/lkml/2006/1/30/292) I remember >adding some stat counters and telling Greg to not put the >patch in because >kref_put() was mostly called with refcount=1. But the patch >did its way. I >*did* ask Greg to revert it, but cannot find this mail >archived somewhere... > >But I believe Venkatesh problem comes from its release() >function : It is >supposed to free the object. >If not, it should properly setup it so that further uses are OK. > >ie doing in release(kref) >atomic_set(>count, 0); > Agreed that setting kref refcnt to 0 in release will solve the probloem. But, once the optimization code is removed, we don't need to set it to zero as release will only be called after the count reaches zero anyway. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
Andrew Morton a écrit : On Wed, 13 Dec 2006 16:12:46 -0800 Greg KH <[EMAIL PROTECTED]> wrote: Original comment seemed to indicate that this conditional thing was performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. I suspect it's a false optimisation. int kref_put(struct kref *kref, void (*release)(struct kref *kref)) { WARN_ON(release == NULL); WARN_ON(release == (void (*)(struct kref *))kfree); /* * if current count is one, we are the last user and can release object * right now, avoiding an atomic operation on 'refcount' */ if ((atomic_read(>refcount) == 1) || (atomic_dec_and_test(>refcount))) { release(kref); return 1; } return 0; } The only time we avoid the atomic_dec_and_test() is when the object is about to be freed. ie: once in its entire lifetime. And freeing the object is part of an expensive (and rare) operation anyway. otoh, we've gone and added a test-n-branch to the common case: those cases where the object will not be freed. I agree this 'optimization' is not "good" (I was the guy who suggested it http://lkml.org/lkml/2006/1/30/4 ) After Eric Biederman message (http://lkml.org/lkml/2006/1/30/292) I remember adding some stat counters and telling Greg to not put the patch in because kref_put() was mostly called with refcount=1. But the patch did its way. I *did* ask Greg to revert it, but cannot find this mail archived somewhere... But I believe Venkatesh problem comes from its release() function : It is supposed to free the object. If not, it should properly setup it so that further uses are OK. ie doing in release(kref) atomic_set(>count, 0); Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
Andrew Morton a écrit : On Wed, 13 Dec 2006 16:12:46 -0800 Greg KH [EMAIL PROTECTED] wrote: Original comment seemed to indicate that this conditional thing was performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. I suspect it's a false optimisation. int kref_put(struct kref *kref, void (*release)(struct kref *kref)) { WARN_ON(release == NULL); WARN_ON(release == (void (*)(struct kref *))kfree); /* * if current count is one, we are the last user and can release object * right now, avoiding an atomic operation on 'refcount' */ if ((atomic_read(kref-refcount) == 1) || (atomic_dec_and_test(kref-refcount))) { release(kref); return 1; } return 0; } The only time we avoid the atomic_dec_and_test() is when the object is about to be freed. ie: once in its entire lifetime. And freeing the object is part of an expensive (and rare) operation anyway. otoh, we've gone and added a test-n-branch to the common case: those cases where the object will not be freed. I agree this 'optimization' is not good (I was the guy who suggested it http://lkml.org/lkml/2006/1/30/4 ) After Eric Biederman message (http://lkml.org/lkml/2006/1/30/292) I remember adding some stat counters and telling Greg to not put the patch in because kref_put() was mostly called with refcount=1. But the patch did its way. I *did* ask Greg to revert it, but cannot find this mail archived somewhere... But I believe Venkatesh problem comes from its release() function : It is supposed to free the object. If not, it should properly setup it so that further uses are OK. ie doing in release(kref) atomic_set(kref-count, 0); Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kref refcnt and false positives
-Original Message- From: Eric Dumazet [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 13, 2006 11:57 PM To: Andrew Morton Cc: Greg KH; Pallipadi, Venkatesh; Arjan; linux-kernel; Eric W. Biederman Subject: Re: kref refcnt and false positives I agree this 'optimization' is not good (I was the guy who suggested it http://lkml.org/lkml/2006/1/30/4 ) After Eric Biederman message (http://lkml.org/lkml/2006/1/30/292) I remember adding some stat counters and telling Greg to not put the patch in because kref_put() was mostly called with refcount=1. But the patch did its way. I *did* ask Greg to revert it, but cannot find this mail archived somewhere... But I believe Venkatesh problem comes from its release() function : It is supposed to free the object. If not, it should properly setup it so that further uses are OK. ie doing in release(kref) atomic_set(kref-count, 0); Agreed that setting kref refcnt to 0 in release will solve the probloem. But, once the optimization code is removed, we don't need to set it to zero as release will only be called after the count reaches zero anyway. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
Pallipadi, Venkatesh [EMAIL PROTECTED] writes: But I believe Venkatesh problem comes from its release() function : It is supposed to free the object. If not, it should properly setup it so that further uses are OK. ie doing in release(kref) atomic_set(kref-count, 0); Agreed that setting kref refcnt to 0 in release will solve the probloem. But, once the optimization code is removed, we don't need to set it to zero as release will only be called after the count reaches zero anyway. The primary point of the optimization is to not write allocate a cache line unnecessarily. I don't know it's value, but it can have one especially on big way SMP machines. If the optimization is not performed setting the value to 0 immediately there after has not real cost as your cpu has the dirty cache line already. If the optimization is performed you still have to dirty the cache line but at least you don't have to allocate it. How that compares to the branch mispredict in cost I don't know, except that cache line misses are the only operation that is generally more expensive than branch misses. So I see no virtue in avoiding the atomic_set(kref-count, 0) if you are about to immediately reuse the data structure. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Thu, 14 Dec 2006 17:19:55 -0700 [EMAIL PROTECTED] (Eric W. Biederman) wrote: Pallipadi, Venkatesh [EMAIL PROTECTED] writes: But I believe Venkatesh problem comes from its release() function : It is supposed to free the object. If not, it should properly setup it so that further uses are OK. ie doing in release(kref) atomic_set(kref-count, 0); Agreed that setting kref refcnt to 0 in release will solve the probloem. But, once the optimization code is removed, we don't need to set it to zero as release will only be called after the count reaches zero anyway. The primary point of the optimization is to not write allocate a cache line unnecessarily. I don't know it's value, but it can have one especially on big way SMP machines. Guys, we have about 100 reports of weirdo crashes, smashes, bashes and splats in the kref code. The last thing we need is some obscure, tricksy little optimisation which leads legitimate uses of the API to mysteriously fail. If we are allocating and freeing kref-counted objects at a sufficiently high frequency for this thing to make a difference then we should fix that instead of trying to suck faster. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
Andrew Morton [EMAIL PROTECTED] writes: Guys, we have about 100 reports of weirdo crashes, smashes, bashes and splats in the kref code. The last thing we need is some obscure, tricksy little optimisation which leads legitimate uses of the API to mysteriously fail. If we are allocating and freeing kref-counted objects at a sufficiently high frequency for this thing to make a difference then we should fix that instead of trying to suck faster. Agreed. Correct code maintenance certainly trumps performance. For the same reason someone reusing the data structure shouldn't assume the kref code left it in any particular state. So both sides should be liberal in what they accept. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, Dec 13, 2006 at 04:12:46PM -0800, Greg KH wrote: > On Wed, Dec 13, 2006 at 03:34:08PM -0800, Venkatesh Pallipadi wrote: > > > > With WARN_ON addition to kobject_init() > > [ > > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch > > ] > > > > I started seeing following WARNING on CPU offline followed by online on my > > x86_64 system. > > > > WARNING at lib/kobject.c:172 kobject_init() > > > > This is a false positive as mce.c is unregistering/registering sysfs > > interfaces cleanly on hotplug. > > The warning above tends to look like this is not true. > > > kref_put() and conditional decrement of refcnt seems to be the root cause > > for this and the patch below resolves the issue for me. > > Why? > > Are you properly initializing your kref to null before you register it > with the driver core? Or is it a static object? > Yes. arch/x86_64/kernel/mce.c is calling sysdev_register and sysdev_unregister for cpu hot_add and hot_remove respectively. The problem is that due to the perf optimization, refcnt remains at 1 even after unregister (as it never gets decremented to 0 in kref_put()). So, when we try to register next time for same sysdev (The object is percpu, so previously set refcnt will remain 1), init sees that refcnt is already 1 and print out the WARNing. > > Original comment seemed to indicate that this conditional thing was > > performance related. Is it really? If not, we should consider the below > > patch. > > Yes, it's a performance gain and I don't see how this patch would change > the above warning. > In that case, I think we should change the WARN_ON in question to check for > 1. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, 13 Dec 2006 16:12:46 -0800 Greg KH <[EMAIL PROTECTED]> wrote: > > Original comment seemed to indicate that this conditional thing was > > performance related. Is it really? If not, we should consider the below > > patch. > > Yes, it's a performance gain and I don't see how this patch would change > the above warning. I suspect it's a false optimisation. int kref_put(struct kref *kref, void (*release)(struct kref *kref)) { WARN_ON(release == NULL); WARN_ON(release == (void (*)(struct kref *))kfree); /* * if current count is one, we are the last user and can release object * right now, avoiding an atomic operation on 'refcount' */ if ((atomic_read(>refcount) == 1) || (atomic_dec_and_test(>refcount))) { release(kref); return 1; } return 0; } The only time we avoid the atomic_dec_and_test() is when the object is about to be freed. ie: once in its entire lifetime. And freeing the object is part of an expensive (and rare) operation anyway. otoh, we've gone and added a test-n-branch to the common case: those cases where the object will not be freed. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, Dec 13, 2006 at 03:34:08PM -0800, Venkatesh Pallipadi wrote: > > With WARN_ON addition to kobject_init() > [ > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch > ] > > I started seeing following WARNING on CPU offline followed by online on my > x86_64 system. > > WARNING at lib/kobject.c:172 kobject_init() > > Call Trace: > [] dump_trace+0xaa/0x3ef > [] show_trace+0x3a/0x50 > [] dump_stack+0x15/0x17 > [] kobject_init+0x3f/0x8a > [] kobject_register+0x1a/0x3e > [] sysdev_register+0x5b/0xf9 > [] mce_create_device+0x77/0xf4 > [] mce_cpu_callback+0x3a/0xe5 > [] notifier_call_chain+0x26/0x3b > [] raw_notifier_call_chain+0x9/0xb > [] _cpu_up+0xb4/0xdc > [] cpu_up+0x2b/0x42 > [] store_online+0x4a/0x72 > [] sysdev_store+0x24/0x26 > [] sysfs_write_file+0xcf/0xfc > [] vfs_write+0xae/0x154 > [] sys_write+0x47/0x6f > [] system_call+0x7e/0x83 > DWARF2 unwinder stuck at system_call+0x7e/0x83 > Leftover inexact backtrace: > > This is a false positive as mce.c is unregistering/registering sysfs > interfaces cleanly on hotplug. The warning above tends to look like this is not true. > kref_put() and conditional decrement of refcnt seems to be the root cause > for this and the patch below resolves the issue for me. Why? Are you properly initializing your kref to null before you register it with the driver core? Or is it a static object? > Original comment seemed to indicate that this conditional thing was > performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, Dec 13, 2006 at 03:34:08PM -0800, Venkatesh Pallipadi wrote: With WARN_ON addition to kobject_init() [ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch ] I started seeing following WARNING on CPU offline followed by online on my x86_64 system. WARNING at lib/kobject.c:172 kobject_init() Call Trace: [8020ab45] dump_trace+0xaa/0x3ef [8020aec4] show_trace+0x3a/0x50 [8020b0f6] dump_stack+0x15/0x17 [80350abc] kobject_init+0x3f/0x8a [80350be1] kobject_register+0x1a/0x3e [803bbd89] sysdev_register+0x5b/0xf9 [80211d0b] mce_create_device+0x77/0xf4 [80211dc2] mce_cpu_callback+0x3a/0xe5 [805632fd] notifier_call_chain+0x26/0x3b [8023f6f3] raw_notifier_call_chain+0x9/0xb [802519bf] _cpu_up+0xb4/0xdc [80251a12] cpu_up+0x2b/0x42 [803bef00] store_online+0x4a/0x72 [803bb6ce] sysdev_store+0x24/0x26 [802baaa2] sysfs_write_file+0xcf/0xfc [8027fc6f] vfs_write+0xae/0x154 [80280418] sys_write+0x47/0x6f [8020963e] system_call+0x7e/0x83 DWARF2 unwinder stuck at system_call+0x7e/0x83 Leftover inexact backtrace: This is a false positive as mce.c is unregistering/registering sysfs interfaces cleanly on hotplug. The warning above tends to look like this is not true. kref_put() and conditional decrement of refcnt seems to be the root cause for this and the patch below resolves the issue for me. Why? Are you properly initializing your kref to null before you register it with the driver core? Or is it a static object? Original comment seemed to indicate that this conditional thing was performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. thanks, greg k-h - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, 13 Dec 2006 16:12:46 -0800 Greg KH [EMAIL PROTECTED] wrote: Original comment seemed to indicate that this conditional thing was performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. I suspect it's a false optimisation. int kref_put(struct kref *kref, void (*release)(struct kref *kref)) { WARN_ON(release == NULL); WARN_ON(release == (void (*)(struct kref *))kfree); /* * if current count is one, we are the last user and can release object * right now, avoiding an atomic operation on 'refcount' */ if ((atomic_read(kref-refcount) == 1) || (atomic_dec_and_test(kref-refcount))) { release(kref); return 1; } return 0; } The only time we avoid the atomic_dec_and_test() is when the object is about to be freed. ie: once in its entire lifetime. And freeing the object is part of an expensive (and rare) operation anyway. otoh, we've gone and added a test-n-branch to the common case: those cases where the object will not be freed. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcnt and false positives
On Wed, Dec 13, 2006 at 04:12:46PM -0800, Greg KH wrote: On Wed, Dec 13, 2006 at 03:34:08PM -0800, Venkatesh Pallipadi wrote: With WARN_ON addition to kobject_init() [ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch ] I started seeing following WARNING on CPU offline followed by online on my x86_64 system. WARNING at lib/kobject.c:172 kobject_init() This is a false positive as mce.c is unregistering/registering sysfs interfaces cleanly on hotplug. The warning above tends to look like this is not true. kref_put() and conditional decrement of refcnt seems to be the root cause for this and the patch below resolves the issue for me. Why? Are you properly initializing your kref to null before you register it with the driver core? Or is it a static object? Yes. arch/x86_64/kernel/mce.c is calling sysdev_register and sysdev_unregister for cpu hot_add and hot_remove respectively. The problem is that due to the perf optimization, refcnt remains at 1 even after unregister (as it never gets decremented to 0 in kref_put()). So, when we try to register next time for same sysdev (The object is percpu, so previously set refcnt will remain 1), init sees that refcnt is already 1 and print out the WARNing. Original comment seemed to indicate that this conditional thing was performance related. Is it really? If not, we should consider the below patch. Yes, it's a performance gain and I don't see how this patch would change the above warning. In that case, I think we should change the WARN_ON in question to check for 1. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/