Re: Debugging function of localcount
On Wed, Nov 15, 2017 at 3:13 AM, Taylor R Campbell wrote: >> Date: Mon, 13 Nov 2017 13:59:32 +0900 >> From: Ryota Ozaki >> >> I'm sorry for not replying this thread. I'm back. >> >> I've revised the patch as your suggestions: >> http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff > > Looks better, thanks! > > Couple nits: > > 1. Still need to volatile-qualify lc_refcnt. > 2. Either use unsigned and atomic_cas_uint, or uint32_t and >atomic_cas_32, but not a mixture. Thanks, revised: http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised2.diff > > Random thought: What about a per-LWP counter for diagnostics, with > lwp_getspecific/setspecific? No interprocessor overhead that way. > Could even assert when an lwp exits, or when a softint returns, that > no localcounts were leaked. > > (This slightly changes the semantics of localcount so that it cannot > be passed from lwp to lwp (i.e., can't acquire in one and release in > the other; must acquire in both and release in both), but I don't > really see a problem with that.) NACK because current uses of localcount on IPsec depend on that lwp migrations are permitted; a reference of localcount last over opencrypto processing that may use hardware offloading and resume its processing in a dedicated softint. Thanks, ozaki-r
Re: Debugging function of localcount
On Tue, 14 Nov 2017, Taylor R Campbell wrote: Date: Mon, 13 Nov 2017 13:59:32 +0900 From: Ryota Ozaki I'm sorry for not replying this thread. I'm back. I've revised the patch as your suggestions: http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff Looks better, thanks! Couple nits: 1. Still need to volatile-qualify lc_refcnt. 2. Either use unsigned and atomic_cas_uint, or uint32_t and atomic_cas_32, but not a mixture. Random thought: What about a per-LWP counter for diagnostics, with lwp_getspecific/setspecific? No interprocessor overhead that way. Could even assert when an lwp exits, or when a softint returns, that no localcounts were leaked. (This slightly changes the semantics of localcount so that it cannot be passed from lwp to lwp (i.e., can't acquire in one and release in the other; must acquire in both and release in both), but I don't really see a problem with that.) It seems to me that there's really no good reason to prohibit migration from one lwp to another; the only apparent benefit of having such a restriction is the ability to verify that the restriction isn't being violated. If we're changing the semantics to disallow this, we should be sure to update the man-page, too. (Actually, we should update the man page in any case, to document whether such migration between lwp's is permitted or disallowed.) +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +--+--++
Re: Debugging function of localcount
> Date: Mon, 13 Nov 2017 13:59:32 +0900 > From: Ryota Ozaki > > I'm sorry for not replying this thread. I'm back. > > I've revised the patch as your suggestions: > http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff Looks better, thanks! Couple nits: 1. Still need to volatile-qualify lc_refcnt. 2. Either use unsigned and atomic_cas_uint, or uint32_t and atomic_cas_32, but not a mixture. Random thought: What about a per-LWP counter for diagnostics, with lwp_getspecific/setspecific? No interprocessor overhead that way. Could even assert when an lwp exits, or when a softint returns, that no localcounts were leaked. (This slightly changes the semantics of localcount so that it cannot be passed from lwp to lwp (i.e., can't acquire in one and release in the other; must acquire in both and release in both), but I don't really see a problem with that.)
Re: Debugging function of localcount
On Tue, Aug 8, 2017 at 3:16 AM, Taylor R Campbell wrote: >> Date: Mon, 7 Aug 2017 13:08:35 +0900 >> From: Ryota Ozaki >> >> Anyway I've brought another (stupid) approach that uses an atomic >> counter: >> http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff >> >> It counts up a counter with atomic instructions along with localcount >> if LOCKDEBUG enabled. Of course it blows the merit of localcount but >> I can provide an accurate view of the counter. >> >> Do you think that it works and is useful? > > That looks plausible. I'd put it under DEBUG && LOCKDEBUG, and > require localcount_debug_refcnt to be used only in KDASSERT, to avoid > adding yet more contention to plain LOCKDEBUG, and to make it safe to > write, e.g., > > KDASSERT(localcount_debug_refcnt(lc) >= 2). > > Otherwise, under, e.g., DIAGNOSTIC && !LOCKDEBUG, > > KASSERT(localcount_debug_refcnt(lc) >= 2) > > would always fail, since localcount_debug_refcnt always returns 0. > > Tiny nit: I suggest you always use uint32_t for this, and use > atomic_dec_32_nv(&lc->lc_refcnt) == UINT_MAX to detect underflow, > rather than conversion to int32_t. Maybe check-and-panic on overflow > too: if atomic_inc_32_nv(&lc->lc_refcnt) == 0, panic. > > Problem: The ABI is different with and without LOCKDEBUG. The > structure member can't be conditional. I don't see a problem with > changing always adding another 32-bit (or even 64-bit) member to > struct localcount -- they already require O(nobj * ncpu) memory, so an > extra handful of bytes per object is inconsequential -- so I'd just > make it unconditional. > > No need for #include in localcount.h. lc_refcnt member > should be volatile-qualified. I'm sorry for not replying this thread. I'm back. I've revised the patch as your suggestions: http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff Thanks, ozaki-r
Re: Debugging function of localcount
> Date: Mon, 7 Aug 2017 13:08:35 +0900 > From: Ryota Ozaki > > Anyway I've brought another (stupid) approach that uses an atomic > counter: > http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff > > It counts up a counter with atomic instructions along with localcount > if LOCKDEBUG enabled. Of course it blows the merit of localcount but > I can provide an accurate view of the counter. > > Do you think that it works and is useful? That looks plausible. I'd put it under DEBUG && LOCKDEBUG, and require localcount_debug_refcnt to be used only in KDASSERT, to avoid adding yet more contention to plain LOCKDEBUG, and to make it safe to write, e.g., KDASSERT(localcount_debug_refcnt(lc) >= 2). Otherwise, under, e.g., DIAGNOSTIC && !LOCKDEBUG, KASSERT(localcount_debug_refcnt(lc) >= 2) would always fail, since localcount_debug_refcnt always returns 0. Tiny nit: I suggest you always use uint32_t for this, and use atomic_dec_32_nv(&lc->lc_refcnt) == UINT_MAX to detect underflow, rather than conversion to int32_t. Maybe check-and-panic on overflow too: if atomic_inc_32_nv(&lc->lc_refcnt) == 0, panic. Problem: The ABI is different with and without LOCKDEBUG. The structure member can't be conditional. I don't see a problem with changing always adding another 32-bit (or even 64-bit) member to struct localcount -- they already require O(nobj * ncpu) memory, so an extra handful of bytes per object is inconsequential -- so I'd just make it unconditional. No need for #include in localcount.h. lc_refcnt member should be volatile-qualified.
Re: Debugging function of localcount
On Thu, Jul 27, 2017 at 1:49 AM, Taylor R Campbell wrote: >> Date: Wed, 26 Jul 2017 18:41:36 +0900 >> From: Ryota Ozaki >> >> Oops. I thought we could use percpu_foreach in softint, but >> I was wrong. So localcount_debug_refcnt can be used only >> in thread contexts. And the above assertion shouldn't be >> there. The proposal isn't so much worthwhile ;-< > > You also can't access one CPU's count from another CPU -- there's no > memory ordering, so you can easily get wildly inconsistent answers. > > Maybe if you did a cross-call to count it up, that would work, but it > sounds like for the cases you had in mind, you can't do a cross-call, > and I'm not sure the view is guaranteed to be stable enough: maybe one > thread decrements the count on CPU 0, then you count 'em up, then the > thread migrates to CPU 1 and increments the count there. (It is not a > theorem that something like this might violate your assertion, just a > conjecture!) Yes, so I expected the uses in very limited environments like ATF tests using rump kernels. Anyway I've brought another (stupid) approach that uses an atomic counter: http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff It counts up a counter with atomic instructions along with localcount if LOCKDEBUG enabled. Of course it blows the merit of localcount but I can provide an accurate view of the counter. Do you think that it works and is useful? ozaki-r
Re: Debugging function of localcount
> Date: Wed, 26 Jul 2017 18:41:36 +0900 > From: Ryota Ozaki > > Oops. I thought we could use percpu_foreach in softint, but > I was wrong. So localcount_debug_refcnt can be used only > in thread contexts. And the above assertion shouldn't be > there. The proposal isn't so much worthwhile ;-< You also can't access one CPU's count from another CPU -- there's no memory ordering, so you can easily get wildly inconsistent answers. Maybe if you did a cross-call to count it up, that would work, but it sounds like for the cases you had in mind, you can't do a cross-call, and I'm not sure the view is guaranteed to be stable enough: maybe one thread decrements the count on CPU 0, then you count 'em up, then the thread migrates to CPU 1 and increments the count there. (It is not a theorem that something like this might violate your assertion, just a conjecture!)
Re: Debugging function of localcount
On Wed, Jul 26, 2017 at 1:34 PM, Ryota Ozaki wrote: > (resending, please ignore if duplicated) > > Hi, > > I propose a debugging function of localcount, > localcount_debug_refcnt, which returns a total > reference count of a specified localcount. > > A result of localcount_debug_refcnt isn't guaranteed > to be accurate, however it's still useful for > debugging purposes, for example debugging on > ATF tests using rump kernels. > > Here is a patch: > http://www.netbsd.org/~ozaki-r/localcount_debug_refcnt.diff > > I added the following assertion in localcount_release: > KDASSERT(localcount_debug_refcnt(lc) >= 0); > > IIUC it's a proper constraint of localcount even if > a result of localcount_debug_refcnt isn't accurate. > > Any comments or suggestions? Oops. I thought we could use percpu_foreach in softint, but I was wrong. So localcount_debug_refcnt can be used only in thread contexts. And the above assertion shouldn't be there. The proposal isn't so much worthwhile ;-< ozaki-r