Re: Debugging function of localcount

2017-11-14 Thread Ryota Ozaki
On Wed, Nov 15, 2017 at 3:13 AM, Taylor R Campbell
 wrote:
>> Date: Mon, 13 Nov 2017 13:59:32 +0900
>> From: Ryota Ozaki 
>>
>> I'm sorry for not replying this thread. I'm back.
>>
>> I've revised the patch as your suggestions:
>>   http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff
>
> Looks better, thanks!
>
> Couple nits:
>
> 1. Still need to volatile-qualify lc_refcnt.
> 2. Either use unsigned and atomic_cas_uint, or uint32_t and
>atomic_cas_32, but not a mixture.

Thanks, revised:
http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised2.diff

>
> Random thought: What about a per-LWP counter for diagnostics, with
> lwp_getspecific/setspecific?  No interprocessor overhead that way.
> Could even assert when an lwp exits, or when a softint returns, that
> no localcounts were leaked.
>
> (This slightly changes the semantics of localcount so that it cannot
> be passed from lwp to lwp (i.e., can't acquire in one and release in
> the other; must acquire in both and release in both), but I don't
> really see a problem with that.)

NACK because current uses of localcount on IPsec depend on that lwp
migrations are permitted; a reference of localcount last over opencrypto
processing that may use hardware offloading and resume its processing
in a dedicated softint.

Thanks,
  ozaki-r


Re: Debugging function of localcount

2017-11-14 Thread Paul Goyette

On Tue, 14 Nov 2017, Taylor R Campbell wrote:


Date: Mon, 13 Nov 2017 13:59:32 +0900
From: Ryota Ozaki 

I'm sorry for not replying this thread. I'm back.

I've revised the patch as your suggestions:
  http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff


Looks better, thanks!

Couple nits:

1. Still need to volatile-qualify lc_refcnt.
2. Either use unsigned and atomic_cas_uint, or uint32_t and
  atomic_cas_32, but not a mixture.

Random thought: What about a per-LWP counter for diagnostics, with
lwp_getspecific/setspecific?  No interprocessor overhead that way.
Could even assert when an lwp exits, or when a softint returns, that
no localcounts were leaked.

(This slightly changes the semantics of localcount so that it cannot
be passed from lwp to lwp (i.e., can't acquire in one and release in
the other; must acquire in both and release in both), but I don't
really see a problem with that.)


It seems to me that there's really no good reason to prohibit migration 
from one lwp to another;  the only apparent benefit of having such a 
restriction is the ability to verify that the restriction isn't being 
violated.


If we're changing the semantics to disallow this, we should be sure to 
update the man-page, too.  (Actually, we should update the man page in 
any case, to document whether such migration between lwp's is permitted 
or disallowed.)



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Debugging function of localcount

2017-11-14 Thread Taylor R Campbell
> Date: Mon, 13 Nov 2017 13:59:32 +0900
> From: Ryota Ozaki 
> 
> I'm sorry for not replying this thread. I'm back.
> 
> I've revised the patch as your suggestions:
>   http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff

Looks better, thanks!

Couple nits:

1. Still need to volatile-qualify lc_refcnt.
2. Either use unsigned and atomic_cas_uint, or uint32_t and
   atomic_cas_32, but not a mixture.

Random thought: What about a per-LWP counter for diagnostics, with
lwp_getspecific/setspecific?  No interprocessor overhead that way.
Could even assert when an lwp exits, or when a softint returns, that
no localcounts were leaked.

(This slightly changes the semantics of localcount so that it cannot
be passed from lwp to lwp (i.e., can't acquire in one and release in
the other; must acquire in both and release in both), but I don't
really see a problem with that.)


Re: Debugging function of localcount

2017-11-12 Thread Ryota Ozaki
On Tue, Aug 8, 2017 at 3:16 AM, Taylor R Campbell
 wrote:
>> Date: Mon, 7 Aug 2017 13:08:35 +0900
>> From: Ryota Ozaki 
>>
>> Anyway I've brought another (stupid) approach that uses an atomic
>> counter:
>>   http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff
>>
>> It counts up a counter with atomic instructions along with localcount
>> if LOCKDEBUG enabled. Of course it blows the merit of localcount but
>> I can provide an accurate view of the counter.
>>
>> Do you think that it works and is useful?
>
> That looks plausible.  I'd put it under DEBUG && LOCKDEBUG, and
> require localcount_debug_refcnt to be used only in KDASSERT, to avoid
> adding yet more contention to plain LOCKDEBUG, and to make it safe to
> write, e.g.,
>
> KDASSERT(localcount_debug_refcnt(lc) >= 2).
>
> Otherwise, under, e.g., DIAGNOSTIC && !LOCKDEBUG,
>
> KASSERT(localcount_debug_refcnt(lc) >= 2)
>
> would always fail, since localcount_debug_refcnt always returns 0.
>
> Tiny nit: I suggest you always use uint32_t for this, and use
> atomic_dec_32_nv(&lc->lc_refcnt) == UINT_MAX to detect underflow,
> rather than conversion to int32_t.  Maybe check-and-panic on overflow
> too: if atomic_inc_32_nv(&lc->lc_refcnt) == 0, panic.
>
> Problem: The ABI is different with and without LOCKDEBUG.  The
> structure member can't be conditional.  I don't see a problem with
> changing always adding another 32-bit (or even 64-bit) member to
> struct localcount -- they already require O(nobj * ncpu) memory, so an
> extra handful of bytes per object is inconsequential -- so I'd just
> make it unconditional.
>
> No need for #include  in localcount.h.  lc_refcnt member
> should be volatile-qualified.

I'm sorry for not replying this thread. I'm back.

I've revised the patch as your suggestions:
  http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.revised.diff

Thanks,
  ozaki-r


Re: Debugging function of localcount

2017-08-07 Thread Taylor R Campbell
> Date: Mon, 7 Aug 2017 13:08:35 +0900
> From: Ryota Ozaki 
> 
> Anyway I've brought another (stupid) approach that uses an atomic
> counter:
>   http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff
> 
> It counts up a counter with atomic instructions along with localcount
> if LOCKDEBUG enabled. Of course it blows the merit of localcount but
> I can provide an accurate view of the counter.
> 
> Do you think that it works and is useful?

That looks plausible.  I'd put it under DEBUG && LOCKDEBUG, and
require localcount_debug_refcnt to be used only in KDASSERT, to avoid
adding yet more contention to plain LOCKDEBUG, and to make it safe to
write, e.g.,

KDASSERT(localcount_debug_refcnt(lc) >= 2).

Otherwise, under, e.g., DIAGNOSTIC && !LOCKDEBUG,

KASSERT(localcount_debug_refcnt(lc) >= 2)

would always fail, since localcount_debug_refcnt always returns 0.

Tiny nit: I suggest you always use uint32_t for this, and use
atomic_dec_32_nv(&lc->lc_refcnt) == UINT_MAX to detect underflow,
rather than conversion to int32_t.  Maybe check-and-panic on overflow
too: if atomic_inc_32_nv(&lc->lc_refcnt) == 0, panic.

Problem: The ABI is different with and without LOCKDEBUG.  The
structure member can't be conditional.  I don't see a problem with
changing always adding another 32-bit (or even 64-bit) member to
struct localcount -- they already require O(nobj * ncpu) memory, so an
extra handful of bytes per object is inconsequential -- so I'd just
make it unconditional.

No need for #include  in localcount.h.  lc_refcnt member
should be volatile-qualified.


Re: Debugging function of localcount

2017-08-06 Thread Ryota Ozaki
On Thu, Jul 27, 2017 at 1:49 AM, Taylor R Campbell
 wrote:
>> Date: Wed, 26 Jul 2017 18:41:36 +0900
>> From: Ryota Ozaki 
>>
>> Oops. I thought we could use percpu_foreach in softint, but
>> I was wrong. So localcount_debug_refcnt can be used only
>> in thread contexts. And the above assertion shouldn't be
>> there. The proposal isn't so much worthwhile ;-<
>
> You also can't access one CPU's count from another CPU -- there's no
> memory ordering, so you can easily get wildly inconsistent answers.
>
> Maybe if you did a cross-call to count it up, that would work, but it
> sounds like for the cases you had in mind, you can't do a cross-call,
> and I'm not sure the view is guaranteed to be stable enough: maybe one
> thread decrements the count on CPU 0, then you count 'em up, then the
> thread migrates to CPU 1 and increments the count there.  (It is not a
> theorem that something like this might violate your assertion, just a
> conjecture!)

Yes, so I expected the uses in very limited environments like ATF tests
using rump kernels.

Anyway I've brought another (stupid) approach that uses an atomic
counter:
  http://www.netbsd.org/~ozaki-r/localcount_debug_atomic.diff

It counts up a counter with atomic instructions along with localcount
if LOCKDEBUG enabled. Of course it blows the merit of localcount but
I can provide an accurate view of the counter.

Do you think that it works and is useful?

  ozaki-r


Re: Debugging function of localcount

2017-07-26 Thread Taylor R Campbell
> Date: Wed, 26 Jul 2017 18:41:36 +0900
> From: Ryota Ozaki 
> 
> Oops. I thought we could use percpu_foreach in softint, but
> I was wrong. So localcount_debug_refcnt can be used only
> in thread contexts. And the above assertion shouldn't be
> there. The proposal isn't so much worthwhile ;-<

You also can't access one CPU's count from another CPU -- there's no
memory ordering, so you can easily get wildly inconsistent answers.

Maybe if you did a cross-call to count it up, that would work, but it
sounds like for the cases you had in mind, you can't do a cross-call,
and I'm not sure the view is guaranteed to be stable enough: maybe one
thread decrements the count on CPU 0, then you count 'em up, then the
thread migrates to CPU 1 and increments the count there.  (It is not a
theorem that something like this might violate your assertion, just a
conjecture!)


Re: Debugging function of localcount

2017-07-26 Thread Ryota Ozaki
On Wed, Jul 26, 2017 at 1:34 PM, Ryota Ozaki  wrote:
> (resending, please ignore if duplicated)
>
> Hi,
>
> I propose a debugging function of localcount,
> localcount_debug_refcnt, which returns a total
> reference count of a specified localcount.
>
> A result of localcount_debug_refcnt isn't guaranteed
> to be accurate, however it's still useful for
> debugging purposes, for example debugging on
> ATF tests using rump kernels.
>
> Here is a patch:
>   http://www.netbsd.org/~ozaki-r/localcount_debug_refcnt.diff
>
> I added the following assertion in localcount_release:
>   KDASSERT(localcount_debug_refcnt(lc) >= 0);
>
> IIUC it's a proper constraint of localcount even if
> a result of localcount_debug_refcnt isn't accurate.
>
> Any comments or suggestions?

Oops. I thought we could use percpu_foreach in softint, but
I was wrong. So localcount_debug_refcnt can be used only
in thread contexts. And the above assertion shouldn't be
there. The proposal isn't so much worthwhile ;-<

  ozaki-r