Re: hppa MI mutex

2018-04-03 Thread Mark Kettenis
> Date: Tue, 3 Apr 2018 12:36:18 +0200
> From: Martin Pieuchot 
> 
> On 03/04/18(Tue) 12:06, Mark Kettenis wrote:
> > > Date: Tue, 3 Apr 2018 11:16:45 +0200
> > > From: Martin Pieuchot 
> > > 
> > > Here's a diff to switch hppa to the MI mutex implementation.  I'm
> > > looking for testers, as I don't own such hardware.
> > > 
> > > Note that our MI mutex implementation relies on atomic_cas_ptr(9).  On
> > > hppa all the calls to this function are serialized on a single lock. I
> > > don't believe it will introduce contention at this point since the
> > > KERNEL_LOCK() is still the problem.  However if it becomes a bottleneck
> > > we could use multiple locks in a small hash table.
> > 
> > I don't have such hardware either, but I'm not in favour of doing
> > this.  PA-RISC is "challenged" but at least the locking primitive it
> > supplies is suitable for implementing spin locks.  Emulating CAS with
> > spin locks to implement a spin lock feels so wrong.  There are also
> > some concerns about guarantees of forward progress in such an
> > implementation.
> 
> Well I would prefer if we could solve these concerns because we are already
> using atomic operations.
> 
> So what are the options?
> 
> 1. Keep hppa mutex MD and start diverging with the MI mutex
> 
> 2. Add another MI layer of abstraction on top of the atomic*(9) API
>because of hppa
> 
> 3. Emulate the atomic*(9) API as well as we can on hppa
> 
> I picked 3 because I don't want to add more complexity for an arch
> which is not widely used.  Who care if it feels wrong to emulate a CAS
> with a hw spinlock?  I believed that what we need is locking primitives
> that we can debug and analyse.  If hppa can benefit from that and help us
> find bugs, that's even better :)

Option 1. doesn't rule this out.  Having most (but not all)
architectures using the MI implementation is still a win.  The hppa
mutex code is already written in C so it would be fairly easy do
adjust it to provide the equivalent diagnostics.  I can give that a go
if you want me to.



Re: hppa MI mutex

2018-04-03 Thread Martin Pieuchot
On 03/04/18(Tue) 12:06, Mark Kettenis wrote:
> > Date: Tue, 3 Apr 2018 11:16:45 +0200
> > From: Martin Pieuchot 
> > 
> > Here's a diff to switch hppa to the MI mutex implementation.  I'm
> > looking for testers, as I don't own such hardware.
> > 
> > Note that our MI mutex implementation relies on atomic_cas_ptr(9).  On
> > hppa all the calls to this function are serialized on a single lock. I
> > don't believe it will introduce contention at this point since the
> > KERNEL_LOCK() is still the problem.  However if it becomes a bottleneck
> > we could use multiple locks in a small hash table.
> 
> I don't have such hardware either, but I'm not in favour of doing
> this.  PA-RISC is "challenged" but at least the locking primitive it
> supplies is suitable for implementing spin locks.  Emulating CAS with
> spin locks to implement a spin lock feels so wrong.  There are also
> some concerns about guarantees of forward progress in such an
> implementation.

Well I would prefer if we could solve these concerns because we are already
using atomic operations.

So what are the options?

1. Keep hppa mutex MD and start diverging with the MI mutex

2. Add another MI layer of abstraction on top of the atomic*(9) API
   because of hppa

3. Emulate the atomic*(9) API as well as we can on hppa

I picked 3 because I don't want to add more complexity for an arch
which is not widely used.  Who care if it feels wrong to emulate a CAS
with a hw spinlock?  I believed that what we need is locking primitives
that we can debug and analyse.  If hppa can benefit from that and help us
find bugs, that's even better :)

Do you see another alternative?



Re: hppa MI mutex

2018-04-03 Thread Mark Kettenis
> Date: Tue, 3 Apr 2018 11:16:45 +0200
> From: Martin Pieuchot 
> 
> Here's a diff to switch hppa to the MI mutex implementation.  I'm
> looking for testers, as I don't own such hardware.
> 
> Note that our MI mutex implementation relies on atomic_cas_ptr(9).  On
> hppa all the calls to this function are serialized on a single lock. I
> don't believe it will introduce contention at this point since the
> KERNEL_LOCK() is still the problem.  However if it becomes a bottleneck
> we could use multiple locks in a small hash table.

I don't have such hardware either, but I'm not in favour of doing
this.  PA-RISC is "challenged" but at least the locking primitive it
supplies is suitable for implementing spin locks.  Emulating CAS with
spin locks to implement a spin lock feels so wrong.  There are also
some concerns about guarantees of forward progress in such an
implementation.

> Index: arch/hppa/conf/files.hppa
> ===
> RCS file: /cvs/src/sys/arch/hppa/conf/files.hppa,v
> retrieving revision 1.97
> diff -u -p -r1.97 files.hppa
> --- arch/hppa/conf/files.hppa 5 Jun 2017 18:59:07 -   1.97
> +++ arch/hppa/conf/files.hppa 12 Feb 2018 10:08:57 -
> @@ -298,7 +298,6 @@ file  arch/hppa/hppa/fpu.c
>  file arch/hppa/hppa/ipi.cmultiprocessor
>  file arch/hppa/hppa/lock_machdep.c   multiprocessor
>  file arch/hppa/hppa/machdep.c
> -file arch/hppa/hppa/mutex.c
>  file arch/hppa/hppa/pmap.c
>  file arch/hppa/hppa/process_machdep.c
>  file arch/hppa/hppa/sys_machdep.c
> Index: arch/hppa/hppa/genassym.cf
> ===
> RCS file: /cvs/src/sys/arch/hppa/hppa/genassym.cf,v
> retrieving revision 1.47
> diff -u -p -r1.47 genassym.cf
> --- arch/hppa/hppa/genassym.cf9 Feb 2015 08:20:13 -   1.47
> +++ arch/hppa/hppa/genassym.cf12 Feb 2018 10:15:56 -
> @@ -45,7 +45,7 @@ include 
>  include 
>  include 
>  include 
> -include 
> +include 
>  include 
>  include 
>  include 
> Index: arch/hppa/hppa/ipi.c
> ===
> RCS file: /cvs/src/sys/arch/hppa/hppa/ipi.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 ipi.c
> --- arch/hppa/hppa/ipi.c  8 Sep 2017 05:36:51 -   1.5
> +++ arch/hppa/hppa/ipi.c  12 Feb 2018 10:16:01 -
> @@ -25,7 +25,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  
>  void hppa_ipi_nop(void);
> Index: arch/hppa/hppa/mutex.c
> ===
> RCS file: arch/hppa/hppa/mutex.c
> diff -N arch/hppa/hppa/mutex.c
> --- arch/hppa/hppa/mutex.c20 Apr 2017 13:57:29 -  1.16
> +++ /dev/null 1 Jan 1970 00:00:00 -
> @@ -1,156 +0,0 @@
> -/*   $OpenBSD: mutex.c,v 1.16 2017/04/20 13:57:29 visa Exp $ */
> -
> -/*
> - * Copyright (c) 2004 Artur Grabowski 
> - * All rights reserved. 
> - *
> - * Redistribution and use in source and binary forms, with or without 
> - * modification, are permitted provided that the following conditions 
> - * are met: 
> - *
> - * 1. Redistributions of source code must retain the above copyright 
> - *notice, this list of conditions and the following disclaimer. 
> - * 2. The name of the author may not be used to endorse or promote products
> - *derived from this software without specific prior written permission. 
> - *
> - * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
> - * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
> - * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> - * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
> - * EXEMPLARY, OR CONSEQUENTIAL  DAMAGES (INCLUDING, BUT NOT LIMITED TO,
> - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 
> PROFITS;
> - * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
> - * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
> - * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
> - * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#include 
> -
> -#include 
> -
> -int __mtx_enter_try(struct mutex *);
> -
> -#ifdef MULTIPROCESSOR
> -/* Note: lock must be 16-byte aligned. */
> -#define __mtx_lock(mtx) ((int *)(((vaddr_t)mtx->mtx_lock + 0xf) & ~0xf))
> -#endif
> -
> -void
> -__mtx_init(struct mutex *mtx, int wantipl)
> -{
> -#ifdef MULTIPROCESSOR
> - mtx->mtx_lock[0] = 1;
> - mtx->mtx_lock[1] = 1;
> - mtx->mtx_lock[2] = 1;
> - mtx->mtx_lock[3] = 1;
> -#endif
> - mtx->mtx_wantipl = wantipl;
> - mtx->mtx_oldipl = IPL_NONE;
> - mtx->mtx_owner = NULL;
> -}
> -
> -#ifdef MULTIPROCESSOR
> -void
> -__mtx_enter(struct mutex *mtx)
> -{
> - while (__mtx_enter_try(mtx) == 0)
> - 

hppa MI mutex

2018-04-03 Thread Martin Pieuchot
Here's a diff to switch hppa to the MI mutex implementation.  I'm
looking for testers, as I don't own such hardware.

Note that our MI mutex implementation relies on atomic_cas_ptr(9).  On
hppa all the calls to this function are serialized on a single lock. I
don't believe it will introduce contention at this point since the
KERNEL_LOCK() is still the problem.  However if it becomes a bottleneck
we could use multiple locks in a small hash table.


Index: arch/hppa/conf/files.hppa
===
RCS file: /cvs/src/sys/arch/hppa/conf/files.hppa,v
retrieving revision 1.97
diff -u -p -r1.97 files.hppa
--- arch/hppa/conf/files.hppa   5 Jun 2017 18:59:07 -   1.97
+++ arch/hppa/conf/files.hppa   12 Feb 2018 10:08:57 -
@@ -298,7 +298,6 @@ filearch/hppa/hppa/fpu.c
 file   arch/hppa/hppa/ipi.cmultiprocessor
 file   arch/hppa/hppa/lock_machdep.c   multiprocessor
 file   arch/hppa/hppa/machdep.c
-file   arch/hppa/hppa/mutex.c
 file   arch/hppa/hppa/pmap.c
 file   arch/hppa/hppa/process_machdep.c
 file   arch/hppa/hppa/sys_machdep.c
Index: arch/hppa/hppa/genassym.cf
===
RCS file: /cvs/src/sys/arch/hppa/hppa/genassym.cf,v
retrieving revision 1.47
diff -u -p -r1.47 genassym.cf
--- arch/hppa/hppa/genassym.cf  9 Feb 2015 08:20:13 -   1.47
+++ arch/hppa/hppa/genassym.cf  12 Feb 2018 10:15:56 -
@@ -45,7 +45,7 @@ include 
 include 
 include 
 include 
-include 
+include 
 include 
 include 
 include 
Index: arch/hppa/hppa/ipi.c
===
RCS file: /cvs/src/sys/arch/hppa/hppa/ipi.c,v
retrieving revision 1.5
diff -u -p -r1.5 ipi.c
--- arch/hppa/hppa/ipi.c8 Sep 2017 05:36:51 -   1.5
+++ arch/hppa/hppa/ipi.c12 Feb 2018 10:16:01 -
@@ -25,7 +25,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 void hppa_ipi_nop(void);
Index: arch/hppa/hppa/mutex.c
===
RCS file: arch/hppa/hppa/mutex.c
diff -N arch/hppa/hppa/mutex.c
--- arch/hppa/hppa/mutex.c  20 Apr 2017 13:57:29 -  1.16
+++ /dev/null   1 Jan 1970 00:00:00 -
@@ -1,156 +0,0 @@
-/* $OpenBSD: mutex.c,v 1.16 2017/04/20 13:57:29 visa Exp $ */
-
-/*
- * Copyright (c) 2004 Artur Grabowski 
- * All rights reserved. 
- *
- * Redistribution and use in source and binary forms, with or without 
- * modification, are permitted provided that the following conditions 
- * are met: 
- *
- * 1. Redistributions of source code must retain the above copyright 
- *notice, this list of conditions and the following disclaimer. 
- * 2. The name of the author may not be used to endorse or promote products
- *derived from this software without specific prior written permission. 
- *
- * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
- * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
- * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
- * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL  DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
- * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
- * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
- * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
- * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
- */
-
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-#include 
-
-int __mtx_enter_try(struct mutex *);
-
-#ifdef MULTIPROCESSOR
-/* Note: lock must be 16-byte aligned. */
-#define __mtx_lock(mtx) ((int *)(((vaddr_t)mtx->mtx_lock + 0xf) & ~0xf))
-#endif
-
-void
-__mtx_init(struct mutex *mtx, int wantipl)
-{
-#ifdef MULTIPROCESSOR
-   mtx->mtx_lock[0] = 1;
-   mtx->mtx_lock[1] = 1;
-   mtx->mtx_lock[2] = 1;
-   mtx->mtx_lock[3] = 1;
-#endif
-   mtx->mtx_wantipl = wantipl;
-   mtx->mtx_oldipl = IPL_NONE;
-   mtx->mtx_owner = NULL;
-}
-
-#ifdef MULTIPROCESSOR
-void
-__mtx_enter(struct mutex *mtx)
-{
-   while (__mtx_enter_try(mtx) == 0)
-   ;
-}
-
-int
-__mtx_enter_try(struct mutex *mtx)
-{
-   struct cpu_info *ci = curcpu();
-   volatile int *lock = __mtx_lock(mtx);
-   int ret;
-   int s;
-
-   if (mtx->mtx_wantipl != IPL_NONE)
-   s = splraise(mtx->mtx_wantipl);
-
-#ifdef DIAGNOSTIC
-   if (__predict_false(mtx->mtx_owner == ci))
-   panic("mtx %p: locking against myself", mtx);
-#endif
-
-   asm volatile (
-   "ldcws  0(%2), %0"
-   : "=" (ret), "+m" (lock)
-   : "r" (lock)
-   );
-
-   if (ret) {
-   membar_enter();
-   mtx->mtx_owner = ci;
-