Re: Unlock top part of uvm_fault()

2021-04-29 Thread Scott Bennett
On Thu, 22 Apr 2021 17:06:00 -0400, Scott Bennett  
wrote:
> On Thu, 22 Apr 2021 15:38:53 +0200, Martin Pieuchot  wrote:
> > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> > both amd64 and sparc64.  That means the kernel lock will only be taken
> > for lower faults and some amap/anon code will now run without it.
>
> Hi Martin,
> 
> I'd be willing to test this on octeon. However, with my poor little EdgeRouter
> Lite, I would only be able to test building a few packages.
> 
> Would this be the correct diff for octeon?
> 
> diff 5fe2fdbf987b2faadb61c6a00cf1dd30ab3e7fa6 /usr/src
> blob - 6530ef75203fbea02a4f1aaca62e4e78a81f4cdf
> file + sys/arch/mips64/mips64/trap.c
> --- sys/arch/mips64/mips64/trap.c
> +++ sys/arch/mips64/mips64/trap.c
> @@ -340,9 +340,7 @@ itsa(struct trapframe *trapframe, struct cpu_info *ci,
>   va = trunc_page((vaddr_t)trapframe->badvaddr);
>   onfault = pcb->pcb_onfault;
>   pcb->pcb_onfault = 0;
> - KERNEL_LOCK();
>   rv = uvm_fault(kernel_map, va, 0, access_type);
> - KERNEL_UNLOCK();
>   pcb->pcb_onfault = onfault;
>   if (rv == 0)
>   return;
> @@ -421,9 +419,7 @@ fault_common_no_miss:
>  
>   onfault = pcb->pcb_onfault;
>   pcb->pcb_onfault = 0;
> - KERNEL_LOCK();
>   rv = uvm_fault(map, va, 0, access_type);
> - KERNEL_UNLOCK();
>   pcb->pcb_onfault = onfault;
>  
>   /*
> 

I tested the above by doing a kernel build and building one package
(lang/tcl/8.6). The system was stable, but the build times were virtually
the same. I used a snapshot from April 27 and applied my diff to sources
checked out on April 27.

kernel build times:

Before diff:
erl$ time make -j 2
   63m31.25s real   113m05.96s user11m52.75s system

After diff:
erl$ time make -j 2
   63m09.83s real   112m49.24s user11m33.18s system


lang/tcl/8.6 build times:

Before diff:
erl$ time make build
   39m04.35s real34m35.00s user 3m43.20s system

After diff:
erl$ time make build
   39m04.12s real34m34.70s user 3m41.03s system



Re: Unlock top part of uvm_fault()

2021-04-29 Thread Kurt Mosiejczuk
On Thu, Apr 22, 2021 at 03:38:53PM +0200, Martin Pieuchot wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.

> I'd be interested to have this tested and see how much does that impact
> the build time of packages.

> We should be able to do the switch on an arch-by-arch basis.  It's
> easier for me to develop & debug on these two architectures so I started
> with them.  If you want to unlock another architecture and report back,
> I'd be glad.

Sorry for the delay on starting such a build, but one just finished for me
overnight. I don't know that it made much difference in time to build, but
it completed the full build just fine.

(It was 3 hours quicker, but it also built less packages, in particular
because all the ruby 2.6 packages are gone)

--Kurt

> Thanks,
> Martin
> 
> Index: arch/amd64/amd64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
> retrieving revision 1.87
> diff -u -p -r1.87 trap.c
> --- arch/amd64/amd64/trap.c   22 Oct 2020 13:41:51 -  1.87
> +++ arch/amd64/amd64/trap.c   22 Apr 2021 13:06:54 -
> @@ -176,10 +176,7 @@ upageflttrap(struct trapframe *frame, ui
>   union sigval sv;
>   int signal, sicode, error;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
> -
>   if (error == 0) {
>   uvm_grow(p, va);
>   return 1;
> @@ -261,9 +258,7 @@ kpageflttrap(struct trapframe *frame, ui
>   if (curcpu()->ci_inatomic == 0 || map == kernel_map) {
>   onfault = pcb->pcb_onfault;
>   pcb->pcb_onfault = NULL;
> - KERNEL_LOCK();
>   error = uvm_fault(map, va, 0, access_type);
> - KERNEL_UNLOCK();
>   pcb->pcb_onfault = onfault;
>  
>   if (error == 0 && map != kernel_map)
> Index: arch/sparc64/sparc64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v
> retrieving revision 1.108
> diff -u -p -r1.108 trap.c
> --- arch/sparc64/sparc64/trap.c   11 Mar 2021 11:17:00 -  1.108
> +++ arch/sparc64/sparc64/trap.c   22 Apr 2021 13:06:54 -
> @@ -773,10 +773,7 @@ data_access_fault(struct trapframe64 *tf
>   if (!(addr & TLB_TAG_ACCESS_CTX)) {
>   /* CTXT == NUCLEUS */
>  
> - KERNEL_LOCK();
>   error = uvm_fault(kernel_map, va, 0, access_type);
> - KERNEL_UNLOCK();
> -
>   if (error == 0)
>   return;
>   goto kfault;
> @@ -792,9 +789,7 @@ data_access_fault(struct trapframe64 *tf
>  
>   onfault = (vaddr_t)p->p_addr->u_pcb.pcb_onfault;
>   p->p_addr->u_pcb.pcb_onfault = NULL;
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, (vaddr_t)va, 0, access_type);
> - KERNEL_UNLOCK();
>   p->p_addr->u_pcb.pcb_onfault = (void *)onfault;
>  
>   /*
> @@ -959,9 +954,7 @@ text_access_fault(struct trapframe64 *tf
>   uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
>   goto out;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
>  
>   /*
>* If this was a stack access we keep track of the maximum
> @@ -1055,9 +1048,7 @@ text_access_error(struct trapframe64 *tf
>   uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
>   goto out;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
>  
>   /*
>* If this was a stack access we keep track of the maximum



Re: Unlock top part of uvm_fault()

2021-04-27 Thread George Koehler
On Thu, 22 Apr 2021 15:38:53 +0200
Martin Pieuchot  wrote:

> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.

I made a similar diff, below, for macppc and powerpc64.  In my small
trials, the build times were about the same on my 4-core powerpc64 and
slightly faster on my 2-core macppc.

My macppc kernel had a problem: it froze at boot, but the problem went
away after I reordered the kernel.  In my guess, this unlocking diff
isn't the cause of the problem, because the freeze was too early
(before copyright); the other cpu wouldn't have entered the kernel, so
the kernel lock wouldn't prevent the freeze.

My powerpc64 trial was "make -j4" in src/gnu/usr.bin/clang on 4-core
POWER9 in Talos II.  Before diff:
  117m45.96s real   361m58.87s user68m44.66s system
After diff:
  115m24.53s real   362m02.65s user58m40.99s system
I see that system time went down by 10 minutes, but real time is about
the same (down by 2 minutes).

My macppc trial was "dpb devel/cmake" on 2-cpu Power Mac G5.
Before diff:03:44:50
After diff: 03:33:44
Real time went down by 11 minutes.  Maybe the unlocking diff works
better on a 2-core machine, or maybe the unlocking diff does nothing
and I saved 11 minutes by random luck.

--George

Index: arch/powerpc/powerpc/trap.c
===
RCS file: /cvs/src/sys/arch/powerpc/powerpc/trap.c,v
retrieving revision 1.119
diff -u -p -r1.119 trap.c
--- arch/powerpc/powerpc/trap.c 11 Mar 2021 11:16:59 -  1.119
+++ arch/powerpc/powerpc/trap.c 28 Apr 2021 02:49:44 -
@@ -282,9 +282,7 @@ trap(struct trapframe *frame)
else
ftype = PROT_READ;
 
-   KERNEL_LOCK();
error = uvm_fault(map, trunc_page(va), 0, ftype);
-   KERNEL_UNLOCK();
 
if (error == 0)
return;
@@ -318,10 +316,8 @@ trap(struct trapframe *frame)
} else
vftype = ftype = PROT_READ;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map,
trunc_page(frame->dar), 0, ftype);
-   KERNEL_UNLOCK();
 
if (error == 0) {
uvm_grow(p, frame->dar);
@@ -343,10 +339,8 @@ trap(struct trapframe *frame)
 
ftype = PROT_READ | PROT_EXEC;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map,
trunc_page(frame->srr0), 0, ftype);
-   KERNEL_UNLOCK();
 
if (error == 0) {
uvm_grow(p, frame->srr0);
Index: arch/powerpc64/powerpc64/trap.c
===
RCS file: /cvs/src/sys/arch/powerpc64/powerpc64/trap.c,v
retrieving revision 1.49
diff -u -p -r1.49 trap.c
--- arch/powerpc64/powerpc64/trap.c 9 Jan 2021 13:14:02 -   1.49
+++ arch/powerpc64/powerpc64/trap.c 24 Apr 2021 03:21:02 -
@@ -126,9 +126,7 @@ trap(struct trapframe *frame)
access_type = PROT_READ | PROT_WRITE;
else
access_type = PROT_READ;
-   KERNEL_LOCK();
error = uvm_fault(map, trunc_page(va), 0, access_type);
-   KERNEL_UNLOCK();
if (error == 0)
return;
 
@@ -237,9 +235,7 @@ trap(struct trapframe *frame)
access_type = PROT_READ | PROT_WRITE;
else
access_type = PROT_READ;
-   KERNEL_LOCK();
error = uvm_fault(map, trunc_page(va), 0, access_type);
-   KERNEL_UNLOCK();
if (error == 0)
uvm_grow(p, va);
 
@@ -284,9 +280,7 @@ trap(struct trapframe *frame)
map = >p_vmspace->vm_map;
va = frame->srr0;
access_type = PROT_READ | PROT_EXEC;
-   KERNEL_LOCK();
error = uvm_fault(map, trunc_page(va), 0, access_type);
-   KERNEL_UNLOCK();
if (error == 0)
uvm_grow(p, va);
 



Re: Unlock top part of uvm_fault()

2021-04-26 Thread Alexander Bluhm
On Mon, Apr 26, 2021 at 12:57:33PM +0200, Mark Kettenis wrote:
> > I did measure kernel build time make -j 4 on a 4 core machine.
> > System time goes down by 7.7%, real time stays the same.
> 
> That basically suggests that contention of the kernel lock isn't the
> major bottleneck for this workload.

Kernel lock is a bottleneck, but the diff does not solve it completely.

> > http://bluhm.genua.de/files/kstack-make.svg
> > http://bluhm.genua.de/files/kstack-make-uvm.svg

In current we have 25% usertrap -> upageflttrap -> kernel_lock.
With diff we have 22% usertrap -> upageflttrap -> uvm_fault -> kernel_lock.

So it pushes the problem down the call stack.

bluhm



Re: Unlock top part of uvm_fault()

2021-04-26 Thread Mark Kettenis
> Date: Mon, 26 Apr 2021 12:43:27 +0200
> From: Alexander Bluhm 
> 
> On Thu, Apr 22, 2021 at 03:38:53PM +0200, Martin Pieuchot wrote:
> > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> > both amd64 and sparc64.  That means the kernel lock will only be taken
> > for lower faults and some amap/anon code will now run without it.
> > 
> > I'd be interested to have this tested and see how much does that impact
> > the build time of packages.
> 
> I did measure kernel build time make -j 4 on a 4 core machine.
> System time goes down by 7.7%, real time stays the same.

That basically suggests that contention of the kernel lock isn't the
major bottleneck for this workload.  But less spinning is still good
since it will reduce the power consumption of modern CPUs.

> http://bluhm.genua.de/perform/results/2021-04-25T17%3A18%3A32Z/gnuplot/make.html
> left column current, righ column with uvm unlock diff
> 
> http://bluhm.genua.de/perform/results/2021-04-25T17%3A18%3A32Z/perform.html
> see row time -lp make -CGENERIC.MP -j4 -s
> 
> http://bluhm.genua.de/files/kstack-make.svg
> http://bluhm.genua.de/files/kstack-make-uvm.svg
> I cannot see much difference here.
> 
> bluhm
> 
> 



Re: Unlock top part of uvm_fault()

2021-04-26 Thread Alexander Bluhm
On Thu, Apr 22, 2021 at 03:38:53PM +0200, Martin Pieuchot wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.

I did measure kernel build time make -j 4 on a 4 core machine.
System time goes down by 7.7%, real time stays the same.

http://bluhm.genua.de/perform/results/2021-04-25T17%3A18%3A32Z/gnuplot/make.html
left column current, righ column with uvm unlock diff

http://bluhm.genua.de/perform/results/2021-04-25T17%3A18%3A32Z/perform.html
see row time -lp make -CGENERIC.MP -j4 -s

http://bluhm.genua.de/files/kstack-make.svg
http://bluhm.genua.de/files/kstack-make-uvm.svg
I cannot see much difference here.

bluhm



Re: Unlock top part of uvm_fault()

2021-04-24 Thread Christian Weisgerber
Christian Weisgerber:

> > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> > both amd64 and sparc64.  That means the kernel lock will only be taken
> > for lower faults and some amap/anon code will now run without it.
> 
> I ran an amd64 bulk build with this diff on top of 6.9.

PS: 4 machines, 4 cores each (Xeon CPU E3-1270 v6, HTT disabled)

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Unlock top part of uvm_fault()

2021-04-24 Thread Stuart Henderson
On 2021/04/22 15:38, Martin Pieuchot wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.
> 
> We should be able to do the switch on an arch-by-arch basis.  It's
> easier for me to develop & debug on these two architectures so I started
> with them.  If you want to unlock another architecture and report back,
> I'd be glad.

i386 with the below diff has survived a ports bulk build; machines are
quad core but I only run builds on 3 concurrent to keep a cap on RAM use.

Hard to say if there's any change in build time as they're a bit variable
anyway; seems to be not much change either way. Perhaps this is more
important with larger numbers of cores though.

Index: arch/i386/i386/trap.c
===
RCS file: /cvs/src/sys/arch/i386/i386/trap.c,v
retrieving revision 1.151
diff -u -p -r1.151 trap.c
--- arch/i386/i386/trap.c   27 Oct 2020 19:17:12 -  1.151
+++ arch/i386/i386/trap.c   22 Apr 2021 20:20:58 -
@@ -126,10 +126,7 @@ upageflttrap(struct trapframe *frame, ui
union sigval sv;
int signal, sicode, error;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
-   KERNEL_UNLOCK();
-
if (error == 0) {
uvm_grow(p, va);
return 1;
@@ -203,9 +200,7 @@ kpageflttrap(struct trapframe *frame, ui
if (curcpu()->ci_inatomic == 0 || map == kernel_map) {
onfault = pcb->pcb_onfault;
pcb->pcb_onfault = NULL;
-   KERNEL_LOCK();
error = uvm_fault(map, va, 0, access_type);
-   KERNEL_UNLOCK();
pcb->pcb_onfault = onfault;
 
if (error == 0 && map != kernel_map)



Re: Unlock top part of uvm_fault()

2021-04-23 Thread Christian Weisgerber
Martin Pieuchot:

> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.

I ran an amd64 bulk build with this diff on top of 6.9.

That succeeded.  No crashes, no ill effects.
The impact on the build time was neglibible.  From 24 hours 20-something
minutes down to 24 hours 12 minutes (1 sample).

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Unlock top part of uvm_fault()

2021-04-23 Thread Janne Johansson
Den tors 22 apr. 2021 kl 23:13 skrev Scott Bennett :
> > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> > both amd64 and sparc64.  That means the kernel lock will only be taken
> > for lower faults and some amap/anon code will now run without it.
> >
> > I'd be interested to have this tested and see how much does that impact
> > the build time of packages.
> >
> > We should be able to do the switch on an arch-by-arch basis.  It's
> > easier for me to develop & debug on these two architectures so I started
> > with them.  If you want to unlock another architecture and report back,
> > I'd be glad.
>
> I'd be willing to test this on octeon. However, with my poor little EdgeRouter
> Lite, I would only be able to test building a few packages.
>
> Would this be the correct diff for octeon?

Yes, there are a few more uvm_fault()s but those are in the parts
relating to FPU emulation which I also didn't unlock but I am running
exactly this on a few of my octeons too, without noticed issues.


> pcb->pcb_onfault = 0;
> -   KERNEL_LOCK();
> rv = uvm_fault(kernel_map, va, 0, access_type);
> -   KERNEL_UNLOCK();
> pcb->pcb_onfault = onfault;


> pcb->pcb_onfault = 0;
> -   KERNEL_LOCK();
> rv = uvm_fault(map, va, 0, access_type);
> -   KERNEL_UNLOCK();
> pcb->pcb_onfault = onfault;

-- 
May the most significant bit of your life be positive.



Re: Unlock top part of uvm_fault()

2021-04-22 Thread Scott Bennett
On Thu, 22 Apr 2021 15:38:53 +0200, Martin Pieuchot  wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.
> 
> We should be able to do the switch on an arch-by-arch basis.  It's
> easier for me to develop & debug on these two architectures so I started
> with them.  If you want to unlock another architecture and report back,
> I'd be glad.
> 
> Thanks,
> Martin

Hi Martin,

I'd be willing to test this on octeon. However, with my poor little EdgeRouter
Lite, I would only be able to test building a few packages.

Would this be the correct diff for octeon?

diff 5fe2fdbf987b2faadb61c6a00cf1dd30ab3e7fa6 /usr/src
blob - 6530ef75203fbea02a4f1aaca62e4e78a81f4cdf
file + sys/arch/mips64/mips64/trap.c
--- sys/arch/mips64/mips64/trap.c
+++ sys/arch/mips64/mips64/trap.c
@@ -340,9 +340,7 @@ itsa(struct trapframe *trapframe, struct cpu_info *ci,
va = trunc_page((vaddr_t)trapframe->badvaddr);
onfault = pcb->pcb_onfault;
pcb->pcb_onfault = 0;
-   KERNEL_LOCK();
rv = uvm_fault(kernel_map, va, 0, access_type);
-   KERNEL_UNLOCK();
pcb->pcb_onfault = onfault;
if (rv == 0)
return;
@@ -421,9 +419,7 @@ fault_common_no_miss:
 
onfault = pcb->pcb_onfault;
pcb->pcb_onfault = 0;
-   KERNEL_LOCK();
rv = uvm_fault(map, va, 0, access_type);
-   KERNEL_UNLOCK();
pcb->pcb_onfault = onfault;
 
/*



Re: Unlock top part of uvm_fault()

2021-04-22 Thread Alexander Bluhm
On Thu, Apr 22, 2021 at 03:38:53PM +0200, Martin Pieuchot wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.

Full regress test on amd64 successful.

http://bluhm.genua.de/regress/results/regress-ot6.html
When you click on custom, you see the diff that was tested.

bluhm

> We should be able to do the switch on an arch-by-arch basis.  It's
> easier for me to develop & debug on these two architectures so I started
> with them.  If you want to unlock another architecture and report back,
> I'd be glad.
> 
> Thanks,
> Martin
> 
> Index: arch/amd64/amd64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
> retrieving revision 1.87
> diff -u -p -r1.87 trap.c
> --- arch/amd64/amd64/trap.c   22 Oct 2020 13:41:51 -  1.87
> +++ arch/amd64/amd64/trap.c   22 Apr 2021 13:06:54 -
> @@ -176,10 +176,7 @@ upageflttrap(struct trapframe *frame, ui
>   union sigval sv;
>   int signal, sicode, error;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
> -
>   if (error == 0) {
>   uvm_grow(p, va);
>   return 1;
> @@ -261,9 +258,7 @@ kpageflttrap(struct trapframe *frame, ui
>   if (curcpu()->ci_inatomic == 0 || map == kernel_map) {
>   onfault = pcb->pcb_onfault;
>   pcb->pcb_onfault = NULL;
> - KERNEL_LOCK();
>   error = uvm_fault(map, va, 0, access_type);
> - KERNEL_UNLOCK();
>   pcb->pcb_onfault = onfault;
>  
>   if (error == 0 && map != kernel_map)
> Index: arch/sparc64/sparc64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v
> retrieving revision 1.108
> diff -u -p -r1.108 trap.c
> --- arch/sparc64/sparc64/trap.c   11 Mar 2021 11:17:00 -  1.108
> +++ arch/sparc64/sparc64/trap.c   22 Apr 2021 13:06:54 -
> @@ -773,10 +773,7 @@ data_access_fault(struct trapframe64 *tf
>   if (!(addr & TLB_TAG_ACCESS_CTX)) {
>   /* CTXT == NUCLEUS */
>  
> - KERNEL_LOCK();
>   error = uvm_fault(kernel_map, va, 0, access_type);
> - KERNEL_UNLOCK();
> -
>   if (error == 0)
>   return;
>   goto kfault;
> @@ -792,9 +789,7 @@ data_access_fault(struct trapframe64 *tf
>  
>   onfault = (vaddr_t)p->p_addr->u_pcb.pcb_onfault;
>   p->p_addr->u_pcb.pcb_onfault = NULL;
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, (vaddr_t)va, 0, access_type);
> - KERNEL_UNLOCK();
>   p->p_addr->u_pcb.pcb_onfault = (void *)onfault;
>  
>   /*
> @@ -959,9 +954,7 @@ text_access_fault(struct trapframe64 *tf
>   uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
>   goto out;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
>  
>   /*
>* If this was a stack access we keep track of the maximum
> @@ -1055,9 +1048,7 @@ text_access_error(struct trapframe64 *tf
>   uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
>   goto out;
>  
> - KERNEL_LOCK();
>   error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
> - KERNEL_UNLOCK();
>  
>   /*
>* If this was a stack access we keep track of the maximum



Unlock top part of uvm_fault()

2021-04-22 Thread Martin Pieuchot
Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
both amd64 and sparc64.  That means the kernel lock will only be taken
for lower faults and some amap/anon code will now run without it.

I'd be interested to have this tested and see how much does that impact
the build time of packages.

We should be able to do the switch on an arch-by-arch basis.  It's
easier for me to develop & debug on these two architectures so I started
with them.  If you want to unlock another architecture and report back,
I'd be glad.

Thanks,
Martin

Index: arch/amd64/amd64/trap.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
retrieving revision 1.87
diff -u -p -r1.87 trap.c
--- arch/amd64/amd64/trap.c 22 Oct 2020 13:41:51 -  1.87
+++ arch/amd64/amd64/trap.c 22 Apr 2021 13:06:54 -
@@ -176,10 +176,7 @@ upageflttrap(struct trapframe *frame, ui
union sigval sv;
int signal, sicode, error;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
-   KERNEL_UNLOCK();
-
if (error == 0) {
uvm_grow(p, va);
return 1;
@@ -261,9 +258,7 @@ kpageflttrap(struct trapframe *frame, ui
if (curcpu()->ci_inatomic == 0 || map == kernel_map) {
onfault = pcb->pcb_onfault;
pcb->pcb_onfault = NULL;
-   KERNEL_LOCK();
error = uvm_fault(map, va, 0, access_type);
-   KERNEL_UNLOCK();
pcb->pcb_onfault = onfault;
 
if (error == 0 && map != kernel_map)
Index: arch/sparc64/sparc64/trap.c
===
RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v
retrieving revision 1.108
diff -u -p -r1.108 trap.c
--- arch/sparc64/sparc64/trap.c 11 Mar 2021 11:17:00 -  1.108
+++ arch/sparc64/sparc64/trap.c 22 Apr 2021 13:06:54 -
@@ -773,10 +773,7 @@ data_access_fault(struct trapframe64 *tf
if (!(addr & TLB_TAG_ACCESS_CTX)) {
/* CTXT == NUCLEUS */
 
-   KERNEL_LOCK();
error = uvm_fault(kernel_map, va, 0, access_type);
-   KERNEL_UNLOCK();
-
if (error == 0)
return;
goto kfault;
@@ -792,9 +789,7 @@ data_access_fault(struct trapframe64 *tf
 
onfault = (vaddr_t)p->p_addr->u_pcb.pcb_onfault;
p->p_addr->u_pcb.pcb_onfault = NULL;
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, (vaddr_t)va, 0, access_type);
-   KERNEL_UNLOCK();
p->p_addr->u_pcb.pcb_onfault = (void *)onfault;
 
/*
@@ -959,9 +954,7 @@ text_access_fault(struct trapframe64 *tf
uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
goto out;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
-   KERNEL_UNLOCK();
 
/*
 * If this was a stack access we keep track of the maximum
@@ -1055,9 +1048,7 @@ text_access_error(struct trapframe64 *tf
uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
goto out;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
-   KERNEL_UNLOCK();
 
/*
 * If this was a stack access we keep track of the maximum