Re: Get rid of UVM_VNODE_CANPERSIST

2022-11-15 Thread Mike Larkin
On Tue, Nov 15, 2022 at 02:31:27PM +0100, Martin Pieuchot wrote:
> UVM vnode objects include a reference count to keep track of the number
> of processes that have the corresponding pages mapped in their VM space.
>
> When the last process referencing a given library or executable dies,
> the reaper will munmap this object on its behalf.  When this happens it
> doesn't free the associated pages to speed-up possible re-use of the
> file.  Instead the pages are placed on the inactive list but stay ready
> to be pmap_enter()'d without requiring I/O as soon as a newly process
> needs to access them.
>
> The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST,
> doesn't work well with swapping [0].  For some reason when the page daemon
> wants to free pages on the inactive list it tries to flush the pages to
> disk and panic(9) because it needs a valid reference to the vnode to do
> so.
>
> This indicates that the mechanism described above, which seems to work
> fine for RO mappings, is currently buggy in more complex situations.
> Flushing the pages when the last reference of the UVM object is dropped
> also doesn't seem to be enough as bluhm@ reported [1].
>
> The diff below, which has already be committed and reverted, gets rid of
> the UVM_VNODE_CANPERSIST logic.  I'd like to commit it again now that
> the arm64 caching bug has been found and fixed.
>
> Getting rid of this logic means more I/O will be generated and pages
> might have a faster reuse cycle.  I'm aware this might introduce a small
> slowdown, however I believe we should work towards loading files from the
> buffer cache to save I/O cycles instead of having another layer of cache.
> Such work isn't trivial and making sure the vnode <-> UVM relation is
> simple and well understood is the first step in this direction.
>
> I'd appreciate if the diff below could be tested on many architectures,
> include the offending rpi4.
>

arm64 (rpi4): full make build, no issues
arm64 (rpi3): let "make build" run for a few hours then ^C (it would probably
  take days and I didn't feel like waiting)
arm64 (sopine): let "make build" run for a few hours then ^C (same as rpi3)
riscv64 (unmatched):  full make build, no issues
powerpc64 (talos): full make build, no issues
i386 (ESXi VM): full make build, no issues
octeon (rhino): full make build, no issues

Hope this helps.

-ml

> Comments?  Oks?
>
> [0] https://marc.info/?l=openbsd-bugs&m=164846737707559&w=2
> [1] https://marc.info/?l=openbsd-bugs&m=166843373415030&w=2
>
> Index: uvm/uvm_vnode.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> retrieving revision 1.130
> diff -u -p -r1.130 uvm_vnode.c
> --- uvm/uvm_vnode.c   20 Oct 2022 13:31:52 -  1.130
> +++ uvm/uvm_vnode.c   15 Nov 2022 13:28:28 -
> @@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>* add it to the writeable list, and then return.
>*/
>   if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
> + KASSERT(uvn->u_obj.uo_refs > 0);
>
> - /* regain vref if we were persisting */
> - if (uvn->u_obj.uo_refs == 0) {
> - vref(vp);
> - }
>   uvn->u_obj.uo_refs++;   /* bump uvn ref! */
>
>   /* check for new writeable uvn */
> @@ -235,14 +232,14 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>   KASSERT(uvn->u_obj.uo_refs == 0);
>   uvn->u_obj.uo_refs++;
>   oldflags = uvn->u_flags;
> - uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
> + uvn->u_flags = UVM_VNODE_VALID;
>   uvn->u_nio = 0;
>   uvn->u_size = used_vnode_size;
>
>   /*
>* add a reference to the vnode.   this reference will stay as long
>* as there is a valid mapping of the vnode.   dropped when the
> -  * reference count goes to zero [and we either free or persist].
> +  * reference count goes to zero.
>*/
>   vref(vp);
>
> @@ -323,16 +320,6 @@ uvn_detach(struct uvm_object *uobj)
>*/
>   vp->v_flag &= ~VTEXT;
>
> - /*
> -  * we just dropped the last reference to the uvn.   see if we can
> -  * let it "stick around".
> -  */
> - if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
> - /* won't block */
> - uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
> - goto out;
> - }
> -
>   /* its a goner! */
>   uvn->u_flags |= UVM_VNODE_DYING;
>
> @@ -382,7 +369,6 @@ uvn_detach(struct uvm_object *uobj)
>   /* wake up any sleepers */
>   if (oldflags & UVM_VNODE_WANTED)
>   wakeup(uvn);
> -out:
>   rw_exit(uobj->vmobjlock);
>
>   /* drop our reference to the vnode. */
> @@ -498,8 +484,8 @@ uvm_vnp_terminate(struct vnode *vp)
>   }
>
>   /*
> -  * done.   now we free the uvn if its reference count is zero
> -  * (true if we are zapping a persisting uvn).   however, if we are
> +  * don

Re: Patch for getopt_long.c

2022-11-15 Thread Jeremie Courreges-Anglas
On Tue, Nov 15 2022, Mathias Bavay  wrote:
> Hi,
>
> Please find here attached a patch with minor touch up on 
> src/lib/libc/stdlib/getopt_long.c :
>
>     * Added an SPDX license identifier;

As Theo answered, we don't use those identifiers (except sometimes in
external source code added to the base system).

>    * moved the declarations of two variables in order to reduce their scope;

We sometimes do this but 1. some people don't like much the idea 2. this
change isn't worth a change/commit on its own (IMHO).

Also, only unified diffs please (-u).  See 
https://www.openbsd.org/faq/faq5.html#Diff

> All the best,
>
> Mathias Bavay
>
> 1d0
> < // SPDX-License-Identifier: BSD-2-Clause
> 133c132
> < int cyclelen, i, j, ncycle, nnonopts, nopts;
> ---
>>  int cstart, cyclelen, i, j, ncycle, nnonopts, nopts, pos;
> 145,146c144,145
> < const int cstart = panonopt_end+i;
> < int pos = cstart;
> ---
>>  cstart = panonopt_end+i;
>>  pos = cstart;
>

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: fix Ipv6 link local address assignment

2022-11-15 Thread Klemens Nanni
On Tue, Nov 15, 2022 at 07:33:07PM +0100, Florian Obser wrote:
> That comment is not helpful, it just restates what the code does. The
> gibberish we had before was also not helpful. Maybe just drop the
> comment?
> 
> Either way, OK

I concur.



Re: fix Ipv6 link local address assignment

2022-11-15 Thread Florian Obser
On 2022-11-15 19:21 +01, Claudio Jeker  wrote:
> My last commit to in6_ifattach() broke a few regress tests.
> The problem is that 'ifconfig tun0 inet6 eui64' no longer works.
> Now I thought it would if called explicitly but no.
> So lets peddal back a bit and assign link-local addresses on all interface
> but wg(4). For mpe(4) this does not really matter and it does help other
> point-to-point interfaces where the system somewhat expects a link-local
> addr to be present.
>
> With this all regress tests pass again.
> -- 
> :wq Claudio
>
> Index: netinet6/in6_ifattach.c
> ===
> RCS file: /cvs/src/sys/netinet6/in6_ifattach.c,v
> retrieving revision 1.120
> diff -u -p -r1.120 in6_ifattach.c
> --- netinet6/in6_ifattach.c   14 Nov 2022 17:12:55 -  1.120
> +++ netinet6/in6_ifattach.c   15 Nov 2022 18:14:10 -
> @@ -389,12 +389,11 @@ in6_ifattach(struct ifnet *ifp)
>   return (error);
>   }
>  
> - /*
> -  * Only interfaces that need the ND cache should automatically
> -  * assign a link local address.
> -  */
> - if (!nd6_need_cache(ifp))
> + /* For wg(4) skip assignment of link local address. */

That comment is not helpful, it just restates what the code does. The
gibberish we had before was also not helpful. Maybe just drop the
comment?

Either way, OK

> + switch (ifp->if_type) {
> + case IFT_WIREGUARD:
>   return (0);
> + }
>  
>   /* Assign a link-local address, if there's none. */
>   if (in6ifa_ifpforlinklocal(ifp, 0) == NULL) {
>

-- 
I'm not entirely sure you are real.



fix Ipv6 link local address assignment

2022-11-15 Thread Claudio Jeker
My last commit to in6_ifattach() broke a few regress tests.
The problem is that 'ifconfig tun0 inet6 eui64' no longer works.
Now I thought it would if called explicitly but no.
So lets peddal back a bit and assign link-local addresses on all interface
but wg(4). For mpe(4) this does not really matter and it does help other
point-to-point interfaces where the system somewhat expects a link-local
addr to be present.

With this all regress tests pass again.
-- 
:wq Claudio

Index: netinet6/in6_ifattach.c
===
RCS file: /cvs/src/sys/netinet6/in6_ifattach.c,v
retrieving revision 1.120
diff -u -p -r1.120 in6_ifattach.c
--- netinet6/in6_ifattach.c 14 Nov 2022 17:12:55 -  1.120
+++ netinet6/in6_ifattach.c 15 Nov 2022 18:14:10 -
@@ -389,12 +389,11 @@ in6_ifattach(struct ifnet *ifp)
return (error);
}
 
-   /*
-* Only interfaces that need the ND cache should automatically
-* assign a link local address.
-*/
-   if (!nd6_need_cache(ifp))
+   /* For wg(4) skip assignment of link local address. */
+   switch (ifp->if_type) {
+   case IFT_WIREGUARD:
return (0);
+   }
 
/* Assign a link-local address, if there's none. */
if (in6ifa_ifpforlinklocal(ifp, 0) == NULL) {



Re: Patch for getopt_long.c

2022-11-15 Thread Theo de Raadt
Mathias Bavay  wrote:

> Hi,
> 
> Please find here attached a patch with minor touch up on 
> src/lib/libc/stdlib/getopt_long.c :
> 
>     * Added an SPDX license identifier;

No.



Patch for getopt_long.c

2022-11-15 Thread Mathias Bavay
Hi,

Please find here attached a patch with minor touch up on 
src/lib/libc/stdlib/getopt_long.c :

    * Added an SPDX license identifier;

   * moved the declarations of two variables in order to reduce their scope;


All the best,

Mathias Bavay
1d0
< // SPDX-License-Identifier: BSD-2-Clause
133c132
< 	int cyclelen, i, j, ncycle, nnonopts, nopts;
---
> 	int cstart, cyclelen, i, j, ncycle, nnonopts, nopts, pos;
145,146c144,145
< 		const int cstart = panonopt_end+i;
< 		int pos = cstart;
---
> 		cstart = panonopt_end+i;
> 		pos = cstart;


Get rid of UVM_VNODE_CANPERSIST

2022-11-15 Thread Martin Pieuchot
UVM vnode objects include a reference count to keep track of the number
of processes that have the corresponding pages mapped in their VM space.

When the last process referencing a given library or executable dies,
the reaper will munmap this object on its behalf.  When this happens it
doesn't free the associated pages to speed-up possible re-use of the
file.  Instead the pages are placed on the inactive list but stay ready
to be pmap_enter()'d without requiring I/O as soon as a newly process
needs to access them.

The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST,
doesn't work well with swapping [0].  For some reason when the page daemon
wants to free pages on the inactive list it tries to flush the pages to
disk and panic(9) because it needs a valid reference to the vnode to do
so.

This indicates that the mechanism described above, which seems to work
fine for RO mappings, is currently buggy in more complex situations.
Flushing the pages when the last reference of the UVM object is dropped
also doesn't seem to be enough as bluhm@ reported [1].

The diff below, which has already be committed and reverted, gets rid of
the UVM_VNODE_CANPERSIST logic.  I'd like to commit it again now that
the arm64 caching bug has been found and fixed.

Getting rid of this logic means more I/O will be generated and pages
might have a faster reuse cycle.  I'm aware this might introduce a small
slowdown, however I believe we should work towards loading files from the
buffer cache to save I/O cycles instead of having another layer of cache.
Such work isn't trivial and making sure the vnode <-> UVM relation is
simple and well understood is the first step in this direction.

I'd appreciate if the diff below could be tested on many architectures,
include the offending rpi4.

Comments?  Oks?

[0] https://marc.info/?l=openbsd-bugs&m=164846737707559&w=2 
[1] https://marc.info/?l=openbsd-bugs&m=166843373415030&w=2

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.130
diff -u -p -r1.130 uvm_vnode.c
--- uvm/uvm_vnode.c 20 Oct 2022 13:31:52 -  1.130
+++ uvm/uvm_vnode.c 15 Nov 2022 13:28:28 -
@@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
 * add it to the writeable list, and then return.
 */
if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
+   KASSERT(uvn->u_obj.uo_refs > 0);
 
-   /* regain vref if we were persisting */
-   if (uvn->u_obj.uo_refs == 0) {
-   vref(vp);
-   }
uvn->u_obj.uo_refs++;   /* bump uvn ref! */
 
/* check for new writeable uvn */
@@ -235,14 +232,14 @@ uvn_attach(struct vnode *vp, vm_prot_t a
KASSERT(uvn->u_obj.uo_refs == 0);
uvn->u_obj.uo_refs++;
oldflags = uvn->u_flags;
-   uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
+   uvn->u_flags = UVM_VNODE_VALID;
uvn->u_nio = 0;
uvn->u_size = used_vnode_size;
 
/*
 * add a reference to the vnode.   this reference will stay as long
 * as there is a valid mapping of the vnode.   dropped when the
-* reference count goes to zero [and we either free or persist].
+* reference count goes to zero.
 */
vref(vp);
 
@@ -323,16 +320,6 @@ uvn_detach(struct uvm_object *uobj)
 */
vp->v_flag &= ~VTEXT;
 
-   /*
-* we just dropped the last reference to the uvn.   see if we can
-* let it "stick around".
-*/
-   if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
-   /* won't block */
-   uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
-   goto out;
-   }
-
/* its a goner! */
uvn->u_flags |= UVM_VNODE_DYING;
 
@@ -382,7 +369,6 @@ uvn_detach(struct uvm_object *uobj)
/* wake up any sleepers */
if (oldflags & UVM_VNODE_WANTED)
wakeup(uvn);
-out:
rw_exit(uobj->vmobjlock);
 
/* drop our reference to the vnode. */
@@ -498,8 +484,8 @@ uvm_vnp_terminate(struct vnode *vp)
}
 
/*
-* done.   now we free the uvn if its reference count is zero
-* (true if we are zapping a persisting uvn).   however, if we are
+* done.   now we free the uvn if its reference count is zero.
+* however, if we are
 * terminating a uvn with active mappings we let it live ... future
 * calls down to the vnode layer will fail.
 */
@@ -507,14 +493,14 @@ uvm_vnp_terminate(struct vnode *vp)
if (uvn->u_obj.uo_refs) {
/*
 * uvn must live on it is dead-vnode state until all references
-* are gone.   restore flags.clear CANPERSIST state.
+* are gone.   restore flags.
 */
uvn->u_flags &= ~(UVM_VNODE_DYING|U