Re: Fewer pool_get() in kqueue_register()

2020-08-19 Thread Visa Hankala
On Wed, Aug 19, 2020 at 12:10:12PM +0200, Martin Pieuchot wrote:
> On 18/08/20(Tue) 15:30, Visa Hankala wrote:
> > On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote:
> > > Diff below changes the order of operations in kqueue_register() to get
> > > rid of an unnecessary pool_get().  When an event is already present on
> > > the list try to acquire it first.  Note that knote_acquire() may sleep
> > > in which case the list might have changed so the lookup has to always
> > > begin from the start.
> > > 
> > > This will help with lazy removal of knote in poll/select.  In this
> > > scenario EV_ADD is generally always done with an knote already on the
> > > list.
> > 
> > Some of the overhead could be absorbed by using a pool cache, as shown
> > in the diff below. However, I am not suggesting that the cache should
> > be taken into use right now. The frequency of knote pool usage is
> > relatively low currently; there are other pools that would benefit more
> > from caching.
> 
> Agreed, this is a nice idea to revisit.  Do you have a way to measure
> which pool could benefit from caches?

I think systat(1)'s pool view gives a starting point.

> > A related question is what implications the increased use of the pool
> > cache feature would have under memory pressure.
> 
> Do you have suggestion on how to measure this as well?  Could dt(4)
> probes or evtcount() help us? 

I do not have a specific tool in mind, just a vague question: "Can the
system start to perform poorly more easily than before?" However, only
experimentation will tell.



Re: Fewer pool_get() in kqueue_register()

2020-08-19 Thread Martin Pieuchot
On 18/08/20(Tue) 15:30, Visa Hankala wrote:
> On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote:
> > Diff below changes the order of operations in kqueue_register() to get
> > rid of an unnecessary pool_get().  When an event is already present on
> > the list try to acquire it first.  Note that knote_acquire() may sleep
> > in which case the list might have changed so the lookup has to always
> > begin from the start.
> > 
> > This will help with lazy removal of knote in poll/select.  In this
> > scenario EV_ADD is generally always done with an knote already on the
> > list.
> 
> Some of the overhead could be absorbed by using a pool cache, as shown
> in the diff below. However, I am not suggesting that the cache should
> be taken into use right now. The frequency of knote pool usage is
> relatively low currently; there are other pools that would benefit more
> from caching.

Agreed, this is a nice idea to revisit.  Do you have a way to measure
which pool could benefit from caches?

I'm also not in a hurry.  That said I'd like to be able to re-use
descriptor that are already on the kqueue.  For per-thread kqueue there's
no possible race.  And since we'll need some sort of serialization to
unlock kevent(2) this could be built on top of it.

> A related question is what implications the increased use of the pool
> cache feature would have under memory pressure.

Do you have suggestion on how to measure this as well?  Could dt(4)
probes or evtcount() help us? 

> Index: kern/init_main.c
> ===
> RCS file: src/sys/kern/init_main.c,v
> retrieving revision 1.300
> diff -u -p -r1.300 init_main.c
> --- kern/init_main.c  16 Jun 2020 05:09:29 -  1.300
> +++ kern/init_main.c  18 Aug 2020 15:09:38 -
> @@ -71,6 +71,7 @@
>  #include 
>  #endif
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -148,7 +149,6 @@ void  crypto_init(void);
>  void db_ctf_init(void);
>  void prof_init(void);
>  void init_exec(void);
> -void kqueue_init(void);
>  void futex_init(void);
>  void taskq_init(void);
>  void timeout_proc_init(void);
> @@ -431,7 +431,9 @@ main(void *framep)
>   prof_init();
>  #endif
>  
> - mbcpuinit();/* enable per cpu mbuf data */
> + /* Enable per cpu data. */
> + mbcpuinit();
> + kqueue_init_percpu();
>  
>   /* init exec and emul */
>   init_exec();
> Index: kern/kern_event.c
> ===
> RCS file: src/sys/kern/kern_event.c,v
> retrieving revision 1.142
> diff -u -p -r1.142 kern_event.c
> --- kern/kern_event.c 12 Aug 2020 13:49:24 -  1.142
> +++ kern/kern_event.c 18 Aug 2020 15:09:38 -
> @@ -205,6 +205,12 @@ kqueue_init(void)
>   PR_WAITOK, "knotepl", NULL);
>  }
>  
> +void
> +kqueue_init_percpu(void)
> +{
> + pool_cache_init(&knote_pool);
> +}
> +
>  int
>  filt_fileattach(struct knote *kn)
>  {
> Index: sys/event.h
> ===
> RCS file: src/sys/sys/event.h,v
> retrieving revision 1.44
> diff -u -p -r1.44 event.h
> --- sys/event.h   22 Jun 2020 13:14:32 -  1.44
> +++ sys/event.h   18 Aug 2020 15:09:38 -
> @@ -210,6 +210,8 @@ extern void   knote_activate(struct knote 
>  extern void  knote_remove(struct proc *p, struct knlist *list);
>  extern void  knote_fdclose(struct proc *p, int fd);
>  extern void  knote_processexit(struct proc *);
> +extern void  kqueue_init(void);
> +extern void  kqueue_init_percpu(void);
>  extern int   kqueue_register(struct kqueue *kq,
>   struct kevent *kev, struct proc *p);
>  extern int   filt_seltrue(struct knote *kn, long hint);



Re: Fewer pool_get() in kqueue_register()

2020-08-18 Thread Visa Hankala
On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote:
> Diff below changes the order of operations in kqueue_register() to get
> rid of an unnecessary pool_get().  When an event is already present on
> the list try to acquire it first.  Note that knote_acquire() may sleep
> in which case the list might have changed so the lookup has to always
> begin from the start.
> 
> This will help with lazy removal of knote in poll/select.  In this
> scenario EV_ADD is generally always done with an knote already on the
> list.

Some of the overhead could be absorbed by using a pool cache, as shown
in the diff below. However, I am not suggesting that the cache should
be taken into use right now. The frequency of knote pool usage is
relatively low currently; there are other pools that would benefit more
from caching.

A related question is what implications the increased use of the pool
cache feature would have under memory pressure.

Index: kern/init_main.c
===
RCS file: src/sys/kern/init_main.c,v
retrieving revision 1.300
diff -u -p -r1.300 init_main.c
--- kern/init_main.c16 Jun 2020 05:09:29 -  1.300
+++ kern/init_main.c18 Aug 2020 15:09:38 -
@@ -71,6 +71,7 @@
 #include 
 #endif
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -148,7 +149,6 @@ voidcrypto_init(void);
 void   db_ctf_init(void);
 void   prof_init(void);
 void   init_exec(void);
-void   kqueue_init(void);
 void   futex_init(void);
 void   taskq_init(void);
 void   timeout_proc_init(void);
@@ -431,7 +431,9 @@ main(void *framep)
prof_init();
 #endif
 
-   mbcpuinit();/* enable per cpu mbuf data */
+   /* Enable per cpu data. */
+   mbcpuinit();
+   kqueue_init_percpu();
 
/* init exec and emul */
init_exec();
Index: kern/kern_event.c
===
RCS file: src/sys/kern/kern_event.c,v
retrieving revision 1.142
diff -u -p -r1.142 kern_event.c
--- kern/kern_event.c   12 Aug 2020 13:49:24 -  1.142
+++ kern/kern_event.c   18 Aug 2020 15:09:38 -
@@ -205,6 +205,12 @@ kqueue_init(void)
PR_WAITOK, "knotepl", NULL);
 }
 
+void
+kqueue_init_percpu(void)
+{
+   pool_cache_init(&knote_pool);
+}
+
 int
 filt_fileattach(struct knote *kn)
 {
Index: sys/event.h
===
RCS file: src/sys/sys/event.h,v
retrieving revision 1.44
diff -u -p -r1.44 event.h
--- sys/event.h 22 Jun 2020 13:14:32 -  1.44
+++ sys/event.h 18 Aug 2020 15:09:38 -
@@ -210,6 +210,8 @@ extern void knote_activate(struct knote 
 extern voidknote_remove(struct proc *p, struct knlist *list);
 extern voidknote_fdclose(struct proc *p, int fd);
 extern voidknote_processexit(struct proc *);
+extern voidkqueue_init(void);
+extern voidkqueue_init_percpu(void);
 extern int kqueue_register(struct kqueue *kq,
struct kevent *kev, struct proc *p);
 extern int filt_seltrue(struct knote *kn, long hint);



Re: Fewer pool_get() in kqueue_register()

2020-08-18 Thread Martin Pieuchot
On 18/08/20(Tue) 11:22, Mark Kettenis wrote:
> > Date: Tue, 18 Aug 2020 11:04:47 +0200
> > From: Martin Pieuchot 
> > 
> > Diff below changes the order of operations in kqueue_register() to get
> > rid of an unnecessary pool_get().  When an event is already present on
> > the list try to acquire it first.  Note that knote_acquire() may sleep
> > in which case the list might have changed so the lookup has to always
> > begin from the start.
> > 
> > This will help with lazy removal of knote in poll/select.  In this
> > scenario EV_ADD is generally always done with an knote already on the
> > list.
> > 
> > ok?
> 
> But pool_get() may sleep as well.  In my experience it is better to do
> the resource allocation up front and release afterwards if it turned
> out you didn't need the resource.  That's what the current code does.

There's indeed a race possible when multiple threads try to register the
same event on the same kqueue, this will lead to a double insert.  Thanks! 

> > Index: kern/kern_event.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_event.c,v
> > retrieving revision 1.142
> > diff -u -p -r1.142 kern_event.c
> > --- kern/kern_event.c   12 Aug 2020 13:49:24 -  1.142
> > +++ kern/kern_event.c   18 Aug 2020 08:58:27 -
> > @@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc
> > struct filedesc *fdp = kq->kq_fdp;
> > const struct filterops *fops = NULL;
> > struct file *fp = NULL;
> > -   struct knote *kn = NULL, *newkn = NULL;
> > +   struct knote *kn, *newkn = NULL;
> > struct knlist *list = NULL;
> > int s, error = 0;
> >  
> > @@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc
> > return (EBADF);
> > }
> >  
> > -   if (kev->flags & EV_ADD)
> > -   newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
> > -
> >  again:
> > +   kn = NULL;
> > if (fops->f_flags & FILTEROP_ISFD) {
> > -   if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
> > -   error = EBADF;
> > -   goto done;
> > -   }
> > -   if (kev->flags & EV_ADD)
> > -   kqueue_expand_list(kq, kev->ident);
> > if (kev->ident < kq->kq_knlistsize)
> > list = &kq->kq_knlist[kev->ident];
> > } else {
> > -   if (kev->flags & EV_ADD)
> > -   kqueue_expand_hash(kq);
> > if (kq->kq_knhashmask != 0) {
> > list = &kq->kq_knhash[
> > KN_HASH((u_long)kev->ident, kq->kq_knhashmask)];
> > @@ -749,10 +739,6 @@ again:
> > s = splhigh();
> > if (!knote_acquire(kn)) {
> > splx(s);
> > -   if (fp != NULL) {
> > -   FRELE(fp, p);
> > -   fp = NULL;
> > -   }
> > goto again;
> > }
> > splx(s);
> > @@ -760,6 +746,21 @@ again:
> > }
> > }
> > }
> > +
> > +   if (kev->flags & EV_ADD && kn == NULL) {
> > +   newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
> > +   if (fops->f_flags & FILTEROP_ISFD) {
> > +   if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
> > +   error = EBADF;
> > +   goto done;
> > +   }
> > +   kqueue_expand_list(kq, kev->ident);
> > +   } else {
> > +   kqueue_expand_hash(kq);
> > +   }
> > +
> > +   }
> > +
> > KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0);
> >  
> > if (kn == NULL && ((kev->flags & EV_ADD) == 0)) {
> > 
> > 



Re: Fewer pool_get() in kqueue_register()

2020-08-18 Thread Mark Kettenis
> Date: Tue, 18 Aug 2020 11:04:47 +0200
> From: Martin Pieuchot 
> 
> Diff below changes the order of operations in kqueue_register() to get
> rid of an unnecessary pool_get().  When an event is already present on
> the list try to acquire it first.  Note that knote_acquire() may sleep
> in which case the list might have changed so the lookup has to always
> begin from the start.
> 
> This will help with lazy removal of knote in poll/select.  In this
> scenario EV_ADD is generally always done with an knote already on the
> list.
> 
> ok?

But pool_get() may sleep as well.  In my experience it is better to do
the resource allocation up front and release afterwards if it turned
out you didn't need the resource.  That's what the current code does.
I don't fully understand how the code works, but your change looks
wrong to me.

> Index: kern/kern_event.c
> ===
> RCS file: /cvs/src/sys/kern/kern_event.c,v
> retrieving revision 1.142
> diff -u -p -r1.142 kern_event.c
> --- kern/kern_event.c 12 Aug 2020 13:49:24 -  1.142
> +++ kern/kern_event.c 18 Aug 2020 08:58:27 -
> @@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc
>   struct filedesc *fdp = kq->kq_fdp;
>   const struct filterops *fops = NULL;
>   struct file *fp = NULL;
> - struct knote *kn = NULL, *newkn = NULL;
> + struct knote *kn, *newkn = NULL;
>   struct knlist *list = NULL;
>   int s, error = 0;
>  
> @@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc
>   return (EBADF);
>   }
>  
> - if (kev->flags & EV_ADD)
> - newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
> -
>  again:
> + kn = NULL;
>   if (fops->f_flags & FILTEROP_ISFD) {
> - if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
> - error = EBADF;
> - goto done;
> - }
> - if (kev->flags & EV_ADD)
> - kqueue_expand_list(kq, kev->ident);
>   if (kev->ident < kq->kq_knlistsize)
>   list = &kq->kq_knlist[kev->ident];
>   } else {
> - if (kev->flags & EV_ADD)
> - kqueue_expand_hash(kq);
>   if (kq->kq_knhashmask != 0) {
>   list = &kq->kq_knhash[
>   KN_HASH((u_long)kev->ident, kq->kq_knhashmask)];
> @@ -749,10 +739,6 @@ again:
>   s = splhigh();
>   if (!knote_acquire(kn)) {
>   splx(s);
> - if (fp != NULL) {
> - FRELE(fp, p);
> - fp = NULL;
> - }
>   goto again;
>   }
>   splx(s);
> @@ -760,6 +746,21 @@ again:
>   }
>   }
>   }
> +
> + if (kev->flags & EV_ADD && kn == NULL) {
> + newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
> + if (fops->f_flags & FILTEROP_ISFD) {
> + if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
> + error = EBADF;
> + goto done;
> + }
> + kqueue_expand_list(kq, kev->ident);
> + } else {
> + kqueue_expand_hash(kq);
> + }
> +
> + }
> +
>   KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0);
>  
>   if (kn == NULL && ((kev->flags & EV_ADD) == 0)) {
> 
> 



Fewer pool_get() in kqueue_register()

2020-08-18 Thread Martin Pieuchot
Diff below changes the order of operations in kqueue_register() to get
rid of an unnecessary pool_get().  When an event is already present on
the list try to acquire it first.  Note that knote_acquire() may sleep
in which case the list might have changed so the lookup has to always
begin from the start.

This will help with lazy removal of knote in poll/select.  In this
scenario EV_ADD is generally always done with an knote already on the
list.

ok?

Index: kern/kern_event.c
===
RCS file: /cvs/src/sys/kern/kern_event.c,v
retrieving revision 1.142
diff -u -p -r1.142 kern_event.c
--- kern/kern_event.c   12 Aug 2020 13:49:24 -  1.142
+++ kern/kern_event.c   18 Aug 2020 08:58:27 -
@@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc
struct filedesc *fdp = kq->kq_fdp;
const struct filterops *fops = NULL;
struct file *fp = NULL;
-   struct knote *kn = NULL, *newkn = NULL;
+   struct knote *kn, *newkn = NULL;
struct knlist *list = NULL;
int s, error = 0;
 
@@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc
return (EBADF);
}
 
-   if (kev->flags & EV_ADD)
-   newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
-
 again:
+   kn = NULL;
if (fops->f_flags & FILTEROP_ISFD) {
-   if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
-   error = EBADF;
-   goto done;
-   }
-   if (kev->flags & EV_ADD)
-   kqueue_expand_list(kq, kev->ident);
if (kev->ident < kq->kq_knlistsize)
list = &kq->kq_knlist[kev->ident];
} else {
-   if (kev->flags & EV_ADD)
-   kqueue_expand_hash(kq);
if (kq->kq_knhashmask != 0) {
list = &kq->kq_knhash[
KN_HASH((u_long)kev->ident, kq->kq_knhashmask)];
@@ -749,10 +739,6 @@ again:
s = splhigh();
if (!knote_acquire(kn)) {
splx(s);
-   if (fp != NULL) {
-   FRELE(fp, p);
-   fp = NULL;
-   }
goto again;
}
splx(s);
@@ -760,6 +746,21 @@ again:
}
}
}
+
+   if (kev->flags & EV_ADD && kn == NULL) {
+   newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO);
+   if (fops->f_flags & FILTEROP_ISFD) {
+   if ((fp = fd_getfile(fdp, kev->ident)) == NULL) {
+   error = EBADF;
+   goto done;
+   }
+   kqueue_expand_list(kq, kev->ident);
+   } else {
+   kqueue_expand_hash(kq);
+   }
+
+   }
+
KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0);
 
if (kn == NULL && ((kev->flags & EV_ADD) == 0)) {