Re: Fewer pool_get() in kqueue_register()
On Wed, Aug 19, 2020 at 12:10:12PM +0200, Martin Pieuchot wrote: > On 18/08/20(Tue) 15:30, Visa Hankala wrote: > > On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote: > > > Diff below changes the order of operations in kqueue_register() to get > > > rid of an unnecessary pool_get(). When an event is already present on > > > the list try to acquire it first. Note that knote_acquire() may sleep > > > in which case the list might have changed so the lookup has to always > > > begin from the start. > > > > > > This will help with lazy removal of knote in poll/select. In this > > > scenario EV_ADD is generally always done with an knote already on the > > > list. > > > > Some of the overhead could be absorbed by using a pool cache, as shown > > in the diff below. However, I am not suggesting that the cache should > > be taken into use right now. The frequency of knote pool usage is > > relatively low currently; there are other pools that would benefit more > > from caching. > > Agreed, this is a nice idea to revisit. Do you have a way to measure > which pool could benefit from caches? I think systat(1)'s pool view gives a starting point. > > A related question is what implications the increased use of the pool > > cache feature would have under memory pressure. > > Do you have suggestion on how to measure this as well? Could dt(4) > probes or evtcount() help us? I do not have a specific tool in mind, just a vague question: "Can the system start to perform poorly more easily than before?" However, only experimentation will tell.
Re: Fewer pool_get() in kqueue_register()
On 18/08/20(Tue) 15:30, Visa Hankala wrote: > On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote: > > Diff below changes the order of operations in kqueue_register() to get > > rid of an unnecessary pool_get(). When an event is already present on > > the list try to acquire it first. Note that knote_acquire() may sleep > > in which case the list might have changed so the lookup has to always > > begin from the start. > > > > This will help with lazy removal of knote in poll/select. In this > > scenario EV_ADD is generally always done with an knote already on the > > list. > > Some of the overhead could be absorbed by using a pool cache, as shown > in the diff below. However, I am not suggesting that the cache should > be taken into use right now. The frequency of knote pool usage is > relatively low currently; there are other pools that would benefit more > from caching. Agreed, this is a nice idea to revisit. Do you have a way to measure which pool could benefit from caches? I'm also not in a hurry. That said I'd like to be able to re-use descriptor that are already on the kqueue. For per-thread kqueue there's no possible race. And since we'll need some sort of serialization to unlock kevent(2) this could be built on top of it. > A related question is what implications the increased use of the pool > cache feature would have under memory pressure. Do you have suggestion on how to measure this as well? Could dt(4) probes or evtcount() help us? > Index: kern/init_main.c > === > RCS file: src/sys/kern/init_main.c,v > retrieving revision 1.300 > diff -u -p -r1.300 init_main.c > --- kern/init_main.c 16 Jun 2020 05:09:29 - 1.300 > +++ kern/init_main.c 18 Aug 2020 15:09:38 - > @@ -71,6 +71,7 @@ > #include > #endif > #include > +#include > #include > #include > #include > @@ -148,7 +149,6 @@ void crypto_init(void); > void db_ctf_init(void); > void prof_init(void); > void init_exec(void); > -void kqueue_init(void); > void futex_init(void); > void taskq_init(void); > void timeout_proc_init(void); > @@ -431,7 +431,9 @@ main(void *framep) > prof_init(); > #endif > > - mbcpuinit();/* enable per cpu mbuf data */ > + /* Enable per cpu data. */ > + mbcpuinit(); > + kqueue_init_percpu(); > > /* init exec and emul */ > init_exec(); > Index: kern/kern_event.c > === > RCS file: src/sys/kern/kern_event.c,v > retrieving revision 1.142 > diff -u -p -r1.142 kern_event.c > --- kern/kern_event.c 12 Aug 2020 13:49:24 - 1.142 > +++ kern/kern_event.c 18 Aug 2020 15:09:38 - > @@ -205,6 +205,12 @@ kqueue_init(void) > PR_WAITOK, "knotepl", NULL); > } > > +void > +kqueue_init_percpu(void) > +{ > + pool_cache_init(&knote_pool); > +} > + > int > filt_fileattach(struct knote *kn) > { > Index: sys/event.h > === > RCS file: src/sys/sys/event.h,v > retrieving revision 1.44 > diff -u -p -r1.44 event.h > --- sys/event.h 22 Jun 2020 13:14:32 - 1.44 > +++ sys/event.h 18 Aug 2020 15:09:38 - > @@ -210,6 +210,8 @@ extern void knote_activate(struct knote > extern void knote_remove(struct proc *p, struct knlist *list); > extern void knote_fdclose(struct proc *p, int fd); > extern void knote_processexit(struct proc *); > +extern void kqueue_init(void); > +extern void kqueue_init_percpu(void); > extern int kqueue_register(struct kqueue *kq, > struct kevent *kev, struct proc *p); > extern int filt_seltrue(struct knote *kn, long hint);
Re: Fewer pool_get() in kqueue_register()
On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote: > Diff below changes the order of operations in kqueue_register() to get > rid of an unnecessary pool_get(). When an event is already present on > the list try to acquire it first. Note that knote_acquire() may sleep > in which case the list might have changed so the lookup has to always > begin from the start. > > This will help with lazy removal of knote in poll/select. In this > scenario EV_ADD is generally always done with an knote already on the > list. Some of the overhead could be absorbed by using a pool cache, as shown in the diff below. However, I am not suggesting that the cache should be taken into use right now. The frequency of knote pool usage is relatively low currently; there are other pools that would benefit more from caching. A related question is what implications the increased use of the pool cache feature would have under memory pressure. Index: kern/init_main.c === RCS file: src/sys/kern/init_main.c,v retrieving revision 1.300 diff -u -p -r1.300 init_main.c --- kern/init_main.c16 Jun 2020 05:09:29 - 1.300 +++ kern/init_main.c18 Aug 2020 15:09:38 - @@ -71,6 +71,7 @@ #include #endif #include +#include #include #include #include @@ -148,7 +149,6 @@ voidcrypto_init(void); void db_ctf_init(void); void prof_init(void); void init_exec(void); -void kqueue_init(void); void futex_init(void); void taskq_init(void); void timeout_proc_init(void); @@ -431,7 +431,9 @@ main(void *framep) prof_init(); #endif - mbcpuinit();/* enable per cpu mbuf data */ + /* Enable per cpu data. */ + mbcpuinit(); + kqueue_init_percpu(); /* init exec and emul */ init_exec(); Index: kern/kern_event.c === RCS file: src/sys/kern/kern_event.c,v retrieving revision 1.142 diff -u -p -r1.142 kern_event.c --- kern/kern_event.c 12 Aug 2020 13:49:24 - 1.142 +++ kern/kern_event.c 18 Aug 2020 15:09:38 - @@ -205,6 +205,12 @@ kqueue_init(void) PR_WAITOK, "knotepl", NULL); } +void +kqueue_init_percpu(void) +{ + pool_cache_init(&knote_pool); +} + int filt_fileattach(struct knote *kn) { Index: sys/event.h === RCS file: src/sys/sys/event.h,v retrieving revision 1.44 diff -u -p -r1.44 event.h --- sys/event.h 22 Jun 2020 13:14:32 - 1.44 +++ sys/event.h 18 Aug 2020 15:09:38 - @@ -210,6 +210,8 @@ extern void knote_activate(struct knote extern voidknote_remove(struct proc *p, struct knlist *list); extern voidknote_fdclose(struct proc *p, int fd); extern voidknote_processexit(struct proc *); +extern voidkqueue_init(void); +extern voidkqueue_init_percpu(void); extern int kqueue_register(struct kqueue *kq, struct kevent *kev, struct proc *p); extern int filt_seltrue(struct knote *kn, long hint);
Re: Fewer pool_get() in kqueue_register()
On 18/08/20(Tue) 11:22, Mark Kettenis wrote: > > Date: Tue, 18 Aug 2020 11:04:47 +0200 > > From: Martin Pieuchot > > > > Diff below changes the order of operations in kqueue_register() to get > > rid of an unnecessary pool_get(). When an event is already present on > > the list try to acquire it first. Note that knote_acquire() may sleep > > in which case the list might have changed so the lookup has to always > > begin from the start. > > > > This will help with lazy removal of knote in poll/select. In this > > scenario EV_ADD is generally always done with an knote already on the > > list. > > > > ok? > > But pool_get() may sleep as well. In my experience it is better to do > the resource allocation up front and release afterwards if it turned > out you didn't need the resource. That's what the current code does. There's indeed a race possible when multiple threads try to register the same event on the same kqueue, this will lead to a double insert. Thanks! > > Index: kern/kern_event.c > > === > > RCS file: /cvs/src/sys/kern/kern_event.c,v > > retrieving revision 1.142 > > diff -u -p -r1.142 kern_event.c > > --- kern/kern_event.c 12 Aug 2020 13:49:24 - 1.142 > > +++ kern/kern_event.c 18 Aug 2020 08:58:27 - > > @@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc > > struct filedesc *fdp = kq->kq_fdp; > > const struct filterops *fops = NULL; > > struct file *fp = NULL; > > - struct knote *kn = NULL, *newkn = NULL; > > + struct knote *kn, *newkn = NULL; > > struct knlist *list = NULL; > > int s, error = 0; > > > > @@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc > > return (EBADF); > > } > > > > - if (kev->flags & EV_ADD) > > - newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); > > - > > again: > > + kn = NULL; > > if (fops->f_flags & FILTEROP_ISFD) { > > - if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { > > - error = EBADF; > > - goto done; > > - } > > - if (kev->flags & EV_ADD) > > - kqueue_expand_list(kq, kev->ident); > > if (kev->ident < kq->kq_knlistsize) > > list = &kq->kq_knlist[kev->ident]; > > } else { > > - if (kev->flags & EV_ADD) > > - kqueue_expand_hash(kq); > > if (kq->kq_knhashmask != 0) { > > list = &kq->kq_knhash[ > > KN_HASH((u_long)kev->ident, kq->kq_knhashmask)]; > > @@ -749,10 +739,6 @@ again: > > s = splhigh(); > > if (!knote_acquire(kn)) { > > splx(s); > > - if (fp != NULL) { > > - FRELE(fp, p); > > - fp = NULL; > > - } > > goto again; > > } > > splx(s); > > @@ -760,6 +746,21 @@ again: > > } > > } > > } > > + > > + if (kev->flags & EV_ADD && kn == NULL) { > > + newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); > > + if (fops->f_flags & FILTEROP_ISFD) { > > + if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { > > + error = EBADF; > > + goto done; > > + } > > + kqueue_expand_list(kq, kev->ident); > > + } else { > > + kqueue_expand_hash(kq); > > + } > > + > > + } > > + > > KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0); > > > > if (kn == NULL && ((kev->flags & EV_ADD) == 0)) { > > > >
Re: Fewer pool_get() in kqueue_register()
> Date: Tue, 18 Aug 2020 11:04:47 +0200 > From: Martin Pieuchot > > Diff below changes the order of operations in kqueue_register() to get > rid of an unnecessary pool_get(). When an event is already present on > the list try to acquire it first. Note that knote_acquire() may sleep > in which case the list might have changed so the lookup has to always > begin from the start. > > This will help with lazy removal of knote in poll/select. In this > scenario EV_ADD is generally always done with an knote already on the > list. > > ok? But pool_get() may sleep as well. In my experience it is better to do the resource allocation up front and release afterwards if it turned out you didn't need the resource. That's what the current code does. I don't fully understand how the code works, but your change looks wrong to me. > Index: kern/kern_event.c > === > RCS file: /cvs/src/sys/kern/kern_event.c,v > retrieving revision 1.142 > diff -u -p -r1.142 kern_event.c > --- kern/kern_event.c 12 Aug 2020 13:49:24 - 1.142 > +++ kern/kern_event.c 18 Aug 2020 08:58:27 - > @@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc > struct filedesc *fdp = kq->kq_fdp; > const struct filterops *fops = NULL; > struct file *fp = NULL; > - struct knote *kn = NULL, *newkn = NULL; > + struct knote *kn, *newkn = NULL; > struct knlist *list = NULL; > int s, error = 0; > > @@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc > return (EBADF); > } > > - if (kev->flags & EV_ADD) > - newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); > - > again: > + kn = NULL; > if (fops->f_flags & FILTEROP_ISFD) { > - if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { > - error = EBADF; > - goto done; > - } > - if (kev->flags & EV_ADD) > - kqueue_expand_list(kq, kev->ident); > if (kev->ident < kq->kq_knlistsize) > list = &kq->kq_knlist[kev->ident]; > } else { > - if (kev->flags & EV_ADD) > - kqueue_expand_hash(kq); > if (kq->kq_knhashmask != 0) { > list = &kq->kq_knhash[ > KN_HASH((u_long)kev->ident, kq->kq_knhashmask)]; > @@ -749,10 +739,6 @@ again: > s = splhigh(); > if (!knote_acquire(kn)) { > splx(s); > - if (fp != NULL) { > - FRELE(fp, p); > - fp = NULL; > - } > goto again; > } > splx(s); > @@ -760,6 +746,21 @@ again: > } > } > } > + > + if (kev->flags & EV_ADD && kn == NULL) { > + newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); > + if (fops->f_flags & FILTEROP_ISFD) { > + if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { > + error = EBADF; > + goto done; > + } > + kqueue_expand_list(kq, kev->ident); > + } else { > + kqueue_expand_hash(kq); > + } > + > + } > + > KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0); > > if (kn == NULL && ((kev->flags & EV_ADD) == 0)) { > >
Fewer pool_get() in kqueue_register()
Diff below changes the order of operations in kqueue_register() to get rid of an unnecessary pool_get(). When an event is already present on the list try to acquire it first. Note that knote_acquire() may sleep in which case the list might have changed so the lookup has to always begin from the start. This will help with lazy removal of knote in poll/select. In this scenario EV_ADD is generally always done with an knote already on the list. ok? Index: kern/kern_event.c === RCS file: /cvs/src/sys/kern/kern_event.c,v retrieving revision 1.142 diff -u -p -r1.142 kern_event.c --- kern/kern_event.c 12 Aug 2020 13:49:24 - 1.142 +++ kern/kern_event.c 18 Aug 2020 08:58:27 - @@ -696,7 +696,7 @@ kqueue_register(struct kqueue *kq, struc struct filedesc *fdp = kq->kq_fdp; const struct filterops *fops = NULL; struct file *fp = NULL; - struct knote *kn = NULL, *newkn = NULL; + struct knote *kn, *newkn = NULL; struct knlist *list = NULL; int s, error = 0; @@ -721,22 +721,12 @@ kqueue_register(struct kqueue *kq, struc return (EBADF); } - if (kev->flags & EV_ADD) - newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); - again: + kn = NULL; if (fops->f_flags & FILTEROP_ISFD) { - if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { - error = EBADF; - goto done; - } - if (kev->flags & EV_ADD) - kqueue_expand_list(kq, kev->ident); if (kev->ident < kq->kq_knlistsize) list = &kq->kq_knlist[kev->ident]; } else { - if (kev->flags & EV_ADD) - kqueue_expand_hash(kq); if (kq->kq_knhashmask != 0) { list = &kq->kq_knhash[ KN_HASH((u_long)kev->ident, kq->kq_knhashmask)]; @@ -749,10 +739,6 @@ again: s = splhigh(); if (!knote_acquire(kn)) { splx(s); - if (fp != NULL) { - FRELE(fp, p); - fp = NULL; - } goto again; } splx(s); @@ -760,6 +746,21 @@ again: } } } + + if (kev->flags & EV_ADD && kn == NULL) { + newkn = pool_get(&knote_pool, PR_WAITOK | PR_ZERO); + if (fops->f_flags & FILTEROP_ISFD) { + if ((fp = fd_getfile(fdp, kev->ident)) == NULL) { + error = EBADF; + goto done; + } + kqueue_expand_list(kq, kev->ident); + } else { + kqueue_expand_hash(kq); + } + + } + KASSERT(kn == NULL || (kn->kn_status & KN_PROCESSING) != 0); if (kn == NULL && ((kev->flags & EV_ADD) == 0)) {