On Sat, Oct 03, 2020 at 09:09:00AM +0200, Martin Pieuchot wrote: > On 02/10/20(Fri) 19:09, Scott Cheloha wrote: > > On Fri, Oct 02, 2020 at 12:19:35PM +0200, Martin Pieuchot wrote: > > > @@ -635,12 +642,39 @@ sys_kevent(struct proc *p, void *v, regi > > > goto done; > > > } > > > > > > + > > > KQREF(kq); > > > FRELE(fp, p); > > > - error = kqueue_scan(kq, SCARG(uap, nevents), SCARG(uap, eventlist), > > > - tsp, kev, p, &n); > > > + /* > > > + * Collect as many events as we can. The timeout on successive > > > + * loops is disabled (kqueue_scan() becomes non-blocking). > > > + */ > > > + total = 0; > > > + error = 0; > > > + kqueue_scan_setup(&scan, kq); > > > + while ((n = SCARG(uap, nevents) - total) > 0) { > > > + if (n > nitems(kev)) > > > + n = nitems(kev); > > > + ready = kqueue_scan(&scan, n, kev, tsp, p, &error); > > > + if (ready == 0) > > > + break; > > > + error = copyout(kev, SCARG(uap, eventlist) + total, > > > + sizeof(struct kevent) * ready); > > > +#ifdef KTRACE > > > + if (KTRPOINT(p, KTR_STRUCT)) > > > + ktrevent(p, kev, ready); > > > +#endif > > > + total += ready; > > > + if (error || ready < n) > > > + break; > > > + tsp = &ts; /* successive loops non-blocking */ > > > + timespecclear(tsp); > > > > Here, this. Why do we force a non-blocking loop the second time? > > If there's a second time that implies the first time already reported > some events so there's already something to return to userland. In that > case we just want to gather the events that were not collected the first > time and not sleep again.
Okay, now I see it, thank you for the explanation. > > > + } > > > + kqueue_scan_finish(&scan); > > > KQRELE(kq); > > > - *retval = n; > > > + if (error == EWOULDBLOCK) > > > + error = 0; > > > + *retval = total; > > > return (error); > > > > > > done: > > > @@ -894,24 +928,22 @@ kqueue_sleep(struct kqueue *kq, struct t > > > return (error); > > > } > > > > > > +/* > > > + * Scan the kqueue, blocking if necessary until the target time is > > > reached. > > > + * If tsp is NULL we block indefinitely. If tsp->ts_secs/nsecs are both > > > + * 0 we do not block at all. > > > + */ > > > int > > > -kqueue_scan(struct kqueue *kq, int maxevents, struct kevent *ulistp, > > > - struct timespec *tsp, struct kevent *kev, struct proc *p, int > > > *retval) > > > +kqueue_scan(struct kqueue_scan_state *scan, int maxevents, > > > + struct kevent *kevp, struct timespec *tsp, struct proc *p, int > > > *errorp) > > > { > > > - struct kevent *kevp; > > > - struct knote mend, mstart, *kn; > > > - int s, count, nkev, error = 0; > > > - > > > - nkev = 0; > > > - kevp = kev; > > > + struct knote *kn; > > > + struct kqueue *kq = scan->kqs_kq; > > > + int s, count, nkev = 0, error = 0; > > > > > > count = maxevents; > > > if (count == 0) > > > goto done; > > > - > > > - memset(&mstart, 0, sizeof(mstart)); > > > - memset(&mend, 0, sizeof(mend)); > > > - > > > retry: > > > KASSERT(count == maxevents); > > > KASSERT(nkev == 0); > > > @@ -923,7 +955,8 @@ retry: > > > > > > s = splhigh(); > > > if (kq->kq_count == 0) { > > > - if (tsp != NULL && !timespecisset(tsp)) { > > > + if ((tsp != NULL && !timespecisset(tsp)) || > > > + scan->kqs_nevent != 0) { > > > splx(s); > > > error = 0; > > > goto done; > > > @@ -931,7 +964,7 @@ retry: > > > kq->kq_state |= KQ_SLEEP; > > > error = kqueue_sleep(kq, tsp); > > > splx(s); > > > - if (error == 0 || error == EWOULDBLOCK) > > > + if (error == 0) > > > goto retry; > > > > Why wouldn't we want to retry in the EWOULDBLOCK case? > > You have a check for > > > > tsp != NULL && !timespecisset(tsp) > > > > e.g., when you time out. > > I don't recall why or even if there was a reason. I'll change it back, > thanks. Cool. > > > + > > > + /* > > > + * The poll/select family of syscalls has been designed to > > > + * block when file descriptors are not available, even if > > > + * there's nothing to wait for. > > > + */ > > > + if (nevents == 0) { > > > + uint64_t nsecs = INFSLP; > > > + > > > + if (timeout != NULL) > > > + nsecs = MAX(1, MIN(TIMESPEC_TO_NSEC(timeout), MAXTSLP)); > > > + > > > + error = tsleep_nsec(&p->p_kq, PSOCK | PCATCH, "kqsel", nsecs); > > > + /* select is not restarted after signals... */ > > > + if (error == ERESTART) > > > + error = EINTR; > > > + } > > > > Aside: can the new logic (below) not handle the case where > > nevents == 0? Like, what happens if we go into kqueue_scan() > > with count == 0? > > First like of kqueue_scan() does: > > if (count == 0) > goto done > > So there's no sleep. I opted for the explicit sleep rather than the > obfuscated one ;) I'm a bit torn on this. On the one hand, I like being overt about special cases. On the other hand, I worry that this will introduce bugs. For instance, I just noticed that you have changed the behavior in the zero timeout case. select(2) and pselect(2) are not supposed to block if the timeout is zero, e.g. code like: struct timeval tv; timerclear(&tv); select(0, NULL, NULL, NULL, &tv); should not block. But your new code blocks. In theory there are no other bugs... > > > + /* Collect at most `nevents' possibly waiting in kqueue_scan() */ > > > + kqueue_scan_setup(&scan, p->p_kq); > > > + while (nevents > 0) { > > > + struct kevent kev[KQ_NEVENTS]; > > > + int i, ready, count; > > > + > > > + /* Maxium number of events per iteration */ > > > + count = MIN(nitems(kev), nevents); > > > + ready = kqueue_scan(&scan, count, kev, timeout, p, &error); > > > +#ifdef KTRACE > > > + if (KTRPOINT(p, KTR_STRUCT)) > > > + ktrevent(p, kev, ready); > > > +#endif > > > + /* Convert back events that are ready. */ > > > + for (i = 0; i < ready; i++) > > > + *retval += pselcollect(p, &kev[i], pobits); > > > + > > > + /* > > > + * Stop if there was an error or if we had enough > > > + * place to collect all events that were ready. > > > + */ > > > + if (error || ready < count) > > > + break; > > > + > > > + timeout = &ts; /* successive loops non-blocking */ > > > + timespecclear(timeout); > > > > ... again, it looks like we now truncate instead of resleeping. > > Yes, that's on purpose. Maybe you can suggest a comment to explain this > behavior that seems surprising? After thinking about it a bit I think it is less surprising than I first thought. What a surprise :) I'm trying to communicate: kev is a fixed-length array. The caller's event array might be longer than kev. In that case, we may need to call kqueue_scan() multiple times to capture all events. But we only want to block in kqueue_scan() on the 1st iteration because we necessarily captured events during the first call. Here's my best attempt: /* * There may be more ready events. We will try to capture them during * the next iteration. However, we have already captured other events, * so we do not want to block in kqueue_scan() if it turns out that there * aren't any. */ ... meh.