Hi,
The syncer thread is one of the last users of the lbolt (lightning
bolt!) sleep channel.
If we add a syncer-specific sleep channel (syncer_chan) and do a bit
of time math we can replicate the current behavior and remove another
lbolt user.
This isn't a perfect recreation of the current behavior. In this
version the sleep period will drift if processing takes longer than 1
second. I think it's good enough. If people are concerned about a
perfect recreation of the current behavior we *can* do it, but it will
require more code. I don't think it's worth it.
This also fixes two problems in the current code. They aren't huge
bugs, but they did jump out as potential problems because they make
the syncer's behavior less deterministic:
- The current code uses gettime(9), which will jump and screw up your
measurement if someone calls settimeofday(2). The new code uses the
uptime clock, which is monotonic and stable.
- The current code uses gettime(9), which has a resolution of 1
second. Measuring a 1 second timeout with an interface with
a resolution of 1 second is crude and error-prone. The new code
uses getnsecuptime(), which has a resolution of roughly 1/hz.
Much better.
I vaguely recall beck@ trying to do something with this in the recent
past, so CC beck@.
Thoughts? ok?
Index: vfs_sync.c
===================================================================
RCS file: /cvs/src/sys/kern/vfs_sync.c,v
retrieving revision 1.64
diff -u -p -r1.64 vfs_sync.c
--- vfs_sync.c 24 Jun 2020 22:03:41 -0000 1.64
+++ vfs_sync.c 12 Dec 2020 19:29:11 -0000
@@ -48,6 +48,7 @@
#include <sys/vnode.h>
#include <sys/lock.h>
#include <sys/malloc.h>
+#include <sys/time.h>
#include <sys/kernel.h>
#include <sys/sched.h>
@@ -73,6 +74,7 @@ LIST_HEAD(synclist, vnode);
static struct synclist *syncer_workitem_pending;
struct proc *syncerproc;
+int syncer_chan;
/*
* The workitem queue.
@@ -129,6 +131,15 @@ vn_syncer_add_to_worklist(struct vnode *
splx(s);
}
+uint64_t
+getnsecuptime(void)
+{
+ struct timespec now;
+
+ getnanouptime(&now);
+ return TIMESPEC_TO_NSEC(&now);
+}
+
/*
* System filesystem synchronizer daemon.
*/
@@ -138,11 +149,11 @@ syncer_thread(void *arg)
struct proc *p = curproc;
struct synclist *slp;
struct vnode *vp;
- time_t starttime;
+ uint64_t elapsed, start;
int s;
for (;;) {
- starttime = gettime();
+ start = getnsecuptime();
/*
* Push files whose dirty time has expired.
@@ -220,6 +231,7 @@ syncer_thread(void *arg)
rushjob -= 1;
continue;
}
+
/*
* If it has taken us less than a second to process the
* current work, then wait. Otherwise start right over
@@ -228,8 +240,11 @@ syncer_thread(void *arg)
* matter as we are just trying to generally pace the
* filesystem activity.
*/
- if (gettime() == starttime)
- tsleep_nsec(&lbolt, PPAUSE, "syncer", INFSLP);
+ elapsed = getnsecuptime() - start;
+ if (elapsed < SEC_TO_NSEC(1)) {
+ tsleep_nsec(&syncer_chan, PPAUSE, "syncer",
+ SEC_TO_NSEC(1) - elapsed);
+ }
}
}
@@ -242,7 +257,7 @@ int
speedup_syncer(void)
{
if (syncerproc)
- wakeup_proc(syncerproc, &lbolt);
+ wakeup_proc(syncerproc, &syncer_chan);
if (rushjob < syncdelay / 2) {
rushjob += 1;
stat_rush_requests += 1;