On Sat, Dec 12, 2020 at 01:32:13PM -0600, Scott Cheloha wrote:
> Hi,
>
> The syncer thread is one of the last users of the lbolt (lightning
> bolt!) sleep channel.
>
> If we add a syncer-specific sleep channel (syncer_chan) and do a bit
> of time math we can replicate the current behavior and remove another
> lbolt user.
>
> This isn't a perfect recreation of the current behavior. In this
> version the sleep period will drift if processing takes longer than 1
> second. I think it's good enough. If people are concerned about a
> perfect recreation of the current behavior we *can* do it, but it will
> require more code. I don't think it's worth it.
>
> This also fixes two problems in the current code. They aren't huge
> bugs, but they did jump out as potential problems because they make
> the syncer's behavior less deterministic:
>
> - The current code uses gettime(9), which will jump and screw up your
> measurement if someone calls settimeofday(2). The new code uses the
> uptime clock, which is monotonic and stable.
>
> - The current code uses gettime(9), which has a resolution of 1
> second. Measuring a 1 second timeout with an interface with
> a resolution of 1 second is crude and error-prone. The new code
> uses getnsecuptime(), which has a resolution of roughly 1/hz.
> Much better.
>
> I vaguely recall beck@ trying to do something with this in the recent
> past, so CC beck@.
1-ish month bump.
Index: vfs_sync.c
===
RCS file: /cvs/src/sys/kern/vfs_sync.c,v
retrieving revision 1.64
diff -u -p -r1.64 vfs_sync.c
--- vfs_sync.c 24 Jun 2020 22:03:41 - 1.64
+++ vfs_sync.c 25 Dec 2020 17:09:00 -
@@ -48,6 +48,7 @@
#include
#include
#include
+#include
#include
#include
@@ -73,6 +74,7 @@ LIST_HEAD(synclist, vnode);
static struct synclist *syncer_workitem_pending;
struct proc *syncerproc;
+int syncer_chan;
/*
* The workitem queue.
@@ -130,6 +132,19 @@ vn_syncer_add_to_worklist(struct vnode *
}
/*
+ * TODO Move getnsecuptime() to kern_tc.c and document it when we have
+ * more users in the kernel.
+ */
+static uint64_t
+getnsecuptime(void)
+{
+ struct timespec now;
+
+ getnanouptime(&now);
+ return TIMESPEC_TO_NSEC(&now);
+}
+
+/*
* System filesystem synchronizer daemon.
*/
void
@@ -138,11 +153,11 @@ syncer_thread(void *arg)
struct proc *p = curproc;
struct synclist *slp;
struct vnode *vp;
- time_t starttime;
+ uint64_t elapsed, start;
int s;
for (;;) {
- starttime = gettime();
+ start = getnsecuptime();
/*
* Push files whose dirty time has expired.
@@ -220,6 +235,7 @@ syncer_thread(void *arg)
rushjob -= 1;
continue;
}
+
/*
* If it has taken us less than a second to process the
* current work, then wait. Otherwise start right over
@@ -228,8 +244,11 @@ syncer_thread(void *arg)
* matter as we are just trying to generally pace the
* filesystem activity.
*/
- if (gettime() == starttime)
- tsleep_nsec(&lbolt, PPAUSE, "syncer", INFSLP);
+ elapsed = getnsecuptime() - start;
+ if (elapsed < SEC_TO_NSEC(1)) {
+ tsleep_nsec(&syncer_chan, PPAUSE, "syncer",
+ SEC_TO_NSEC(1) - elapsed);
+ }
}
}
@@ -242,7 +261,7 @@ int
speedup_syncer(void)
{
if (syncerproc)
- wakeup_proc(syncerproc, &lbolt);
+ wakeup_proc(syncerproc, &syncer_chan);
if (rushjob < syncdelay / 2) {
rushjob += 1;
stat_rush_requests += 1;