Re: syncer_thread: sleep without lbolt

2021-01-07 Thread Scott Cheloha
On Sat, Dec 12, 2020 at 01:32:13PM -0600, Scott Cheloha wrote:
> Hi,
> 
> The syncer thread is one of the last users of the lbolt (lightning
> bolt!) sleep channel.
> 
> If we add a syncer-specific sleep channel (syncer_chan) and do a bit
> of time math we can replicate the current behavior and remove another
> lbolt user.
> 
> This isn't a perfect recreation of the current behavior.  In this
> version the sleep period will drift if processing takes longer than 1
> second.  I think it's good enough.  If people are concerned about a
> perfect recreation of the current behavior we *can* do it, but it will
> require more code.  I don't think it's worth it.
> 
> This also fixes two problems in the current code.  They aren't huge
> bugs, but they did jump out as potential problems because they make
> the syncer's behavior less deterministic:
> 
> - The current code uses gettime(9), which will jump and screw up your
>   measurement if someone calls settimeofday(2).  The new code uses the
>   uptime clock, which is monotonic and stable.
> 
> - The current code uses gettime(9), which has a resolution of 1
>   second.  Measuring a 1 second timeout with an interface with
>   a resolution of 1 second is crude and error-prone.  The new code
>   uses getnsecuptime(), which has a resolution of roughly 1/hz.
>   Much better.
> 
> I vaguely recall beck@ trying to do something with this in the recent
> past, so CC beck@.

1-ish month bump.

Index: vfs_sync.c
===
RCS file: /cvs/src/sys/kern/vfs_sync.c,v
retrieving revision 1.64
diff -u -p -r1.64 vfs_sync.c
--- vfs_sync.c  24 Jun 2020 22:03:41 -  1.64
+++ vfs_sync.c  25 Dec 2020 17:09:00 -
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -73,6 +74,7 @@ LIST_HEAD(synclist, vnode);
 static struct synclist *syncer_workitem_pending;
 
 struct proc *syncerproc;
+int syncer_chan;
 
 /*
  * The workitem queue.
@@ -130,6 +132,19 @@ vn_syncer_add_to_worklist(struct vnode *
 }
 
 /*
+ * TODO Move getnsecuptime() to kern_tc.c and document it when we have
+ * more users in the kernel.
+ */
+static uint64_t
+getnsecuptime(void)
+{
+   struct timespec now;
+
+   getnanouptime(&now);
+   return TIMESPEC_TO_NSEC(&now);
+}
+
+/*
  * System filesystem synchronizer daemon.
  */
 void
@@ -138,11 +153,11 @@ syncer_thread(void *arg)
struct proc *p = curproc;
struct synclist *slp;
struct vnode *vp;
-   time_t starttime;
+   uint64_t elapsed, start;
int s;
 
for (;;) {
-   starttime = gettime();
+   start = getnsecuptime();
 
/*
 * Push files whose dirty time has expired.
@@ -220,6 +235,7 @@ syncer_thread(void *arg)
rushjob -= 1;
continue;
}
+
/*
 * If it has taken us less than a second to process the
 * current work, then wait. Otherwise start right over
@@ -228,8 +244,11 @@ syncer_thread(void *arg)
 * matter as we are just trying to generally pace the
 * filesystem activity.
 */
-   if (gettime() == starttime)
-   tsleep_nsec(&lbolt, PPAUSE, "syncer", INFSLP);
+   elapsed = getnsecuptime() - start;
+   if (elapsed < SEC_TO_NSEC(1)) {
+   tsleep_nsec(&syncer_chan, PPAUSE, "syncer",
+   SEC_TO_NSEC(1) - elapsed);
+   }
}
 }
 
@@ -242,7 +261,7 @@ int
 speedup_syncer(void)
 {
if (syncerproc)
-   wakeup_proc(syncerproc, &lbolt);
+   wakeup_proc(syncerproc, &syncer_chan);
if (rushjob < syncdelay / 2) {
rushjob += 1;
stat_rush_requests += 1;



syncer_thread: sleep without lbolt

2020-12-12 Thread Scott Cheloha
Hi,

The syncer thread is one of the last users of the lbolt (lightning
bolt!) sleep channel.

If we add a syncer-specific sleep channel (syncer_chan) and do a bit
of time math we can replicate the current behavior and remove another
lbolt user.

This isn't a perfect recreation of the current behavior.  In this
version the sleep period will drift if processing takes longer than 1
second.  I think it's good enough.  If people are concerned about a
perfect recreation of the current behavior we *can* do it, but it will
require more code.  I don't think it's worth it.

This also fixes two problems in the current code.  They aren't huge
bugs, but they did jump out as potential problems because they make
the syncer's behavior less deterministic:

- The current code uses gettime(9), which will jump and screw up your
  measurement if someone calls settimeofday(2).  The new code uses the
  uptime clock, which is monotonic and stable.

- The current code uses gettime(9), which has a resolution of 1
  second.  Measuring a 1 second timeout with an interface with
  a resolution of 1 second is crude and error-prone.  The new code
  uses getnsecuptime(), which has a resolution of roughly 1/hz.
  Much better.

I vaguely recall beck@ trying to do something with this in the recent
past, so CC beck@.

Thoughts?  ok?

Index: vfs_sync.c
===
RCS file: /cvs/src/sys/kern/vfs_sync.c,v
retrieving revision 1.64
diff -u -p -r1.64 vfs_sync.c
--- vfs_sync.c  24 Jun 2020 22:03:41 -  1.64
+++ vfs_sync.c  12 Dec 2020 19:29:11 -
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -73,6 +74,7 @@ LIST_HEAD(synclist, vnode);
 static struct synclist *syncer_workitem_pending;
 
 struct proc *syncerproc;
+int syncer_chan;
 
 /*
  * The workitem queue.
@@ -129,6 +131,15 @@ vn_syncer_add_to_worklist(struct vnode *
splx(s);
 }
 
+uint64_t
+getnsecuptime(void)
+{
+   struct timespec now;
+
+   getnanouptime(&now);
+   return TIMESPEC_TO_NSEC(&now);
+}
+
 /*
  * System filesystem synchronizer daemon.
  */
@@ -138,11 +149,11 @@ syncer_thread(void *arg)
struct proc *p = curproc;
struct synclist *slp;
struct vnode *vp;
-   time_t starttime;
+   uint64_t elapsed, start;
int s;
 
for (;;) {
-   starttime = gettime();
+   start = getnsecuptime();
 
/*
 * Push files whose dirty time has expired.
@@ -220,6 +231,7 @@ syncer_thread(void *arg)
rushjob -= 1;
continue;
}
+
/*
 * If it has taken us less than a second to process the
 * current work, then wait. Otherwise start right over
@@ -228,8 +240,11 @@ syncer_thread(void *arg)
 * matter as we are just trying to generally pace the
 * filesystem activity.
 */
-   if (gettime() == starttime)
-   tsleep_nsec(&lbolt, PPAUSE, "syncer", INFSLP);
+   elapsed = getnsecuptime() - start;
+   if (elapsed < SEC_TO_NSEC(1)) {
+   tsleep_nsec(&syncer_chan, PPAUSE, "syncer",
+   SEC_TO_NSEC(1) - elapsed);
+   }
}
 }
 
@@ -242,7 +257,7 @@ int
 speedup_syncer(void)
 {
if (syncerproc)
-   wakeup_proc(syncerproc, &lbolt);
+   wakeup_proc(syncerproc, &syncer_chan);
if (rushjob < syncdelay / 2) {
rushjob += 1;
stat_rush_requests += 1;