On Tue, Jun 05, 2018 at 03:57:48PM -0600, Todd C. Miller wrote:
> On Tue, 05 Jun 2018 16:30:14 -0500, Scott Cheloha wrote:
> 
> > Because the system clock can jump around, process start times
> > should be recorded as offsets from boot and only converted to
> > a time of day on request.
> >
> > This keeps ps(1)'s etime accurate and ensures processes are
> > correctly sorted by age in ps(1) and pgrep(1).
> 
> Looks good.
> 
> > I don't think we're losing much by declining to preserve the
> > time of day when the process forked.
> >
> > FreeBSD has made this change, though NetBSD has not.
> >
> > The attached patch isn't perfect, as there remains a small window
> > after kvm_getprocs is called where the system clock can jump again.
> > To circumvent that we'd need to compute the elapsed real time for
> > the process in the kernel, or pass out ps_start alongside the derived
> > time of day and let userspace compute the elapsed realtime with
> > the CLOCK_BOOTTIME clock.  Unsure how people feel about extending
> > kinfo_proc, though.
> 
> I really don't this is worth worrying about.  In the current state
> of things, if the system clock jumps the process start time as
> reported by ps will be incorrect anyway.  This seems like an
> improvement.
>
> > This change doesn't require any userspace recompilation for inspecting
> > a live kernel: on reboot -lkvm binaries behave as expected.
> 
> Great.  I was worried that consumers of p_ustart_{sec,usec} would
> need to change but I can see that is not the case.

Update: here is a completed patch.  This now adds bits to libkvm that
derive the process starting time in the coredump case.  It doesn't take
subseconds into account: for that we'd need to expose the timehands
definition to userspace.  So I left a comment marking the way forward
for the intrepid.

For the coredump case to work obviously libkvm itself will need to be
recompiled, as will statically linked binaries like ps(1), otherwise
you'll get zeroes for your process starting times and ps(1) will claim
all the processes on your cored system started in ~1970.  The interface
to libkvm is entirely unchanged, so I'm pretty sure this doesn't
require a version bump.

ps(1)'s etime column will still be incorrect in the coredump case because
it computes the elapsed time using the active kernel's time.  There is a
similar problem with the uptime line in w(1).  The attached patch itself
doesn't break these outputs, though, so I'll fix them in a separate patch.

The only port using p_ustart_{sec,usec} in a way that might be broken by
this change is sudo.  I talked to todd and he said that this change wouldn't
cause any problems, though.

ok?

--
Scott Cheloha

Index: lib/libkvm/kvm_proc2.c
===================================================================
RCS file: /cvs/src/lib/libkvm/kvm_proc2.c,v
retrieving revision 1.27
diff -u -p -r1.27 kvm_proc2.c
--- lib/libkvm/kvm_proc2.c      7 Nov 2016 00:26:33 -0000       1.27
+++ lib/libkvm/kvm_proc2.c      23 Jun 2018 00:12:14 -0000
@@ -119,13 +119,35 @@ kvm_proclist(kvm_t *kd, int op, int arg,
        struct sigacts sa, *sap;
        struct vmspace vm, *vmp;
        struct plimit limits, *limp;
+       struct nlist nl[3];
        pid_t parent_pid, leader_pid;
-       int cnt = 0;
-       int dothreads = 0;
+       time_t time_second, time_uptime;
+       int cnt, dothreads, i;
 
+       cnt = 0;
        dothreads = op & KERN_PROC_SHOW_THREADS;
        op &= ~KERN_PROC_SHOW_THREADS;
 
+       /* Determine the victim's age and time of death. */
+       memset(&nl, 0, sizeof(nl));
+       nl[0].n_name = "time_second";
+       nl[1].n_name = "time_uptime";
+       nl[2].n_name = NULL;
+       if (kvm_nlist(kd, nl) != 0) {
+               for (i = 0; nl[i].n_type != 0; i++)
+                       continue;
+               _kvm_err(kd, kd->program, "%s: no such symbol", nl[i].n_name);
+               return (-1);
+       }
+       if (KREAD(kd, nl[0].n_value, &time_second)) {
+               _kvm_err(kd, kd->program, "can't read %s", nl[0].n_name);
+               return (-1);
+       }
+       if (KREAD(kd, nl[1].n_value, &time_uptime)) {
+               _kvm_err(kd, kd->program, "can't read %s", nl[1].n_name);
+               return (-1);
+       }
+
        /*
         * Modelled on sysctl_doproc() in sys/kern/kern_sysctl.c
         */
@@ -288,6 +310,19 @@ kvm_proclist(kvm_t *kd, int op, int arg,
                        kp.p_tpgid = -1;
                        kp.p_tdev = NODEV;
                }
+
+               /*
+                * Derive the starting time for the process.
+                *
+                * XXX This ignores subseconds.  One solution would be
+                * to expose the timehands definition to userspace and
+                * grab th_offset/th_nanotime from the active timehands.
+                * Pending that, just pretend every process forked at
+                * the beginning of a nearby second.
+                */
+               kp.p_ustart_sec =
+                   time_second - time_uptime + process.ps_start.tv_sec;
+               kp.p_ustart_usec = 0;   /* XXX */
 
                /* update %cpu for all threads */
                if (dothreads) {
Index: sys/kern/init_main.c
===================================================================
RCS file: /cvs/src/sys/kern/init_main.c,v
retrieving revision 1.277
diff -u -p -r1.277 init_main.c
--- sys/kern/init_main.c        28 Apr 2018 03:13:04 -0000      1.277
+++ sys/kern/init_main.c        23 Jun 2018 00:12:14 -0000
@@ -512,7 +512,7 @@ main(void *framep)
         */
        nanotime(&boottime);
        LIST_FOREACH(pr, &allprocess, ps_list) {
-               pr->ps_start = boottime;
+               getnanouptime(&pr->ps_start);
                TAILQ_FOREACH(p, &pr->ps_threads, p_thr_link) {
                        nanouptime(&p->p_cpu->ci_schedstate.spc_runtime);
                        timespecclear(&p->p_rtime);
Index: sys/kern/kern_acct.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_acct.c,v
retrieving revision 1.36
diff -u -p -r1.36 kern_acct.c
--- sys/kern/kern_acct.c        28 Apr 2018 03:13:04 -0000      1.36
+++ sys/kern/kern_acct.c        23 Jun 2018 00:12:14 -0000
@@ -184,8 +184,8 @@ acct_process(struct proc *p)
        acct.ac_stime = encode_comp_t(st.tv_sec, st.tv_nsec);
 
        /* (3) The elapsed time the command ran (and its starting time) */
-       acct.ac_btime = pr->ps_start.tv_sec;
-       getnanotime(&tmp);
+       getnanouptime(&tmp);
+       acct.ac_btime = time_second - (tmp.tv_sec - pr->ps_start.tv_sec);
        timespecsub(&tmp, &pr->ps_start, &tmp);
        acct.ac_etime = encode_comp_t(tmp.tv_sec, tmp.tv_nsec);
 
Index: sys/kern/kern_fork.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_fork.c,v
retrieving revision 1.203
diff -u -p -r1.203 kern_fork.c
--- sys/kern/kern_fork.c        17 Jun 2018 08:22:02 -0000      1.203
+++ sys/kern/kern_fork.c        23 Jun 2018 00:12:14 -0000
@@ -452,7 +452,7 @@ fork1(struct proc *curp, int flags, void
        /*
         * For new processes, set accounting bits and mark as complete.
         */
-       getnanotime(&pr->ps_start);
+       getnanouptime(&pr->ps_start);
        pr->ps_acflag = AFORK;
        atomic_clearbits_int(&pr->ps_flags, PS_EMBRYO);
 
Index: sys/kern/kern_sysctl.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_sysctl.c,v
retrieving revision 1.343
diff -u -p -r1.343 kern_sysctl.c
--- sys/kern/kern_sysctl.c      20 Jun 2018 10:52:49 -0000      1.343
+++ sys/kern/kern_sysctl.c      23 Jun 2018 00:12:14 -0000
@@ -1602,6 +1602,7 @@ fill_kproc(struct process *pr, struct ki
        struct session *s = pr->ps_session;
        struct tty *tp;
        struct vmspace *vm = pr->ps_vmspace;
+       struct timespec elapsed, now, started, uptime;
        struct timespec ut, st;
        int isthread;
 
@@ -1633,6 +1634,12 @@ fill_kproc(struct process *pr, struct ki
        if ((pr->ps_flags & PS_ZOMBIE) == 0) {
                if ((pr->ps_flags & PS_EMBRYO) == 0 && vm != NULL)
                        ki->p_vm_rssize = vm_resident_count(vm);
+               getnanouptime(&uptime);
+               timespecsub(&uptime, &pr->ps_start, &elapsed);
+               getnanotime(&now);
+               timespecsub(&now, &elapsed, &started);
+               ki->p_ustart_sec = started.tv_sec;
+               ki->p_ustart_usec = started.tv_nsec / 1000;
                calctsru(isthread ? &p->p_tu : &pr->ps_tu, &ut, &st, NULL);
                ki->p_uutime_sec = ut.tv_sec;
                ki->p_uutime_usec = ut.tv_nsec/1000;
Index: sys/sys/proc.h
===================================================================
RCS file: /cvs/src/sys/sys/proc.h,v
retrieving revision 1.249
diff -u -p -r1.249 proc.h
--- sys/sys/proc.h      17 Jun 2018 08:22:02 -0000      1.249
+++ sys/sys/proc.h      23 Jun 2018 00:12:14 -0000
@@ -230,7 +230,7 @@ struct process {
 #define ps_endcopy     ps_refcnt
        int     ps_refcnt;              /* Number of references. */
 
-       struct  timespec ps_start;      /* starting time. */
+       struct  timespec ps_start;      /* starting boot offset. */
        struct  timeout ps_realit_to;   /* real-time itimer trampoline. */
 };
 
Index: sys/sys/sysctl.h
===================================================================
RCS file: /cvs/src/sys/sys/sysctl.h,v
retrieving revision 1.178
diff -u -p -r1.178 sysctl.h
--- sys/sys/sysctl.h    19 Jun 2018 19:29:52 -0000      1.178
+++ sys/sys/sysctl.h    23 Jun 2018 00:12:14 -0000
@@ -516,7 +516,8 @@ struct kinfo_vmentry {
  *     sa - source struct sigacts
  * There are some members that are not handled by these macros
  * because they're too painful to generalize: p_ppid, p_sid, p_tdev,
- * p_tpgid, p_tsess, p_vm_rssize, p_u[us]time_{sec,usec}, p_cpuid
+ * p_tpgid, p_tsess, p_vm_rssize, p_u[us]time_{sec,usec}, p_cpuid,
+ * p_ustart_{sec,usec}
  */
 
 #define PTRTOINT64(_x) ((u_int64_t)(u_long)(_x))
@@ -629,9 +630,6 @@ do {                                                        
                \
                struct timeval tv;                                      \
                                                                        \
                (kp)->p_uvalid = 1;                                     \
-                                                                       \
-               (kp)->p_ustart_sec = (pr)->ps_start.tv_sec;             \
-               (kp)->p_ustart_usec = (pr)->ps_start.tv_nsec/1000;      \
                                                                        \
                (kp)->p_uru_maxrss = (p)->p_ru.ru_maxrss;               \
                (kp)->p_uru_ixrss = (p)->p_ru.ru_ixrss;                 \

Reply via email to