On 02/06/19(Sun) 22:03, Mark Kettenis wrote:
> > Date: Sat, 1 Jun 2019 18:55:20 -0300
> > From: Martin Pieuchot <[email protected]>
> [...] 
> > - First change is to stop calling tsleep(9) at PUSER.  That makes
> >   it clear that all "sleeping priorities" are smaller than PUSER.
> >   That's important to understand for the diff below.  `p_priority'
> >   is currently a placeholder for the "sleeping priority" and the
> >   "runnqueue priority".  Both fields are separated by this diff.
> 
> Separating out the fields is a good idea.  The current way priorities
> are recorded is just confusing.  The use of PUSER vs. PWAIT seems to
> be fairly arbitrary, so that is probably not a big issue.  Except
> maybe for the single-threded signal stuff.  Would be good to get
> guenther@'s thoughts on this bit.
> 
> The PUSER -> PWAIT change isn't really necessary is it?  It just makes
> it easier for you to understand what;s going on when looking at the
> queues.

The problem becomes easier to understand with this change.  They
is currently two places where `p_priority' is updated iff it was
previously >= PUSER.  These two places are schedclock() and schedcpu()
they both "duplicate" the same logic to update `p_priority' just after
having recalculated `p_usrpri'.  So with it we can say that this code
do no apply to the new `p_slpprio' because it is always < PUSER.

Now the question is why does `p_priority' exist and why it some times
reflect `p_usrpri'?  The only places where mi_switch() is called without
updating `p_priority' nor putting the caller on a queue are related to
signals.  To exit the SSTOP state, setrunnable() is called on a thread. 
Since setrunnable() is shared between SSLEEP and SSTOP threads it is not
obvious that `p_priority' is the "sleeping priority" in the first case
and 'p_usrpri' in the second case.

The only exception to this logic are stopped threads with nice(1) value
that have a `p_priority' < PUSER.  In this specific case my diff, that
passes `p_usrpri' as argument to setrunnable(), introduces a change in
behavior.  However I doubt it matters since such threads are generally
exceptions and the only ones with a priority < PUSER.

This can be observed by stopping sndiod(8).

> > [...]
> > - `p_estcpu' `p_usrpri' and `p_slptime' which represent the "priority"
> >   of a thread are now updated while holding a per-thread mutex.  As a
> >   result schedclock() and donice() no longer takes the SCHED_LOCK(),
> >   and schedcpu() almost never take it.
> 
> Need to look closer at how this works.

`p_slptime' could be removed if we want to simplify the logic.  I don't
see the point of deferring the calculation of sleeping/stopped threads.  

> 
> > - With this diff top(1) and ps(1) will report the "real" `p_usrpi' value
> >   when displaying priorities.  This is helpful to understand what's
> >   happening:
> 
> Do you intend to remove that bit before committing this?

I don't know.  Which "priority" should we export then?  The sleep/run
priority?  This would be the most backward-compatible change.  However
the priority that really matters *is* `p_usrpri'.  With this diff it
becomes so much easier to understand what's happening...

Reply via email to