On 23/08/23(Wed) 18:52, Scott Cheloha wrote:
> This is the next patch in the clock interrupt reorganization series.

Thanks for your diff.  I'm sorry but it is really hard for me to help
review this diff because there is still no man page for this API+subsystem.

Can we start with that please?

> This patch moves the entry points for the interval and profile dt(4)
> providers from the hardclock(9) to a dedicated clock interrupt
> callback, dt_prov_profile_intr(), in dev/dt/dt_prov_profile.c.
> 
> - To preserve current behavior, (1) both provider entrypoints have
>   been moved into a single callback function, (2) the interrupt runs at
>   the same frequency as the hardclock, and (3) the interrupt is
>   staggered to co-occur with the hardclock on a given CPU.

The only behavior that needs to be preserved is the output of dumping
stacks.  That means DT_FA_PROFILE and DT_FA_STATIC certainly needs to
be adapted with this change.  You can figure that out by looking at the
output of /usr/src/share/btrace/kprofile.bt without and with this diff.

Please generate a FlameGraph to make sure they're still the same.

Apart from that I'd prefer if we could skip the mechanical change and
go straight to what dt(4) needs.  Otherwise we will have to re-design
everything.   If you don't want to do this work, then leave it and tell
me what you need and what is your plan so I can help you and do it
myself.

dt(4) needs a way to schedule two different kind of periodic timeouts
with the higher precision possible.  It is currently plugged to hardclock
because there is nothing better.

The current code assumes the periodic entry points are external to dt(4).
This diff moves them in the middle of dt(4) but keeps the existing flow
which makes the code very convoluted. 
A starting point to understand the added complexity it so see that the
DT_ENTER() macro are no longer needed if we move the entry points inside
dt(4).

The first periodic timeout is dt_prov_interval_enter().  It could be
implemented as a per-PCB timeout_add_nsec(9).  The drawback of this
approach is that it uses too much code in the kernel which is a problem
when instrumenting the kernel itself.  Every subsystem used by dt(4) is
impossible to instrument with btrace(8).

The second periodic timeout it dt_prov_profile_enter().  It is similar
to the previous one and has to run on every CPU.

Both are currently bound to tick, but we want per-PCB time resolution.
We can get rid of `dp_nticks' and `dp_maxtick' if we control when the
timeouts fires.

> - Each dt_pcb gets a provider-specific "dp_clockintr" pointer.  If the
>   PCB's implementing provider needs a clock interrupt to do its work, it
>   stores the handle in dp_clockintr.  The handle, if any, is established
>   during dtpv_alloc() and disestablished during dtpv_dealloc().

Sorry, but as I said I don't understand what is a clockintr.  How does it
fit in the kernel and how is it supposed to be used?

Why have it per PCB and not per provider or for the whole driver instead?
Per-PCB implies that if I run 3 different profiling on a 32 CPU machines
I now have 96 different clockintr.  Is it what we want?  If so, why not
simply use timeout(9)?  What's the difference?

>   One alternative is to start running the clock interrupts when they
>   are allocated in dtpv_alloc() and stop them when they are freed in
>   dtpv_dealloc().  This is wasteful, though: the PCBs are not recording
>   yet, so the interrupts won't perform any useful work until the
>   controlling process enables recording.
> 
>   An additional pair of provider hooks, e.g. "dtpv_record_start" and
>   "dtpv_record_stop", might resolve this.

Another alternative would be to have a single hardclock-like handler for
dt(4).  Sorry, I don't understand the big picture behind clockintr, so I
can't help.  

> - We haven't needed to destroy clock interrupts yet, so the
>   clockintr_disestablish() function used in this patch is new.

So maybe that's fishy.  Are they supposed to be destroyed?  What is the
goal of this API?

Reply via email to