Hey folks ...
The other day I checked in a new kernel API - threadpool(9) - an API for
scheduling jobs on shared pools of threads. This new API was written by Taylor
Campbell, but had been languishing as a WIP draft for some years now. As it
happens, I have a need for the (forthcoming, still being debugged) task(9) API,
also written by Taylor, that's built on top of threadpool(9), so I decided to
whip it into shape.
threadpool(9) basically makes it easy to create jobs that run in thread
context, but that don't necessarily need to have a thread waiting around all
the time to do work. Kernel threads are created as-needed, and the threads
will idle out and exit after a period of time if there is no work for them to
do. Thread pools are a shared system resource, and you can use unbound pools
(that have no particular CPU affinity) or pools that are bound to a specific
CPU (in the case of per-CPU pools, each CPU gets its own private pool for each
priority).
The pools themselves are also created on-demand, only when requested by
something else. When requesting a reference to a pool, the caller specifies
the priority that the threads should run at: PRI_NONE (the default timesharing
priority) or up to MAXPRI_KERNEL_RT.
The threadpool(9) work abstraction is built around the concept of a "job",
threadpool_job_t. This is an opaque structure that the caller needs to
allocate storage for. A job can be scheduled on a pool from any context,
including hard interrupt context up to and including IPL_VM. The job will run
until completion, and can take an arbitrarily long time, and sleep an
arbitrarily long time; additional threads will be created in the pool for other
jobs on-demand. Note: there is no hard limit on the number of threads a pool
will create. Once scheduled, a job cannot be scheduled again until it has
completed, at which point it needs to notify the system of this fact by calling
threadpool_job_done(). Job cancellation is possible if the job has not yet
run, but once a job is running, cancellation must wait for it to complete.
Among other things, this provides a deterministic way to ensure that a job is
not running. More information on job lifecycle and cancellation semantics can
be found in the man page.
Job functions provided by the caller are passed a pointer to the
threadpool_job_t corresponding to the work they're doing. It is expected that
this threadpool_job_t is embedded in the caller's state structure, and this
state can be recovered by using the "container-of" access pattern, e.g.:
struct my_job_state {
kmutex_t mutex;
int some_counter;
threadpool_job_t the_job;
};
threadpool_t *unbound_lopri_pool;
...
void
my_job_setup_routine(struct my_job_state *state)
{
...
error = threadpool_get(&unbound_lopri_threadpool, PRI_NONE);
...
threadpool_job_init(&state->the_job, my_job_function, &state->mutex);
...
}
void
my_interrupt_handler(struct my_job_state *state)
{
...
mutex_enter(&state->mutex);
...
threadpool_schedule_job(unbound_lopri_pool, &state->the_job);
...
mutex_exit(&state->mutex);
...
}
void
my_job_function(threadpool_job_t *job)
{
struct my_job_state *state =
container_of(job, struct my_job_state, the_job);
/* state->mutex is unlocked upon entering job */
/* do whatever needs to be done. */
threadpool_job_done(job);
/* thread we ran on will be idle'd and available for future jobs until
timing out and exiting. */
}
In addition to the foundation for task(9) (and eventually workqueue(9) --
Taylor already has a prototype implementation of that API on top of
threadpool(9)), this API could be very useful for some of the other uses of
ephemeral or mostly-idle kernel threads... the two that jumped into my mind
immediately were scsibus_discover_thread and scsipi_completion_thread ... each
of those could be easily converted to jobs running on an unbound PRI_NONE
thread pool. atabusconfig_thread and atabus_thread are other obvious
candidates. I actually think those 4 examples are really great applications of
where to use threadpool(9) directly, because they are infrequent, but
potentially arbitrarily long-running, and thus not suitable for task(9). If
someone wants to tackle those, I'd be happy to answer questions about how to
use threadpool(9) in those situations (or perhaps I'll just do it so that
there's a concrete in-tree example of how to use the API that's not as complex
as task(9) is).
Anyway, feel to reach out if you have questions! And kudos to Taylor for the
great work... it didn't require much effort to get it the
never-even-compile-tested draft up and running.
-- thorpej