threadpool(9) -- an API for scheduling jobs on shared pools of threads

Jason Thorpe Wed, 26 Dec 2018 08:11:32 -0800

Hey folks ...

The other day I checked in a new kernel API - threadpool(9) - an API for 
scheduling jobs on shared pools of threads.  This new API was written by Taylor 
Campbell, but had been languishing as a WIP draft for some years now.  As it 
happens, I have a need for the (forthcoming, still being debugged) task(9) API, 
also written by Taylor, that's built on top of threadpool(9), so I decided to 
whip it into shape.


threadpool(9) basically makes it easy to create jobs that run in thread 
context, but that don't necessarily need to have a thread waiting around all 
the time to do work.  Kernel threads are created as-needed, and the threads 
will idle out and exit after a period of time if there is no work for them to 
do.  Thread pools are a shared system resource, and you can use unbound pools 
(that have no particular CPU affinity) or pools that are bound to a specific 
CPU (in the case of per-CPU pools, each CPU gets its own private pool for each 
priority).

The pools themselves are also created on-demand, only when requested by 
something else.  When requesting a reference to a pool, the caller specifies 
the priority that the threads should run at: PRI_NONE (the default timesharing 
priority) or up to MAXPRI_KERNEL_RT.

The threadpool(9) work abstraction is built around the concept of a "job", 
threadpool_job_t.  This is an opaque structure that the caller needs to 
allocate storage for.  A job can be scheduled on a pool from any context, 
including hard interrupt context up to and including IPL_VM.  The job will run 
until completion, and can take an arbitrarily long time, and sleep an 
arbitrarily long time; additional threads will be created in the pool for other 
jobs on-demand.  Note: there is no hard limit on the number of threads a pool 
will create.  Once scheduled, a job cannot be scheduled again until it has 
completed, at which point it needs to notify the system of this fact by calling 
threadpool_job_done().  Job cancellation is possible if the job has not yet 
run, but once a job is running, cancellation must wait for it to complete.  
Among other things, this provides a deterministic way to ensure that a job is 
not running.  More information on job lifecycle and cancellation semantics can 
be found in the man page.

Job functions provided by the caller are passed a pointer to the 
threadpool_job_t corresponding to the work they're doing.  It is expected that 
this threadpool_job_t is embedded in the caller's state structure, and this 
state can be recovered by using the "container-of" access pattern, e.g.:

struct my_job_state {
        kmutex_t mutex;
        int some_counter;
        threadpool_job_t the_job;
};

threadpool_t *unbound_lopri_pool;

...

void
my_job_setup_routine(struct my_job_state *state)
{
        ...
        error = threadpool_get(&unbound_lopri_threadpool, PRI_NONE);
        ...
        threadpool_job_init(&state->the_job, my_job_function, &state->mutex);
        ...
}

void
my_interrupt_handler(struct my_job_state *state)
{
        ...
        mutex_enter(&state->mutex);
        ...
        threadpool_schedule_job(unbound_lopri_pool, &state->the_job);
        ...
        mutex_exit(&state->mutex);
        ...
}

void
my_job_function(threadpool_job_t *job)
{
        struct my_job_state *state =
            container_of(job, struct my_job_state, the_job);

        /* state->mutex is unlocked upon entering job */

        /* do whatever needs to be done. */

        threadpool_job_done(job);

        /* thread we ran on will be idle'd and available for future jobs until 
timing out and exiting. */
}

In addition to the foundation for task(9) (and eventually workqueue(9) -- 
Taylor already has a prototype implementation of that API on top of 
threadpool(9)), this API could be very useful for some of the other uses of 
ephemeral or mostly-idle kernel threads... the two that jumped into my mind 
immediately were scsibus_discover_thread and scsipi_completion_thread ... each 
of those could be easily converted to jobs running on an unbound PRI_NONE 
thread pool.  atabusconfig_thread and atabus_thread are other obvious 
candidates.  I actually think those 4 examples are really great applications of 
where to use threadpool(9) directly, because they are infrequent, but 
potentially arbitrarily long-running, and thus not suitable for task(9).  If 
someone wants to tackle those, I'd be happy to answer questions about how to 
use threadpool(9) in those situations (or perhaps I'll just do it so that 
there's a concrete in-tree example of how to use the API that's not as complex 
as task(9) is).

Anyway, feel to reach out if you have questions!  And kudos to Taylor for the 
great work... it didn't require much effort to get it the 
never-even-compile-tested draft up and running.

-- thorpej

threadpool(9) -- an API for scheduling jobs on shared pools of threads

Reply via email to