Hi - 

I'm working on some 2LS changes to support c++ threads, and I figured
you all might be interested in the details.  Standard disclaimer: all
of this is subject to change, etc.

A bit of history: the original 2LS, before there was even the uthread
library, was pthreads.  I wrote it against the vcore interface to get a
sense for what a 2LS needed to do.  Back in 2011, I drew a line down
the middle and extracted things that were generic to any thread-based
2LS (compared to a 2LS that doesn't even have threads, which
conceptually was an option).  The common 2LS code was named
'uthreads.'  (See b16299dc282b ("Pulled code specific to all 2LS out of
pthread.c") for details).  This line between uthreads and specific 2LSs
needs to move (and we have moved it over the years).

At the time, it wasn't clear what the "gateway" API would be to the
2LSs.  Should apps call uthread_create() to get a thread, or should
they call a 2LS-specific function, e.g. pthread_create()?  The app
knows about its 2LS, in theory, and not all 2LSs will need pthread's
interface.  For a modern example, the VMM has multiple types of
threads, include the guest thread and its controller buddy.

For the most part, we settled on letting the 2LS manage its own API
(e.g. pthreads), since the specific 2LS rides on top of uthreads.
Uthreads would provide an API that the 2LS uses.  For example,
applications don't call uthread_yield().  They call pthread_yield(),
which uses uthread_yield().  There is no uthread_exit(), it's
pthread_exit(), which also calls uthread_yield().  (uthread_yield() is
the approved solution for cooperatively switching from uthread context
to vcore context, complete with a 'yield callback' that says what to do
after the thread cooperatively yielded).

That line has worked alright, and if not, we figured we could always
change it.  It looks like it's time to change it.

Fast forward six years, and things have changed a little.  On occasion,
we've needed ways for user libraries to make 2LS-specific calls, but
without knowing the name of the 2LS.  The best example is the uthread
mutexes, e.g. uth_mutex_lock().  This is a front-end function that
either calls a 2LS op for a mutex or uses a default implementation.
Any code can call uth_mutex_lock().

Now we have another use case: gcc.  Gcc's c++ implementation requires a
threading model.  We've been telling it to use POSIX threads.  See
stage 3 of the gcc build in the toolchain's Makefile.  That was
actually buggy, and C++ apps using shared_ptrs thought they were single
threaded - resulting in a rather brutal bug.  There's a minimal
interface in gcc for threading, which we mostly support already: DTLS,
mutexes, etc.  However, apparently that's not enough, and any c++
app of consequence (say, using std::mutex) requires the full interface
in gcc-4.9.2/libgcc/gthr.h.  This includes thread creation, joining,
detaching, condvars, etc.  Here you go:

/* If this file is compiled with threads support, it must
       #define __GTHREADS 1
   to indicate that threads support is present.  Also it has define
   function
     int __gthread_active_p ()
   that returns 1 if thread system is active, 0 if not.

   The threads interface must define the following types:
     __gthread_key_t
     __gthread_once_t
     __gthread_mutex_t
     __gthread_recursive_mutex_t

   The threads interface must define the following macros:

     __GTHREAD_ONCE_INIT
                to initialize __gthread_once_t
     __GTHREAD_MUTEX_INIT
                to initialize __gthread_mutex_t to get a fast
                non-recursive mutex.
     __GTHREAD_MUTEX_INIT_FUNCTION
                to initialize __gthread_mutex_t to get a fast
                non-recursive mutex.
                Define this to a function which looks like this:
                  void __GTHREAD_MUTEX_INIT_FUNCTION (__gthread_mutex_t *)
                Some systems can't initialize a mutex without a
                function call.  Don't define __GTHREAD_MUTEX_INIT in this case.
     __GTHREAD_RECURSIVE_MUTEX_INIT
     __GTHREAD_RECURSIVE_MUTEX_INIT_FUNCTION
                as above, but for a recursive mutex.

   The threads interface must define the following static functions:

     int __gthread_once (__gthread_once_t *once, void (*func) ())

     int __gthread_key_create (__gthread_key_t *keyp, void (*dtor) (void *))
     int __gthread_key_delete (__gthread_key_t key)

     void *__gthread_getspecific (__gthread_key_t key)
     int __gthread_setspecific (__gthread_key_t key, const void *ptr)

     int __gthread_mutex_destroy (__gthread_mutex_t *mutex);
     int __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t *mutex);

     int __gthread_mutex_lock (__gthread_mutex_t *mutex);
     int __gthread_mutex_trylock (__gthread_mutex_t *mutex);
     int __gthread_mutex_unlock (__gthread_mutex_t *mutex);

     int __gthread_recursive_mutex_lock (__gthread_recursive_mutex_t *mutex);
     int __gthread_recursive_mutex_trylock (__gthread_recursive_mutex_t *mutex);
     int __gthread_recursive_mutex_unlock (__gthread_recursive_mutex_t *mutex);

   The following are supported in POSIX threads only. They are required to
   fix a deadlock in static initialization inside libsupc++. The header file
   gthr-posix.h defines a symbol __GTHREAD_HAS_COND to signify that these extra
   features are supported.

   Types:
     __gthread_cond_t

   Macros:
     __GTHREAD_COND_INIT
     __GTHREAD_COND_INIT_FUNCTION

   Interface:
     int __gthread_cond_broadcast (__gthread_cond_t *cond);
     int __gthread_cond_wait (__gthread_cond_t *cond, __gthread_mutex_t *mutex);
     int __gthread_cond_wait_recursive (__gthread_cond_t *cond,
                                        __gthread_recursive_mutex_t *mutex);

   All functions returning int should return zero on success or the error
   number.  If the operation is not supported, -1 is returned.

   If the following are also defined, you should
     #define __GTHREADS_CXX0X 1
   to enable the c++0x thread library.

   Types:
     __gthread_t
     __gthread_time_t

   Interface:
     int __gthread_create (__gthread_t *thread, void *(*func) (void*),
                           void *args);
     int __gthread_join (__gthread_t thread, void **value_ptr);
     int __gthread_detach (__gthread_t thread);
     int __gthread_equal (__gthread_t t1, __gthread_t t2);
     __gthread_t __gthread_self (void);
     int __gthread_yield (void);

     int __gthread_mutex_timedlock (__gthread_mutex_t *m,
                                    const __gthread_time_t *abs_timeout);
     int __gthread_recursive_mutex_timedlock (__gthread_recursive_mutex_t *m,
                                          const __gthread_time_t *abs_time);

     int __gthread_cond_signal (__gthread_cond_t *cond);
     int __gthread_cond_timedwait (__gthread_cond_t *cond,
                                   __gthread_mutex_t *mutex,
                                   const __gthread_time_t *abs_timeout);

*/


So it looks like we'll need a 2LS-independent interface to access those
functions.  And that will also require that uthreads knows how to ask a
specific 2LS for something - e.g. "create a thread."  For example, with
the VMM 2LS, "create a thread" would be a task thread, not a guest or
controller thread.

On the plus side, all 2LSs will get join/detach/equal/etc for free, if
they want to use the generic interface.  This basically moves the line
between uthreads and the specific 2LSs up higher (uthread below, 2LS
above; so more 2LS functionality would move into uthreads).

So I'm picturing a set of generic thread functions in uthread.c
(e.g. gth_create) that will call into an extended set of 2LS
operations.  The 'g' is for generic, and also it matches gcc's usage
(where I imagine the 'g' is for gcc).  We'll see.  (More on this below).


There's another related item that's been on my radar for a while now:
libraries not written by us that want to use threads.  Imagine the VMM
(or some other 2LS-using app) wants to use a library from Linux.  Odds
are, that library was written to use pthreads.  The pthreads API, love
it or hate it, is the defacto way that libraries access threads, and we
would like it to be available regardless of the 2LS that is in use.
Currently, this is not possible, since there can be only one 2LS, and
pthreads is a 2LS.

This means that pthreads would need to no longer be a 2LS.  The 2LS
currently known as pthreads will become something like bthreads (basic)
or dthreads (default).  For $100 you can pick the letter.  The
pthread.h interface will then call into the generic uthread interface.

There are a couple issues.  First, we could make the generic interface
(e.g. gth_create) just be called pthreads.  The downside is that we are
stuck with pthread's interface.  If we split them, we can have a more
expressive gthread interface.  Additionally, we can have gthreads
handle *less* than pthreads (which might be important), if we otherwise
deal with the remainder of pthreads.

Which brings us to the other issue: the extra stuff associated with
pthreads that isn't needed by gcc.  For starters, we have futex.h and
semaphore.h.  Those are pretty minor and can be adapted to work with any
2LS.  That'd be a good change.  

The bigger issue is all the extra pthreads crap, like
pthread_attr_{get,set}sched{whatever}().  Do we really want every 2LS
to have to respond to the list of pthread scheduling hints (e.g.
SCHED_FIFO)?  And what if we wanted to have more hints?  Pushing that
much crap "below the line" seems to me like it'd hurt the ability to
innovate at the 2LS - we'd be forcing it to do whatever POSIX says it
should do.

An alternative is to support the basic pthreads interface (i.e.,
anything needed by gcc or isn't invasive), and then let specific 2LSs
implement the rest of pthreads interface as needed.  It'll be up
to those 2LSs if they want to bother.  My hunch is that any library
using pthread-specific scheduling code isn't going to be too helpful
when working with a specific 2LS, since pthreads/Linux/POSIX
assumptions are not valid.

So concretely, I'm currently planning:
- parlib_once() function, like pthread_once.  (unrelated to threading,
  actually).  This will do the work of run_once(), and
  userspace's run_once() will be replaced by parlib_once()

- Change mutex alloc/initialization.  Right now, they are
  allocated/initialized by the 2LS, then used.  This doesn't work with
  pthreads (though it does work with gcc), since pthreads wants a static
  initializer (PTHREAD_MUTEX_INITIALIZER).  The new version will
  still have an allocation/init 2LS op, but calling it is controlled
  via a 'once' variable.  The 2LS-specific allocation/init is useful in
  case the 2LS wants to use its own data structure for the waiters.
  Say, a priority heap instead of a TAILQ.  As Krste says,
  "synchronization is scheduling".

- Change the mutex/cond var 2LS ops.  Right now, they have ops for lock
  and unlock.  The new ones will be for dealing with a thread that
  blocked on an object (the object allocated above) and for picking a
  thread from the object to wake up.  This will be the same interface
  used for both mutexes/sems and CVs: "a thread blocked on a sync
  object" and "it's time to wake thread(s) from an object."  The 2LS
  manages the object, and uthreads manages the vcore context business.

  I might be able to repurpose the existing thread_has_blocked 2LS op
  for this.  The one difficult holdout is the event code, which also
  uses it.  I'll come up with something that works for all cases.  I
  think the trick will be to have the 2LS wakeup op be about choosing a
  thread, not about waking it up.  That way it can fit with the evq
  blockon code (see user/parlib/event.c L 668).  (There are other
  complicated issues with the evq herd wakeup.  For now, I can use the
  "wake all" 2LS op).

- Under the hood, make the mutexes actually be semaphores of value 1.

- Add timeouts to condvars and mutexes.  This will require another 2LS
  op for "wake a particular thread from a sync object."  But otherwise,
  the timeout and whatnot is handled in the uthread layer, which is
  nice.  (Every 2LS doesn't need to implement the same features, which
  is the point of uthread.c).

- Add gth_create, join, detach, equal, self, and yield functions at the
  uthread level, with callbacks to the 2LS sched_ops where needed.
  Some of these, such as join and detach, can just come from pthread.c
  and might not require a 2LS op.  Or maybe call them 'uthread' still,
  and change the existing uthread_yield.

- At this point, gcc should be able to use the akaros threading model,
  since that should be the entire interface from
  gcc-4.9.2/libgcc/gthr.h.

- Generically support futexes and semaphores.  As with all pthread
  stuff, we might not need or want all of semaphore.h, and can leave
  the remnants behind.  (Currently, they just error out).

- Split pthreads into the guts of the scheduler (bthreads, or whatever)
  and the interface.  Functions that are part of the required gcc
  interface will call directly to uthread code.  The others will be a
  bunch of weak functions in pthread.c that pretend to work.  Specific
  2LSs can override them.  bthreads (or whatever) can use the existing
  pthreads code for it.

Anyway, that's my current plan.  I imagine it'll change as I dig in
more.  The minimum that we need to do is gcc's gthr.h interface, so I
might hold off on the pthread stuff until we get a better feel for
specific library code that is trying to use the pthread's API.  Another
option, after all, is to just() change the various libraries, which is
what we did with virtio (though that case was closely coupled with the
VMM).

Barret

-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to