First, thanks to Andrew for referencing my paper. Although it was
written for the great Solaris 9 thread model U-turn, a lot of what it
says still applies today (except that the thread library has now been
folded into libc).
If you're using the DTrace syscall provider you will only see lwp_park.
However, if you're looking at stack traces, truss(1) output or the
OpenSolaris source you may see that the one syscall entry point actually
implements five private syscalls: lwp_park, lwp_unpark, lwp_unpark_all,
lwp_unpark_cancel and lwp_park with schedctl support. Of these, only the
first three are of real interest.
The lwp_park syscalls are designed for the simplest and quickest thread
sleep/wakeup mechanism with intra-process scope. They are used to
implement PTHREAD_PROCESS_PRIVATE (aka USYNC_THREAD) mutexes, condvars,
semaphores and rwlocks.
For inter-process synchronisation (i.e. PTHREAD_PROCESS_SHARED (aka
USYNC_PROCESS) mutexes, condvars, semaphores and rwlocks) you will see
object specific syscalls (e.g. lwp_mutex_lock, lwp_cond_broadcast,
lwp_sema_post, lwp_rwlock_unlock).
PTHREAD_PROCESS_PRIVATE synchronisation uses user-level sleep queues
(implemented in libc). When a thread needs to sleep on a queue, it calls
lwp_park(). When another thread wants to wake just one thread from a
user-level sleep queue is uses lwp_unpark(). To wake all sleepers it
uses lwp_unpark_all(). For PTHREAD_PROCESS_PRIVATE synchronisation the
kernel has no notion of the type of synchronisation object being waited
for, it just provides the simple park/unpark mechanism
PTHREAD_PROCESS_SHARED synchronisation uses kernel sleep queues (known
as LWP wait channels) because only the kernel has visibility across LWPs
in multiple processes. For PTHREAD_PROCESS_SHARED synchronisation the
kernel has to deal with object specific semantics, so hence requires
many more syscalls.
By analysing lwp_* syscalls, together with their associated stack
traces, one can understand a lot about an application's scalability and
the quality of its implementation. If you see a lot of lwp_park activity
around mutexes, you know that you have potentially limiting mutex
contention (i.e. that a mutex couldn't be acquired with a simple
instruction or after a short spin). If you see excessive lwp_park
activity around condvars, you probably have poor workflow between the
application's various threads.
Of course, it can a lot of experience and a trained eye to really
understand what is going on and to take steps to improve application
scalability. For such situations you might consider hiring in some
expert help :)
Phil Harman
Harman Holistix - focusing on the detail and the big picture
Our holistic services include: performance health checks, system tuning,
DTrace training, coding advice, developer assassinations
http://blogs.sun.com/pgdh (mothballed)
http://harmanholistix.com/mt (current)
http://linkedin.com/in/philharman
On 03/02/2010 20:55, Andrew Gabriel wrote:
Yes. Lots of info in
http://www.sun.com/software/whitepapers/solaris9/multithread.pdf
Jim Mauro wrote:
It's used to put threads to sleep that are blocking on user locks
(at least that's my recollection).
Run "prstat -Lmp .
Thanks,
/jim
Dtrace Email wrote:
Hi, when doing dtrace on an appliction, __lwp_park() seems to be
taking a lot of time. What does it really do? is it waiting for
threads?
Thanks,
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org