I'm looking at a hang/stall while dtrace'ing on heavily loaded systems.
I've got the following scenario:
Kernel Thread A is waiting to acquire tthe dtrace_meta_lock
Kernel Thread B owns the dtrace_meta_lock, and is waiting on the
dtrace_provider_lock
Kernel Thread C owns the dtrace_provider_lock, and is executing
normally.
However, I remembered this helpful comment from dtrace.c:
* The lock ordering between these three locks is dtrace_meta_lock before
* dtrace_provider_lock before dtrace_lock. (In particular, there are
* several places where dtrace_provider_lock is held by the framework
as it
* calls into the providers -- which then call back into the framework,
* grabbing dtrace_lock.)
Kernel Thread C is calling into dtrace_unregister from
fasttrap_cleanup_pid_cb().
As best I can tell, fasttrap_cleanup_pid_cb() never takes any of the
dtrace locks. It does
call dtrace_unregister at line 338, though.
The dtrace_unregister function immediately takes the provider_lock
and the dtrace_lock,
without taking the meta lock.
Am I seeing an exception to the rules above, or the first signs of a
potential lock ordering
issue?
James M
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org