I'm looking at a hang/stall while dtrace'ing on heavily loaded systems.

        I've got the following scenario:

Kernel Thread A is waiting to acquire tthe dtrace_meta_lock
Kernel Thread B owns the dtrace_meta_lock, and is waiting on the dtrace_provider_lock Kernel Thread C owns the dtrace_provider_lock, and is executing normally.

        However, I remembered this helpful comment from dtrace.c:
* The lock ordering between these three locks is dtrace_meta_lock before
* dtrace_provider_lock before dtrace_lock.  (In particular, there are
* several places where dtrace_provider_lock is held by the framework as it
* calls into the providers -- which then call back into the framework,
* grabbing dtrace_lock.)
Kernel Thread C is calling into dtrace_unregister from fasttrap_cleanup_pid_cb(). As best I can tell, fasttrap_cleanup_pid_cb() never takes any of the dtrace locks. It does
call dtrace_unregister at line 338, though.

The dtrace_unregister function immediately takes the provider_lock and the dtrace_lock,
without taking the meta lock.

Am I seeing an exception to the rules above, or the first signs of a potential lock ordering
issue?

        James M

        
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Reply via email to