Hi James,
This doesn't sound like a lock ordering issue or an exception to the
stated
lock ordering rules. In particular, there's no need to take the meta
lock
before dtrace or provider; rather, the meta lock must be taken before
either
of those locks if it is to be taken at all.
Is there a situation in which the lock ordering has been invalidated?
Do you
have information about why thread B is being starved out of using the
provider
lock? fasttrap_cleanup_pid_cb() should only be called frequently if
there's
a high degree of turnover for pid and USDT probes.
Adam
On Mar 25, 2009, at 2:46 PM, James McIlree wrote:
I'm looking at a hang/stall while dtrace'ing on heavily loaded
systems.
I've got the following scenario:
Kernel Thread A is waiting to acquire tthe dtrace_meta_lock
Kernel Thread B owns the dtrace_meta_lock, and is waiting on the
dtrace_provider_lock
Kernel Thread C owns the dtrace_provider_lock, and is executing
normally.
However, I remembered this helpful comment from dtrace.c:
* The lock ordering between these three locks is dtrace_meta_lock
before
* dtrace_provider_lock before dtrace_lock. (In particular, there are
* several places where dtrace_provider_lock is held by the framework
as it
* calls into the providers -- which then call back into the framework,
* grabbing dtrace_lock.)
Kernel Thread C is calling into dtrace_unregister from
fasttrap_cleanup_pid_cb().
As best I can tell, fasttrap_cleanup_pid_cb() never takes any of the
dtrace locks. It does
call dtrace_unregister at line 338, though.
The dtrace_unregister function immediately takes the provider_lock
and the dtrace_lock,
without taking the meta lock.
Am I seeing an exception to the rules above, or the first signs of
a potential lock ordering
issue?
James M
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org
--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org