Hey Bryan, This is great stuff, and -- as you say -- something that we've all wanted for a long long time. The code looks great. The only addition I'd ask for is if you considered having a case with bufpolicy=ring and/or anonymous tracing.
I had a few questions: Can you explain the purpose of dtrace_unregister_defunct_reap? Is the idea to try for providers that were recently defunct and then give up after a minute? Should the fasttrap cleanup stuff use a taskq rather than its timeout stuff (not asking you to do it of course)? Can you talk a little about the approach you took? Obviously if you're using speculative tracing or bufpolicy=ring or anonymous tracing it limits the efficacy of what you've done. Would it have been possible to disable the ECBs without destroying them? I assume that's horribly facile, but I'd be interested to understand. dtrace_buffer_t structures should be cache-size aligned since they're per-CPU structures, yes? Would that be worth noting explicitly? And why did you elect to have padding in two places rather than just 7 64-bit values at the end? Thanks; again, this is really great. Adam On Sat, Jul 2, 2011 at 11:27 AM, Bryan Cantrill <br...@joyent.com> wrote: > All, > > A longstanding problem that we have had is that enablings on defunct > providers (e.g., USDT probes on dead processes) are not reaped: the > probes will exist as long as there exists an enabling for them. When > processes are turning over frequently (or when enablings are > long-running), this can clog up the probe space to the point that > DTrace probe creation will silently fail (an absolutely maddening > failure mode). This has been hit several times over the years (we > were nailed by it on our build machines at Fishworks) -- so when Theo > Schlossnagle mentioned to me that he was getting killed by this > problem in an environment with rapidly turning over Postgres > processes, I was embarrassed that I hadn't tackled it earlier. As it > turns out, it was a tad thorny for locking reasons, but a patch for > this problem is attached. We have integrated this into our bits at > Joyent (internal ticket is OS-454, "enablings on defunct providers > prevent providers from unregistering"), so you'll see this show up > soon at http://github.com/joyent/illumos-joyent -- but I wanted to > give everyone here a heads-up. > > Anyway, patch is attached, with my thanks to Adam for a helpful > discussion on fasttrap's asynchronous provider retiring mechanics. > Note that Adam hasn't (yet) reviewed this, and its integration > upstream should wait until he's had a chance to look it over. Please > let me know if you have any questions or comments! > > Thanks, > Bryan > > _______________________________________________ > Developer mailing list > develo...@lists.illumos.org > http://lists.illumos.org/m/listinfo/developer > > -- Adam Leventhal, Delphix http://dtrace.org/blogs/ahl 275 Middlefield Road, Suite 50 Menlo Park, CA 94025 http://www.delphix.com _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org