On Sat, Apr 02, 2011 at 07:46:31AM +0100, Chris Wilson wrote: > > But perhaps we do need to reconsider the performance aspect. intel_gpu_top > samples the ring HEAD and TAIL at around 10KHz and forcing gt-wake is > about 50 microseconds... I hope I'm mistaken, because even batched that is > doomed. Ben, do you mind checking that thought experiment with a little > hard fact?
I can get some numbers for it... but I'd like to further the discussion a bit since I have to go to the office to get a SNB, and typing is easier than doing that right now :). I too think we might be doomed. At the very least we have the POSTING_READ, which is expensive. The udelays may or may not actually occur (I'll find out). Let's not forget too that we do fix other tools, not just intel_gpu_top, and those tools don't poll, and don't care about timing (granted they're mostly less interesting too). I think what we really need to try to defer the forcewake_put, as well as something like the last patch I sent <[email protected]> to remove the need to protect force_wake_get with struct_mutex, and keep a refcount. I'd rather get the current stuff accepted, port the tools, and then make improvements to the performance. However if you feel the order must be another way, I can work with that. > -Chris > Thanks. Ben _______________________________________________ Intel-gfx mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/intel-gfx
