On 12/02/18 11:08, Chris Wilson wrote:
Quoting Lionel Landwerlin (2018-02-12 11:00:10)
On 12/02/18 10:41, Chris Wilson wrote:
Quoting Lionel Landwerlin (2018-02-12 10:37:52)
On 09/02/18 20:53, Chris Wilson wrote:
Quoting Lionel Landwerlin (2018-02-09 17:47:44)
From the i915/perf point of view, I'm fine with this change.
The pinning of the hw_id when monitoring a single context (with OA)
doesn't break the existing userspace (I can only think of Mesa).
I'm also trying to build up a system wide monitoring feature in GPUTop
with a timeline display. This change makes it a bit more challenging.
But this isn't really an expected feature, it's just nice to have.
What I'm thinking of would be to keep a circular buffer of requests in
the order they're submitted to an engine.
Then the i915 perf driver could correlate between the context-switch
tagged reports coming from OA and the requests submitted.
Much like the OA buffer, this circular buffer could overflow at which
point we signal the application using the i915 perf driver and it'll
most likely close the driver and try again.
I would need that have the hw_id added to the requests. Does that sounds
You already add hw_id to the tracepoint. For the requests, it is just
req->ctx->hw_id, valid from submission to retirement. Hmm, the
tracepoint is broken (use-after-free in ctx->hw_id). What value do you
want for HSW? (This patch will assign all legacy submission to HW ID 0.)
But aiui, for HSW oa you want lrca not HW ID. So both the use-after-free
and alternative ids suggest storing it on the request directly.
HSW doesn't have a hw_id field in the OA reports.
I'll have to come up with something slightly different.
How do you get the HW ID out, via the tracepoint right?
It's in the OA reports.
I mean how do you correlate the HW ID with userspace?
Right now, with tracepoints. But as I wrote above, I don't think it's
reliable (because of the timestamp correlation).
My idea was to let the i915 perf driver do it for userspace by attaching
the pid/tid to the drm_i915_perf_record.
On submission i915 would put the request into a circular buffer and i915
perf would read back the tail of that buffer to match what is coming out
of the OA unit.
And I read that as doing the tracing entirely in userspace. Inside the
kernel we should be able to do a much better job of knowing what
requests/contexts are active, it may just be we take an extra ref and/or
context pin for perf tracking.
Then I must be failing to communicate :)
What you wrote is precisely what I would like to do :)
Intel-gfx mailing list