On Mon, 18 Dec 2023 22:07:38 -0800, Umesh Nerlige Ramappa wrote:
>
> On Mon, Dec 18, 2023 at 09:48:39PM -0800, Dixit, Ashutosh wrote:
> > On Mon, 18 Dec 2023 21:28:33 -0800, Dixit, Ashutosh wrote:
> >>
> >> On Mon, 18 Dec 2023 16:05:43 -0800, Umesh Nerlige Ramappa wrote:
> >> >
> >>
> >> Hi Umesh,
> >>
> >> > On XEHP platforms user is not able to find MMIO triggered reports in the
> >> > OA buffer since i915 squashes the context ID fields. These context ID
> >> > fields hold the MMIO trigger markers.
> >> >
> >> > Update logic to not squash the context ID fields of MMIO triggered
> >> > reports.
> >> >
> >> > Fixes: cba94bbcff08 ("drm/i915/perf: Determine context valid in OA 
> >> > reports")
> >> > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.rama...@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_perf.c | 39 ++++++++++++++++++++++++++++----
> >> >  1 file changed, 34 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_perf.c 
> >> > b/drivers/gpu/drm/i915/i915_perf.c
> >> > index 7b1c8de2f9cb..2d695818f006 100644
> >> > --- a/drivers/gpu/drm/i915/i915_perf.c
> >> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> >> > @@ -772,10 +772,6 @@ static int gen8_append_oa_reports(struct 
> >> > i915_perf_stream *stream,
> >> >           * The reason field includes flags identifying what
> >> >           * triggered this specific report (mostly timer
> >> >           * triggered or e.g. due to a context switch).
> >> > -                 *
> >> > -                 * In MMIO triggered reports, some platforms do not set 
> >> > the
> >> > -                 * reason bit in this field and it is valid to have a 
> >> > reason
> >> > -                 * field of zero.
> >> >           */
> >> >          reason = oa_report_reason(stream, report);
> >> >          ctx_id = oa_context_id(stream, report32);
> >> > @@ -787,8 +783,41 @@ static int gen8_append_oa_reports(struct 
> >> > i915_perf_stream *stream,
> >> >           *
> >> >           * Note: that we don't clear the valid_ctx_bit so userspace can
> >> >           * understand that the ID has been squashed by the kernel.
> >> > +                 *
> >> > +                 * Update:
> >> > +                 *
> >> > +                 * On XEHP platforms the behavior of context id valid 
> >> > bit has
> >> > +                 * changed compared to prior platforms. To describe 
> >> > this, we
> >> > +                 * define a few terms:
> >> > +                 *
> >> > +                 * context-switch-report: This is a report with the 
> >> > reason type
> >> > +                 * being context-switch. It is generated when a context 
> >> > switches
> >> > +                 * out.
> >> > +                 *
> >> > +                 * context-valid-bit: A bit that is set in the report 
> >> > ID field
> >> > +                 * to indicate that a valid context has been loaded.
> >> > +                 *
> >> > +                 * gpu-idle: A condition characterized by a
> >> > +                 * context-switch-report with context-valid-bit set to 
> >> > 0.
> >> > +                 *
> >> > +                 * On prior platforms, context-id-valid bit is set to 0 
> >> > only
> >> > +                 * when GPU goes idle. In all other reports, it is set 
> >> > to 1.
> >> > +                 *
> >> > +                 * On XEHP platforms, context-valid-bit is set to 1 in 
> >> > a context
> >> > +                 * switch report if a new context switched in. For all 
> >> > other
> >> > +                 * reports it is set to 0.
> >> > +                 *
> >> > +                 * This change in behavior causes an issue with MMIO 
> >> > triggered
> >> > +                 * reports. MMIO triggered reports have the markers in 
> >> > the
> >> > +                 * context ID field and the context-valid-bit is 0. The 
> >> > logic
> >> > +                 * below to squash the context ID would render the 
> >> > report
> >> > +                 * useless since the user will not be able to find it 
> >> > in the OA
> >> > +                 * buffer. Since MMIO triggered reports exist only on 
> >> > XEHP,
> >> > +                 * we should avoid squashing these for XEHP platforms.
> >>
> >> Hmm I am wondering if this is over-information and this comment should be
> >> made brief.
> >
> > Let me try: "For Gen's >= 12.50, the context id valid bit is reset when a
> > context switches out, but the context id is still valid. Because of this we
> > cannot squash the context id in this case".
> >
> > So this should affect both the regular as well as MMIO triggered cases
> > afaiu.
> >
> > Anyway, please do what you think is right with the comment. I just thought
> > I'll chime in.
>
> The long and descriptive comment is entirely for my benefit. There is a
> very good chance I will forget this, so putting it down in the code.  Also,
> I don't see this described in the spec, so thinking that we will benefit
> from it by having it here. I can put it in the commit msg instead if that
> helps.

I've already R-b'd the patch, so entirely your call. So do whatever you're
comfortable with :)

Ashutosh

>
> >
> >> For the record, here's the explanation of what is happening from Robert
> >> Krzemien's email (which at least makes it simpler for me to understand
> >> what is happening):
> >>
> >>    For Gen12HP+ (ATS/DG2/PVC/MTL+) platforms, context id valid bit is
> >>    set only for context switch reports and when a context is being
> >>    loaded. When exiting a context, a context switch report is
> >>    generated, ctx id is not zero, but the bit is not set. It allows us
> >>    to distinguish whether context switch reports are generated due to
> >>    entering or exiting GPU contexts. Ctx id field is non-zero for
> >>    context switches and mmio triggers. Other types always have ctx id
> >>    set to 0.
> >>
> >>    For previous platforms (like Gen12LP, Gen9/11), the bit is set for
> >>    all types of reports if a context is loaded. But those older
> >>    platforms don’t have mmio triggers. Ctx id field is non-zero for
> >>    all types of reports if a context is loaded.
> >>
> >>    I don’t understand why i915 needs to set ctx id to 0xffffffff if
> >>    the flag is not set. It has been removed from XE KMD as I remember
> >>    correctly.
> >>
> >> >           */
> >> > -                if (oa_report_ctx_invalid(stream, report)) {
> >> > +
> >> > +                if (oa_report_ctx_invalid(stream, report) &&
> >> > +                    GRAPHICS_VER_FULL(stream->engine->i915) < 
> >> > IP_VER(12, 50)) {
> >> >                  ctx_id = INVALID_CTX_ID;
> >> >                  oa_context_id_squash(stream, report32);
> >> >          }
> >>
> >> So I am assuming there's some unknown reason (maybe not hearing from Mesa?)
> >> why we can't get rid of the squashing even for legacy platforms. But that's
> >> ok I guess. So this is:
> >>
> >> Reviewed-by: Ashutosh Dixit <ashutosh.di...@intel.com>

Reply via email to