On Fri, May 21, 2021 at 5:32 PM Sider, Graham <[email protected]> wrote:
>
> Would this be referring to tools that may parse 
> /sys/class/.../device/gpu_metrics or the actual gpu_metrics_vX_Y structs? For 
> the latter, if there are tools that parse dependent on version vX_Y, I agree 
> that we would not want to break those.
>
> Since most ASICs are using different version currently, we would have to 
> create a duplicate struct for each gpu_metrics version currently being used, 
> unless I'm misunderstanding. I'm not sure if this is what you had in mind - 
> let me know.
>

Just update them all to the latest version.  The newer ones are just
supersets of the previous versions.  I think a newer revision just
went in in the last day or two for some additional new data, you can
probably just piggy back on that since the code is not upstream yet.

Alex


> Best,
> Graham
>
> -----Original Message-----
> From: Alex Deucher <[email protected]>
> Sent: Friday, May 21, 2021 4:15 PM
> To: Sider, Graham <[email protected]>
> Cc: amd-gfx list <[email protected]>; Kasiviswanathan, Harish 
> <[email protected]>; Sakhnovitch, Elena (Elen) 
> <[email protected]>
> Subject: Re: [PATCH 2/6] drm/amd/pm: Add arcturus throttler translation
>
> [CAUTION: External Email]
>
> On Fri, May 21, 2021 at 1:39 PM Sider, Graham <[email protected]> wrote:
> >
> > Hi Alex,
> >
> > Are you referring to bumping the gpu_metrics_vX_Y version number? Different 
> > ASICs are currently using different version numbers already, so I'm not 
> > sure how feasible this might be (e.g. arcturus ==  gpu_metrics_v1_1, navi1x 
> > == gpu_metrics_v1_3, vangogh == gpu_metrics_v2_1).
> >
> > Technically speaking no new fields have been added to any of the 
> > gpu_metrics versions, just a change in representation in the 
> > throttle_status field. Let me know your thoughts on this.
> >
>
> I don't know if we have any existing tools out there that parse this data, 
> but if so, they would interpret it incorrectly after this change.  If we bump 
> the version, at least the tools will know how to handle it.
>
> Alex
>
>
> > Best,
> > Graham
> >
> > -----Original Message-----
> > From: Alex Deucher <[email protected]>
> > Sent: Friday, May 21, 2021 10:27 AM
> > To: Sider, Graham <[email protected]>
> > Cc: amd-gfx list <[email protected]>; Kasiviswanathan,
> > Harish <[email protected]>; Sakhnovitch, Elena (Elen)
> > <[email protected]>
> > Subject: Re: [PATCH 2/6] drm/amd/pm: Add arcturus throttler
> > translation
> >
> > [CAUTION: External Email]
> >
> > General comment on the patch series, do you want to bump the metrics table 
> > version since the meaning of the throttler status has changed?
> >
> > Alex
> >
> > On Thu, May 20, 2021 at 10:30 AM Graham Sider <[email protected]> wrote:
> > >
> > > Perform dependent to independent throttle status translation for
> > > arcturus.
> > > ---
> > >  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 62
> > > ++++++++++++++++---
> > >  1 file changed, 53 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > index 1735a96dd307..7c01c0bf2073 100644
> > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > @@ -540,6 +540,49 @@ static int arcturus_freqs_in_same_level(int32_t 
> > > frequency1,
> > >         return (abs(frequency1 - frequency2) <= EPSILON);  }
> > >
> > > +static uint32_t arcturus_get_indep_throttler_status(
> > > +                                       unsigned long
> > > +dep_throttler_status) {
> > > +       unsigned long indep_throttler_status = 0;
> > > +
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_EDGE_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_EDGE_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_HOTSPOT_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_HOTSPOT_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_MEM_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_MEM_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_GFX_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_GFX_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_MEM_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_MEM_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_SOC_BIT, 
> > > &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_SOC_BIT, 
> > > &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TDC_GFX_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TDC_GFX_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TDC_SOC_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TDC_SOC_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT0_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT0_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT1_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT1_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT2_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT2_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT3_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT3_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPM_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPM_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_FIT_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_FIT_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_APCC_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_APCC_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_VRHOT0_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_VRHOT0_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_VRHOT1_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_VRHOT1_BIT,
> > > + &dep_throttler_status));
> > > +
> > > +       return (uint32_t)indep_throttler_status; }
> > > +
> > >  static int arcturus_get_smu_metrics_data(struct smu_context *smu,
> > >                                          MetricsMember_t member,
> > >                                          uint32_t *value) @@ -629,7
> > > +672,7 @@ static int arcturus_get_smu_metrics_data(struct
> > > +smu_context *smu,
> > >                         SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
> > >                 break;
> > >         case METRICS_THROTTLER_STATUS:
> > > -               *value = metrics->ThrottlerStatus;
> > > +               *value =
> > > + arcturus_get_indep_throttler_status(metrics->ThrottlerStatus);
> > >                 break;
> > >         case METRICS_CURR_FANSPEED:
> > >                 *value = metrics->CurrFanSpeed; @@ -2213,13 +2256,13
> > > @@ static const struct throttling_logging_label {
> > >         uint32_t feature_mask;
> > >         const char *label;
> > >  } logging_label[] = {
> > > -       {(1U << THROTTLER_TEMP_HOTSPOT_BIT), "GPU"},
> > > -       {(1U << THROTTLER_TEMP_MEM_BIT), "HBM"},
> > > -       {(1U << THROTTLER_TEMP_VR_GFX_BIT), "VR of GFX rail"},
> > > -       {(1U << THROTTLER_TEMP_VR_MEM_BIT), "VR of HBM rail"},
> > > -       {(1U << THROTTLER_TEMP_VR_SOC_BIT), "VR of SOC rail"},
> > > -       {(1U << THROTTLER_VRHOT0_BIT), "VR0 HOT"},
> > > -       {(1U << THROTTLER_VRHOT1_BIT), "VR1 HOT"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_HOTSPOT_BIT), "GPU"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_MEM_BIT), "HBM"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_GFX_BIT), "VR of GFX rail"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_MEM_BIT), "VR of HBM rail"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_SOC_BIT), "VR of SOC rail"},
> > > +       {(1U << INDEP_THROTTLER_VRHOT0_BIT), "VR0 HOT"},
> > > +       {(1U << INDEP_THROTTLER_VRHOT1_BIT), "VR1 HOT"},
> > >  };
> > >  static void arcturus_log_thermal_throttling_event(struct
> > > smu_context
> > > *smu)  { @@ -2314,7 +2357,8 @@ static ssize_t
> > > arcturus_get_gpu_metrics(struct smu_context *smu,
> > >         gpu_metrics->current_vclk0 = metrics.CurrClock[PPCLK_VCLK];
> > >         gpu_metrics->current_dclk0 = metrics.CurrClock[PPCLK_DCLK];
> > >
> > > -       gpu_metrics->throttle_status = metrics.ThrottlerStatus;
> > > +       gpu_metrics->throttle_status =
> > > +
> > > + arcturus_get_indep_throttler_status(metrics.ThrottlerStatus);
> > >
> > >         gpu_metrics->current_fan_speed = metrics.CurrFanSpeed;
> > >
> > > --
> > > 2.17.1
> > >
> > > _______________________________________________
> > > amd-gfx mailing list
> > > [email protected]
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> > > st
> > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7C
> > > Gr
> > > aham.Sider%40amd.com%7Ca3ca9a6b0576479e545808d91c648f50%7C3dd8961fe4
> > > 88
> > > 4e608e11a82d994e183d%7C0%7C0%7C637572040495495758%7CUnknown%7CTWFpbG
> > > Zs
> > > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> > > 3D
> > > %7C1000&amp;sdata=YxUx7BrsQKBauKE3fHpNrkWMAG4dBy11fV9xnJdMHns%3D&amp
> > > ;r
> > > eserved=0
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to