RE: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-09 Thread Quan, Evan
[AMD Official Use Only - General]

For some members, "0" is a valid value. 
Thus "0x" is used instead to tell the output is invalid/unsupported.

BR
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of
> sfrcorne
> Sent: Wednesday, February 8, 2023 7:12 AM
> To: Alex Deucher 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk',
> 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU
> 
> Dear Alex,
> 
> If current_gfxclk is not supported for my CPU, then using
> average_gfxclk_frequency instead is indeed the best solution in my opinion.
> I will try to get a fix merged for my CPU in Mangohud.
> 
> On a side note: you mentioned that unsupported fields would be 0 but I
> don't think this is correct. In the Linux kernel/driver there is a line of 
> code
> that first set all values to 0xFF by a memset() and then populates the
> supported fields.
> 
> see
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/s
> wsmu/smu_cmn.c#L999": memset(header, 0xFF, structure_size);
> 
> The value of the unsupported uint16_t fields thus should be 0x (or 65535
> in decimal). This is also what I get when reading the gpu_metrics file. I just
> wanted to mention this in case someone reads this in the Archive.
> 
> Anyway, thank you for your help!
> 
> Kind regards,
> sfrcorne
> 
> --- Original Message ---
> On Tuesday, February 7th, 2023 at 05:05, Alex Deucher
>  wrote:
> 
> 
> > On Mon, Feb 6, 2023 at 5:48 PM sfrcorne sfrco...@protonmail.com wrote:
> >
> > > Dear Alex,
> > >
> > > First of all, thank you for your response. Personally, I use a Ryzen 5 
> > > 7600X
> however people with a Ryzen 9 7900X are also reporting this issue. The
> relevant bug report in Mangohud can be found here:
> "https://github.com/flightlessmango/MangoHud/issues/868";.
> > >
> > > I looked around a bit in both the Mangohud source code and the Linux
> kernel source code.
> > >
> > > (Mangohud source): From what I understand, Mangohud looks for a file
> "/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does
> exists on my machine), it tries to read this file and extract the relevant GPU
> data (and in case of an APU also the CPU data) from it (these are the values I
> was talking about in my previous mail). When the file
> "/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data
> provided by hwmon (/sys/class/hwmon/hwmon*/*).
> > >
> > > (Linux kernel): The gpu_metrics file contains different data, depending
> on what version is used. All valid versions can be found in the source code:
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/includ
> e/kgd_pp_interface.h#L725". For my CPU/APU the 'gpu_metrics_v2_1'
> structure is used (I tested this by reading the gpu_metrics file myself).
> Furthermore, I think that for my case, this structure is set by the function
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/s
> wsmu/smu13/smu_v13_0_5_ppt.c#L459" but I am not completely sure
> about this.
> >
> >
> > The metrics provided by the SMU firmware varies from asic to asic.
> > For things that are not supported by the metrics table for a
> > particular asic, those fields would be 0. You can see what metrics are
> > supported for your asic in smu_v13_0_5_get_gpu_metrics() as that
> > function populates the supported fields from the firmware to the
> > common structure. current_gfxclk is not supported in your asic, but
> > average_gfxclk_frequency is. So you'd want to use whichever field is
> > available for a particular asic in Mangohud.
> >
> > > Lastly, I am not familiar with umr. I assume that you are referring to
> "https://gitlab.freedesktop.org/tomstdenis/umr";? If I find some time this
> weekend, then I will look into this some more.
> >
> >
> > Yes, that is the right link. umr uses the same interface as mangohud,
> > so you should see the same data.
> >
> > Alex
> >
> > > Kind regards,
> > > sfrcorne
> > >
> > > --- Original Message ---
> > > On Monday, February 6th, 2023 at 22:22, Alex Deucher
> alexdeuc...@gmail.com wrote:
> > >
> > > > On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrco...@protonmail.com
> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I hope this is the cor

Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-07 Thread sfrcorne
Dear Alex,

If current_gfxclk is not supported for my CPU, then using 
average_gfxclk_frequency instead is indeed the best solution in my opinion. I 
will try to get a fix merged for my CPU in Mangohud.

On a side note: you mentioned that unsupported fields would be 0 but I don't 
think this is correct. In the Linux kernel/driver there is a line of code that 
first set all values to 0xFF by a memset() and then populates the supported 
fields.

see 
"https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c#L999":
 memset(header, 0xFF, structure_size);

The value of the unsupported uint16_t fields thus should be 0x (or 65535 in 
decimal). This is also what I get when reading the gpu_metrics file. I just 
wanted to mention this in case someone reads this in the Archive.

Anyway, thank you for your help!

Kind regards,
sfrcorne

--- Original Message ---
On Tuesday, February 7th, 2023 at 05:05, Alex Deucher  
wrote:


> On Mon, Feb 6, 2023 at 5:48 PM sfrcorne sfrco...@protonmail.com wrote:
> 
> > Dear Alex,
> > 
> > First of all, thank you for your response. Personally, I use a Ryzen 5 
> > 7600X however people with a Ryzen 9 7900X are also reporting this issue. 
> > The relevant bug report in Mangohud can be found here: 
> > "https://github.com/flightlessmango/MangoHud/issues/868";.
> > 
> > I looked around a bit in both the Mangohud source code and the Linux kernel 
> > source code.
> > 
> > (Mangohud source): From what I understand, Mangohud looks for a file 
> > "/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does 
> > exists on my machine), it tries to read this file and extract the relevant 
> > GPU data (and in case of an APU also the CPU data) from it (these are the 
> > values I was talking about in my previous mail). When the file 
> > "/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data 
> > provided by hwmon (/sys/class/hwmon/hwmon*/*).
> > 
> > (Linux kernel): The gpu_metrics file contains different data, depending on 
> > what version is used. All valid versions can be found in the source code: 
> > "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/include/kgd_pp_interface.h#L725";.
> >  For my CPU/APU the 'gpu_metrics_v2_1' structure is used (I tested this by 
> > reading the gpu_metrics file myself). Furthermore, I think that for my 
> > case, this structure is set by the function 
> > "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c#L459";
> >  but I am not completely sure about this.
> 
> 
> The metrics provided by the SMU firmware varies from asic to asic.
> For things that are not supported by the metrics table for a
> particular asic, those fields would be 0. You can see what metrics
> are supported for your asic in smu_v13_0_5_get_gpu_metrics() as that
> function populates the supported fields from the firmware to the
> common structure. current_gfxclk is not supported in your asic, but
> average_gfxclk_frequency is. So you'd want to use whichever field is
> available for a particular asic in Mangohud.
> 
> > Lastly, I am not familiar with umr. I assume that you are referring to 
> > "https://gitlab.freedesktop.org/tomstdenis/umr";? If I find some time this 
> > weekend, then I will look into this some more.
> 
> 
> Yes, that is the right link. umr uses the same interface as mangohud,
> so you should see the same data.
> 
> Alex
> 
> > Kind regards,
> > sfrcorne
> > 
> > --- Original Message ---
> > On Monday, February 6th, 2023 at 22:22, Alex Deucher alexdeuc...@gmail.com 
> > wrote:
> > 
> > > On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrco...@protonmail.com wrote:
> > > 
> > > > Hello,
> > > > 
> > > > I hope this is the correct place to ask my question. I was not sure if 
> > > > I should have opened a new issue on Gitlab or send an email here, since 
> > > > I don't know know whether this is a bug or intended behaviour.
> > > > 
> > > > The question is about the new AMD Ryzen 7000 CPU's. These new CPU's 
> > > > have an iGPU and consequently provide a gpu_metrics file for monitoring 
> > > > the GPU/CPU (APU?). This file is used by programs like Mangohud, that 
> > > > try to read (among other values) the following 4 values:
> > > > - current_gfxclk
> > > > - current_uclk
> > > > - average_cpu_power
> > > > - temperature_core
> > > > However it appears that on AMD Ryzen 7000 CPU's these 4 values are not 
> > > > provided/updated in the gpu_metrics file. Other values like 
> > > > 'average_core_power', 'temperature_l3' and the other 'current_clk' 
> > > > are also not provided/updated but these are not used by Mangohud at the 
> > > > moment.
> > > > 
> > > > Is this intentional or a bug? And will this be fix and/or will support 
> > > > for these 4 values be added in the future?
> > > 
> > > What specific CPU/APU is this? I don't recall off hand how mangohud
> > > queries this stuff, but you can take a look at the hwmon inter

Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-06 Thread Alex Deucher
On Mon, Feb 6, 2023 at 5:48 PM sfrcorne  wrote:
>
> Dear Alex,
>
> First of all, thank you for your response. Personally, I use a Ryzen 5 7600X 
> however people with a Ryzen 9 7900X are also reporting this issue. The 
> relevant bug report in Mangohud can be found here: 
> "https://github.com/flightlessmango/MangoHud/issues/868";.
>
> I looked around a bit in both the Mangohud source code and the Linux kernel 
> source code.
>
> (Mangohud source): From what I understand, Mangohud looks for a file 
> "/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does 
> exists on my machine), it tries to read this file and extract the relevant 
> GPU data (and in case of an APU also the CPU data) from it (these are the 
> values I was talking about in my previous mail). When the file 
> "/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data 
> provided by hwmon (/sys/class/hwmon/hwmon*/*).
>
> (Linux kernel): The gpu_metrics file contains different data, depending on 
> what version is used. All valid versions can be found in the source code: 
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/include/kgd_pp_interface.h#L725";.
>  For my CPU/APU the 'gpu_metrics_v2_1' structure is used (I tested this by 
> reading the gpu_metrics file myself). Furthermore, I think that for my case, 
> this structure is set by the function 
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c#L459";
>  but I am not completely sure about this.

The metrics provided by the SMU firmware varies from asic to asic.
For things that are not supported by the metrics table for a
particular asic, those fields would be 0.  You can see what metrics
are supported for your asic in smu_v13_0_5_get_gpu_metrics() as that
function populates the supported fields from the firmware to the
common structure.  current_gfxclk is not supported in your asic, but
average_gfxclk_frequency is.  So you'd want to use whichever field is
available for a particular asic in Mangohud.

>
> Lastly, I am not familiar with umr. I assume that you are referring to 
> "https://gitlab.freedesktop.org/tomstdenis/umr";? If I find some time this 
> weekend, then I will look into this some more.

Yes, that is the right link.  umr uses the same interface as mangohud,
so you should see the same data.

Alex


>
> Kind regards,
> sfrcorne
>
> --- Original Message ---
> On Monday, February 6th, 2023 at 22:22, Alex Deucher  
> wrote:
>
> > On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrco...@protonmail.com wrote:
> >
> > > Hello,
> > >
> > > I hope this is the correct place to ask my question. I was not sure if I 
> > > should have opened a new issue on Gitlab or send an email here, since I 
> > > don't know know whether this is a bug or intended behaviour.
> > >
> > > The question is about the new AMD Ryzen 7000 CPU's. These new CPU's have 
> > > an iGPU and consequently provide a gpu_metrics file for monitoring the 
> > > GPU/CPU (APU?). This file is used by programs like Mangohud, that try to 
> > > read (among other values) the following 4 values:
> > > - current_gfxclk
> > > - current_uclk
> > > - average_cpu_power
> > > - temperature_core
> > > However it appears that on AMD Ryzen 7000 CPU's these 4 values are not 
> > > provided/updated in the gpu_metrics file. Other values like 
> > > 'average_core_power', 'temperature_l3' and the other 'current_clk' are 
> > > also not provided/updated but these are not used by Mangohud at the 
> > > moment.
> > >
> > > Is this intentional or a bug? And will this be fix and/or will support 
> > > for these 4 values be added in the future?
> >
> >
> > What specific CPU/APU is this? I don't recall off hand how mangohud
> > queries this stuff, but you can take a look at the hwmon interfaces
> > exposed by the driver or if you want the whole metrics table, you can
> > use umr to fetch and decode it via the kernel interface. That will
> > allow you to verify that the firmware is producing the proper data.
> >
> > Alex


Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-06 Thread sfrcorne
Dear Alex,

First of all, thank you for your response. Personally, I use a Ryzen 5 7600X 
however people with a Ryzen 9 7900X are also reporting this issue. The relevant 
bug report in Mangohud can be found here: 
"https://github.com/flightlessmango/MangoHud/issues/868";.

I looked around a bit in both the Mangohud source code and the Linux kernel 
source code.

(Mangohud source): From what I understand, Mangohud looks for a file 
"/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does 
exists on my machine), it tries to read this file and extract the relevant GPU 
data (and in case of an APU also the CPU data) from it (these are the values I 
was talking about in my previous mail). When the file 
"/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data 
provided by hwmon (/sys/class/hwmon/hwmon*/*).

(Linux kernel): The gpu_metrics file contains different data, depending on what 
version is used. All valid versions can be found in the source code: 
"https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/include/kgd_pp_interface.h#L725";.
 For my CPU/APU the 'gpu_metrics_v2_1' structure is used (I tested this by 
reading the gpu_metrics file myself). Furthermore, I think that for my case, 
this structure is set by the function 
"https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c#L459";
 but I am not completely sure about this.

Lastly, I am not familiar with umr. I assume that you are referring to 
"https://gitlab.freedesktop.org/tomstdenis/umr";? If I find some time this 
weekend, then I will look into this some more.

Kind regards,
sfrcorne

--- Original Message ---
On Monday, February 6th, 2023 at 22:22, Alex Deucher  
wrote:

> On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrco...@protonmail.com wrote:
> 
> > Hello,
> > 
> > I hope this is the correct place to ask my question. I was not sure if I 
> > should have opened a new issue on Gitlab or send an email here, since I 
> > don't know know whether this is a bug or intended behaviour.
> > 
> > The question is about the new AMD Ryzen 7000 CPU's. These new CPU's have an 
> > iGPU and consequently provide a gpu_metrics file for monitoring the GPU/CPU 
> > (APU?). This file is used by programs like Mangohud, that try to read 
> > (among other values) the following 4 values:
> > - current_gfxclk
> > - current_uclk
> > - average_cpu_power
> > - temperature_core
> > However it appears that on AMD Ryzen 7000 CPU's these 4 values are not 
> > provided/updated in the gpu_metrics file. Other values like 
> > 'average_core_power', 'temperature_l3' and the other 'current_clk' are 
> > also not provided/updated but these are not used by Mangohud at the moment.
> > 
> > Is this intentional or a bug? And will this be fix and/or will support for 
> > these 4 values be added in the future?
> 
> 
> What specific CPU/APU is this? I don't recall off hand how mangohud
> queries this stuff, but you can take a look at the hwmon interfaces
> exposed by the driver or if you want the whole metrics table, you can
> use umr to fetch and decode it via the kernel interface. That will
> allow you to verify that the firmware is producing the proper data.
> 
> Alex


Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-06 Thread Alex Deucher
On Mon, Feb 6, 2023 at 9:22 AM sfrcorne  wrote:
>
> Hello,
>
> I hope this is the correct place to ask my question. I was not sure if I 
> should have opened a new issue on Gitlab or send an email here, since I don't 
> know know whether this is a bug or intended behaviour.
>
> The question is about the new AMD Ryzen 7000 CPU's. These new CPU's have an 
> iGPU and consequently provide a gpu_metrics file for monitoring the GPU/CPU 
> (APU?). This file is used by programs like Mangohud, that try to read (among 
> other values) the following 4 values:
>  - current_gfxclk
>  - current_uclk
>  - average_cpu_power
>  - temperature_core
> However it appears that on AMD Ryzen 7000 CPU's these 4 values are not 
> provided/updated in the gpu_metrics file. Other values like 
> 'average_core_power', 'temperature_l3' and the other 'current_clk' are 
> also not provided/updated but these are not used by Mangohud at the moment.
>
> Is this intentional or a bug? And will this be fix and/or will support for 
> these 4 values be added in the future?

What specific CPU/APU is this?  I don't recall off hand how mangohud
queries this stuff, but you can take a look at the hwmon interfaces
exposed by the driver or if you want the whole metrics table, you can
use umr to fetch and decode it via the kernel interface.  That will
allow you to verify that the firmware is producing the proper data.

Alex