Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-25 Thread Bryan Steele
On Mon, Apr 25, 2022 at 05:33:51PM +0200, Claudio Jeker wrote:
> On Mon, Apr 25, 2022 at 11:31:22AM -0400, Bryan Steele wrote:
> > On Mon, Apr 25, 2022 at 05:20:46PM +0200, Claudio Jeker wrote:
> > > On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > > > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > > > excludes Zen APU CPUs) this should show additional temp info. This is
> > > > based on info from the Linux k10temp driver.
> > > > 
> > > > Additionally use the MSRs defined in "Open-Source Register Reference For
> > > > AMD Family 17h Processors" to measure the CPU core frequency.
> > > > That should be the actuall speed of the CPU core during the measuring
> > > > interval.
> > > > 
> > > > On my T14g2 the output is now for example:
> > > > ksmn0.temp0   63.88 degC  Tctl
> > > > ksmn0.frequency03553141515.00 Hz  CPU0
> > > > ksmn0.frequency13549080315.00 Hz  CPU2
> > > > ksmn0.frequency23552369937.00 Hz  CPU4
> > > > ksmn0.frequency33546055048.00 Hz  CPU6
> > > > ksmn0.frequency43546854449.00 Hz  CPU8
> > > > ksmn0.frequency53543869698.00 Hz  CPU10
> > > > ksmn0.frequency63542551127.00 Hz  CPU12
> > > > ksmn0.frequency74441623647.00 Hz  CPU14
> > > > 
> > > > It is intresting to watch turbo kick in and how temp causes the CPU to
> > > > throttle.
> > > > 
> > > > I only tested this on systems with APUs so I could not thest the Tccd 
> > > > temp
> > > > reporting.
> > > 
> > > With the frequence sensor moved to cpu(4) this just adds the Tccd
> > > additional temparature sensors. It does not fix the duplication of the
> > > ksmn(4) sensors. That needs to be fixed by no attaching on the duplicate
> > > root complexes. How that is done I still need to figure out.
> > > 
> > > -- 
> > > :wq Claudio
> > 
> > I think this looks good now, I'm not too worried about the tCTL offsets.
> > That can be figured out later. ok brynet@
> > 
> > As for the attaching issue, would only attaching one device in the
> > kernel device be a nasty hack? If I understand, this should make the
> > duplicate complex attach as ppb(4) again.
> > 
> > ksmn0   at pci?
> > 
> 
> But that would break systems with multiple CPU sockets. Like the two
> socket EPYC server.

Ah, I missed that possibility.

> -- 
> :wq Claudio



Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-25 Thread Claudio Jeker
On Mon, Apr 25, 2022 at 11:31:22AM -0400, Bryan Steele wrote:
> On Mon, Apr 25, 2022 at 05:20:46PM +0200, Claudio Jeker wrote:
> > On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > > excludes Zen APU CPUs) this should show additional temp info. This is
> > > based on info from the Linux k10temp driver.
> > > 
> > > Additionally use the MSRs defined in "Open-Source Register Reference For
> > > AMD Family 17h Processors" to measure the CPU core frequency.
> > > That should be the actuall speed of the CPU core during the measuring
> > > interval.
> > > 
> > > On my T14g2 the output is now for example:
> > > ksmn0.temp0   63.88 degC  Tctl
> > > ksmn0.frequency03553141515.00 Hz  CPU0
> > > ksmn0.frequency13549080315.00 Hz  CPU2
> > > ksmn0.frequency23552369937.00 Hz  CPU4
> > > ksmn0.frequency33546055048.00 Hz  CPU6
> > > ksmn0.frequency43546854449.00 Hz  CPU8
> > > ksmn0.frequency53543869698.00 Hz  CPU10
> > > ksmn0.frequency63542551127.00 Hz  CPU12
> > > ksmn0.frequency74441623647.00 Hz  CPU14
> > > 
> > > It is intresting to watch turbo kick in and how temp causes the CPU to
> > > throttle.
> > > 
> > > I only tested this on systems with APUs so I could not thest the Tccd temp
> > > reporting.
> > 
> > With the frequence sensor moved to cpu(4) this just adds the Tccd
> > additional temparature sensors. It does not fix the duplication of the
> > ksmn(4) sensors. That needs to be fixed by no attaching on the duplicate
> > root complexes. How that is done I still need to figure out.
> > 
> > -- 
> > :wq Claudio
> 
> I think this looks good now, I'm not too worried about the tCTL offsets.
> That can be figured out later. ok brynet@
> 
> As for the attaching issue, would only attaching one device in the
> kernel device be a nasty hack? If I understand, this should make the
> duplicate complex attach as ppb(4) again.
> 
> ksmn0   at pci?
> 

But that would break systems with multiple CPU sockets. Like the two
socket EPYC server.

-- 
:wq Claudio



Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-25 Thread Bryan Steele
On Mon, Apr 25, 2022 at 05:20:46PM +0200, Claudio Jeker wrote:
> On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > excludes Zen APU CPUs) this should show additional temp info. This is
> > based on info from the Linux k10temp driver.
> > 
> > Additionally use the MSRs defined in "Open-Source Register Reference For
> > AMD Family 17h Processors" to measure the CPU core frequency.
> > That should be the actuall speed of the CPU core during the measuring
> > interval.
> > 
> > On my T14g2 the output is now for example:
> > ksmn0.temp0   63.88 degC  Tctl
> > ksmn0.frequency03553141515.00 Hz  CPU0
> > ksmn0.frequency13549080315.00 Hz  CPU2
> > ksmn0.frequency23552369937.00 Hz  CPU4
> > ksmn0.frequency33546055048.00 Hz  CPU6
> > ksmn0.frequency43546854449.00 Hz  CPU8
> > ksmn0.frequency53543869698.00 Hz  CPU10
> > ksmn0.frequency63542551127.00 Hz  CPU12
> > ksmn0.frequency74441623647.00 Hz  CPU14
> > 
> > It is intresting to watch turbo kick in and how temp causes the CPU to
> > throttle.
> > 
> > I only tested this on systems with APUs so I could not thest the Tccd temp
> > reporting.
> 
> With the frequence sensor moved to cpu(4) this just adds the Tccd
> additional temparature sensors. It does not fix the duplication of the
> ksmn(4) sensors. That needs to be fixed by no attaching on the duplicate
> root complexes. How that is done I still need to figure out.
> 
> -- 
> :wq Claudio

I think this looks good now, I'm not too worried about the tCTL offsets.
That can be figured out later. ok brynet@

As for the attaching issue, would only attaching one device in the
kernel device be a nasty hack? If I understand, this should make the
duplicate complex attach as ppb(4) again.

ksmn0   at pci?

> Index: ksmn.c
> ===
> RCS file: /cvs/src/sys/dev/pci/ksmn.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 ksmn.c
> --- ksmn.c11 Mar 2022 18:00:50 -  1.6
> +++ ksmn.c25 Apr 2022 15:19:21 -
> @@ -42,6 +42,7 @@
>* [31:21]  Current reported temperature.
>*/
>  #define SMU_17H_THM  0x59800
> +#define SMU_17H_CCD_THM(o, x)(SMU_17H_THM + (o) + ((x) * 4))
>  #define GET_CURTMP(r)(((r) >> 21) & 0x7ff)
>  
>  /*
> @@ -50,6 +51,8 @@
>   */
>  #define CURTMP_17H_RANGE_SEL (1 << 19)
>  #define CURTMP_17H_RANGE_ADJUST  490
> +#define CURTMP_CCD_VALID (1 << 11)
> +#define CURTMP_CCD_MASK  0x7ff
>  
>  /*
>   * Undocumented tCTL offsets gleamed from Linux k10temp driver.
> @@ -75,13 +78,18 @@ struct ksmn_softc {
>   pcitag_tsc_pcitag;
>  
>   int sc_tctl_offset;
> + unsigned intsc_ccd_valid;   /* available Tccds */
> + unsigned intsc_ccd_offset;
>  
> - struct ksensor  sc_sensor;
>   struct ksensordev   sc_sensordev;
> + struct ksensor  sc_sensor;  /* Tctl */
> + struct ksensor  sc_ccd_sensor[12];  /* Tccd */
>  };
>  
>  int  ksmn_match(struct device *, void *, void *);
>  void ksmn_attach(struct device *, struct device *, void *);
> +uint32_t ksmn_read_reg(struct ksmn_softc *, uint32_t);
> +void ksmn_ccd_attach(struct ksmn_softc *, int);
>  void ksmn_refresh(void *);
>  
>  const struct cfattach ksmn_ca = {
> @@ -113,7 +121,9 @@ ksmn_attach(struct device *parent, struc
>   struct ksmn_softc   *sc = (struct ksmn_softc *)self;
>   struct pci_attach_args  *pa = aux;
>   struct curtmp_offset*p;
> - extern char cpu_model[];
> + struct cpu_info *ci = curcpu();
> + extern char  cpu_model[];
> +
>  
>   sc->sc_pc = pa->pa_pc;
>   sc->sc_pcitag = pa->pa_tag;
> @@ -122,6 +132,7 @@ ksmn_attach(struct device *parent, struc
>   sizeof(sc->sc_sensordev.xname));
>  
>   sc->sc_sensor.type = SENSOR_TEMP;
> + snprintf(sc->sc_sensor.desc, sizeof(sc->sc_sensor.desc), "Tctl");
>   sensor_attach(>sc_sensordev, >sc_sensor);
>  
>   /*
> @@ -136,6 +147,38 @@ ksmn_attach(struct device *parent, struc
>   sc->sc_tctl_offset = p->tctl_offset;
>   }
>  
> + sc->sc_ccd_offset = 0x154;
> +
> + if (ci->ci_family == 0x17 || ci->ci_family == 0x18) {
> + switch (ci->ci_model) {
> + case 0x1:   /* Zen */
> + case 0x8:   /* Zen+ */
> + case 0x11:  /* Zen APU */
> + case 0x18:  /* Zen+ APU */
> + ksmn_ccd_attach(sc, 4);
> + break;
> + case 0x31:  /* Zen2 Threadripper */
> + case 0x60:  /* Renoir */
> + case 0x68:  /* Lucienne */
> +  

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-25 Thread Claudio Jeker
On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> excludes Zen APU CPUs) this should show additional temp info. This is
> based on info from the Linux k10temp driver.
> 
> Additionally use the MSRs defined in "Open-Source Register Reference For
> AMD Family 17h Processors" to measure the CPU core frequency.
> That should be the actuall speed of the CPU core during the measuring
> interval.
> 
> On my T14g2 the output is now for example:
> ksmn0.temp0   63.88 degC  Tctl
> ksmn0.frequency03553141515.00 Hz  CPU0
> ksmn0.frequency13549080315.00 Hz  CPU2
> ksmn0.frequency23552369937.00 Hz  CPU4
> ksmn0.frequency33546055048.00 Hz  CPU6
> ksmn0.frequency43546854449.00 Hz  CPU8
> ksmn0.frequency53543869698.00 Hz  CPU10
> ksmn0.frequency63542551127.00 Hz  CPU12
> ksmn0.frequency74441623647.00 Hz  CPU14
> 
> It is intresting to watch turbo kick in and how temp causes the CPU to
> throttle.
> 
> I only tested this on systems with APUs so I could not thest the Tccd temp
> reporting.

With the frequence sensor moved to cpu(4) this just adds the Tccd
additional temparature sensors. It does not fix the duplication of the
ksmn(4) sensors. That needs to be fixed by no attaching on the duplicate
root complexes. How that is done I still need to figure out.

-- 
:wq Claudio

Index: ksmn.c
===
RCS file: /cvs/src/sys/dev/pci/ksmn.c,v
retrieving revision 1.6
diff -u -p -r1.6 ksmn.c
--- ksmn.c  11 Mar 2022 18:00:50 -  1.6
+++ ksmn.c  25 Apr 2022 15:19:21 -
@@ -42,6 +42,7 @@
   * [31:21]  Current reported temperature.
   */
 #define SMU_17H_THM0x59800
+#define SMU_17H_CCD_THM(o, x)  (SMU_17H_THM + (o) + ((x) * 4))
 #define GET_CURTMP(r)  (((r) >> 21) & 0x7ff)
 
 /*
@@ -50,6 +51,8 @@
  */
 #define CURTMP_17H_RANGE_SEL   (1 << 19)
 #define CURTMP_17H_RANGE_ADJUST490
+#define CURTMP_CCD_VALID   (1 << 11)
+#define CURTMP_CCD_MASK0x7ff
 
 /*
  * Undocumented tCTL offsets gleamed from Linux k10temp driver.
@@ -75,13 +78,18 @@ struct ksmn_softc {
pcitag_tsc_pcitag;
 
int sc_tctl_offset;
+   unsigned intsc_ccd_valid;   /* available Tccds */
+   unsigned intsc_ccd_offset;
 
-   struct ksensor  sc_sensor;
struct ksensordev   sc_sensordev;
+   struct ksensor  sc_sensor;  /* Tctl */
+   struct ksensor  sc_ccd_sensor[12];  /* Tccd */
 };
 
 intksmn_match(struct device *, void *, void *);
 void   ksmn_attach(struct device *, struct device *, void *);
+uint32_t   ksmn_read_reg(struct ksmn_softc *, uint32_t);
+void   ksmn_ccd_attach(struct ksmn_softc *, int);
 void   ksmn_refresh(void *);
 
 const struct cfattach ksmn_ca = {
@@ -113,7 +121,9 @@ ksmn_attach(struct device *parent, struc
struct ksmn_softc   *sc = (struct ksmn_softc *)self;
struct pci_attach_args  *pa = aux;
struct curtmp_offset*p;
-   extern char cpu_model[];
+   struct cpu_info *ci = curcpu();
+   extern char  cpu_model[];
+
 
sc->sc_pc = pa->pa_pc;
sc->sc_pcitag = pa->pa_tag;
@@ -122,6 +132,7 @@ ksmn_attach(struct device *parent, struc
sizeof(sc->sc_sensordev.xname));
 
sc->sc_sensor.type = SENSOR_TEMP;
+   snprintf(sc->sc_sensor.desc, sizeof(sc->sc_sensor.desc), "Tctl");
sensor_attach(>sc_sensordev, >sc_sensor);
 
/*
@@ -136,6 +147,38 @@ ksmn_attach(struct device *parent, struc
sc->sc_tctl_offset = p->tctl_offset;
}
 
+   sc->sc_ccd_offset = 0x154;
+
+   if (ci->ci_family == 0x17 || ci->ci_family == 0x18) {
+   switch (ci->ci_model) {
+   case 0x1:   /* Zen */
+   case 0x8:   /* Zen+ */
+   case 0x11:  /* Zen APU */
+   case 0x18:  /* Zen+ APU */
+   ksmn_ccd_attach(sc, 4);
+   break;
+   case 0x31:  /* Zen2 Threadripper */
+   case 0x60:  /* Renoir */
+   case 0x68:  /* Lucienne */
+   case 0x71:  /* Zen2 */
+   ksmn_ccd_attach(sc, 8);
+   break;
+   }
+   } else if (ci->ci_family == 0x19) {
+   uint32_t m = ci->ci_model;
+
+   if ((m >= 0x40 && m <= 0x4f) ||
+   (m >= 0x10 && m <= 0x1f) ||
+   (m >= 0xa0 && m <= 0xaf))
+   sc->sc_ccd_offset = 0x300;
+
+   if ((m >= 0x10 && m <= 0x1f) ||
+   (m >= 0xa0 && m <= 0xaf))
+   

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread David Gwynne
On Sun, Apr 24, 2022 at 11:32:53PM +0200, Claudio Jeker wrote:
> On Sun, Apr 24, 2022 at 02:30:37PM -0400, Bryan Steele wrote:
> > On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > > excludes Zen APU CPUs) this should show additional temp info. This is
> > > based on info from the Linux k10temp driver.
> > > 
> > > Additionally use the MSRs defined in "Open-Source Register Reference For
> > > AMD Family 17h Processors" to measure the CPU core frequency.
> > > That should be the actuall speed of the CPU core during the measuring
> > > interval.

Indeed.

Intel have aperf and mperf (and pperf and smi_count) MSRs too, so
I would argue that an AMD specific driver is the wrong place for this.

> > > 
> > > On my T14g2 the output is now for example:
> > > ksmn0.temp0   63.88 degC  Tctl
> > > ksmn0.frequency03553141515.00 Hz  CPU0
> > > ksmn0.frequency13549080315.00 Hz  CPU2
> > > ksmn0.frequency23552369937.00 Hz  CPU4
> > > ksmn0.frequency33546055048.00 Hz  CPU6
> > > ksmn0.frequency43546854449.00 Hz  CPU8
> > > ksmn0.frequency53543869698.00 Hz  CPU10
> > > ksmn0.frequency63542551127.00 Hz  CPU12
> > > ksmn0.frequency74441623647.00 Hz  CPU14
> > > 
> > > It is intresting to watch turbo kick in and how temp causes the CPU to
> > > throttle.

Yes, I've been exporting it via some quick and dirty kstat code at work
on a bunch of boxes, which in turn gets stored with all the other kstats
and some other metrics we're interested in such as the CPU stats the
kernel collects, and some counters specific programs keep track of and
report along with their own getrusage.

We have one thing in particular that has a fairly constant workload.
When it's the only thing running you see the effective performance
from these MSRs report the clock running at about 1.2GHz, and the
program says it's averaging about 10% CPU. If you recompile something,
effective performance of the system spikes to 2.8GHz and that program
says it's CPU usage halves, but it's doing the same amount of work
via all the other metrics it reports.

Pretty cool.

I don't think this driver is the right place to read the MSRs though,
and I'm not sure the cpu driver is the right place either. There's
a bunch of other MSRs on recent CPUs from both AMD and Intel that
can report power/energy measurements (the Running Average Power
Limit aka RAPL stuff), but on AMD those measurements are at the core and
package level rather than on each thread like our cpu driver attaches
to. From what I remember the Intel RAPL bits report stuff about DRAM and
different bits on the die.

dlg



Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Claudio Jeker
On Sun, Apr 24, 2022 at 02:30:37PM -0400, Bryan Steele wrote:
> On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > excludes Zen APU CPUs) this should show additional temp info. This is
> > based on info from the Linux k10temp driver.
> > 
> > Additionally use the MSRs defined in "Open-Source Register Reference For
> > AMD Family 17h Processors" to measure the CPU core frequency.
> > That should be the actuall speed of the CPU core during the measuring
> > interval.
> > 
> > On my T14g2 the output is now for example:
> > ksmn0.temp0   63.88 degC  Tctl
> > ksmn0.frequency03553141515.00 Hz  CPU0
> > ksmn0.frequency13549080315.00 Hz  CPU2
> > ksmn0.frequency23552369937.00 Hz  CPU4
> > ksmn0.frequency33546055048.00 Hz  CPU6
> > ksmn0.frequency43546854449.00 Hz  CPU8
> > ksmn0.frequency53543869698.00 Hz  CPU10
> > ksmn0.frequency63542551127.00 Hz  CPU12
> > ksmn0.frequency74441623647.00 Hz  CPU14
> > 
> > It is intresting to watch turbo kick in and how temp causes the CPU to
> > throttle.
> > 
> > I only tested this on systems with APUs so I could not thest the Tccd temp
> > reporting.
> > -- 
> > :wq Claudio
> 
> Awesome! :-)
> 
> I can see this adding a bunch of extra frequency sensors on higher-end
> Ryzen/Threadripper/EPYC CPUs, considering it's displayed in "Hz" that
> might be a bit overwhelming to look at.. but that's just a cosmetic
> nit for systat.
> 
> I'll test on some of my AMD machines later, if someone doesn't beat me
> to it.
> 
> Some comments below..
> 
> > Index: ksmn.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/ksmn.c,v
> > retrieving revision 1.6
> > diff -u -p -r1.6 ksmn.c
> > --- ksmn.c  11 Mar 2022 18:00:50 -  1.6
> > +++ ksmn.c  24 Apr 2022 16:47:08 -
> > @@ -20,6 +20,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -42,6 +43,7 @@
> >* [31:21]  Current reported temperature.
> >*/
> >  #define SMU_17H_THM0x59800
> > +#define SMU_17H_CCD_THM(o, x)  (SMU_17H_THM + (o) + ((x) * 4))
> >  #define GET_CURTMP(r)  (((r) >> 21) & 0x7ff)
> >  
> >  /*
> > @@ -50,6 +52,8 @@
> >   */
> >  #define CURTMP_17H_RANGE_SEL   (1 << 19)
> >  #define CURTMP_17H_RANGE_ADJUST490
> > +#define CURTMP_CCD_VALID   (1 << 11)
> > +#define CURTMP_CCD_MASK0x7ff
> >  
> >  /*
> >   * Undocumented tCTL offsets gleamed from Linux k10temp driver.
> > @@ -75,13 +79,24 @@ struct ksmn_softc {
> > pcitag_tsc_pcitag;
> >  
> > int sc_tctl_offset;
> > +   unsigned intsc_ccd_valid;   /* available Tccds */
> > +   unsigned intsc_ccd_offset;
> > +   int sc_hz_count;
> >  
> > -   struct ksensor  sc_sensor;
> > struct ksensordev   sc_sensordev;
> > +   struct ksensor  sc_sensor;  /* Tctl */
> > +   struct ksensor  sc_ccd_sensor[12];  /* Tccd */
> > +
> > +   struct timeout  sc_hz_timeout;
> > +   struct ksensor  *sc_hz_sensor;
> > +   struct cpu_info **sc_hz_cpu_info;
> >  };
> >  
> >  intksmn_match(struct device *, void *, void *);
> >  void   ksmn_attach(struct device *, struct device *, void *);
> > +uint32_t   ksmn_read_reg(struct ksmn_softc *, uint32_t);
> > +void   ksmn_ccd_attach(struct ksmn_softc *, int);
> > +void   ksmn_hz_task(void *);
> >  void   ksmn_refresh(void *);
> >  
> >  const struct cfattach ksmn_ca = {
> > @@ -113,7 +128,12 @@ ksmn_attach(struct device *parent, struc
> > struct ksmn_softc   *sc = (struct ksmn_softc *)self;
> > struct pci_attach_args  *pa = aux;
> > struct curtmp_offset*p;
> > -   extern char cpu_model[];
> > +   CPU_INFO_ITERATORcii;
> > +   struct cpu_info *ci = curcpu();
> > +   struct ksensor  *s;
> > +   extern char  cpu_model[];
> > +   int  i;
> > +
> >  
> > sc->sc_pc = pa->pa_pc;
> > sc->sc_pcitag = pa->pa_tag;
> > @@ -122,6 +142,7 @@ ksmn_attach(struct device *parent, struc
> > sizeof(sc->sc_sensordev.xname));
> >  
> > sc->sc_sensor.type = SENSOR_TEMP;
> > +   snprintf(sc->sc_sensor.desc, sizeof(sc->sc_sensor.desc), "Tctl");
> > sensor_attach(>sc_sensordev, >sc_sensor);
> >  
> > /*
> > @@ -136,6 +157,80 @@ ksmn_attach(struct device *parent, struc
> > sc->sc_tctl_offset = p->tctl_offset;
> > }
> >  
> > +   sc->sc_ccd_offset = 0x154;
> > +
> > +   if (ci->ci_family == 0x17 || ci->ci_family == 0x18) {
> > +   switch (ci->ci_model) {
> > +   case 0x1:   /* Zen */
> > +   case 0x8:   /* 

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Hrvoje Popovski
On 24.4.2022. 20:24, Hrvoje Popovski wrote:
> after diff
> smc24# sysctl | grep ksmn
> hw.sensors.ksmn0.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn0.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn0.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn0.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn0.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn0.frequency0=1960043474.00 Hz (CPU0)
> hw.sensors.ksmn0.frequency1=2178969010.00 Hz (CPU1)
> hw.sensors.ksmn0.frequency2=2021703765.00 Hz (CPU2)
> hw.sensors.ksmn0.frequency3=2791496996.00 Hz (CPU3)
> hw.sensors.ksmn0.frequency4=1936332732.00 Hz (CPU4)
> hw.sensors.ksmn0.frequency5=1952819576.00 Hz (CPU5)
> hw.sensors.ksmn0.frequency6=1895289933.00 Hz (CPU6)
> hw.sensors.ksmn0.frequency7=1906813124.00 Hz (CPU7)
> hw.sensors.ksmn0.frequency8=1916662200.00 Hz (CPU8)
> hw.sensors.ksmn0.frequency9=1925515463.00 Hz (CPU9)
> hw.sensors.ksmn0.frequency10=2165544390.00 Hz (CPU10)
> hw.sensors.ksmn0.frequency11=1940854644.00 Hz (CPU11)
> hw.sensors.ksmn0.frequency12=1963695350.00 Hz (CPU12)
> hw.sensors.ksmn0.frequency13=2038281258.00 Hz (CPU13)
> hw.sensors.ksmn0.frequency14=1973428768.00 Hz (CPU14)
> hw.sensors.ksmn0.frequency15=2035124252.00 Hz (CPU15)
> hw.sensors.ksmn0.frequency16=1931312925.00 Hz (CPU16)
> hw.sensors.ksmn0.frequency17=191422.00 Hz (CPU17)
> hw.sensors.ksmn0.frequency18=1913169799.00 Hz (CPU18)
> hw.sensors.ksmn0.frequency19=2472108200.00 Hz (CPU19)
> hw.sensors.ksmn0.frequency20=1915108480.00 Hz (CPU20)
> hw.sensors.ksmn0.frequency21=2862980120.00 Hz (CPU21)
> hw.sensors.ksmn0.frequency22=2639124653.00 Hz (CPU22)
> hw.sensors.ksmn0.frequency23=1908989778.00 Hz (CPU23)
> hw.sensors.ksmn1.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn1.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn1.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn1.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn1.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn1.frequency0=1968382096.00 Hz (CPU0)
> hw.sensors.ksmn1.frequency1=2376295723.00 Hz (CPU1)
> hw.sensors.ksmn1.frequency2=2369074799.00 Hz (CPU2)
> hw.sensors.ksmn1.frequency3=2404712762.00 Hz (CPU3)
> hw.sensors.ksmn1.frequency4=2457581506.00 Hz (CPU4)
> hw.sensors.ksmn1.frequency5=2401611206.00 Hz (CPU5)
> hw.sensors.ksmn1.frequency6=2339088239.00 Hz (CPU6)
> hw.sensors.ksmn1.frequency7=2463824725.00 Hz (CPU7)
> hw.sensors.ksmn1.frequency8=2359482485.00 Hz (CPU8)
> hw.sensors.ksmn1.frequency9=2483767808.00 Hz (CPU9)
> hw.sensors.ksmn1.frequency10=2413048435.00 Hz (CPU10)
> hw.sensors.ksmn1.frequency11=2391201370.00 Hz (CPU11)
> hw.sensors.ksmn1.frequency12=1944466261.00 Hz (CPU12)
> hw.sensors.ksmn1.frequency13=1939033492.00 Hz (CPU13)
> hw.sensors.ksmn1.frequency14=1949862067.00 Hz (CPU14)
> hw.sensors.ksmn1.frequency15=1947783743.00 Hz (CPU15)
> hw.sensors.ksmn1.frequency16=1919198696.00 Hz (CPU16)
> hw.sensors.ksmn1.frequency17=1953120383.00 Hz (CPU17)
> hw.sensors.ksmn1.frequency18=2543332610.00 Hz (CPU18)
> hw.sensors.ksmn1.frequency19=2564893500.00 Hz (CPU19)
> hw.sensors.ksmn1.frequency20=2638202441.00 Hz (CPU20)
> hw.sensors.ksmn1.frequency21=2814783269.00 Hz (CPU21)
> hw.sensors.ksmn1.frequency22=2808046584.00 Hz (CPU22)
> hw.sensors.ksmn1.frequency23=2578708588.00 Hz (CPU23)
> hw.sensors.ksmn2.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn2.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn2.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn2.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn2.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn2.frequency0=2001533749.00 Hz (CPU0)
> hw.sensors.ksmn2.frequency1=1948022864.00 Hz (CPU1)
> hw.sensors.ksmn2.frequency2=1949718978.00 Hz (CPU2)
> hw.sensors.ksmn2.frequency3=2093756889.00 Hz (CPU3)
> hw.sensors.ksmn2.frequency4=1948401172.00 Hz (CPU4)
> hw.sensors.ksmn2.frequency5=1990612716.00 Hz (CPU5)
> hw.sensors.ksmn2.frequency6=2112140214.00 Hz (CPU6)
> hw.sensors.ksmn2.frequency7=1962903090.00 Hz (CPU7)
> hw.sensors.ksmn2.frequency8=1985202582.00 Hz (CPU8)
> hw.sensors.ksmn2.frequency9=2190306365.00 Hz (CPU9)
> hw.sensors.ksmn2.frequency10=1991116471.00 Hz (CPU10)
> hw.sensors.ksmn2.frequency11=2002007440.00 Hz (CPU11)
> hw.sensors.ksmn2.frequency12=3126687467.00 Hz (CPU12)
> hw.sensors.ksmn2.frequency13=3360747003.00 Hz (CPU13)
> hw.sensors.ksmn2.frequency14=2544531280.00 Hz (CPU14)
> hw.sensors.ksmn2.frequency15=3270889025.00 Hz (CPU15)
> hw.sensors.ksmn2.frequency16=3112205978.00 Hz (CPU16)
> hw.sensors.ksmn2.frequency17=2553566819.00 Hz (CPU17)
> hw.sensors.ksmn2.frequency18=2106320461.00 Hz (CPU18)
> hw.sensors.ksmn2.frequency19=2580420523.00 Hz (CPU19)
> hw.sensors.ksmn2.frequency20=2046857758.00 Hz (CPU20)
> hw.sensors.ksmn2.frequency21=2440632976.00 Hz (CPU21)
> hw.sensors.ksmn2.frequency22=2398193682.00 Hz (CPU22)
> hw.sensors.ksmn2.frequency23=2242716702.00 Hz (CPU23)
> hw.sensors.ksmn3.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn3.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn3.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn3.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn3.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn3.frequency0=1793941053.00 Hz (CPU0)
> 

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Bryan Steele
On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> excludes Zen APU CPUs) this should show additional temp info. This is
> based on info from the Linux k10temp driver.
> 
> Additionally use the MSRs defined in "Open-Source Register Reference For
> AMD Family 17h Processors" to measure the CPU core frequency.
> That should be the actuall speed of the CPU core during the measuring
> interval.
> 
> On my T14g2 the output is now for example:
> ksmn0.temp0   63.88 degC  Tctl
> ksmn0.frequency03553141515.00 Hz  CPU0
> ksmn0.frequency13549080315.00 Hz  CPU2
> ksmn0.frequency23552369937.00 Hz  CPU4
> ksmn0.frequency33546055048.00 Hz  CPU6
> ksmn0.frequency43546854449.00 Hz  CPU8
> ksmn0.frequency53543869698.00 Hz  CPU10
> ksmn0.frequency63542551127.00 Hz  CPU12
> ksmn0.frequency74441623647.00 Hz  CPU14
> 
> It is intresting to watch turbo kick in and how temp causes the CPU to
> throttle.
> 
> I only tested this on systems with APUs so I could not thest the Tccd temp
> reporting.
> -- 
> :wq Claudio

Awesome! :-)

I can see this adding a bunch of extra frequency sensors on higher-end
Ryzen/Threadripper/EPYC CPUs, considering it's displayed in "Hz" that
might be a bit overwhelming to look at.. but that's just a cosmetic
nit for systat.

I'll test on some of my AMD machines later, if someone doesn't beat me
to it.

Some comments below..

> Index: ksmn.c
> ===
> RCS file: /cvs/src/sys/dev/pci/ksmn.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 ksmn.c
> --- ksmn.c11 Mar 2022 18:00:50 -  1.6
> +++ ksmn.c24 Apr 2022 16:47:08 -
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -42,6 +43,7 @@
>* [31:21]  Current reported temperature.
>*/
>  #define SMU_17H_THM  0x59800
> +#define SMU_17H_CCD_THM(o, x)(SMU_17H_THM + (o) + ((x) * 4))
>  #define GET_CURTMP(r)(((r) >> 21) & 0x7ff)
>  
>  /*
> @@ -50,6 +52,8 @@
>   */
>  #define CURTMP_17H_RANGE_SEL (1 << 19)
>  #define CURTMP_17H_RANGE_ADJUST  490
> +#define CURTMP_CCD_VALID (1 << 11)
> +#define CURTMP_CCD_MASK  0x7ff
>  
>  /*
>   * Undocumented tCTL offsets gleamed from Linux k10temp driver.
> @@ -75,13 +79,24 @@ struct ksmn_softc {
>   pcitag_tsc_pcitag;
>  
>   int sc_tctl_offset;
> + unsigned intsc_ccd_valid;   /* available Tccds */
> + unsigned intsc_ccd_offset;
> + int sc_hz_count;
>  
> - struct ksensor  sc_sensor;
>   struct ksensordev   sc_sensordev;
> + struct ksensor  sc_sensor;  /* Tctl */
> + struct ksensor  sc_ccd_sensor[12];  /* Tccd */
> +
> + struct timeout  sc_hz_timeout;
> + struct ksensor  *sc_hz_sensor;
> + struct cpu_info **sc_hz_cpu_info;
>  };
>  
>  int  ksmn_match(struct device *, void *, void *);
>  void ksmn_attach(struct device *, struct device *, void *);
> +uint32_t ksmn_read_reg(struct ksmn_softc *, uint32_t);
> +void ksmn_ccd_attach(struct ksmn_softc *, int);
> +void ksmn_hz_task(void *);
>  void ksmn_refresh(void *);
>  
>  const struct cfattach ksmn_ca = {
> @@ -113,7 +128,12 @@ ksmn_attach(struct device *parent, struc
>   struct ksmn_softc   *sc = (struct ksmn_softc *)self;
>   struct pci_attach_args  *pa = aux;
>   struct curtmp_offset*p;
> - extern char cpu_model[];
> + CPU_INFO_ITERATORcii;
> + struct cpu_info *ci = curcpu();
> + struct ksensor  *s;
> + extern char  cpu_model[];
> + int  i;
> +
>  
>   sc->sc_pc = pa->pa_pc;
>   sc->sc_pcitag = pa->pa_tag;
> @@ -122,6 +142,7 @@ ksmn_attach(struct device *parent, struc
>   sizeof(sc->sc_sensordev.xname));
>  
>   sc->sc_sensor.type = SENSOR_TEMP;
> + snprintf(sc->sc_sensor.desc, sizeof(sc->sc_sensor.desc), "Tctl");
>   sensor_attach(>sc_sensordev, >sc_sensor);
>  
>   /*
> @@ -136,6 +157,80 @@ ksmn_attach(struct device *parent, struc
>   sc->sc_tctl_offset = p->tctl_offset;
>   }
>  
> + sc->sc_ccd_offset = 0x154;
> +
> + if (ci->ci_family == 0x17 || ci->ci_family == 0x18) {
> + switch (ci->ci_model) {
> + case 0x1:   /* Zen */
> + case 0x8:   /* Zen+ */
> + case 0x11:  /* Zen APU */
> + case 0x18:  /* Zen+ APU */
> + ksmn_ccd_attach(sc, 4);
> + break;
> + case 0x31:  /* Zen2 Threadripper */
> + case 0x60:  /* Renoir 

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Hrvoje Popovski
On 24.4.2022. 19:06, Claudio Jeker wrote:
> On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> excludes Zen APU CPUs) this should show additional temp info. This is
> based on info from the Linux k10temp driver.
> 
> Additionally use the MSRs defined in "Open-Source Register Reference For
> AMD Family 17h Processors" to measure the CPU core frequency.
> That should be the actuall speed of the CPU core during the measuring
> interval.
> 
> On my T14g2 the output is now for example:
> ksmn0.temp0   63.88 degC  Tctl
> ksmn0.frequency03553141515.00 Hz  CPU0
> ksmn0.frequency13549080315.00 Hz  CPU2
> ksmn0.frequency23552369937.00 Hz  CPU4
> ksmn0.frequency33546055048.00 Hz  CPU6
> ksmn0.frequency43546854449.00 Hz  CPU8
> ksmn0.frequency53543869698.00 Hz  CPU10
> ksmn0.frequency63542551127.00 Hz  CPU12
> ksmn0.frequency74441623647.00 Hz  CPU14
> 
> It is intresting to watch turbo kick in and how temp causes the CPU to
> throttle.
> 
> I only tested this on systems with APUs so I could not thest the Tccd temp
> reporting.

Hi,

before diff

smc24# sysctl | grep ksmn
hw.sensors.ksmn0.temp0=48.00 degC
hw.sensors.ksmn1.temp0=48.00 degC
hw.sensors.ksmn2.temp0=48.00 degC
hw.sensors.ksmn3.temp0=48.00 degC
smc24#


after diff
smc24# sysctl | grep ksmn
hw.sensors.ksmn0.temp0=47.50 degC (Tctl)
hw.sensors.ksmn0.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn0.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn0.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn0.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn0.frequency0=1960043474.00 Hz (CPU0)
hw.sensors.ksmn0.frequency1=2178969010.00 Hz (CPU1)
hw.sensors.ksmn0.frequency2=2021703765.00 Hz (CPU2)
hw.sensors.ksmn0.frequency3=2791496996.00 Hz (CPU3)
hw.sensors.ksmn0.frequency4=1936332732.00 Hz (CPU4)
hw.sensors.ksmn0.frequency5=1952819576.00 Hz (CPU5)
hw.sensors.ksmn0.frequency6=1895289933.00 Hz (CPU6)
hw.sensors.ksmn0.frequency7=1906813124.00 Hz (CPU7)
hw.sensors.ksmn0.frequency8=1916662200.00 Hz (CPU8)
hw.sensors.ksmn0.frequency9=1925515463.00 Hz (CPU9)
hw.sensors.ksmn0.frequency10=2165544390.00 Hz (CPU10)
hw.sensors.ksmn0.frequency11=1940854644.00 Hz (CPU11)
hw.sensors.ksmn0.frequency12=1963695350.00 Hz (CPU12)
hw.sensors.ksmn0.frequency13=2038281258.00 Hz (CPU13)
hw.sensors.ksmn0.frequency14=1973428768.00 Hz (CPU14)
hw.sensors.ksmn0.frequency15=2035124252.00 Hz (CPU15)
hw.sensors.ksmn0.frequency16=1931312925.00 Hz (CPU16)
hw.sensors.ksmn0.frequency17=191422.00 Hz (CPU17)
hw.sensors.ksmn0.frequency18=1913169799.00 Hz (CPU18)
hw.sensors.ksmn0.frequency19=2472108200.00 Hz (CPU19)
hw.sensors.ksmn0.frequency20=1915108480.00 Hz (CPU20)
hw.sensors.ksmn0.frequency21=2862980120.00 Hz (CPU21)
hw.sensors.ksmn0.frequency22=2639124653.00 Hz (CPU22)
hw.sensors.ksmn0.frequency23=1908989778.00 Hz (CPU23)
hw.sensors.ksmn1.temp0=47.50 degC (Tctl)
hw.sensors.ksmn1.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn1.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn1.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn1.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn1.frequency0=1968382096.00 Hz (CPU0)
hw.sensors.ksmn1.frequency1=2376295723.00 Hz (CPU1)
hw.sensors.ksmn1.frequency2=2369074799.00 Hz (CPU2)
hw.sensors.ksmn1.frequency3=2404712762.00 Hz (CPU3)
hw.sensors.ksmn1.frequency4=2457581506.00 Hz (CPU4)
hw.sensors.ksmn1.frequency5=2401611206.00 Hz (CPU5)
hw.sensors.ksmn1.frequency6=2339088239.00 Hz (CPU6)
hw.sensors.ksmn1.frequency7=2463824725.00 Hz (CPU7)
hw.sensors.ksmn1.frequency8=2359482485.00 Hz (CPU8)
hw.sensors.ksmn1.frequency9=2483767808.00 Hz (CPU9)
hw.sensors.ksmn1.frequency10=2413048435.00 Hz (CPU10)
hw.sensors.ksmn1.frequency11=2391201370.00 Hz (CPU11)
hw.sensors.ksmn1.frequency12=1944466261.00 Hz (CPU12)
hw.sensors.ksmn1.frequency13=1939033492.00 Hz (CPU13)
hw.sensors.ksmn1.frequency14=1949862067.00 Hz (CPU14)
hw.sensors.ksmn1.frequency15=1947783743.00 Hz (CPU15)
hw.sensors.ksmn1.frequency16=1919198696.00 Hz (CPU16)
hw.sensors.ksmn1.frequency17=1953120383.00 Hz (CPU17)
hw.sensors.ksmn1.frequency18=2543332610.00 Hz (CPU18)
hw.sensors.ksmn1.frequency19=2564893500.00 Hz (CPU19)
hw.sensors.ksmn1.frequency20=2638202441.00 Hz (CPU20)
hw.sensors.ksmn1.frequency21=2814783269.00 Hz (CPU21)
hw.sensors.ksmn1.frequency22=2808046584.00 Hz (CPU22)
hw.sensors.ksmn1.frequency23=2578708588.00 Hz (CPU23)
hw.sensors.ksmn2.temp0=47.50 degC (Tctl)
hw.sensors.ksmn2.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn2.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn2.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn2.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn2.frequency0=2001533749.00 Hz (CPU0)
hw.sensors.ksmn2.frequency1=1948022864.00 Hz (CPU1)
hw.sensors.ksmn2.frequency2=1949718978.00 Hz (CPU2)
hw.sensors.ksmn2.frequency3=2093756889.00 Hz (CPU3)
hw.sensors.ksmn2.frequency4=1948401172.00 Hz (CPU4)
hw.sensors.ksmn2.frequency5=1990612716.00 Hz (CPU5)
hw.sensors.ksmn2.frequency6=2112140214.00