Re: Wrong temperature with AMD and amdtemp.ko
On 6-10-2015 06:28, Don Lewis wrote: On 3 Oct, Willem Jan Withagen wrote: On 2-10-2015 23:32, Don Lewis wrote: On 2 Oct, Willem Jan Withagen wrote: Hi 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 Processor: Opteron 6812, in Supermicro H8SGL dev.cpu.7.temperature: 11.1C dev.cpu.6.temperature: 11.1C dev.cpu.5.temperature: 11.1C dev.cpu.4.temperature: 11.1C dev.cpu.3.temperature: 11.1C dev.cpu.2.temperature: 11.1C dev.cpu.1.temperature: 11.1C dev.cpu.0.temperature: 11.1C But I'm pretty sure it is not 11.1C in the datacenter If one boots into the BIOS, the BIOS suggests that it knows how to do this conversion Perhaps one can question the ultimate correctness of the outcome, but the 51.3C value suggests some accuracy. That may be a measurement from a separate temperature sensor on the motherboard underneath the CPU socket. Interesting point Sort of hard to get there to see if that is really the fact. But then it would be accessable for any of the other drivers/programs like lm* or smb* to get the readings?? Now only figure out which one of the many... --WjW ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 3 Oct, Willem Jan Withagen wrote: > On 2-10-2015 23:32, Don Lewis wrote: >> On 2 Oct, Willem Jan Withagen wrote: >>> >>> Hi >>> >>> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 >>> >>> Processor: Opteron 6812, in Supermicro H8SGL >>> >>> dev.cpu.7.temperature: 11.1C >>> dev.cpu.6.temperature: 11.1C >>> dev.cpu.5.temperature: 11.1C >>> dev.cpu.4.temperature: 11.1C >>> dev.cpu.3.temperature: 11.1C >>> dev.cpu.2.temperature: 11.1C >>> dev.cpu.1.temperature: 11.1C >>> dev.cpu.0.temperature: 11.1C >>> >>> But I'm pretty sure it is not 11.1C in the datacenter >>> >>> Or should I not use amdtemp.ko for this? >> >> The definition of the value that can be read from the temperature >> register is pretty strange. For AMD Family 15h processors, the BIOS and >> Kernel Developer's Guide (BKDG) says this: >> >> Tctl is a processor temperature control value used for processor >> thermal management. Tctl is accessible through D18F3xA4[CurTmp]. >> Tctl is a temperature on its own scale aligned to the processors >> cooling requirements. Therefore Tctl does not represent a temperature >> which could be measured on the die or the case of the processor. >> Instead, it specifies the processor temperature relative to the >> maximum operating temperature, Tctl,max. Tctl,max is specified in the >> power and thermal data sheet. Tctl is defined as follows for all >> parts: >> >> A: For Tctl = Tctl_max to 255.875: the temperature of the part is >> [Tctl - Tctl_max] over the maximum operat- ing temperature. The >> processor may take corrective actions that affects performance, such >> as HTC, to support the return to Tctl range A. >> >> B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is >> [Tctl_max - Tctl] under the maximum operating temperature. >> >> It would be nice to report Tctl_max so that we could at least know how >> far the temperature is from the limit, but I don't know if that is >> available. It might be the value in the HtcTmpLmt register, but the >> BKDG is unclear about that. If not, we would have to build a table of >> values from the datasheet. > > And > > On 2-10-2015 23:06, Jung-uk Kim wrote: >> On 10/02/2015 16:49, Willem Jan Withagen wrote: > >> amdtemp(4): >> >> For Family 10h and later processors, “(the reported temperature) is a >> non-physical temperature measured on an arbitrary scale and it does not >> represent an actual physical temperature like die or case temperature. >> Instead, it specifies the processor temperature relative to the point at >> which the system must supply the maximum cooling for the processor's >> specified maximum case temperature and maximum thermal power dissipation” >> according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors, >> http://developer.amd.com/documentation/guides/Pages/default.aspx. > > If one boots into the BIOS, the BIOS suggests that it knows how to do > this conversion Perhaps one can question the ultimate correctness of > the outcome, but the 51.3C value suggests some accuracy. That may be a measurement from a separate temperature sensor on the motherboard underneath the CPU socket. > Thusfar I have not been able to locate the "Power and Thermal Datasheet" > for the family 15h > Perhaps need to disassemble the bios, or check other tools or OSes on > how they do this. > > --WjW > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 04/10/2015 20:10, Willem Jan Withagen wrote: On 4-10-2015 03:26, Shane Ambler wrote: On 03/10/2015 20:12, Willem Jan Withagen wrote: On 2-10-2015 23:32, Don Lewis wrote: On 2 Oct, Willem Jan Withagen wrote: Hi 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 Processor: Opteron 6812, in Supermicro H8SGL Thusfar I have not been able to locate the "Power and Thermal Datasheet" for the family 15h Perhaps need to disassemble the bios, or check other tools or OSes on how they do this. According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1 I see this as an omission on the website. I've never been bothered by FreeBSD compatibility in the 15+ years I'm using Supermicro. Maybe you could approach Supermicro support and ask for assistance in updating FreeBSD support - maybe for other boards as well. Another approach is asking iXSystems, I'm pretty sure they sell re-badged Supermicro machines. Turns out it has nothing to do with the Motherboard, but more on the temperature management hardware in the CPU. So asking either iXSstems or Supermicro does not really help. But supermicro would have access to the bios source and can tell quickly how it is done there. As a MB manufacturer they would also have the chance to ask AMD for info not easily found in docs. If supermicro doesn't provide info to oss developers then iXsystems could ask supermicro for info as they sell the hardware as their own and need to write drivers to support it and as a supporter of FreeBSD they may be interested in writing the update themselves. At the minimum one of them could tell you how the bios calculates the temperature so you can match it. Would save a lot of experiments on your part. -- FreeBSD - the place to B...Software Developing Shane Ambler ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 4-10-2015 03:26, Shane Ambler wrote: > On 03/10/2015 20:12, Willem Jan Withagen wrote: >> On 2-10-2015 23:32, Don Lewis wrote: >>> On 2 Oct, Willem Jan Withagen wrote: Hi 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 Processor: Opteron 6812, in Supermicro H8SGL > >> Thusfar I have not been able to locate the "Power and Thermal Datasheet" >> for the family 15h >> Perhaps need to disassemble the bios, or check other tools or OSes on >> how they do this. > > According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1 I see this as an omission on the website. I've never been bothered by FreeBSD compatibility in the 15+ years I'm using Supermicro. > Maybe you could approach Supermicro support and ask for assistance in > updating FreeBSD support - maybe for other boards as well. > > Another approach is asking iXSystems, I'm pretty sure they sell > re-badged Supermicro machines. Turns out it has nothing to do with the Motherboard, but more on the temperature management hardware in the CPU. So asking either iXSstems or Supermicro does not really help. Fun part is that I have a few desktop system with "AMD Phenom(tm) II X6 1075T Processor" and they more or less run with a temp that makes more sense I've verified the temps on the board with a thermo-cam and the CPU cooler is around that temp. The junction-temp in the CPU package will be higher, and you'd need to have a much beter understanding of what AMD does with it's on-chip sensor. I've been reading a bit more on this topic in all kinds of different fora, and thusfar everybody recognises that it is a sort of black art. So I guess I'll just measure, plot and monitor it, to get a feeling for where it should be. --WjW --WjW ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 03/10/2015 20:12, Willem Jan Withagen wrote: On 2-10-2015 23:32, Don Lewis wrote: On 2 Oct, Willem Jan Withagen wrote: Hi 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 Processor: Opteron 6812, in Supermicro H8SGL Thusfar I have not been able to locate the "Power and Thermal Datasheet" for the family 15h Perhaps need to disassemble the bios, or check other tools or OSes on how they do this. According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1 Maybe you could approach Supermicro support and ask for assistance in updating FreeBSD support - maybe for other boards as well. Another approach is asking iXSystems, I'm pretty sure they sell re-badged Supermicro machines. -- FreeBSD - the place to B...Software Developing Shane Ambler ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 2-10-2015 23:32, Don Lewis wrote: > On 2 Oct, Willem Jan Withagen wrote: >> >> Hi >> >> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 >> >> Processor: Opteron 6812, in Supermicro H8SGL >> >> dev.cpu.7.temperature: 11.1C >> dev.cpu.6.temperature: 11.1C >> dev.cpu.5.temperature: 11.1C >> dev.cpu.4.temperature: 11.1C >> dev.cpu.3.temperature: 11.1C >> dev.cpu.2.temperature: 11.1C >> dev.cpu.1.temperature: 11.1C >> dev.cpu.0.temperature: 11.1C >> >> But I'm pretty sure it is not 11.1C in the datacenter >> >> Or should I not use amdtemp.ko for this? > > The definition of the value that can be read from the temperature > register is pretty strange. For AMD Family 15h processors, the BIOS and > Kernel Developer's Guide (BKDG) says this: > > Tctl is a processor temperature control value used for processor > thermal management. Tctl is accessible through D18F3xA4[CurTmp]. > Tctl is a temperature on its own scale aligned to the processors > cooling requirements. Therefore Tctl does not represent a temperature > which could be measured on the die or the case of the processor. > Instead, it specifies the processor temperature relative to the > maximum operating temperature, Tctl,max. Tctl,max is specified in the > power and thermal data sheet. Tctl is defined as follows for all > parts: > > A: For Tctl = Tctl_max to 255.875: the temperature of the part is > [Tctl - Tctl_max] over the maximum operat- ing temperature. The > processor may take corrective actions that affects performance, such > as HTC, to support the return to Tctl range A. > > B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is > [Tctl_max - Tctl] under the maximum operating temperature. > > It would be nice to report Tctl_max so that we could at least know how > far the temperature is from the limit, but I don't know if that is > available. It might be the value in the HtcTmpLmt register, but the > BKDG is unclear about that. If not, we would have to build a table of > values from the datasheet. And On 2-10-2015 23:06, Jung-uk Kim wrote: > On 10/02/2015 16:49, Willem Jan Withagen wrote: > amdtemp(4): > > For Family 10h and later processors, “(the reported temperature) is a > non-physical temperature measured on an arbitrary scale and it does not > represent an actual physical temperature like die or case temperature. > Instead, it specifies the processor temperature relative to the point at > which the system must supply the maximum cooling for the processor's > specified maximum case temperature and maximum thermal power dissipation” > according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors, > http://developer.amd.com/documentation/guides/Pages/default.aspx. If one boots into the BIOS, the BIOS suggests that it knows how to do this conversion Perhaps one can question the ultimate correctness of the outcome, but the 51.3C value suggests some accuracy. Thusfar I have not been able to locate the "Power and Thermal Datasheet" for the family 15h Perhaps need to disassemble the bios, or check other tools or OSes on how they do this. --WjW ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
On 2 Oct, Willem Jan Withagen wrote: > > Hi > > 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 > > Processor: Opteron 6812, in Supermicro H8SGL > > dev.cpu.7.temperature: 11.1C > dev.cpu.6.temperature: 11.1C > dev.cpu.5.temperature: 11.1C > dev.cpu.4.temperature: 11.1C > dev.cpu.3.temperature: 11.1C > dev.cpu.2.temperature: 11.1C > dev.cpu.1.temperature: 11.1C > dev.cpu.0.temperature: 11.1C > > But I'm pretty sure it is not 11.1C in the datacenter > > Or should I not use amdtemp.ko for this? The definition of the value that can be read from the temperature register is pretty strange. For AMD Family 15h processors, the BIOS and Kernel Developer's Guide (BKDG) says this: Tctl is a processor temperature control value used for processor thermal management. Tctl is accessible through D18F3xA4[CurTmp]. Tctl is a temperature on its own scale aligned to the processors cooling requirements. Therefore Tctl does not represent a temperature which could be measured on the die or the case of the processor. Instead, it specifies the processor temperature relative to the maximum operating temperature, Tctl,max. Tctl,max is specified in the power and thermal data sheet. Tctl is defined as follows for all parts: A: For Tctl = Tctl_max to 255.875: the temperature of the part is [Tctl - Tctl_max] over the maximum operat- ing temperature. The processor may take corrective actions that affects performance, such as HTC, to support the return to Tctl range A. B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is [Tctl_max - Tctl] under the maximum operating temperature. It would be nice to report Tctl_max so that we could at least know how far the temperature is from the limit, but I don't know if that is available. It might be the value in the HtcTmpLmt register, but the BKDG is unclear about that. If not, we would have to build a table of values from the datasheet. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Wrong temperature with AMD and amdtemp.ko
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 10/02/2015 16:49, Willem Jan Withagen wrote: > > Hi > > 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 > > Processor: Opteron 6812, in Supermicro H8SGL > > dev.cpu.7.temperature: 11.1C dev.cpu.6.temperature: 11.1C > dev.cpu.5.temperature: 11.1C dev.cpu.4.temperature: 11.1C > dev.cpu.3.temperature: 11.1C dev.cpu.2.temperature: 11.1C > dev.cpu.1.temperature: 11.1C dev.cpu.0.temperature: 11.1C > > But I'm pretty sure it is not 11.1C in the datacenter > > Or should I not use amdtemp.ko for this? amdtemp(4): For Family 10h and later processors, “(the reported temperature) is a non-physical temperature measured on an arbitrary scale and it does not represent an actual physical temperature like die or case temperature. Instead, it specifies the processor temperature relative to the point at which the system must supply the maximum cooling for the processor's specified maximum case temperature and maximum thermal power dissipation” according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors, http://developer.amd.com/documentation/guides/Pages/default.aspx. Jung-uk Kim -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBCAAGBQJWDvHaAAoJEHyflib82/FGrvwIAIHhM5bDxODIycXkuqoNWutC MUO7KsFcQuU+pGpV+Ip70ZehRbbdjbo/3sD4oqispWKwKuUgihPBiRn7v/Ad2fxN crvZ4MJXalQmr7+ipELWVtD8/TkzIX6npLMvjRr/adnkzDleGuEErG45z77w/6uj SdJVkvAp15Ji1qu2UWXCipg8mWpPvZjgNwDeeK3ryo5ZsE9YaeWOKJG9eP9sjTcx zoYC7LR/aVFFDZTlx6EY6SLHTZNs/jBsFkr6iF6xeIa9dsnwrI7ebNat8ApGQTX2 sydzIECiElqKiYNwk9XEn+e3aNgryoBhGx2Ax+fWxHHBB+kojhnFBHVQ1Qg2WaE= =ixlS -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Wrong temperature with AMD and amdtemp.ko
Hi 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24 Processor: Opteron 6812, in Supermicro H8SGL dev.cpu.7.temperature: 11.1C dev.cpu.6.temperature: 11.1C dev.cpu.5.temperature: 11.1C dev.cpu.4.temperature: 11.1C dev.cpu.3.temperature: 11.1C dev.cpu.2.temperature: 11.1C dev.cpu.1.temperature: 11.1C dev.cpu.0.temperature: 11.1C But I'm pretty sure it is not 11.1C in the datacenter Or should I not use amdtemp.ko for this? --WjW ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"