Re: Wrong temperature with AMD and amdtemp.ko

2015-10-06 Thread Willem Jan Withagen

On 6-10-2015 06:28, Don Lewis wrote:

On  3 Oct, Willem Jan Withagen wrote:

On 2-10-2015 23:32, Don Lewis wrote:

On  2 Oct, Willem Jan Withagen wrote:


Hi

10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24

Processor: Opteron 6812, in Supermicro H8SGL

dev.cpu.7.temperature: 11.1C
dev.cpu.6.temperature: 11.1C
dev.cpu.5.temperature: 11.1C
dev.cpu.4.temperature: 11.1C
dev.cpu.3.temperature: 11.1C
dev.cpu.2.temperature: 11.1C
dev.cpu.1.temperature: 11.1C
dev.cpu.0.temperature: 11.1C

But I'm pretty sure it is not 11.1C in the datacenter



If one boots into the BIOS, the BIOS suggests that it knows how to do
this conversion Perhaps one can question the ultimate correctness of
the outcome, but the 51.3C value suggests some accuracy.


That may be a measurement from a separate temperature sensor on the
motherboard underneath the CPU socket.


Interesting point

Sort of hard to get there to see if that is really the fact.

But then it would be accessable for any of the other drivers/programs 
like lm* or smb* to get the readings?? Now only figure out which one of 
the many...


--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-05 Thread Don Lewis
On  3 Oct, Willem Jan Withagen wrote:
> On 2-10-2015 23:32, Don Lewis wrote:
>> On  2 Oct, Willem Jan Withagen wrote:
>>>
>>> Hi
>>>
>>> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24
>>>
>>> Processor: Opteron 6812, in Supermicro H8SGL
>>>
>>> dev.cpu.7.temperature: 11.1C
>>> dev.cpu.6.temperature: 11.1C
>>> dev.cpu.5.temperature: 11.1C
>>> dev.cpu.4.temperature: 11.1C
>>> dev.cpu.3.temperature: 11.1C
>>> dev.cpu.2.temperature: 11.1C
>>> dev.cpu.1.temperature: 11.1C
>>> dev.cpu.0.temperature: 11.1C
>>>
>>> But I'm pretty sure it is not 11.1C in the datacenter
>>>
>>> Or should I not use amdtemp.ko for this?
>> 
>> The definition of the value that can be read from the temperature
>> register is pretty strange.  For AMD Family 15h processors, the BIOS and
>> Kernel Developer's Guide (BKDG) says this:
>> 
>>   Tctl is a processor temperature control value used for processor
>>   thermal management. Tctl is accessible through D18F3xA4[CurTmp].
>>   Tctl is a temperature on its own scale aligned to the processors
>>   cooling requirements. Therefore Tctl does not represent a temperature
>>   which could be measured on the die or the case of the processor.
>>   Instead, it specifies the processor temperature relative to the
>>   maximum operating temperature, Tctl,max. Tctl,max is specified in the
>>   power and thermal data sheet. Tctl is defined as follows for all
>>   parts:
>> 
>>   A: For Tctl = Tctl_max to 255.875: the temperature of the part is
>>   [Tctl - Tctl_max] over the maximum operat- ing temperature.  The
>>   processor may take corrective actions that affects performance, such
>>   as HTC, to support the return to Tctl range A.
>> 
>>   B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is
>>   [Tctl_max - Tctl] under the maximum operating temperature.
>> 
>> It would be nice to report Tctl_max so that we could at least know how
>> far the temperature is from the limit, but I don't know if that is
>> available.  It might be the value in the HtcTmpLmt register, but the
>> BKDG is unclear about that.  If not, we would have to build a table of
>> values from the datasheet.
> 
> And
> 
> On 2-10-2015 23:06, Jung-uk Kim wrote:
>> On 10/02/2015 16:49, Willem Jan Withagen wrote:
> 
>> amdtemp(4):
>>
>> For Family 10h and later processors, “(the reported temperature) is a
>> non-physical temperature measured on an arbitrary scale and it does not
>> represent an actual physical temperature like die or case temperature.
>> Instead, it specifies the processor temperature relative to the point at
>> which the system must supply the maximum cooling for the processor's
>> specified maximum case temperature and maximum thermal power dissipation”
>> according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors,
>> http://developer.amd.com/documentation/guides/Pages/default.aspx.
> 
> If one boots into the BIOS, the BIOS suggests that it knows how to do
> this conversion Perhaps one can question the ultimate correctness of
> the outcome, but the 51.3C value suggests some accuracy.

That may be a measurement from a separate temperature sensor on the
motherboard underneath the CPU socket.

> Thusfar I have not been able to locate the "Power and Thermal Datasheet"
> for the family 15h
> Perhaps need to disassemble the bios, or check other tools or OSes on
> how they do this.
> 
> --WjW
> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-04 Thread Shane Ambler

On 04/10/2015 20:10, Willem Jan Withagen wrote:

On 4-10-2015 03:26, Shane Ambler wrote:

On 03/10/2015 20:12, Willem Jan Withagen wrote:

On 2-10-2015 23:32, Don Lewis wrote:

On  2 Oct, Willem Jan Withagen wrote:


Hi

10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24

Processor: Opteron 6812, in Supermicro H8SGL




Thusfar I have not been able to locate the "Power and Thermal Datasheet"
for the family 15h
Perhaps need to disassemble the bios, or check other tools or OSes on
how they do this.


According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1


I see this as an omission on the website.
I've never been bothered by FreeBSD compatibility in the 15+ years I'm
using Supermicro.


Maybe you could approach Supermicro support and ask for assistance in
updating FreeBSD support - maybe for other boards as well.

Another approach is asking iXSystems, I'm pretty sure they sell
re-badged Supermicro machines.


Turns out it has nothing to do with the Motherboard, but more on the
temperature management hardware in the CPU.
So asking either iXSstems or Supermicro does not really help.


But supermicro would have access to the bios source and can tell
quickly how it is done there. As a MB manufacturer they would also have
the chance to ask AMD for info not easily found in docs. If supermicro
doesn't provide info to oss developers then iXsystems could ask
supermicro for info as they sell the hardware as their own and need to
write drivers to support it and as a supporter of FreeBSD they may be
interested in writing the update themselves.

At the minimum one of them could tell you how the bios calculates the
temperature so you can match it. Would save a lot of experiments on
your part.


--
FreeBSD - the place to B...Software Developing

Shane Ambler

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-04 Thread Willem Jan Withagen
On 4-10-2015 03:26, Shane Ambler wrote:
> On 03/10/2015 20:12, Willem Jan Withagen wrote:
>> On 2-10-2015 23:32, Don Lewis wrote:
>>> On  2 Oct, Willem Jan Withagen wrote:

 Hi

 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24

 Processor: Opteron 6812, in Supermicro H8SGL

> 
>> Thusfar I have not been able to locate the "Power and Thermal Datasheet"
>> for the family 15h
>> Perhaps need to disassemble the bios, or check other tools or OSes on
>> how they do this.
> 
> According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1

I see this as an omission on the website.
I've never been bothered by FreeBSD compatibility in the 15+ years I'm
using Supermicro.

> Maybe you could approach Supermicro support and ask for assistance in
> updating FreeBSD support - maybe for other boards as well.
> 
> Another approach is asking iXSystems, I'm pretty sure they sell
> re-badged Supermicro machines.

Turns out it has nothing to do with the Motherboard, but more on the
temperature management hardware in the CPU.
So asking either iXSstems or Supermicro does not really help.

Fun part is that I have a few desktop system with "AMD Phenom(tm) II X6
1075T Processor" and they more or less run with a temp that makes more
sense I've verified the temps on the board with a thermo-cam and the
CPU cooler is around that temp. The junction-temp in the CPU package
will be higher, and you'd need to have a much beter understanding of
what AMD does with it's on-chip sensor.

I've been reading a bit more on this topic in all kinds of different
fora, and thusfar everybody recognises that it is a sort of black art.
So I guess I'll just measure, plot and monitor it, to get a feeling for
where it should be.

--WjW

--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-03 Thread Willem Jan Withagen
On 2-10-2015 23:32, Don Lewis wrote:
> On  2 Oct, Willem Jan Withagen wrote:
>>
>> Hi
>>
>> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24
>>
>> Processor: Opteron 6812, in Supermicro H8SGL
>>
>> dev.cpu.7.temperature: 11.1C
>> dev.cpu.6.temperature: 11.1C
>> dev.cpu.5.temperature: 11.1C
>> dev.cpu.4.temperature: 11.1C
>> dev.cpu.3.temperature: 11.1C
>> dev.cpu.2.temperature: 11.1C
>> dev.cpu.1.temperature: 11.1C
>> dev.cpu.0.temperature: 11.1C
>>
>> But I'm pretty sure it is not 11.1C in the datacenter
>>
>> Or should I not use amdtemp.ko for this?
> 
> The definition of the value that can be read from the temperature
> register is pretty strange.  For AMD Family 15h processors, the BIOS and
> Kernel Developer's Guide (BKDG) says this:
> 
>   Tctl is a processor temperature control value used for processor
>   thermal management. Tctl is accessible through D18F3xA4[CurTmp].
>   Tctl is a temperature on its own scale aligned to the processors
>   cooling requirements. Therefore Tctl does not represent a temperature
>   which could be measured on the die or the case of the processor.
>   Instead, it specifies the processor temperature relative to the
>   maximum operating temperature, Tctl,max. Tctl,max is specified in the
>   power and thermal data sheet. Tctl is defined as follows for all
>   parts:
> 
>   A: For Tctl = Tctl_max to 255.875: the temperature of the part is
>   [Tctl - Tctl_max] over the maximum operat- ing temperature.  The
>   processor may take corrective actions that affects performance, such
>   as HTC, to support the return to Tctl range A.
> 
>   B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is
>   [Tctl_max - Tctl] under the maximum operating temperature.
> 
> It would be nice to report Tctl_max so that we could at least know how
> far the temperature is from the limit, but I don't know if that is
> available.  It might be the value in the HtcTmpLmt register, but the
> BKDG is unclear about that.  If not, we would have to build a table of
> values from the datasheet.

And

On 2-10-2015 23:06, Jung-uk Kim wrote:
> On 10/02/2015 16:49, Willem Jan Withagen wrote:

> amdtemp(4):
>
> For Family 10h and later processors, “(the reported temperature) is a
> non-physical temperature measured on an arbitrary scale and it does not
> represent an actual physical temperature like die or case temperature.
> Instead, it specifies the processor temperature relative to the point at
> which the system must supply the maximum cooling for the processor's
> specified maximum case temperature and maximum thermal power dissipation”
> according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors,
> http://developer.amd.com/documentation/guides/Pages/default.aspx.

If one boots into the BIOS, the BIOS suggests that it knows how to do
this conversion Perhaps one can question the ultimate correctness of
the outcome, but the 51.3C value suggests some accuracy.

Thusfar I have not been able to locate the "Power and Thermal Datasheet"
for the family 15h
Perhaps need to disassemble the bios, or check other tools or OSes on
how they do this.

--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-03 Thread Shane Ambler

On 03/10/2015 20:12, Willem Jan Withagen wrote:

On 2-10-2015 23:32, Don Lewis wrote:

On  2 Oct, Willem Jan Withagen wrote:


Hi

10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24

Processor: Opteron 6812, in Supermicro H8SGL




Thusfar I have not been able to locate the "Power and Thermal Datasheet"
for the family 15h
Perhaps need to disassemble the bios, or check other tools or OSes on
how they do this.


According to Supermicro the H8SGL supports FreeBSD 8.0 but not 9.1

Maybe you could approach Supermicro support and ask for assistance in
updating FreeBSD support - maybe for other boards as well.

Another approach is asking iXSystems, I'm pretty sure they sell
re-badged Supermicro machines.


--
FreeBSD - the place to B...Software Developing

Shane Ambler

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-02 Thread Jung-uk Kim
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 10/02/2015 16:49, Willem Jan Withagen wrote:
> 
> Hi
> 
> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24
> 
> Processor: Opteron 6812, in Supermicro H8SGL
> 
> dev.cpu.7.temperature: 11.1C dev.cpu.6.temperature: 11.1C 
> dev.cpu.5.temperature: 11.1C dev.cpu.4.temperature: 11.1C 
> dev.cpu.3.temperature: 11.1C dev.cpu.2.temperature: 11.1C 
> dev.cpu.1.temperature: 11.1C dev.cpu.0.temperature: 11.1C
> 
> But I'm pretty sure it is not 11.1C in the datacenter
> 
> Or should I not use amdtemp.ko for this?

amdtemp(4):

For Family 10h and later processors, “(the reported temperature) is a
non-physical temperature measured on an arbitrary scale and it does not
represent an actual physical temperature like die or case temperature.
Instead, it specifies the processor temperature relative to the point at
which the system must supply the maximum cooling for the processor's
specified maximum case temperature and maximum thermal power dissipation”
according to BIOS and Kernel Developer's Guide (BKDG) for AMD Processors,
http://developer.amd.com/documentation/guides/Pages/default.aspx.

Jung-uk Kim
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQEcBAEBCAAGBQJWDvHaAAoJEHyflib82/FGrvwIAIHhM5bDxODIycXkuqoNWutC
MUO7KsFcQuU+pGpV+Ip70ZehRbbdjbo/3sD4oqispWKwKuUgihPBiRn7v/Ad2fxN
crvZ4MJXalQmr7+ipELWVtD8/TkzIX6npLMvjRr/adnkzDleGuEErG45z77w/6uj
SdJVkvAp15Ji1qu2UWXCipg8mWpPvZjgNwDeeK3ryo5ZsE9YaeWOKJG9eP9sjTcx
zoYC7LR/aVFFDZTlx6EY6SLHTZNs/jBsFkr6iF6xeIa9dsnwrI7ebNat8ApGQTX2
sydzIECiElqKiYNwk9XEn+e3aNgryoBhGx2Ax+fWxHHBB+kojhnFBHVQ1Qg2WaE=
=ixlS
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wrong temperature with AMD and amdtemp.ko

2015-10-02 Thread Don Lewis
On  2 Oct, Willem Jan Withagen wrote:
> 
> Hi
> 
> 10.2-STABLE FreeBSD 10.2-STABLE #0 r287102: Mon Aug 24
> 
> Processor: Opteron 6812, in Supermicro H8SGL
> 
> dev.cpu.7.temperature: 11.1C
> dev.cpu.6.temperature: 11.1C
> dev.cpu.5.temperature: 11.1C
> dev.cpu.4.temperature: 11.1C
> dev.cpu.3.temperature: 11.1C
> dev.cpu.2.temperature: 11.1C
> dev.cpu.1.temperature: 11.1C
> dev.cpu.0.temperature: 11.1C
> 
> But I'm pretty sure it is not 11.1C in the datacenter
> 
> Or should I not use amdtemp.ko for this?

The definition of the value that can be read from the temperature
register is pretty strange.  For AMD Family 15h processors, the BIOS and
Kernel Developer's Guide (BKDG) says this:

  Tctl is a processor temperature control value used for processor
  thermal management. Tctl is accessible through D18F3xA4[CurTmp].
  Tctl is a temperature on its own scale aligned to the processors
  cooling requirements. Therefore Tctl does not represent a temperature
  which could be measured on the die or the case of the processor.
  Instead, it specifies the processor temperature relative to the
  maximum operating temperature, Tctl,max. Tctl,max is specified in the
  power and thermal data sheet. Tctl is defined as follows for all
  parts:

  A: For Tctl = Tctl_max to 255.875: the temperature of the part is
  [Tctl - Tctl_max] over the maximum operat- ing temperature.  The
  processor may take corrective actions that affects performance, such
  as HTC, to support the return to Tctl range A.

  B: For Tctl = 0 to Tctl_max - 0.125: the temperature of the part is
  [Tctl_max - Tctl] under the maximum operating temperature.

It would be nice to report Tctl_max so that we could at least know how
far the temperature is from the limit, but I don't know if that is
available.  It might be the value in the HtcTmpLmt register, but the
BKDG is unclear about that.  If not, we would have to build a table of
values from the datasheet.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"