Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-21 Thread Martin Peres

On 16/08/2013 09:14, Pali Rohár wrote:

On Thursday 15 August 2013 18:21:51 Martin Peres wrote:

On 15/08/2013 03:24, Pali Rohár wrote:

On Thursday 15 August 2013 04:07:24 Martin Peres wrote:

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature
anymore is because his vbios lacks sensor calibration
values.

In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
"Thermal Settings" is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported "GPU Temp" about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what
the temperature is like. It won't be perfectly accurate
but there is some kind of default for nvidia cards of this
generation.

Ok, send me patch and I can try it if it will work and
report similar values as windows or nvidia driver.

Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau
will be higher than with the blob.
I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be
to use the blob and run:
nvapeek 0x15b0
Please send me the result along with the temperature reported
by nvidia at the time of the peek.

Martin

PS: This patch has only be compile-tested, I don't have access
to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)

 (crit = +145.0°C, hyst =  +2.0°C)
 (emerg = +135.0°C, hyst =  +5.0°C)

Ok, that was expected ;)


...

I found that nvidia binary driver has command line utility
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0

  Temperature
  
  Gpu : 70 C


Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by
nvidia binary driver.

As you didn't run nvapeek 15b0 when running nouveau it is hard to tell
if it is due to
calibration values or because the temperature was lower.


I run it and it always reported value 00ff (also when temperature changed).


Seems like we may not calibrate the ADC correctly, this is weird.



Could you please read the temperature + peek 15b0 when running nouveau?

Anyway, it is weird because I cannot find 70°C with 0x8e as an input
temperature and with
the current default values :o


My idea is that register does not contains temperature. Both nouveau and
nvidia driver when show different temperature it does not show different output
from "nvapeek 0x15b0".

Now I started computer with nouveau driver. Temperature is incresing, but
nvapeek 0x15b0 is still same.

So do you really needs other tests with nvapeek 0x15b0? Is that register
correct?


I want you to be really sure that 15b0 doesn't change with temperature 
ON THE

PROPRIETARY driver. This is very serious if this is not the case.

If this is not the case, then you must have an i2c device from which the 
blob is

reading temperature and this device isn't detected by Nouveau.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-21 Thread Martin Peres

On 16/08/2013 09:14, Pali Rohár wrote:

On Thursday 15 August 2013 18:21:51 Martin Peres wrote:

On 15/08/2013 03:24, Pali Rohár wrote:

On Thursday 15 August 2013 04:07:24 Martin Peres wrote:

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature
anymore is because his vbios lacks sensor calibration
values.

In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
Thermal Settings is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported GPU Temp about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what
the temperature is like. It won't be perfectly accurate
but there is some kind of default for nvidia cards of this
generation.

Ok, send me patch and I can try it if it will work and
report similar values as windows or nvidia driver.

Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau
will be higher than with the blob.
I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be
to use the blob and run:
nvapeek 0x15b0
Please send me the result along with the temperature reported
by nvidia at the time of the peek.

Martin

PS: This patch has only be compile-tested, I don't have access
to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)

 (crit = +145.0°C, hyst =  +2.0°C)
 (emerg = +135.0°C, hyst =  +5.0°C)

Ok, that was expected ;)


...

I found that nvidia binary driver has command line utility
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0

  Temperature
  
  Gpu : 70 C


Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by
nvidia binary driver.

As you didn't run nvapeek 15b0 when running nouveau it is hard to tell
if it is due to
calibration values or because the temperature was lower.


I run it and it always reported value 00ff (also when temperature changed).


Seems like we may not calibrate the ADC correctly, this is weird.



Could you please read the temperature + peek 15b0 when running nouveau?

Anyway, it is weird because I cannot find 70°C with 0x8e as an input
temperature and with
the current default values :o


My idea is that register does not contains temperature. Both nouveau and
nvidia driver when show different temperature it does not show different output
from nvapeek 0x15b0.

Now I started computer with nouveau driver. Temperature is incresing, but
nvapeek 0x15b0 is still same.

So do you really needs other tests with nvapeek 0x15b0? Is that register
correct?


I want you to be really sure that 15b0 doesn't change with temperature 
ON THE

PROPRIETARY driver. This is very serious if this is not the case.

If this is not the case, then you must have an i2c device from which the 
blob is

reading temperature and this device isn't detected by Nouveau.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-16 Thread Pali Rohár
On Thursday 15 August 2013 18:21:51 Martin Peres wrote:
> On 15/08/2013 03:24, Pali Rohár wrote:
> > On Thursday 15 August 2013 04:07:24 Martin Peres wrote:
> >> On 14/08/2013 05:02, Pali Rohár wrote:
> >>> On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
>  On 13/08/2013 09:53, Pali Rohár wrote:
> > On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres
> >>> 
> >>> wrote:
> >> On 13/08/2013 09:23, Pali Rohár wrote:
> >>> On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
> >>...
> >> 
> >> You can check the temperature by running nvidia-settings.
> >> If you can't see the temperature in it, then nvidia
> >> doesn't support it on your card and
> >> I'm not sure we should :s
> >> 
> >> Thanks for the vbios you sent me in private. For the
> >> others, the reason why he doesn't have temperature
> >> anymore is because his vbios lacks sensor calibration
> >> values.
> > 
> > In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
> > "Thermal Settings" is:
> > 
> > Thermal Sensor Information:
> > ID: 0
> > Target: GPU
> > Provider: GPU Internal
> > Temperature: 70 C (now)
> > 
> > I looked in Windows program SpeedFan. It found Nvidia PCI
> > card and reported "GPU Temp" about 68-70 C. So it looks
> > like both nvidia driver and windows SpeedFan program
> > reading same values.
>  
>  Great, I'll cook you a patch in a bit and you'll see what
>  the temperature is like. It won't be perfectly accurate
>  but there is some kind of default for nvidia cards of this
>  generation.
> >>> 
> >>> Ok, send me patch and I can try it if it will work and
> >>> report similar values as windows or nvidia driver.
> >> 
> >> Sorry for the late answer.
> >> 
> >> Please test this patch. Be aware that temperature with nouveau
> >> will be higher than with the blob.
> >> I only want to see if nouveau reports a temperature.
> >> 
> >> The only way to be sure if the values are good-enough would be
> >> to use the blob and run:
> >> nvapeek 0x15b0
> >> Please send me the result along with the temperature reported
> >> by nvidia at the time of the peek.
> >> 
> >> Martin
> >> 
> >> PS: This patch has only be compile-tested, I don't have access
> >> to an nv4x right now.
> > 
> > Hello,
> > 
> > now after patch nouveau report temperature:
> > 
> > $ sensors
> > ...
> > nouveau-pci-0500
> > Adapter: PCI adapter
> > temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
> > 
> > (crit = +145.0°C, hyst =  +2.0°C)
> > (emerg = +135.0°C, hyst =  +5.0°C)
> 
> Ok, that was expected ;)
> 
> > ...
> > 
> > I found that nvidia binary driver has command line utility
> > nvidia-smi which report same temperature as X utility nvidia-
> > settings. So I will use nvidia-smi (if it is OK).
> > 
> > And after reboot nvidia report another temperature value:
> > 
> > $ nvidia-smi -q -d TEMPERATURE
> > ...
> > GPU :05:00.0
> > 
> >  Temperature
> >  
> >  Gpu : 70 C
> > 
> > Immediately I called nvapeek command:
> > 
> > $ nvapeek 0x15b0
> > 15b0: 108e
> > 
> > So value reported by nouveau is lower than value reported by
> > nvidia binary driver.
> 
> As you didn't run nvapeek 15b0 when running nouveau it is hard to tell
> if it is due to
> calibration values or because the temperature was lower.
> 

I run it and it always reported value 00ff (also when temperature changed).

> Could you please read the temperature + peek 15b0 when running nouveau?
> 
> Anyway, it is weird because I cannot find 70°C with 0x8e as an input
> temperature and with
> the current default values :o
> 

My idea is that register does not contains temperature. Both nouveau and 
nvidia driver when show different temperature it does not show different output 
from "nvapeek 0x15b0".

Now I started computer with nouveau driver. Temperature is incresing, but 
nvapeek 0x15b0 is still same.

So do you really needs other tests with nvapeek 0x15b0? Is that register 
correct?

> > I wait some some and started nvidia-smi and nvapeek again, here
> > are results:
> > 
> > $ nvidia-smi -q -d TEMPERATURE
> > ...
> > GPU :05:00.0
> > 
> >  Temperature
> >  
> >  Gpu : 67 C
> > 
> > $ nvapeek 0x15b0
> > 15b0: 108e
> > 
> > So it looks like that nvapeek returning always same value and
> > does not depends on temperature... It is OK?
> 
> Well, it looks like the temperature reading is very noisy!
> Could you please get the temperature + peek when the card is as hot as
> possible?
> 
> There is a very effective solution to get a GPU hot, use a hair drier.
> If you could get your
> GPU to at 110°C (or less, if you feel like it is too much), that could
> help me check the formula
> and default values.
> 
> PS: I attached a new version of the patch that should improve the
> temperature accuracy for
> nv43s. 

Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-16 Thread Pali Rohár
On Thursday 15 August 2013 18:21:51 Martin Peres wrote:
 On 15/08/2013 03:24, Pali Rohár wrote:
  On Thursday 15 August 2013 04:07:24 Martin Peres wrote:
  On 14/08/2013 05:02, Pali Rohár wrote:
  On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
  On 13/08/2013 09:53, Pali Rohár wrote:
  On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres
  
  wrote:
  On 13/08/2013 09:23, Pali Rohár wrote:
  On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
 ...
  
  You can check the temperature by running nvidia-settings.
  If you can't see the temperature in it, then nvidia
  doesn't support it on your card and
  I'm not sure we should :s
  
  Thanks for the vbios you sent me in private. For the
  others, the reason why he doesn't have temperature
  anymore is because his vbios lacks sensor calibration
  values.
  
  In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
  Thermal Settings is:
  
  Thermal Sensor Information:
  ID: 0
  Target: GPU
  Provider: GPU Internal
  Temperature: 70 C (now)
  
  I looked in Windows program SpeedFan. It found Nvidia PCI
  card and reported GPU Temp about 68-70 C. So it looks
  like both nvidia driver and windows SpeedFan program
  reading same values.
  
  Great, I'll cook you a patch in a bit and you'll see what
  the temperature is like. It won't be perfectly accurate
  but there is some kind of default for nvidia cards of this
  generation.
  
  Ok, send me patch and I can try it if it will work and
  report similar values as windows or nvidia driver.
  
  Sorry for the late answer.
  
  Please test this patch. Be aware that temperature with nouveau
  will be higher than with the blob.
  I only want to see if nouveau reports a temperature.
  
  The only way to be sure if the values are good-enough would be
  to use the blob and run:
  nvapeek 0x15b0
  Please send me the result along with the temperature reported
  by nvidia at the time of the peek.
  
  Martin
  
  PS: This patch has only be compile-tested, I don't have access
  to an nv4x right now.
  
  Hello,
  
  now after patch nouveau report temperature:
  
  $ sensors
  ...
  nouveau-pci-0500
  Adapter: PCI adapter
  temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
  
  (crit = +145.0°C, hyst =  +2.0°C)
  (emerg = +135.0°C, hyst =  +5.0°C)
 
 Ok, that was expected ;)
 
  ...
  
  I found that nvidia binary driver has command line utility
  nvidia-smi which report same temperature as X utility nvidia-
  settings. So I will use nvidia-smi (if it is OK).
  
  And after reboot nvidia report another temperature value:
  
  $ nvidia-smi -q -d TEMPERATURE
  ...
  GPU :05:00.0
  
   Temperature
   
   Gpu : 70 C
  
  Immediately I called nvapeek command:
  
  $ nvapeek 0x15b0
  15b0: 108e
  
  So value reported by nouveau is lower than value reported by
  nvidia binary driver.
 
 As you didn't run nvapeek 15b0 when running nouveau it is hard to tell
 if it is due to
 calibration values or because the temperature was lower.
 

I run it and it always reported value 00ff (also when temperature changed).

 Could you please read the temperature + peek 15b0 when running nouveau?
 
 Anyway, it is weird because I cannot find 70°C with 0x8e as an input
 temperature and with
 the current default values :o
 

My idea is that register does not contains temperature. Both nouveau and 
nvidia driver when show different temperature it does not show different output 
from nvapeek 0x15b0.

Now I started computer with nouveau driver. Temperature is incresing, but 
nvapeek 0x15b0 is still same.

So do you really needs other tests with nvapeek 0x15b0? Is that register 
correct?

  I wait some some and started nvidia-smi and nvapeek again, here
  are results:
  
  $ nvidia-smi -q -d TEMPERATURE
  ...
  GPU :05:00.0
  
   Temperature
   
   Gpu : 67 C
  
  $ nvapeek 0x15b0
  15b0: 108e
  
  So it looks like that nvapeek returning always same value and
  does not depends on temperature... It is OK?
 
 Well, it looks like the temperature reading is very noisy!
 Could you please get the temperature + peek when the card is as hot as
 possible?
 
 There is a very effective solution to get a GPU hot, use a hair drier.
 If you could get your
 GPU to at 110°C (or less, if you feel like it is too much), that could
 help me check the formula
 and default values.
 
 PS: I attached a new version of the patch that should improve the
 temperature accuracy for
 nv43s. Could you test it and send me your kernel log?

-- 
Pali Rohár
pali.ro...@gmail.com



signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-15 Thread Martin Peres

On 15/08/2013 03:24, Pali Rohár wrote:

On Thursday 15 August 2013 04:07:24 Martin Peres wrote:

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

   ...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature
anymore is because his vbios lacks sensor calibration
values.

In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
"Thermal Settings" is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported "GPU Temp" about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what
the temperature is like. It won't be perfectly accurate
but there is some kind of default for nvidia cards of this
generation.

Ok, send me patch and I can try it if it will work and
report similar values as windows or nvidia driver.

Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau
will be higher than with the blob.
I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be
to use the blob and run:
nvapeek 0x15b0
Please send me the result along with the temperature reported
by nvidia at the time of the peek.

Martin

PS: This patch has only be compile-tested, I don't have access
to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
(crit = +145.0°C, hyst =  +2.0°C)
(emerg = +135.0°C, hyst =  +5.0°C)


Ok, that was expected ;)

...

I found that nvidia binary driver has command line utility
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
 Temperature
 Gpu : 70 C

Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by
nvidia binary driver.
As you didn't run nvapeek 15b0 when running nouveau it is hard to tell 
if it is due to

calibration values or because the temperature was lower.

Could you please read the temperature + peek 15b0 when running nouveau?

Anyway, it is weird because I cannot find 70°C with 0x8e as an input 
temperature and with

the current default values :o

I wait some some and started nvidia-smi and nvapeek again, here
are results:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
 Temperature
 Gpu : 67 C

$ nvapeek 0x15b0
15b0: 108e

So it looks like that nvapeek returning always same value and
does not depends on temperature... It is OK?

Well, it looks like the temperature reading is very noisy!
Could you please get the temperature + peek when the card is as hot as 
possible?


There is a very effective solution to get a GPU hot, use a hair drier. 
If you could get your
GPU to at 110°C (or less, if you feel like it is too much), that could 
help me check the formula

and default values.

PS: I attached a new version of the patch that should improve the 
temperature accuracy for

nv43s. Could you test it and send me your kernel log?
>From 8c806fd49d87ecf57e98713537a7d23d1f7712d9 Mon Sep 17 00:00:00 2001
From: Martin Peres 
Date: Wed, 14 Aug 2013 22:00:48 -0400
Subject: [PATCH] drm/nv40/therm: set default calibration values if needed

Some vbios expose a thermal sensor but do not set default
calibration values. As they are almost always the same, let's
set some default ones.

v2:
- the nv43 requires a different offset numerator
- cosmetic changes

Signed-off-by: Martin Peres 
---
 .../drm/nouveau/core/include/subdev/bios/therm.h   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/bios/therm.c   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c   | 43 +++---
 drivers/gpu/drm/nouveau/core/subdev/therm/temp.c   |  5 +--
 4 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
index 083541d..11b7993 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
@@ -10,6 +10,7 @@ struct nvbios_therm_threshold {
 
 struct nvbios_therm_sensor {
 	

Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-15 Thread Pali Rohár
On Thursday 15 August 2013 04:07:24 Martin Peres wrote:
> On 14/08/2013 05:02, Pali Rohár wrote:
> > On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
> >> On 13/08/2013 09:53, Pali Rohár wrote:
> >>> On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres
> > 
> > wrote:
>  On 13/08/2013 09:23, Pali Rohár wrote:
> > On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
>    ...
>  
>  You can check the temperature by running nvidia-settings.
>  If you can't see the temperature in it, then nvidia
>  doesn't support it on your card and
>  I'm not sure we should :s
>  
>  Thanks for the vbios you sent me in private. For the
>  others, the reason why he doesn't have temperature
>  anymore is because his vbios lacks sensor calibration
>  values.
> >>> 
> >>> In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
> >>> "Thermal Settings" is:
> >>> 
> >>> Thermal Sensor Information:
> >>> ID: 0
> >>> Target: GPU
> >>> Provider: GPU Internal
> >>> Temperature: 70 C (now)
> >>> 
> >>> I looked in Windows program SpeedFan. It found Nvidia PCI
> >>> card and reported "GPU Temp" about 68-70 C. So it looks
> >>> like both nvidia driver and windows SpeedFan program
> >>> reading same values.
> >> 
> >> Great, I'll cook you a patch in a bit and you'll see what
> >> the temperature is like. It won't be perfectly accurate
> >> but there is some kind of default for nvidia cards of this
> >> generation.
> > 
> > Ok, send me patch and I can try it if it will work and
> > report similar values as windows or nvidia driver.
> 
> Sorry for the late answer.
> 
> Please test this patch. Be aware that temperature with nouveau
> will be higher than with the blob.
> I only want to see if nouveau reports a temperature.
> 
> The only way to be sure if the values are good-enough would be
> to use the blob and run:
> nvapeek 0x15b0
> Please send me the result along with the temperature reported
> by nvidia at the time of the peek.
> 
> Martin
> 
> PS: This patch has only be compile-tested, I don't have access
> to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
   (crit = +145.0°C, hyst =  +2.0°C)
   (emerg = +135.0°C, hyst =  +5.0°C)
...

I found that nvidia binary driver has command line utility 
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
Temperature
Gpu : 70 C

Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by 
nvidia binary driver.

I wait some some and started nvidia-smi and nvapeek again, here 
are results:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
Temperature
Gpu : 67 C

$ nvapeek 0x15b0
15b0: 108e

So it looks like that nvapeek returning always same value and 
does not depends on temperature... It is OK?

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-15 Thread Pali Rohár
On Thursday 15 August 2013 04:07:24 Martin Peres wrote:
 On 14/08/2013 05:02, Pali Rohár wrote:
  On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
  On 13/08/2013 09:53, Pali Rohár wrote:
  On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres
  
  wrote:
  On 13/08/2013 09:23, Pali Rohár wrote:
  On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
...
  
  You can check the temperature by running nvidia-settings.
  If you can't see the temperature in it, then nvidia
  doesn't support it on your card and
  I'm not sure we should :s
  
  Thanks for the vbios you sent me in private. For the
  others, the reason why he doesn't have temperature
  anymore is because his vbios lacks sensor calibration
  values.
  
  In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
  Thermal Settings is:
  
  Thermal Sensor Information:
  ID: 0
  Target: GPU
  Provider: GPU Internal
  Temperature: 70 C (now)
  
  I looked in Windows program SpeedFan. It found Nvidia PCI
  card and reported GPU Temp about 68-70 C. So it looks
  like both nvidia driver and windows SpeedFan program
  reading same values.
  
  Great, I'll cook you a patch in a bit and you'll see what
  the temperature is like. It won't be perfectly accurate
  but there is some kind of default for nvidia cards of this
  generation.
  
  Ok, send me patch and I can try it if it will work and
  report similar values as windows or nvidia driver.
 
 Sorry for the late answer.
 
 Please test this patch. Be aware that temperature with nouveau
 will be higher than with the blob.
 I only want to see if nouveau reports a temperature.
 
 The only way to be sure if the values are good-enough would be
 to use the blob and run:
 nvapeek 0x15b0
 Please send me the result along with the temperature reported
 by nvidia at the time of the peek.
 
 Martin
 
 PS: This patch has only be compile-tested, I don't have access
 to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
   (crit = +145.0°C, hyst =  +2.0°C)
   (emerg = +135.0°C, hyst =  +5.0°C)
...

I found that nvidia binary driver has command line utility 
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
Temperature
Gpu : 70 C

Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by 
nvidia binary driver.

I wait some some and started nvidia-smi and nvapeek again, here 
are results:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
Temperature
Gpu : 67 C

$ nvapeek 0x15b0
15b0: 108e

So it looks like that nvapeek returning always same value and 
does not depends on temperature... It is OK?

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-15 Thread Martin Peres

On 15/08/2013 03:24, Pali Rohár wrote:

On Thursday 15 August 2013 04:07:24 Martin Peres wrote:

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

   ...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature
anymore is because his vbios lacks sensor calibration
values.

In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
Thermal Settings is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported GPU Temp about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what
the temperature is like. It won't be perfectly accurate
but there is some kind of default for nvidia cards of this
generation.

Ok, send me patch and I can try it if it will work and
report similar values as windows or nvidia driver.

Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau
will be higher than with the blob.
I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be
to use the blob and run:
nvapeek 0x15b0
Please send me the result along with the temperature reported
by nvidia at the time of the peek.

Martin

PS: This patch has only be compile-tested, I don't have access
to an nv4x right now.

Hello,

now after patch nouveau report temperature:

$ sensors
...
nouveau-pci-0500
Adapter: PCI adapter
temp1:+63.0°C  (high = +95.0°C, hyst =  +3.0°C)
(crit = +145.0°C, hyst =  +2.0°C)
(emerg = +135.0°C, hyst =  +5.0°C)


Ok, that was expected ;)

...

I found that nvidia binary driver has command line utility
nvidia-smi which report same temperature as X utility nvidia-
settings. So I will use nvidia-smi (if it is OK).

And after reboot nvidia report another temperature value:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
 Temperature
 Gpu : 70 C

Immediately I called nvapeek command:

$ nvapeek 0x15b0
15b0: 108e

So value reported by nouveau is lower than value reported by
nvidia binary driver.
As you didn't run nvapeek 15b0 when running nouveau it is hard to tell 
if it is due to

calibration values or because the temperature was lower.

Could you please read the temperature + peek 15b0 when running nouveau?

Anyway, it is weird because I cannot find 70°C with 0x8e as an input 
temperature and with

the current default values :o

I wait some some and started nvidia-smi and nvapeek again, here
are results:

$ nvidia-smi -q -d TEMPERATURE
...
GPU :05:00.0
 Temperature
 Gpu : 67 C

$ nvapeek 0x15b0
15b0: 108e

So it looks like that nvapeek returning always same value and
does not depends on temperature... It is OK?

Well, it looks like the temperature reading is very noisy!
Could you please get the temperature + peek when the card is as hot as 
possible?


There is a very effective solution to get a GPU hot, use a hair drier. 
If you could get your
GPU to at 110°C (or less, if you feel like it is too much), that could 
help me check the formula

and default values.

PS: I attached a new version of the patch that should improve the 
temperature accuracy for

nv43s. Could you test it and send me your kernel log?
From 8c806fd49d87ecf57e98713537a7d23d1f7712d9 Mon Sep 17 00:00:00 2001
From: Martin Peres martin.pe...@labri.fr
Date: Wed, 14 Aug 2013 22:00:48 -0400
Subject: [PATCH] drm/nv40/therm: set default calibration values if needed

Some vbios expose a thermal sensor but do not set default
calibration values. As they are almost always the same, let's
set some default ones.

v2:
- the nv43 requires a different offset numerator
- cosmetic changes

Signed-off-by: Martin Peres martin.pe...@labri.fr
---
 .../drm/nouveau/core/include/subdev/bios/therm.h   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/bios/therm.c   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c   | 43 +++---
 drivers/gpu/drm/nouveau/core/subdev/therm/temp.c   |  5 +--
 4 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
index 083541d..11b7993 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
@@ -10,6 +10,7 @@ struct nvbios_therm_threshold {

Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-14 Thread Martin Peres

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

  ...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature anymore
is because his vbios lacks sensor calibration values.

In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
"Thermal Settings" is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported "GPU Temp" about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what the
temperature is like. It won't be perfectly accurate but there
is some kind of default for nvidia cards of this generation.

Ok, send me patch and I can try it if it will work and report
similar values as windows or nvidia driver.


Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau will be 
higher than with the blob.

I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be to use 
the blob and run:

nvapeek 0x15b0
Please send me the result along with the temperature reported by nvidia 
at the time of the peek.


Martin

PS: This patch has only be compile-tested, I don't have access to an 
nv4x right now.
>From abe97f1e5de0b7ae5114802fcbc99d6e3408cd00 Mon Sep 17 00:00:00 2001
From: Martin Peres 
Date: Wed, 14 Aug 2013 22:00:48 -0400
Subject: [PATCH] drm/nv40/therm: set default calibration values if needed

Some vbios expose a thermal sensor but do not set default
calibration values. As they are almost always the same, let's
set some default ones.

Signed-off-by: Martin Peres 
---
 .../drm/nouveau/core/include/subdev/bios/therm.h   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/bios/therm.c   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c   | 36 ++
 drivers/gpu/drm/nouveau/core/subdev/therm/temp.c   |  5 ++-
 4 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
index 083541d..11b7993 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
@@ -10,6 +10,7 @@ struct nvbios_therm_threshold {
 
 struct nvbios_therm_sensor {
 	/* diode */
+	int has_sensor;
 	s16 slope_mult;
 	s16 slope_div;
 	s16 offset_num;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c b/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
index 22a2057..16b763d 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
@@ -95,6 +95,7 @@ nvbios_therm_sensor_parse(struct nouveau_bios *bios,
 			sensor_section++;
 			if (sensor_section == 0) {
 offset = ((s8) nv_ro08(bios, entry + 2)) / 2;
+sensor->has_sensor = 1;
 sensor->offset_constant = offset;
 			}
 			break;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c b/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
index 002e51b..5312bbd 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
@@ -93,11 +93,6 @@ nv40_temp_get(struct nouveau_therm *therm)
 	} else
 		return -ENODEV;
 
-	/* if the slope or the offset is unset, do no use the sensor */
-	if (!sensor->slope_div || !sensor->slope_mult ||
-	!sensor->offset_num || !sensor->offset_den)
-	return -ENODEV;
-
 	core_temp = core_temp * sensor->slope_mult / sensor->slope_div;
 	core_temp = core_temp + sensor->offset_num / sensor->offset_den;
 	core_temp = core_temp + sensor->offset_constant - 8;
@@ -171,7 +166,7 @@ nv40_therm_intr(struct nouveau_subdev *subdev)
 	struct nouveau_therm *therm = nouveau_therm(subdev);
 	uint32_t stat = nv_rd32(therm, 0x1100);
 
-	/* traitement */
+	/* TODO: do something? Need more RE first */
 
 	/* ack all IRQs */
 	nv_wr32(therm, 0x1100, 0x7);
@@ -202,11 +197,40 @@ nv40_therm_ctor(struct nouveau_object *parent,
 	return nouveau_therm_preinit(>base.base);
 }
 
+static void
+nv40_therm_temp_safety_checks(struct nouveau_therm *therm)
+{
+	struct nouveau_therm_priv *priv = (void *)therm;
+	struct nvbios_therm_sensor *sensor = >bios_sensor;
+	enum nv40_sensor_style style = nv40_sensor_style(therm);
+
+	/* if the slope or the offset is unset, do no use the sensor */
+	if (sensor->has_sensor && (!sensor->slope_div || !sensor->slope_mult ||
+	!sensor->offset_num || 

Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-14 Thread Pali Rohár
On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
> On 13/08/2013 09:53, Pali Rohár wrote:
> > On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres 
wrote:
> >> On 13/08/2013 09:23, Pali Rohár wrote:
> >>> On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
> >>  ...
> >> 
> >> You can check the temperature by running nvidia-settings.
> >> If you can't see the temperature in it, then nvidia
> >> doesn't support it on your card and
> >> I'm not sure we should :s
> >> 
> >> Thanks for the vbios you sent me in private. For the
> >> others, the reason why he doesn't have temperature anymore
> >> is because his vbios lacks sensor calibration values.
> > 
> > In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" -->
> > "Thermal Settings" is:
> > 
> > Thermal Sensor Information:
> > ID: 0
> > Target: GPU
> > Provider: GPU Internal
> > Temperature: 70 C (now)
> > 
> > I looked in Windows program SpeedFan. It found Nvidia PCI
> > card and reported "GPU Temp" about 68-70 C. So it looks
> > like both nvidia driver and windows SpeedFan program
> > reading same values.
> 
> Great, I'll cook you a patch in a bit and you'll see what the
> temperature is like. It won't be perfectly accurate but there
> is some kind of default for nvidia cards of this generation.

Ok, send me patch and I can try it if it will work and report 
similar values as windows or nvidia driver.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-14 Thread Pali Rohár
On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:
 On 13/08/2013 09:53, Pali Rohár wrote:
  On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres 
wrote:
  On 13/08/2013 09:23, Pali Rohár wrote:
  On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
   ...
  
  You can check the temperature by running nvidia-settings.
  If you can't see the temperature in it, then nvidia
  doesn't support it on your card and
  I'm not sure we should :s
  
  Thanks for the vbios you sent me in private. For the
  others, the reason why he doesn't have temperature anymore
  is because his vbios lacks sensor calibration values.
  
  In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
  Thermal Settings is:
  
  Thermal Sensor Information:
  ID: 0
  Target: GPU
  Provider: GPU Internal
  Temperature: 70 C (now)
  
  I looked in Windows program SpeedFan. It found Nvidia PCI
  card and reported GPU Temp about 68-70 C. So it looks
  like both nvidia driver and windows SpeedFan program
  reading same values.
 
 Great, I'll cook you a patch in a bit and you'll see what the
 temperature is like. It won't be perfectly accurate but there
 is some kind of default for nvidia cards of this generation.

Ok, send me patch and I can try it if it will work and report 
similar values as windows or nvidia driver.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-14 Thread Martin Peres

On 14/08/2013 05:02, Pali Rohár wrote:

On Tuesday 13 August 2013 15:55:28 Martin Peres wrote:

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres

wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

  ...

You can check the temperature by running nvidia-settings.
If you can't see the temperature in it, then nvidia
doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the
others, the reason why he doesn't have temperature anymore
is because his vbios lacks sensor calibration values.

In nvidia-settings tab GPU 0 - (GeForce 6600 GT) --
Thermal Settings is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI
card and reported GPU Temp about 68-70 C. So it looks
like both nvidia driver and windows SpeedFan program
reading same values.

Great, I'll cook you a patch in a bit and you'll see what the
temperature is like. It won't be perfectly accurate but there
is some kind of default for nvidia cards of this generation.

Ok, send me patch and I can try it if it will work and report
similar values as windows or nvidia driver.


Sorry for the late answer.

Please test this patch. Be aware that temperature with nouveau will be 
higher than with the blob.

I only want to see if nouveau reports a temperature.

The only way to be sure if the values are good-enough would be to use 
the blob and run:

nvapeek 0x15b0
Please send me the result along with the temperature reported by nvidia 
at the time of the peek.


Martin

PS: This patch has only be compile-tested, I don't have access to an 
nv4x right now.
From abe97f1e5de0b7ae5114802fcbc99d6e3408cd00 Mon Sep 17 00:00:00 2001
From: Martin Peres martin.pe...@labri.fr
Date: Wed, 14 Aug 2013 22:00:48 -0400
Subject: [PATCH] drm/nv40/therm: set default calibration values if needed

Some vbios expose a thermal sensor but do not set default
calibration values. As they are almost always the same, let's
set some default ones.

Signed-off-by: Martin Peres martin.pe...@labri.fr
---
 .../drm/nouveau/core/include/subdev/bios/therm.h   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/bios/therm.c   |  1 +
 drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c   | 36 ++
 drivers/gpu/drm/nouveau/core/subdev/therm/temp.c   |  5 ++-
 4 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
index 083541d..11b7993 100644
--- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
+++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/therm.h
@@ -10,6 +10,7 @@ struct nvbios_therm_threshold {
 
 struct nvbios_therm_sensor {
 	/* diode */
+	int has_sensor;
 	s16 slope_mult;
 	s16 slope_div;
 	s16 offset_num;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c b/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
index 22a2057..16b763d 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bios/therm.c
@@ -95,6 +95,7 @@ nvbios_therm_sensor_parse(struct nouveau_bios *bios,
 			sensor_section++;
 			if (sensor_section == 0) {
 offset = ((s8) nv_ro08(bios, entry + 2)) / 2;
+sensor-has_sensor = 1;
 sensor-offset_constant = offset;
 			}
 			break;
diff --git a/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c b/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
index 002e51b..5312bbd 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c
@@ -93,11 +93,6 @@ nv40_temp_get(struct nouveau_therm *therm)
 	} else
 		return -ENODEV;
 
-	/* if the slope or the offset is unset, do no use the sensor */
-	if (!sensor-slope_div || !sensor-slope_mult ||
-	!sensor-offset_num || !sensor-offset_den)
-	return -ENODEV;
-
 	core_temp = core_temp * sensor-slope_mult / sensor-slope_div;
 	core_temp = core_temp + sensor-offset_num / sensor-offset_den;
 	core_temp = core_temp + sensor-offset_constant - 8;
@@ -171,7 +166,7 @@ nv40_therm_intr(struct nouveau_subdev *subdev)
 	struct nouveau_therm *therm = nouveau_therm(subdev);
 	uint32_t stat = nv_rd32(therm, 0x1100);
 
-	/* traitement */
+	/* TODO: do something? Need more RE first */
 
 	/* ack all IRQs */
 	nv_wr32(therm, 0x1100, 0x7);
@@ -202,11 +197,40 @@ nv40_therm_ctor(struct nouveau_object *parent,
 	return nouveau_therm_preinit(priv-base.base);
 }
 
+static void
+nv40_therm_temp_safety_checks(struct nouveau_therm *therm)
+{
+	struct nouveau_therm_priv *priv = (void *)therm;
+	struct nvbios_therm_sensor *sensor = priv-bios_sensor;
+	enum nv40_sensor_style style = nv40_sensor_style(therm);
+
+	/* if the slope or the offset is unset, do no use the sensor */
+	if (sensor-has_sensor  (!sensor-slope_div || !sensor-slope_mult ||
+	

Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Martin Peres

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

 ...
You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your 
card and

I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.




In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" --> "Thermal 
Settings" is:


Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI card and 
reported "GPU Temp" about 68-70 C. So it looks like both nvidia driver 
and windows SpeedFan program reading same values.


Great, I'll cook you a patch in a bit and you'll see what the 
temperature is like. It won't be perfectly accurate but there is some 
kind of default for nvidia cards of this generation.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

 ...
You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.




In nvidia-settings tab "GPU 0 - (GeForce 6600 GT)" --> "Thermal Settings" is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI card and reported "GPU 
Temp" about 68-70 C. So it looks like both nvidia driver and windows SpeedFan 
program reading same values.

--
Pali Rohár
pali.ro...@gmail.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Martin Peres

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

On 13/08/2013 05:56, Pali Rohár wrote:

Hello,

after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature
information from lm sensors is not available on my Nvidia 6600GT graphics
card. There is no nouveau hwmon entry in sysfs anymore. Why it was
removed? Can I help with debugging? I'd like to see temperature sensor
working again.

Hi,

Thanks for bisecting the issue. Can you send me your vbios and tell
me if the nvidia driver shows your temperature probe (and what is
its source).

To fetch your vbios, you can extract it using nvagetbios from the
envytools repo: https://github.com/envytools/envytools

Cheers,
Martin

How can I check source and temperature probe in nvidia binary driver?

You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár
On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
> On 13/08/2013 05:56, Pali Rohár wrote:
> > Hello,
> > 
> > after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature
> > information from lm sensors is not available on my Nvidia 6600GT graphics
> > card. There is no nouveau hwmon entry in sysfs anymore. Why it was
> > removed? Can I help with debugging? I'd like to see temperature sensor
> > working again.
> 
> Hi,
> 
> Thanks for bisecting the issue. Can you send me your vbios and tell
> me if the nvidia driver shows your temperature probe (and what is
> its source).
> 
> To fetch your vbios, you can extract it using nvagetbios from the
> envytools repo: https://github.com/envytools/envytools
> 
> Cheers,
> Martin

How can I check source and temperature probe in nvidia binary driver?

-- 
Pali Rohár
pali.ro...@gmail.com



signature.asc
Description: This is a digitally signed message part.


nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár
Hello,

after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature information 
from lm sensors is not available on my Nvidia 6600GT graphics card. There is 
no nouveau hwmon entry in sysfs anymore. Why it was removed? Can I help with 
debugging? I'd like to see temperature sensor working again.

-- 
Pali Rohár
pali.ro...@gmail.com



signature.asc
Description: This is a digitally signed message part.


nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár
Hello,

after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature information 
from lm sensors is not available on my Nvidia 6600GT graphics card. There is 
no nouveau hwmon entry in sysfs anymore. Why it was removed? Can I help with 
debugging? I'd like to see temperature sensor working again.

-- 
Pali Rohár
pali.ro...@gmail.com



signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár
On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:
 On 13/08/2013 05:56, Pali Rohár wrote:
  Hello,
  
  after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature
  information from lm sensors is not available on my Nvidia 6600GT graphics
  card. There is no nouveau hwmon entry in sysfs anymore. Why it was
  removed? Can I help with debugging? I'd like to see temperature sensor
  working again.
 
 Hi,
 
 Thanks for bisecting the issue. Can you send me your vbios and tell
 me if the nvidia driver shows your temperature probe (and what is
 its source).
 
 To fetch your vbios, you can extract it using nvagetbios from the
 envytools repo: https://github.com/envytools/envytools
 
 Cheers,
 Martin

How can I check source and temperature probe in nvidia binary driver?

-- 
Pali Rohár
pali.ro...@gmail.com



signature.asc
Description: This is a digitally signed message part.


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Martin Peres

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

On 13/08/2013 05:56, Pali Rohár wrote:

Hello,

after commit ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954 temperature
information from lm sensors is not available on my Nvidia 6600GT graphics
card. There is no nouveau hwmon entry in sysfs anymore. Why it was
removed? Can I help with debugging? I'd like to see temperature sensor
working again.

Hi,

Thanks for bisecting the issue. Can you send me your vbios and tell
me if the nvidia driver shows your temperature probe (and what is
its source).

To fetch your vbios, you can extract it using nvagetbios from the
envytools repo: https://github.com/envytools/envytools

Cheers,
Martin

How can I check source and temperature probe in nvidia binary driver?

You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Pali Rohár

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

 ...
You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your card and
I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.




In nvidia-settings tab GPU 0 - (GeForce 6600 GT) -- Thermal Settings is:

Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI card and reported GPU 
Temp about 68-70 C. So it looks like both nvidia driver and windows SpeedFan 
program reading same values.

--
Pali Rohár
pali.ro...@gmail.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nouveau: temperature on nv40 is unavailable since ad40d73ef533ab0ad16b4a1ab2f7870c1f8ab954

2013-08-13 Thread Martin Peres

On 13/08/2013 09:53, Pali Rohár wrote:

On utorok, 13. augusta 2013 15:32:45 CEST, Martin Peres wrote:

On 13/08/2013 09:23, Pali Rohár wrote:

On Tuesday 13 August 2013 09:01:19 Martin Peres wrote:

 ...
You can check the temperature by running nvidia-settings. If you can't
see the temperature in it, then nvidia doesn't support it on your 
card and

I'm not sure we should :s

Thanks for the vbios you sent me in private. For the others, the reason
why he doesn't have temperature anymore is because his vbios lacks
sensor calibration values.




In nvidia-settings tab GPU 0 - (GeForce 6600 GT) -- Thermal 
Settings is:


Thermal Sensor Information:
ID: 0
Target: GPU
Provider: GPU Internal
Temperature: 70 C (now)

I looked in Windows program SpeedFan. It found Nvidia PCI card and 
reported GPU Temp about 68-70 C. So it looks like both nvidia driver 
and windows SpeedFan program reading same values.


Great, I'll cook you a patch in a bit and you'll see what the 
temperature is like. It won't be perfectly accurate but there is some 
kind of default for nvidia cards of this generation.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/