Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-13 Thread Borislav Petkov
On Thu, Dec 13, 2007 at 09:17:18AM -0700, Bjorn Helgaas wrote:
> On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote:
> > On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
> > > On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
> > > > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
> > > > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> > > > > > From what i can roughly tell so far it seems like an resource 
> > > > > > conflict between acpi and
> > > > > > the pnp requested regions in your patch which result in the 
> > > > > > acpi_thermal code
> > > > > > to read the wrong (0xff) temperature value and halt the machine, 
> > > > > > but i might be
> > > > > > wrong on the details since acpi is such a big code chunk to swallow.
> > > > > 
> > > I think Alexey is on the right track with the PCI resource allocation
> > > failure.
> > 
> > Then it should be the SMBus controller, PCI id 00:1f:3, which is having 
> > problems
> > registering its io ports region 4, AFAICT.
> 
> Yes, it looks like the ioport region 0x540-0x55f is described both in
> PNP and ACPI:
> 
>   /sys/devices/pnp0/00:0d/resources:state = active
>   /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f
>   /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f
> 
>   00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
> Controller (rev 03)
> Subsystem: ASUSTeK Computer Inc. Unknown device 1869
> Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR- FastB2B-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> SERR-  Interrupt: pin B routed to IRQ 0
> Region 4: I/O ports at 0540 [size=32]
> 
> The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc().
> 
> This quirk seems dangerous to me, and the comments above asus_hides_smbus
> allude to problems similar to what you're seeing.  It's obvious that a
> lot of blood, sweat, and tears have gone into this quirk, so I'm not
> suggesting that it's time to revert it, but I would be interested in
> knowing whether the critical temperature problem goes away if we leave
> the PCI device hidden, e.g., with the following patch:
> 
> Index: linux-mm/drivers/pci/quirks.c
> ===
> --- linux-mm.orig/drivers/pci/quirks.c2007-12-13 09:11:31.0 
> -0700
> +++ linux-mm/drivers/pci/quirks.c 2007-12-13 09:12:27.0 -0700
> @@ -1073,12 +1073,7 @@
>  
>   pci_read_config_word(dev, 0xF2, );
>   if (val & 0x8) {
> - pci_write_config_word(dev, 0xF2, val & (~0x8));
> - pci_read_config_word(dev, 0xF2, );
> - if (val & 0x8)
> - printk(KERN_INFO "PCI: i801 SMBus device continues to 
> play 'hide and seek'! 0x%x\n", val);
> - else
> - printk(KERN_INFO "PCI: Enabled i801 SMBus device\n");
> + printk(KERN_INFO "PCI: Leaving i801 SMBus device hidden\n");
>   }
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,
> PCI_DEVICE_ID_INTEL_82801AA_0,  asus_hides_smbus_lpc);

yep, this fixes it. Bootlog attached.

-- 
Regards/Gruß,
Boris.


bootlog-smbus-hidden.bz2
Description: Binary data


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-13 Thread Bjorn Helgaas
On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote:
> On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
> > On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
> > > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
> > > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> > > > > From what i can roughly tell so far it seems like an resource 
> > > > > conflict between acpi and
> > > > > the pnp requested regions in your patch which result in the 
> > > > > acpi_thermal code
> > > > > to read the wrong (0xff) temperature value and halt the machine, but 
> > > > > i might be
> > > > > wrong on the details since acpi is such a big code chunk to swallow.
> > > > 
> > I think Alexey is on the right track with the PCI resource allocation
> > failure.
> 
> Then it should be the SMBus controller, PCI id 00:1f:3, which is having 
> problems
> registering its io ports region 4, AFAICT.

Yes, it looks like the ioport region 0x540-0x55f is described both in
PNP and ACPI:

  /sys/devices/pnp0/00:0d/resources:state = active
  /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f
  /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f

  00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
Controller (rev 03)
Subsystem: ASUSTeK Computer Inc. Unknown device 1869
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-13 Thread Bjorn Helgaas
On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote:
 On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
  On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
   On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
 From what i can roughly tell so far it seems like an resource 
 conflict between acpi and
 the pnp requested regions in your patch which result in the 
 acpi_thermal code
 to read the wrong (0xff) temperature value and halt the machine, but 
 i might be
 wrong on the details since acpi is such a big code chunk to swallow.

  I think Alexey is on the right track with the PCI resource allocation
  failure.
 
 Then it should be the SMBus controller, PCI id 00:1f:3, which is having 
 problems
 registering its io ports region 4, AFAICT.

Yes, it looks like the ioport region 0x540-0x55f is described both in
PNP and ACPI:

  /sys/devices/pnp0/00:0d/resources:state = active
  /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f
  /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f

  00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
Controller (rev 03)
Subsystem: ASUSTeK Computer Inc. Unknown device 1869
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Interrupt: pin B routed to IRQ 0
Region 4: I/O ports at 0540 [size=32]

The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc().

This quirk seems dangerous to me, and the comments above asus_hides_smbus
allude to problems similar to what you're seeing.  It's obvious that a
lot of blood, sweat, and tears have gone into this quirk, so I'm not
suggesting that it's time to revert it, but I would be interested in
knowing whether the critical temperature problem goes away if we leave
the PCI device hidden, e.g., with the following patch:

Index: linux-mm/drivers/pci/quirks.c
===
--- linux-mm.orig/drivers/pci/quirks.c  2007-12-13 09:11:31.0 -0700
+++ linux-mm/drivers/pci/quirks.c   2007-12-13 09:12:27.0 -0700
@@ -1073,12 +1073,7 @@
 
pci_read_config_word(dev, 0xF2, val);
if (val  0x8) {
-   pci_write_config_word(dev, 0xF2, val  (~0x8));
-   pci_read_config_word(dev, 0xF2, val);
-   if (val  0x8)
-   printk(KERN_INFO PCI: i801 SMBus device continues to 
play 'hide and seek'! 0x%x\n, val);
-   else
-   printk(KERN_INFO PCI: Enabled i801 SMBus device\n);
+   printk(KERN_INFO PCI: Leaving i801 SMBus device hidden\n);
}
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,  PCI_DEVICE_ID_INTEL_82801AA_0,  
asus_hides_smbus_lpc);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-13 Thread Borislav Petkov
On Thu, Dec 13, 2007 at 09:17:18AM -0700, Bjorn Helgaas wrote:
 On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote:
  On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
   On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
 On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
  From what i can roughly tell so far it seems like an resource 
  conflict between acpi and
  the pnp requested regions in your patch which result in the 
  acpi_thermal code
  to read the wrong (0xff) temperature value and halt the machine, 
  but i might be
  wrong on the details since acpi is such a big code chunk to swallow.
 
   I think Alexey is on the right track with the PCI resource allocation
   failure.
  
  Then it should be the SMBus controller, PCI id 00:1f:3, which is having 
  problems
  registering its io ports region 4, AFAICT.
 
 Yes, it looks like the ioport region 0x540-0x55f is described both in
 PNP and ACPI:
 
   /sys/devices/pnp0/00:0d/resources:state = active
   /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f
   /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f
 
   00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
 Controller (rev 03)
 Subsystem: ASUSTeK Computer Inc. Unknown device 1869
 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
 Stepping- SERR- FastB2B-
 Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
 TAbort- MAbort- SERR- PERR-
 Interrupt: pin B routed to IRQ 0
 Region 4: I/O ports at 0540 [size=32]
 
 The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc().
 
 This quirk seems dangerous to me, and the comments above asus_hides_smbus
 allude to problems similar to what you're seeing.  It's obvious that a
 lot of blood, sweat, and tears have gone into this quirk, so I'm not
 suggesting that it's time to revert it, but I would be interested in
 knowing whether the critical temperature problem goes away if we leave
 the PCI device hidden, e.g., with the following patch:
 
 Index: linux-mm/drivers/pci/quirks.c
 ===
 --- linux-mm.orig/drivers/pci/quirks.c2007-12-13 09:11:31.0 
 -0700
 +++ linux-mm/drivers/pci/quirks.c 2007-12-13 09:12:27.0 -0700
 @@ -1073,12 +1073,7 @@
  
   pci_read_config_word(dev, 0xF2, val);
   if (val  0x8) {
 - pci_write_config_word(dev, 0xF2, val  (~0x8));
 - pci_read_config_word(dev, 0xF2, val);
 - if (val  0x8)
 - printk(KERN_INFO PCI: i801 SMBus device continues to 
 play 'hide and seek'! 0x%x\n, val);
 - else
 - printk(KERN_INFO PCI: Enabled i801 SMBus device\n);
 + printk(KERN_INFO PCI: Leaving i801 SMBus device hidden\n);
   }
  }
  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,
 PCI_DEVICE_ID_INTEL_82801AA_0,  asus_hides_smbus_lpc);

yep, this fixes it. Bootlog attached.

-- 
Regards/Gruß,
Boris.


bootlog-smbus-hidden.bz2
Description: Binary data


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Borislav Petkov
On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
> On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
> > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
> > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> > > > From what i can roughly tell so far it seems like an resource conflict 
> > > > between acpi and
> > > > the pnp requested regions in your patch which result in the 
> > > > acpi_thermal code
> > > > to read the wrong (0xff) temperature value and halt the machine, but i 
> > > > might be
> > > > wrong on the details since acpi is such a big code chunk to swallow.
> > > 
> > > I don't see any obvious conflict from the log you posted.  For the sake
> > > of comparison, can you post the corresponding dmesg log after you removed
> > > the patch?
> > 
> > The only difference i see is that ACPI finds EC in DSDT in the working 
> > kernel
> > and in the broken case something silently fails. Please find attached the 2 
> > bootlogs
> > and a disassembled DSDT.
> 
> Thanks very much!
> 
> "ACPI: EC: Look up EC in DSDT" appears in the working log, but not
> in the broken one.  But I think we *do* find the EC in both cases,
> because we see "ACPI: EC: non-query interrupt received" even before
> acpi_ec_add() (which prints the "ACPI: EC: GPE = 0x1c, ...".  Maybe
> the logs were collected with different log levels?

Well, hm, actually no, the only difference is that the broken log was taken over
netconsole so the lines might appear in a different order. I'll capture that
log again on the weekend to see whether something is missing..
 
> I think Alexey is on the right track with the PCI resource allocation
> failure.

Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems
registering its io ports region 4, AFAICT.

> On your working kernel, can you collect this:
> 
>   lspci -vv > lspci
>   cat /proc/ioports > ioports
>   cat /proc/iomem > iomem
>   grep . /sys/devices/pnp*/*/resources > pnp
>   tar -jcf resources.tar.bz2 lspci ioports iomem pnp

attached.

-- 
Regards/Gruß,
Boris.


resources.tar.bz2
Description: Binary data


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Bjorn Helgaas
On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
> On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
> > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> > > From what i can roughly tell so far it seems like an resource conflict 
> > > between acpi and
> > > the pnp requested regions in your patch which result in the acpi_thermal 
> > > code
> > > to read the wrong (0xff) temperature value and halt the machine, but i 
> > > might be
> > > wrong on the details since acpi is such a big code chunk to swallow.
> > 
> > I don't see any obvious conflict from the log you posted.  For the sake
> > of comparison, can you post the corresponding dmesg log after you removed
> > the patch?
> 
> The only difference i see is that ACPI finds EC in DSDT in the working kernel
> and in the broken case something silently fails. Please find attached the 2 
> bootlogs
> and a disassembled DSDT.

Thanks very much!

"ACPI: EC: Look up EC in DSDT" appears in the working log, but not
in the broken one.  But I think we *do* find the EC in both cases,
because we see "ACPI: EC: non-query interrupt received" even before
acpi_ec_add() (which prints the "ACPI: EC: GPE = 0x1c, ...".  Maybe
the logs were collected with different log levels?

I think Alexey is on the right track with the PCI resource allocation
failure.  On your working kernel, can you collect this:

  lspci -vv > lspci
  cat /proc/ioports > ioports
  cat /proc/iomem > iomem
  grep . /sys/devices/pnp*/*/resources > pnp
  tar -jcf resources.tar.bz2 lspci ioports iomem pnp

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Alexey Starikovskiy

Borislav Petkov wrote:

On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
  

On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:


From what i can roughly tell so far it seems like an resource conflict between 
acpi and
the pnp requested regions in your patch which result in the acpi_thermal code
to read the wrong (0xff) temperature value and halt the machine, but i might be
wrong on the details since acpi is such a big code chunk to swallow.
  

I don't see any obvious conflict from the log you posted.  For the sake
of comparison, can you post the corresponding dmesg log after you removed
the patch?



The only difference i see is that ACPI finds EC in DSDT in the working kernel
and in the broken case something silently fails. Please find attached the 2 
bootlogs
and a disassembled DSDT.

  

This seems to be the start of trouble...
   PCI: Cannot allocate resource region 4 of device :00:1f.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Borislav Petkov
On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
> On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> > From what i can roughly tell so far it seems like an resource conflict 
> > between acpi and
> > the pnp requested regions in your patch which result in the acpi_thermal 
> > code
> > to read the wrong (0xff) temperature value and halt the machine, but i 
> > might be
> > wrong on the details since acpi is such a big code chunk to swallow.
> 
> I don't see any obvious conflict from the log you posted.  For the sake
> of comparison, can you post the corresponding dmesg log after you removed
> the patch?

The only difference i see is that ACPI finds EC in DSDT in the working kernel
and in the broken case something silently fails. Please find attached the 2 
bootlogs
and a disassembled DSDT.

-- 
Regards/Gruß,
Boris.
BZh91AYµÜ¬t$Øÿ¬ý´€ÿÿÿïÿþÿÿÿô`/?* P¨ P
 IMˆÓS4CÔjz‰€Òz©ˆi“F! Ð
h€à 4

4È4i£&@4 ¦™24dÁˆ€€Ð44Ó [EMAIL PROTECTED] 
1pЀšd4Èѓ ¤ˆ Ñ ši 
ÄLš˜Èɪaè„ÚM©ý_åÿ/;ý÷b°[$‰íĖTGµýD{žÞ¥ó¢DÍJe‚—K"ñ.¢…(¼«%Id¥”)E…D¢Soúí|ÏõW©?Ó>mŸ
ªY.ë]ò±âÓŸ£d©ÂNÓlm´ø…,5r5K¦‘HÖc¢}ö͋Ջä^}2«¶,iänqb]’—“%œ[<¿*Ê}oÁ±¹îu>îöñàv©ý…QââÍϽ\_©ÔÔû®§µòÙ݊ØÊît¬žå:‹©÷t*ñ²¡ëgccðYéM®3Ž<§ñ}~—3btV¯CQµì-çûٛ®
]9üºŸcrŒòM^I~·Œð{ÎÆÆÆ-+ÝïþmŒ_‹çÅçäâüTÍyãö1qw=ß{’q}/™i½ñwö¬ìbíì{u©O”p~–
Ôö2?\ÙGÖòx®z(åäˆ(Hi#
ÎAäÃê4,8´©‡Tœc4Ñ\΃éY¡Jlfø5Fì[öeècã’xsüמ
ܧkÇ):ÔjZÉù^ÏÎ×kÉå„ÉS±ú".ó6¬ØÉâÜɓaN.—7ëÁÙ«}Uí-
¢Ö´ªµ­UlÚښ”îyùúð™©ÉO%Æ¥Êt=ë›ÛÊ*QE)¥žµ
j=)îqw;
ýژ¦-é_²èâ§4×çv>GSþ}š’.ÖÐðt1]>Óû)wÔÍظ’›™»ZΆGçhÞ÷µÌNwhþK[#cïXkÎÈ䍋)¿¥iœæ‡ƒ¡ç{[žw˖çs{¥ä‹hÔ±=§[ßfº5½nµ§^cµÖ¾Å*›Š'LŠGûnX{Ý.k¦òž‡¿“åtíkvøõéÚÚónøI"Ó¥,Ÿô§¡OG›wƒõ6º•SÎîdš¬·’×ã0ö˜0Qñp`èt:Td啽޾ž^‘¢ÌÜÔ<Ý/³›6ÔS‹âõ§½­sDâé-‚CÔ>wàɌ™z]>šØºÞš[SÁ©Þë[ÈÚôxIɓçí֝¤fbè~Vsà²ÍN—ÊûW{\»¹ûŸ¸ñàÚêx¼ýªvbo^IÈt5¬ìôÉNk,êw:Û3yÑèõ¬Os[ɛ¥=«6¾³'ï{ŸÕ—­ƒÐÀÖ¨z©ÚÙ±ßÝÂêIâÅ»D”ó,ؕ’cÂYŠíىJl2RhŒq´#NS¶13&â½Hö‘ŒëíÒûKI:›5éD÷)™ò1æìß2ß{µ)ÐìæäÅOpܺ3ˆÚúލ®—Gêkb¿{{]<؟à©òDZÒäÖéfðu2™
ô”º–7½ªŸ‘Ç¿’ӯ̻àÅì­¿~Œ¬\šÔä°ít*IðQÇG5–Y‹ÑµÜ׸ð~GkTاsŒíъ(´¸NL%½H·a&dHâpè—`—GÄ0ȧ%OcgçðÁɚYDïiCv»mÎ<Õ<ŸnƒM±L
ª•„aE70jV8–À¨º2G™IÛ±8(ÉÜý\ýz›[^–/ý¹CÐ~té
G Ñ•¥¥x˜nO‹ý—÷+¼ËAR46ž·µò?S&[EMAIL 
PROTECTED]/¼¸5¿
šÕ0~:\ô©¹îqg6?k¿¹ïd³¥ÒWÛ«S¡²´³ìŽ·™JQÀÚÔÍÖæÖ]øþŽ§˜=.Ísö”s}Kèx½ÈÞü†NõÖdðzØ:[Íàâýj\ÉNá­Í Ã/³ÆæáÏ&šÎ£.¡°šó.
‹½ñlö,·¥FçBh§üz]Ò
¬æè±Õ¯­Í›au×džõ!Ò±g½eÙýX¶kԚ«Z~¤§Êá{O¹í{‚äù,žg8Ø}8¹<:ΗÐKÿɝHÕ®º>ÈðG“sÉI©§öÝÀô¹O¿Þ³öÙ§êaeL66bÃ6ÕûW~§ß?ìðØÎ
JG£äèÙl˜27½
<ËoV
Šµì|Œ¬.¡IK1²ï‰k).ÁUKK,]0`³Æ¦/M2>fC,èÑZ|,£VÍWfj¢S&½k?õ0Òý^…ÞHêsØ¿ÁéS¿T  
`Œ´‰¦ƒÑ’F9âÂҌ„MX4PG%¿Îlõ¤Æ'ƒó`ì%.ŒIlÝ0Ne;jH‰Ü¯e»TôTòWoô{c%ÕONëþΦ,•ÎÖ´ÉûYCú¿äᣃo‚íìSÎÓf«òv5è©ÉÎϕ¯E9ª¦f첩³[ÿ¯ØÔÅ¥?£ýRÍn$Ô¹8¼_BíÇëy.Ü»7­ÔÄó»,{Öè]ïœØ¹)
  Ή"nqRÃÅe”¨Éš®õ)““k+55/vµ)cKR´i„ fîôÝ·ó°{»®kZµö:juœZ˜¬Y‰w‹‡6NŸ
R”¥)_'þ50) 
âðN#­ªÍk)ÖàÅg6¦n,Y¿¦n¦²`Ð:‚Íjc·fL)uš[ŠÓS{«%Ôa|ð;—ARAlŒî.,•šJÁN>uYuc’ªksp^r1’fÿõv‹jkt,³›;³íZs¦æöÄÁ,;—éfšóg7ó]fݘ´˜[üU$Å1RÝ*lpMZaÕ¯RjtNÈmi=Jß*éŠ;YJVæ(7Bv­'k)=/èì¥õ6¹µ:–uЛ]92v[×ûº…(£S%ßÄ£R”¥)JR”’Ä©éRê"”ŠR”¥)E(¥:¾•›“ÉXªê-T¥J”«Œ·¾,ŸÝ”MªLGð.}Müxãè}O;zQŽî3àæô´}N–“&‹[•̲¬‹¦­QN:¤Àä‹5ý–

*Ë<õ#éRüíeI²ž2Ý_'ú¦Ç©ñfõ´llãF¶-`o+væÕô£­LkO‹­–Ù–YR²`»Ö°Ò†Ë-®òÏ¥“65–(Ã]ZôÍf­ê˜Yu™¼¬o¥`³5³`ÁžMºŽŽO.µ›m.ÓÎàžÆ_þURiçW½ßJQT६V¬—T‹±OàùŸ:Ë3ån.8‚ǼSDø,Üìo]îPÂ^¶#>՝öðΗp>V·šìuͳäRmÖ,ɹF

WuÒ¦Ó¼O¿ƒ
G]ÝJNõ#ÂÇ­äKçЮL\ï3ëW5~‚±p’ÚjpF£qټٯtüìYáO3®íž5Œ^š›ØßL691Ã6O'U)J£Z–±\d˜©¬õ®¦öLV`äØôåÑÄË_“wx±àù©›ì×éfÍ:läø·¶åèmfà¦ç lYƑ†ãкö}íí•¯½©5*œ§²ìYj»Øó6á&8¼—^—Tóêéfï3¿ß½æy5®Jq'IÉv…²]¬Í£kËc–T©K÷¯èd³¦¥f‚ãü‹ù.Ŗz×X™c…F¦Œ›^fŒš”Wr:Q߄sÆnÏ%ܔÁNo‚~vKƒæYü¼†½ÖqoÏr°_÷ð귅G™²ÍêkÖ¹©FJëx’™ô»Ç5”˜MZ[W‡±f³ÂÊ`ñX蝘cž6TèÃ'‹UئØÅ°§ñY­“{co'yÞ`mgMÖÔÚúò~õ&[Z­gr÷hS{](lK»p5úUyŸS‹&µÚ»^Y¶1Ý]–µÛ˜°mYCŽZ_^“k3›¥“םìäkiÀŸ6[”œØ·¯Ãš3ÔÉ]

¿K‘8²uÔÕ¾¦î=ÖjΘ襜µÖÎJiÔÃÿ<ßÁw,iשvi±´Õùse–EÊd|ÞíYž¹›Z™iý-Ö3t)Ö6â±£'Ê×ÔÅNÕÚ"úY“Gæ&¶Æ
µëpÕ¤Êe»RÇL—ilO‹™¶6¿ƒFÙ¤¹;&¼)‰)ýŠhLVrmÎLÕà¶m¬ÌÛÖm7mÏ
Wm_kF9V¤ÓsvK¶³¥xí|‰ŸÔùwÖÑÝ{—áÉG%ÜüOÚýêÔüY>ÑOÀR`Ÿ™J(џáÀž34z¯R¨ü_­±2obÂ4TYY”Ý]YY£Fx©Ñ±ÁöӉØý­Z¥¯d¬™­ü>Æ2{¬ŒYtø¿Éø»ÍEJk]ҋjáëöê½ï{_š9,»+^ªRë»Û”‰7ºX`¥4iTÑ£s4ûḳ6lÚ4F©Ðê]±ýlãmm­Y#š^¹¸¬jWƒ&½±±ÁÁkængfŽMâÀØñÜÞWhÙ#ȺŸ"ŒTY½½ÅÅfŒßø6Ì2fÒWï¿_K©¬¾¤œæ—m]¢Œe8úº2õý®V’j͊EꤖÔÎÈuª
 
éúkS_ôYÍ©Ó£_k¡ó¼Î->.q÷ºÚ5¾Ÿ‚;ß#z–ŒT<–YOý¾U˜£gŒóá6Õ}Œ^¯Ûto~?§'»s½ù‹Dú—9°´ŸõQízŒØ¿ÏΨ¥IúäûP‘e)ƒõ.»ù,ý•ýºÌ¤âRrPÿ7*6B—5ÊRŠx9º~E—P’–)E?›õ85-Ù¿±¹
ϼ´~óÿ§S©8͍ªJkµ•O¥µ½£u¬¦-òJn`²ÌÙÍå
†Hì»±’‚¢DÔ;¹µ–E3डQüWt)Fõ01UkIiR•Rž.]æÁÎoWkêf±gا“ñqhšÜßjǙò–?yͽÁçe:ÿ½ˆÔÀŵØf§{ÖQ*¡á6'ªë(x’‡r:SÉNÇó‡°SFÄÜÛ®KŬµN왙©¶ÖFÑÀÑӓñu#“ƒ§Š”y³œ¼XÏUL×Z#ƒSÁwT†Ҕ›˜<ÌVqf–# R`ÊIv+%Ö)’•ÅLT—nZÈÉfâÈ¡J¨””R…(
QJTU”õ*.žf+
znL˜)D¡Ð‰fŠkPŽ°îX»i(©~bµ¶.xœFJ’¬=Z3Á:ÞK7»/$‹¶»T´Å)þëLký¾‡K“Ú±v¶
£q‹Ý2<˜%›ÞOêÀ`§¥î4v¾v-C7¥KC½dv)JIJ’•=Ž¥öòv;:2HÔìþ´I¥
à‡sG¹M΅ƒ’“‰.™0Dèncùš,ÛIú]NÖ¶k´9=‰ð`ó1z×2ÉfŒ4klSÎÉf¦ç"ÂN%&“ÔºC­Ê1

Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Borislav Petkov
On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
 On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
  From what i can roughly tell so far it seems like an resource conflict 
  between acpi and
  the pnp requested regions in your patch which result in the acpi_thermal 
  code
  to read the wrong (0xff) temperature value and halt the machine, but i 
  might be
  wrong on the details since acpi is such a big code chunk to swallow.
 
 I don't see any obvious conflict from the log you posted.  For the sake
 of comparison, can you post the corresponding dmesg log after you removed
 the patch?

The only difference i see is that ACPI finds EC in DSDT in the working kernel
and in the broken case something silently fails. Please find attached the 2 
bootlogs
and a disassembled DSDT.

-- 
Regards/Gruß,
Boris.
BZh91AYSYµÜ¬t$Øÿ¬ý´€ÿÿÿïÿþÿÿÿô`/?* P¨ P
 IMˆÓS4CÔjz‰€Òz©ˆi“F! Ð
h€à 4

4È4i£@4 ¦™24dÁˆ€€Ð44Ó [EMAIL PROTECTED] 
1pЀšd4Èѓ ¤ˆ Ñ ši 
ÄLš˜Èɪaè„ÚM©ý_åÿ/;ý÷b°[$‰íĖTGµýD{žÞ¥ó¢DÍJe‚—Kñ.¢…(¼«%Id¥”)E…D¢Soúí|ÏõW©?ӝmŸ
ªY.ë]ò±âÓNÃb667ÿÏ՞æÁnTÿ£¿­†ªÍÁ¬ÕÛu¯]¼ßsFXÓý3Õ}2F
h–b­l˜¾¦—nZÓoSz·cËG×ÿwØÖº²u²¦»®¬äъŸãEFtõ
9m’òS®ÙÃv6“4¾s¬™+0Ee£jçIJB[EMAIL PROTECTED]|Ywv]ÑòZÏ%r£[’øîV;1`¤÷`
Ôő£†¡Z’AµL¼tÖ/‹TàJÒómo~ÿsVܹ62¾5ÉJR”¥9ª×\Þk`RcPñEë0ç“8ÙĹ«DÅ0_$[õ¯’†[qrÔú5kõ?ي©Íµ8¯»©¬Á€)†}úhíoùuŠt8pÍuï}ø·7¶,ÔÁþ»2dÉ©’Ì,Á‚Ì,Áêu²›.¦Öo¯Z„f¤ƒò½ÎÉë})îm~F[6QO­çl8µ¿âõop
ÌVOü¾Å½()RO枕…çËÝêf÷#½Ôï{éÁírdþ˜Îú¬}
ËWÈñtN‚ŒƒÜ§½F;ýM‰³µ±ä»HžãcÌÛ2nS”ÍíYþ+5?Åõº^嚟Þâ›Wt³:mmCñfYw½wZë6Tجn¨Œb*#oŠ¢nÉÿ?ðÎÎՆOå€üôcæO2ëÝ.º÷K­u]]ÖÂêR”¥6Óå–ZÓ.45e³d¥)(¥(¥)4ù™¹°k{\
}¸²pGþ”Ñü”ý·©%Ɂ{UÔpkÂÌ͋Ê5¬Å‚Ÿ8Ÿ£d©ÂNÓlm´ø…,5r5K¦‘HÖc¢}ö͋Ջä^}2«¶,iänqb]’—“%œ[¿*Ê}oÁ±¹îuîöñàv©ý…QââÍϽ\_©ÔÔû®§µòÙ݊ØÊît¬žå:‹©÷t*ñ²¡ëgccðYéM®3Ž§ñ}~—3btV¯CQµì-çûٛ®
]9üºŸcrŒòM^I~·Œð{ÎÆÆÆ-+ÝïþmŒ_‹çÅçäâüTÍyãö1qw=ß{’q}/™i½ñwö¬ìbíì{u©O”p~–
Ôö2?\ÙGÖòx®z(åäˆ(Hi#
ÎAäÃê4,8´©‡Tœc4Ñ\΃éY¡Jlfø5Fì[öeècã’xsüמ
ܧkÇ):ÔjZÉù^ÏÎ×kÉå„ÉS±ú.ó6¬ØÉâÜɓaN.—7ëÁÙ«}Uí-
¢Ö´ªµ­UlÚښ”îyùúð™©ÉO%Æ¥Êt=ë›ÛÊ*QE)¥žµ
j=)îqw;
ýژ¦-é_²èâ§4×çvGSþ}š’.ÖÐðt1]Óû)wÔÍظ’›™»ZΆGçhÞ÷µÌNwhþK[#cïXkÎÈ䍋)¿¥iœæ‡ƒ¡ç{[žw˖çs{¥ä‹hÔ±=§[ßfº5½nµ§^cµÖ¾Å*›Š'LŠGûnX{Ý.k¦òž‡¿“åtíkvøõéÚÚónøIÓ¥,Ÿô§¡OG›wƒõ6º•SÎîdš¬·’×ã0ö˜0Qñp`èt:Td啽޾ž^‘¢ÌÜÔÝ/³›6ÔS‹âõ§½­sDâé-‚CÔwàɌ™z]šØºÞš[SÁ©Þë[ÈÚôxIɓçí֝¤fbè~Vsà²ÍN—ÊûW{\»¹ûŸ¸ñàÚêx¼ýªvbo^IÈt5¬ìôÉNk,êw:Û3yÑèõ¬Os[ɛ¥=«6¾³'ï{ŸÕ—­ƒÐÀÖ¨z©ÚÙ±ßÝÂêIâÅ»D”ó,ؕ’cÂYŠíىJl2RhŒq´#NS¶13â½Hö‘ŒëíÒûKI:›5éD÷)™ò1æìß2ß{µ)ÐìæäÅOpܺ3ˆÚúލ®—Gêkb¿{{]؟à©òDZÒäÖéfðu2™
ô”º–7½ªŸ‘Ç¿’ӯ̻àÅì­¿~Œ¬\šÔä°ít*IðQÇG5–Y‹ÑµÜ׸ð~GkTاsŒíъ(´¸NL%½H·adHâpè—`—GÄ0ȧ%OcgçðÁɚYDïiCv»mÎ՟nƒM±L
ª•„aE70jV8–À¨º2G™IÛ±8(ÉÜý\ýz›[^–/ý¹CÐ~té
G Ñ•¥¥x˜nO‹ý—÷+¼ËAR46ž·µò?S[EMAIL 
PROTECTED]iƒGˆ|¿—skcoÔõŽOÜçewü-ñ«|Ú¾Öo·|³­§qÅÔë5¹¦¦îÕUl56¬Áۚë÷5}Á-—Û¤öïÔíÑK:L}X­K´[àÔècí|ÌýmO°WÖ¥)e)c{,Ö}s[Û«Ãé¯:Î3÷¾§ÒR¥?ÓrÓС²†µŸ÷X}jû”sp–z”šéOž‡Ê¯Êö,§ØÒêiñ;6|_r–S±g“/¼¸5¿
šÕ0~:\ô©¹îqg6?k¿¹ïd³¥ÒWÛ«S¡²´³ìŽ·™JQÀÚÔÍÖæÖ]øþŽ§˜=.Ísö”s}Kèx½ÈÞü†NõÖdðzØ:[Íàâýj\ÉNá­Í Ã/³ÆæáϚΣ.¡°šó.
‹½ñlö,·¥FçBh§üz]Ò
¬æè±Õ¯­Í›au×džõ!Ò±g½eÙýX¶kԚ«Z~¤§Êá{O¹í{‚äù,žg8Ø}8¹:ΗÐKÿɝHÕ®ºÈðG“sÉI©§öÝÀô¹O¿Þ³öÙ§êaeL66bÃ6ÕûW~§ß?ìðØÎ
JG£äèÙl˜27½f
ËoV
Šµì|Œ¬.¡IK1²ï‰k).ÁUKK,]0`³Æ¦/M2fC,èÑZ|,£VÍWfj¢S½k?õ0Òý^…ÞHêsØ¿ÁéS¿T  
`Œ´‰¦ƒÑ’F9âÂҌ„MX4PG%¿Îlõ¤Æ'ƒó`ì%.ŒIlÝ0Ne;jH‰Ü¯e»TôTòWoô{c%ÕONëþΦ,•ÎÖ´ÉûYCú¿äᣃo‚íìSÎÓf«òv5è©ÉÎϕ¯E9ª¦f첩³[ÿ¯ØÔÅ¥?£ýRÍn$Ô¹8¼_BíÇëy.Ü»7­ÔÄó»,{Öè]ïœØ¹)
  ΉnqRÃÅe”¨Éš®õ)““k+55/vµ)cKR´i„ fîôÝ·ó°{»®kZµö:juœZ˜¬Y‰w‹‡6NŸ
R”¥)_'þ50) 
âðN#­ªÍk)ÖàÅg6¦n,Y¿¦n¦²`ÐE:‚Íjc·fL)uš[ŠÓS{«%Ôa|ð;—ARAlŒî.,•šJÁNuYuc’ªksp^r1’fÿõv‹jkt,³›;³íZs¦æöÄÁ,;—éfšóg7ó]fݘ´˜[üU$Å1RÝ*lpMZaÕ¯RjtNÈmi=Jß*éŠ;YJVæ(7Bv­'k)=/èì¥õ6¹µ:–uЛ]92v[×ûº…(£S%ßÄ£R”¥)JR”’Ä©éRꔊR”¥)E(¥:¾•›“ÉXªê-T¥J”«Œ·¾,ŸÝ”MªLGð.}Müxãè}O;zQŽî3àæô´}N–“‹[•̲¬‹¦­QN:¤Àä‹5ý–

*Ëõ#éRüíeI²ž2Ý_'ú¦Ç©ñfõ´llãF¶-`o+væÕô£­LkO‹­–Ù–YR²`»Ö°Ò†Ë-®òÏ¥“65–(Ã]ZôÍf­ê˜Yu™¼¬o¥`³5³`ÁžMºŽŽO.µ›m.ÓÎàžÆ_þURiçW½ßJQT६V¬—T‹±OàùŸ:Ë3ån.8‚ǼSDø,Üìo]îPÂ^¶#՝öðΗpV·šìuͳäRmÖ,ɹF

WuÒ¦Ó¼O¿ƒ
G]ÝJNõ#ÂÇ­äKçЮL\ï3ëW5~‚±p’ÚjpF£qټٯtüìYáO3®íž5Œ^š›ØßL691Ã6O'U)J£Z–±\d˜©¬õ®¦öLV`äØôåÑÄË_“wx±àù©›ì×éfÍ:läø·¶åèmfà¦ç lYƑ†ãкö}íí•¯½©5*œ§²ìYj»Øó6á8¼—^—Tóêéfï3¿ß½æy5®Jq'IÉv…²]¬Í£kËc–T©K÷¯èd³¦¥f‚ãü‹ù.Ŗz×X™c…F¦Œ›^fŒš”Wr:Q߄sÆnÏ%ܔÁNo‚~vKƒæYü¼†½ÖqoÏr°_÷ð귅G™²ÍêkÖ¹©FJëx’™ô»Ç5”˜MZ[W‡±f³ÂÊ`ñX蝘cž6TèÃ'‹UئØÅ°§ñY­“{co'yÞ`mgMÖÔÚúò~õ[Z­gr÷hS{](lK»p5úUyŸS‹µÚ»^Y¶1Ý]–µÛ˜°mYCŽZ_^“k3›¥“םìäkiÀŸ6[”œØ·¯Ãš3ÔÉ]

¿K‘8²uÔÕ¾¦î=ÖjΘ襜µÖÎJiÔÃÿßÁw,iשvi±´Õùse–EÊd|ÞíYž¹›Z™iý-Ö3t)Ö6â±£'Ê×ÔÅNÕÚúY“Gæ¶Æ
µëpÕ¤Êe»RÇL—ilO‹™¶6¿ƒFÙ¤¹;¼)‰)ýŠhLVrmÎLÕà¶m¬ÌÛÖm7mÏ
Wm_kF9V¤ÓsvK¶³¥xí|‰ŸÔùwÖÑÝ{—áÉG%ÜüOÚýêÔüYÑOÀR`Ÿ™J(џáÀž34z¯R¨ü_­±2obÂ4TYY”Ý]YY£Fx©Ñ±ÁöӉØý­Z¥¯d¬™­üÆ2{¬ŒYtø¿Éø»ÍEJk]ҋjáëöê½ï{_š9,»+^ªRë»Û”‰7ºX`¥4iTÑ£s4ûḳ6lÚ4F©Ðê]±ýlãmm­Y#š^¹¸¬jWƒ½±±ÁÁkængfŽMâÀØñÜÞW

Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Alexey Starikovskiy

Borislav Petkov wrote:

On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
  

On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:


From what i can roughly tell so far it seems like an resource conflict between 
acpi and
the pnp requested regions in your patch which result in the acpi_thermal code
to read the wrong (0xff) temperature value and halt the machine, but i might be
wrong on the details since acpi is such a big code chunk to swallow.
  

I don't see any obvious conflict from the log you posted.  For the sake
of comparison, can you post the corresponding dmesg log after you removed
the patch?



The only difference i see is that ACPI finds EC in DSDT in the working kernel
and in the broken case something silently fails. Please find attached the 2 
bootlogs
and a disassembled DSDT.

  

This seems to be the start of trouble...
   PCI: Cannot allocate resource region 4 of device :00:1f.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Bjorn Helgaas
On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
 On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
  On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
   From what i can roughly tell so far it seems like an resource conflict 
   between acpi and
   the pnp requested regions in your patch which result in the acpi_thermal 
   code
   to read the wrong (0xff) temperature value and halt the machine, but i 
   might be
   wrong on the details since acpi is such a big code chunk to swallow.
  
  I don't see any obvious conflict from the log you posted.  For the sake
  of comparison, can you post the corresponding dmesg log after you removed
  the patch?
 
 The only difference i see is that ACPI finds EC in DSDT in the working kernel
 and in the broken case something silently fails. Please find attached the 2 
 bootlogs
 and a disassembled DSDT.

Thanks very much!

ACPI: EC: Look up EC in DSDT appears in the working log, but not
in the broken one.  But I think we *do* find the EC in both cases,
because we see ACPI: EC: non-query interrupt received even before
acpi_ec_add() (which prints the ACPI: EC: GPE = 0x1c,   Maybe
the logs were collected with different log levels?

I think Alexey is on the right track with the PCI resource allocation
failure.  On your working kernel, can you collect this:

  lspci -vv  lspci
  cat /proc/ioports  ioports
  cat /proc/iomem  iomem
  grep . /sys/devices/pnp*/*/resources  pnp
  tar -jcf resources.tar.bz2 lspci ioports iomem pnp

Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-12 Thread Borislav Petkov
On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote:
 On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote:
  On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote:
   On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
From what i can roughly tell so far it seems like an resource conflict 
between acpi and
the pnp requested regions in your patch which result in the 
acpi_thermal code
to read the wrong (0xff) temperature value and halt the machine, but i 
might be
wrong on the details since acpi is such a big code chunk to swallow.
   
   I don't see any obvious conflict from the log you posted.  For the sake
   of comparison, can you post the corresponding dmesg log after you removed
   the patch?
  
  The only difference i see is that ACPI finds EC in DSDT in the working 
  kernel
  and in the broken case something silently fails. Please find attached the 2 
  bootlogs
  and a disassembled DSDT.
 
 Thanks very much!
 
 ACPI: EC: Look up EC in DSDT appears in the working log, but not
 in the broken one.  But I think we *do* find the EC in both cases,
 because we see ACPI: EC: non-query interrupt received even before
 acpi_ec_add() (which prints the ACPI: EC: GPE = 0x1c,   Maybe
 the logs were collected with different log levels?

Well, hm, actually no, the only difference is that the broken log was taken over
netconsole so the lines might appear in a different order. I'll capture that
log again on the weekend to see whether something is missing..
 
 I think Alexey is on the right track with the PCI resource allocation
 failure.

Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems
registering its io ports region 4, AFAICT.

 On your working kernel, can you collect this:
 
   lspci -vv  lspci
   cat /proc/ioports  ioports
   cat /proc/iomem  iomem
   grep . /sys/devices/pnp*/*/resources  pnp
   tar -jcf resources.tar.bz2 lspci ioports iomem pnp

attached.

-- 
Regards/Gruß,
Boris.


resources.tar.bz2
Description: Binary data


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Bjorn Helgaas
On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
> From what i can roughly tell so far it seems like an resource conflict 
> between acpi and
> the pnp requested regions in your patch which result in the acpi_thermal code
> to read the wrong (0xff) temperature value and halt the machine, but i might 
> be
> wrong on the details since acpi is such a big code chunk to swallow.

I don't see any obvious conflict from the log you posted.  For the sake
of comparison, can you post the corresponding dmesg log after you removed
the patch?

acpi_thermal_get_temperature() only evaluates _TMP, which isn't very
interesting.  I wonder if there's some conflict between that AML method
and the EC driver or something.

If you can also collect the DSDT, maybe I can poke around in there and
see what _TMP is really doing.

Thanks,
  Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Borislav Petkov
On Tue, Dec 11, 2007 at 01:00:24PM -0700, Bjorn Helgaas wrote:
> On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote:
> > On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
> > > On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
> > > > Hi Andrew,
> > > > Hi Len,
> > > > 
> > > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
> > > > fine) on my asus laptop, the machine reboots after claiming that
> > > > "Critical temperature reached (255 C)." However, the degrees number
> > > > is kinda hinting at 0xff all-ones field. Will try dump_stack in
> > > > acpi_thermal_critical() to checkout the call path. For now here's the 
> > > > netconsole bootlog:
> > > 
> > > Here's what i got so far:
> > > 
> > > [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
> > > [   50.287999]  [] show_trace_log_lvl+0x12/0x25
> > > [   50.288103]  [] show_trace+0xd/0x10
> > > [   50.288202]  [] dump_stack+0x57/0x5f
> > > [   50.288303]  [] acpi_thermal_check+0x150/0x3bb
> > > [   50.288415]  [] acpi_thermal_add+0x261/0x2cf
> > > [   50.288515]  [] acpi_device_probe+0x3e/0xdb
> > > [   50.288615]  [] driver_probe_device+0xaf/0x12a
> > > [   50.288717]  [] __driver_attach+0x6c/0xa5
> > > [   50.288817]  [] bus_for_each_dev+0x3e/0x60
> > > [   50.288916]  [] driver_attach+0x14/0x16
> > > [   50.289015]  [] bus_add_driver+0xa6/0x1a8
> > > [   50.289114]  [] driver_register+0x42/0x47
> > > [   50.289214]  [] acpi_bus_register_driver+0x3a/0x3c
> > > [   50.289316]  [] acpi_thermal_init+0x57/0x76
> > > [   50.289424]  [] kernel_init+0x138/0x280
> > > [   50.289525]  [] kernel_thread_helper+0x7/0x10
> > > [   50.289625]  ===
> > > [   50.289680] ACPI: Critical trip point
> > > [   50.289736] Critical temperature reached (255 C), shutting down.
> > > 
> > > so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
> > > tz->temperature thingy is not set properly (printk's added):
> > > 
> > > [   50.276607] Old temp: 4294967023
> > > [   50.281890] Got temp: 255
> > > [   50.282567] Old temp: 255
> > > [   50.287882] Got temp: 255
> > > 
> > > What's also strange is that the tz acpi_thermal is alloc'd with kzalloc 
> > > and
> > > there's still garbage in it after reading it in 
> > > acpi_thermal_get_temperature()
> > > for the first time. Debugging continues...
> > 
> > (i almost suspected that the problem might be something completely 
> > different.)
> > well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
> > turned out to be
> > 
> > broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.
> > 
> > After backing this one out, mm1 boots just fine here.
> 
> Thanks for tracking this down.  I'll look into your logs and see if I
> can figure out what's going on.  There's another report related to that
> patch here: http://lkml.org/lkml/2007/11/22/110 .  Looks like a different
> symptom though, so probably a different fix.

>From what i can roughly tell so far it seems like an resource conflict between 
>acpi and
the pnp requested regions in your patch which result in the acpi_thermal code
to read the wrong (0xff) temperature value and halt the machine, but i might be
wrong on the details since acpi is such a big code chunk to swallow. Anyways, 
this is a
different issue than the one you quote above.

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Bjorn Helgaas
On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote:
> On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
> > On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
> > > Hi Andrew,
> > > Hi Len,
> > > 
> > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
> > > fine) on my asus laptop, the machine reboots after claiming that
> > > "Critical temperature reached (255 C)." However, the degrees number
> > > is kinda hinting at 0xff all-ones field. Will try dump_stack in
> > > acpi_thermal_critical() to checkout the call path. For now here's the 
> > > netconsole bootlog:
> > 
> > Here's what i got so far:
> > 
> > [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
> > [   50.287999]  [] show_trace_log_lvl+0x12/0x25
> > [   50.288103]  [] show_trace+0xd/0x10
> > [   50.288202]  [] dump_stack+0x57/0x5f
> > [   50.288303]  [] acpi_thermal_check+0x150/0x3bb
> > [   50.288415]  [] acpi_thermal_add+0x261/0x2cf
> > [   50.288515]  [] acpi_device_probe+0x3e/0xdb
> > [   50.288615]  [] driver_probe_device+0xaf/0x12a
> > [   50.288717]  [] __driver_attach+0x6c/0xa5
> > [   50.288817]  [] bus_for_each_dev+0x3e/0x60
> > [   50.288916]  [] driver_attach+0x14/0x16
> > [   50.289015]  [] bus_add_driver+0xa6/0x1a8
> > [   50.289114]  [] driver_register+0x42/0x47
> > [   50.289214]  [] acpi_bus_register_driver+0x3a/0x3c
> > [   50.289316]  [] acpi_thermal_init+0x57/0x76
> > [   50.289424]  [] kernel_init+0x138/0x280
> > [   50.289525]  [] kernel_thread_helper+0x7/0x10
> > [   50.289625]  ===
> > [   50.289680] ACPI: Critical trip point
> > [   50.289736] Critical temperature reached (255 C), shutting down.
> > 
> > so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
> > tz->temperature thingy is not set properly (printk's added):
> > 
> > [   50.276607] Old temp: 4294967023
> > [   50.281890] Got temp: 255
> > [   50.282567] Old temp: 255
> > [   50.287882] Got temp: 255
> > 
> > What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
> > there's still garbage in it after reading it in 
> > acpi_thermal_get_temperature()
> > for the first time. Debugging continues...
> 
> (i almost suspected that the problem might be something completely different.)
> well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
> turned out to be
> 
> broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.
> 
> After backing this one out, mm1 boots just fine here.

Thanks for tracking this down.  I'll look into your logs and see if I
can figure out what's going on.  There's another report related to that
patch here: http://lkml.org/lkml/2007/11/22/110 .  Looks like a different
symptom though, so probably a different fix.

Bjorn

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Borislav Petkov
On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
> On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
> > Hi Andrew,
> > Hi Len,
> > 
> > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
> > fine) on my asus laptop, the machine reboots after claiming that
> > "Critical temperature reached (255 C)." However, the degrees number
> > is kinda hinting at 0xff all-ones field. Will try dump_stack in
> > acpi_thermal_critical() to checkout the call path. For now here's the 
> > netconsole bootlog:
> 
> Here's what i got so far:
> 
> [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
> [   50.287999]  [] show_trace_log_lvl+0x12/0x25
> [   50.288103]  [] show_trace+0xd/0x10
> [   50.288202]  [] dump_stack+0x57/0x5f
> [   50.288303]  [] acpi_thermal_check+0x150/0x3bb
> [   50.288415]  [] acpi_thermal_add+0x261/0x2cf
> [   50.288515]  [] acpi_device_probe+0x3e/0xdb
> [   50.288615]  [] driver_probe_device+0xaf/0x12a
> [   50.288717]  [] __driver_attach+0x6c/0xa5
> [   50.288817]  [] bus_for_each_dev+0x3e/0x60
> [   50.288916]  [] driver_attach+0x14/0x16
> [   50.289015]  [] bus_add_driver+0xa6/0x1a8
> [   50.289114]  [] driver_register+0x42/0x47
> [   50.289214]  [] acpi_bus_register_driver+0x3a/0x3c
> [   50.289316]  [] acpi_thermal_init+0x57/0x76
> [   50.289424]  [] kernel_init+0x138/0x280
> [   50.289525]  [] kernel_thread_helper+0x7/0x10
> [   50.289625]  ===
> [   50.289680] ACPI: Critical trip point
> [   50.289736] Critical temperature reached (255 C), shutting down.
> 
> so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
> tz->temperature thingy is not set properly (printk's added):
> 
> [   50.276607] Old temp: 4294967023
> [   50.281890] Got temp: 255
> [   50.282567] Old temp: 255
> [   50.287882] Got temp: 255
> 
> What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
> there's still garbage in it after reading it in acpi_thermal_get_temperature()
> for the first time. Debugging continues...

(i almost suspected that the problem might be something completely different.)
well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
turned out to be

broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.

After backing this one out, mm1 boots just fine here.
-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Borislav Petkov
On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
 On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
  Hi Andrew,
  Hi Len,
  
  after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
  fine) on my asus laptop, the machine reboots after claiming that
  Critical temperature reached (255 C). However, the degrees number
  is kinda hinting at 0xff all-ones field. Will try dump_stack in
  acpi_thermal_critical() to checkout the call path. For now here's the 
  netconsole bootlog:
 
 Here's what i got so far:
 
 [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
 [   50.287999]  [c0104b65] show_trace_log_lvl+0x12/0x25
 [   50.288103]  [c01053e7] show_trace+0xd/0x10
 [   50.288202]  [c0105a6c] dump_stack+0x57/0x5f
 [   50.288303]  [c021c991] acpi_thermal_check+0x150/0x3bb
 [   50.288415]  [c021d4b3] acpi_thermal_add+0x261/0x2cf
 [   50.288515]  [c0213549] acpi_device_probe+0x3e/0xdb
 [   50.288615]  [c023f8f5] driver_probe_device+0xaf/0x12a
 [   50.288717]  [c023fa88] __driver_attach+0x6c/0xa5
 [   50.288817]  [c023ee5a] bus_for_each_dev+0x3e/0x60
 [   50.288916]  [c023f77d] driver_attach+0x14/0x16
 [   50.289015]  [c023f5a6] bus_add_driver+0xa6/0x1a8
 [   50.289114]  [c023fc53] driver_register+0x42/0x47
 [   50.289214]  [c02138c2] acpi_bus_register_driver+0x3a/0x3c
 [   50.289316]  [c044306b] acpi_thermal_init+0x57/0x76
 [   50.289424]  [c04344a7] kernel_init+0x138/0x280
 [   50.289525]  [c01047df] kernel_thread_helper+0x7/0x10
 [   50.289625]  ===
 [   50.289680] ACPI: Critical trip point
 [   50.289736] Critical temperature reached (255 C), shutting down.
 
 so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
 tz-temperature thingy is not set properly (printk's added):
 
 [   50.276607] Old temp: 4294967023
 [   50.281890] Got temp: 255
 [   50.282567] Old temp: 255
 [   50.287882] Got temp: 255
 
 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
 there's still garbage in it after reading it in acpi_thermal_get_temperature()
 for the first time. Debugging continues...

(i almost suspected that the problem might be something completely different.)
well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
turned out to be

broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.

After backing this one out, mm1 boots just fine here.
-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Bjorn Helgaas
On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote:
 On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
  On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
   Hi Andrew,
   Hi Len,
   
   after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
   fine) on my asus laptop, the machine reboots after claiming that
   Critical temperature reached (255 C). However, the degrees number
   is kinda hinting at 0xff all-ones field. Will try dump_stack in
   acpi_thermal_critical() to checkout the call path. For now here's the 
   netconsole bootlog:
  
  Here's what i got so far:
  
  [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
  [   50.287999]  [c0104b65] show_trace_log_lvl+0x12/0x25
  [   50.288103]  [c01053e7] show_trace+0xd/0x10
  [   50.288202]  [c0105a6c] dump_stack+0x57/0x5f
  [   50.288303]  [c021c991] acpi_thermal_check+0x150/0x3bb
  [   50.288415]  [c021d4b3] acpi_thermal_add+0x261/0x2cf
  [   50.288515]  [c0213549] acpi_device_probe+0x3e/0xdb
  [   50.288615]  [c023f8f5] driver_probe_device+0xaf/0x12a
  [   50.288717]  [c023fa88] __driver_attach+0x6c/0xa5
  [   50.288817]  [c023ee5a] bus_for_each_dev+0x3e/0x60
  [   50.288916]  [c023f77d] driver_attach+0x14/0x16
  [   50.289015]  [c023f5a6] bus_add_driver+0xa6/0x1a8
  [   50.289114]  [c023fc53] driver_register+0x42/0x47
  [   50.289214]  [c02138c2] acpi_bus_register_driver+0x3a/0x3c
  [   50.289316]  [c044306b] acpi_thermal_init+0x57/0x76
  [   50.289424]  [c04344a7] kernel_init+0x138/0x280
  [   50.289525]  [c01047df] kernel_thread_helper+0x7/0x10
  [   50.289625]  ===
  [   50.289680] ACPI: Critical trip point
  [   50.289736] Critical temperature reached (255 C), shutting down.
  
  so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
  tz-temperature thingy is not set properly (printk's added):
  
  [   50.276607] Old temp: 4294967023
  [   50.281890] Got temp: 255
  [   50.282567] Old temp: 255
  [   50.287882] Got temp: 255
  
  What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
  there's still garbage in it after reading it in 
  acpi_thermal_get_temperature()
  for the first time. Debugging continues...
 
 (i almost suspected that the problem might be something completely different.)
 well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
 turned out to be
 
 broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.
 
 After backing this one out, mm1 boots just fine here.

Thanks for tracking this down.  I'll look into your logs and see if I
can figure out what's going on.  There's another report related to that
patch here: http://lkml.org/lkml/2007/11/22/110 .  Looks like a different
symptom though, so probably a different fix.

Bjorn

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Borislav Petkov
On Tue, Dec 11, 2007 at 01:00:24PM -0700, Bjorn Helgaas wrote:
 On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote:
  On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote:
   On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
Hi Andrew,
Hi Len,

after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
fine) on my asus laptop, the machine reboots after claiming that
Critical temperature reached (255 C). However, the degrees number
is kinda hinting at 0xff all-ones field. Will try dump_stack in
acpi_thermal_critical() to checkout the call path. For now here's the 
netconsole bootlog:
   
   Here's what i got so far:
   
   [   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
   [   50.287999]  [c0104b65] show_trace_log_lvl+0x12/0x25
   [   50.288103]  [c01053e7] show_trace+0xd/0x10
   [   50.288202]  [c0105a6c] dump_stack+0x57/0x5f
   [   50.288303]  [c021c991] acpi_thermal_check+0x150/0x3bb
   [   50.288415]  [c021d4b3] acpi_thermal_add+0x261/0x2cf
   [   50.288515]  [c0213549] acpi_device_probe+0x3e/0xdb
   [   50.288615]  [c023f8f5] driver_probe_device+0xaf/0x12a
   [   50.288717]  [c023fa88] __driver_attach+0x6c/0xa5
   [   50.288817]  [c023ee5a] bus_for_each_dev+0x3e/0x60
   [   50.288916]  [c023f77d] driver_attach+0x14/0x16
   [   50.289015]  [c023f5a6] bus_add_driver+0xa6/0x1a8
   [   50.289114]  [c023fc53] driver_register+0x42/0x47
   [   50.289214]  [c02138c2] acpi_bus_register_driver+0x3a/0x3c
   [   50.289316]  [c044306b] acpi_thermal_init+0x57/0x76
   [   50.289424]  [c04344a7] kernel_init+0x138/0x280
   [   50.289525]  [c01047df] kernel_thread_helper+0x7/0x10
   [   50.289625]  ===
   [   50.289680] ACPI: Critical trip point
   [   50.289736] Critical temperature reached (255 C), shutting down.
   
   so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
   tz-temperature thingy is not set properly (printk's added):
   
   [   50.276607] Old temp: 4294967023
   [   50.281890] Got temp: 255
   [   50.282567] Old temp: 255
   [   50.287882] Got temp: 255
   
   What's also strange is that the tz acpi_thermal is alloc'd with kzalloc 
   and
   there's still garbage in it after reading it in 
   acpi_thermal_get_temperature()
   for the first time. Debugging continues...
  
  (i almost suspected that the problem might be something completely 
  different.)
  well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer
  turned out to be
  
  broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch.
  
  After backing this one out, mm1 boots just fine here.
 
 Thanks for tracking this down.  I'll look into your logs and see if I
 can figure out what's going on.  There's another report related to that
 patch here: http://lkml.org/lkml/2007/11/22/110 .  Looks like a different
 symptom though, so probably a different fix.

From what i can roughly tell so far it seems like an resource conflict between 
acpi and
the pnp requested regions in your patch which result in the acpi_thermal code
to read the wrong (0xff) temperature value and halt the machine, but i might be
wrong on the details since acpi is such a big code chunk to swallow. Anyways, 
this is a
different issue than the one you quote above.

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine... solved

2007-12-11 Thread Bjorn Helgaas
On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote:
 From what i can roughly tell so far it seems like an resource conflict 
 between acpi and
 the pnp requested regions in your patch which result in the acpi_thermal code
 to read the wrong (0xff) temperature value and halt the machine, but i might 
 be
 wrong on the details since acpi is such a big code chunk to swallow.

I don't see any obvious conflict from the log you posted.  For the sake
of comparison, can you post the corresponding dmesg log after you removed
the patch?

acpi_thermal_get_temperature() only evaluates _TMP, which isn't very
interesting.  I wonder if there's some conflict between that AML method
and the EC driver or something.

If you can also collect the DSDT, maybe I can poke around in there and
see what _TMP is really doing.

Thanks,
  Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine

2007-12-09 Thread Borislav Petkov
On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
> Hi Andrew,
> Hi Len,
> 
> after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
> fine) on my asus laptop, the machine reboots after claiming that
> "Critical temperature reached (255 C)." However, the degrees number
> is kinda hinting at 0xff all-ones field. Will try dump_stack in
> acpi_thermal_critical() to checkout the call path. For now here's the 
> netconsole bootlog:

Here's what i got so far:

[   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
[   50.287999]  [] show_trace_log_lvl+0x12/0x25
[   50.288103]  [] show_trace+0xd/0x10
[   50.288202]  [] dump_stack+0x57/0x5f
[   50.288303]  [] acpi_thermal_check+0x150/0x3bb
[   50.288415]  [] acpi_thermal_add+0x261/0x2cf
[   50.288515]  [] acpi_device_probe+0x3e/0xdb
[   50.288615]  [] driver_probe_device+0xaf/0x12a
[   50.288717]  [] __driver_attach+0x6c/0xa5
[   50.288817]  [] bus_for_each_dev+0x3e/0x60
[   50.288916]  [] driver_attach+0x14/0x16
[   50.289015]  [] bus_add_driver+0xa6/0x1a8
[   50.289114]  [] driver_register+0x42/0x47
[   50.289214]  [] acpi_bus_register_driver+0x3a/0x3c
[   50.289316]  [] acpi_thermal_init+0x57/0x76
[   50.289424]  [] kernel_init+0x138/0x280
[   50.289525]  [] kernel_thread_helper+0x7/0x10
[   50.289625]  ===
[   50.289680] ACPI: Critical trip point
[   50.289736] Critical temperature reached (255 C), shutting down.

so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
tz->temperature thingy is not set properly (printk's added):

[   50.276607] Old temp: 4294967023
[   50.281890] Got temp: 255
[   50.282567] Old temp: 255
[   50.287882] Got temp: 255

What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
there's still garbage in it after reading it in acpi_thermal_get_temperature()
for the first time. Debugging continues...
-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1: acpi reboots machine

2007-12-09 Thread Borislav Petkov
On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote:
 Hi Andrew,
 Hi Len,
 
 after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
 fine) on my asus laptop, the machine reboots after claiming that
 Critical temperature reached (255 C). However, the degrees number
 is kinda hinting at 0xff all-ones field. Will try dump_stack in
 acpi_thermal_critical() to checkout the call path. For now here's the 
 netconsole bootlog:

Here's what i got so far:

[   50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14
[   50.287999]  [c0104b65] show_trace_log_lvl+0x12/0x25
[   50.288103]  [c01053e7] show_trace+0xd/0x10
[   50.288202]  [c0105a6c] dump_stack+0x57/0x5f
[   50.288303]  [c021c991] acpi_thermal_check+0x150/0x3bb
[   50.288415]  [c021d4b3] acpi_thermal_add+0x261/0x2cf
[   50.288515]  [c0213549] acpi_device_probe+0x3e/0xdb
[   50.288615]  [c023f8f5] driver_probe_device+0xaf/0x12a
[   50.288717]  [c023fa88] __driver_attach+0x6c/0xa5
[   50.288817]  [c023ee5a] bus_for_each_dev+0x3e/0x60
[   50.288916]  [c023f77d] driver_attach+0x14/0x16
[   50.289015]  [c023f5a6] bus_add_driver+0xa6/0x1a8
[   50.289114]  [c023fc53] driver_register+0x42/0x47
[   50.289214]  [c02138c2] acpi_bus_register_driver+0x3a/0x3c
[   50.289316]  [c044306b] acpi_thermal_init+0x57/0x76
[   50.289424]  [c04344a7] kernel_init+0x138/0x280
[   50.289525]  [c01047df] kernel_thread_helper+0x7/0x10
[   50.289625]  ===
[   50.289680] ACPI: Critical trip point
[   50.289736] Critical temperature reached (255 C), shutting down.

so in acpi_thermal_get_temperature() called in acpi_thermal_add() the
tz-temperature thingy is not set properly (printk's added):

[   50.276607] Old temp: 4294967023
[   50.281890] Got temp: 255
[   50.282567] Old temp: 255
[   50.287882] Got temp: 255

What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and
there's still garbage in it after reading it in acpi_thermal_get_temperature()
for the first time. Debugging continues...
-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.24-rc4-mm1: acpi reboots machine

2007-12-08 Thread Borislav Petkov
Hi Andrew,
Hi Len,

after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
fine) on my asus laptop, the machine reboots after claiming that
"Critical temperature reached (255 C)." However, the degrees number
is kinda hinting at 0xff all-ones field. Will try dump_stack in
acpi_thermal_critical() to checkout the call path. For now here's the 
netconsole bootlog:

[0.00] Linux version 2.6.24-rc4-mm1 ([EMAIL PROTECTED]) (gcc version 
4.2.3 20071123 (prerelease) (Debian 4.2.2-4)) #7 SMP PREEMPT Sun Dec 9 08:27:26 
CET 2007
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000e - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1ff4 (usable)
[0.00]  BIOS-e820: 1ff4 - 1ff5 (ACPI data)
[0.00]  BIOS-e820: 1ff5 - 2000 (ACPI NVS)
[0.00] 511MB LOWMEM available.
[0.00] Zone PFN ranges:
[0.00]   DMA 0 -> 4096
[0.00]   Normal   4096 ->   130880
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[1] active PFN ranges
[0.00] 0:0 ->   130880
[0.00] DMI 2.3 present.
[0.00] ACPI: RSDP 000F5DF0, 0014 (r0 ACPIAM)
[0.00] ACPI: RSDT 1FF4, 002C (r1 A M I  OEMRSDT   6000423 MSFT  
 97)
[0.00] ACPI: FACP 1FF40200, 0081 (r1 A M I  OEMFACP   6000423 MSFT  
 97)
[0.00] ACPI: DSDT 1FF40400, 628D (r1  1ABSP 1ABSP0011 MSFT  
201)
[0.00] ACPI: FACS 1FF5, 0040
[0.00] ACPI: OEMB 1FF50040, 0053 (r1 A M I  OEMBIOS   6000423 MSFT  
 97)
[0.00] ACPI: PM-Timer IO Port: 0x408
[0.00] Allocating PCI resources starting at 3000 (gap: 
2000:e000)
[0.00] swsusp: Registered nosave memory region: 0009f000 - 
000a
[0.00] swsusp: Registered nosave memory region: 000a - 
000e
[0.00] swsusp: Registered nosave memory region: 000e - 
0010
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 129475
[0.00] Kernel command line: root=/dev/hda1 vga=0 nmi_watchdog=1 [EMAIL 
PROTECTED]/,@192.168.45.26/
[0.00] Found and enabled local APIC!
[0.00] Enabling fast FPU save and restore... done.
[0.00] Enabling unmasked SIMD FPU exception support... done.
[0.00] Initializing CPU#0
[0.00] CPU 0 irqstacks, hard=c0451000 soft=c0449000
[0.00] PID hash table entries: 2048 (order: 11, 8192 bytes)
[0.00] Detected 1500.114 MHz processor.
[   50.138075] Console: colour VGA+ 80x25
[   50.138080] console [tty0] enabled
[   50.140479] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[   50.140882] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[   50.160065] Memory: 513364k/523520k available (2049k kernel code, 9712k 
reserved, 1113k data, 172k init, 0k highmem)
[   50.160147] virtual kernel memory layout:
[   50.160148] fixmap  : 0xfffb5000 - 0xf000   ( 296 kB)
[   50.160150] vmalloc : 0xe080 - 0xfffb3000   ( 503 MB)
[   50.160151] lowmem  : 0xc000 - 0xdff4   ( 511 MB)
[   50.160153]   .init : 0xc041b000 - 0xc0446000   ( 172 kB)
[   50.160154]   .data : 0xc030067f - 0xc0416ca8   (1113 kB)
[   50.160156]   .text : 0xc010 - 0xc030067f   (2049 kB)
[   50.160549] Checking if this processor honours the WP bit even in supervisor 
mode... Ok.
[   50.160705] SLUB: Genslabs=11, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, 
Nodes=1
[   50.220728] Calibrating delay using timer specific routine.. 3003.73 
BogoMIPS (lpj=1501865)
[   50.220857] Security Framework initialized
[   50.220934] Mount-cache hash table entries: 512
[   50.221174] CPU: L1 I cache: 32K, L1 D cache: 32K
[   50.221273] CPU: L2 cache: 1024K
[   50.221338] Intel machine check architecture supported.
[   50.221398] Intel machine check reporting enabled on CPU#0.
[   50.221459] Compat vDSO mapped to e000.
[   50.221524] Checking 'hlt' instruction... OK.
[   50.225022] SMP alternatives: switching to UP code
[   50.225766] Freeing SMP alternatives: 11k freed
[   50.225823] ACPI: Core revision 20070126
[   50.229623] ACPI: setting ELCR to 0200 (from 0c30)
[   50.734915] CPU0: Intel(R) Pentium(R) M processor 1500MHz stepping 05
[   50.735059] SMP motherboard not detected.
[   50.836119] Brought up 1 CPUs
[   50.836305] khelper used greatest stack depth: 3352 bytes left
[   50.836463] net_namespace: 108 bytes
[   50.837167] NET: Registered protocol family 16
[   50.837466] ACPI: bus type pci registered
[   50.838812] PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=2
[   50.838872] PCI: Using configuration type 1
[   50.838928] Setting up standard PCI 

2.6.24-rc4-mm1: acpi reboots machine

2007-12-08 Thread Borislav Petkov
Hi Andrew,
Hi Len,

after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just
fine) on my asus laptop, the machine reboots after claiming that
Critical temperature reached (255 C). However, the degrees number
is kinda hinting at 0xff all-ones field. Will try dump_stack in
acpi_thermal_critical() to checkout the call path. For now here's the 
netconsole bootlog:

[0.00] Linux version 2.6.24-rc4-mm1 ([EMAIL PROTECTED]) (gcc version 
4.2.3 20071123 (prerelease) (Debian 4.2.2-4)) #7 SMP PREEMPT Sun Dec 9 08:27:26 
CET 2007
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000e - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1ff4 (usable)
[0.00]  BIOS-e820: 1ff4 - 1ff5 (ACPI data)
[0.00]  BIOS-e820: 1ff5 - 2000 (ACPI NVS)
[0.00] 511MB LOWMEM available.
[0.00] Zone PFN ranges:
[0.00]   DMA 0 - 4096
[0.00]   Normal   4096 -   130880
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[1] active PFN ranges
[0.00] 0:0 -   130880
[0.00] DMI 2.3 present.
[0.00] ACPI: RSDP 000F5DF0, 0014 (r0 ACPIAM)
[0.00] ACPI: RSDT 1FF4, 002C (r1 A M I  OEMRSDT   6000423 MSFT  
 97)
[0.00] ACPI: FACP 1FF40200, 0081 (r1 A M I  OEMFACP   6000423 MSFT  
 97)
[0.00] ACPI: DSDT 1FF40400, 628D (r1  1ABSP 1ABSP0011 MSFT  
201)
[0.00] ACPI: FACS 1FF5, 0040
[0.00] ACPI: OEMB 1FF50040, 0053 (r1 A M I  OEMBIOS   6000423 MSFT  
 97)
[0.00] ACPI: PM-Timer IO Port: 0x408
[0.00] Allocating PCI resources starting at 3000 (gap: 
2000:e000)
[0.00] swsusp: Registered nosave memory region: 0009f000 - 
000a
[0.00] swsusp: Registered nosave memory region: 000a - 
000e
[0.00] swsusp: Registered nosave memory region: 000e - 
0010
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 129475
[0.00] Kernel command line: root=/dev/hda1 vga=0 nmi_watchdog=1 [EMAIL 
PROTECTED]/,@192.168.45.26/
[0.00] Found and enabled local APIC!
[0.00] Enabling fast FPU save and restore... done.
[0.00] Enabling unmasked SIMD FPU exception support... done.
[0.00] Initializing CPU#0
[0.00] CPU 0 irqstacks, hard=c0451000 soft=c0449000
[0.00] PID hash table entries: 2048 (order: 11, 8192 bytes)
[0.00] Detected 1500.114 MHz processor.
[   50.138075] Console: colour VGA+ 80x25
[   50.138080] console [tty0] enabled
[   50.140479] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[   50.140882] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[   50.160065] Memory: 513364k/523520k available (2049k kernel code, 9712k 
reserved, 1113k data, 172k init, 0k highmem)
[   50.160147] virtual kernel memory layout:
[   50.160148] fixmap  : 0xfffb5000 - 0xf000   ( 296 kB)
[   50.160150] vmalloc : 0xe080 - 0xfffb3000   ( 503 MB)
[   50.160151] lowmem  : 0xc000 - 0xdff4   ( 511 MB)
[   50.160153]   .init : 0xc041b000 - 0xc0446000   ( 172 kB)
[   50.160154]   .data : 0xc030067f - 0xc0416ca8   (1113 kB)
[   50.160156]   .text : 0xc010 - 0xc030067f   (2049 kB)
[   50.160549] Checking if this processor honours the WP bit even in supervisor 
mode... Ok.
[   50.160705] SLUB: Genslabs=11, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, 
Nodes=1
[   50.220728] Calibrating delay using timer specific routine.. 3003.73 
BogoMIPS (lpj=1501865)
[   50.220857] Security Framework initialized
[   50.220934] Mount-cache hash table entries: 512
[   50.221174] CPU: L1 I cache: 32K, L1 D cache: 32K
[   50.221273] CPU: L2 cache: 1024K
[   50.221338] Intel machine check architecture supported.
[   50.221398] Intel machine check reporting enabled on CPU#0.
[   50.221459] Compat vDSO mapped to e000.
[   50.221524] Checking 'hlt' instruction... OK.
[   50.225022] SMP alternatives: switching to UP code
[   50.225766] Freeing SMP alternatives: 11k freed
[   50.225823] ACPI: Core revision 20070126
[   50.229623] ACPI: setting ELCR to 0200 (from 0c30)
[   50.734915] CPU0: Intel(R) Pentium(R) M processor 1500MHz stepping 05
[   50.735059] SMP motherboard not detected.
[   50.836119] Brought up 1 CPUs
[   50.836305] khelper used greatest stack depth: 3352 bytes left
[   50.836463] net_namespace: 108 bytes
[   50.837167] NET: Registered protocol family 16
[   50.837466] ACPI: bus type pci registered
[   50.838812] PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=2
[   50.838872] PCI: Using configuration type 1
[   50.838928] Setting up standard PCI