Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Thu, Dec 13, 2007 at 09:17:18AM -0700, Bjorn Helgaas wrote: > On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote: > > On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: > > > On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: > > > > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: > > > > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > > > > > > From what i can roughly tell so far it seems like an resource > > > > > > conflict between acpi and > > > > > > the pnp requested regions in your patch which result in the > > > > > > acpi_thermal code > > > > > > to read the wrong (0xff) temperature value and halt the machine, > > > > > > but i might be > > > > > > wrong on the details since acpi is such a big code chunk to swallow. > > > > > > > > I think Alexey is on the right track with the PCI resource allocation > > > failure. > > > > Then it should be the SMBus controller, PCI id 00:1f:3, which is having > > problems > > registering its io ports region 4, AFAICT. > > Yes, it looks like the ioport region 0x540-0x55f is described both in > PNP and ACPI: > > /sys/devices/pnp0/00:0d/resources:state = active > /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f > /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f > > 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus > Controller (rev 03) > Subsystem: ASUSTeK Computer Inc. Unknown device 1869 > Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > SERR- Interrupt: pin B routed to IRQ 0 > Region 4: I/O ports at 0540 [size=32] > > The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc(). > > This quirk seems dangerous to me, and the comments above asus_hides_smbus > allude to problems similar to what you're seeing. It's obvious that a > lot of blood, sweat, and tears have gone into this quirk, so I'm not > suggesting that it's time to revert it, but I would be interested in > knowing whether the critical temperature problem goes away if we leave > the PCI device hidden, e.g., with the following patch: > > Index: linux-mm/drivers/pci/quirks.c > === > --- linux-mm.orig/drivers/pci/quirks.c2007-12-13 09:11:31.0 > -0700 > +++ linux-mm/drivers/pci/quirks.c 2007-12-13 09:12:27.0 -0700 > @@ -1073,12 +1073,7 @@ > > pci_read_config_word(dev, 0xF2, ); > if (val & 0x8) { > - pci_write_config_word(dev, 0xF2, val & (~0x8)); > - pci_read_config_word(dev, 0xF2, ); > - if (val & 0x8) > - printk(KERN_INFO "PCI: i801 SMBus device continues to > play 'hide and seek'! 0x%x\n", val); > - else > - printk(KERN_INFO "PCI: Enabled i801 SMBus device\n"); > + printk(KERN_INFO "PCI: Leaving i801 SMBus device hidden\n"); > } > } > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, > PCI_DEVICE_ID_INTEL_82801AA_0, asus_hides_smbus_lpc); yep, this fixes it. Bootlog attached. -- Regards/Gruß, Boris. bootlog-smbus-hidden.bz2 Description: Binary data
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote: > On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: > > On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: > > > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: > > > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > > > > > From what i can roughly tell so far it seems like an resource > > > > > conflict between acpi and > > > > > the pnp requested regions in your patch which result in the > > > > > acpi_thermal code > > > > > to read the wrong (0xff) temperature value and halt the machine, but > > > > > i might be > > > > > wrong on the details since acpi is such a big code chunk to swallow. > > > > > > I think Alexey is on the right track with the PCI resource allocation > > failure. > > Then it should be the SMBus controller, PCI id 00:1f:3, which is having > problems > registering its io ports region 4, AFAICT. Yes, it looks like the ioport region 0x540-0x55f is described both in PNP and ACPI: /sys/devices/pnp0/00:0d/resources:state = active /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 03) Subsystem: ASUSTeK Computer Inc. Unknown device 1869 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote: On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I think Alexey is on the right track with the PCI resource allocation failure. Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems registering its io ports region 4, AFAICT. Yes, it looks like the ioport region 0x540-0x55f is described both in PNP and ACPI: /sys/devices/pnp0/00:0d/resources:state = active /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 03) Subsystem: ASUSTeK Computer Inc. Unknown device 1869 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- Interrupt: pin B routed to IRQ 0 Region 4: I/O ports at 0540 [size=32] The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc(). This quirk seems dangerous to me, and the comments above asus_hides_smbus allude to problems similar to what you're seeing. It's obvious that a lot of blood, sweat, and tears have gone into this quirk, so I'm not suggesting that it's time to revert it, but I would be interested in knowing whether the critical temperature problem goes away if we leave the PCI device hidden, e.g., with the following patch: Index: linux-mm/drivers/pci/quirks.c === --- linux-mm.orig/drivers/pci/quirks.c 2007-12-13 09:11:31.0 -0700 +++ linux-mm/drivers/pci/quirks.c 2007-12-13 09:12:27.0 -0700 @@ -1073,12 +1073,7 @@ pci_read_config_word(dev, 0xF2, val); if (val 0x8) { - pci_write_config_word(dev, 0xF2, val (~0x8)); - pci_read_config_word(dev, 0xF2, val); - if (val 0x8) - printk(KERN_INFO PCI: i801 SMBus device continues to play 'hide and seek'! 0x%x\n, val); - else - printk(KERN_INFO PCI: Enabled i801 SMBus device\n); + printk(KERN_INFO PCI: Leaving i801 SMBus device hidden\n); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AA_0, asus_hides_smbus_lpc); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Thu, Dec 13, 2007 at 09:17:18AM -0700, Bjorn Helgaas wrote: On Thursday 13 December 2007 12:09:23 am Borislav Petkov wrote: On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I think Alexey is on the right track with the PCI resource allocation failure. Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems registering its io ports region 4, AFAICT. Yes, it looks like the ioport region 0x540-0x55f is described both in PNP and ACPI: /sys/devices/pnp0/00:0d/resources:state = active /sys/devices/pnp0/00:0d/resources:io 0x540-0x55f /sys/devices/pnp0/00:0d/resources:io 0x400-0x47f 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 03) Subsystem: ASUSTeK Computer Inc. Unknown device 1869 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- Interrupt: pin B routed to IRQ 0 Region 4: I/O ports at 0540 [size=32] The PCI SMBus device was enabled by a quirk, asus_hides_smbus_lpc(). This quirk seems dangerous to me, and the comments above asus_hides_smbus allude to problems similar to what you're seeing. It's obvious that a lot of blood, sweat, and tears have gone into this quirk, so I'm not suggesting that it's time to revert it, but I would be interested in knowing whether the critical temperature problem goes away if we leave the PCI device hidden, e.g., with the following patch: Index: linux-mm/drivers/pci/quirks.c === --- linux-mm.orig/drivers/pci/quirks.c2007-12-13 09:11:31.0 -0700 +++ linux-mm/drivers/pci/quirks.c 2007-12-13 09:12:27.0 -0700 @@ -1073,12 +1073,7 @@ pci_read_config_word(dev, 0xF2, val); if (val 0x8) { - pci_write_config_word(dev, 0xF2, val (~0x8)); - pci_read_config_word(dev, 0xF2, val); - if (val 0x8) - printk(KERN_INFO PCI: i801 SMBus device continues to play 'hide and seek'! 0x%x\n, val); - else - printk(KERN_INFO PCI: Enabled i801 SMBus device\n); + printk(KERN_INFO PCI: Leaving i801 SMBus device hidden\n); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AA_0, asus_hides_smbus_lpc); yep, this fixes it. Bootlog attached. -- Regards/Gruß, Boris. bootlog-smbus-hidden.bz2 Description: Binary data
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: > On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: > > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: > > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > > > > From what i can roughly tell so far it seems like an resource conflict > > > > between acpi and > > > > the pnp requested regions in your patch which result in the > > > > acpi_thermal code > > > > to read the wrong (0xff) temperature value and halt the machine, but i > > > > might be > > > > wrong on the details since acpi is such a big code chunk to swallow. > > > > > > I don't see any obvious conflict from the log you posted. For the sake > > > of comparison, can you post the corresponding dmesg log after you removed > > > the patch? > > > > The only difference i see is that ACPI finds EC in DSDT in the working > > kernel > > and in the broken case something silently fails. Please find attached the 2 > > bootlogs > > and a disassembled DSDT. > > Thanks very much! > > "ACPI: EC: Look up EC in DSDT" appears in the working log, but not > in the broken one. But I think we *do* find the EC in both cases, > because we see "ACPI: EC: non-query interrupt received" even before > acpi_ec_add() (which prints the "ACPI: EC: GPE = 0x1c, ...". Maybe > the logs were collected with different log levels? Well, hm, actually no, the only difference is that the broken log was taken over netconsole so the lines might appear in a different order. I'll capture that log again on the weekend to see whether something is missing.. > I think Alexey is on the right track with the PCI resource allocation > failure. Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems registering its io ports region 4, AFAICT. > On your working kernel, can you collect this: > > lspci -vv > lspci > cat /proc/ioports > ioports > cat /proc/iomem > iomem > grep . /sys/devices/pnp*/*/resources > pnp > tar -jcf resources.tar.bz2 lspci ioports iomem pnp attached. -- Regards/Gruß, Boris. resources.tar.bz2 Description: Binary data
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: > On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: > > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > > > From what i can roughly tell so far it seems like an resource conflict > > > between acpi and > > > the pnp requested regions in your patch which result in the acpi_thermal > > > code > > > to read the wrong (0xff) temperature value and halt the machine, but i > > > might be > > > wrong on the details since acpi is such a big code chunk to swallow. > > > > I don't see any obvious conflict from the log you posted. For the sake > > of comparison, can you post the corresponding dmesg log after you removed > > the patch? > > The only difference i see is that ACPI finds EC in DSDT in the working kernel > and in the broken case something silently fails. Please find attached the 2 > bootlogs > and a disassembled DSDT. Thanks very much! "ACPI: EC: Look up EC in DSDT" appears in the working log, but not in the broken one. But I think we *do* find the EC in both cases, because we see "ACPI: EC: non-query interrupt received" even before acpi_ec_add() (which prints the "ACPI: EC: GPE = 0x1c, ...". Maybe the logs were collected with different log levels? I think Alexey is on the right track with the PCI resource allocation failure. On your working kernel, can you collect this: lspci -vv > lspci cat /proc/ioports > ioports cat /proc/iomem > iomem grep . /sys/devices/pnp*/*/resources > pnp tar -jcf resources.tar.bz2 lspci ioports iomem pnp Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. This seems to be the start of trouble... PCI: Cannot allocate resource region 4 of device :00:1f.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: > On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > > From what i can roughly tell so far it seems like an resource conflict > > between acpi and > > the pnp requested regions in your patch which result in the acpi_thermal > > code > > to read the wrong (0xff) temperature value and halt the machine, but i > > might be > > wrong on the details since acpi is such a big code chunk to swallow. > > I don't see any obvious conflict from the log you posted. For the sake > of comparison, can you post the corresponding dmesg log after you removed > the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. -- Regards/Gruß, Boris. BZh91AYµÜ¬t $Øÿ¬ý´ ÿÿÿïÿþÿÿÿô `/?* P ¨ P IMÓS4CÔjzÒz©iF! Ð hà 4 4È 4i£&@ 4 ¦ 2 4dÁ Ð4 4Ó [EMAIL PROTECTED] 1 p Ðd 4È Ñ ¤ Ñ i ÄLÈɪaèÚM©ý_åÿ/;ý÷b°[$íÄTGµýD{Þ¥ó¢DÍJeK"ñ.¢ (¼«%Id¥)E D¢Soúí|ÏõW©?Ó>m ªY.ë]ò±âÓ£d©ÂNÓlm´ø ,5r5K¦HÖc¢}öÍÕä^}2«¶,iänqb]%[<¿*Ê}oÁ±¹îu>îöñàv©ý QââÍϽ\_©ÔÔû®§µòÙÝØÊît¬å:©÷t*ñ²¡ëgccðYéM®3<§ñ}~3btV¯CQµì-çûÙ® ]9üºcròM^I~·ð{ÎÆÆÆ-+Ýïþm_çÅçäâüTÍyãö1qw=ß{q}/i½ñwö¬ìbíì{u©Op~ Ôö2?\ÙGÖòx®z(åä(Hi# ÎAäÃê4,8´©Tc4Ñ\ÎéY¡Jlfø5Fì[öeècãxsü× Ü§kÇ):ÔjZÉù^ÏÎ×kÉåÉS±ú".ó6¬ØÉâÜÉaN.7ëÁÙ«}Uí- ¢Ö´ªµUlÚÚîyùúð©ÉO%Æ¥Êt=ëÛÊ*QE)¥µ j=)îqw; ýÚ¦-é_²èâ§4×çv>GSþ}.ÖÐðt1]>Óû)wÔÍظ»ZÎGçhÞ÷µÌNwhþK[#cïXkÎÈä)¿¥iæ¡ç{[wËçs{¥ähÔ±=§[ßfº5½nµ§^cµÖ¾Å*'LGûnX{Ý.k¦ò¿åtíkvøõéÚÚónøI"Ó¥,ô§¡OGwõ6ºSÎîd¬·×ã0ö0Qñp`èt:Tdå½Þ¾^¢ÌÜÔ<Ý/³6ÔSâõ§½sDâé-CÔ>wàÉz]>غÞ[SÁ©Þë[ÈÚôxIÉçíÖ¤fbè~Vsà²ÍNÊûW{\»¹û¸ñàÚêx¼ýªvbo^IÈt5¬ìôÉNk,êw:Û3yÑèõ¬Os[É¥=«6¾³'ï{ÕÐÀÖ¨z©ÚÙ±ßÝÂêIâÅ»Dó,ØcÂYíÙJl2Rhq´#NS¶13&â½HöëíÒûKI:5éD÷)ò1æìß2ß{µ)ÐìæäÅOpܺ3ÚúÞ®Gêkb¿{{]<Øà©òñÒäÖéfðu2 ôº7½ªÇ¿Ó¯Ì»àÅì¿~¬\Ôä°ít*IðQÇG5YѵÜ׸ð~GkTاsíÑ(´¸NL%½H·a&dHâpè`GÄ0ȧ%OcgçðÁÉYDïiCv»mÎ<Õ<nM±L ªaE70jV8À¨º2GIÛ±8(ÉÜý\ýz[^/ý¹CÐ~té G Ñ¥¥xnOý÷+¼ËAR46·µò?S&[EMAIL PROTECTED]/¼¸5¿ Õ0~:\ô©¹îqg6?k¿¹ïd³¥ÒWÛ«S¡²´³ì·JQÀÚÔÍÖæÖ]øþ§=.Ísös}Kèx½ÈÞüNõÖdðzØ:[Íàâýj\ÉNáÍ Ã/³ÆæáÏ&Σ.¡°ó. ½ñlö,·¥FçBh§üz]Ò ¬æè±Õ¯Íau×dõ!Ò±g½eÙýX¶kÔ«Z~¤§Êá{O¹í{äù,g8Ø}8¹<:ÎÐKÿÉHÕ®º>ÈðGsÉI©§öÝÀô¹O¿Þ³öÙ§êaeL66bÃ6ÕûW~§ß?ìðØÎ JG£äèÙl27½ <ËoV µì|¬.¡IK1²ïk).ÁUKK,]0`³Æ¦/M2>fC,èÑZ|,£VÍWfj¢S&½k?õ0Òý^ ÞHêsØ¿ÁéS¿T `´¦ÑF9âÂÒMXî4PG%¿Îlõ¤Æ'ó`ì%.IlÝ0Ne;jHܯe»TôTòWoô{c%ÕONëþΦ,ÎÖ´ÉûYCú¿äá£oíìSÎÓf«òv5è©ÉÎϯE9ª¦f첩³[ÿ¯ØÔÅ¥?£ýRÍn$Ô¹8¼_BíÇëy.Ü»7ÔÄó»,{Öè]ïع) Î"nqRÃÅe¨É®õ)k+55/vµ)cKR´i fîôÝ·ó°{»®kZµö:juZ¬Yw6N R¥)_'þ50) âðN#ªÍk)ÖàÅg6¦n,Y¿¦n¦²`Ð:Íjc·fL)u[ÓS{«%Ôa|ð;ARAlî.,JÁN>uYucªksp^r1fÿõvjkt,³;³íZs¦æöÄÁ,;éfóg7ó]fÝ´[üU$Å1RÝ*lpMZaÕ¯RjtNÈmi=Jß*é;YJVæ(7Bv'k)=/èì¥õ6¹µ:uÐ]92v[×ûº (£S%ßÄ£R¥)JRÄ©éRê"R¥)E(¥:¾ÉXªê-T¥J«·¾,ÝMªLGð.}Müxãè}O;zQî3àæô´}N&[̲¬¦QN:¤Àä5ý *Ë<õ#éRüíeI²2Ý_'ú¦Ç©ñfõ´llãF¶-`o+væÕô£LkOÙYRîµ²`»Ö°ÒË-®òÏ¥65(Ã]ZôÍfêYu¼¬o¥`³5³`ÁMºO.µm.ÓÎàÂÆ_þURiçW½ßJQT६V¬T±Oàù:Ë3ån.8ǼSDø,Üìo]îPÂ^¶#>ÕöðÎp>V·ìuͳäRmÖ,ɹF WuÒ¦Ó¼O¿ G]ÝJNõ#ÂÇäKçЮL\ï3ëW5~±pÚjpF£qټٯtüìYáO3®í5^ØßL691Ã6O'U)J£Z±\d©¬õ®¦öLV`äØôåÑÄË_wx±àù©ì×éfÍ:läø·¶åèmfà¦ç lYÆãкö}íí¯½©5*§²ìYj»Øó6á&8¼^Tóêéfï3¿ß½æy5®Jq'IÉv ²]¬Í£kËcT©K÷¯èd³¦¥fãüù.Åz×Xc F¦^fWr:QßsÆnÏ%ÜÁNo~vKæYü¼½ÖqoÏr°_÷ðê· G²ÍêkÖ¹©FJëxô»Ç5MZ[W±f³ÂÊ`ñXèc6TèÃ'UئØÅ°§ñY{co'yÞ`mgMÖÔÚúò~õ&[Zgr÷hS{](lK»p5úUyS&µÚ»^Y¶1Ý]µÛ°mYCZ_^k3¥×ìäkiÀ6[Ø·¯Ã3ÔÉ] ¿K8²uÔÕ¾¦î=ÖjÎ襵ÖÎJiÔÃÿ<ßÁw,iשvi±´ÕùseEÊd|ÞíY¹Ziý-Ö3t)Ö6â±£'Ê×ÔÅNÕÚ"úYGæ&¶Æ µëpÕ¤Êe»RÇLilO¶6¿FÙ¤¹;&¼))ýhLVrmÎLÕà ¶m¬ÌÛÖm7mÏ Wm_kF9V¤ÓsvK¶³¥xí|ÔùwÖÑÝ{áÉG%ÜüOÚýêÔüY>ÑOÀR`J(ÑáÀ34z¯R¨ü_±2obÂ4TYYÝ]YY£Fx©Ñ±ÁöÓØýZ¥¯d¬ü>Æ2{¬Ytø¿Éø»ÍEJk]Òjáëöê½ï{_9,»+^ªRë»Û7ºX`¥4iTÑ£s4ûḳ6lÚ4F©Ðê]±ýlãmmY#^¹¸¬jW&½±±ÁÁkængfMâÀØñÜÞWhÙ#Ⱥ"TY½½ÅÅfßø6Ì2fÒWï¿_K©¬¾¤æm]¢e8úº2õý®VjÍEê¤ÔÎÈuª éúkS_ôYÍ©Ó£_k¡ó¼Î->.q÷ºÚ5¾;ß#zT<YOý¾U£góá6Õ}^¯Ûto~?§'»s½ùDú9°´õQízØ¿ÏΨ¥IúäûPe)õ.»ù,ýýºÌ¤âRrPÿ7*6B5ÊRx9º~EP)E?õ85-Ù¿±¹ ϼ´~óÿ§S©8ͪJkµO¥µ½£u¬¦-òJn`²ÌÙÍå Hì»±¢DÔ;¹µE3डQüWt)Fõ01UkIiRR.]æÁÎoWkêf±gاñqhÜßjÇò?yͽÁçe:ÿ½ÔÀŵØf§{ÖQ*¡á6'ªë(xr:SÉNÇó°SFÄÜÛ®KŬµN쩶ÖFÑÀÑÓñu#§y³¼XÏUL×Z#SÁwTÒ<ÌVqf# R`ÊIv+%Ö)ÅLTnZÈÉfâÈ¡J¨R ( QJTUõ*.f+ znL)D¡ÐfkP°îX»i(©~bµ¶.xFJ¬=Z3Á:ÞK7»/$¶»T´Å)þëLký¾KÚ±v¶ £qÝ2<%ÞOêÀ`§¥î4v¾v-C7¥KC½dv)JIJ=¥öòv;:2HÔìþ´I¥ àsG¹MÎ .0Dèncù,ÛIú]NÖ¶k´9=ð`ó1z×2Éf4klSÎÉf¦ç"ÂN%&ÔºCÊ1
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. -- Regards/Gruß, Boris. BZh91AYSYµÜ¬t $Øÿ¬ý´ ÿÿÿïÿþÿÿÿô `/?* P ¨ P IMÓS4CÔjzÒz©iF! Ð hà 4 4È 4i£@ 4 ¦ 2 4dÁ Ð4 4Ó [EMAIL PROTECTED] 1 p Ðd 4È Ñ ¤ Ñ i ÄLÈɪaèÚM©ý_åÿ/;ý÷b°[$íÄTGµýD{Þ¥ó¢DÍJeKñ.¢ (¼«%Id¥)E D¢Soúí|ÏõW©?Óm ªY.ë]ò±âÓNÃb667ÿÏÕæÁnTÿ£¿ªÍÁ¬ÕÛu¯]¼ßsFXÓý3Õ}2F hbl¾¦nZÓoSz·cËG×ÿwØÖº²u²¦»®¬äÑãEFtõ 9mòS®ÙÃv64¾s¬+0Ee£jçIJB[EMAIL PROTECTED]|Ywv]ÑòZÏ%r£[øîV;1`¤÷` ÔÅ£¡ZAµL¼tÖ/TàJÒómo~ÿsVܹ62¾5ÉJR¥9ª×\Þk`RcPñEë0ç8ÙĹ«DÅ0_$[õ¯[qrÔú5kõ?٩͵8¯»©¬Á)}úhíoùut8pÍuï}ø·7¶,ÔÁþ»2dÉ©Ì,ÁÌ,Áêu².¦Öo¯Zf¤ò½ÎÉë})îm~F[6QOçl8µ¿âõop ÌVOü¾Å½()ROæ çËÝêf÷#½Ôï{éÁírdþÎú¬} ËWÈñtNܧ½F;ýM³µ±ä»HãcÌÛ2nSÍíYþ+5?Åõº^åÞâWt³:mmCñfYw½wZë6Tجn¨b*#o¢nÉÿ?ðÎÎÕOåüôcæO2ëÝ.º÷Ku]]ÖÂêR¥6ÓåZÓ.45e³d¥)(¥(¥)4ù¹°k{\ }¸²pGþÑüý·©%É{UÔpkÂÌÍÊ5¬Å8£d©ÂNÓlm´ø ,5r5K¦HÖc¢}öÍÕä^}2«¶,iänqb]%[¿*Ê}oÁ±¹îuîöñàv©ý QââÍϽ\_©ÔÔû®§µòÙÝØÊît¬å:©÷t*ñ²¡ëgccðYéM®3§ñ}~3btV¯CQµì-çûÙ® ]9üºcròM^I~·ð{ÎÆÆÆ-+Ýïþm_çÅçäâüTÍyãö1qw=ß{q}/i½ñwö¬ìbíì{u©Op~ Ôö2?\ÙGÖòx®z(åä(Hi# ÎAäÃê4,8´©Tc4Ñ\ÎéY¡Jlfø5Fì[öeècãxsü× Ü§kÇ):ÔjZÉù^ÏÎ×kÉåÉS±ú.ó6¬ØÉâÜÉaN.7ëÁÙ«}Uí- ¢Ö´ªµUlÚÚîyùúð©ÉO%Æ¥Êt=ëÛÊ*QE)¥µ j=)îqw; ýÚ¦-é_²èâ§4×çvGSþ}.ÖÐðt1]Óû)wÔÍظ»ZÎGçhÞ÷µÌNwhþK[#cïXkÎÈä)¿¥iæ¡ç{[wËçs{¥ähÔ±=§[ßfº5½nµ§^cµÖ¾Å*'LGûnX{Ý.k¦ò¿åtíkvøõéÚÚónøIÓ¥,ô§¡OGwõ6ºSÎîd¬·×ã0ö0Qñp`èt:Tdå½Þ¾^¢ÌÜÔÝ/³6ÔSâõ§½sDâé-CÔwàÉz]غÞ[SÁ©Þë[ÈÚôxIÉçíÖ¤fbè~Vsà²ÍNÊûW{\»¹û¸ñàÚêx¼ýªvbo^IÈt5¬ìôÉNk,êw:Û3yÑèõ¬Os[É¥=«6¾³'ï{ÕÐÀÖ¨z©ÚÙ±ßÝÂêIâÅ»Dó,ØcÂYíÙJl2Rhq´#NS¶13â½HöëíÒûKI:5éD÷)ò1æìß2ß{µ)ÐìæäÅOpܺ3ÚúÞ®Gêkb¿{{]Øà©òñÒäÖéfðu2 ôº7½ªÇ¿Ó¯Ì»àÅì¿~¬\Ôä°ít*IðQÇG5YѵÜ׸ð~GkTاsíÑ(´¸NL%½H·adHâpè`GÄ0ȧ%OcgçðÁÉYDïiCv»mÎÕnM±L ªaE70jV8À¨º2GIÛ±8(ÉÜý\ýz[^/ý¹CÐ~té G Ñ¥¥xnOý÷+¼ËAR46·µò?S[EMAIL PROTECTED]iG|¿skcoÔõOÜçewü-ñ«|Ú¾Öo·|³§qÅÔë5¹¦¦îÕUl56¬ÁÛë÷5}Á-Û¤öïÔíÑK:L}XK´[àÔècí|ÌýmO°WÖ¥)e)c{,Ö}s[Û«Ãé¯:Î3÷¾§ÒR¥?ÓrÓС²µ÷X}jûspzéOʯÊö,§ØÒêiñ;6|_rS±g/¼¸5¿ Õ0~:\ô©¹îqg6?k¿¹ïd³¥ÒWÛ«S¡²´³ì·JQÀÚÔÍÖæÖ]øþ§=.Ísös}Kèx½ÈÞüNõÖdðzØ:[Íàâýj\ÉNáÍ Ã/³ÆæáÏΣ.¡°ó. ½ñlö,·¥FçBh§üz]Ò ¬æè±Õ¯Íau×dõ!Ò±g½eÙýX¶kÔ«Z~¤§Êá{O¹í{äù,g8Ø}8¹:ÎÐKÿÉHÕ®ºÈðGsÉI©§öÝÀô¹O¿Þ³öÙ§êaeL66bÃ6ÕûW~§ß?ìðØÎ JG£äèÙl27½f ËoV µì|¬.¡IK1²ïk).ÁUKK,]0`³Æ¦/M2fC,èÑZ|,£VÍWfj¢S½k?õ0Òý^ ÞHêsØ¿ÁéS¿T `´¦ÑF9âÂÒMXî4PG%¿Îlõ¤Æ'ó`ì%.IlÝ0Ne;jHܯe»TôTòWoô{c%ÕONëþΦ,ÎÖ´ÉûYCú¿äá£oíìSÎÓf«òv5è©ÉÎϯE9ª¦f첩³[ÿ¯ØÔÅ¥?£ýRÍn$Ô¹8¼_BíÇëy.Ü»7ÔÄó»,{Öè]ïع) ÎnqRÃÅe¨É®õ)k+55/vµ)cKR´i fîôÝ·ó°{»®kZµö:juZ¬Yw6N R¥)_'þ50) âðN#ªÍk)ÖàÅg6¦n,Y¿¦n¦²`ÐE:Íjc·fL)u[ÓS{«%Ôa|ð;ARAlî.,JÁNuYucªksp^r1fÿõvjkt,³;³íZs¦æöÄÁ,;éfóg7ó]fÝ´[üU$Å1RÝ*lpMZaÕ¯RjtNÈmi=Jß*é;YJVæ(7Bv'k)=/èì¥õ6¹µ:uÐ]92v[×ûº (£S%ßÄ£R¥)JRÄ©éRêR¥)E(¥:¾ÉXªê-T¥J«·¾,ÝMªLGð.}Müxãè}O;zQî3àæô´}N[̲¬¦QN:¤Àä5ý *Ëõ#éRüíeI²2Ý_'ú¦Ç©ñfõ´llãF¶-`o+væÕô£LkOÙYRîµ²`»Ö°ÒË-®òÏ¥65(Ã]ZôÍfêYu¼¬o¥`³5³`ÁMºO.µm.ÓÎàÂÆ_þURiçW½ßJQT६V¬T±Oàù:Ë3ån.8ǼSDø,Üìo]îPÂ^¶#ÕöðÎpV·ìuͳäRmÖ,ɹF WuÒ¦Ó¼O¿ G]ÝJNõ#ÂÇäKçЮL\ï3ëW5~±pÚjpF£qټٯtüìYáO3®í5^ØßL691Ã6O'U)J£Z±\d©¬õ®¦öLV`äØôåÑÄË_wx±àù©ì×éfÍ:läø·¶åèmfà¦ç lYÆãкö}íí¯½©5*§²ìYj»Øó6á8¼^Tóêéfï3¿ß½æy5®Jq'IÉv ²]¬Í£kËcT©K÷¯èd³¦¥fãüù.Åz×Xc F¦^fWr:QßsÆnÏ%ÜÁNo~vKæYü¼½ÖqoÏr°_÷ðê· G²ÍêkÖ¹©FJëxô»Ç5MZ[W±f³ÂÊ`ñXèc6TèÃ'UئØÅ°§ñY{co'yÞ`mgMÖÔÚúò~õ[Zgr÷hS{](lK»p5úUySµÚ»^Y¶1Ý]µÛ°mYCZ_^k3¥×ìäkiÀ6[Ø·¯Ã3ÔÉ] ¿K8²uÔÕ¾¦î=ÖjÎ襵ÖÎJiÔÃÿßÁw,iשvi±´ÕùseEÊd|ÞíY¹Ziý-Ö3t)Ö6â±£'Ê×ÔÅNÕÚúYGæ¶Æ µëpÕ¤Êe»RÇLilO¶6¿FÙ¤¹;¼))ýhLVrmÎLÕà ¶m¬ÌÛÖm7mÏ Wm_kF9V¤ÓsvK¶³¥xí|ÔùwÖÑÝ{áÉG%ÜüOÚýêÔüYÑOÀR`J(ÑáÀ34z¯R¨ü_±2obÂ4TYYÝ]YY£Fx©Ñ±ÁöÓØýZ¥¯d¬üÆ2{¬Ytø¿Éø»ÍEJk]Òjáëöê½ï{_9,»+^ªRë»Û7ºX`¥4iTÑ£s4ûḳ6lÚ4F©Ðê]±ýlãmmY#^¹¸¬jW½±±ÁÁkængfMâÀØñÜÞW
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. This seems to be the start of trouble... PCI: Cannot allocate resource region 4 of device :00:1f.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. Thanks very much! ACPI: EC: Look up EC in DSDT appears in the working log, but not in the broken one. But I think we *do* find the EC in both cases, because we see ACPI: EC: non-query interrupt received even before acpi_ec_add() (which prints the ACPI: EC: GPE = 0x1c, Maybe the logs were collected with different log levels? I think Alexey is on the right track with the PCI resource allocation failure. On your working kernel, can you collect this: lspci -vv lspci cat /proc/ioports ioports cat /proc/iomem iomem grep . /sys/devices/pnp*/*/resources pnp tar -jcf resources.tar.bz2 lspci ioports iomem pnp Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Wed, Dec 12, 2007 at 09:21:41AM -0700, Bjorn Helgaas wrote: On Wednesday 12 December 2007 03:11:23 am Borislav Petkov wrote: On Tue, Dec 11, 2007 at 05:08:59PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? The only difference i see is that ACPI finds EC in DSDT in the working kernel and in the broken case something silently fails. Please find attached the 2 bootlogs and a disassembled DSDT. Thanks very much! ACPI: EC: Look up EC in DSDT appears in the working log, but not in the broken one. But I think we *do* find the EC in both cases, because we see ACPI: EC: non-query interrupt received even before acpi_ec_add() (which prints the ACPI: EC: GPE = 0x1c, Maybe the logs were collected with different log levels? Well, hm, actually no, the only difference is that the broken log was taken over netconsole so the lines might appear in a different order. I'll capture that log again on the weekend to see whether something is missing.. I think Alexey is on the right track with the PCI resource allocation failure. Then it should be the SMBus controller, PCI id 00:1f:3, which is having problems registering its io ports region 4, AFAICT. On your working kernel, can you collect this: lspci -vv lspci cat /proc/ioports ioports cat /proc/iomem iomem grep . /sys/devices/pnp*/*/resources pnp tar -jcf resources.tar.bz2 lspci ioports iomem pnp attached. -- Regards/Gruß, Boris. resources.tar.bz2 Description: Binary data
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: > From what i can roughly tell so far it seems like an resource conflict > between acpi and > the pnp requested regions in your patch which result in the acpi_thermal code > to read the wrong (0xff) temperature value and halt the machine, but i might > be > wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? acpi_thermal_get_temperature() only evaluates _TMP, which isn't very interesting. I wonder if there's some conflict between that AML method and the EC driver or something. If you can also collect the DSDT, maybe I can poke around in there and see what _TMP is really doing. Thanks, Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tue, Dec 11, 2007 at 01:00:24PM -0700, Bjorn Helgaas wrote: > On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote: > > On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: > > > On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: > > > > Hi Andrew, > > > > Hi Len, > > > > > > > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just > > > > fine) on my asus laptop, the machine reboots after claiming that > > > > "Critical temperature reached (255 C)." However, the degrees number > > > > is kinda hinting at 0xff all-ones field. Will try dump_stack in > > > > acpi_thermal_critical() to checkout the call path. For now here's the > > > > netconsole bootlog: > > > > > > Here's what i got so far: > > > > > > [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 > > > [ 50.287999] [] show_trace_log_lvl+0x12/0x25 > > > [ 50.288103] [] show_trace+0xd/0x10 > > > [ 50.288202] [] dump_stack+0x57/0x5f > > > [ 50.288303] [] acpi_thermal_check+0x150/0x3bb > > > [ 50.288415] [] acpi_thermal_add+0x261/0x2cf > > > [ 50.288515] [] acpi_device_probe+0x3e/0xdb > > > [ 50.288615] [] driver_probe_device+0xaf/0x12a > > > [ 50.288717] [] __driver_attach+0x6c/0xa5 > > > [ 50.288817] [] bus_for_each_dev+0x3e/0x60 > > > [ 50.288916] [] driver_attach+0x14/0x16 > > > [ 50.289015] [] bus_add_driver+0xa6/0x1a8 > > > [ 50.289114] [] driver_register+0x42/0x47 > > > [ 50.289214] [] acpi_bus_register_driver+0x3a/0x3c > > > [ 50.289316] [] acpi_thermal_init+0x57/0x76 > > > [ 50.289424] [] kernel_init+0x138/0x280 > > > [ 50.289525] [] kernel_thread_helper+0x7/0x10 > > > [ 50.289625] === > > > [ 50.289680] ACPI: Critical trip point > > > [ 50.289736] Critical temperature reached (255 C), shutting down. > > > > > > so in acpi_thermal_get_temperature() called in acpi_thermal_add() the > > > tz->temperature thingy is not set properly (printk's added): > > > > > > [ 50.276607] Old temp: 4294967023 > > > [ 50.281890] Got temp: 255 > > > [ 50.282567] Old temp: 255 > > > [ 50.287882] Got temp: 255 > > > > > > What's also strange is that the tz acpi_thermal is alloc'd with kzalloc > > > and > > > there's still garbage in it after reading it in > > > acpi_thermal_get_temperature() > > > for the first time. Debugging continues... > > > > (i almost suspected that the problem might be something completely > > different.) > > well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer > > turned out to be > > > > broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. > > > > After backing this one out, mm1 boots just fine here. > > Thanks for tracking this down. I'll look into your logs and see if I > can figure out what's going on. There's another report related to that > patch here: http://lkml.org/lkml/2007/11/22/110 . Looks like a different > symptom though, so probably a different fix. >From what i can roughly tell so far it seems like an resource conflict between >acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. Anyways, this is a different issue than the one you quote above. -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote: > On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: > > On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: > > > Hi Andrew, > > > Hi Len, > > > > > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just > > > fine) on my asus laptop, the machine reboots after claiming that > > > "Critical temperature reached (255 C)." However, the degrees number > > > is kinda hinting at 0xff all-ones field. Will try dump_stack in > > > acpi_thermal_critical() to checkout the call path. For now here's the > > > netconsole bootlog: > > > > Here's what i got so far: > > > > [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 > > [ 50.287999] [] show_trace_log_lvl+0x12/0x25 > > [ 50.288103] [] show_trace+0xd/0x10 > > [ 50.288202] [] dump_stack+0x57/0x5f > > [ 50.288303] [] acpi_thermal_check+0x150/0x3bb > > [ 50.288415] [] acpi_thermal_add+0x261/0x2cf > > [ 50.288515] [] acpi_device_probe+0x3e/0xdb > > [ 50.288615] [] driver_probe_device+0xaf/0x12a > > [ 50.288717] [] __driver_attach+0x6c/0xa5 > > [ 50.288817] [] bus_for_each_dev+0x3e/0x60 > > [ 50.288916] [] driver_attach+0x14/0x16 > > [ 50.289015] [] bus_add_driver+0xa6/0x1a8 > > [ 50.289114] [] driver_register+0x42/0x47 > > [ 50.289214] [] acpi_bus_register_driver+0x3a/0x3c > > [ 50.289316] [] acpi_thermal_init+0x57/0x76 > > [ 50.289424] [] kernel_init+0x138/0x280 > > [ 50.289525] [] kernel_thread_helper+0x7/0x10 > > [ 50.289625] === > > [ 50.289680] ACPI: Critical trip point > > [ 50.289736] Critical temperature reached (255 C), shutting down. > > > > so in acpi_thermal_get_temperature() called in acpi_thermal_add() the > > tz->temperature thingy is not set properly (printk's added): > > > > [ 50.276607] Old temp: 4294967023 > > [ 50.281890] Got temp: 255 > > [ 50.282567] Old temp: 255 > > [ 50.287882] Got temp: 255 > > > > What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and > > there's still garbage in it after reading it in > > acpi_thermal_get_temperature() > > for the first time. Debugging continues... > > (i almost suspected that the problem might be something completely different.) > well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer > turned out to be > > broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. > > After backing this one out, mm1 boots just fine here. Thanks for tracking this down. I'll look into your logs and see if I can figure out what's going on. There's another report related to that patch here: http://lkml.org/lkml/2007/11/22/110 . Looks like a different symptom though, so probably a different fix. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: > On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: > > Hi Andrew, > > Hi Len, > > > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just > > fine) on my asus laptop, the machine reboots after claiming that > > "Critical temperature reached (255 C)." However, the degrees number > > is kinda hinting at 0xff all-ones field. Will try dump_stack in > > acpi_thermal_critical() to checkout the call path. For now here's the > > netconsole bootlog: > > Here's what i got so far: > > [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 > [ 50.287999] [] show_trace_log_lvl+0x12/0x25 > [ 50.288103] [] show_trace+0xd/0x10 > [ 50.288202] [] dump_stack+0x57/0x5f > [ 50.288303] [] acpi_thermal_check+0x150/0x3bb > [ 50.288415] [] acpi_thermal_add+0x261/0x2cf > [ 50.288515] [] acpi_device_probe+0x3e/0xdb > [ 50.288615] [] driver_probe_device+0xaf/0x12a > [ 50.288717] [] __driver_attach+0x6c/0xa5 > [ 50.288817] [] bus_for_each_dev+0x3e/0x60 > [ 50.288916] [] driver_attach+0x14/0x16 > [ 50.289015] [] bus_add_driver+0xa6/0x1a8 > [ 50.289114] [] driver_register+0x42/0x47 > [ 50.289214] [] acpi_bus_register_driver+0x3a/0x3c > [ 50.289316] [] acpi_thermal_init+0x57/0x76 > [ 50.289424] [] kernel_init+0x138/0x280 > [ 50.289525] [] kernel_thread_helper+0x7/0x10 > [ 50.289625] === > [ 50.289680] ACPI: Critical trip point > [ 50.289736] Critical temperature reached (255 C), shutting down. > > so in acpi_thermal_get_temperature() called in acpi_thermal_add() the > tz->temperature thingy is not set properly (printk's added): > > [ 50.276607] Old temp: 4294967023 > [ 50.281890] Got temp: 255 > [ 50.282567] Old temp: 255 > [ 50.287882] Got temp: 255 > > What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and > there's still garbage in it after reading it in acpi_thermal_get_temperature() > for the first time. Debugging continues... (i almost suspected that the problem might be something completely different.) well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer turned out to be broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. After backing this one out, mm1 boots just fine here. -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that Critical temperature reached (255 C). However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: Here's what i got so far: [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 [ 50.287999] [c0104b65] show_trace_log_lvl+0x12/0x25 [ 50.288103] [c01053e7] show_trace+0xd/0x10 [ 50.288202] [c0105a6c] dump_stack+0x57/0x5f [ 50.288303] [c021c991] acpi_thermal_check+0x150/0x3bb [ 50.288415] [c021d4b3] acpi_thermal_add+0x261/0x2cf [ 50.288515] [c0213549] acpi_device_probe+0x3e/0xdb [ 50.288615] [c023f8f5] driver_probe_device+0xaf/0x12a [ 50.288717] [c023fa88] __driver_attach+0x6c/0xa5 [ 50.288817] [c023ee5a] bus_for_each_dev+0x3e/0x60 [ 50.288916] [c023f77d] driver_attach+0x14/0x16 [ 50.289015] [c023f5a6] bus_add_driver+0xa6/0x1a8 [ 50.289114] [c023fc53] driver_register+0x42/0x47 [ 50.289214] [c02138c2] acpi_bus_register_driver+0x3a/0x3c [ 50.289316] [c044306b] acpi_thermal_init+0x57/0x76 [ 50.289424] [c04344a7] kernel_init+0x138/0x280 [ 50.289525] [c01047df] kernel_thread_helper+0x7/0x10 [ 50.289625] === [ 50.289680] ACPI: Critical trip point [ 50.289736] Critical temperature reached (255 C), shutting down. so in acpi_thermal_get_temperature() called in acpi_thermal_add() the tz-temperature thingy is not set properly (printk's added): [ 50.276607] Old temp: 4294967023 [ 50.281890] Got temp: 255 [ 50.282567] Old temp: 255 [ 50.287882] Got temp: 255 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and there's still garbage in it after reading it in acpi_thermal_get_temperature() for the first time. Debugging continues... (i almost suspected that the problem might be something completely different.) well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer turned out to be broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. After backing this one out, mm1 boots just fine here. -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote: On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that Critical temperature reached (255 C). However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: Here's what i got so far: [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 [ 50.287999] [c0104b65] show_trace_log_lvl+0x12/0x25 [ 50.288103] [c01053e7] show_trace+0xd/0x10 [ 50.288202] [c0105a6c] dump_stack+0x57/0x5f [ 50.288303] [c021c991] acpi_thermal_check+0x150/0x3bb [ 50.288415] [c021d4b3] acpi_thermal_add+0x261/0x2cf [ 50.288515] [c0213549] acpi_device_probe+0x3e/0xdb [ 50.288615] [c023f8f5] driver_probe_device+0xaf/0x12a [ 50.288717] [c023fa88] __driver_attach+0x6c/0xa5 [ 50.288817] [c023ee5a] bus_for_each_dev+0x3e/0x60 [ 50.288916] [c023f77d] driver_attach+0x14/0x16 [ 50.289015] [c023f5a6] bus_add_driver+0xa6/0x1a8 [ 50.289114] [c023fc53] driver_register+0x42/0x47 [ 50.289214] [c02138c2] acpi_bus_register_driver+0x3a/0x3c [ 50.289316] [c044306b] acpi_thermal_init+0x57/0x76 [ 50.289424] [c04344a7] kernel_init+0x138/0x280 [ 50.289525] [c01047df] kernel_thread_helper+0x7/0x10 [ 50.289625] === [ 50.289680] ACPI: Critical trip point [ 50.289736] Critical temperature reached (255 C), shutting down. so in acpi_thermal_get_temperature() called in acpi_thermal_add() the tz-temperature thingy is not set properly (printk's added): [ 50.276607] Old temp: 4294967023 [ 50.281890] Got temp: 255 [ 50.282567] Old temp: 255 [ 50.287882] Got temp: 255 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and there's still garbage in it after reading it in acpi_thermal_get_temperature() for the first time. Debugging continues... (i almost suspected that the problem might be something completely different.) well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer turned out to be broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. After backing this one out, mm1 boots just fine here. Thanks for tracking this down. I'll look into your logs and see if I can figure out what's going on. There's another report related to that patch here: http://lkml.org/lkml/2007/11/22/110 . Looks like a different symptom though, so probably a different fix. Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tue, Dec 11, 2007 at 01:00:24PM -0700, Bjorn Helgaas wrote: On Tuesday 11 December 2007 10:44:43 am Borislav Petkov wrote: On Sun, Dec 09, 2007 at 10:19:47AM +0100, Borislav Petkov wrote: On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that Critical temperature reached (255 C). However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: Here's what i got so far: [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 [ 50.287999] [c0104b65] show_trace_log_lvl+0x12/0x25 [ 50.288103] [c01053e7] show_trace+0xd/0x10 [ 50.288202] [c0105a6c] dump_stack+0x57/0x5f [ 50.288303] [c021c991] acpi_thermal_check+0x150/0x3bb [ 50.288415] [c021d4b3] acpi_thermal_add+0x261/0x2cf [ 50.288515] [c0213549] acpi_device_probe+0x3e/0xdb [ 50.288615] [c023f8f5] driver_probe_device+0xaf/0x12a [ 50.288717] [c023fa88] __driver_attach+0x6c/0xa5 [ 50.288817] [c023ee5a] bus_for_each_dev+0x3e/0x60 [ 50.288916] [c023f77d] driver_attach+0x14/0x16 [ 50.289015] [c023f5a6] bus_add_driver+0xa6/0x1a8 [ 50.289114] [c023fc53] driver_register+0x42/0x47 [ 50.289214] [c02138c2] acpi_bus_register_driver+0x3a/0x3c [ 50.289316] [c044306b] acpi_thermal_init+0x57/0x76 [ 50.289424] [c04344a7] kernel_init+0x138/0x280 [ 50.289525] [c01047df] kernel_thread_helper+0x7/0x10 [ 50.289625] === [ 50.289680] ACPI: Critical trip point [ 50.289736] Critical temperature reached (255 C), shutting down. so in acpi_thermal_get_temperature() called in acpi_thermal_add() the tz-temperature thingy is not set properly (printk's added): [ 50.276607] Old temp: 4294967023 [ 50.281890] Got temp: 255 [ 50.282567] Old temp: 255 [ 50.287882] Got temp: 255 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and there's still garbage in it after reading it in acpi_thermal_get_temperature() for the first time. Debugging continues... (i almost suspected that the problem might be something completely different.) well, after bisecting the rc4-mm1 tree for a whole day today, the evildoer turned out to be broken-out/pnp-request-ioport-and-iomem-resources-used-by-active-devices.patch. After backing this one out, mm1 boots just fine here. Thanks for tracking this down. I'll look into your logs and see if I can figure out what's going on. There's another report related to that patch here: http://lkml.org/lkml/2007/11/22/110 . Looks like a different symptom though, so probably a different fix. From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. Anyways, this is a different issue than the one you quote above. -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine... solved
On Tuesday 11 December 2007 01:52:55 pm Borislav Petkov wrote: From what i can roughly tell so far it seems like an resource conflict between acpi and the pnp requested regions in your patch which result in the acpi_thermal code to read the wrong (0xff) temperature value and halt the machine, but i might be wrong on the details since acpi is such a big code chunk to swallow. I don't see any obvious conflict from the log you posted. For the sake of comparison, can you post the corresponding dmesg log after you removed the patch? acpi_thermal_get_temperature() only evaluates _TMP, which isn't very interesting. I wonder if there's some conflict between that AML method and the EC driver or something. If you can also collect the DSDT, maybe I can poke around in there and see what _TMP is really doing. Thanks, Bjorn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine
On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: > Hi Andrew, > Hi Len, > > after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just > fine) on my asus laptop, the machine reboots after claiming that > "Critical temperature reached (255 C)." However, the degrees number > is kinda hinting at 0xff all-ones field. Will try dump_stack in > acpi_thermal_critical() to checkout the call path. For now here's the > netconsole bootlog: Here's what i got so far: [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 [ 50.287999] [] show_trace_log_lvl+0x12/0x25 [ 50.288103] [] show_trace+0xd/0x10 [ 50.288202] [] dump_stack+0x57/0x5f [ 50.288303] [] acpi_thermal_check+0x150/0x3bb [ 50.288415] [] acpi_thermal_add+0x261/0x2cf [ 50.288515] [] acpi_device_probe+0x3e/0xdb [ 50.288615] [] driver_probe_device+0xaf/0x12a [ 50.288717] [] __driver_attach+0x6c/0xa5 [ 50.288817] [] bus_for_each_dev+0x3e/0x60 [ 50.288916] [] driver_attach+0x14/0x16 [ 50.289015] [] bus_add_driver+0xa6/0x1a8 [ 50.289114] [] driver_register+0x42/0x47 [ 50.289214] [] acpi_bus_register_driver+0x3a/0x3c [ 50.289316] [] acpi_thermal_init+0x57/0x76 [ 50.289424] [] kernel_init+0x138/0x280 [ 50.289525] [] kernel_thread_helper+0x7/0x10 [ 50.289625] === [ 50.289680] ACPI: Critical trip point [ 50.289736] Critical temperature reached (255 C), shutting down. so in acpi_thermal_get_temperature() called in acpi_thermal_add() the tz->temperature thingy is not set properly (printk's added): [ 50.276607] Old temp: 4294967023 [ 50.281890] Got temp: 255 [ 50.282567] Old temp: 255 [ 50.287882] Got temp: 255 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and there's still garbage in it after reading it in acpi_thermal_get_temperature() for the first time. Debugging continues... -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1: acpi reboots machine
On Sun, Dec 09, 2007 at 08:50:02AM +0100, Borislav Petkov wrote: Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that Critical temperature reached (255 C). However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: Here's what i got so far: [ 50.287939] Pid: 1, comm: swapper Not tainted 2.6.24-rc4-mm1 #14 [ 50.287999] [c0104b65] show_trace_log_lvl+0x12/0x25 [ 50.288103] [c01053e7] show_trace+0xd/0x10 [ 50.288202] [c0105a6c] dump_stack+0x57/0x5f [ 50.288303] [c021c991] acpi_thermal_check+0x150/0x3bb [ 50.288415] [c021d4b3] acpi_thermal_add+0x261/0x2cf [ 50.288515] [c0213549] acpi_device_probe+0x3e/0xdb [ 50.288615] [c023f8f5] driver_probe_device+0xaf/0x12a [ 50.288717] [c023fa88] __driver_attach+0x6c/0xa5 [ 50.288817] [c023ee5a] bus_for_each_dev+0x3e/0x60 [ 50.288916] [c023f77d] driver_attach+0x14/0x16 [ 50.289015] [c023f5a6] bus_add_driver+0xa6/0x1a8 [ 50.289114] [c023fc53] driver_register+0x42/0x47 [ 50.289214] [c02138c2] acpi_bus_register_driver+0x3a/0x3c [ 50.289316] [c044306b] acpi_thermal_init+0x57/0x76 [ 50.289424] [c04344a7] kernel_init+0x138/0x280 [ 50.289525] [c01047df] kernel_thread_helper+0x7/0x10 [ 50.289625] === [ 50.289680] ACPI: Critical trip point [ 50.289736] Critical temperature reached (255 C), shutting down. so in acpi_thermal_get_temperature() called in acpi_thermal_add() the tz-temperature thingy is not set properly (printk's added): [ 50.276607] Old temp: 4294967023 [ 50.281890] Got temp: 255 [ 50.282567] Old temp: 255 [ 50.287882] Got temp: 255 What's also strange is that the tz acpi_thermal is alloc'd with kzalloc and there's still garbage in it after reading it in acpi_thermal_get_temperature() for the first time. Debugging continues... -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.24-rc4-mm1: acpi reboots machine
Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that "Critical temperature reached (255 C)." However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: [0.00] Linux version 2.6.24-rc4-mm1 ([EMAIL PROTECTED]) (gcc version 4.2.3 20071123 (prerelease) (Debian 4.2.2-4)) #7 SMP PREEMPT Sun Dec 9 08:27:26 CET 2007 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e - 0010 (reserved) [0.00] BIOS-e820: 0010 - 1ff4 (usable) [0.00] BIOS-e820: 1ff4 - 1ff5 (ACPI data) [0.00] BIOS-e820: 1ff5 - 2000 (ACPI NVS) [0.00] 511MB LOWMEM available. [0.00] Zone PFN ranges: [0.00] DMA 0 -> 4096 [0.00] Normal 4096 -> 130880 [0.00] Movable zone start PFN for each node [0.00] early_node_map[1] active PFN ranges [0.00] 0:0 -> 130880 [0.00] DMI 2.3 present. [0.00] ACPI: RSDP 000F5DF0, 0014 (r0 ACPIAM) [0.00] ACPI: RSDT 1FF4, 002C (r1 A M I OEMRSDT 6000423 MSFT 97) [0.00] ACPI: FACP 1FF40200, 0081 (r1 A M I OEMFACP 6000423 MSFT 97) [0.00] ACPI: DSDT 1FF40400, 628D (r1 1ABSP 1ABSP0011 MSFT 201) [0.00] ACPI: FACS 1FF5, 0040 [0.00] ACPI: OEMB 1FF50040, 0053 (r1 A M I OEMBIOS 6000423 MSFT 97) [0.00] ACPI: PM-Timer IO Port: 0x408 [0.00] Allocating PCI resources starting at 3000 (gap: 2000:e000) [0.00] swsusp: Registered nosave memory region: 0009f000 - 000a [0.00] swsusp: Registered nosave memory region: 000a - 000e [0.00] swsusp: Registered nosave memory region: 000e - 0010 [0.00] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 129475 [0.00] Kernel command line: root=/dev/hda1 vga=0 nmi_watchdog=1 [EMAIL PROTECTED]/,@192.168.45.26/ [0.00] Found and enabled local APIC! [0.00] Enabling fast FPU save and restore... done. [0.00] Enabling unmasked SIMD FPU exception support... done. [0.00] Initializing CPU#0 [0.00] CPU 0 irqstacks, hard=c0451000 soft=c0449000 [0.00] PID hash table entries: 2048 (order: 11, 8192 bytes) [0.00] Detected 1500.114 MHz processor. [ 50.138075] Console: colour VGA+ 80x25 [ 50.138080] console [tty0] enabled [ 50.140479] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) [ 50.140882] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) [ 50.160065] Memory: 513364k/523520k available (2049k kernel code, 9712k reserved, 1113k data, 172k init, 0k highmem) [ 50.160147] virtual kernel memory layout: [ 50.160148] fixmap : 0xfffb5000 - 0xf000 ( 296 kB) [ 50.160150] vmalloc : 0xe080 - 0xfffb3000 ( 503 MB) [ 50.160151] lowmem : 0xc000 - 0xdff4 ( 511 MB) [ 50.160153] .init : 0xc041b000 - 0xc0446000 ( 172 kB) [ 50.160154] .data : 0xc030067f - 0xc0416ca8 (1113 kB) [ 50.160156] .text : 0xc010 - 0xc030067f (2049 kB) [ 50.160549] Checking if this processor honours the WP bit even in supervisor mode... Ok. [ 50.160705] SLUB: Genslabs=11, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1 [ 50.220728] Calibrating delay using timer specific routine.. 3003.73 BogoMIPS (lpj=1501865) [ 50.220857] Security Framework initialized [ 50.220934] Mount-cache hash table entries: 512 [ 50.221174] CPU: L1 I cache: 32K, L1 D cache: 32K [ 50.221273] CPU: L2 cache: 1024K [ 50.221338] Intel machine check architecture supported. [ 50.221398] Intel machine check reporting enabled on CPU#0. [ 50.221459] Compat vDSO mapped to e000. [ 50.221524] Checking 'hlt' instruction... OK. [ 50.225022] SMP alternatives: switching to UP code [ 50.225766] Freeing SMP alternatives: 11k freed [ 50.225823] ACPI: Core revision 20070126 [ 50.229623] ACPI: setting ELCR to 0200 (from 0c30) [ 50.734915] CPU0: Intel(R) Pentium(R) M processor 1500MHz stepping 05 [ 50.735059] SMP motherboard not detected. [ 50.836119] Brought up 1 CPUs [ 50.836305] khelper used greatest stack depth: 3352 bytes left [ 50.836463] net_namespace: 108 bytes [ 50.837167] NET: Registered protocol family 16 [ 50.837466] ACPI: bus type pci registered [ 50.838812] PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=2 [ 50.838872] PCI: Using configuration type 1 [ 50.838928] Setting up standard PCI
2.6.24-rc4-mm1: acpi reboots machine
Hi Andrew, Hi Len, after booting 2.6.24-rc4-mm1 (2.6.24-rc4-190-g94545ba, otoh, boots just fine) on my asus laptop, the machine reboots after claiming that Critical temperature reached (255 C). However, the degrees number is kinda hinting at 0xff all-ones field. Will try dump_stack in acpi_thermal_critical() to checkout the call path. For now here's the netconsole bootlog: [0.00] Linux version 2.6.24-rc4-mm1 ([EMAIL PROTECTED]) (gcc version 4.2.3 20071123 (prerelease) (Debian 4.2.2-4)) #7 SMP PREEMPT Sun Dec 9 08:27:26 CET 2007 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e - 0010 (reserved) [0.00] BIOS-e820: 0010 - 1ff4 (usable) [0.00] BIOS-e820: 1ff4 - 1ff5 (ACPI data) [0.00] BIOS-e820: 1ff5 - 2000 (ACPI NVS) [0.00] 511MB LOWMEM available. [0.00] Zone PFN ranges: [0.00] DMA 0 - 4096 [0.00] Normal 4096 - 130880 [0.00] Movable zone start PFN for each node [0.00] early_node_map[1] active PFN ranges [0.00] 0:0 - 130880 [0.00] DMI 2.3 present. [0.00] ACPI: RSDP 000F5DF0, 0014 (r0 ACPIAM) [0.00] ACPI: RSDT 1FF4, 002C (r1 A M I OEMRSDT 6000423 MSFT 97) [0.00] ACPI: FACP 1FF40200, 0081 (r1 A M I OEMFACP 6000423 MSFT 97) [0.00] ACPI: DSDT 1FF40400, 628D (r1 1ABSP 1ABSP0011 MSFT 201) [0.00] ACPI: FACS 1FF5, 0040 [0.00] ACPI: OEMB 1FF50040, 0053 (r1 A M I OEMBIOS 6000423 MSFT 97) [0.00] ACPI: PM-Timer IO Port: 0x408 [0.00] Allocating PCI resources starting at 3000 (gap: 2000:e000) [0.00] swsusp: Registered nosave memory region: 0009f000 - 000a [0.00] swsusp: Registered nosave memory region: 000a - 000e [0.00] swsusp: Registered nosave memory region: 000e - 0010 [0.00] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 129475 [0.00] Kernel command line: root=/dev/hda1 vga=0 nmi_watchdog=1 [EMAIL PROTECTED]/,@192.168.45.26/ [0.00] Found and enabled local APIC! [0.00] Enabling fast FPU save and restore... done. [0.00] Enabling unmasked SIMD FPU exception support... done. [0.00] Initializing CPU#0 [0.00] CPU 0 irqstacks, hard=c0451000 soft=c0449000 [0.00] PID hash table entries: 2048 (order: 11, 8192 bytes) [0.00] Detected 1500.114 MHz processor. [ 50.138075] Console: colour VGA+ 80x25 [ 50.138080] console [tty0] enabled [ 50.140479] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) [ 50.140882] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) [ 50.160065] Memory: 513364k/523520k available (2049k kernel code, 9712k reserved, 1113k data, 172k init, 0k highmem) [ 50.160147] virtual kernel memory layout: [ 50.160148] fixmap : 0xfffb5000 - 0xf000 ( 296 kB) [ 50.160150] vmalloc : 0xe080 - 0xfffb3000 ( 503 MB) [ 50.160151] lowmem : 0xc000 - 0xdff4 ( 511 MB) [ 50.160153] .init : 0xc041b000 - 0xc0446000 ( 172 kB) [ 50.160154] .data : 0xc030067f - 0xc0416ca8 (1113 kB) [ 50.160156] .text : 0xc010 - 0xc030067f (2049 kB) [ 50.160549] Checking if this processor honours the WP bit even in supervisor mode... Ok. [ 50.160705] SLUB: Genslabs=11, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1 [ 50.220728] Calibrating delay using timer specific routine.. 3003.73 BogoMIPS (lpj=1501865) [ 50.220857] Security Framework initialized [ 50.220934] Mount-cache hash table entries: 512 [ 50.221174] CPU: L1 I cache: 32K, L1 D cache: 32K [ 50.221273] CPU: L2 cache: 1024K [ 50.221338] Intel machine check architecture supported. [ 50.221398] Intel machine check reporting enabled on CPU#0. [ 50.221459] Compat vDSO mapped to e000. [ 50.221524] Checking 'hlt' instruction... OK. [ 50.225022] SMP alternatives: switching to UP code [ 50.225766] Freeing SMP alternatives: 11k freed [ 50.225823] ACPI: Core revision 20070126 [ 50.229623] ACPI: setting ELCR to 0200 (from 0c30) [ 50.734915] CPU0: Intel(R) Pentium(R) M processor 1500MHz stepping 05 [ 50.735059] SMP motherboard not detected. [ 50.836119] Brought up 1 CPUs [ 50.836305] khelper used greatest stack depth: 3352 bytes left [ 50.836463] net_namespace: 108 bytes [ 50.837167] NET: Registered protocol family 16 [ 50.837466] ACPI: bus type pci registered [ 50.838812] PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=2 [ 50.838872] PCI: Using configuration type 1 [ 50.838928] Setting up standard PCI