Re: More info on port 80 symptoms on MCP51 machine.
Allen Martin wrote: Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. This is very helpful. So the next question is there something on the laptop mainboard. Any idea how to look for such a thing? I am not averse to taking the laptop apart to look at the mainboard. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Allen Martin wrote: Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. We use PCI port 80 decoders internally for debugging quite often, so if there were some chipset issue related to port 80 it would have showed up a long time ago, and this is the first I've heard of hangs related to port 80 writes. Presumably you have programmable decoders to trigger SMI? If not, then they're probably doing the equivalent in a SuperIO chip or similar. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: More info on port 80 symptoms on MCP51 machine.
> Alan Cox wrote: > > On Wed, 12 Dec 2007 21:58:25 +0100 > > Rene Herman <[EMAIL PROTECTED]> wrote: > > > >> On 12-12-07 21:26, Rene Herman wrote: > >> > >>> On 12-12-07 21:07, David P. Reed wrote: > Someone might have an in to nVidia to clarify this, > since I don't. > In any case, the udelay(2) approach seems to be a safe > fix for this machine. > >> By the way, _does_ anyone have a contact at nVidia who > could clarify? > >> Alan maybe? I'm quite curious what they did... > > > > I don't. Nvidia are not the most open bunch of people on > the planet. > > This doesn't appear to be a chipset bug anyway but a firmware one > > (other systems with the same chipset work just fine). > > > > The laptop maker might therefore be a better starting point. > > One wonders if it does some SMM trick to capture port 0x80 > writes and attempt to haul them off for debugging; it almost > sounds like some kind of debugging code got let out into the field. > > -hpa Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. We use PCI port 80 decoders internally for debugging quite often, so if there were some chipset issue related to port 80 it would have showed up a long time ago, and this is the first I've heard of hangs related to port 80 writes. -Allen --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: More info on port 80 symptoms on MCP51 machine.
Alan Cox wrote: On Wed, 12 Dec 2007 21:58:25 +0100 Rene Herman [EMAIL PROTECTED] wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... I don't. Nvidia are not the most open bunch of people on the planet. This doesn't appear to be a chipset bug anyway but a firmware one (other systems with the same chipset work just fine). The laptop maker might therefore be a better starting point. One wonders if it does some SMM trick to capture port 0x80 writes and attempt to haul them off for debugging; it almost sounds like some kind of debugging code got let out into the field. -hpa Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. We use PCI port 80 decoders internally for debugging quite often, so if there were some chipset issue related to port 80 it would have showed up a long time ago, and this is the first I've heard of hangs related to port 80 writes. -Allen --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Allen Martin wrote: Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. We use PCI port 80 decoders internally for debugging quite often, so if there were some chipset issue related to port 80 it would have showed up a long time ago, and this is the first I've heard of hangs related to port 80 writes. Presumably you have programmable decoders to trigger SMI? If not, then they're probably doing the equivalent in a SuperIO chip or similar. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Allen Martin wrote: Nothing inside the chipset should be decoding port 80 writes. It's possible this board has a port 80 decoder wired onto the board that's misbehaving. I've seen other laptop boards with port 80 decoders wired onto the board, even if the 7 segment display is only populated on debug builds. This is very helpful. So the next question is there something on the laptop mainboard. Any idea how to look for such a thing? I am not averse to taking the laptop apart to look at the mainboard. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 14-12-07 23:05, Chuck Ebbert wrote: On 12/12/2007 04:05 PM, H. Peter Anvin wrote: Rene Herman wrote: By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with "acpi=off", outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Sorry, the first sentence didn't parse unambiguously for me. Do you mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to work? I have some people at nVidia I can probably ping. Sorry, didn't see this again due to aforementioned horseshit ISP. "acpi=off" works it seems. Report from David Reed here: http://lkml.org/lkml/2007/12/12/291 Have them search on Google for: hp tx1000 noapic :) Rene. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12/12/2007 04:05 PM, H. Peter Anvin wrote: > Rene Herman wrote: >> On 12-12-07 21:26, Rene Herman wrote: >> >>> On 12-12-07 21:07, David P. Reed wrote: >> Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. >> >> By the way, _does_ anyone have a contact at nVidia who could clarify? >> Alan maybe? I'm quite curious what they did... >> >> Summary: >> >> Unless after booting with "acpi=off", outputs to port 0x80 (the legacy >> way to delay I/O) reliably, but not immediately, hang MCP51 machines. >> Outputs to port 0xed do not indicating it's a not a generic bus abort >> problem. >> > > Sorry, the first sentence didn't parse unambiguously for me. Do you > mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to > work? > > I have some people at nVidia I can probably ping. > Have them search on Google for: hp tx1000 noapic :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12/12/2007 04:05 PM, H. Peter Anvin wrote: Rene Herman wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with acpi=off, outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Sorry, the first sentence didn't parse unambiguously for me. Do you mean acpi=off works, or that acpi=off allows *subsequent* boots to work? I have some people at nVidia I can probably ping. Have them search on Google for: hp tx1000 noapic :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 14-12-07 23:05, Chuck Ebbert wrote: On 12/12/2007 04:05 PM, H. Peter Anvin wrote: Rene Herman wrote: By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with acpi=off, outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Sorry, the first sentence didn't parse unambiguously for me. Do you mean acpi=off works, or that acpi=off allows *subsequent* boots to work? I have some people at nVidia I can probably ping. Sorry, didn't see this again due to aforementioned horseshit ISP. acpi=off works it seems. Report from David Reed here: http://lkml.org/lkml/2007/12/12/291 Have them search on Google for: hp tx1000 noapic :) Rene. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
> One wonders if it does some SMM trick to capture port 0x80 writes and > attempt to haul them off for debugging; it almost sounds like some kind > of debugging code got let out into the field. Not implausible. We've got a bug I've been dealing with where a vendor left debug stuff enabled via the parallel port and which clearly "escaped" from the test environment to the BIOS proper. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Alan Cox wrote: On Wed, 12 Dec 2007 21:58:25 +0100 Rene Herman <[EMAIL PROTECTED]> wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... I don't. Nvidia are not the most open bunch of people on the planet. This doesn't appear to be a chipset bug anyway but a firmware one (other systems with the same chipset work just fine). The laptop maker might therefore be a better starting point. One wonders if it does some SMM trick to capture port 0x80 writes and attempt to haul them off for debugging; it almost sounds like some kind of debugging code got let out into the field. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Rene Herman wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with "acpi=off", outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Sorry, the first sentence didn't parse unambiguously for me. Do you mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to work? I have some people at nVidia I can probably ping. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On Wed, 12 Dec 2007 21:58:25 +0100 Rene Herman <[EMAIL PROTECTED]> wrote: > On 12-12-07 21:26, Rene Herman wrote: > > > On 12-12-07 21:07, David P. Reed wrote: > > >> Someone might have an in to nVidia to clarify this, since I don't. In > >> any case, the udelay(2) approach seems to be a safe fix for this machine. > > By the way, _does_ anyone have a contact at nVidia who could clarify? Alan > maybe? I'm quite curious what they did... I don't. Nvidia are not the most open bunch of people on the planet. This doesn't appear to be a chipset bug anyway but a firmware one (other systems with the same chipset work just fine). The laptop maker might therefore be a better starting point. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with "acpi=off", outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Rene. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Port 0xED, just FYI: cycles: out 1430, in 1370 cycles: out 1429, in 1370 (800 Mhz) Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my "problem" machine, with the results below. The interesting part of this is that port 80 seems to respond to "in" instructions faster than the presumably "unused" ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). Don't know if someone else mentioned those but I only said 0xed. That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being generally problematic as well). It's in fact not all that unexpected it seems that port 0x80 responds to in given that it's used by the DMA controller. It's a write that falls on deaf ears. The read is going to be faster if it doesn't timeout on an unused port. Although it's not faster for everyone, such as for me indicating that for us port 0x80 is really-really unused, it is for many. See results here: http://lkml.org/lkml/2007/12/12/309 That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the "hwclock" utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Yes, so it seems. In this case we could in fact also "fix" your situation by just going to 0xed depending on for example DMI. Alan Cox just posted a few further problems with a simple udelay() replacement... Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an "outsider" is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the "in" is somewhat interesting. Did someone at nVidia think it's an "in" from 0x80 which should get the 1 microsec delay? port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 Unused ports, bus timeouts. System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB Rene. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12-12-07 21:07, David P. Reed wrote: Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my "problem" machine, with the results below. The interesting part of this is that port 80 seems to respond to "in" instructions faster than the presumably "unused" ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). Don't know if someone else mentioned those but I only said 0xed. That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being generally problematic as well). It's in fact not all that unexpected it seems that port 0x80 responds to in given that it's used by the DMA controller. It's a write that falls on deaf ears. The read is going to be faster if it doesn't timeout on an unused port. Although it's not faster for everyone, such as for me indicating that for us port 0x80 is really-really unused, it is for many. See results here: http://lkml.org/lkml/2007/12/12/309 That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the "hwclock" utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Yes, so it seems. In this case we could in fact also "fix" your situation by just going to 0xed depending on for example DMI. Alan Cox just posted a few further problems with a simple udelay() replacement... Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an "outsider" is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the "in" is somewhat interesting. Did someone at nVidia think it's an "in" from 0x80 which should get the 1 microsec delay? port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 Unused ports, bus timeouts. System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB Rene. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
More info on port 80 symptoms on MCP51 machine.
Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my "problem" machine, with the results below. The interesting part of this is that port 80 seems to respond to "in" instructions faster than the presumably "unused" ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the "hwclock" utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an "outsider" is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
More info on port 80 symptoms on MCP51 machine.
Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my problem machine, with the results below. The interesting part of this is that port 80 seems to respond to in instructions faster than the presumably unused ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the hwclock utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an outsider is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12-12-07 21:07, David P. Reed wrote: Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my problem machine, with the results below. The interesting part of this is that port 80 seems to respond to in instructions faster than the presumably unused ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). Don't know if someone else mentioned those but I only said 0xed. That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being generally problematic as well). It's in fact not all that unexpected it seems that port 0x80 responds to in given that it's used by the DMA controller. It's a write that falls on deaf ears. The read is going to be faster if it doesn't timeout on an unused port. Although it's not faster for everyone, such as for me indicating that for us port 0x80 is really-really unused, it is for many. See results here: http://lkml.org/lkml/2007/12/12/309 That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the hwclock utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Yes, so it seems. In this case we could in fact also fix your situation by just going to 0xed depending on for example DMI. Alan Cox just posted a few further problems with a simple udelay() replacement... Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an outsider is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the in is somewhat interesting. Did someone at nVidia think it's an in from 0x80 which should get the 1 microsec delay? port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 Unused ports, bus timeouts. System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB Rene. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Port 0xED, just FYI: cycles: out 1430, in 1370 cycles: out 1429, in 1370 (800 Mhz) Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Sadly, I've been busy with other crises in my day job for the last few days. I did modify Rene's test program and ran it on my problem machine, with the results below. The interesting part of this is that port 80 seems to respond to in instructions faster than the presumably unused ports 0xEC and 0XEF (those were mentioned by someone as alternatives to port 80). Don't know if someone else mentioned those but I only said 0xed. That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being generally problematic as well). It's in fact not all that unexpected it seems that port 0x80 responds to in given that it's used by the DMA controller. It's a write that falls on deaf ears. The read is going to be faster if it doesn't timeout on an unused port. Although it's not faster for everyone, such as for me indicating that for us port 0x80 is really-really unused, it is for many. See results here: http://lkml.org/lkml/2007/12/12/309 That, and the fact that the port 80 test reliably freezes the machine solid the second time it is run, and the hwclock utility reliably hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this chipset or motherboard actually uses port 80, rather than there being a bus problem. Yes, so it seems. In this case we could in fact also fix your situation by just going to 0xed depending on for example DMI. Alan Cox just posted a few further problems with a simple udelay() replacement... Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. Hope input from an outsider is helpful in going forward. I put a lot of time and effort into tracking down this problem on this particular machine model, largely because I like the machine. Running the (slightly modified to test ports 80, ec, ef instead of just port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's what I get for port 80 and port ec and port ef. port 80: cycles: out 1430, in 792 At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the in is somewhat interesting. Did someone at nVidia think it's an in from 0x80 which should get the 1 microsec delay? port ef:cycles: out 1431, in 1378 port ec: cycles: out 1432, in 1372 Unused ports, bus timeouts. System info: HP Pavilion dv9000z laptop (AMD64x2) PCI bus controller is nVidia MCP51. processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 72 model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping: 2 cpu MHz : 800.000 cache size : 512 KB Rene. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Alan Cox wrote: On Wed, 12 Dec 2007 21:58:25 +0100 Rene Herman [EMAIL PROTECTED] wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... I don't. Nvidia are not the most open bunch of people on the planet. This doesn't appear to be a chipset bug anyway but a firmware one (other systems with the same chipset work just fine). The laptop maker might therefore be a better starting point. One wonders if it does some SMM trick to capture port 0x80 writes and attempt to haul them off for debugging; it almost sounds like some kind of debugging code got let out into the field. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
Rene Herman wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with acpi=off, outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Sorry, the first sentence didn't parse unambiguously for me. Do you mean acpi=off works, or that acpi=off allows *subsequent* boots to work? I have some people at nVidia I can probably ping. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On Wed, 12 Dec 2007 21:58:25 +0100 Rene Herman [EMAIL PROTECTED] wrote: On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... I don't. Nvidia are not the most open bunch of people on the planet. This doesn't appear to be a chipset bug anyway but a firmware one (other systems with the same chipset work just fine). The laptop maker might therefore be a better starting point. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
On 12-12-07 21:26, Rene Herman wrote: On 12-12-07 21:07, David P. Reed wrote: Someone might have an in to nVidia to clarify this, since I don't. In any case, the udelay(2) approach seems to be a safe fix for this machine. By the way, _does_ anyone have a contact at nVidia who could clarify? Alan maybe? I'm quite curious what they did... Summary: Unless after booting with acpi=off, outputs to port 0x80 (the legacy way to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to port 0xed do not indicating it's a not a generic bus abort problem. Rene. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More info on port 80 symptoms on MCP51 machine.
One wonders if it does some SMM trick to capture port 0x80 writes and attempt to haul them off for debugging; it almost sounds like some kind of debugging code got let out into the field. Not implausible. We've got a bug I've been dealing with where a vendor left debug stuff enabled via the parallel port and which clearly escaped from the test environment to the BIOS proper. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/