Re: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread David P. Reed

Allen Martin wrote:
Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders

wired onto the board, even if the 7 segment display is only populated
on debug builds.  

  
This is very helpful.  So the next question is there something on the 
laptop mainboard.


Any idea how to look for such a thing?   I am not averse to taking the 
laptop apart to look at the mainboard.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread H. Peter Anvin

Allen Martin wrote:


Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders

wired onto the board, even if the 7 segment display is only populated
on debug builds.  

We use PCI port 80 decoders internally for debugging quite often, so 
if there were some chipset issue related to port 80 it would have 
showed up a long time ago, and this is the first I've heard of

hangs related to port 80 writes.



Presumably you have programmable decoders to trigger SMI?  If not, then 
they're probably doing the equivalent in a SuperIO chip or similar.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread Allen Martin
> Alan Cox wrote:
> > On Wed, 12 Dec 2007 21:58:25 +0100
> > Rene Herman <[EMAIL PROTECTED]> wrote:
> > 
> >> On 12-12-07 21:26, Rene Herman wrote:
> >>
> >>> On 12-12-07 21:07, David P. Reed wrote:
>  Someone might have an in to nVidia to clarify this, 
> since I don't.  
>  In any case, the udelay(2) approach seems to be a safe 
> fix for this machine.
> >> By the way, _does_ anyone have a contact at nVidia who 
> could clarify? 
> >> Alan maybe? I'm quite curious what they did...
> > 
> > I don't. Nvidia are not the most open bunch of people on 
> the planet. 
> > This doesn't appear to be a chipset bug anyway but a firmware one 
> > (other systems with the same chipset work just fine).
> > 
> > The laptop maker might therefore be a better starting point.
> 
> One wonders if it does some SMM trick to capture port 0x80 
> writes and attempt to haul them off for debugging; it almost 
> sounds like some kind of debugging code got let out into the field.
> 
>   -hpa

Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders
wired onto the board, even if the 7 segment display is only populated
on debug builds.  

We use PCI port 80 decoders internally for debugging quite often, so 
if there were some chipset issue related to port 80 it would have 
showed up a long time ago, and this is the first I've heard of
hangs related to port 80 writes.

-Allen
---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread Allen Martin
 Alan Cox wrote:
  On Wed, 12 Dec 2007 21:58:25 +0100
  Rene Herman [EMAIL PROTECTED] wrote:
  
  On 12-12-07 21:26, Rene Herman wrote:
 
  On 12-12-07 21:07, David P. Reed wrote:
  Someone might have an in to nVidia to clarify this, 
 since I don't.  
  In any case, the udelay(2) approach seems to be a safe 
 fix for this machine.
  By the way, _does_ anyone have a contact at nVidia who 
 could clarify? 
  Alan maybe? I'm quite curious what they did...
  
  I don't. Nvidia are not the most open bunch of people on 
 the planet. 
  This doesn't appear to be a chipset bug anyway but a firmware one 
  (other systems with the same chipset work just fine).
  
  The laptop maker might therefore be a better starting point.
 
 One wonders if it does some SMM trick to capture port 0x80 
 writes and attempt to haul them off for debugging; it almost 
 sounds like some kind of debugging code got let out into the field.
 
   -hpa

Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders
wired onto the board, even if the 7 segment display is only populated
on debug builds.  

We use PCI port 80 decoders internally for debugging quite often, so 
if there were some chipset issue related to port 80 it would have 
showed up a long time ago, and this is the first I've heard of
hangs related to port 80 writes.

-Allen
---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread H. Peter Anvin

Allen Martin wrote:


Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders

wired onto the board, even if the 7 segment display is only populated
on debug builds.  

We use PCI port 80 decoders internally for debugging quite often, so 
if there were some chipset issue related to port 80 it would have 
showed up a long time ago, and this is the first I've heard of

hangs related to port 80 writes.



Presumably you have programmable decoders to trigger SMI?  If not, then 
they're probably doing the equivalent in a SuperIO chip or similar.


-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-15 Thread David P. Reed

Allen Martin wrote:
Nothing inside the chipset should be decoding port 80 writes.  It's 
possible this board has a port 80 decoder wired onto the board that's 
misbehaving.  I've seen other laptop boards with port 80 decoders

wired onto the board, even if the 7 segment display is only populated
on debug builds.  

  
This is very helpful.  So the next question is there something on the 
laptop mainboard.


Any idea how to look for such a thing?   I am not averse to taking the 
laptop apart to look at the mainboard.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-14 Thread Rene Herman

On 14-12-07 23:05, Chuck Ebbert wrote:


On 12/12/2007 04:05 PM, H. Peter Anvin wrote:



Rene Herman wrote:



By the way, _does_ anyone have a contact at nVidia who could clarify?
Alan maybe? I'm quite curious what they did...

Summary:

Unless after booting with "acpi=off", outputs to port 0x80 (the legacy
way to delay I/O) reliably, but not immediately, hang MCP51 machines.
Outputs to port 0xed do not indicating it's a not a generic bus abort
problem.


Sorry, the first sentence didn't parse unambiguously for me.  Do you
mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to
work?

I have some people at nVidia I can probably ping.


Sorry, didn't see this again due to aforementioned horseshit ISP. "acpi=off" 
works it seems. Report from David Reed here:


http://lkml.org/lkml/2007/12/12/291


Have them search on Google for:

  hp tx1000 noapic

:)


Rene.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-14 Thread Chuck Ebbert
On 12/12/2007 04:05 PM, H. Peter Anvin wrote:
> Rene Herman wrote:
>> On 12-12-07 21:26, Rene Herman wrote:
>>
>>> On 12-12-07 21:07, David P. Reed wrote:
>>
 Someone might have an in to nVidia to clarify this, since I don't. 
 In any case, the udelay(2) approach seems to be a safe fix for this
 machine.
>>
>> By the way, _does_ anyone have a contact at nVidia who could clarify?
>> Alan maybe? I'm quite curious what they did...
>>
>> Summary:
>>
>> Unless after booting with "acpi=off", outputs to port 0x80 (the legacy
>> way to delay I/O) reliably, but not immediately, hang MCP51 machines.
>> Outputs to port 0xed do not indicating it's a not a generic bus abort
>> problem.
>>
> 
> Sorry, the first sentence didn't parse unambiguously for me.  Do you
> mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to
> work?
> 
> I have some people at nVidia I can probably ping.
> 

Have them search on Google for:

  hp tx1000 noapic

:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-14 Thread Chuck Ebbert
On 12/12/2007 04:05 PM, H. Peter Anvin wrote:
 Rene Herman wrote:
 On 12-12-07 21:26, Rene Herman wrote:

 On 12-12-07 21:07, David P. Reed wrote:

 Someone might have an in to nVidia to clarify this, since I don't. 
 In any case, the udelay(2) approach seems to be a safe fix for this
 machine.

 By the way, _does_ anyone have a contact at nVidia who could clarify?
 Alan maybe? I'm quite curious what they did...

 Summary:

 Unless after booting with acpi=off, outputs to port 0x80 (the legacy
 way to delay I/O) reliably, but not immediately, hang MCP51 machines.
 Outputs to port 0xed do not indicating it's a not a generic bus abort
 problem.

 
 Sorry, the first sentence didn't parse unambiguously for me.  Do you
 mean acpi=off works, or that acpi=off allows *subsequent* boots to
 work?
 
 I have some people at nVidia I can probably ping.
 

Have them search on Google for:

  hp tx1000 noapic

:)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-14 Thread Rene Herman

On 14-12-07 23:05, Chuck Ebbert wrote:


On 12/12/2007 04:05 PM, H. Peter Anvin wrote:



Rene Herman wrote:



By the way, _does_ anyone have a contact at nVidia who could clarify?
Alan maybe? I'm quite curious what they did...

Summary:

Unless after booting with acpi=off, outputs to port 0x80 (the legacy
way to delay I/O) reliably, but not immediately, hang MCP51 machines.
Outputs to port 0xed do not indicating it's a not a generic bus abort
problem.


Sorry, the first sentence didn't parse unambiguously for me.  Do you
mean acpi=off works, or that acpi=off allows *subsequent* boots to
work?

I have some people at nVidia I can probably ping.


Sorry, didn't see this again due to aforementioned horseshit ISP. acpi=off 
works it seems. Report from David Reed here:


http://lkml.org/lkml/2007/12/12/291


Have them search on Google for:

  hp tx1000 noapic

:)


Rene.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Alan Cox
> One wonders if it does some SMM trick to capture port 0x80 writes and 
> attempt to haul them off for debugging; it almost sounds like some kind 
> of debugging code got let out into the field.

Not implausible. We've got a bug I've been dealing with where a vendor
left debug stuff enabled via the parallel port and which clearly
"escaped" from the test environment to the BIOS proper.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread H. Peter Anvin

Alan Cox wrote:

On Wed, 12 Dec 2007 21:58:25 +0100
Rene Herman <[EMAIL PROTECTED]> wrote:


On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:
Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.
By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
maybe? I'm quite curious what they did...


I don't. Nvidia are not the most open bunch of people on the planet. This
doesn't appear to be a chipset bug anyway but a firmware one (other
systems with the same chipset work just fine).

The laptop maker might therefore be a better starting point.


One wonders if it does some SMM trick to capture port 0x80 writes and 
attempt to haul them off for debugging; it almost sounds like some kind 
of debugging code got let out into the field.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread H. Peter Anvin

Rene Herman wrote:

On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:


Someone might have an in to nVidia to clarify this, since I don't.  
In any case, the udelay(2) approach seems to be a safe fix for this 
machine.


By the way, _does_ anyone have a contact at nVidia who could clarify? 
Alan maybe? I'm quite curious what they did...


Summary:

Unless after booting with "acpi=off", outputs to port 0x80 (the legacy 
way to delay I/O) reliably, but not immediately, hang MCP51 machines. 
Outputs to port 0xed do not indicating it's a not a generic bus abort 
problem.




Sorry, the first sentence didn't parse unambiguously for me.  Do you 
mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to work?


I have some people at nVidia I can probably ping.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Alan Cox
On Wed, 12 Dec 2007 21:58:25 +0100
Rene Herman <[EMAIL PROTECTED]> wrote:

> On 12-12-07 21:26, Rene Herman wrote:
> 
> > On 12-12-07 21:07, David P. Reed wrote:
> 
> >> Someone might have an in to nVidia to clarify this, since I don't.  In 
> >> any case, the udelay(2) approach seems to be a safe fix for this machine.
> 
> By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
> maybe? I'm quite curious what they did...

I don't. Nvidia are not the most open bunch of people on the planet. This
doesn't appear to be a chipset bug anyway but a firmware one (other
systems with the same chipset work just fine).

The laptop maker might therefore be a better starting point.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Rene Herman

On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
maybe? I'm quite curious what they did...


Summary:

Unless after booting with "acpi=off", outputs to port 0x80 (the legacy way 
to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to 
port 0xed do not indicating it's a not a generic bus abort problem.


Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread David P. Reed

Port 0xED, just FYI:

cycles: out 1430, in 1370
cycles: out 1429, in 1370

(800 Mhz)


Rene Herman wrote:

On 12-12-07 21:07, David P. Reed wrote:

Sadly, I've been busy with other crises in my day job for the last 
few days.   I did modify Rene's test program and ran it on my 
"problem" machine, with the results below.


The interesting part of this is that port 80 seems to respond to "in" 
instructions faster than the presumably "unused" ports 0xEC  and 
0XEF  (those were mentioned by someone as alternatives to port 80).


Don't know if someone else mentioned those but I only said 0xed. 
That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) 
reported as being generally problematic as well).


It's in fact not all that unexpected it seems that port 0x80 responds 
to in given that it's used by the DMA controller. It's a write that 
falls on deaf ears. The read is going to be faster if it doesn't 
timeout on an unused port.


Although it's not faster for everyone, such as for me indicating that 
for us port 0x80 is really-really unused, it is for many. See results 
here:


http://lkml.org/lkml/2007/12/12/309

That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the "hwclock" utility reliably 
hangs the machine if the port 80's are used in the 
CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this 
chipset or motherboard actually uses port 80, rather than there being 
a bus problem.


Yes, so it seems. In this case we could in fact also "fix" your 
situation by just going to 0xed depending on for example DMI. Alan Cox 
just posted a few further problems with a simple udelay() replacement...


Someone might have an in to nVidia to clarify this, since I don't.  
In any case, the udelay(2) approach seems to be a safe fix for this 
machine.


Hope input from an "outsider" is helpful in going forward.   I put a 
lot of time and effort into tracking down this problem on this 
particular machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of 
just port 80) test when the 2 GHz max speed CPU is running at 800 
MHz, here's what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792


At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the "in" 
is somewhat interesting. Did someone at nVidia think it's an "in" from 
0x80 which should get the 1 microsec delay?



port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


Unused ports, bus timeouts.




System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB


Rene.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Rene Herman

On 12-12-07 21:07, David P. Reed wrote:

Sadly, I've been busy with other crises in my day job for the last few 
days.   I did modify Rene's test program and ran it on my "problem" 
machine, with the results below.


The interesting part of this is that port 80 seems to respond to "in" 
instructions faster than the presumably "unused" ports 0xEC  and 0XEF  
(those were mentioned by someone as alternatives to port 80).


Don't know if someone else mentioned those but I only said 0xed. That's the 
value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being 
generally problematic as well).


It's in fact not all that unexpected it seems that port 0x80 responds to in 
given that it's used by the DMA controller. It's a write that falls on deaf 
ears. The read is going to be faster if it doesn't timeout on an unused port.


Although it's not faster for everyone, such as for me indicating that for us 
port 0x80 is really-really unused, it is for many. See results here:


http://lkml.org/lkml/2007/12/12/309

That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the "hwclock" utility reliably 
hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE 
loop, seems to strongly indicate that this chipset or motherboard 
actually uses port 80, rather than there being a bus problem.


Yes, so it seems. In this case we could in fact also "fix" your situation by 
just going to 0xed depending on for example DMI. Alan Cox just posted a few 
further problems with a simple udelay() replacement...


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


Hope input from an "outsider" is helpful in going forward.   I put a lot 
of time and effort into tracking down this problem on this particular 
machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of just 
port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's 
what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792


At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the "in" is 
somewhat interesting. Did someone at nVidia think it's an "in" from 0x80 
which should get the 1 microsec delay?



port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


Unused ports, bus timeouts.




System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB


Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread David P. Reed
Sadly, I've been busy with other crises in my day job for the last few 
days.   I did modify Rene's test program and ran it on my "problem" 
machine, with the results below.


The interesting part of this is that port 80 seems to respond to "in" 
instructions faster than the presumably "unused" ports 0xEC  and 0XEF  
(those were mentioned by someone as alternatives to port 80).


That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the "hwclock" utility reliably 
hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE 
loop, seems to strongly indicate that this chipset or motherboard 
actually uses port 80, rather than there being a bus problem.


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


Hope input from an "outsider" is helpful in going forward.   I put a lot 
of time and effort into tracking down this problem on this particular 
machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of just 
port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's 
what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792
port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread David P. Reed
Sadly, I've been busy with other crises in my day job for the last few 
days.   I did modify Rene's test program and ran it on my problem 
machine, with the results below.


The interesting part of this is that port 80 seems to respond to in 
instructions faster than the presumably unused ports 0xEC  and 0XEF  
(those were mentioned by someone as alternatives to port 80).


That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the hwclock utility reliably 
hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE 
loop, seems to strongly indicate that this chipset or motherboard 
actually uses port 80, rather than there being a bus problem.


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


Hope input from an outsider is helpful in going forward.   I put a lot 
of time and effort into tracking down this problem on this particular 
machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of just 
port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's 
what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792
port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Rene Herman

On 12-12-07 21:07, David P. Reed wrote:

Sadly, I've been busy with other crises in my day job for the last few 
days.   I did modify Rene's test program and ran it on my problem 
machine, with the results below.


The interesting part of this is that port 80 seems to respond to in 
instructions faster than the presumably unused ports 0xEC  and 0XEF  
(those were mentioned by someone as alternatives to port 80).


Don't know if someone else mentioned those but I only said 0xed. That's the 
value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being 
generally problematic as well).


It's in fact not all that unexpected it seems that port 0x80 responds to in 
given that it's used by the DMA controller. It's a write that falls on deaf 
ears. The read is going to be faster if it doesn't timeout on an unused port.


Although it's not faster for everyone, such as for me indicating that for us 
port 0x80 is really-really unused, it is for many. See results here:


http://lkml.org/lkml/2007/12/12/309

That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the hwclock utility reliably 
hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE 
loop, seems to strongly indicate that this chipset or motherboard 
actually uses port 80, rather than there being a bus problem.


Yes, so it seems. In this case we could in fact also fix your situation by 
just going to 0xed depending on for example DMI. Alan Cox just posted a few 
further problems with a simple udelay() replacement...


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


Hope input from an outsider is helpful in going forward.   I put a lot 
of time and effort into tracking down this problem on this particular 
machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of just 
port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's 
what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792


At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the in is 
somewhat interesting. Did someone at nVidia think it's an in from 0x80 
which should get the 1 microsec delay?



port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


Unused ports, bus timeouts.




System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB


Rene.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread David P. Reed

Port 0xED, just FYI:

cycles: out 1430, in 1370
cycles: out 1429, in 1370

(800 Mhz)


Rene Herman wrote:

On 12-12-07 21:07, David P. Reed wrote:

Sadly, I've been busy with other crises in my day job for the last 
few days.   I did modify Rene's test program and ran it on my 
problem machine, with the results below.


The interesting part of this is that port 80 seems to respond to in 
instructions faster than the presumably unused ports 0xEC  and 
0XEF  (those were mentioned by someone as alternatives to port 80).


Don't know if someone else mentioned those but I only said 0xed. 
That's the value Phoenix BIOSes use (yes, and which H. Peter Anvin) 
reported as being generally problematic as well).


It's in fact not all that unexpected it seems that port 0x80 responds 
to in given that it's used by the DMA controller. It's a write that 
falls on deaf ears. The read is going to be faster if it doesn't 
timeout on an unused port.


Although it's not faster for everyone, such as for me indicating that 
for us port 0x80 is really-really unused, it is for many. See results 
here:


http://lkml.org/lkml/2007/12/12/309

That, and the fact that the port 80 test reliably freezes the machine 
solid the second time it is run, and the hwclock utility reliably 
hangs the machine if the port 80's are used in the 
CMOS_READ/CMOS_WRITE loop, seems to strongly indicate that this 
chipset or motherboard actually uses port 80, rather than there being 
a bus problem.


Yes, so it seems. In this case we could in fact also fix your 
situation by just going to 0xed depending on for example DMI. Alan Cox 
just posted a few further problems with a simple udelay() replacement...


Someone might have an in to nVidia to clarify this, since I don't.  
In any case, the udelay(2) approach seems to be a safe fix for this 
machine.


Hope input from an outsider is helpful in going forward.   I put a 
lot of time and effort into tracking down this problem on this 
particular machine model, largely because I like the machine.


Running the (slightly modified to test ports 80, ec, ef instead of 
just port 80) test when the 2 GHz max speed CPU is running at 800 
MHz, here's what I get for port 80 and port ec and port ef.


port 80:   cycles: out 1430, in 792


At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the in 
is somewhat interesting. Did someone at nVidia think it's an in from 
0x80 which should get the 1 microsec delay?



port ef:cycles: out 1431, in 1378
port ec:   cycles: out 1432, in 1372


Unused ports, bus timeouts.




System info:  HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 72
model name  : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping: 2
cpu MHz : 800.000
cache size  : 512 KB


Rene.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread H. Peter Anvin

Alan Cox wrote:

On Wed, 12 Dec 2007 21:58:25 +0100
Rene Herman [EMAIL PROTECTED] wrote:


On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:
Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.
By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
maybe? I'm quite curious what they did...


I don't. Nvidia are not the most open bunch of people on the planet. This
doesn't appear to be a chipset bug anyway but a firmware one (other
systems with the same chipset work just fine).

The laptop maker might therefore be a better starting point.


One wonders if it does some SMM trick to capture port 0x80 writes and 
attempt to haul them off for debugging; it almost sounds like some kind 
of debugging code got let out into the field.


-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread H. Peter Anvin

Rene Herman wrote:

On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:


Someone might have an in to nVidia to clarify this, since I don't.  
In any case, the udelay(2) approach seems to be a safe fix for this 
machine.


By the way, _does_ anyone have a contact at nVidia who could clarify? 
Alan maybe? I'm quite curious what they did...


Summary:

Unless after booting with acpi=off, outputs to port 0x80 (the legacy 
way to delay I/O) reliably, but not immediately, hang MCP51 machines. 
Outputs to port 0xed do not indicating it's a not a generic bus abort 
problem.




Sorry, the first sentence didn't parse unambiguously for me.  Do you 
mean acpi=off works, or that acpi=off allows *subsequent* boots to work?


I have some people at nVidia I can probably ping.

-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Alan Cox
On Wed, 12 Dec 2007 21:58:25 +0100
Rene Herman [EMAIL PROTECTED] wrote:

 On 12-12-07 21:26, Rene Herman wrote:
 
  On 12-12-07 21:07, David P. Reed wrote:
 
  Someone might have an in to nVidia to clarify this, since I don't.  In 
  any case, the udelay(2) approach seems to be a safe fix for this machine.
 
 By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
 maybe? I'm quite curious what they did...

I don't. Nvidia are not the most open bunch of people on the planet. This
doesn't appear to be a chipset bug anyway but a firmware one (other
systems with the same chipset work just fine).

The laptop maker might therefore be a better starting point.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Rene Herman

On 12-12-07 21:26, Rene Herman wrote:


On 12-12-07 21:07, David P. Reed wrote:


Someone might have an in to nVidia to clarify this, since I don't.  In 
any case, the udelay(2) approach seems to be a safe fix for this machine.


By the way, _does_ anyone have a contact at nVidia who could clarify? Alan 
maybe? I'm quite curious what they did...


Summary:

Unless after booting with acpi=off, outputs to port 0x80 (the legacy way 
to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to 
port 0xed do not indicating it's a not a generic bus abort problem.


Rene.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: More info on port 80 symptoms on MCP51 machine.

2007-12-12 Thread Alan Cox
 One wonders if it does some SMM trick to capture port 0x80 writes and 
 attempt to haul them off for debugging; it almost sounds like some kind 
 of debugging code got let out into the field.

Not implausible. We've got a bug I've been dealing with where a vendor
left debug stuff enabled via the parallel port and which clearly
escaped from the test environment to the BIOS proper.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/