Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2008-01-07 Thread Tony Camuso
On Thu, 2007-12-20 at 22:50 +0300, Ivan Kokshaysky wrote:

> Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
> is to avoid touching BARs with mmconfig.
> 
> Ivan.

I've tried Ivan's suggestion, and it works. 

The patch is appended below. 

My question is, do we want to incorporate this full-time upstream?

It would fix a lot of existing and potential problems, and the cost
is very small. 

Regards,
Tony

diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..42e7d4a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -73,7 +73,8 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
}
 
base = get_base_addr(seg, bus, devfn);
-   if (!base)
+   if ((!addr) || (reg < 0x40))
+
return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
spin_lock_irqsave(_config_lock, flags);
@@ -106,7 +107,7 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int 
bus,
return -EINVAL;
 
base = get_base_addr(seg, bus, devfn);
-   if (!base)
+   if ((!addr) || (reg < 0x40))
return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
spin_lock_irqsave(_config_lock, flags);
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..4ad1fcb 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -61,7 +61,7 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
}
 
addr = pci_dev_base(seg, bus, devfn);
-   if (!addr)
+   if ((!addr) || (reg < 0x40))
return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
switch (len) {
@@ -89,7 +89,7 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
return -EINVAL;
 
addr = pci_dev_base(seg, bus, devfn);
-   if (!addr)
+   if ((!addr) || (reg < 0x40))
return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
switch (len) {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2008-01-07 Thread Tony Camuso
On Thu, 2007-12-20 at 22:50 +0300, Ivan Kokshaysky wrote:

 Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
 is to avoid touching BARs with mmconfig.
 
 Ivan.

I've tried Ivan's suggestion, and it works. 

The patch is appended below. 

My question is, do we want to incorporate this full-time upstream?

It would fix a lot of existing and potential problems, and the cost
is very small. 

Regards,
Tony

diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..42e7d4a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -73,7 +73,8 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
}
 
base = get_base_addr(seg, bus, devfn);
-   if (!base)
+   if ((!addr) || (reg  0x40))
+
return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
spin_lock_irqsave(pci_config_lock, flags);
@@ -106,7 +107,7 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int 
bus,
return -EINVAL;
 
base = get_base_addr(seg, bus, devfn);
-   if (!base)
+   if ((!addr) || (reg  0x40))
return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
spin_lock_irqsave(pci_config_lock, flags);
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..4ad1fcb 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -61,7 +61,7 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
}
 
addr = pci_dev_base(seg, bus, devfn);
-   if (!addr)
+   if ((!addr) || (reg  0x40))
return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
switch (len) {
@@ -89,7 +89,7 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
return -EINVAL;
 
addr = pci_dev_base(seg, bus, devfn);
-   if (!addr)
+   if ((!addr) || (reg  0x40))
return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
switch (len) {


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Robert Hancock

Loic Prylli wrote:

I just realized one thing: the bar sizing code in pci_read_bases() (that
writes 0x in the bars) does not seem to disable the
PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
manipulating the BARs. And it seems nobody else ensures they are
disabled at this point either (or am I missing something?).


No you're not missing anything. This problem causes many machines to 
break horribly when MMCONFIG is enabled. There's a patch in -mm to fix 
this. (It special-cases the case of host bridges and doesn't disable the 
 decode bits for those, since some are known to do crazy things if you 
do that.)


http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc6/2.6.24-rc6-mm1/broken-out/pci-disable-decoding-during-sizing-of-bars.patch



Touching the bars while they are enabled would be buggy behaviour from
our part, and something trivial to fix. And it might well fix that
particular problem (it's fair play from the machine to crash if we
create a decoding conflict, simply disabling the cmd bits in
pci_read_bases() should remove that conflict).

FWIW, to partially answer your last question, Windows does disable
mem-space and/or IO-space when sizing the bars of a device (I have some
traces of configuration-space-access taken on a window machine for one
of the PCI busses).


Good to know. There was some speculation that it did not.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Grant Grundler
On Thu, Dec 20, 2007 at 02:40:06PM -0800, Greg KH wrote:
> Sure, I realize this, but it solves the problem in one way for broken
> hardware, such that it at least allows it to work, right?  It also
> provides a better incentive for the manufacturer to fix their bios,
> which as you are on-site at HP, it would seem odd that they would just
> not do that instead of trying to work around this in the kernel...

Greg, you know why: Cost.

Asking the BIOS provider to roll a new version costs.
Rolling the BIOS version might also require re-running the OS "Certification".
Besides the cost of an OS re-cert, BIOS changes are unlikely to happen if
it will delay the introduction and/or sale of new products.

And it's not considered to be a "BIOS" problem unless win XP (or maybe
"Vista" these days) fails the same way (which BIOS supplier would have
fixed...).

hth,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Grant Grundler
On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
...
> I just realized one thing: the bar sizing code in pci_read_bases() (that
> writes 0x in the bars) does not seem to disable the
> PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
> manipulating the BARs. And it seems nobody else ensures they are
> disabled at this point either (or am I missing something?).

You are missing some history... I posted such a patch in 2002:
http://lkml.org/lkml/2002/12/19/145

> Touching the bars while they are enabled would be buggy behaviour from
> our part, and something trivial to fix. And it might well fix that
> particular problem (it's fair play from the machine to crash if we
> create a decoding conflict, simply disabling the cmd bits in
> pci_read_bases() should remove that conflict).

ISTR willy or Ivan recently posted a patch that was suggested in 2002
as well (don't disable MMIO on bridge devices when sizing BARs)...so the
main objections might be resolved to this "obvious fix". *sigh*

> FWIW, to partially answer your last question, Windows does disable
> mem-space and/or IO-space when sizing the bars of a device (I have some
> traces of configuration-space-access taken on a window machine for one
> of the PCI busses).

Thanks for posting the traces...it's past midnight here and I'll try to
look at those tomorrow. (Sorry - sounds like a lame excuse but I'm likely
to read the trace incorrectly at the moment.)

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Grant Grundler
On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
...
 I just realized one thing: the bar sizing code in pci_read_bases() (that
 writes 0x in the bars) does not seem to disable the
 PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
 manipulating the BARs. And it seems nobody else ensures they are
 disabled at this point either (or am I missing something?).

You are missing some history... I posted such a patch in 2002:
http://lkml.org/lkml/2002/12/19/145

 Touching the bars while they are enabled would be buggy behaviour from
 our part, and something trivial to fix. And it might well fix that
 particular problem (it's fair play from the machine to crash if we
 create a decoding conflict, simply disabling the cmd bits in
 pci_read_bases() should remove that conflict).

ISTR willy or Ivan recently posted a patch that was suggested in 2002
as well (don't disable MMIO on bridge devices when sizing BARs)...so the
main objections might be resolved to this obvious fix. *sigh*

 FWIW, to partially answer your last question, Windows does disable
 mem-space and/or IO-space when sizing the bars of a device (I have some
 traces of configuration-space-access taken on a window machine for one
 of the PCI busses).

Thanks for posting the traces...it's past midnight here and I'll try to
look at those tomorrow. (Sorry - sounds like a lame excuse but I'm likely
to read the trace incorrectly at the moment.)

cheers,
grant
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Grant Grundler
On Thu, Dec 20, 2007 at 02:40:06PM -0800, Greg KH wrote:
 Sure, I realize this, but it solves the problem in one way for broken
 hardware, such that it at least allows it to work, right?  It also
 provides a better incentive for the manufacturer to fix their bios,
 which as you are on-site at HP, it would seem odd that they would just
 not do that instead of trying to work around this in the kernel...

Greg, you know why: Cost.

Asking the BIOS provider to roll a new version costs.
Rolling the BIOS version might also require re-running the OS Certification.
Besides the cost of an OS re-cert, BIOS changes are unlikely to happen if
it will delay the introduction and/or sale of new products.

And it's not considered to be a BIOS problem unless win XP (or maybe
Vista these days) fails the same way (which BIOS supplier would have
fixed...).

hth,
grant
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-24 Thread Robert Hancock

Loic Prylli wrote:

I just realized one thing: the bar sizing code in pci_read_bases() (that
writes 0x in the bars) does not seem to disable the
PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
manipulating the BARs. And it seems nobody else ensures they are
disabled at this point either (or am I missing something?).


No you're not missing anything. This problem causes many machines to 
break horribly when MMCONFIG is enabled. There's a patch in -mm to fix 
this. (It special-cases the case of host bridges and doesn't disable the 
 decode bits for those, since some are known to do crazy things if you 
do that.)


http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc6/2.6.24-rc6-mm1/broken-out/pci-disable-decoding-during-sizing-of-bars.patch



Touching the bars while they are enabled would be buggy behaviour from
our part, and something trivial to fix. And it might well fix that
particular problem (it's fair play from the machine to crash if we
create a decoding conflict, simply disabling the cmd bits in
pci_read_bases() should remove that conflict).

FWIW, to partially answer your last question, Windows does disable
mem-space and/or IO-space when sizing the bars of a device (I have some
traces of configuration-space-access taken on a window machine for one
of the PCI busses).


Good to know. There was some speculation that it did not.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Loic Prylli
On 12/23/2007 3:55 PM, Matthew Wilcox wrote:
> On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
>   
>> I just realized one thing: the bar sizing code in pci_read_bases() (that
>> writes 0x in the bars) does not seem to disable the
>> PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
>> manipulating the BARs. And it seems nobody else ensures they are
>> disabled at this point either (or am I missing something?).
>> 
>
> Right, we don't.  Ivan and Greg are convinced that doing so can break
> some machines.  
>   



 It might indeed be scary to suddenly change the command bits on some
old "Host bridges" and maybe other classes of bridges. I was too
optimistic about the triviality of the fix.


Still an obvious improvement to the pci_read_bases() code would be to
not try sizing non-existings BARs (that way if a device has no memory
bars, we never need to touch the mem-enable bit). In addition, if you
exclude Host bridges and other bridges from temporary mem-disabling
while sizing their BARs, then doing temp disabling for the remaining
devices looks less scary than the alternative (messing live with the
bars decoding range especially for regular devices integrated to the
main chipset).


It seems to me worth making that init code safer independantly of the
mmconfig issues.



> Now that contradicts some information we've been told before; can you
> post those traces?  That would argue in favour of disabling memspace
> when configuring BARs.
>   



Here is a trace where COMMAND/STATUS and 0x10 and 0x14 registers
accesses were captured. Comments in brackets are my interpretation of
what's happening. That was with windows 2003 64bit.


Loic



[ reset at 607 second]

[BIOS: read-status, COMMAND = 0 ]

626.442718:cfg-read:0x6.w=0x10
626.442805:cfg-write:0x4.w=0x0

[BIOS: sizing BAR0]

626.448325:cfg-read:0x10.l=0x8
626.448365:cfg-write:0x10.l=0xfffe
626.448439:cfg-read:0x10.l=0xff08
626.448493:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.448552:cfg-read:0x14.l=0x0
626.448591:cfg-write:0x14.l=0xfffe
626.448647:cfg-read:0x14.l=0x0
626.448687:cfg-write:0x14.l=0x0

[BIOS: sizing BAR0]

626.449596:cfg-read:0x10.l=0x8
626.449635:cfg-write:0x10.l=0xfffe
626.449710:cfg-read:0x10.l=0xff08
626.449764:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.449832:cfg-read:0x14.l=0x0
626.449872:cfg-write:0x14.l=0xfffe
626.449928:cfg-read:0x14.l=0x0
626.449967:cfg-write:0x14.l=0x0

[BIOS: sizing BAR0]


626.451822:cfg-read:0x10.l=0x8
626.451862:cfg-write:0x10.l=0xfffe
626.451936:cfg-read:0x10.l=0xff08
626.451990:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.452058:cfg-read:0x14.l=0x0
626.452098:cfg-write:0x14.l=0xfffe
626.452154:cfg-read:0x14.l=0x0
626.452194:cfg-write:0x14.l=0x0


[BIOS: sizing and assigning BAR0]

626.454899:cfg-read:0x10.l=0x8
626.454939:cfg-write:0x10.l=0xfffe
626.455013:cfg-read:0x10.l=0xff08

626.455068:cfg-write:0x10.l=0x8
626.455138:cfg-write:0x10.l=0xfa00

[BIOS: sizing non-existent BAR1]

626.455211:cfg-read:0x14.l=0x0
626.455251:cfg-write:0x14.l=0xfffe
626.455307:cfg-read:0x14.l=0x0
626.455346:cfg-write:0x14.l=0x0


[BIOS: sizing and restoring BAR0]


628.024774:cfg-read:0x10.l=0xfa08
628.024831:cfg-write:0x10.l=0xfffe
628.024908:cfg-read:0x10.l=0xff08
628.024964:cfg-write:0x10.l=0xfa08

[BIOS: sizing non-existent BAR1]

628.025049:cfg-read:0x14.l=0x0
628.025091:cfg-write:0x14.l=0xfffe
628.025149:cfg-read:0x14.l=0x0
628.025190:cfg-write:0x14.l=0x0

[BIOS: clear status ]

628.025714:cfg-write:0x6.w=0x

[BIOS: enable SERR/IO+MEMORY+MASTER+SPECIAL+INVALIDATE ]
628.025788:cfg-write:0x4.w=0x11f

[BIOS: enable MEMORY+MASTER (after seeing IO/special/invalidate were
read-only 0) ]
629.028114:cfg-read:0x4.b=0x6
629.028160:cfg-write:0x4.b=0x6

[BIOS: read status, clear PARITY/SERR status bits ]
629.028490:cfg-read:0x7.b=0x0
629.028536:cfg-write:0x7.b=0xc1

[BIOS: enable PARITY detection ]
629.028818:cfg-read:0x4.b=0x6
629.028863:cfg-write:0x4.b=0x46

[BIOS: enable SERR (already enabled anyway) ]
629.028920:cfg-read:0x5.b=0x1
629.028966:cfg-write:0x5.b=0x1
629.032338:cfg-read:0x6.w=0x10


[ WINDOWS starting ]

661.965932:cfg-read:0x4.l=0x100146
661.965986:cfg-read:0x10.l=0xfa08
661.966040:cfg-read:0x14.l=0x0

662.112971:cfg-read:0x4.l=0x100146
662.113024:cfg-read:0x10.l=0xfa08
662.113078:cfg-read:0x14.l=0x0

662.155903:cfg-read:0x4.l=0x100146

662.156066:cfg-read:0x4.l=0x100146
662.156120:cfg-read:0x10.l=0xfa08
662.156174:cfg-read:0x14.l=0x0

662.156337:cfg-read:0x4.l=0x100146
662.156391:cfg-read:0x10.l=0xfa08
662.156445:cfg-read:0x14.l=0x0

[ WINDOWS: disable MEMORY + MASTER ]
662.156519:cfg-write:0x4.l=0x140

[ WINDOWS: size BAR0 and BAR1]
662.156588:cfg-write:0x10.l=0x
662.156662:cfg-write:0x14.l=0x
662.156755:cfg-read:0x4.l=0x100140
662.156809:cfg-read:0x10.l=0xff08
662.156863:cfg-read:0x14.l=0x0

[ WINDOWS: disable MEMORY + 

Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Matthew Wilcox
On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
> I just realized one thing: the bar sizing code in pci_read_bases() (that
> writes 0x in the bars) does not seem to disable the
> PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
> manipulating the BARs. And it seems nobody else ensures they are
> disabled at this point either (or am I missing something?).

Right, we don't.  Ivan and Greg are convinced that doing so can break
some machines.  

> FWIW, to partially answer your last question, Windows does disable
> mem-space and/or IO-space when sizing the bars of a device (I have some
> traces of configuration-space-access taken on a window machine for one
> of the PCI busses).

Now that contradicts some information we've been told before; can you
post those traces?  That would argue in favour of disabling memspace
when configuring BARs.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Loic Prylli


On 12/20/2007 1:16 PM, Matthew Wilcox wrote:
> Oh, that's the same bug others (including me) have been complaining
> about.
>
> http://marc.info/?l=linux-kernel=118809338631160=2
>
>   
>> It hangs in exactly the same place every time.
>>
>> I am surmising that the write to that BAR is causing a MCE.
>> 
>
> Bad deduction.  What's happening is that the write to the BAR is causing
> it to overlap the decode for mmconfig space.  So the mmconfig write to
> set the BAR back never gets through.
>
> I have a different idea to fix this problem.  Instead of writing
> 0x, we could look for an unused bit of space in the E820 map and
> write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
> overlap, and we could find its size using MMCONFIG.
>
> Does anyone know how Windows handles these machines? 



I just realized one thing: the bar sizing code in pci_read_bases() (that
writes 0x in the bars) does not seem to disable the
PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
manipulating the BARs. And it seems nobody else ensures they are
disabled at this point either (or am I missing something?).


Touching the bars while they are enabled would be buggy behaviour from
our part, and something trivial to fix. And it might well fix that
particular problem (it's fair play from the machine to crash if we
create a decoding conflict, simply disabling the cmd bits in
pci_read_bases() should remove that conflict).

FWIW, to partially answer your last question, Windows does disable
mem-space and/or IO-space when sizing the bars of a device (I have some
traces of configuration-space-access taken on a window machine for one
of the PCI busses).



Loic






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Loic Prylli


On 12/20/2007 1:16 PM, Matthew Wilcox wrote:
 Oh, that's the same bug others (including me) have been complaining
 about.

 http://marc.info/?l=linux-kernelm=118809338631160w=2

   
 It hangs in exactly the same place every time.

 I am surmising that the write to that BAR is causing a MCE.
 

 Bad deduction.  What's happening is that the write to the BAR is causing
 it to overlap the decode for mmconfig space.  So the mmconfig write to
 set the BAR back never gets through.

 I have a different idea to fix this problem.  Instead of writing
 0x, we could look for an unused bit of space in the E820 map and
 write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
 overlap, and we could find its size using MMCONFIG.

 Does anyone know how Windows handles these machines? 



I just realized one thing: the bar sizing code in pci_read_bases() (that
writes 0x in the bars) does not seem to disable the
PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
manipulating the BARs. And it seems nobody else ensures they are
disabled at this point either (or am I missing something?).


Touching the bars while they are enabled would be buggy behaviour from
our part, and something trivial to fix. And it might well fix that
particular problem (it's fair play from the machine to crash if we
create a decoding conflict, simply disabling the cmd bits in
pci_read_bases() should remove that conflict).

FWIW, to partially answer your last question, Windows does disable
mem-space and/or IO-space when sizing the bars of a device (I have some
traces of configuration-space-access taken on a window machine for one
of the PCI busses).



Loic






--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Matthew Wilcox
On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
 I just realized one thing: the bar sizing code in pci_read_bases() (that
 writes 0x in the bars) does not seem to disable the
 PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
 manipulating the BARs. And it seems nobody else ensures they are
 disabled at this point either (or am I missing something?).

Right, we don't.  Ivan and Greg are convinced that doing so can break
some machines.  

 FWIW, to partially answer your last question, Windows does disable
 mem-space and/or IO-space when sizing the bars of a device (I have some
 traces of configuration-space-access taken on a window machine for one
 of the PCI busses).

Now that contradicts some information we've been told before; can you
post those traces?  That would argue in favour of disabling memspace
when configuring BARs.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-23 Thread Loic Prylli
On 12/23/2007 3:55 PM, Matthew Wilcox wrote:
 On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote:
   
 I just realized one thing: the bar sizing code in pci_read_bases() (that
 writes 0x in the bars) does not seem to disable the
 PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before
 manipulating the BARs. And it seems nobody else ensures they are
 disabled at this point either (or am I missing something?).
 

 Right, we don't.  Ivan and Greg are convinced that doing so can break
 some machines.  
   



 It might indeed be scary to suddenly change the command bits on some
old Host bridges and maybe other classes of bridges. I was too
optimistic about the triviality of the fix.


Still an obvious improvement to the pci_read_bases() code would be to
not try sizing non-existings BARs (that way if a device has no memory
bars, we never need to touch the mem-enable bit). In addition, if you
exclude Host bridges and other bridges from temporary mem-disabling
while sizing their BARs, then doing temp disabling for the remaining
devices looks less scary than the alternative (messing live with the
bars decoding range especially for regular devices integrated to the
main chipset).


It seems to me worth making that init code safer independantly of the
mmconfig issues.



 Now that contradicts some information we've been told before; can you
 post those traces?  That would argue in favour of disabling memspace
 when configuring BARs.
   



Here is a trace where COMMAND/STATUS and 0x10 and 0x14 registers
accesses were captured. Comments in brackets are my interpretation of
what's happening. That was with windows 2003 64bit.


Loic



[ reset at 607 second]

[BIOS: read-status, COMMAND = 0 ]

626.442718:cfg-read:0x6.w=0x10
626.442805:cfg-write:0x4.w=0x0

[BIOS: sizing BAR0]

626.448325:cfg-read:0x10.l=0x8
626.448365:cfg-write:0x10.l=0xfffe
626.448439:cfg-read:0x10.l=0xff08
626.448493:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.448552:cfg-read:0x14.l=0x0
626.448591:cfg-write:0x14.l=0xfffe
626.448647:cfg-read:0x14.l=0x0
626.448687:cfg-write:0x14.l=0x0

[BIOS: sizing BAR0]

626.449596:cfg-read:0x10.l=0x8
626.449635:cfg-write:0x10.l=0xfffe
626.449710:cfg-read:0x10.l=0xff08
626.449764:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.449832:cfg-read:0x14.l=0x0
626.449872:cfg-write:0x14.l=0xfffe
626.449928:cfg-read:0x14.l=0x0
626.449967:cfg-write:0x14.l=0x0

[BIOS: sizing BAR0]


626.451822:cfg-read:0x10.l=0x8
626.451862:cfg-write:0x10.l=0xfffe
626.451936:cfg-read:0x10.l=0xff08
626.451990:cfg-write:0x10.l=0x8

[BIOS: sizing non-existent BAR1]

626.452058:cfg-read:0x14.l=0x0
626.452098:cfg-write:0x14.l=0xfffe
626.452154:cfg-read:0x14.l=0x0
626.452194:cfg-write:0x14.l=0x0


[BIOS: sizing and assigning BAR0]

626.454899:cfg-read:0x10.l=0x8
626.454939:cfg-write:0x10.l=0xfffe
626.455013:cfg-read:0x10.l=0xff08

626.455068:cfg-write:0x10.l=0x8
626.455138:cfg-write:0x10.l=0xfa00

[BIOS: sizing non-existent BAR1]

626.455211:cfg-read:0x14.l=0x0
626.455251:cfg-write:0x14.l=0xfffe
626.455307:cfg-read:0x14.l=0x0
626.455346:cfg-write:0x14.l=0x0


[BIOS: sizing and restoring BAR0]


628.024774:cfg-read:0x10.l=0xfa08
628.024831:cfg-write:0x10.l=0xfffe
628.024908:cfg-read:0x10.l=0xff08
628.024964:cfg-write:0x10.l=0xfa08

[BIOS: sizing non-existent BAR1]

628.025049:cfg-read:0x14.l=0x0
628.025091:cfg-write:0x14.l=0xfffe
628.025149:cfg-read:0x14.l=0x0
628.025190:cfg-write:0x14.l=0x0

[BIOS: clear status ]

628.025714:cfg-write:0x6.w=0x

[BIOS: enable SERR/IO+MEMORY+MASTER+SPECIAL+INVALIDATE ]
628.025788:cfg-write:0x4.w=0x11f

[BIOS: enable MEMORY+MASTER (after seeing IO/special/invalidate were
read-only 0) ]
629.028114:cfg-read:0x4.b=0x6
629.028160:cfg-write:0x4.b=0x6

[BIOS: read status, clear PARITY/SERR status bits ]
629.028490:cfg-read:0x7.b=0x0
629.028536:cfg-write:0x7.b=0xc1

[BIOS: enable PARITY detection ]
629.028818:cfg-read:0x4.b=0x6
629.028863:cfg-write:0x4.b=0x46

[BIOS: enable SERR (already enabled anyway) ]
629.028920:cfg-read:0x5.b=0x1
629.028966:cfg-write:0x5.b=0x1
629.032338:cfg-read:0x6.w=0x10


[ WINDOWS starting ]

661.965932:cfg-read:0x4.l=0x100146
661.965986:cfg-read:0x10.l=0xfa08
661.966040:cfg-read:0x14.l=0x0

662.112971:cfg-read:0x4.l=0x100146
662.113024:cfg-read:0x10.l=0xfa08
662.113078:cfg-read:0x14.l=0x0

662.155903:cfg-read:0x4.l=0x100146

662.156066:cfg-read:0x4.l=0x100146
662.156120:cfg-read:0x10.l=0xfa08
662.156174:cfg-read:0x14.l=0x0

662.156337:cfg-read:0x4.l=0x100146
662.156391:cfg-read:0x10.l=0xfa08
662.156445:cfg-read:0x14.l=0x0

[ WINDOWS: disable MEMORY + MASTER ]
662.156519:cfg-write:0x4.l=0x140

[ WINDOWS: size BAR0 and BAR1]
662.156588:cfg-write:0x10.l=0x
662.156662:cfg-write:0x14.l=0x
662.156755:cfg-read:0x4.l=0x100140
662.156809:cfg-read:0x10.l=0xff08
662.156863:cfg-read:0x14.l=0x0

[ WINDOWS: disable MEMORY + MASTER ]


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-22 Thread Robert Hancock

Loic Prylli wrote:

On 12/20/2007 6:21 PM, Tony Camuso wrote:

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.




The lack of MMCONFIG support is indeed because some hypertransport
chipsets lack that support. But there are some BIOSes out there that are
advertising support for all busses in their MCFG acpi attribute (even
the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131
motherboard), and the BIOS seems at least faulty for advertising a
capability that does not exist.


This didn't really occur to me before for some reason. But yes, the MCFG 
table lists the buses to which each MMCONFIG region is applicable. If 
there are entire buses which MMCONFIG cannot access, it should not be 
indicating they are accessible via MMCONFIG in the ACPI MCFG table. If 
it is, then it's truly a BIOS bug.


Unless of course Linux isn't handling what the MCFG table is indicating 
properly. Then it's our bug. It would be good to verify this on one of 
the systems involved..


One of the things this patch (currently in -mm) does is dump out the 
segment and starting/ending buses for each MCFG configuration listed. 
The dmesg from this patch applied on such a system would tell you which 
is the case:


http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commit;h=e18c985289ee356f06dbc953281a3c140a02fbb3

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-22 Thread Robert Hancock

Loic Prylli wrote:

On 12/20/2007 6:21 PM, Tony Camuso wrote:

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.




The lack of MMCONFIG support is indeed because some hypertransport
chipsets lack that support. But there are some BIOSes out there that are
advertising support for all busses in their MCFG acpi attribute (even
the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131
motherboard), and the BIOS seems at least faulty for advertising a
capability that does not exist.


This didn't really occur to me before for some reason. But yes, the MCFG 
table lists the buses to which each MMCONFIG region is applicable. If 
there are entire buses which MMCONFIG cannot access, it should not be 
indicating they are accessible via MMCONFIG in the ACPI MCFG table. If 
it is, then it's truly a BIOS bug.


Unless of course Linux isn't handling what the MCFG table is indicating 
properly. Then it's our bug. It would be good to verify this on one of 
the systems involved..


One of the things this patch (currently in -mm) does is dump out the 
segment and starting/ending buses for each MCFG configuration listed. 
The dmesg from this patch applied on such a system would tell you which 
is the case:


http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commit;h=e18c985289ee356f06dbc953281a3c140a02fbb3

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-21 Thread Bhavana Nagendra

Tony Camuso wrote:

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG 
for accesses behind those chips.





And here are the excerpts from that page of the spec which are salient
to the present discussion:

--

The base configuration space of the AMD-8132 and PCI(-X) devices 
attached to it are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices 
attached to the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no 
registers in the extended

configuration space.

Fix Planned
No


Prarit, Tony et al, this is as official as it gets.   It's pretty 
clearly stated.


Bhavana
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-21 Thread Andi Kleen
Robert Hancock <[EMAIL PROTECTED]> writes:

> First off, I would like to see confirmation from the horses's mouths
> here (namely AMD,

AMD publicly releases errata sheets/data sheets for their PCI bridges
(check their website).  I haven't checked the 8132 errata for this
though. Not sure it implements MMCONFIG at all.

However the PCI Express chipsets typically do implement 
their own MMCONFIG aperture.

> The case of the device built into the K8 northbridge that's
> unreachable by MMCONFIG kind of makes sense, 

The internal northbridge devices on K8 are not reachable through
mmconfig. While BIOS are supposed to express this in MCFG by excluding
that bus many don't. That was the original reason I added the
type1<->mcfg sanity check. It catches the K8 case fine.

> since the northbridge is
> what's translating the MMCONFIG memory access into config accesses. 

The way it works on K8 systems is that the CPU internal northbridge
knows nothing about MMCONFIG, but that the external chipsets
implement an MMCONFIG aperture on their own outside the northbridge.

If you have multiple bridges like some SLI K8 setups that could
be multiple ones.

This has changed on the Quad Core Fam10h CPUs BTW -- there the
NB can deliver an single mmconfig aperture that is translated
to appropiate transactions on the Hypertransport link.

What might happen with K8 and 8132 (I'm speculating here) is that 
they got a PCI Express chipset that implements an MMCONFIG 
aperture for its devices, but the system also has a PCI-X 8132
bridge and the MMCONFIG aperture inside the chipset doesn't
support talking to the 8132 which might be "upstream" in the HT
topology. And the BIOS' MCFG doesn't tell Linux that.

-Andi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-21 Thread Bhavana Nagendra

Tony Camuso wrote:

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG 
for accesses behind those chips.





And here are the excerpts from that page of the spec which are salient
to the present discussion:

--

The base configuration space of the AMD-8132 and PCI(-X) devices 
attached to it are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices 
attached to the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no 
registers in the extended

configuration space.

Fix Planned
No


Prarit, Tony et al, this is as official as it gets.   It's pretty 
clearly stated.


Bhavana
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-21 Thread Andi Kleen
Robert Hancock [EMAIL PROTECTED] writes:

 First off, I would like to see confirmation from the horses's mouths
 here (namely AMD,

AMD publicly releases errata sheets/data sheets for their PCI bridges
(check their website).  I haven't checked the 8132 errata for this
though. Not sure it implements MMCONFIG at all.

However the PCI Express chipsets typically do implement 
their own MMCONFIG aperture.

 The case of the device built into the K8 northbridge that's
 unreachable by MMCONFIG kind of makes sense, 

The internal northbridge devices on K8 are not reachable through
mmconfig. While BIOS are supposed to express this in MCFG by excluding
that bus many don't. That was the original reason I added the
type1-mcfg sanity check. It catches the K8 case fine.

 since the northbridge is
 what's translating the MMCONFIG memory access into config accesses. 

The way it works on K8 systems is that the CPU internal northbridge
knows nothing about MMCONFIG, but that the external chipsets
implement an MMCONFIG aperture on their own outside the northbridge.

If you have multiple bridges like some SLI K8 setups that could
be multiple ones.

This has changed on the Quad Core Fam10h CPUs BTW -- there the
NB can deliver an single mmconfig aperture that is translated
to appropiate transactions on the Hypertransport link.

What might happen with K8 and 8132 (I'm speculating here) is that 
they got a PCI Express chipset that implements an MMCONFIG 
aperture for its devices, but the system also has a PCI-X 8132
bridge and the MMCONFIG aperture inside the chipset doesn't
support talking to the 8132 which might be upstream in the HT
topology. And the BIOS' MCFG doesn't tell Linux that.

-Andi


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Loic Prylli wrote:


Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
"northbridges" (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the
other available hypertransport links in the system.


Loic



Here is the PCI configuration of the HP DL585G2.

You can see two nVidia CK804 PCIE root ports at bus 0 and bus 0x40.
Each of them has an 8132 connected as a subordinate bridge.

[EMAIL PROTECTED] ~]# lspci -vt
-+-[:40]-+-00.0  nVidia Corporation CK804 Memory Controller
 |   +-01.0  nVidia Corporation CK804 Memory Controller
 |   +-0b.0-[:4f-51]--
 |   +-0c.0-[:4c-4e]--
 |   +-0d.0-[:49-4b]--
 |   +-0e.0-[:46-48]--
 |   +-10.0-[:41]--+-01.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   | \-02.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   +-10.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 |   +-11.0-[:42-45]--
 |   \-11.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 \-[:00]-+-00.0  nVidia Corporation CK804 Memory Controller
 +-01.0  nVidia Corporation CK804 ISA Bridge
 +-02.0  nVidia Corporation CK804 USB Controller
 +-02.1  nVidia Corporation CK804 USB Controller
 +-06.0  nVidia Corporation CK804 IDE
 +-09.0-[:01]--+-03.0  ATI Technologies Inc ES1000
 | +-04.0  Compaq Computer Corporation Integrated 
Lights Out Controller
 | +-04.2  Compaq Computer Corporation Integrated 
Lights Out  Processor
 | +-04.4  Hewlett-Packard Company Proliant iLO2 
virtual USB controller
 | \-04.6  Hewlett-Packard Company Proliant iLO2 
virtual UART
 +-0c.0-[:08-0a]00.0  Hewlett-Packard Company Smart Array 
Controller
 +-0d.0-[:05-07]--
 +-0e.0-[:02-04]--
 +-18.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-18.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-18.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-18.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-19.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-19.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-19.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-19.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1a.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1a.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1a.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-1a.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1b.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1b.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1b.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 \-1b.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
[EMAIL PROTECTED] ~]#













--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 9:15 PM, Robert Hancock wrote:
>>
>> Suggested Workaround
>>
>> It is strongly recommended that system designers do not connect the
>> AMD-8132 and devices that use extended
>> configuration space MMIO BARs (ex: HyperTransport-to-PCI Express®
>> bridges) to the same processor
>> HyperTransport link.
>>
>> Fix Planned
>> No
>
> That does sound fairly definitive. I have to wonder why certain system
> designers then didn't follow their strong recommendation..



Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
"northbridges" (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a "MMCONFIG BAR" where on one of the
other available hypertransport links in the system.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

I have to wonder why certain system 
designers then didn't follow their strong recommendation..



I don't think I want to go there.

I used to be a hardware/firmware guy.
:D :D
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Robert Hancock

Tony Camuso wrote:

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.




I happen to have this one stored in my desktop.

 From AMD-8132TM HyperTransportTM
 PCI-X®2.0 Tunnel
  Revision Guide

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 



79 AMD-8132TM Tunnel Lacks Extended Configuration
Space Memory-Mapped I/O Base Address Register

Description

Current AMD processors do not natively support PCI-defined extended 
configuration space. A memory
mapped I/O base address register (MMIO BAR) is required in chipset 
devices to support extended

configuration space. The AMD-8132 does not have this MMIO BAR.
Potential Effect On System

The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR 
to support extended
configuration space. Using a device which does have this MMIO BAR and an 
AMD-8132 on the same
HyperTransportTM link of the processor may cause firmware/software 
problems.


The base configuration space of the AMD-8132 and PCI(-X) devices 
attached to it are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices 
attached to the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no 
registers in the extended

configuration space.

Suggested Workaround

It is strongly recommended that system designers do not connect the 
AMD-8132 and devices that use extended
configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® 
bridges) to the same processor

HyperTransport link.

Fix Planned
No


That does sound fairly definitive. I have to wonder why certain system 
designers then didn't follow their strong recommendation..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

The case of the device built into the K8 northbridge that's unreachable 
by MMCONFIG kind of makes sense, since the northbridge is what's 
translating the MMCONFIG memory access into config accesses. It seems 
bizarre to me that a bridge chip could possibly have such a problem. The 
MMCONFIG access should get translated into a configuration space access 
in the northbridge and from that point on there's no difference between 
an MMCONFIG and type1 access.



Robert's point is well taken.

Only northbridge chips can give us this kind of trouble, and the only
chips mentioned in the present discussion as not being mmconf-compliant
are northbridges (8132, ht1000).

The patch is aware of this, so once a root bus has been programmed for
legacy pci config access, all descendent buses automatically inherit
this access mechanism and are therefore not probed by the patch.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli

On 12/20/2007 6:21 PM, Tony Camuso wrote:
>
> And the MMCONFIG problem with enterprise systems and workstations, where
> we do control the BIOS (for the most part), is due to known bugs in
> certain versions of certain chipsets, HT1000, AMD8132, among them, not
> the BIOS.



The lack of MMCONFIG support is indeed because some hypertransport
chipsets lack that support. But there are some BIOSes out there that are
advertising support for all busses in their MCFG acpi attribute (even
the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131
motherboard), and the BIOS seems at least faulty for advertising a
capability that does not exist.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.





And here are the excerpts from that page of the spec which are salient
to the present discussion:

--

The base configuration space of the AMD-8132 and PCI(-X) devices attached to it 
are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to 
the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no registers 
in the extended
configuration space.

Fix Planned
No


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is 
no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.




I happen to have this one stored in my desktop.

From AMD-8132TM HyperTransportTM
 PCI-X®2.0 Tunnel
  Revision Guide

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf

79 AMD-8132TM Tunnel Lacks Extended Configuration
Space Memory-Mapped I/O Base Address Register

Description

Current AMD processors do not natively support PCI-defined extended 
configuration space. A memory
mapped I/O base address register (MMIO BAR) is required in chipset devices to 
support extended
configuration space. The AMD-8132 does not have this MMIO BAR.
Potential Effect On System

The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR to 
support extended
configuration space. Using a device which does have this MMIO BAR and an 
AMD-8132 on the same
HyperTransportTM link of the processor may cause firmware/software problems.

The base configuration space of the AMD-8132 and PCI(-X) devices attached to it 
are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to 
the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no registers 
in the extended
configuration space.

Suggested Workaround

It is strongly recommended that system designers do not connect the AMD-8132 
and devices that use extended
configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® bridges) to 
the same processor
HyperTransport link.

Fix Planned
No

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Robert Hancock

Tony Camuso wrote:

Greg KH wrote:


Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h


I don't think that many OEMs have that much control over the BIOS in
their "value lines".
:)

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.

Anyway, we are devising better ways to deal with these anomalies
than blacklists and telling customers to use "pci=nommconf"

And we're bringing them to the community for discussion, improvement,
and, we hope, acceptance.


First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is 
no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.


The case of the device built into the K8 northbridge that's unreachable 
by MMCONFIG kind of makes sense, since the northbridge is what's 
translating the MMCONFIG memory access into config accesses. It seems 
bizarre to me that a bridge chip could possibly have such a problem. The 
MMCONFIG access should get translated into a configuration space access 
in the northbridge and from that point on there's no difference between 
an MMCONFIG and type1 access.


Look at MSI for another example, we recently had a patch from NVIDIA 
posted to enable Hypertransport MSI mapping bits on some chipsets so 
that MSI would function, because the BIOS failed to set them up 
properly. Are we sure there's not a similar BIOS configuration issue 
that could ideally be fixed in the BIOS, or else fixed up in the kernel?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:


Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h


I don't think that many OEMs have that much control over the BIOS in
their "value lines".
:)

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.

Anyway, we are devising better ways to deal with these anomalies
than blacklists and telling customers to use "pci=nommconf"

And we're bringing them to the community for discussion, improvement,
and, we hope, acceptance.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 05:36:43PM -0500, Tony Camuso wrote:
> Greg KH wrote:
>> On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
>> Any reason why these changes were never submitted to the upstream kernel
>> versions?  Or do you all just want to keep patching your newer releases
>> with this information forever?  :)
>
> I really don't know why these changes were never submitted to the
> upstream kernel versions".
>
> I was brought on the scene about six months ago as HP's on-site engineer
> at RH, and this was one of the things they wanted me to do.
>
> We wanted a solution that was more generic and could manage this
> problem preemptively, rather than using blacklists. Maintenance of
> blacklists is a bother and almost always done after a new system
> with this problem is discovered.
>
> Furthermore, blacklisting whole platforms to use legacy pci config
> penalizes any mmconfig-friendly buses in those platforms, particularly
> the pci express buses, and causes such platforms to be non-compliant
> with the pci expres spec.

Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:

On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
Any reason why these changes were never submitted to the upstream kernel
versions?  Or do you all just want to keep patching your newer releases
with this information forever?  :)



I really don't know why these changes were never submitted to the
upstream kernel versions".

I was brought on the scene about six months ago as HP's on-site engineer
at RH, and this was one of the things they wanted me to do.

We wanted a solution that was more generic and could manage this
problem preemptively, rather than using blacklists. Maintenance of
blacklists is a bother and almost always done after a new system
with this problem is discovered.

Furthermore, blacklisting whole platforms to use legacy pci config
penalizes any mmconfig-friendly buses in those platforms, particularly
the pci express buses, and causes such platforms to be non-compliant
with the pci expres spec.

> Please send these kinds of changes upstream...
>
As you wish.
:)


I have talked to intel about this, but they haven't gotten back to me.


Try poking them harder, they are usually very responsive to Linux kernel
issues.

thanks,

greg k-h


Matthew Wilcox explained the problem to me. The hang that cannot be fixed
by my patch isn't the chipset's fault. The graphics device has a 64-bit
BAR asking for 256 MB of IO.

If I understand his explanation correctly, when the bus sizing code
writes the low dword of this BAR with 0x, the chip is then
programmed to claim every MMIO reference betweeen 0x.f000 and
0x0001.

It just so happens that MMCONFIG space for the dc5700 (awa some others) is
mapped by its BIOS into that very region, so all MMCONFIG references are now
swallowed up by the graphics chip. At that point, no further boot progress can
be made.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
> Appended below is a code snippet embedded in the RHEL version of 
> mmconfig.c,
> for both i386 and x86_64. It does not encompass all the systems that have
> (or will have) problems with mmconf. Only HP platforms are listed, but I
> believe there are others.
>
> The reason the HP platforms got caught is they put these devices beyond
> bus 16, where they would have been trapped, and the problem would have
> been avoided.

Any reason why these changes were never submitted to the upstream kernel
versions?  Or do you all just want to keep patching your newer releases
with this information forever?  :)

Please send these kinds of changes upstream...

>>> The one device we know about that throws exceptions is the 830M/MG
>>> graphics chip. This chip passes the read-compare test, so the code
>>> merrily advances to bus sizing. When the bus sizing code writes the
>>> BAR at offset 0x18 in this device, the system hangs.
>> So it doesn't work at all, with or without this patch?  Does the vendor
>> know about this?
>> thanks,
>> greg k-h
>
> I have talked to intel about this, but they haven't gotten back to me.

Try poking them harder, they are usually very responsive to Linux kernel
issues.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 4:00 PM, Matthew Wilcox wrote:
> On Thu, Dec 20, 2007 at 03:56:29PM -0500, Loic Prylli wrote:
>   
>> I know the final device is not aware on how the config request was
>> originated. I am just saying platforms built around the Intel 82801
>> chipset (ICH2) don't support mmconfig at all. I would also not be
>> surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
>> might also not support mmconfig (but for this second case, it's only
>> speculation based on seeing a couple posts about
>> TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
>> mentioned). If you know of a platform that support mmconfig, and where
>> the tg3 does need to use relatively intensively pci-conf-space, I'll be
>> happy to be corrected.
>> 
>
> tg3 is available as an add-in pci card.  i have one.  i can plug it into
> a machine that does support mmconfig.
>   



That doesn't tell for sure your NIC has the specific rev that would
cause the hwbug workaround to be used. But let's assume so, your
combination would still works correctly (maybe a slightly non-optimal
network performance). A non-default mmconf=always option can help
maximize again the performance.

My starting point was that more systems could be supported out-of-the
box (vs not working at all) by using "type 1" more widely,. This would
not break anybody.  And this would not affect performance except on what
I initially called "obscure hardware or systems".

I already acknowledged in the previous email the possibility of a
performance impact, but it is still not clear to me whether it would be
widespread. The example you  mention is a buggy revision of a chip that
was designed to use memory-mapped IO, and has to use config-space
instead because the memory-mapped IO implementation is buggy (in most
cases only when used in combination with certain chipsets). How much
influence should a modest performance impact there influence the
decision-making?


Usually it's better to trade some performance for stability than the
reverse, and you loose the performance anyway when some many people
start using pci=nommconf by default on all their installs.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 03:56:29PM -0500, Loic Prylli wrote:
> I know the final device is not aware on how the config request was
> originated. I am just saying platforms built around the Intel 82801
> chipset (ICH2) don't support mmconfig at all. I would also not be
> surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
> might also not support mmconfig (but for this second case, it's only
> speculation based on seeing a couple posts about
> TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
> mentioned). If you know of a platform that support mmconfig, and where
> the tg3 does need to use relatively intensively pci-conf-space, I'll be
> happy to be corrected.

tg3 is available as an add-in pci card.  i have one.  i can plug it into
a machine that does support mmconfig.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 3:15 PM, Matthew Wilcox wrote:
> On Thu, Dec 20, 2007 at 03:05:52PM -0500, Loic Prylli wrote:
>   
>> I am not familiar with the tg3 driver, just trying to give a 5 minutes
>> look, it seems the typical cases where the pci-conf-space is used
>> intensively are with some rev in combination with the 82801
>> (TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
>> as well as some very specific PCI-X combinations
>> (TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
>> mmconfig.
>> 
>
> It's not a question of whether the card supports mmconfig or not -- the
> card can't tell whether a first-256-byte pci config transaction was initiated 
> through mmconfig, type1, type2, or even a bios call.
>   





I know the final device is not aware on how the config request was
originated. I am just saying platforms built around the Intel 82801
chipset (ICH2) don't support mmconfig at all. I would also not be
surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
might also not support mmconfig (but for this second case, it's only
speculation based on seeing a couple posts about
TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
mentioned). If you know of a platform that support mmconfig, and where
the tg3 does need to use relatively intensively pci-conf-space, I'll be
happy to be corrected.




> I'm just hacking together an implementation based on Ivan's suggestion
> to always use type 1 for the first 64 bytes.
>
>   



Whatever you think it's best. If you are only trying to address a
BAR-sizing/mmconfig temporary conflict, and think the "MCFG BIOS" or
other mmconfig problems are under control, that makes sense. If the need
arise for somebody, an option to select between 64 to 256 can be added
later.



Loic


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Ivan Kokshaysky wrote:


Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.


Re-thinking this ...

The problem I see with this approach is that it does not help devices
that are mmconfig-unfriendly and will not work above the 64-byte offset.

So you will be able to configure the BARS of such a device, but not be
able to do much of anything else.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Ivan Kokshaysky wrote:

On Thu, Dec 20, 2007 at 12:08:33PM -0700, Matthew Wilcox wrote:

On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:

Does anybody see a down side to this?

It'll be slower than it would be if we used mmconfig directly.  Now yes,
nobody should be using pci config space in performance critical paths
... but see the tg3 driver.


Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.


Sounds good to me.

However, this does add another in-line test for every pci config access.

The existing test is a lookup in the unreachable_devices bitmap. Even though
the bitmap only covers the first 16 buses, the lookup is performed for every
pci config access.

We would be adding another test,

  if the offset is less than 64 bytes, use legacy pci config
  else use mmconfig

The emitted code would probably only produce one branch, so it shouldn't
present a performance degradation.

Any objections to taking this tact?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Here's how BARs work ... when you write 0x to the BAR, it
ignores all the set bits that are less than the size of the BAR.  So,
assuming this is a 256MB BAR (like my G33 is), what ends up written to
this BAR is 0xf000.  Now, because this is graphics, apparently it's
special and embedded in the chipset, even though it looks like it's a
PCI device.  So it actually gets priority over MMCONFIG which is also
mapped to 0xf000.

For your case of a 64-bit BAR, you could write 0x to the high
32-bits first, then write to the low 32-bits, then reset the low, then
high bits, and you'd avoid the problem.  But the G33 has a 32-bit BAR
with the same problem, so it won't work for that case.

BARs that are behind bridges don't have this problem (they can't decode
memory accesses that aren't forwarded to them).  BARs on devices which
have memory IO disabled also don't have theis problem, but disabling
devices has its problems (as does probing BARs for active devices anyway
...).


Thanks for the detailed explanation.

> The question is how large can 32-bit BARs get.  As we've seen, 256MB
> exist, and are causing pain.  I can't imagine any PCI device
> manufacturer thinks they can allocate 2GB of the low space, but we could
> potentially mis-size a large BAR by not using 0x.
>

Point well taken. Graphics devices understandably consume a lot of memory
space, and are likely to consume even more in the not-too-distant future.


I'm really not clear on the purpose of your patchset.  Was it all to
address this one problem?



No. My patch-set does not address this problem at all, but rather the
larger problem of having mmconfig-unfriendly devices on buses that are
out of reach of the unreachable_devices() routine and bitmap.

This problem is one I encountered during my testing and mentioned in
my preamble as not being fixable by my patch-set.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 03:05:52PM -0500, Loic Prylli wrote:
> I am not familiar with the tg3 driver, just trying to give a 5 minutes
> look, it seems the typical cases where the pci-conf-space is used
> intensively are with some rev in combination with the 82801
> (TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
> as well as some very specific PCI-X combinations
> (TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
> mmconfig.

It's not a question of whether the card supports mmconfig or not -- the
card can't tell whether a first-256-byte pci config transaction was
initiated through mmconfig, type1, type2, or even a bios call.

I'm just hacking together an implementation based on Ivan's suggestion
to always use type 1 for the first 64 bytes.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 2:08 PM, Matthew Wilcox wrote:
> On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
>   
>> Also, this solution also would allow us to remove the unreachable_devices()
>> routine and bitmap.
>> 
>
> Not really ... we probe reading address 0x100 to see if the device
> supports extended config space or not.  So we need to make that fail
> gracefully for the amd7111 case.
>   



pci_cfg_space_size()  is only done for PCI-express  or PCI-X mode 2
devices, so you still have eliminated the bulk of the problems which are
typically handling the legacy busses on modern machines. I don't know
what is the amd7111.



>   
>> Does anybody see a down side to this?
>> 
>
> It'll be slower than it would be if we used mmconfig directly.  Now yes,
> nobody should be using pci config space in performance critical paths
> ... but see the tg3 driver.
>   


I am not familiar with the tg3 driver, just trying to give a 5 minutes
look, it seems the typical cases where the pci-conf-space is used
intensively are with some rev in combination with the 82801
(TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
as well as some very specific PCI-X combinations
(TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
mmconfig.

Even if I am wrong for the tg3, I don't really think mmconfig vs type1
could make a noticeable performance on any common systems (obscure
systems or hardware where it could potentially have a performance impact
could  use a non-default configuration).


Loic




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 02:37:49PM -0500, Tony Camuso wrote:
> Matthew Wilcox wrote:
> 
> >Bad deduction.  What's happening is that the write to the BAR is causing
> >it to overlap the decode for mmconfig space.  So the mmconfig write to
> >set the BAR back never gets through.
> >
> >I have a different idea to fix this problem.  Instead of writing
> >0x, we could look for an unused bit of space in the E820 map and
> >write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
> >overlap, and we could find its size using MMCONFIG.
> >
> Let me see if I understand this correctly.
> 
> Writing this BAR with 0x causes it to decode all further mmconfig
> references based at addr 0xfxxx as its own?
> 
> If I've got that right, then why don't any of the other BARs do that? Is
> it because this one's a 64-bit BAR?

Here's how BARs work ... when you write 0x to the BAR, it
ignores all the set bits that are less than the size of the BAR.  So,
assuming this is a 256MB BAR (like my G33 is), what ends up written to
this BAR is 0xf000.  Now, because this is graphics, apparently it's
special and embedded in the chipset, even though it looks like it's a
PCI device.  So it actually gets priority over MMCONFIG which is also
mapped to 0xf000.

For your case of a 64-bit BAR, you could write 0x to the high
32-bits first, then write to the low 32-bits, then reset the low, then
high bits, and you'd avoid the problem.  But the G33 has a 32-bit BAR
with the same problem, so it won't work for that case.

BARs that are behind bridges don't have this problem (they can't decode
memory accesses that aren't forwarded to them).  BARs on devices which
have memory IO disabled also don't have theis problem, but disabling
devices has its problems (as does probing BARs for active devices anyway
...).

> AFIK, there are no devices out there that require 32-bits of address
> space, so using 0xdfff in the low register would certainly work.

The question is how large can 32-bit BARs get.  As we've seen, 256MB
exist, and are causing pain.  I can't imagine any PCI device
manufacturer thinks they can allocate 2GB of the low space, but we could
potentially mis-size a large BAR by not using 0x.

> Does anybody see that change as being within the purview of the patch-set
> I am proposing? Or is that another patch for another time?

I'm really not clear on the purpose of your patchset.  Was it all to
address this one problem?

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Ivan Kokshaysky
On Thu, Dec 20, 2007 at 12:08:33PM -0700, Matthew Wilcox wrote:
> On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
> > Does anybody see a down side to this?
> 
> It'll be slower than it would be if we used mmconfig directly.  Now yes,
> nobody should be using pci config space in performance critical paths
> ... but see the tg3 driver.

Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.


Let me see if I understand this correctly.

Writing this BAR with 0x causes it to decode all further mmconfig
references based at addr 0xfxxx as its own?

If I've got that right, then why don't any of the other BARs do that? Is
it because this one's a 64-bit BAR?

AFIK, there are no devices out there that require 32-bits of address
space, so using 0xdfff in the low register would certainly work.

However, the PCI spec says that we should be able to write 0x
to the BAR, buggy BIOS and hardware notwithstanding.

Does anybody see that change as being within the purview of the patch-set
I am proposing? Or is that another patch for another time?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
> Also, this solution also would allow us to remove the unreachable_devices()
> routine and bitmap.

Not really ... we probe reading address 0x100 to see if the device
supports extended config space or not.  So we need to make that fail
gracefully for the amd7111 case.

> Does anybody see a down side to this?

It'll be slower than it would be if we used mmconfig directly.  Now yes,
nobody should be using pci config space in performance critical paths
... but see the tg3 driver.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Loic Prylli wrote:


Always using type 1 for accesses below 256 bytes looks like a very very
attractive solution

I know we had a lot of older kernels over the last two years that we
patched like that (we needed MMCONFIG for our own device development
purposes, but we also needed our machines to boot and discover all
devices reliably). Recent kernels works fine out of the box on all
hardware we have, but all this sometimes tricky and apparently endless
work (in big part because of buggy BIOSes) about MMCONFIG would probably
become relatively easy by limiting the aim to have MMCONFIG work when it
is required (for cfg-space accesses >= 256).


Loic


Hmmm... I think I like this solution.

It may be easier to implement than the solution I posted.

Also, this solution also would allow us to remove the unreachable_devices()
routine and bitmap.

Does anybody see a down side to this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli

On 12/20/2007 1:16 PM, Matthew Wilcox wrote:
>
> Bad deduction.  What's happening is that the write to the BAR is causing
> it to overlap the decode for mmconfig space.  So the mmconfig write to
> set the BAR back never gets through.
>
> I have a different idea to fix this problem.  Instead of writing
> 0x, we could look for an unused bit of space in the E820 map and
> write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
> overlap, and we could find its size using MMCONFIG.
>
> Does anyone know how Windows handles these machines?  Obviously, if it's
> using MMCONFIG, it'd have the same problems.  Does it just use type 1
> for initial sizing?  Or does it use type 1 for all accesses below 256
> bytes?
>   



Always using type 1 for accesses below 256 bytes looks like a very very
attractive solution

I know we had a lot of older kernels over the last two years that we
patched like that (we needed MMCONFIG for our own device development
purposes, but we also needed our machines to boot and discover all
devices reliably). Recent kernels works fine out of the box on all
hardware we have, but all this sometimes tricky and apparently endless
work (in big part because of buggy BIOSes) about MMCONFIG would probably
become relatively easy by limiting the aim to have MMCONFIG work when it
is required (for cfg-space accesses >= 256).


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 01:30:16PM -0500, Tony Camuso wrote:
> Matthew Wilcox wrote:
> 
> >Bad deduction.  What's happening is that the write to the BAR is causing
> >it to overlap the decode for mmconfig space.  So the mmconfig write to
> >set the BAR back never gets through.
> >
> >I have a different idea to fix this problem.  Instead of writing
> >0x, we could look for an unused bit of space in the E820 map and
> >write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
> >overlap, and we could find its size using MMCONFIG.
> >
> The BAR claims to be a 64-bit BAR.

And what's in the upper 32 bits by default?  I'm guessing all zeroes.
We size 64-bit BARs by writing to each half individually.  So I bet this
one still overlaps the MMCONFIG space at 0xf000.

> As far as I know, Windows has a blacklist that limits systems with these
> devices to legacy PCI config access.

I don't think that's true.  If the people designing these machines had
noticed Windows failing to boot on them, they'd've fixed the BIOS to put
the MMCONFIG region elsewhere, not hacked Windows until it worked.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.


The BAR claims to be a 64-bit BAR.


Does anyone know how Windows handles these machines?  Obviously, if it's
using MMCONFIG, it'd have the same problems.  Does it just use type 1
for initial sizing?  Or does it use type 1 for all accesses below 256
bytes?


As far as I know, Windows has a blacklist that limits systems with these
devices to legacy PCI config access.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:


Why haven't we gotten reports about this before if this is a common
problem?

And why hasn't the vendor fixed the bios on these to work properly?


I can't really answer either of these questions. All I know is the
problem exists, and we need to deal with it.

That's why the unreachable_devices() routine exists in the first place.

But that routine is limited to only the first 16 buses on segment-0.
You would like to think that hardware designers would confine legacy
hardware, or problematic hardware, to that area, but they don't.

As I said before, expanding that routine to cover more buses would
adversely impact mmconfig pci access, since the mmconfig access
code does a lookup of that bitmap for every request. I don't think
we want that bitmap and the accompanying in-line lookup to increase
enough to encompass the pci configuration of some of these larger
systems.


Do you have a pointer to this blacklist anywhere so that everyone can
benifit from this knowledge?



Appended below is a code snippet embedded in the RHEL version of mmconfig.c,
for both i386 and x86_64. It does not encompass all the systems that have
(or will have) problems with mmconf. Only HP platforms are listed, but I
believe there are others.

The reason the HP platforms got caught is they put these devices beyond
bus 16, where they would have been trapped, and the problem would have
been avoided.

static int __devinit disable_mmconf(struct dmi_system_id *d)
{
pci_probe &= ~PCI_PROBE_MMCONF;
printk(KERN_INFO "%s detected: disabling PCI MMCONFIG\n", d->ident);
return 0;
}


/*
 * Systems which cannot use PCI MMCONFIG at this time...
 */
static struct dmi_system_id __devinitdata nommconf_dmi_table[] = {
{
.callback = disable_mmconf,
.ident = "HP xw9300 Workstation",
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, "HP xw9300 Workstation"),
},
},
{
.callback = disable_mmconf,
.ident = "HP xw9400 Workstation",
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, "HP xw9400 Workstation"),
},
},
{
.callback = disable_mmconf,
.ident = "ProLiant DL585 G2",
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL585 G2"),
},
},
{
.callback = disable_mmconf,
.ident = "HP Compaq dc5700 Microtower",
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME,
"HP Compaq dc5700 Microtower"),
},
},

{}
};


The one device we know about that throws exceptions is the 830M/MG
graphics chip. This chip passes the read-compare test, so the code
merrily advances to bus sizing. When the bus sizing code writes the
BAR at offset 0x18 in this device, the system hangs.


So it doesn't work at all, with or without this patch?  Does the vendor
know about this?

thanks,

greg k-h


I have talked to intel about this, but they haven't gotten back to me.

All I know at this point is that a mmconf write to the BAR at offset 0x18
of this device hangs the system.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 01:04:09PM -0500, Tony Camuso wrote:
> Sorry, that was the 82Q963/Q965 graphics controller, PCI ID 2992.
> 
> I can't remember why I thought PCI ID 2992 maps to 830M/MG. I think
> it was in some intel doc about the ICH8 or ICH9.
> 
> Nevertheless, I have an hp dc5700 microtower right here on the floor
> next to me, and it hangs when the bus sizing code does an mmconfig
> write to the BAR at offset 0x18 in this device (id 2992).

Oh, that's the same bug others (including me) have been complaining
about.

http://marc.info/?l=linux-kernel=118809338631160=2

> It hangs in exactly the same place every time.
> 
> I am surmising that the write to that BAR is causing a MCE.

Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.

Does anyone know how Windows handles these machines?  Obviously, if it's
using MMCONFIG, it'd have the same problems.  Does it just use type 1
for initial sizing?  Or does it use type 1 for all accesses below 256
bytes?

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:

On Thu, Dec 20, 2007 at 09:22:05AM -0800, Greg KH wrote:

I can't imagine there are too many machines with MMCONFIG and an i830
chipset.  The laptop I'm currently typing on is an i830 ... and its
BIOS is from 2002, well predating the specification of MMCONFIG. If I
didn't know better, I'd accuse Tony of lying about the existance of
such a machine ;-)


Sorry, that was the 82Q963/Q965 graphics controller, PCI ID 2992.

I can't remember why I thought PCI ID 2992 maps to 830M/MG. I think
it was in some intel doc about the ICH8 or ICH9.

Nevertheless, I have an hp dc5700 microtower right here on the floor
next to me, and it hangs when the bus sizing code does an mmconfig
write to the BAR at offset 0x18 in this device (id 2992).

It hangs in exactly the same place every time.

I am surmising that the write to that BAR is causing a MCE.

Here is the lspci output of the dc5700. The misbehaving device is
00:02.0

[EMAIL PROTECTED] ~]# lspci
00:00.0 Host bridge: Intel Corporation 82Q963/Q965 Memory Controller Hub
(rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82Q963/Q965
Integrated Graphics Controller (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#2 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port
SATA IDE Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801H (ICH8 Family) 2 port
SATA IDE Controller (rev 02)
3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755
Gigabit Ethernet PCI Express (rev 02)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 09:22:05AM -0800, Greg KH wrote:
> > The one device we know about that throws exceptions is the 830M/MG
> > graphics chip. This chip passes the read-compare test, so the code
> > merrily advances to bus sizing. When the bus sizing code writes the
> > BAR at offset 0x18 in this device, the system hangs.
> 
> So it doesn't work at all, with or without this patch?  Does the vendor
> know about this?

I can't imagine there are too many machines with MMCONFIG and an
i830 chipset.  The laptop I'm currently typing on is an i830 ... and
its BIOS is from 2002, well predating the specification of MMCONFIG.
If I didn't know better, I'd accuse Tony of lying about the existance
of such a machine ;-)

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 07:28:00AM -0500, Tony Camuso wrote:
>
>
>  Original Message 
> Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
> Date: Wed, 19 Dec 2007 19:33:45 -0500
> From: Tony Camuso <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: Greg KH <[EMAIL PROTECTED]>
> References: 
> <[EMAIL PROTECTED]> 
> <[EMAIL PROTECTED]>
>
> Greg KH wrote:
>> On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote:
>>> There exist devices that do not respond correctly to PCI
>>> MMCONFIG accesses in x86 platforms.
>> What devices are these?  Do you have reports of them somewhere?
> There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips
> and the 830M/MG graphics. Not all versions of these chips present
> this pathology, but there are perhaps tens of thousands of systems
> out there that have the broken versions of these chipsets.

Why haven't we gotten reports about this before if this is a common
problem?

And why hasn't the vendor fixed the bios on these to work properly?

> RedHat have been maintaining a blacklist of systems having these
> devices. Systems in the blacklist are confined to legacy PCI
> access.

Do you have a pointer to this blacklist anywhere so that everyone can
benifit from this knowledge?

>> That sounds like this patchset can cause bad side affects on hardware
>> that currently works just fine.  That is not a good thing to be adding
>> to the kernel, right?
> No, the patch set tries to obviate this without requiring endusers to
> write customized scripts with "pci=nommconf" and without requiring the
> RH folks to add another platform (usually belatedly) to the blacklist.
>
> If a device is going to machine check when you touch it with an mmconfig
> access, it will happen with or without this patch-set.
>
> However, the patch-set does cover most of the devices that don't respond
> well to mmconfig access. Such devices almost alway7s return garbage when
> you read from them.
>
> The one device we know about that throws exceptions is the 830M/MG
> graphics chip. This chip passes the read-compare test, so the code
> merrily advances to bus sizing. When the bus sizing code writes the
> BAR at offset 0x18 in this device, the system hangs.

So it doesn't work at all, with or without this patch?  Does the vendor
know about this?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso



 Original Message 
Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
Date: Wed, 19 Dec 2007 19:33:45 -0500
From: Tony Camuso <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: Greg KH <[EMAIL PROTECTED]>
References: 
<[EMAIL PROTECTED]> 
<[EMAIL PROTECTED]>


Greg KH wrote:

On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote:

There exist devices that do not respond correctly to PCI
MMCONFIG accesses in x86 platforms.


What devices are these?  Do you have reports of them somewhere?


There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips
and the 830M/MG graphics. Not all versions of these chips present
this pathology, but there are perhaps tens of thousands of systems
out there that have the broken versions of these chipsets.

RedHat have been maintaining a blacklist of systems having these
devices. Systems in the blacklist are confined to legacy PCI
access.

However, some of these systems, high volume ones at that, also have
PCI express buses in them. By limiting the whole platform to legacy
PCI access, we put PCI express extended config capabilities out of
reach. IIRC, the PCIe spec requires platforms to access extended
PCI config space.


That sounds like this patchset can cause bad side affects on hardware
that currently works just fine.  That is not a good thing to be adding
to the kernel, right?


No, the patch set tries to obviate this without requiring endusers to
write customized scripts with "pci=nommconf" and without requiring the
RH folks to add another platform (usually belatedly) to the blacklist.

If a device is going to machine check when you touch it with an mmconfig
access, it will happen with or without this patch-set.

However, the patch-set does cover most of the devices that don't respond
well to mmconfig access. Such devices almost alway7s return garbage when
you read from them.

The one device we know about that throws exceptions is the 830M/MG
graphics chip. This chip passes the read-compare test, so the code
merrily advances to bus sizing. When the bus sizing code writes the
BAR at offset 0x18 in this device, the system hangs.

I am thinking about a machine check handler that can catch these things,
or at least the exceptions. Aborts are not recoverable, according to
intel lore. However, the one device that we know croaks HARD seems to
throw an exception, since it happens in the same exact place every
time. I think we might be able to recover from that.

But that's in the future.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso



 Original Message 
Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
Date: Wed, 19 Dec 2007 19:44:13 -0500
From: Tony Camuso <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: Robert Hancock <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]> 
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>


Robert Hancock wrote:

Greg KH wrote:

I think we need more details on why this patch is needed. Also, we 
already have something like this in arch/x86/pci/mmconfig-shared.c, in 
the unreachable_devices function. This attempts to detect devices where 
MMCONFIG cannot access the configuration space (one of these would be at 
least one device in the AMD K8 built-in northbridge). If this is not 
sufficient, I would suggest expanding that mechanism instead of adding 
all this new code.




We thought of doing that. But IIRC, unreachable_devices() only covers
the first 16 buses on segment 0. What happens to the bitmap when you
want to cover all possible buses on all possible segments?

We felt that we had to come up with a more in-line way of doing this.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso



 Original Message 
Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
Date: Wed, 19 Dec 2007 19:33:45 -0500
From: Tony Camuso [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: Greg KH [EMAIL PROTECTED]
References: 
[EMAIL PROTECTED] 
[EMAIL PROTECTED]


Greg KH wrote:

On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote:

There exist devices that do not respond correctly to PCI
MMCONFIG accesses in x86 platforms.


What devices are these?  Do you have reports of them somewhere?


There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips
and the 830M/MG graphics. Not all versions of these chips present
this pathology, but there are perhaps tens of thousands of systems
out there that have the broken versions of these chipsets.

RedHat have been maintaining a blacklist of systems having these
devices. Systems in the blacklist are confined to legacy PCI
access.

However, some of these systems, high volume ones at that, also have
PCI express buses in them. By limiting the whole platform to legacy
PCI access, we put PCI express extended config capabilities out of
reach. IIRC, the PCIe spec requires platforms to access extended
PCI config space.


That sounds like this patchset can cause bad side affects on hardware
that currently works just fine.  That is not a good thing to be adding
to the kernel, right?


No, the patch set tries to obviate this without requiring endusers to
write customized scripts with pci=nommconf and without requiring the
RH folks to add another platform (usually belatedly) to the blacklist.

If a device is going to machine check when you touch it with an mmconfig
access, it will happen with or without this patch-set.

However, the patch-set does cover most of the devices that don't respond
well to mmconfig access. Such devices almost alway7s return garbage when
you read from them.

The one device we know about that throws exceptions is the 830M/MG
graphics chip. This chip passes the read-compare test, so the code
merrily advances to bus sizing. When the bus sizing code writes the
BAR at offset 0x18 in this device, the system hangs.

I am thinking about a machine check handler that can catch these things,
or at least the exceptions. Aborts are not recoverable, according to
intel lore. However, the one device that we know croaks HARD seems to
throw an exception, since it happens in the same exact place every
time. I think we might be able to recover from that.

But that's in the future.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso



 Original Message 
Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
Date: Wed, 19 Dec 2007 19:44:13 -0500
From: Tony Camuso [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: Robert Hancock [EMAIL PROTECTED]
References: [EMAIL PROTECTED] 
[EMAIL PROTECTED] [EMAIL PROTECTED]


Robert Hancock wrote:

Greg KH wrote:

I think we need more details on why this patch is needed. Also, we 
already have something like this in arch/x86/pci/mmconfig-shared.c, in 
the unreachable_devices function. This attempts to detect devices where 
MMCONFIG cannot access the configuration space (one of these would be at 
least one device in the AMD K8 built-in northbridge). If this is not 
sufficient, I would suggest expanding that mechanism instead of adding 
all this new code.




We thought of doing that. But IIRC, unreachable_devices() only covers
the first 16 buses on segment 0. What happens to the bitmap when you
want to cover all possible buses on all possible segments?

We felt that we had to come up with a more in-line way of doing this.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 07:28:00AM -0500, Tony Camuso wrote:


  Original Message 
 Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG
 Date: Wed, 19 Dec 2007 19:33:45 -0500
 From: Tony Camuso [EMAIL PROTECTED]
 Reply-To: [EMAIL PROTECTED]
 To: Greg KH [EMAIL PROTECTED]
 References: 
 [EMAIL PROTECTED] 
 [EMAIL PROTECTED]

 Greg KH wrote:
 On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote:
 There exist devices that do not respond correctly to PCI
 MMCONFIG accesses in x86 platforms.
 What devices are these?  Do you have reports of them somewhere?
 There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips
 and the 830M/MG graphics. Not all versions of these chips present
 this pathology, but there are perhaps tens of thousands of systems
 out there that have the broken versions of these chipsets.

Why haven't we gotten reports about this before if this is a common
problem?

And why hasn't the vendor fixed the bios on these to work properly?

 RedHat have been maintaining a blacklist of systems having these
 devices. Systems in the blacklist are confined to legacy PCI
 access.

Do you have a pointer to this blacklist anywhere so that everyone can
benifit from this knowledge?

 That sounds like this patchset can cause bad side affects on hardware
 that currently works just fine.  That is not a good thing to be adding
 to the kernel, right?
 No, the patch set tries to obviate this without requiring endusers to
 write customized scripts with pci=nommconf and without requiring the
 RH folks to add another platform (usually belatedly) to the blacklist.

 If a device is going to machine check when you touch it with an mmconfig
 access, it will happen with or without this patch-set.

 However, the patch-set does cover most of the devices that don't respond
 well to mmconfig access. Such devices almost alway7s return garbage when
 you read from them.

 The one device we know about that throws exceptions is the 830M/MG
 graphics chip. This chip passes the read-compare test, so the code
 merrily advances to bus sizing. When the bus sizing code writes the
 BAR at offset 0x18 in this device, the system hangs.

So it doesn't work at all, with or without this patch?  Does the vendor
know about this?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 09:22:05AM -0800, Greg KH wrote:
  The one device we know about that throws exceptions is the 830M/MG
  graphics chip. This chip passes the read-compare test, so the code
  merrily advances to bus sizing. When the bus sizing code writes the
  BAR at offset 0x18 in this device, the system hangs.
 
 So it doesn't work at all, with or without this patch?  Does the vendor
 know about this?

I can't imagine there are too many machines with MMCONFIG and an
i830 chipset.  The laptop I'm currently typing on is an i830 ... and
its BIOS is from 2002, well predating the specification of MMCONFIG.
If I didn't know better, I'd accuse Tony of lying about the existance
of such a machine ;-)

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:

On Thu, Dec 20, 2007 at 09:22:05AM -0800, Greg KH wrote:

I can't imagine there are too many machines with MMCONFIG and an i830
chipset.  The laptop I'm currently typing on is an i830 ... and its
BIOS is from 2002, well predating the specification of MMCONFIG. If I
didn't know better, I'd accuse Tony of lying about the existance of
such a machine ;-)


Sorry, that was the 82Q963/Q965 graphics controller, PCI ID 2992.

I can't remember why I thought PCI ID 2992 maps to 830M/MG. I think
it was in some intel doc about the ICH8 or ICH9.

Nevertheless, I have an hp dc5700 microtower right here on the floor
next to me, and it hangs when the bus sizing code does an mmconfig
write to the BAR at offset 0x18 in this device (id 2992).

It hangs in exactly the same place every time.

I am surmising that the write to that BAR is causing a MCE.

Here is the lspci output of the dc5700. The misbehaving device is
00:02.0

[EMAIL PROTECTED] ~]# lspci
00:00.0 Host bridge: Intel Corporation 82Q963/Q965 Memory Controller Hub
(rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82Q963/Q965
Integrated Graphics Controller (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
#2 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
#1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port
SATA IDE Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801H (ICH8 Family) 2 port
SATA IDE Controller (rev 02)
3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755
Gigabit Ethernet PCI Express (rev 02)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 01:04:09PM -0500, Tony Camuso wrote:
 Sorry, that was the 82Q963/Q965 graphics controller, PCI ID 2992.
 
 I can't remember why I thought PCI ID 2992 maps to 830M/MG. I think
 it was in some intel doc about the ICH8 or ICH9.
 
 Nevertheless, I have an hp dc5700 microtower right here on the floor
 next to me, and it hangs when the bus sizing code does an mmconfig
 write to the BAR at offset 0x18 in this device (id 2992).

Oh, that's the same bug others (including me) have been complaining
about.

http://marc.info/?l=linux-kernelm=118809338631160w=2

 It hangs in exactly the same place every time.
 
 I am surmising that the write to that BAR is causing a MCE.

Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.

Does anyone know how Windows handles these machines?  Obviously, if it's
using MMCONFIG, it'd have the same problems.  Does it just use type 1
for initial sizing?  Or does it use type 1 for all accesses below 256
bytes?

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:


Why haven't we gotten reports about this before if this is a common
problem?

And why hasn't the vendor fixed the bios on these to work properly?


I can't really answer either of these questions. All I know is the
problem exists, and we need to deal with it.

That's why the unreachable_devices() routine exists in the first place.

But that routine is limited to only the first 16 buses on segment-0.
You would like to think that hardware designers would confine legacy
hardware, or problematic hardware, to that area, but they don't.

As I said before, expanding that routine to cover more buses would
adversely impact mmconfig pci access, since the mmconfig access
code does a lookup of that bitmap for every request. I don't think
we want that bitmap and the accompanying in-line lookup to increase
enough to encompass the pci configuration of some of these larger
systems.


Do you have a pointer to this blacklist anywhere so that everyone can
benifit from this knowledge?



Appended below is a code snippet embedded in the RHEL version of mmconfig.c,
for both i386 and x86_64. It does not encompass all the systems that have
(or will have) problems with mmconf. Only HP platforms are listed, but I
believe there are others.

The reason the HP platforms got caught is they put these devices beyond
bus 16, where they would have been trapped, and the problem would have
been avoided.

static int __devinit disable_mmconf(struct dmi_system_id *d)
{
pci_probe = ~PCI_PROBE_MMCONF;
printk(KERN_INFO %s detected: disabling PCI MMCONFIG\n, d-ident);
return 0;
}


/*
 * Systems which cannot use PCI MMCONFIG at this time...
 */
static struct dmi_system_id __devinitdata nommconf_dmi_table[] = {
{
.callback = disable_mmconf,
.ident = HP xw9300 Workstation,
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, HP xw9300 Workstation),
},
},
{
.callback = disable_mmconf,
.ident = HP xw9400 Workstation,
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, HP xw9400 Workstation),
},
},
{
.callback = disable_mmconf,
.ident = ProLiant DL585 G2,
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME, ProLiant DL585 G2),
},
},
{
.callback = disable_mmconf,
.ident = HP Compaq dc5700 Microtower,
.matches = {
DMI_MATCH(DMI_PRODUCT_NAME,
HP Compaq dc5700 Microtower),
},
},

{}
};


The one device we know about that throws exceptions is the 830M/MG
graphics chip. This chip passes the read-compare test, so the code
merrily advances to bus sizing. When the bus sizing code writes the
BAR at offset 0x18 in this device, the system hangs.


So it doesn't work at all, with or without this patch?  Does the vendor
know about this?

thanks,

greg k-h


I have talked to intel about this, but they haven't gotten back to me.

All I know at this point is that a mmconf write to the BAR at offset 0x18
of this device hangs the system.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.


The BAR claims to be a 64-bit BAR.


Does anyone know how Windows handles these machines?  Obviously, if it's
using MMCONFIG, it'd have the same problems.  Does it just use type 1
for initial sizing?  Or does it use type 1 for all accesses below 256
bytes?


As far as I know, Windows has a blacklist that limits systems with these
devices to legacy PCI config access.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 01:30:16PM -0500, Tony Camuso wrote:
 Matthew Wilcox wrote:
 
 Bad deduction.  What's happening is that the write to the BAR is causing
 it to overlap the decode for mmconfig space.  So the mmconfig write to
 set the BAR back never gets through.
 
 I have a different idea to fix this problem.  Instead of writing
 0x, we could look for an unused bit of space in the E820 map and
 write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
 overlap, and we could find its size using MMCONFIG.
 
 The BAR claims to be a 64-bit BAR.

And what's in the upper 32 bits by default?  I'm guessing all zeroes.
We size 64-bit BARs by writing to each half individually.  So I bet this
one still overlaps the MMCONFIG space at 0xf000.

 As far as I know, Windows has a blacklist that limits systems with these
 devices to legacy PCI config access.

I don't think that's true.  If the people designing these machines had
noticed Windows failing to boot on them, they'd've fixed the BIOS to put
the MMCONFIG region elsewhere, not hacked Windows until it worked.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli

On 12/20/2007 1:16 PM, Matthew Wilcox wrote:

 Bad deduction.  What's happening is that the write to the BAR is causing
 it to overlap the decode for mmconfig space.  So the mmconfig write to
 set the BAR back never gets through.

 I have a different idea to fix this problem.  Instead of writing
 0x, we could look for an unused bit of space in the E820 map and
 write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
 overlap, and we could find its size using MMCONFIG.

 Does anyone know how Windows handles these machines?  Obviously, if it's
 using MMCONFIG, it'd have the same problems.  Does it just use type 1
 for initial sizing?  Or does it use type 1 for all accesses below 256
 bytes?
   



Always using type 1 for accesses below 256 bytes looks like a very very
attractive solution

I know we had a lot of older kernels over the last two years that we
patched like that (we needed MMCONFIG for our own device development
purposes, but we also needed our machines to boot and discover all
devices reliably). Recent kernels works fine out of the box on all
hardware we have, but all this sometimes tricky and apparently endless
work (in big part because of buggy BIOSes) about MMCONFIG would probably
become relatively easy by limiting the aim to have MMCONFIG work when it
is required (for cfg-space accesses = 256).


Loic

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Loic Prylli wrote:


Always using type 1 for accesses below 256 bytes looks like a very very
attractive solution

I know we had a lot of older kernels over the last two years that we
patched like that (we needed MMCONFIG for our own device development
purposes, but we also needed our machines to boot and discover all
devices reliably). Recent kernels works fine out of the box on all
hardware we have, but all this sometimes tricky and apparently endless
work (in big part because of buggy BIOSes) about MMCONFIG would probably
become relatively easy by limiting the aim to have MMCONFIG work when it
is required (for cfg-space accesses = 256).


Loic


Hmmm... I think I like this solution.

It may be easier to implement than the solution I posted.

Also, this solution also would allow us to remove the unreachable_devices()
routine and bitmap.

Does anybody see a down side to this?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
 Also, this solution also would allow us to remove the unreachable_devices()
 routine and bitmap.

Not really ... we probe reading address 0x100 to see if the device
supports extended config space or not.  So we need to make that fail
gracefully for the amd7111 case.

 Does anybody see a down side to this?

It'll be slower than it would be if we used mmconfig directly.  Now yes,
nobody should be using pci config space in performance critical paths
... but see the tg3 driver.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Bad deduction.  What's happening is that the write to the BAR is causing
it to overlap the decode for mmconfig space.  So the mmconfig write to
set the BAR back never gets through.

I have a different idea to fix this problem.  Instead of writing
0x, we could look for an unused bit of space in the E820 map and
write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
overlap, and we could find its size using MMCONFIG.


Let me see if I understand this correctly.

Writing this BAR with 0x causes it to decode all further mmconfig
references based at addr 0xfxxx as its own?

If I've got that right, then why don't any of the other BARs do that? Is
it because this one's a 64-bit BAR?

AFIK, there are no devices out there that require 32-bits of address
space, so using 0xdfff in the low register would certainly work.

However, the PCI spec says that we should be able to write 0x
to the BAR, buggy BIOS and hardware notwithstanding.

Does anybody see that change as being within the purview of the patch-set
I am proposing? Or is that another patch for another time?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Ivan Kokshaysky
On Thu, Dec 20, 2007 at 12:08:33PM -0700, Matthew Wilcox wrote:
 On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
  Does anybody see a down side to this?
 
 It'll be slower than it would be if we used mmconfig directly.  Now yes,
 nobody should be using pci config space in performance critical paths
 ... but see the tg3 driver.

Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 02:37:49PM -0500, Tony Camuso wrote:
 Matthew Wilcox wrote:
 
 Bad deduction.  What's happening is that the write to the BAR is causing
 it to overlap the decode for mmconfig space.  So the mmconfig write to
 set the BAR back never gets through.
 
 I have a different idea to fix this problem.  Instead of writing
 0x, we could look for an unused bit of space in the E820 map and
 write, say, 0xdfff to the low 32-bits of a BAR.  Then it wouldn't
 overlap, and we could find its size using MMCONFIG.
 
 Let me see if I understand this correctly.
 
 Writing this BAR with 0x causes it to decode all further mmconfig
 references based at addr 0xfxxx as its own?
 
 If I've got that right, then why don't any of the other BARs do that? Is
 it because this one's a 64-bit BAR?

Here's how BARs work ... when you write 0x to the BAR, it
ignores all the set bits that are less than the size of the BAR.  So,
assuming this is a 256MB BAR (like my G33 is), what ends up written to
this BAR is 0xf000.  Now, because this is graphics, apparently it's
special and embedded in the chipset, even though it looks like it's a
PCI device.  So it actually gets priority over MMCONFIG which is also
mapped to 0xf000.

For your case of a 64-bit BAR, you could write 0x to the high
32-bits first, then write to the low 32-bits, then reset the low, then
high bits, and you'd avoid the problem.  But the G33 has a 32-bit BAR
with the same problem, so it won't work for that case.

BARs that are behind bridges don't have this problem (they can't decode
memory accesses that aren't forwarded to them).  BARs on devices which
have memory IO disabled also don't have theis problem, but disabling
devices has its problems (as does probing BARs for active devices anyway
...).

 AFIK, there are no devices out there that require 32-bits of address
 space, so using 0xdfff in the low register would certainly work.

The question is how large can 32-bit BARs get.  As we've seen, 256MB
exist, and are causing pain.  I can't imagine any PCI device
manufacturer thinks they can allocate 2GB of the low space, but we could
potentially mis-size a large BAR by not using 0x.

 Does anybody see that change as being within the purview of the patch-set
 I am proposing? Or is that another patch for another time?

I'm really not clear on the purpose of your patchset.  Was it all to
address this one problem?

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 2:08 PM, Matthew Wilcox wrote:
 On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:
   
 Also, this solution also would allow us to remove the unreachable_devices()
 routine and bitmap.
 

 Not really ... we probe reading address 0x100 to see if the device
 supports extended config space or not.  So we need to make that fail
 gracefully for the amd7111 case.
   



pci_cfg_space_size()  is only done for PCI-express  or PCI-X mode 2
devices, so you still have eliminated the bulk of the problems which are
typically handling the legacy busses on modern machines. I don't know
what is the amd7111.



   
 Does anybody see a down side to this?
 

 It'll be slower than it would be if we used mmconfig directly.  Now yes,
 nobody should be using pci config space in performance critical paths
 ... but see the tg3 driver.
   


I am not familiar with the tg3 driver, just trying to give a 5 minutes
look, it seems the typical cases where the pci-conf-space is used
intensively are with some rev in combination with the 82801
(TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
as well as some very specific PCI-X combinations
(TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
mmconfig.

Even if I am wrong for the tg3, I don't really think mmconfig vs type1
could make a noticeable performance on any common systems (obscure
systems or hardware where it could potentially have a performance impact
could  use a non-default configuration).


Loic




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Matthew Wilcox wrote:


Here's how BARs work ... when you write 0x to the BAR, it
ignores all the set bits that are less than the size of the BAR.  So,
assuming this is a 256MB BAR (like my G33 is), what ends up written to
this BAR is 0xf000.  Now, because this is graphics, apparently it's
special and embedded in the chipset, even though it looks like it's a
PCI device.  So it actually gets priority over MMCONFIG which is also
mapped to 0xf000.

For your case of a 64-bit BAR, you could write 0x to the high
32-bits first, then write to the low 32-bits, then reset the low, then
high bits, and you'd avoid the problem.  But the G33 has a 32-bit BAR
with the same problem, so it won't work for that case.

BARs that are behind bridges don't have this problem (they can't decode
memory accesses that aren't forwarded to them).  BARs on devices which
have memory IO disabled also don't have theis problem, but disabling
devices has its problems (as does probing BARs for active devices anyway
...).


Thanks for the detailed explanation.

 The question is how large can 32-bit BARs get.  As we've seen, 256MB
 exist, and are causing pain.  I can't imagine any PCI device
 manufacturer thinks they can allocate 2GB of the low space, but we could
 potentially mis-size a large BAR by not using 0x.


Point well taken. Graphics devices understandably consume a lot of memory
space, and are likely to consume even more in the not-too-distant future.


I'm really not clear on the purpose of your patchset.  Was it all to
address this one problem?



No. My patch-set does not address this problem at all, but rather the
larger problem of having mmconfig-unfriendly devices on buses that are
out of reach of the unreachable_devices() routine and bitmap.

This problem is one I encountered during my testing and mentioned in
my preamble as not being fixable by my patch-set.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 03:05:52PM -0500, Loic Prylli wrote:
 I am not familiar with the tg3 driver, just trying to give a 5 minutes
 look, it seems the typical cases where the pci-conf-space is used
 intensively are with some rev in combination with the 82801
 (TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
 as well as some very specific PCI-X combinations
 (TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
 mmconfig.

It's not a question of whether the card supports mmconfig or not -- the
card can't tell whether a first-256-byte pci config transaction was
initiated through mmconfig, type1, type2, or even a bios call.

I'm just hacking together an implementation based on Ivan's suggestion
to always use type 1 for the first 64 bytes.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Ivan Kokshaysky wrote:

On Thu, Dec 20, 2007 at 12:08:33PM -0700, Matthew Wilcox wrote:

On Thu, Dec 20, 2007 at 02:04:31PM -0500, Tony Camuso wrote:

Does anybody see a down side to this?

It'll be slower than it would be if we used mmconfig directly.  Now yes,
nobody should be using pci config space in performance critical paths
... but see the tg3 driver.


Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.


Sounds good to me.

However, this does add another in-line test for every pci config access.

The existing test is a lookup in the unreachable_devices bitmap. Even though
the bitmap only covers the first 16 buses, the lookup is performed for every
pci config access.

We would be adding another test,

  if the offset is less than 64 bytes, use legacy pci config
  else use mmconfig

The emitted code would probably only produce one branch, so it shouldn't
present a performance degradation.

Any objections to taking this tact?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Ivan Kokshaysky wrote:


Use type 1 just for the first 64 bytes and tg3 will be happy. All we need
is to avoid touching BARs with mmconfig.

Ivan.


Re-thinking this ...

The problem I see with this approach is that it does not help devices
that are mmconfig-unfriendly and will not work above the 64-byte offset.

So you will be able to configure the BARS of such a device, but not be
able to do much of anything else.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 3:15 PM, Matthew Wilcox wrote:
 On Thu, Dec 20, 2007 at 03:05:52PM -0500, Loic Prylli wrote:
   
 I am not familiar with the tg3 driver, just trying to give a 5 minutes
 look, it seems the typical cases where the pci-conf-space is used
 intensively are with some rev in combination with the 82801
 (TG3_FLG2_ICH_WORKAROUND) which I don't think support mmconfig anyway,
 as well as some very specific PCI-X combinations
 (TG3_FLAG_PCIX_TARGET_HWBUG) which are also very unlikely to support
 mmconfig.
 

 It's not a question of whether the card supports mmconfig or not -- the
 card can't tell whether a first-256-byte pci config transaction was initiated 
 through mmconfig, type1, type2, or even a bios call.
   





I know the final device is not aware on how the config request was
originated. I am just saying platforms built around the Intel 82801
chipset (ICH2) don't support mmconfig at all. I would also not be
surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
might also not support mmconfig (but for this second case, it's only
speculation based on seeing a couple posts about
TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
mentioned). If you know of a platform that support mmconfig, and where
the tg3 does need to use relatively intensively pci-conf-space, I'll be
happy to be corrected.




 I'm just hacking together an implementation based on Ivan's suggestion
 to always use type 1 for the first 64 bytes.

   



Whatever you think it's best. If you are only trying to address a
BAR-sizing/mmconfig temporary conflict, and think the MCFG BIOS or
other mmconfig problems are under control, that makes sense. If the need
arise for somebody, an option to select between 64 to 256 can be added
later.



Loic


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Matthew Wilcox
On Thu, Dec 20, 2007 at 03:56:29PM -0500, Loic Prylli wrote:
 I know the final device is not aware on how the config request was
 originated. I am just saying platforms built around the Intel 82801
 chipset (ICH2) don't support mmconfig at all. I would also not be
 surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
 might also not support mmconfig (but for this second case, it's only
 speculation based on seeing a couple posts about
 TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
 mentioned). If you know of a platform that support mmconfig, and where
 the tg3 does need to use relatively intensively pci-conf-space, I'll be
 happy to be corrected.

tg3 is available as an add-in pci card.  i have one.  i can plug it into
a machine that does support mmconfig.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 4:00 PM, Matthew Wilcox wrote:
 On Thu, Dec 20, 2007 at 03:56:29PM -0500, Loic Prylli wrote:
   
 I know the final device is not aware on how the config request was
 originated. I am just saying platforms built around the Intel 82801
 chipset (ICH2) don't support mmconfig at all. I would also not be
 surprised if the platforms where tg3 needs TG3_FLAG_PCIX_TARGET_HWBUG
 might also not support mmconfig (but for this second case, it's only
 speculation based on seeing a couple posts about
 TG3_FLAG_PCIX_TARGET_HWBUG where amd hypertransport/PCI-X chipsets where
 mentioned). If you know of a platform that support mmconfig, and where
 the tg3 does need to use relatively intensively pci-conf-space, I'll be
 happy to be corrected.
 

 tg3 is available as an add-in pci card.  i have one.  i can plug it into
 a machine that does support mmconfig.
   



That doesn't tell for sure your NIC has the specific rev that would
cause the hwbug workaround to be used. But let's assume so, your
combination would still works correctly (maybe a slightly non-optimal
network performance). A non-default mmconf=always option can help
maximize again the performance.

My starting point was that more systems could be supported out-of-the
box (vs not working at all) by using type 1 more widely,. This would
not break anybody.  And this would not affect performance except on what
I initially called obscure hardware or systems.

I already acknowledged in the previous email the possibility of a
performance impact, but it is still not clear to me whether it would be
widespread. The example you  mention is a buggy revision of a chip that
was designed to use memory-mapped IO, and has to use config-space
instead because the memory-mapped IO implementation is buggy (in most
cases only when used in combination with certain chipsets). How much
influence should a modest performance impact there influence the
decision-making?


Usually it's better to trade some performance for stability than the
reverse, and you loose the performance anyway when some many people
start using pci=nommconf by default on all their installs.


Loic

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
 Appended below is a code snippet embedded in the RHEL version of 
 mmconfig.c,
 for both i386 and x86_64. It does not encompass all the systems that have
 (or will have) problems with mmconf. Only HP platforms are listed, but I
 believe there are others.

 The reason the HP platforms got caught is they put these devices beyond
 bus 16, where they would have been trapped, and the problem would have
 been avoided.

Any reason why these changes were never submitted to the upstream kernel
versions?  Or do you all just want to keep patching your newer releases
with this information forever?  :)

Please send these kinds of changes upstream...

 The one device we know about that throws exceptions is the 830M/MG
 graphics chip. This chip passes the read-compare test, so the code
 merrily advances to bus sizing. When the bus sizing code writes the
 BAR at offset 0x18 in this device, the system hangs.
 So it doesn't work at all, with or without this patch?  Does the vendor
 know about this?
 thanks,
 greg k-h

 I have talked to intel about this, but they haven't gotten back to me.

Try poking them harder, they are usually very responsive to Linux kernel
issues.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:

On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
Any reason why these changes were never submitted to the upstream kernel
versions?  Or do you all just want to keep patching your newer releases
with this information forever?  :)



I really don't know why these changes were never submitted to the
upstream kernel versions.

I was brought on the scene about six months ago as HP's on-site engineer
at RH, and this was one of the things they wanted me to do.

We wanted a solution that was more generic and could manage this
problem preemptively, rather than using blacklists. Maintenance of
blacklists is a bother and almost always done after a new system
with this problem is discovered.

Furthermore, blacklisting whole platforms to use legacy pci config
penalizes any mmconfig-friendly buses in those platforms, particularly
the pci express buses, and causes such platforms to be non-compliant
with the pci expres spec.

 Please send these kinds of changes upstream...

As you wish.
:)


I have talked to intel about this, but they haven't gotten back to me.


Try poking them harder, they are usually very responsive to Linux kernel
issues.

thanks,

greg k-h


Matthew Wilcox explained the problem to me. The hang that cannot be fixed
by my patch isn't the chipset's fault. The graphics device has a 64-bit
BAR asking for 256 MB of IO.

If I understand his explanation correctly, when the bus sizing code
writes the low dword of this BAR with 0x, the chip is then
programmed to claim every MMIO reference betweeen 0x.f000 and
0x0001.

It just so happens that MMCONFIG space for the dc5700 (awa some others) is
mapped by its BIOS into that very region, so all MMCONFIG references are now
swallowed up by the graphics chip. At that point, no further boot progress can
be made.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Greg KH
On Thu, Dec 20, 2007 at 05:36:43PM -0500, Tony Camuso wrote:
 Greg KH wrote:
 On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote:
 Any reason why these changes were never submitted to the upstream kernel
 versions?  Or do you all just want to keep patching your newer releases
 with this information forever?  :)

 I really don't know why these changes were never submitted to the
 upstream kernel versions.

 I was brought on the scene about six months ago as HP's on-site engineer
 at RH, and this was one of the things they wanted me to do.

 We wanted a solution that was more generic and could manage this
 problem preemptively, rather than using blacklists. Maintenance of
 blacklists is a bother and almost always done after a new system
 with this problem is discovered.

 Furthermore, blacklisting whole platforms to use legacy pci config
 penalizes any mmconfig-friendly buses in those platforms, particularly
 the pci express buses, and causes such platforms to be non-compliant
 with the pci expres spec.

Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Greg KH wrote:


Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h


I don't think that many OEMs have that much control over the BIOS in
their value lines.
:)

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.

Anyway, we are devising better ways to deal with these anomalies
than blacklists and telling customers to use pci=nommconf

And we're bringing them to the community for discussion, improvement,
and, we hope, acceptance.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Robert Hancock

Tony Camuso wrote:

Greg KH wrote:


Sure, I realize this, but it solves the problem in one way for broken
hardware, such that it at least allows it to work, right?  It also
provides a better incentive for the manufacturer to fix their bios,
which as you are on-site at HP, it would seem odd that they would just
not do that instead of trying to work around this in the kernel...

thanks,

greg k-h


I don't think that many OEMs have that much control over the BIOS in
their value lines.
:)

And the MMCONFIG problem with enterprise systems and workstations, where
we do control the BIOS (for the most part), is due to known bugs in
certain versions of certain chipsets, HT1000, AMD8132, among them, not
the BIOS.

Anyway, we are devising better ways to deal with these anomalies
than blacklists and telling customers to use pci=nommconf

And we're bringing them to the community for discussion, improvement,
and, we hope, acceptance.


First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is 
no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.


The case of the device built into the K8 northbridge that's unreachable 
by MMCONFIG kind of makes sense, since the northbridge is what's 
translating the MMCONFIG memory access into config accesses. It seems 
bizarre to me that a bridge chip could possibly have such a problem. The 
MMCONFIG access should get translated into a configuration space access 
in the northbridge and from that point on there's no difference between 
an MMCONFIG and type1 access.


Look at MSI for another example, we recently had a patch from NVIDIA 
posted to enable Hypertransport MSI mapping bits on some chipsets so 
that MSI would function, because the BIOS failed to set them up 
properly. Are we sure there's not a similar BIOS configuration issue 
that could ideally be fixed in the BIOS, or else fixed up in the kernel?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is 
no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.




I happen to have this one stored in my desktop.

From AMD-8132TM HyperTransportTM
 PCI-X®2.0 Tunnel
  Revision Guide

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf

79 AMD-8132TM Tunnel Lacks Extended Configuration
Space Memory-Mapped I/O Base Address Register

Description

Current AMD processors do not natively support PCI-defined extended 
configuration space. A memory
mapped I/O base address register (MMIO BAR) is required in chipset devices to 
support extended
configuration space. The AMD-8132 does not have this MMIO BAR.
Potential Effect On System

The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR to 
support extended
configuration space. Using a device which does have this MMIO BAR and an 
AMD-8132 on the same
HyperTransportTM link of the processor may cause firmware/software problems.

The base configuration space of the AMD-8132 and PCI(-X) devices attached to it 
are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to 
the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no registers 
in the extended
configuration space.

Suggested Workaround

It is strongly recommended that system designers do not connect the AMD-8132 
and devices that use extended
configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® bridges) to 
the same processor
HyperTransport link.

Fix Planned
No

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.





And here are the excerpts from that page of the spec which are salient
to the present discussion:

--

The base configuration space of the AMD-8132 and PCI(-X) devices attached to it 
are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to 
the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no registers 
in the extended
configuration space.

Fix Planned
No


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli

On 12/20/2007 6:21 PM, Tony Camuso wrote:

 And the MMCONFIG problem with enterprise systems and workstations, where
 we do control the BIOS (for the most part), is due to known bugs in
 certain versions of certain chipsets, HT1000, AMD8132, among them, not
 the BIOS.



The lack of MMCONFIG support is indeed because some hypertransport
chipsets lack that support. But there are some BIOSes out there that are
advertising support for all busses in their MCFG acpi attribute (even
the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131
motherboard), and the BIOS seems at least faulty for advertising a
capability that does not exist.


Loic

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

The case of the device built into the K8 northbridge that's unreachable 
by MMCONFIG kind of makes sense, since the northbridge is what's 
translating the MMCONFIG memory access into config accesses. It seems 
bizarre to me that a bridge chip could possibly have such a problem. The 
MMCONFIG access should get translated into a configuration space access 
in the northbridge and from that point on there's no difference between 
an MMCONFIG and type1 access.



Robert's point is well taken.

Only northbridge chips can give us this kind of trouble, and the only
chips mentioned in the present discussion as not being mmconf-compliant
are northbridges (8132, ht1000).

The patch is aware of this, so once a root bus has been programmed for
legacy pci config access, all descendent buses automatically inherit
this access mechanism and are therefore not probed by the patch.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Robert Hancock

Tony Camuso wrote:

Robert Hancock wrote:

First off, I would like to see confirmation from the horses's mouths 
here (namely AMD, ServerWorks/Broadcom, and whoever else) that there 
is no other way to get around this problem than disabling MMCONFIG for 
accesses behind those chips.




I happen to have this one stored in my desktop.

 From AMD-8132TM HyperTransportTM
 PCI-X®2.0 Tunnel
  Revision Guide

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 



79 AMD-8132TM Tunnel Lacks Extended Configuration
Space Memory-Mapped I/O Base Address Register

Description

Current AMD processors do not natively support PCI-defined extended 
configuration space. A memory
mapped I/O base address register (MMIO BAR) is required in chipset 
devices to support extended

configuration space. The AMD-8132 does not have this MMIO BAR.
Potential Effect On System

The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR 
to support extended
configuration space. Using a device which does have this MMIO BAR and an 
AMD-8132 on the same
HyperTransportTM link of the processor may cause firmware/software 
problems.


The base configuration space of the AMD-8132 and PCI(-X) devices 
attached to it are accessible using only
the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices 
attached to the AMD-8132 in the
extended configuration space are not accessible. The AMD-8132 has no 
registers in the extended

configuration space.

Suggested Workaround

It is strongly recommended that system designers do not connect the 
AMD-8132 and devices that use extended
configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® 
bridges) to the same processor

HyperTransport link.

Fix Planned
No


That does sound fairly definitive. I have to wonder why certain system 
designers then didn't follow their strong recommendation..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Robert Hancock wrote:

I have to wonder why certain system 
designers then didn't follow their strong recommendation..



I don't think I want to go there.

I used to be a hardware/firmware guy.
:D :D
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Loic Prylli
On 12/20/2007 9:15 PM, Robert Hancock wrote:

 Suggested Workaround

 It is strongly recommended that system designers do not connect the
 AMD-8132 and devices that use extended
 configuration space MMIO BARs (ex: HyperTransport-to-PCI Express®
 bridges) to the same processor
 HyperTransport link.

 Fix Planned
 No

 That does sound fairly definitive. I have to wonder why certain system
 designers then didn't follow their strong recommendation..



Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
northbridges (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a MMCONFIG BAR where on one of the
other available hypertransport links in the system.


Loic

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]

2007-12-20 Thread Tony Camuso

Loic Prylli wrote:


Just curious, do you know of any system where that recommendation was
not followed? On all motherboards where I have seen a AMD-8131 or a
AMD-8132, they were alone on their hypertransport link, and other
northbridges (more precisely hypertransport to pci-express or
pci-whatever, often nvidia) with a MMCONFIG BAR where on one of the
other available hypertransport links in the system.


Loic



Here is the PCI configuration of the HP DL585G2.

You can see two nVidia CK804 PCIE root ports at bus 0 and bus 0x40.
Each of them has an 8132 connected as a subordinate bridge.

[EMAIL PROTECTED] ~]# lspci -vt
-+-[:40]-+-00.0  nVidia Corporation CK804 Memory Controller
 |   +-01.0  nVidia Corporation CK804 Memory Controller
 |   +-0b.0-[:4f-51]--
 |   +-0c.0-[:4c-4e]--
 |   +-0d.0-[:49-4b]--
 |   +-0e.0-[:46-48]--
 |   +-10.0-[:41]--+-01.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   | \-02.0  Broadcom Corporation NetXtreme II 
BCM5706 Gigabit Ethernet
 |   +-10.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 |   +-11.0-[:42-45]--
 |   \-11.1  Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
 \-[:00]-+-00.0  nVidia Corporation CK804 Memory Controller
 +-01.0  nVidia Corporation CK804 ISA Bridge
 +-02.0  nVidia Corporation CK804 USB Controller
 +-02.1  nVidia Corporation CK804 USB Controller
 +-06.0  nVidia Corporation CK804 IDE
 +-09.0-[:01]--+-03.0  ATI Technologies Inc ES1000
 | +-04.0  Compaq Computer Corporation Integrated 
Lights Out Controller
 | +-04.2  Compaq Computer Corporation Integrated 
Lights Out  Processor
 | +-04.4  Hewlett-Packard Company Proliant iLO2 
virtual USB controller
 | \-04.6  Hewlett-Packard Company Proliant iLO2 
virtual UART
 +-0c.0-[:08-0a]00.0  Hewlett-Packard Company Smart Array 
Controller
 +-0d.0-[:05-07]--
 +-0e.0-[:02-04]--
 +-18.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-18.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-18.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-18.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-19.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-19.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-19.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-19.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1a.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1a.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1a.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 +-1a.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
 +-1b.0  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
 +-1b.1  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
 +-1b.2  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
 \-1b.3  Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
[EMAIL PROTECTED] ~]#













--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/