Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-08 Thread Martin Mares
Hello!

> > Maybe we should do all bus scanning with conf1 and then try if
> > MMCONFIG returns the same values?
> 
> that is already in the code today but not nearly enough; there's a ton
> of cases where it's "touch mmconfig and the box is dead"...

Argh.  However I still do not think it is an acceptable reason for breaking
all userspace access to the extended config space.

At the very least, there should be a sysfs toggle to enable MMCONFIG if
root really wishes that (maybe for a particular device, as the drivers
would do).

Have a nice fortnight
-- 
Martin `MJ' Mares  <[EMAIL PROTECTED]>   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Do not believe in miracles -- rely on them.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-08 Thread Martin Mares
Hello!

  Maybe we should do all bus scanning with conf1 and then try if
  MMCONFIG returns the same values?
 
 that is already in the code today but not nearly enough; there's a ton
 of cases where it's touch mmconfig and the box is dead...

Argh.  However I still do not think it is an acceptable reason for breaking
all userspace access to the extended config space.

At the very least, there should be a sysfs toggle to enable MMCONFIG if
root really wishes that (maybe for a particular device, as the drivers
would do).

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Do not believe in miracles -- rely on them.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-01 Thread Arjan van de Ven
On Thu, 1 Nov 2007 09:31:40 +0100
Martin Mares <[EMAIL PROTECTED]> wrote:

> Hello!
> 
> > something like
> > int pci_enable_mmconfig(struct pci_dev *pdev) ?
> > sounds like a very solid plan to me...
> 
> Please remember that the driver is not the sole user of the PCI config
> space -- user-space programs (e.g., lspci) can access it via sysfs,
> too. Should we force users of such programs to add a magic kernel
> parameter to enable MMCONFIG? Does not make much sense.
> 
> Maybe we should do all bus scanning with conf1 and then try if
> MMCONFIG returns the same values?

that is already in the code today but not nearly enough; there's a ton
of cases where it's "touch mmconfig and the box is dead"...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-01 Thread Martin Mares
Hello!

> something like
> int pci_enable_mmconfig(struct pci_dev *pdev) ?
> sounds like a very solid plan to me...

Please remember that the driver is not the sole user of the PCI config
space -- user-space programs (e.g., lspci) can access it via sysfs, too.
Should we force users of such programs to add a magic kernel parameter
to enable MMCONFIG? Does not make much sense.

Maybe we should do all bus scanning with conf1 and then try if MMCONFIG
returns the same values?

Have a nice fortnight
-- 
Martin `MJ' Mares  <[EMAIL PROTECTED]>   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"Beware of bugs in the above code; I have only proved it correct, not tried 
it." -- D.E.K.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-01 Thread Martin Mares
Hello!

 something like
 int pci_enable_mmconfig(struct pci_dev *pdev) ?
 sounds like a very solid plan to me...

Please remember that the driver is not the sole user of the PCI config
space -- user-space programs (e.g., lspci) can access it via sysfs, too.
Should we force users of such programs to add a magic kernel parameter
to enable MMCONFIG? Does not make much sense.

Maybe we should do all bus scanning with conf1 and then try if MMCONFIG
returns the same values?

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Beware of bugs in the above code; I have only proved it correct, not tried 
it. -- D.E.K.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-11-01 Thread Arjan van de Ven
On Thu, 1 Nov 2007 09:31:40 +0100
Martin Mares [EMAIL PROTECTED] wrote:

 Hello!
 
  something like
  int pci_enable_mmconfig(struct pci_dev *pdev) ?
  sounds like a very solid plan to me...
 
 Please remember that the driver is not the sole user of the PCI config
 space -- user-space programs (e.g., lspci) can access it via sysfs,
 too. Should we force users of such programs to add a magic kernel
 parameter to enable MMCONFIG? Does not make much sense.
 
 Maybe we should do all bus scanning with conf1 and then try if
 MMCONFIG returns the same values?

that is already in the code today but not nearly enough; there's a ton
of cases where it's touch mmconfig and the box is dead...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 4:59 pm Linus Torvalds wrote:
> Also, there are several devices that don't show up in the MMCFG
> things, or just otherwise get it wrong.
>
> So just take a look at arch/x86/pci/mmconfig-shared.c and look for
> "conf1".
>
> Really. Damn, I'm nervous taking any MMCFG patches that has you as an
> author, if you aren't even aware of these kinds of fundamnetal
> issues. You probably read the standards about how things are
> "supposed" to work, and then just believed them?

We have a few bits of code in there to look at the actual MMCONFIG base 
register, which is good since we can sanity check it against the 
firmware.  However, by itself it's not enough since we may end up using 
it even though the BIOS didn't tell us about an overlap that really 
does exist.  Robert recognized this and added checks to make sure the 
values read from the base regs actually matched what the firmware was 
telling us in the ACPI tables (at least iirc, it's been awhile since I 
looked at the patch).

So don't be too nervous. :)

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Robert Hancock

Linus Torvalds wrote:


On Tue, 30 Oct 2007, Robert Hancock wrote:

You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for
startup.

If it does, it's not by necessity. As soon as you read the table location out
of the ACPI tables you can start using it, and that shouldn't require any
config space accesses.


Don't be silly. Exactly _BECAUSE_ we cannot trust the firmware, we have to 
use conf1 (which we can trust) to verify it and/or fix things up.


My point was, it's not inherently necessary in order to use MMCONFIG. 
I'm not saying the checks (unreachable_devices and 
pci_mmcfg_check_hostbridge) aren't useful or needed with many real 
machines. However, in the event that type1 access isn't available we 
just skip all those checks because we have no other option. It would 
indeed be a pretty broken spec if there was no way to bootstrap with it 
even under ideal conditions..




Also, there are several devices that don't show up in the MMCFG things, or 
just otherwise get it wrong.


So just take a look at arch/x86/pci/mmconfig-shared.c and look for 
"conf1".


Really. Damn, I'm nervous taking any MMCFG patches that has you as an 
author, if you aren't even aware of these kinds of fundamnetal issues. You 
probably read the standards about how things are "supposed" to work, and 
then just believed them?


Rule #1 in kernel programming: don't *ever* think that things actually 
work the way they are documented to work. The documentation is a starting 
point, nothing else. 

And please be defensive in programming. We *know* conf1 cycles work. The 
hardware has been extensively tested, and there are no firmware 
interactions. There is *zero* reasons to use MMCONF cycles for normal 
devices. Ergo: switching over to MMCONF when not needed is stupid and 
fragile.


I can't really disagree that MMCONFIG doesn't have great advantages for 
most devices (though it likely is faster on a lot of platforms, which 
may be significant if the device does lots of config space accesses). So 
for the moment, avoiding using it except where necessary will likely 
work out (except if some system does indeed puke on mixing type1 and 
MMCONFIG).


However, what Microsoft is doing with Vista may eventually make a 
difference in the future. Many hardware vendors seem to use the testing 
strategy of "test with latest Windows version. Works OK? Ship it." If 
Vista decides that MMCONFIG is good to use all the time, then type1 
access support is likely going to a) end up less tested and b) probably 
deleted entirely in time. We've seen it before - it used to be that not 
using ACPI was the safe option on most hardware with Linux. Now you 
pretty much have to use it because the manufacturers only test with it 
enabled. I've seen at least one board where the interrupt routing was 
completely broken with ACPI off, because they obviously only tested in 
Windows..


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
> 
> it's per device so that if you have have no users for this stuff,
> you'll never ever get bitten by bugs (be it linux or the bios).

Exactly. It's not that we care about trying to protect a system that 
really needs to use MMCFG - once you really *really* need it, you're 
screwed if it doesn't work.

But currently 99% of all systems will never need it, and for the broken 
platforms, that's a nice round 100%. So if we make code not use MMCFG by 
default, and you have to enable it for just those device drivers that 
really want it and care, you automatically avoid all the broken hardware, 
with no real downsides.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Robert Hancock wrote:
> > 
> > You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for
> > startup.
> 
> If it does, it's not by necessity. As soon as you read the table location out
> of the ACPI tables you can start using it, and that shouldn't require any
> config space accesses.

Don't be silly. Exactly _BECAUSE_ we cannot trust the firmware, we have to 
use conf1 (which we can trust) to verify it and/or fix things up.

Also, there are several devices that don't show up in the MMCFG things, or 
just otherwise get it wrong.

So just take a look at arch/x86/pci/mmconfig-shared.c and look for 
"conf1".

Really. Damn, I'm nervous taking any MMCFG patches that has you as an 
author, if you aren't even aware of these kinds of fundamnetal issues. You 
probably read the standards about how things are "supposed" to work, and 
then just believed them?

Rule #1 in kernel programming: don't *ever* think that things actually 
work the way they are documented to work. The documentation is a starting 
point, nothing else. 

And please be defensive in programming. We *know* conf1 cycles work. The 
hardware has been extensively tested, and there are no firmware 
interactions. There is *zero* reasons to use MMCONF cycles for normal 
devices. Ergo: switching over to MMCONF when not needed is stupid and 
fragile.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 17:41:26 -0600
Robert Hancock <[EMAIL PROTECTED]> wrote:
> > I could easily see device driver writers probing to see if
> > something works, and I absolutely don't think we should just
> > automatically enable MMCONFIG from then on.
> 
> Why per device? It's not like the MSI case where both the platform
> and the device are potentially busted. Whether or not MMCONFIG works
> has nothing to do with the device, all that matters is whether it
> works on the platform. It shouldn't be the driver's responsibility to
> know this.

it's per device so that if you have have no users for this stuff,
you'll never ever get bitten by bugs (be it linux or the bios).
It's just avoiding the problem for a large class of cases... doesn't
mean it's solved for the other class, just in practice it turns it into
a much smaller problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Robert Hancock

Linus Torvalds wrote:


On Tue, 30 Oct 2007, Arjan van de Ven wrote:

the problem is... you're not supposed to mix both types of accesses.


You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for 
startup.


If it does, it's not by necessity. As soon as you read the table 
location out of the ACPI tables you can start using it, and that 
shouldn't require any config space accesses.




Also, there's reason to believe that mixing things up _has_ to work 
anyway, and if the issue is between "works in practice" and "theory says 
that you shouldn't mix", I'll take practice every time.


Especially since we *know* that the theory is broken. Right now MMCONFIG 
is effectively disabled very aggressively because it's simply unusably 
flaky. So the choice is between:


 - don't use MMCONFIG at all, because it has so many problems
 - use MMCONFIG sparingly enough to hide the problems


Fact is, we don't really know how many of these systems with supposedly 
"broken" MMCONFIG were really just suffering from the overlapping 
PCI/MMCONFIG address space problem, which is entirely the fault of the 
way we do PCI probing. I would bet quite a few of them.




and what "you're supposed to do" is simply trumped by Real Life(tm). 
Because Intel screwed up so badly when they designed that piece of shit.


(Where "screwed up badly" is the usual "left it to firmware people" thing, 
of course. Dammit, Intel *could* have just made it a real PCI BAR in the 
Northbridge, and specified it as such, and we wouldn't have these 
problems! But no, it had to be another idiotic "firmware tells where it 
is" thing)


This wouldn't have helped anything with the problem in question.



The fact is, CONF1 style accesses are just safer, and *work*. 

I would suggest a slight twist then: use CONF1 *until* you're using
something above 256, and then and only then switch to MMCONFIG from
then on for all accesses.


No.

Maybe if you do it per-device, and only *after* probing (ie we have seen 
multiple, and successful, accesses), but globally, absolutely not. That 
would be useless. The bugs we have had in this area have been exactly the 
kinds of things like "we don't know the real size of the MMCONFIG areas" 
etc.


I could easily see device driver writers probing to see if something 
works, and I absolutely don't think we should just automatically enable 
MMCONFIG from then on.


Why per device? It's not like the MSI case where both the platform and 
the device are potentially busted. Whether or not MMCONFIG works has 
nothing to do with the device, all that matters is whether it works on 
the platform. It shouldn't be the driver's responsibility to know this.




But maybe we could have a per-device flag that a driver *can* set. Ie have 
the logic be:


 - use MMCONFIG if we have to (reg >= 256)

OR

 - use MMCONFIG if the driver specifically asked us to

and then drivers that absolutely need it, and know they do, can set that 
flag. Preferably after they actually verified that it works.


How will they verify that it works? If it works, then verifying it works 
is all well and good. If it doesn't work, trying to verify if it does 
could very well blow up the machine.


I've made the point before that if we're going to allow using it at all, 
we'd better find out if it works or not early on, not after we've been 
running and somebody decides it's a good idea to try using it and 
causing a lockup or something.




That way you _can_ get the "this is how you're supposed to do it" 
behaviour, but you get it when there is a reasonable chance that it 
actually works.


And quite frankly, if you're not supposed to mix these things even across 
devices, then I think we are better off just doing what we effectively do 
now: mostly ignore the damn thing because it's too broken to use.


Maybe somebody inside Intel could just clarify the documentation, and 
change it from "you're not supposed to mix" to "mix all you want". 


Intel could say what they want on the subject.. but that doesn't 
necessarily reflect what happens with anyone else's chipset implementations.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen

> Actually, I guess the bad case wasn't "not listed at all", but incorrectly
> listed - so the probing would go to the wrong address, not find any
> devices, and then promptly result in an unusable machine with no hardware
> attached.

iirc they tended to hang for whatever reason when the mcfg
area was accessed, not just not detect anything.

>
> I _think_ (and hope) those machines were never released.

They were :/ The bug flowed from a Intel reference BIOS to lots of
production boards. 

> But even now, on 
> my main machine, I get "MCFG area at f000 is not E820-reserved", and

That's a different issue. The spec does not require it to be e820-reserved.
That was just a (bogus) heuristic originally added to handle the early BIOS
bug. Even BIOS where the mcfg is fine do not reserve it there.

There were some patches to check it against ACPI resources (which
was presumably better specification conforming and more importantly
similar to what M$ does). I'm not sure that fixed all problems and
made the e820 check obsolete though.

> Basically, the resource allocation for mmconf has always been broken
> (largely by *design* by Intel!). And by being broken, it has been
> unreliable.

Well the overlapping to MMIO would have been fine as design if BIOS had
gotten it right.  Trying to design things that BIOS can't get it wrong
is probably futile -- the BIOSes would just find more subtle ways to break.
The real problem I guess was that Linux was trying to use it before 
Windows which is usually a bad idea regarding BIOS support at least
in the Desktop/Laptop space. 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Jesse Barnes wrote:
> 
> Yeah, that's definitely a problem, and would be a firmware bug.  There's 
> no doubt that firmwares have had trouble with this in the past, but 
> given that Vista now relies on this stuff working, it's a lot more 
> likely to be reliable in current and future systems.

I agree. I think most of the big problems we had were basically with 
unreleased or very early systems.

But the per-device flag should "just fix it" regardless, and we can go 
forward assuming that things work, but without breaking those borderline 
systems.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread David Miller
From: Jesse Barnes <[EMAIL PROTECTED]>
Date: Tue, 30 Oct 2007 15:31:57 -0700

> I don't disagree there.  I'm just saying the actual mechanism is fine 
> (as illustrated by the numerous non-PC ports of Linux), and this 
> particular problem, at least, isn't really specific to how MMCONFIG is 
> described or configured by the firmware and OS, it's simply a Linux 
> problem.

I totally disagree, it is not a Linux problem.

The non-PC ports don't have this issue because the PCI config space
MMIO area of the PCI controller does not overlap PCI MMIO space at
all.  So the problem simply cannot happen.

They choose to put this MMCONFIG in an area overlapping with PCI MMIO
space, which breaks both existing practice and code.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 3:38 pm Linus Torvalds wrote:
> On Tue, 30 Oct 2007, Linus Torvalds wrote:
> > No. You really don't see the big picture. There's been tons of
> > problems with MMCONFIG. Like the fact that other devices have their
> > IO regions registered on top of it, because the MMCONFIG thing was
> > done as a hidden resource. Or the fact that the area claimed was
> > too small. Or too large. Or not listed at all.
>
> Actually, I guess the bad case wasn't "not listed at all", but
> incorrectly listed - so the probing would go to the wrong address,
> not find any devices, and then promptly result in an unusable machine
> with no hardware attached.

Yeah, busted BIOS (and I agree a poor design decision).

> I _think_ (and hope) those machines were never released. But even
> now, on my main machine, I get "MCFG area at f000 is not
> E820-reserved", and probably the only reason the PCI layer doesn't
> overwrite it is because it does show up as a PnP region, and I have
> pnp support enabled.

Unfortunately I think some such machines *were* released, only because 
the release engineers figured no one actually uses MMCONFIG yet 
(Windows == whole world of users), so why worry about that particular 
bug?

The "not E820-reserved" message is actually bogus.  I'll bet MMCONFIG 
works fine on your machine with Robert's patch and the disable decode 
stuff applied.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Linus Torvalds wrote:
>
> No. You really don't see the big picture. There's been tons of problems 
> with MMCONFIG. Like the fact that other devices have their IO regions 
> registered on top of it, because the MMCONFIG thing was done as a hidden 
> resource. Or the fact that the area claimed was too small. Or too large. 
> Or not listed at all.

Actually, I guess the bad case wasn't "not listed at all", but incorrectly 
listed - so the probing would go to the wrong address, not find any 
devices, and then promptly result in an unusable machine with no hardware 
attached.

I _think_ (and hope) those machines were never released. But even now, on 
my main machine, I get "MCFG area at f000 is not E820-reserved", and 
probably the only reason the PCI layer doesn't overwrite it is because it 
does show up as a PnP region, and I have pnp support enabled.

Basically, the resource allocation for mmconf has always been broken 
(largely by *design* by Intel!). And by being broken, it has been 
unreliable.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 3:22 pm Linus Torvalds wrote:
> On Tue, 30 Oct 2007, Jesse Barnes wrote:
> > The per-device flag is fine with me, but I should make something
> > clear:
> >
> > MMCONFIG IS NOT BROKEN!
>
> Trust me, it is.
>
> The particular problem _you_ had with it is only a small small part
> of the bugs we have had.
>
> > What's broken is our PCI probing with certain address space layouts
> > that include MMCONFIG space.
>
> No. You really don't see the big picture. There's been tons of
> problems with MMCONFIG. Like the fact that other devices have their
> IO regions registered on top of it, because the MMCONFIG thing was
> done as a hidden resource. Or the fact that the area claimed was too
> small. Or too large. Or not listed at all.

Yeah, that's definitely a problem, and would be a firmware bug.  There's 
no doubt that firmwares have had trouble with this in the past, but 
given that Vista now relies on this stuff working, it's a lot more 
likely to be reliable in current and future systems.

> The whole thing is a total disaster. I told Intel engineers literally
> *years* ago to not do that idiotic "hidden IO resources that are
> described by firmware that then inevitably gets things wrong", and
> yet what happens? Every single time.

I don't disagree there.  I'm just saying the actual mechanism is fine 
(as illustrated by the numerous non-PC ports of Linux), and this 
particular problem, at least, isn't really specific to how MMCONFIG is 
described or configured by the firmware and OS, it's simply a Linux 
problem.

But like I said, the per-device flag Arjan suggested is fine with me...

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Jesse Barnes wrote:
>
> The per-device flag is fine with me, but I should make something clear:
> 
> MMCONFIG IS NOT BROKEN!

Trust me, it is.

The particular problem _you_ had with it is only a small small part of the 
bugs we have had.

> What's broken is our PCI probing with certain address space layouts that 
> include MMCONFIG space.

No. You really don't see the big picture. There's been tons of problems 
with MMCONFIG. Like the fact that other devices have their IO regions 
registered on top of it, because the MMCONFIG thing was done as a hidden 
resource. Or the fact that the area claimed was too small. Or too large. 
Or not listed at all.

The whole thing is a total disaster. I told Intel engineers literally 
*years* ago to not do that idiotic "hidden IO resources that are described 
by firmware that then inevitably gets things wrong", and yet what happens? 
Every single time.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 10:07 am Linus Torvalds wrote:
> (Where "screwed up badly" is the usual "left it to firmware people"
> thing, of course. Dammit, Intel *could* have just made it a real PCI
> BAR in the Northbridge, and specified it as such, and we wouldn't
> have these problems! But no, it had to be another idiotic "firmware
> tells where it is" thing)

The per-device flag is fine with me, but I should make something clear:

MMCONFIG IS NOT BROKEN!

Nor is finding mmconfig space.  We know how to do that too.

What's broken is our PCI probing with certain address space layouts that 
include MMCONFIG space.  Since MMCONFIG is in MMIO space (by 
definition) there will always be the potential for problems when we use 
MMCONFIG and don't disable decode while sizing BARs (probably a latent 
bug on many non-PC Linux platforms).

So we can either use I/O ports for sizing and only switch to MMCONFIG 
later, or we can just use it on an as-needed basis, or we can make our 
PCI probing safe if MMCONFIG is in use.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread David Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Tue, 30 Oct 2007 19:50:58 +0100

> I believe the tigon3 driver also does config space access frequently
> on some chips.

This case is not worth optimizing for at all.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen
On Tuesday 30 October 2007 20:06:12 Linus Torvalds wrote:
> 
> On Tue, 30 Oct 2007, Andi Kleen wrote:
> > 
> > Also there are still the old x86 Macs where conf1 doesn't work.
> 
> I don't think that was ever true. That was a made-up rumor by the EFI 
> people who were just confused.
> There's no way to disable conf1 accesses on any known chipset, afaik.

There was a tester with such a machine and he complained until
the heuristic handled this case.  So there is evidence it 
really was like this.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Andi Kleen wrote:
> 
> Also there are still the old x86 Macs where conf1 doesn't work.

I don't think that was ever true. That was a made-up rumor by the EFI 
people who were just confused.

There's no way to disable conf1 accesses on any known chipset, afaik.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen

> At that point, almost all the issues with mmconfig just go away, BECAUSE 
> NOBODY USES IT, so it doesn't matter if it's broken?

We seem to slowly grow more uses of registers > 256, but yes it's still mostly 
non critical stuff.
 
> The fact is, CONF1 style accesses are just safer, and *work*. 
> 
> Can we please just do that? 

At least for the AMD GART IOMMU I would prefer if that was not done.
It has a pci config space access in a time critical path (flush).
And Family10h supports accessing the northbridge using mmconfig.

I believe the tigon3 driver also does config space access frequently
on some chips.

Also there are still the old x86 Macs where conf1 doesn't work.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
> 
> something like
> int pci_enable_mmconfig(struct pci_dev *pdev) ?

Yes, that looks fine. It also matches the kinds of things drivers already 
have to do (ie enable DMA, MSI etc), both conceptually and from a purely 
syntactic/practical standpoint.

I think mmconfig and MSI have a lot in common, in that both are relatively 
new features and both have had issues that made them not work well.

> sounds like a very solid plan to me...
> 
> > Maybe somebody inside Intel could just clarify the documentation, and 
> > change it from "you're not supposed to mix" to "mix all you want". 
> 
> I'll see what I can do ;)

Well, let's see if "pci_enable_mmconfig()" works out first. There probably 
aren't more than a few drivers that would be interested in this anyway, so 
it's probably fine.

And by the time mmconfig is commonly used, probably all the teething 
problems are long gone, so we could plan on a "in five years, maybe we 
could even enable it by default" thing?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 10:07:48 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> 
> > > The fact is, CONF1 style accesses are just safer, and *work*. 
> > 
> > I would suggest a slight twist then: use CONF1 *until* you're using
> > something above 256, and then and only then switch to MMCONFIG from
> > then on for all accesses.
> 
> No.
> 
> Maybe if you do it per-device, and only *after* probing (ie we have
> seen multiple, and successful, accesses), but globally, absolutely
> not. That would be useless. The bugs we have had in this area have
> been exactly the kinds of things like "we don't know the real size of
> the MMCONFIG areas" etc.

sorry I wasn't very clear, I meant "per device".

> 
> I could easily see device driver writers probing to see if something 
> works, and I absolutely don't think we should just automatically
> enable MMCONFIG from then on.
> 
> But maybe we could have a per-device flag that a driver *can* set. Ie
> have the logic be:
> 
>  - use MMCONFIG if we have to (reg >= 256)
> 
> OR
> 
>  - use MMCONFIG if the driver specifically asked us to

something like
int pci_enable_mmconfig(struct pci_dev *pdev) ?
sounds like a very solid plan to me...


> Maybe somebody inside Intel could just clarify the documentation, and 
> change it from "you're not supposed to mix" to "mix all you want". 

I'll see what I can do ;)


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
> 
> the problem is... you're not supposed to mix both types of accesses.

You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for 
startup.

Also, there's reason to believe that mixing things up _has_ to work 
anyway, and if the issue is between "works in practice" and "theory says 
that you shouldn't mix", I'll take practice every time.

Especially since we *know* that the theory is broken. Right now MMCONFIG 
is effectively disabled very aggressively because it's simply unusably 
flaky. So the choice is between:

 - don't use MMCONFIG at all, because it has so many problems
 - use MMCONFIG sparingly enough to hide the problems

and what "you're supposed to do" is simply trumped by Real Life(tm). 
Because Intel screwed up so badly when they designed that piece of shit.

(Where "screwed up badly" is the usual "left it to firmware people" thing, 
of course. Dammit, Intel *could* have just made it a real PCI BAR in the 
Northbridge, and specified it as such, and we wouldn't have these 
problems! But no, it had to be another idiotic "firmware tells where it 
is" thing)

> > The fact is, CONF1 style accesses are just safer, and *work*. 
> 
> I would suggest a slight twist then: use CONF1 *until* you're using
> something above 256, and then and only then switch to MMCONFIG from
> then on for all accesses.

No.

Maybe if you do it per-device, and only *after* probing (ie we have seen 
multiple, and successful, accesses), but globally, absolutely not. That 
would be useless. The bugs we have had in this area have been exactly the 
kinds of things like "we don't know the real size of the MMCONFIG areas" 
etc.

I could easily see device driver writers probing to see if something 
works, and I absolutely don't think we should just automatically enable 
MMCONFIG from then on.

But maybe we could have a per-device flag that a driver *can* set. Ie have 
the logic be:

 - use MMCONFIG if we have to (reg >= 256)

OR

 - use MMCONFIG if the driver specifically asked us to

and then drivers that absolutely need it, and know they do, can set that 
flag. Preferably after they actually verified that it works.

That way you _can_ get the "this is how you're supposed to do it" 
behaviour, but you get it when there is a reasonable chance that it 
actually works.

And quite frankly, if you're not supposed to mix these things even across 
devices, then I think we are better off just doing what we effectively do 
now: mostly ignore the damn thing because it's too broken to use.

Maybe somebody inside Intel could just clarify the documentation, and 
change it from "you're not supposed to mix" to "mix all you want". 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 08:15:46 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Mon, 29 Oct 2007, Robert Hancock wrote:
> > 
> > The other possible workaround would be to avoid using MMCONFIG
> > until the BAR sizing is done. 
> 
> The only sane solution is the one that people constantly seem to
> ignore:
> 
>  - only use MMCONFIG if absolutely required by the access itself
> 
> In other words, make the MMCONFIG code fall back on old-style
> accesses for any register access to a word with reg+size <= 256.

the problem is... you're not supposed to mix both types of accesses.
 
> At that point, almost all the issues with mmconfig just go away,
> BECAUSE NOBODY USES IT, so it doesn't matter if it's broken?
> 
> The fact is, CONF1 style accesses are just safer, and *work*. 

I would suggest a slight twist then: use CONF1 *until* you're using
something above 256, and then and only then switch to MMCONFIG from
then on for all accesses.

That should solve the spec issue (yes I know about specs, but this is
one of those things that will bite us in the future if we're not
careful)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Mon, 29 Oct 2007, Robert Hancock wrote:
> 
> The other possible workaround would be to avoid using MMCONFIG until the BAR
> sizing is done. 

The only sane solution is the one that people constantly seem to ignore:

 - only use MMCONFIG if absolutely required by the access itself

In other words, make the MMCONFIG code fall back on old-style accesses for 
any register access to a word with reg+size <= 256.

At that point, almost all the issues with mmconfig just go away, BECAUSE 
NOBODY USES IT, so it doesn't matter if it's broken?

The fact is, CONF1 style accesses are just safer, and *work*. 

Can we please just do that? 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Mon, 29 Oct 2007, Robert Hancock wrote:
 
 The other possible workaround would be to avoid using MMCONFIG until the BAR
 sizing is done. 

The only sane solution is the one that people constantly seem to ignore:

 - only use MMCONFIG if absolutely required by the access itself

In other words, make the MMCONFIG code fall back on old-style accesses for 
any register access to a word with reg+size = 256.

At that point, almost all the issues with mmconfig just go away, BECAUSE 
NOBODY USES IT, so it doesn't matter if it's broken?

The fact is, CONF1 style accesses are just safer, and *work*. 

Can we please just do that? 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 08:15:46 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:

 
 
 On Mon, 29 Oct 2007, Robert Hancock wrote:
  
  The other possible workaround would be to avoid using MMCONFIG
  until the BAR sizing is done. 
 
 The only sane solution is the one that people constantly seem to
 ignore:
 
  - only use MMCONFIG if absolutely required by the access itself
 
 In other words, make the MMCONFIG code fall back on old-style
 accesses for any register access to a word with reg+size = 256.

the problem is... you're not supposed to mix both types of accesses.
 
 At that point, almost all the issues with mmconfig just go away,
 BECAUSE NOBODY USES IT, so it doesn't matter if it's broken?
 
 The fact is, CONF1 style accesses are just safer, and *work*. 

I would suggest a slight twist then: use CONF1 *until* you're using
something above 256, and then and only then switch to MMCONFIG from
then on for all accesses.

That should solve the spec issue (yes I know about specs, but this is
one of those things that will bite us in the future if we're not
careful)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
 
 the problem is... you're not supposed to mix both types of accesses.

You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for 
startup.

Also, there's reason to believe that mixing things up _has_ to work 
anyway, and if the issue is between works in practice and theory says 
that you shouldn't mix, I'll take practice every time.

Especially since we *know* that the theory is broken. Right now MMCONFIG 
is effectively disabled very aggressively because it's simply unusably 
flaky. So the choice is between:

 - don't use MMCONFIG at all, because it has so many problems
 - use MMCONFIG sparingly enough to hide the problems

and what you're supposed to do is simply trumped by Real Life(tm). 
Because Intel screwed up so badly when they designed that piece of shit.

(Where screwed up badly is the usual left it to firmware people thing, 
of course. Dammit, Intel *could* have just made it a real PCI BAR in the 
Northbridge, and specified it as such, and we wouldn't have these 
problems! But no, it had to be another idiotic firmware tells where it 
is thing)

  The fact is, CONF1 style accesses are just safer, and *work*. 
 
 I would suggest a slight twist then: use CONF1 *until* you're using
 something above 256, and then and only then switch to MMCONFIG from
 then on for all accesses.

No.

Maybe if you do it per-device, and only *after* probing (ie we have seen 
multiple, and successful, accesses), but globally, absolutely not. That 
would be useless. The bugs we have had in this area have been exactly the 
kinds of things like we don't know the real size of the MMCONFIG areas 
etc.

I could easily see device driver writers probing to see if something 
works, and I absolutely don't think we should just automatically enable 
MMCONFIG from then on.

But maybe we could have a per-device flag that a driver *can* set. Ie have 
the logic be:

 - use MMCONFIG if we have to (reg = 256)

OR

 - use MMCONFIG if the driver specifically asked us to

and then drivers that absolutely need it, and know they do, can set that 
flag. Preferably after they actually verified that it works.

That way you _can_ get the this is how you're supposed to do it 
behaviour, but you get it when there is a reasonable chance that it 
actually works.

And quite frankly, if you're not supposed to mix these things even across 
devices, then I think we are better off just doing what we effectively do 
now: mostly ignore the damn thing because it's too broken to use.

Maybe somebody inside Intel could just clarify the documentation, and 
change it from you're not supposed to mix to mix all you want. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 10:07:48 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:
 
   The fact is, CONF1 style accesses are just safer, and *work*. 
  
  I would suggest a slight twist then: use CONF1 *until* you're using
  something above 256, and then and only then switch to MMCONFIG from
  then on for all accesses.
 
 No.
 
 Maybe if you do it per-device, and only *after* probing (ie we have
 seen multiple, and successful, accesses), but globally, absolutely
 not. That would be useless. The bugs we have had in this area have
 been exactly the kinds of things like we don't know the real size of
 the MMCONFIG areas etc.

sorry I wasn't very clear, I meant per device.

 
 I could easily see device driver writers probing to see if something 
 works, and I absolutely don't think we should just automatically
 enable MMCONFIG from then on.
 
 But maybe we could have a per-device flag that a driver *can* set. Ie
 have the logic be:
 
  - use MMCONFIG if we have to (reg = 256)
 
 OR
 
  - use MMCONFIG if the driver specifically asked us to

something like
int pci_enable_mmconfig(struct pci_dev *pdev) ?
sounds like a very solid plan to me...


 Maybe somebody inside Intel could just clarify the documentation, and 
 change it from you're not supposed to mix to mix all you want. 

I'll see what I can do ;)


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
 
 something like
 int pci_enable_mmconfig(struct pci_dev *pdev) ?

Yes, that looks fine. It also matches the kinds of things drivers already 
have to do (ie enable DMA, MSI etc), both conceptually and from a purely 
syntactic/practical standpoint.

I think mmconfig and MSI have a lot in common, in that both are relatively 
new features and both have had issues that made them not work well.

 sounds like a very solid plan to me...
 
  Maybe somebody inside Intel could just clarify the documentation, and 
  change it from you're not supposed to mix to mix all you want. 
 
 I'll see what I can do ;)

Well, let's see if pci_enable_mmconfig() works out first. There probably 
aren't more than a few drivers that would be interested in this anyway, so 
it's probably fine.

And by the time mmconfig is commonly used, probably all the teething 
problems are long gone, so we could plan on a in five years, maybe we 
could even enable it by default thing?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen

 At that point, almost all the issues with mmconfig just go away, BECAUSE 
 NOBODY USES IT, so it doesn't matter if it's broken?

We seem to slowly grow more uses of registers  256, but yes it's still mostly 
non critical stuff.
 
 The fact is, CONF1 style accesses are just safer, and *work*. 
 
 Can we please just do that? 

At least for the AMD GART IOMMU I would prefer if that was not done.
It has a pci config space access in a time critical path (flush).
And Family10h supports accessing the northbridge using mmconfig.

I believe the tigon3 driver also does config space access frequently
on some chips.

Also there are still the old x86 Macs where conf1 doesn't work.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Andi Kleen wrote:
 
 Also there are still the old x86 Macs where conf1 doesn't work.

I don't think that was ever true. That was a made-up rumor by the EFI 
people who were just confused.

There's no way to disable conf1 accesses on any known chipset, afaik.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen
On Tuesday 30 October 2007 20:06:12 Linus Torvalds wrote:
 
 On Tue, 30 Oct 2007, Andi Kleen wrote:
  
  Also there are still the old x86 Macs where conf1 doesn't work.
 
 I don't think that was ever true. That was a made-up rumor by the EFI 
 people who were just confused.
 There's no way to disable conf1 accesses on any known chipset, afaik.

There was a tester with such a machine and he complained until
the heuristic handled this case.  So there is evidence it 
really was like this.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED]
Date: Tue, 30 Oct 2007 19:50:58 +0100

 I believe the tigon3 driver also does config space access frequently
 on some chips.

This case is not worth optimizing for at all.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 10:07 am Linus Torvalds wrote:
 (Where screwed up badly is the usual left it to firmware people
 thing, of course. Dammit, Intel *could* have just made it a real PCI
 BAR in the Northbridge, and specified it as such, and we wouldn't
 have these problems! But no, it had to be another idiotic firmware
 tells where it is thing)

The per-device flag is fine with me, but I should make something clear:

MMCONFIG IS NOT BROKEN!

Nor is finding mmconfig space.  We know how to do that too.

What's broken is our PCI probing with certain address space layouts that 
include MMCONFIG space.  Since MMCONFIG is in MMIO space (by 
definition) there will always be the potential for problems when we use 
MMCONFIG and don't disable decode while sizing BARs (probably a latent 
bug on many non-PC Linux platforms).

So we can either use I/O ports for sizing and only switch to MMCONFIG 
later, or we can just use it on an as-needed basis, or we can make our 
PCI probing safe if MMCONFIG is in use.

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Jesse Barnes wrote:

 The per-device flag is fine with me, but I should make something clear:
 
 MMCONFIG IS NOT BROKEN!

Trust me, it is.

The particular problem _you_ had with it is only a small small part of the 
bugs we have had.

 What's broken is our PCI probing with certain address space layouts that 
 include MMCONFIG space.

No. You really don't see the big picture. There's been tons of problems 
with MMCONFIG. Like the fact that other devices have their IO regions 
registered on top of it, because the MMCONFIG thing was done as a hidden 
resource. Or the fact that the area claimed was too small. Or too large. 
Or not listed at all.

The whole thing is a total disaster. I told Intel engineers literally 
*years* ago to not do that idiotic hidden IO resources that are described 
by firmware that then inevitably gets things wrong, and yet what happens? 
Every single time.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 3:22 pm Linus Torvalds wrote:
 On Tue, 30 Oct 2007, Jesse Barnes wrote:
  The per-device flag is fine with me, but I should make something
  clear:
 
  MMCONFIG IS NOT BROKEN!

 Trust me, it is.

 The particular problem _you_ had with it is only a small small part
 of the bugs we have had.

  What's broken is our PCI probing with certain address space layouts
  that include MMCONFIG space.

 No. You really don't see the big picture. There's been tons of
 problems with MMCONFIG. Like the fact that other devices have their
 IO regions registered on top of it, because the MMCONFIG thing was
 done as a hidden resource. Or the fact that the area claimed was too
 small. Or too large. Or not listed at all.

Yeah, that's definitely a problem, and would be a firmware bug.  There's 
no doubt that firmwares have had trouble with this in the past, but 
given that Vista now relies on this stuff working, it's a lot more 
likely to be reliable in current and future systems.

 The whole thing is a total disaster. I told Intel engineers literally
 *years* ago to not do that idiotic hidden IO resources that are
 described by firmware that then inevitably gets things wrong, and
 yet what happens? Every single time.

I don't disagree there.  I'm just saying the actual mechanism is fine 
(as illustrated by the numerous non-PC ports of Linux), and this 
particular problem, at least, isn't really specific to how MMCONFIG is 
described or configured by the firmware and OS, it's simply a Linux 
problem.

But like I said, the per-device flag Arjan suggested is fine with me...

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Linus Torvalds wrote:

 No. You really don't see the big picture. There's been tons of problems 
 with MMCONFIG. Like the fact that other devices have their IO regions 
 registered on top of it, because the MMCONFIG thing was done as a hidden 
 resource. Or the fact that the area claimed was too small. Or too large. 
 Or not listed at all.

Actually, I guess the bad case wasn't not listed at all, but incorrectly 
listed - so the probing would go to the wrong address, not find any 
devices, and then promptly result in an unusable machine with no hardware 
attached.

I _think_ (and hope) those machines were never released. But even now, on 
my main machine, I get MCFG area at f000 is not E820-reserved, and 
probably the only reason the PCI layer doesn't overwrite it is because it 
does show up as a PnP region, and I have pnp support enabled.

Basically, the resource allocation for mmconf has always been broken 
(largely by *design* by Intel!). And by being broken, it has been 
unreliable.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 3:38 pm Linus Torvalds wrote:
 On Tue, 30 Oct 2007, Linus Torvalds wrote:
  No. You really don't see the big picture. There's been tons of
  problems with MMCONFIG. Like the fact that other devices have their
  IO regions registered on top of it, because the MMCONFIG thing was
  done as a hidden resource. Or the fact that the area claimed was
  too small. Or too large. Or not listed at all.

 Actually, I guess the bad case wasn't not listed at all, but
 incorrectly listed - so the probing would go to the wrong address,
 not find any devices, and then promptly result in an unusable machine
 with no hardware attached.

Yeah, busted BIOS (and I agree a poor design decision).

 I _think_ (and hope) those machines were never released. But even
 now, on my main machine, I get MCFG area at f000 is not
 E820-reserved, and probably the only reason the PCI layer doesn't
 overwrite it is because it does show up as a PnP region, and I have
 pnp support enabled.

Unfortunately I think some such machines *were* released, only because 
the release engineers figured no one actually uses MMCONFIG yet 
(Windows == whole world of users), so why worry about that particular 
bug?

The not E820-reserved message is actually bogus.  I'll bet MMCONFIG 
works fine on your machine with Robert's patch and the disable decode 
stuff applied.

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread David Miller
From: Jesse Barnes [EMAIL PROTECTED]
Date: Tue, 30 Oct 2007 15:31:57 -0700

 I don't disagree there.  I'm just saying the actual mechanism is fine 
 (as illustrated by the numerous non-PC ports of Linux), and this 
 particular problem, at least, isn't really specific to how MMCONFIG is 
 described or configured by the firmware and OS, it's simply a Linux 
 problem.

I totally disagree, it is not a Linux problem.

The non-PC ports don't have this issue because the PCI config space
MMIO area of the PCI controller does not overlap PCI MMIO space at
all.  So the problem simply cannot happen.

They choose to put this MMCONFIG in an area overlapping with PCI MMIO
space, which breaks both existing practice and code.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Jesse Barnes wrote:
 
 Yeah, that's definitely a problem, and would be a firmware bug.  There's 
 no doubt that firmwares have had trouble with this in the past, but 
 given that Vista now relies on this stuff working, it's a lot more 
 likely to be reliable in current and future systems.

I agree. I think most of the big problems we had were basically with 
unreleased or very early systems.

But the per-device flag should just fix it regardless, and we can go 
forward assuming that things work, but without breaking those borderline 
systems.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Andi Kleen

 Actually, I guess the bad case wasn't not listed at all, but incorrectly
 listed - so the probing would go to the wrong address, not find any
 devices, and then promptly result in an unusable machine with no hardware
 attached.

iirc they tended to hang for whatever reason when the mcfg
area was accessed, not just not detect anything.


 I _think_ (and hope) those machines were never released.

They were :/ The bug flowed from a Intel reference BIOS to lots of
production boards. 

 But even now, on 
 my main machine, I get MCFG area at f000 is not E820-reserved, and

That's a different issue. The spec does not require it to be e820-reserved.
That was just a (bogus) heuristic originally added to handle the early BIOS
bug. Even BIOS where the mcfg is fine do not reserve it there.

There were some patches to check it against ACPI resources (which
was presumably better specification conforming and more importantly
similar to what M$ does). I'm not sure that fixed all problems and
made the e820 check obsolete though.

 Basically, the resource allocation for mmconf has always been broken
 (largely by *design* by Intel!). And by being broken, it has been
 unreliable.

Well the overlapping to MMIO would have been fine as design if BIOS had
gotten it right.  Trying to design things that BIOS can't get it wrong
is probably futile -- the BIOSes would just find more subtle ways to break.
The real problem I guess was that Linux was trying to use it before 
Windows which is usually a bad idea regarding BIOS support at least
in the Desktop/Laptop space. 

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Robert Hancock

Linus Torvalds wrote:


On Tue, 30 Oct 2007, Arjan van de Ven wrote:

the problem is... you're not supposed to mix both types of accesses.


You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for 
startup.


If it does, it's not by necessity. As soon as you read the table 
location out of the ACPI tables you can start using it, and that 
shouldn't require any config space accesses.




Also, there's reason to believe that mixing things up _has_ to work 
anyway, and if the issue is between works in practice and theory says 
that you shouldn't mix, I'll take practice every time.


Especially since we *know* that the theory is broken. Right now MMCONFIG 
is effectively disabled very aggressively because it's simply unusably 
flaky. So the choice is between:


 - don't use MMCONFIG at all, because it has so many problems
 - use MMCONFIG sparingly enough to hide the problems


Fact is, we don't really know how many of these systems with supposedly 
broken MMCONFIG were really just suffering from the overlapping 
PCI/MMCONFIG address space problem, which is entirely the fault of the 
way we do PCI probing. I would bet quite a few of them.




and what you're supposed to do is simply trumped by Real Life(tm). 
Because Intel screwed up so badly when they designed that piece of shit.


(Where screwed up badly is the usual left it to firmware people thing, 
of course. Dammit, Intel *could* have just made it a real PCI BAR in the 
Northbridge, and specified it as such, and we wouldn't have these 
problems! But no, it had to be another idiotic firmware tells where it 
is thing)


This wouldn't have helped anything with the problem in question.



The fact is, CONF1 style accesses are just safer, and *work*. 

I would suggest a slight twist then: use CONF1 *until* you're using
something above 256, and then and only then switch to MMCONFIG from
then on for all accesses.


No.

Maybe if you do it per-device, and only *after* probing (ie we have seen 
multiple, and successful, accesses), but globally, absolutely not. That 
would be useless. The bugs we have had in this area have been exactly the 
kinds of things like we don't know the real size of the MMCONFIG areas 
etc.


I could easily see device driver writers probing to see if something 
works, and I absolutely don't think we should just automatically enable 
MMCONFIG from then on.


Why per device? It's not like the MSI case where both the platform and 
the device are potentially busted. Whether or not MMCONFIG works has 
nothing to do with the device, all that matters is whether it works on 
the platform. It shouldn't be the driver's responsibility to know this.




But maybe we could have a per-device flag that a driver *can* set. Ie have 
the logic be:


 - use MMCONFIG if we have to (reg = 256)

OR

 - use MMCONFIG if the driver specifically asked us to

and then drivers that absolutely need it, and know they do, can set that 
flag. Preferably after they actually verified that it works.


How will they verify that it works? If it works, then verifying it works 
is all well and good. If it doesn't work, trying to verify if it does 
could very well blow up the machine.


I've made the point before that if we're going to allow using it at all, 
we'd better find out if it works or not early on, not after we've been 
running and somebody decides it's a good idea to try using it and 
causing a lockup or something.




That way you _can_ get the this is how you're supposed to do it 
behaviour, but you get it when there is a reasonable chance that it 
actually works.


And quite frankly, if you're not supposed to mix these things even across 
devices, then I think we are better off just doing what we effectively do 
now: mostly ignore the damn thing because it's too broken to use.


Maybe somebody inside Intel could just clarify the documentation, and 
change it from you're not supposed to mix to mix all you want. 


Intel could say what they want on the subject.. but that doesn't 
necessarily reflect what happens with anyone else's chipset implementations.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Arjan van de Ven
On Tue, 30 Oct 2007 17:41:26 -0600
Robert Hancock [EMAIL PROTECTED] wrote:
  I could easily see device driver writers probing to see if
  something works, and I absolutely don't think we should just
  automatically enable MMCONFIG from then on.
 
 Why per device? It's not like the MSI case where both the platform
 and the device are potentially busted. Whether or not MMCONFIG works
 has nothing to do with the device, all that matters is whether it
 works on the platform. It shouldn't be the driver's responsibility to
 know this.

it's per device so that if you have have no users for this stuff,
you'll never ever get bitten by bugs (be it linux or the bios).
It's just avoiding the problem for a large class of cases... doesn't
mean it's solved for the other class, just in practice it turns it into
a much smaller problem.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Robert Hancock wrote:
  
  You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for
  startup.
 
 If it does, it's not by necessity. As soon as you read the table location out
 of the ACPI tables you can start using it, and that shouldn't require any
 config space accesses.

Don't be silly. Exactly _BECAUSE_ we cannot trust the firmware, we have to 
use conf1 (which we can trust) to verify it and/or fix things up.

Also, there are several devices that don't show up in the MMCFG things, or 
just otherwise get it wrong.

So just take a look at arch/x86/pci/mmconfig-shared.c and look for 
conf1.

Really. Damn, I'm nervous taking any MMCFG patches that has you as an 
author, if you aren't even aware of these kinds of fundamnetal issues. You 
probably read the standards about how things are supposed to work, and 
then just believed them?

Rule #1 in kernel programming: don't *ever* think that things actually 
work the way they are documented to work. The documentation is a starting 
point, nothing else. 

And please be defensive in programming. We *know* conf1 cycles work. The 
hardware has been extensively tested, and there are no firmware 
interactions. There is *zero* reasons to use MMCONF cycles for normal 
devices. Ergo: switching over to MMCONF when not needed is stupid and 
fragile.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Linus Torvalds


On Tue, 30 Oct 2007, Arjan van de Ven wrote:
 
 it's per device so that if you have have no users for this stuff,
 you'll never ever get bitten by bugs (be it linux or the bios).

Exactly. It's not that we care about trying to protect a system that 
really needs to use MMCFG - once you really *really* need it, you're 
screwed if it doesn't work.

But currently 99% of all systems will never need it, and for the broken 
platforms, that's a nice round 100%. So if we make code not use MMCFG by 
default, and you have to enable it for just those device drivers that 
really want it and care, you automatically avoid all the broken hardware, 
with no real downsides.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Robert Hancock

Linus Torvalds wrote:


On Tue, 30 Oct 2007, Robert Hancock wrote:

You have to, anyway. Even now the MMCONFIG stuff uses CONF1 cycles for
startup.

If it does, it's not by necessity. As soon as you read the table location out
of the ACPI tables you can start using it, and that shouldn't require any
config space accesses.


Don't be silly. Exactly _BECAUSE_ we cannot trust the firmware, we have to 
use conf1 (which we can trust) to verify it and/or fix things up.


My point was, it's not inherently necessary in order to use MMCONFIG. 
I'm not saying the checks (unreachable_devices and 
pci_mmcfg_check_hostbridge) aren't useful or needed with many real 
machines. However, in the event that type1 access isn't available we 
just skip all those checks because we have no other option. It would 
indeed be a pretty broken spec if there was no way to bootstrap with it 
even under ideal conditions..




Also, there are several devices that don't show up in the MMCFG things, or 
just otherwise get it wrong.


So just take a look at arch/x86/pci/mmconfig-shared.c and look for 
conf1.


Really. Damn, I'm nervous taking any MMCFG patches that has you as an 
author, if you aren't even aware of these kinds of fundamnetal issues. You 
probably read the standards about how things are supposed to work, and 
then just believed them?


Rule #1 in kernel programming: don't *ever* think that things actually 
work the way they are documented to work. The documentation is a starting 
point, nothing else. 

And please be defensive in programming. We *know* conf1 cycles work. The 
hardware has been extensively tested, and there are no firmware 
interactions. There is *zero* reasons to use MMCONF cycles for normal 
devices. Ergo: switching over to MMCONF when not needed is stupid and 
fragile.


I can't really disagree that MMCONFIG doesn't have great advantages for 
most devices (though it likely is faster on a lot of platforms, which 
may be significant if the device does lots of config space accesses). So 
for the moment, avoiding using it except where necessary will likely 
work out (except if some system does indeed puke on mixing type1 and 
MMCONFIG).


However, what Microsoft is doing with Vista may eventually make a 
difference in the future. Many hardware vendors seem to use the testing 
strategy of test with latest Windows version. Works OK? Ship it. If 
Vista decides that MMCONFIG is good to use all the time, then type1 
access support is likely going to a) end up less tested and b) probably 
deleted entirely in time. We've seen it before - it used to be that not 
using ACPI was the safe option on most hardware with Linux. Now you 
pretty much have to use it because the manufacturers only test with it 
enabled. I've seen at least one board where the interrupt routing was 
completely broken with ACPI off, because they obviously only tested in 
Windows..


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-30 Thread Jesse Barnes
On Tuesday, October 30, 2007 4:59 pm Linus Torvalds wrote:
 Also, there are several devices that don't show up in the MMCFG
 things, or just otherwise get it wrong.

 So just take a look at arch/x86/pci/mmconfig-shared.c and look for
 conf1.

 Really. Damn, I'm nervous taking any MMCFG patches that has you as an
 author, if you aren't even aware of these kinds of fundamnetal
 issues. You probably read the standards about how things are
 supposed to work, and then just believed them?

We have a few bits of code in there to look at the actual MMCONFIG base 
register, which is good since we can sanity check it against the 
firmware.  However, by itself it's not enough since we may end up using 
it even though the BIOS didn't tell us about an overlap that really 
does exist.  Robert recognized this and added checks to make sure the 
values read from the base regs actually matched what the firmware was 
telling us in the ACPI tables (at least iirc, it's been awhile since I 
looked at the patch).

So don't be too nervous. :)

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-29 Thread Robert Hancock

Greg KH wrote:

On Fri, Oct 26, 2007 at 09:59:45AM -0700, Jesse Barnes wrote:

On Thursday, October 25, 2007 7:54 pm Greg KH wrote:

On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:

I think Greg doesn't like it, even though we don't have an
alternative at this point...

Yes, I didn't like it, Ivan didn't like it, and I got reports that it
wasn't even needed at all once you upgraded your BIOS to the latest
version.

So, is this still needed?  And if so, can you try to implement what
Ivan suggested to do here instead?
Yes, it's still needed.  Auke rescinded his "BIOS upgrade makes it work" 
message, so something like this is still necessary.


He did?  Ugh, I can't keep these all straight, sorry.

Can someone just send what they think is still needed, and explain why
Ivan will not object to it?  :)


Here's a recap of the whole issue just for people's information:

Right now we disable MMCONFIG on machines where the MCFG area is not 
reserved in the E820 memory map since we figure it's not valid. This is 
a broken heuristic because the PCI Express firmware spec doesn't require 
that it be so reserved, it only needs to be reserved as an ACPI 
motherboard resource, and so many times it's not reserved in E820 
despite being completely valid and working. The 
mmconfig-validate-against-acpi-motherboard-resources.patch changes this 
to validate against the ACPI motherboard resources instead.


The second problem is that on some machines, when we are doing BAR 
sizing on PCI devices, and write all ones to a BAR in order to determine 
how many bits "stick", the BAR ends up overlapping with the MCFG area. 
On some chipsets, this causes writes to the MCFG area (like, to restore 
the original BAR contents) to get decoded by the device instead of by 
the MCFG mechanism, which means the BAR stays disabled and configuration 
access stops working, wreaking havoc. Usually on these machines the 
MMCONFIG is located near the top of 32-bit memory and the PCI device 
causing problems is a PCI Express graphics card. 
pci-disable-decode-of-io-memory-during-bar-sizing.patch, and its 
successors, switch off the device's decoding during sizing so that it 
won't absorb the accesses to the MCFG table.


The concern raised was that this might affect some devices negatively. 
We do avoid disabling decode on host bridges, as it's known that some of 
them disable RAM access when you turn decode off, stupidly. I've yet to 
hear of any other conclusive case where disabling the decode is harmful. 
In general, if disabling the decode causes issues, the mere fact of 
doing the BAR sizing could cause the same issues, and that is unavoidable.


The other possible workaround would be to avoid using MMCONFIG until the 
BAR sizing is done. However, this seems like a poor solution. First of 
all, in the future there may come machines where MMCONFIG is the only 
config mechanism (or, perhaps more likely, it becomes the only tested 
one, so the old methods get broken). Secondly, what happens with 
hot-plug devices that need to be sized after MMCONFIG gets turned on?


The only way these two patches are related is that the E820 check 
happens to wrongly disable MMCONFIG on some of the machines where the 
memory areas could overlap during sizing, so removing that check alone 
without fixing the overlap issue could cause breakage on some machines. 
However, this is purely by chance, and it doesn't prevent the breakage 
on many other machines - as well as the one mentioned in the earlier 
thread, there's this one:


https://bugzilla.redhat.com/show_bug.cgi?id=251493

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-29 Thread Robert Hancock

Greg KH wrote:

On Fri, Oct 26, 2007 at 09:59:45AM -0700, Jesse Barnes wrote:

On Thursday, October 25, 2007 7:54 pm Greg KH wrote:

On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:

I think Greg doesn't like it, even though we don't have an
alternative at this point...

Yes, I didn't like it, Ivan didn't like it, and I got reports that it
wasn't even needed at all once you upgraded your BIOS to the latest
version.

So, is this still needed?  And if so, can you try to implement what
Ivan suggested to do here instead?
Yes, it's still needed.  Auke rescinded his BIOS upgrade makes it work 
message, so something like this is still necessary.


He did?  Ugh, I can't keep these all straight, sorry.

Can someone just send what they think is still needed, and explain why
Ivan will not object to it?  :)


Here's a recap of the whole issue just for people's information:

Right now we disable MMCONFIG on machines where the MCFG area is not 
reserved in the E820 memory map since we figure it's not valid. This is 
a broken heuristic because the PCI Express firmware spec doesn't require 
that it be so reserved, it only needs to be reserved as an ACPI 
motherboard resource, and so many times it's not reserved in E820 
despite being completely valid and working. The 
mmconfig-validate-against-acpi-motherboard-resources.patch changes this 
to validate against the ACPI motherboard resources instead.


The second problem is that on some machines, when we are doing BAR 
sizing on PCI devices, and write all ones to a BAR in order to determine 
how many bits stick, the BAR ends up overlapping with the MCFG area. 
On some chipsets, this causes writes to the MCFG area (like, to restore 
the original BAR contents) to get decoded by the device instead of by 
the MCFG mechanism, which means the BAR stays disabled and configuration 
access stops working, wreaking havoc. Usually on these machines the 
MMCONFIG is located near the top of 32-bit memory and the PCI device 
causing problems is a PCI Express graphics card. 
pci-disable-decode-of-io-memory-during-bar-sizing.patch, and its 
successors, switch off the device's decoding during sizing so that it 
won't absorb the accesses to the MCFG table.


The concern raised was that this might affect some devices negatively. 
We do avoid disabling decode on host bridges, as it's known that some of 
them disable RAM access when you turn decode off, stupidly. I've yet to 
hear of any other conclusive case where disabling the decode is harmful. 
In general, if disabling the decode causes issues, the mere fact of 
doing the BAR sizing could cause the same issues, and that is unavoidable.


The other possible workaround would be to avoid using MMCONFIG until the 
BAR sizing is done. However, this seems like a poor solution. First of 
all, in the future there may come machines where MMCONFIG is the only 
config mechanism (or, perhaps more likely, it becomes the only tested 
one, so the old methods get broken). Secondly, what happens with 
hot-plug devices that need to be sized after MMCONFIG gets turned on?


The only way these two patches are related is that the E820 check 
happens to wrongly disable MMCONFIG on some of the machines where the 
memory areas could overlap during sizing, so removing that check alone 
without fixing the overlap issue could cause breakage on some machines. 
However, this is purely by chance, and it doesn't prevent the breakage 
on many other machines - as well as the one mentioned in the earlier 
thread, there's this one:


https://bugzilla.redhat.com/show_bug.cgi?id=251493

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-26 Thread Greg KH
On Fri, Oct 26, 2007 at 09:59:45AM -0700, Jesse Barnes wrote:
> On Thursday, October 25, 2007 7:54 pm Greg KH wrote:
> > On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:
> > > I think Greg doesn't like it, even though we don't have an
> > > alternative at this point...
> >
> > Yes, I didn't like it, Ivan didn't like it, and I got reports that it
> > wasn't even needed at all once you upgraded your BIOS to the latest
> > version.
> >
> > So, is this still needed?  And if so, can you try to implement what
> > Ivan suggested to do here instead?
> 
> Yes, it's still needed.  Auke rescinded his "BIOS upgrade makes it work" 
> message, so something like this is still necessary.

He did?  Ugh, I can't keep these all straight, sorry.

Can someone just send what they think is still needed, and explain why
Ivan will not object to it?  :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-26 Thread Jesse Barnes
On Thursday, October 25, 2007 7:54 pm Greg KH wrote:
> On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:
> > I think Greg doesn't like it, even though we don't have an
> > alternative at this point...
>
> Yes, I didn't like it, Ivan didn't like it, and I got reports that it
> wasn't even needed at all once you upgraded your BIOS to the latest
> version.
>
> So, is this still needed?  And if so, can you try to implement what
> Ivan suggested to do here instead?

Yes, it's still needed.  Auke rescinded his "BIOS upgrade makes it work" 
message, so something like this is still necessary.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-26 Thread Jesse Barnes
On Thursday, October 25, 2007 7:54 pm Greg KH wrote:
 On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:
  I think Greg doesn't like it, even though we don't have an
  alternative at this point...

 Yes, I didn't like it, Ivan didn't like it, and I got reports that it
 wasn't even needed at all once you upgraded your BIOS to the latest
 version.

 So, is this still needed?  And if so, can you try to implement what
 Ivan suggested to do here instead?

Yes, it's still needed.  Auke rescinded his BIOS upgrade makes it work 
message, so something like this is still necessary.

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pci-disable-decode-of-io-memory-during-bar-sizing.patch

2007-10-26 Thread Greg KH
On Fri, Oct 26, 2007 at 09:59:45AM -0700, Jesse Barnes wrote:
 On Thursday, October 25, 2007 7:54 pm Greg KH wrote:
  On Thu, Oct 25, 2007 at 04:22:35PM -0700, Jesse Barnes wrote:
   I think Greg doesn't like it, even though we don't have an
   alternative at this point...
 
  Yes, I didn't like it, Ivan didn't like it, and I got reports that it
  wasn't even needed at all once you upgraded your BIOS to the latest
  version.
 
  So, is this still needed?  And if so, can you try to implement what
  Ivan suggested to do here instead?
 
 Yes, it's still needed.  Auke rescinded his BIOS upgrade makes it work 
 message, so something like this is still necessary.

He did?  Ugh, I can't keep these all straight, sorry.

Can someone just send what they think is still needed, and explain why
Ivan will not object to it?  :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/