Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-29 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 03:23:35AM +, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-ow...@vger.kernel.org 
> > [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas 
> > ---
> >  Documentation/PCI/00-INDEX  |2 +
> >  Documentation/PCI/acpi-info.txt |  136 
> > +++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> > - this file
> > +acpi-info.txt
> > +   - info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> > - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt 
> > b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +   ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2].  For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself.  PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3].  That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS.  The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-29 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 03:23:35AM +, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-ow...@vger.kernel.org 
> > [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas 
> > ---
> >  Documentation/PCI/00-INDEX  |2 +
> >  Documentation/PCI/acpi-info.txt |  136 
> > +++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> > - this file
> > +acpi-info.txt
> > +   - info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> > - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt 
> > b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +   ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2].  For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself.  PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3].  That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS.  The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-29 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
> already done via pci_acpi_root_add_resources(), but we might need
> a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
> pnp/system.c does for PNP0C02 devices.  This is similar to the
> producer regions, but I think the consumer ones should be marked
> IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-29 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
> already done via pci_acpi_root_add_resources(), but we might need
> a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
> pnp/system.c does for PNP0C02 devices.  This is similar to the
> producer regions, but I think the consumer ones should be marked
> IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-23 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
already done via pci_acpi_root_add_resources(), but we might need
a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
pnp/system.c does for PNP0C02 devices.  This is similar to the
producer regions, but I think the consumer ones should be marked
IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-23 Thread Bjorn Helgaas
On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
already done via pci_acpi_root_add_resources(), but we might need
a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
pnp/system.c does for PNP0C02 devices.  This is similar to the
producer regions, but I think the consumer ones should be marked
IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-23 Thread Lorenzo Pieralisi
On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-23 Thread Lorenzo Pieralisi
On Wed, Nov 23, 2016 at 07:28:12AM +, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart from the lookup, there's still some 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Ard Biesheuvel
On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Ard Biesheuvel
On 23 November 2016 at 01:06, Bjorn Helgaas  wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?


RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Zheng, Lv
Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Bjorn
> Helgaas
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>   - this file
> +acpi-info.txt
> + - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>   - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> + ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to 

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Zheng, Lv
Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Bjorn
> Helgaas
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>   - this file
> +acpi-info.txt
> + - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>   - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> + ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Bjorn Helgaas
On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +For any device that is on a non-enumerable type of bus (for
> > +example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +and the ACPI system firmware must supply an _HID object ... for
> > +each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +The OS enumerates motherboard devices simply by reading through
> > +the ACPI 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Bjorn Helgaas
On Tue, Nov 22, 2016 at 10:09:50AM +, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +For any device that is on a non-enumerable type of bus (for
> > +example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +and the ACPI system firmware must supply an _HID object ... for
> > +each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +The OS enumerates motherboard devices simply by reading through
> > +the ACPI Namespace looking for 

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: Bjorn Helgaas [mailto:helg...@kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 05:23:11PM +, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -Original Message-
> > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -Original Message-
> > > > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni
> wrote:
> > > > > > > -Original Message-
> > > > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> >

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: Bjorn Helgaas [mailto:helg...@kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 05:23:11PM +, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -Original Message-
> > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -Original Message-
> > > > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni
> wrote:
> > > > > > > -Original Message-
> > > > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> >

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Ard Biesheuvel
On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
> - this file
> +acpi-info.txt
> +   - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
> - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +   ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-22 Thread Ard Biesheuvel
On 17 November 2016 at 17:59, Bjorn Helgaas  wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
> - this file
> +acpi-info.txt
> +   - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
> - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +   ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Mon, Nov 21, 2016 at 05:23:11PM +, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -Original Message-
> > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 21 November 2016 16:47
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > > Hi Bjorn
> > >
> > > > -Original Message-
> > > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > > Sent: 18 November 2016 17:54
> > > > To: Gabriele Paoloni
> > > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> > ACPI
> > > >
> > > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > > > -Original Message-
> > > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > Sent: 17 November 2016 18:00
> > > >
> > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > mechanisms
> > > > for
> > > > > > +reserving address space!  The static tables are for things the
> > OS
> > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > namespace.
> > > > > > +If a new table is defined, an old OS needs to operate
> > correctly
> > > > even
> > > > > > +though it ignores the table.  _CRS allows that because it is
> > > > generic
> > > > > > +and understood by the old OS; a static table does not.
> > > > >
> > > > > Right so if my understanding is correct you are saying that
> > resources
> > > > > described in the MCFG table should also be declared in PNP0C02
> > > > devices
> > > > > so that the PNP driver can reserve these resources.
> > > >
> > > > Yes.
> > > >
> > > > > On the other side the PCI Root bridge driver should not reserve
> > such
> > > > > resources.
> > > > >
> > > > > Well if my understanding is correct I think we have a problem
> > here:
> > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > >
> > > > > As you can see pci_ecam_create() will conflict with the pnp
> > driver
> > > > > as it will try to reserve the resources from the MCFG table...
> > > > >
> > > > > Maybe we need to rework pci_ecam_create() ?
> > > >
> > > > I think it's OK as it is.
> > > >
> > > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> > it
> > > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > > won't be allocated for anything else, but they can still be
> > requested
> > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > >
> > > > This is analogous to what the PCI core does in
> > pci_claim_resource().
> > > > This is really a function of the ACPI/PNP *core*, which should
> > reserve
> > > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > > bunch of historical baggage there.
> > > >
> > > > You'll also notice that in this case, things are out of order:
> > > > logically the pnp/system.c reservation should happen first, but in
> > > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > > could not be reserved".
> > >
> > > Correct me if I am wrong...
> > >
> > > So currently we are relying on the fact that pci_ecam_create() is
> > called
> > > before the pnp driver.
> > > If th

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Mon, Nov 21, 2016 at 05:23:11PM +, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -Original Message-
> > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 21 November 2016 16:47
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > > Hi Bjorn
> > >
> > > > -Original Message-
> > > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > > Sent: 18 November 2016 17:54
> > > > To: Gabriele Paoloni
> > > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> > ACPI
> > > >
> > > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > > > -Original Message-
> > > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > Sent: 17 November 2016 18:00
> > > >
> > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > mechanisms
> > > > for
> > > > > > +reserving address space!  The static tables are for things the
> > OS
> > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > namespace.
> > > > > > +If a new table is defined, an old OS needs to operate
> > correctly
> > > > even
> > > > > > +though it ignores the table.  _CRS allows that because it is
> > > > generic
> > > > > > +and understood by the old OS; a static table does not.
> > > > >
> > > > > Right so if my understanding is correct you are saying that
> > resources
> > > > > described in the MCFG table should also be declared in PNP0C02
> > > > devices
> > > > > so that the PNP driver can reserve these resources.
> > > >
> > > > Yes.
> > > >
> > > > > On the other side the PCI Root bridge driver should not reserve
> > such
> > > > > resources.
> > > > >
> > > > > Well if my understanding is correct I think we have a problem
> > here:
> > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > >
> > > > > As you can see pci_ecam_create() will conflict with the pnp
> > driver
> > > > > as it will try to reserve the resources from the MCFG table...
> > > > >
> > > > > Maybe we need to rework pci_ecam_create() ?
> > > >
> > > > I think it's OK as it is.
> > > >
> > > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> > it
> > > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > > won't be allocated for anything else, but they can still be
> > requested
> > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > >
> > > > This is analogous to what the PCI core does in
> > pci_claim_resource().
> > > > This is really a function of the ACPI/PNP *core*, which should
> > reserve
> > > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > > bunch of historical baggage there.
> > > >
> > > > You'll also notice that in this case, things are out of order:
> > > > logically the pnp/system.c reservation should happen first, but in
> > > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > > could not be reserved".
> > >
> > > Correct me if I am wrong...
> > >
> > > So currently we are relying on the fact that pci_ecam_create() is
> > called
> > > before the pnp driver.
> > > If th

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -Original Message-
> > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > > -Original Message-
> > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
> 
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
> 
>   system_pnp_probe
> reserve_reso

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-
> ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -Original Message-
> > > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > > -Original Message-
> > > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
> 
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
> 
>   system_pnp_probe
> reserve_reso

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -Original Message-
> > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > Sent: 18 November 2016 17:54
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > -Original Message-
> > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > Sent: 17 November 2016 18:00
> > 
> > > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> > for
> > > > +reserving address space!  The static tables are for things the OS
> > > > +needs to know early in boot, before it can parse the ACPI
> > namespace.
> > > > +If a new table is defined, an old OS needs to operate correctly
> > even
> > > > +though it ignores the table.  _CRS allows that because it is
> > generic
> > > > +and understood by the old OS; a static table does not.
> > >
> > > Right so if my understanding is correct you are saying that resources
> > > described in the MCFG table should also be declared in PNP0C02
> > devices
> > > so that the PNP driver can reserve these resources.
> > 
> > Yes.
> > 
> > > On the other side the PCI Root bridge driver should not reserve such
> > > resources.
> > >
> > > Well if my understanding is correct I think we have a problem here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > >
> > > As you can see pci_ecam_create() will conflict with the pnp driver
> > > as it will try to reserve the resources from the MCFG table...
> > >
> > > Maybe we need to rework pci_ecam_create() ?
> > 
> > I think it's OK as it is.
> > 
> > The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> > marks them as "not busy".  That way they appear in /proc/iomem and
> > won't be allocated for anything else, but they can still be requested
> > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > 
> > This is analogous to what the PCI core does in pci_claim_resource().
> > This is really a function of the ACPI/PNP *core*, which should reserve
> > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > bunch of historical baggage there.
> > 
> > You'll also notice that in this case, things are out of order:
> > logically the pnp/system.c reservation should happen first, but in
> > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > That means the pnp/system.c one might fail and complain "[mem ...]
> > could not be reserved".
> 
> Correct me if I am wrong...
> 
> So currently we are relying on the fact that pci_ecam_create() is called
> before the pnp driver.
> If the pnp driver came first we would end up in pci_ecam_create() failing
> here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> 
> I am not sure but it seems to me like a bit weak condition to rely on...
> what about removing the error condition in pci_ecam_create() and logging
> just a dev_info()?

Huh.  I'm confused.  I *thought* it would be safe to reverse the
order, which would effectively be this:

  system_pnp_probe
reserve_resources_of_dev
  reserve_range
request_mem_region([mem 0xb000-0xb1ff])
  ...
  pci_ecam_create
request_resource_conflict([mem 0xb000-0xb1ff])


but I experimented with the patch below on qemu, and it failed as you
predicted:

  ** res test **
  requested [mem 0xa000-0xafff]
  can't claim ECAM area [mem 0xa000-0xafff]: conflict with ECAM PNP 
[mem 0xa000-0xafff]

I expected the request_resource_conflict() to succeed since it's
completely contained in the "ECAM PNP" region.  But I guess I don't
understand kernel/resource.c well enough.

I'm not sure we need to fix anything yet, since we currently do the
ecam.c request before the system.c one, and any change there would be
a long ways off.  If/when that *does* change, I think the correct fix
would be to change ecam.c so its request succeeds (by changing the way
it does the request, fixing kernel/resource.c, or whatever) rather
than to reduce t

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Mon, Nov 21, 2016 at 08:52:52AM +, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -Original Message-
> > From: Bjorn Helgaas [mailto:helg...@kernel.org]
> > Sent: 18 November 2016 17:54
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> > a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > > -Original Message-
> > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > Sent: 17 November 2016 18:00
> > 
> > > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> > for
> > > > +reserving address space!  The static tables are for things the OS
> > > > +needs to know early in boot, before it can parse the ACPI
> > namespace.
> > > > +If a new table is defined, an old OS needs to operate correctly
> > even
> > > > +though it ignores the table.  _CRS allows that because it is
> > generic
> > > > +and understood by the old OS; a static table does not.
> > >
> > > Right so if my understanding is correct you are saying that resources
> > > described in the MCFG table should also be declared in PNP0C02
> > devices
> > > so that the PNP driver can reserve these resources.
> > 
> > Yes.
> > 
> > > On the other side the PCI Root bridge driver should not reserve such
> > > resources.
> > >
> > > Well if my understanding is correct I think we have a problem here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > >
> > > As you can see pci_ecam_create() will conflict with the pnp driver
> > > as it will try to reserve the resources from the MCFG table...
> > >
> > > Maybe we need to rework pci_ecam_create() ?
> > 
> > I think it's OK as it is.
> > 
> > The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> > marks them as "not busy".  That way they appear in /proc/iomem and
> > won't be allocated for anything else, but they can still be requested
> > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > 
> > This is analogous to what the PCI core does in pci_claim_resource().
> > This is really a function of the ACPI/PNP *core*, which should reserve
> > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > bunch of historical baggage there.
> > 
> > You'll also notice that in this case, things are out of order:
> > logically the pnp/system.c reservation should happen first, but in
> > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > That means the pnp/system.c one might fail and complain "[mem ...]
> > could not be reserved".
> 
> Correct me if I am wrong...
> 
> So currently we are relying on the fact that pci_ecam_create() is called
> before the pnp driver.
> If the pnp driver came first we would end up in pci_ecam_create() failing
> here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> 
> I am not sure but it seems to me like a bit weak condition to rely on...
> what about removing the error condition in pci_ecam_create() and logging
> just a dev_info()?

Huh.  I'm confused.  I *thought* it would be safe to reverse the
order, which would effectively be this:

  system_pnp_probe
reserve_resources_of_dev
  reserve_range
request_mem_region([mem 0xb000-0xb1ff])
  ...
  pci_ecam_create
request_resource_conflict([mem 0xb000-0xb1ff])


but I experimented with the patch below on qemu, and it failed as you
predicted:

  ** res test **
  requested [mem 0xa000-0xafff]
  can't claim ECAM area [mem 0xa000-0xafff]: conflict with ECAM PNP 
[mem 0xa000-0xafff]

I expected the request_resource_conflict() to succeed since it's
completely contained in the "ECAM PNP" region.  But I guess I don't
understand kernel/resource.c well enough.

I'm not sure we need to fix anything yet, since we currently do the
ecam.c request before the system.c one, and any change there would be
a long ways off.  If/when that *does* change, I think the correct fix
would be to change ecam.c so its request succeeds (by changing the way
it does the request, fixing kernel/resource.c, or whatever) rather
than to reduce t

Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas  wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas 
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Bjorn Helgaas
On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas  wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas 
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn


RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: Bjorn Helgaas [mailto:helg...@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > -Original Message-
> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn


RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-21 Thread Gabriele Paoloni
Hi Bjorn

> -Original Message-
> From: Bjorn Helgaas [mailto:helg...@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-...@vger.kernel.org; linux-
> a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > > -Original Message-
> > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Rafael J. Wysocki
On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas  wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas 

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
> - this file
> +acpi-info.txt
> +   - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
> - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +   ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Rafael J. Wysocki
On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas  wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas 

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136 
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
> - this file
> +acpi-info.txt
> +   - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
> - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +   ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Bjorn Helgaas
On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn


Re: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Bjorn Helgaas
On Fri, Nov 18, 2016 at 05:17:34PM +, Gabriele Paoloni wrote:
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> > ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn


RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Gabriele Paoloni
Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 17 November 2016 18:00
> To: linux-...@vger.kernel.org
> Cc: linux-a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> arm-ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>   - this file
> +acpi-info.txt
> + - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>   - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and
> FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-
> info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> + ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM 

RE: [PATCH] PCI: Add information about describing PCI in ACPI

2016-11-18 Thread Gabriele Paoloni
Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 17 November 2016 18:00
> To: linux-...@vger.kernel.org
> Cc: linux-a...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> arm-ker...@lists.infradead.org; linaro-a...@lists.linaro.org
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas 
> ---
>  Documentation/PCI/00-INDEX  |2 +
>  Documentation/PCI/acpi-info.txt |  136
> +++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>   - this file
> +acpi-info.txt
> + - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>   - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and
> FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-
> info.txt
> new file mode 100644
> index 000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> + ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is