subject:"Re\: Info\: mapping multiple BARs. Your kernel is fine."

On Thu, Apr 17, 2014 at 04:53:55PM -0400, Dave Jones wrote:
> ok, with your config I get back to a console after the modesetting
> switch, but then it hangs in USB init.

Maybe because of our machines are not that similar there? Can you take
my config but paste the usb part of yours and see whether it boots fine
then? It could be yours and mine have different USB hw...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 04:03:52PM -0400, Dave Jones wrote:
 > On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
 >  > On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
 >  > > Just as X starts up, I see this in dmesg..
 >  > > 
 >  > > [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
 > underrun
 >  > 
 >  > FWIW, I have that too. It should be something i915-related:
 >  > 
 >  > [0.617673] [drm] Memory usable by graphics device = 2048M
 >  > [0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
 >  > [0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
 >  > [0.694631] [drm] Driver supports precise vblank timestamp query.
 >  > [0.695313] vgaarb: device changed decodes: 
 > PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
 >  > [0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to 
 > bit banging on pin 5
 >  > [0.799829] fbcon: inteldrmfb (fb0) is primary device
 >  > [1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
 > underrun
 > 
 > Can you send me your .config off-list ?
 > I wonder if this is something config specific that's causing me to see
 > this, and you not, given we've apparently got similar machines.

ok, with your config I get back to a console after the modesetting
switch, but then it hangs in USB init.

Hrmm.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 1:48 PM, Borislav Petkov  wrote:
> On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
>> Thanks a lot for testing this out and debugging my issues.
>>
>> Here's a new version that looks for both device IDs I know about.
>>
>> I'm still nervous about the modeset problem Dave is seeing.  Since the
>> original patch wouldn't find an 8086:0c00 device on Dave's system, it
>> should have done nothing.  But since it caused a modesetting problem,
>> there's something else doing on that I don't understand.
>
> Yeah, this is strange, to put it mildly. This quirk wouldnt've done
> anything besides the iteration over the pci devices with pci_get_device.
> Which wouldn't do anything (refcount increment or so) if it didn't find
> the device, right?

Right.

> Bah, today is the day of the strange bugs. :-\
>
>> PNP: Work around BIOS defects in Intel MCH area reporting
>>
>> From: Bjorn Helgaas 
>>
>> Work around BIOSes that don't report the entire Intel MCH area.
>>
>> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
>> PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
>> Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
>> the rest of the MCH space is consumed but unreported.
>>
>> This can cause resource map sanity check warnings or (theoretically) a
>> device conflict if we assigned the unreported space to another device.
>>
>> The Intel perf event uncore driver tripped over this when it claimed the
>> MCH region:
>>
>>   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
>> 0xfed13fff pnp 00:01
>>   Info: mapping multiple BARs. Your kernel is fine.
>>
>> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
>> space, extend it to cover the entire space.
>>
>> Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
>> Reported-by: Borislav Petkov 
>
> Yep, this one works fine:
>
> [0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 
> 0xfed1-0xfed13fff] covers only part of :00:00.0 Intel MCH; extending 
> to [mem 0xfed1-0xfed17fff]
>
> Acked-by: Borislav Petkov 
> Tested-by: Borislav Petkov 

>> + region.end = region.start + 32*1024 - 1 ;

> checkpatch complains about a trailing space before the semicolon.

Thanks!  I hate typos like that.

I'll fix this, add your tested-by and ack, and send to Rafael.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
 > On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
 > > Just as X starts up, I see this in dmesg..
 > > 
 > > [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
 > > underrun
 > 
 > FWIW, I have that too. It should be something i915-related:
 > 
 > [0.617673] [drm] Memory usable by graphics device = 2048M
 > [0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
 > [0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
 > [0.694631] [drm] Driver supports precise vblank timestamp query.
 > [0.695313] vgaarb: device changed decodes: 
 > PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
 > [0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit 
 > banging on pin 5
 > [0.799829] fbcon: inteldrmfb (fb0) is primary device
 > [1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
 > underrun

Can you send me your .config off-list ?
I wonder if this is something config specific that's causing me to see
this, and you not, given we've apparently got similar machines.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
> Just as X starts up, I see this in dmesg..
> 
> [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
> underrun

FWIW, I have that too. It should be something i915-related:

[0.617673] [drm] Memory usable by graphics device = 2048M
[0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
[0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[0.694631] [drm] Driver supports precise vblank timestamp query.
[0.695313] vgaarb: device changed decodes: 
PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit 
banging on pin 5
[0.799829] fbcon: inteldrmfb (fb0) is primary device
[1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
 > Thanks a lot for testing this out and debugging my issues.
 > 
 > Here's a new version that looks for both device IDs I know about.

I can confirm this patch does fix the backtrace.
I disabled lockdep, and now I can get to X each boot, but I still see
a black screen rather than a console between modesetting becoming active, and X 
starting.

(The lockdep thing turned out to be a known XFS false positive, but for
 some reason it actually caused X to lock up)

 > I'm still nervous about the modeset problem Dave is seeing.  Since the
 > original patch wouldn't find an 8086:0c00 device on Dave's system, it
 > should have done nothing.  But since it caused a modesetting problem,
 > there's something else doing on that I don't understand.

I don't know if it's relevant, but this laptop (and I suspect many other
thinkpads which seem affected) have dual gfx, both show up on the bus,
even if though the nvidia isn't in use..

00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor 
Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2200
Flags: bus master, fast devsel, latency 0, IRQ 44
Memory at f100 (64-bit, non-prefetchable) [size=4M]
Memory at e000 (64-bit, prefetchable) [size=256M]
I/O ports at 6000 [size=64]
Expansion ROM at  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [a4] PCI Advanced Features
Kernel driver in use: i915

01:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 
620M/625M/630M/720M] (rev a1)
Subsystem: Lenovo NVS 5200M
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f000 (32-bit, non-prefetchable) [size=16M]
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at d000 (64-bit, prefetchable) [size=32M]
I/O ports at 5000 [size=128]
Expansion ROM at  [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting 
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 


Just as X starts up, I see this in dmesg..

[   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
> Thanks a lot for testing this out and debugging my issues.
> 
> Here's a new version that looks for both device IDs I know about.
> 
> I'm still nervous about the modeset problem Dave is seeing.  Since the
> original patch wouldn't find an 8086:0c00 device on Dave's system, it
> should have done nothing.  But since it caused a modesetting problem,
> there's something else doing on that I don't understand.

Yeah, this is strange, to put it mildly. This quirk wouldnt've done
anything besides the iteration over the pci devices with pci_get_device.
Which wouldn't do anything (refcount increment or so) if it didn't find
the device, right?

Bah, today is the day of the strange bugs. :-\

> PNP: Work around BIOS defects in Intel MCH area reporting
> 
> From: Bjorn Helgaas 
> 
> Work around BIOSes that don't report the entire Intel MCH area.
> 
> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
> PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
> Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
> the rest of the MCH space is consumed but unreported.
> 
> This can cause resource map sanity check warnings or (theoretically) a
> device conflict if we assigned the unreported space to another device.
> 
> The Intel perf event uncore driver tripped over this when it claimed the
> MCH region:
> 
>   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
> 0xfed13fff pnp 00:01
>   Info: mapping multiple BARs. Your kernel is fine.
> 
> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
> space, extend it to cover the entire space.
> 
> Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
> Reported-by: Borislav Petkov 

Yep, this one works fine:

[0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 
0xfed1-0xfed13fff] covers only part of :00:00.0 Intel MCH; extending to 
[mem 0xfed1-0xfed17fff]

Acked-by: Borislav Petkov 
Tested-by: Borislav Petkov 

Just a minor nitpick below.

> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/pnp/quirks.c |   74 
> ++
>  1 file changed, 74 insertions(+)
> 
> diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
> index 258fef272ea7..403bd5c42ed1 100644
> --- a/drivers/pnp/quirks.c
> +++ b/drivers/pnp/quirks.c
> @@ -334,6 +334,79 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
>  }
>  #endif
>  
> +/* Device IDs of parts that have 32KB MCH space */
> +static const unsigned int mch_quirk_devices[] = {
> + 0x0154, /* Ivy Bridge */
> + 0x0c00, /* Haswell */
> +};
> +
> +static struct pci_dev *get_intel_host(void)
> +{
> + int i;
> + struct pci_dev *host;
> +
> + for (i = 0; i < ARRAY_SIZE(mch_quirk_devices); i++) {
> + host = pci_get_device(PCI_VENDOR_ID_INTEL, mch_quirk_devices[i],
> +   NULL);
> + if (host)
> + return host;
> + }
> + return NULL;
> +}
> +
> +static void quirk_intel_mch(struct pnp_dev *dev)
> +{
> + struct pci_dev *host;
> + u32 addr_lo, addr_hi;
> + struct pci_bus_region region;
> + struct resource mch;
> + struct pnp_resource *pnp_res;
> + struct resource *res;
> +
> + host = get_intel_host();
> + if (!host)
> + return;
> +
> + /*
> +  * MCHBAR is not an architected PCI BAR, so MCH space is usually
> +  * reported as a PNP0C02 resource.  The MCH space was originally
> +  * 16KB, but is 32KB in newer parts.  Some BIOSes still report a
> +  * PNP0C02 resource that is only 16KB, which means the rest of the
> +  * MCH space is consumed but unreported.
> +  */
> +
> + /*
> +  * Read MCHBAR for Host Member Mapped Register Range Base
> +  * 
> https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
> +  * Sec 3.1.12.
> +  */
> + pci_read_config_dword(host, 0x48, _lo);
> + region.start = addr_lo & ~0x7fff;
> + pci_read_config_dword(host, 0x4c, _hi);
> + region.start |= (dma_addr_t) addr_hi << 32;
> + region.end = region.start + 32*1024 - 1 ;

checkpatch complains about a trailing space before the semicolon.

> +
> + memset(, 0, sizeof(mch));
> + mch.flags = IORESOURCE_MEM;
> + pcibios_bus_to_resource(host->bus, , );
> +
> + list_for_each_entry(pnp_res, >resources, list) {
> + res = _res->res;
> + if (res->end < mch.start || res->start > mch.end)
> + continue;   /* no overlap */
> + if (res->start == mch.start && res->end == mch.end)
> + continue;   /* exact match */
> +
> + dev_info(>dev, FW_BUG "PNP resource %pR covers only part 
> of %s Intel MCH; extending to %pR\n",
> +  res, pci_name(host), );
> +

Re: Info: mapping multiple BARs. Your kernel is fine.

Thanks a lot for testing this out and debugging my issues.

Here's a new version that looks for both device IDs I know about.

I'm still nervous about the modeset problem Dave is seeing.  Since the
original patch wouldn't find an 8086:0c00 device on Dave's system, it
should have done nothing.  But since it caused a modesetting problem,
there's something else doing on that I don't understand.

Bjorn



PNP: Work around BIOS defects in Intel MCH area reporting

From: Bjorn Helgaas 

Work around BIOSes that don't report the entire Intel MCH area.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
the rest of the MCH space is consumed but unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov 
Signed-off-by: Bjorn Helgaas 
---
 drivers/pnp/quirks.c |   74 ++
 1 file changed, 74 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..403bd5c42ed1 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,79 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+/* Device IDs of parts that have 32KB MCH space */
+static const unsigned int mch_quirk_devices[] = {
+   0x0154, /* Ivy Bridge */
+   0x0c00, /* Haswell */
+};
+
+static struct pci_dev *get_intel_host(void)
+{
+   int i;
+   struct pci_dev *host;
+
+   for (i = 0; i < ARRAY_SIZE(mch_quirk_devices); i++) {
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, mch_quirk_devices[i],
+ NULL);
+   if (host)
+   return host;
+   }
+   return NULL;
+}
+
+static void quirk_intel_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = get_intel_host();
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was originally
+* 16KB, but is 32KB in newer parts.  Some BIOSes still report a
+* PNP0C02 resource that is only 16KB, which means the rest of the
+* MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, _lo);
+   region.start = addr_lo & ~0x7fff;
+   pci_read_config_dword(host, 0x4c, _hi);
+   region.start |= (dma_addr_t) addr_hi << 32;
+   region.end = region.start + 32*1024 - 1 ;
+
+   memset(, 0, sizeof(mch));
+   mch.flags = IORESOURCE_MEM;
+   pcibios_bus_to_resource(host->bus, , );
+
+   list_for_each_entry(pnp_res, >resources, list) {
+   res = _res->res;
+   if (res->end < mch.start || res->start > mch.end)
+   continue;   /* no overlap */
+   if (res->start == mch.start && res->end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(>dev, FW_BUG "PNP resource %pR covers only part 
of %s Intel MCH; extending to %pR\n",
+res, pci_name(host), );
+   res->start = mch.start;
+   res->end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+
 /*
  *  PnP Quirks
  *  Cards or devices that need some tweaking due to incomplete resource info
@@ -364,6 +437,7 @@ static struct pnp_fixup pnp_fixups[] = {
 #ifdef CONFIG_AMD_NB
{"PNP0c01", quirk_amd_mmconfig_area},
 #endif
+   {"PNP0c02", quirk_intel_mch},
{""}
 };
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

Hi Bjorn,

thanks for the patch, a couple of notes below:

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:
> PNP: Work around Haswell BIOS defect in MCH area reporting
> 
> From: Bjorn Helgaas 
> 
> Work around a Haswell BIOS defect that causes part of the MCH area to be
> unreported.

Yep, what Stephane said, this is not HSW only.

> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
> PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
> in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
> only 16KB, which means the rest of the MCH space is consumed but
> unreported.
> 
> This can cause resource map sanity check warnings or (theoretically) a
> device conflict if we assigned the unreported space to another device.
> 
> The Intel perf event uncore driver tripped over this when it claimed the
> MCH region:
> 
>   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
> 0xfed13fff pnp 00:01
>   Info: mapping multiple BARs. Your kernel is fine.
> 
> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
> space, extend it to cover the entire space.
> 
> Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
> Reported-by: Borislav Petkov 
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/pnp/quirks.c |   55 
> ++
>  1 file changed, 55 insertions(+)
> 
> diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
> index 258fef272ea7..8402088d4145 100644
> --- a/drivers/pnp/quirks.c
> +++ b/drivers/pnp/quirks.c
> @@ -334,6 +334,60 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
>  }
>  #endif
>  
> +static void quirk_intel_haswell_mch(struct pnp_dev *dev)
> +{
> + struct pci_dev *host;
> + u32 addr_lo, addr_hi;
> + struct pci_bus_region region;
> + struct resource mch;
> + struct pnp_resource *pnp_res;
> + struct resource *res;
> +
> + host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);

And because it is not HSW only, this PCI device ID doesn't match on my
IVB system. On mine the hostbridge is

00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
(rev 09)
Subsystem: Lenovo Device 21fa
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: ivb_uncore
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00

and from looking at Dave's, it is the same one, so PCI device ID is
0x154.

With that changed to

host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0154, NULL);

and a bit of debugging code, it says now:

[0.235739] quirk_intel_haswell_mch: entry
[0.235800] quirk_intel_haswell_mch: got host: 0x0
[0.235860] quirk_intel_haswell_mch: mch: [mem 0xfed1-0xfed17fff]
[0.235930] quirk_intel_haswell_mch: res: [mem 0xfed1-0xfed13fff]
[0.235990] pnp 00:01: [Firmware Bug]: [mem 0xfed1-0xfed13fff] covers 
only part of Intel Haswell MCH; extending to [mem 0xfed1-0xfed17fff]

So you probably want to have a list of hostbridge pci ids in the quirk
or so.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

Hi Bjorn,

thanks for the patch, a couple of notes below:

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:
 PNP: Work around Haswell BIOS defect in MCH area reporting
 
 From: Bjorn Helgaas bhelg...@google.com
 
 Work around a Haswell BIOS defect that causes part of the MCH area to be
 unreported.

Yep, what Stephane said, this is not HSW only.

 MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
 PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
 in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
 only 16KB, which means the rest of the MCH space is consumed but
 unreported.
 
 This can cause resource map sanity check warnings or (theoretically) a
 device conflict if we assigned the unreported space to another device.
 
 The Intel perf event uncore driver tripped over this when it claimed the
 MCH region:
 
   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
 0xfed13fff pnp 00:01
   Info: mapping multiple BARs. Your kernel is fine.
 
 To prevent this, if we find a PNP0C02 resource that covers part of the MCH
 space, extend it to cover the entire space.
 
 Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
 Reported-by: Borislav Petkov b...@alien8.de
 Signed-off-by: Bjorn Helgaas bhelg...@google.com
 ---
  drivers/pnp/quirks.c |   55 
 ++
  1 file changed, 55 insertions(+)
 
 diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
 index 258fef272ea7..8402088d4145 100644
 --- a/drivers/pnp/quirks.c
 +++ b/drivers/pnp/quirks.c
 @@ -334,6 +334,60 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
  }
  #endif
  
 +static void quirk_intel_haswell_mch(struct pnp_dev *dev)
 +{
 + struct pci_dev *host;
 + u32 addr_lo, addr_hi;
 + struct pci_bus_region region;
 + struct resource mch;
 + struct pnp_resource *pnp_res;
 + struct resource *res;
 +
 + host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);

And because it is not HSW only, this PCI device ID doesn't match on my
IVB system. On mine the hostbridge is

00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
(rev 09)
Subsystem: Lenovo Device 21fa
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR- INTx-
Latency: 0
Capabilities: access denied
Kernel driver in use: ivb_uncore
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00

and from looking at Dave's, it is the same one, so PCI device ID is
0x154.

With that changed to

host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0154, NULL);

and a bit of debugging code, it says now:

[0.235739] quirk_intel_haswell_mch: entry
[0.235800] quirk_intel_haswell_mch: got host: 0x0
[0.235860] quirk_intel_haswell_mch: mch: [mem 0xfed1-0xfed17fff]
[0.235930] quirk_intel_haswell_mch: res: [mem 0xfed1-0xfed13fff]
[0.235990] pnp 00:01: [Firmware Bug]: [mem 0xfed1-0xfed13fff] covers 
only part of Intel Haswell MCH; extending to [mem 0xfed1-0xfed17fff]

So you probably want to have a list of hostbridge pci ids in the quirk
or so.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

Thanks a lot for testing this out and debugging my issues.

Here's a new version that looks for both device IDs I know about.

I'm still nervous about the modeset problem Dave is seeing.  Since the
original patch wouldn't find an 8086:0c00 device on Dave's system, it
should have done nothing.  But since it caused a modesetting problem,
there's something else doing on that I don't understand.

Bjorn



PNP: Work around BIOS defects in Intel MCH area reporting

From: Bjorn Helgaas bhelg...@google.com

Work around BIOSes that don't report the entire Intel MCH area.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
the rest of the MCH space is consumed but unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov b...@alien8.de
Signed-off-by: Bjorn Helgaas bhelg...@google.com
---
 drivers/pnp/quirks.c |   74 ++
 1 file changed, 74 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..403bd5c42ed1 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,79 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+/* Device IDs of parts that have 32KB MCH space */
+static const unsigned int mch_quirk_devices[] = {
+   0x0154, /* Ivy Bridge */
+   0x0c00, /* Haswell */
+};
+
+static struct pci_dev *get_intel_host(void)
+{
+   int i;
+   struct pci_dev *host;
+
+   for (i = 0; i  ARRAY_SIZE(mch_quirk_devices); i++) {
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, mch_quirk_devices[i],
+ NULL);
+   if (host)
+   return host;
+   }
+   return NULL;
+}
+
+static void quirk_intel_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = get_intel_host();
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was originally
+* 16KB, but is 32KB in newer parts.  Some BIOSes still report a
+* PNP0C02 resource that is only 16KB, which means the rest of the
+* MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, addr_lo);
+   region.start = addr_lo  ~0x7fff;
+   pci_read_config_dword(host, 0x4c, addr_hi);
+   region.start |= (dma_addr_t) addr_hi  32;
+   region.end = region.start + 32*1024 - 1 ;
+
+   memset(mch, 0, sizeof(mch));
+   mch.flags = IORESOURCE_MEM;
+   pcibios_bus_to_resource(host-bus, mch, region);
+
+   list_for_each_entry(pnp_res, dev-resources, list) {
+   res = pnp_res-res;
+   if (res-end  mch.start || res-start  mch.end)
+   continue;   /* no overlap */
+   if (res-start == mch.start  res-end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(dev-dev, FW_BUG PNP resource %pR covers only part 
of %s Intel MCH; extending to %pR\n,
+res, pci_name(host), mch);
+   res-start = mch.start;
+   res-end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+
 /*
  *  PnP Quirks
  *  Cards or devices that need some tweaking due to incomplete resource info
@@ -364,6 +437,7 @@ static struct pnp_fixup pnp_fixups[] = {
 #ifdef CONFIG_AMD_NB
{PNP0c01, quirk_amd_mmconfig_area},
 #endif
+   {PNP0c02, quirk_intel_mch},
{}
 };
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
 Thanks a lot for testing this out and debugging my issues.
 
 Here's a new version that looks for both device IDs I know about.
 
 I'm still nervous about the modeset problem Dave is seeing.  Since the
 original patch wouldn't find an 8086:0c00 device on Dave's system, it
 should have done nothing.  But since it caused a modesetting problem,
 there's something else doing on that I don't understand.

Yeah, this is strange, to put it mildly. This quirk wouldnt've done
anything besides the iteration over the pci devices with pci_get_device.
Which wouldn't do anything (refcount increment or so) if it didn't find
the device, right?

Bah, today is the day of the strange bugs. :-\

 PNP: Work around BIOS defects in Intel MCH area reporting
 
 From: Bjorn Helgaas bhelg...@google.com
 
 Work around BIOSes that don't report the entire Intel MCH area.
 
 MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
 PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
 Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
 the rest of the MCH space is consumed but unreported.
 
 This can cause resource map sanity check warnings or (theoretically) a
 device conflict if we assigned the unreported space to another device.
 
 The Intel perf event uncore driver tripped over this when it claimed the
 MCH region:
 
   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
 0xfed13fff pnp 00:01
   Info: mapping multiple BARs. Your kernel is fine.
 
 To prevent this, if we find a PNP0C02 resource that covers part of the MCH
 space, extend it to cover the entire space.
 
 Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
 Reported-by: Borislav Petkov b...@alien8.de

Yep, this one works fine:

[0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 
0xfed1-0xfed13fff] covers only part of :00:00.0 Intel MCH; extending to 
[mem 0xfed1-0xfed17fff]

Acked-by: Borislav Petkov b...@suse.de
Tested-by: Borislav Petkov b...@suse.de

Just a minor nitpick below.

 Signed-off-by: Bjorn Helgaas bhelg...@google.com
 ---
  drivers/pnp/quirks.c |   74 
 ++
  1 file changed, 74 insertions(+)
 
 diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
 index 258fef272ea7..403bd5c42ed1 100644
 --- a/drivers/pnp/quirks.c
 +++ b/drivers/pnp/quirks.c
 @@ -334,6 +334,79 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
  }
  #endif
  
 +/* Device IDs of parts that have 32KB MCH space */
 +static const unsigned int mch_quirk_devices[] = {
 + 0x0154, /* Ivy Bridge */
 + 0x0c00, /* Haswell */
 +};
 +
 +static struct pci_dev *get_intel_host(void)
 +{
 + int i;
 + struct pci_dev *host;
 +
 + for (i = 0; i  ARRAY_SIZE(mch_quirk_devices); i++) {
 + host = pci_get_device(PCI_VENDOR_ID_INTEL, mch_quirk_devices[i],
 +   NULL);
 + if (host)
 + return host;
 + }
 + return NULL;
 +}
 +
 +static void quirk_intel_mch(struct pnp_dev *dev)
 +{
 + struct pci_dev *host;
 + u32 addr_lo, addr_hi;
 + struct pci_bus_region region;
 + struct resource mch;
 + struct pnp_resource *pnp_res;
 + struct resource *res;
 +
 + host = get_intel_host();
 + if (!host)
 + return;
 +
 + /*
 +  * MCHBAR is not an architected PCI BAR, so MCH space is usually
 +  * reported as a PNP0C02 resource.  The MCH space was originally
 +  * 16KB, but is 32KB in newer parts.  Some BIOSes still report a
 +  * PNP0C02 resource that is only 16KB, which means the rest of the
 +  * MCH space is consumed but unreported.
 +  */
 +
 + /*
 +  * Read MCHBAR for Host Member Mapped Register Range Base
 +  * 
 https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
 +  * Sec 3.1.12.
 +  */
 + pci_read_config_dword(host, 0x48, addr_lo);
 + region.start = addr_lo  ~0x7fff;
 + pci_read_config_dword(host, 0x4c, addr_hi);
 + region.start |= (dma_addr_t) addr_hi  32;
 + region.end = region.start + 32*1024 - 1 ;

checkpatch complains about a trailing space before the semicolon.

 +
 + memset(mch, 0, sizeof(mch));
 + mch.flags = IORESOURCE_MEM;
 + pcibios_bus_to_resource(host-bus, mch, region);
 +
 + list_for_each_entry(pnp_res, dev-resources, list) {
 + res = pnp_res-res;
 + if (res-end  mch.start || res-start  mch.end)
 + continue;   /* no overlap */
 + if (res-start == mch.start  res-end == mch.end)
 + continue;   /* exact match */
 +
 + dev_info(dev-dev, FW_BUG PNP resource %pR covers only part 
 of %s Intel MCH; extending to %pR\n,
 +  res, pci_name(host), mch);
 + res-start = mch.start;
 +

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
  Thanks a lot for testing this out and debugging my issues.
  
  Here's a new version that looks for both device IDs I know about.

I can confirm this patch does fix the backtrace.
I disabled lockdep, and now I can get to X each boot, but I still see
a black screen rather than a console between modesetting becoming active, and X 
starting.

(The lockdep thing turned out to be a known XFS false positive, but for
 some reason it actually caused X to lock up)

  I'm still nervous about the modeset problem Dave is seeing.  Since the
  original patch wouldn't find an 8086:0c00 device on Dave's system, it
  should have done nothing.  But since it caused a modesetting problem,
  there's something else doing on that I don't understand.

I don't know if it's relevant, but this laptop (and I suspect many other
thinkpads which seem affected) have dual gfx, both show up on the bus,
even if though the nvidia isn't in use..

00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor 
Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2200
Flags: bus master, fast devsel, latency 0, IRQ 44
Memory at f100 (64-bit, non-prefetchable) [size=4M]
Memory at e000 (64-bit, prefetchable) [size=256M]
I/O ports at 6000 [size=64]
Expansion ROM at unassigned [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [a4] PCI Advanced Features
Kernel driver in use: i915

01:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 
620M/625M/630M/720M] (rev a1)
Subsystem: Lenovo NVS 5200M
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f000 (32-bit, non-prefetchable) [size=16M]
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at d000 (64-bit, prefetchable) [size=32M]
I/O ports at 5000 [size=128]
Expansion ROM at ignored [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 ?
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting ?
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 
?

Just as X starts up, I see this in dmesg..

[   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
 Just as X starts up, I see this in dmesg..
 
 [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
 underrun

FWIW, I have that too. It should be something i915-related:

[0.617673] [drm] Memory usable by graphics device = 2048M
[0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
[0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[0.694631] [drm] Driver supports precise vblank timestamp query.
[0.695313] vgaarb: device changed decodes: 
PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit 
banging on pin 5
[0.799829] fbcon: inteldrmfb (fb0) is primary device
[1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
  On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
   Just as X starts up, I see this in dmesg..
   
   [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
   underrun
  
  FWIW, I have that too. It should be something i915-related:
  
  [0.617673] [drm] Memory usable by graphics device = 2048M
  [0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
  [0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
  [0.694631] [drm] Driver supports precise vblank timestamp query.
  [0.695313] vgaarb: device changed decodes: 
  PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
  [0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit 
  banging on pin 5
  [0.799829] fbcon: inteldrmfb (fb0) is primary device
  [1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
  underrun

Can you send me your .config off-list ?
I wonder if this is something config specific that's causing me to see
this, and you not, given we've apparently got similar machines.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 1:48 PM, Borislav Petkov b...@alien8.de wrote:
 On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
 Thanks a lot for testing this out and debugging my issues.

 Here's a new version that looks for both device IDs I know about.

 I'm still nervous about the modeset problem Dave is seeing.  Since the
 original patch wouldn't find an 8086:0c00 device on Dave's system, it
 should have done nothing.  But since it caused a modesetting problem,
 there's something else doing on that I don't understand.

 Yeah, this is strange, to put it mildly. This quirk wouldnt've done
 anything besides the iteration over the pci devices with pci_get_device.
 Which wouldn't do anything (refcount increment or so) if it didn't find
 the device, right?

Right.

 Bah, today is the day of the strange bugs. :-\

 PNP: Work around BIOS defects in Intel MCH area reporting

 From: Bjorn Helgaas bhelg...@google.com

 Work around BIOSes that don't report the entire Intel MCH area.

 MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
 PNP0C02 resource.  The MCH space was once 16KB, but is 32KB in newer parts.
 Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
 the rest of the MCH space is consumed but unreported.

 This can cause resource map sanity check warnings or (theoretically) a
 device conflict if we assigned the unreported space to another device.

 The Intel perf event uncore driver tripped over this when it claimed the
 MCH region:

   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
 0xfed13fff pnp 00:01
   Info: mapping multiple BARs. Your kernel is fine.

 To prevent this, if we find a PNP0C02 resource that covers part of the MCH
 space, extend it to cover the entire space.

 Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
 Reported-by: Borislav Petkov b...@alien8.de

 Yep, this one works fine:

 [0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 
 0xfed1-0xfed13fff] covers only part of :00:00.0 Intel MCH; extending 
 to [mem 0xfed1-0xfed17fff]

 Acked-by: Borislav Petkov b...@suse.de
 Tested-by: Borislav Petkov b...@suse.de

 + region.end = region.start + 32*1024 - 1 ;

 checkpatch complains about a trailing space before the semicolon.

Thanks!  I hate typos like that.

I'll fix this, add your tested-by and ack, and send to Rafael.

Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 04:03:52PM -0400, Dave Jones wrote:
  On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
 Just as X starts up, I see this in dmesg..
 
 [   42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
  underrun

FWIW, I have that too. It should be something i915-related:

[0.617673] [drm] Memory usable by graphics device = 2048M
[0.694445] i915 :00:02.0: irq 42 for MSI/MSI-X
[0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[0.694631] [drm] Driver supports precise vblank timestamp query.
[0.695313] vgaarb: device changed decodes: 
  PCI::00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to 
  bit banging on pin 5
[0.799829] fbcon: inteldrmfb (fb0) is primary device
[1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO 
  underrun
  
  Can you send me your .config off-list ?
  I wonder if this is something config specific that's causing me to see
  this, and you not, given we've apparently got similar machines.

ok, with your config I get back to a console after the modesetting
switch, but then it hangs in USB init.

Hrmm.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Apr 17, 2014 at 04:53:55PM -0400, Dave Jones wrote:
 ok, with your config I get back to a console after the modesetting
 switch, but then it hangs in USB init.

Maybe because of our machines are not that similar there? Can you take
my config but paste the usb part of yours and see whether it boots fine
then? It could be yours and mine have different USB hw...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:
 
 > > I'm seeing the exact same message on my thinkpad t430s.
 > > When I try your patch, modesetting no longer works. When it tries
 > > to change to the framebuffer I get a black screen and lockup.
 > > If I boot with nomodeset it locks up when it gets to X.
 > > It all scrolls by too fast to read, but it looks like there's still
 > > a backtrace present.
 > 
 > Ouch, sorry about that.  I do see a bug in my patch (fixed below), but I
 > don't see how that could cause what you're seeing.

updated diff made no difference fwiw.

 > Maybe I could figure
 > out something from this info (this can be from a kernel without my patch):
 > 
 > - dmesg log
 > - output of "find /sys/devices/pnp0 -name id -o -name resources | xargs 
 > grep ."
 > - output of "sudo lspci -s00:00.0 -xxx"

attached from a fedora build of rc1.

Dave

/sys/devices/pnp0/00:00/id:PNP0c01
/sys/devices/pnp0/00:00/resources:state = active
/sys/devices/pnp0/00:00/resources:mem 0x0-0x9
/sys/devices/pnp0/00:00/resources:mem 0xc-0xc3fff
/sys/devices/pnp0/00:00/resources:mem 0xc4000-0xc7fff
/sys/devices/pnp0/00:00/resources:mem 0xc8000-0xcbfff
/sys/devices/pnp0/00:00/resources:mem 0xcc000-0xc
/sys/devices/pnp0/00:00/resources:mem 0xd-0xd3fff
/sys/devices/pnp0/00:00/resources:mem 0xd4000-0xd7fff
/sys/devices/pnp0/00:00/resources:mem 0xd8000-0xdbfff
/sys/devices/pnp0/00:00/resources:mem 0xdc000-0xd
/sys/devices/pnp0/00:00/resources:mem 0xe-0xe3fff
/sys/devices/pnp0/00:00/resources:mem 0xe4000-0xe7fff
/sys/devices/pnp0/00:00/resources:mem 0xe8000-0xebfff
/sys/devices/pnp0/00:00/resources:mem 0xec000-0xe
/sys/devices/pnp0/00:00/resources:mem 0xf-0xf
/sys/devices/pnp0/00:00/resources:mem 0x10-0xbf9f
/sys/devices/pnp0/00:00/resources:mem 0xfec0-0xfed3
/sys/devices/pnp0/00:00/resources:mem 0xfed4c000-0x
/sys/devices/pnp0/00:01/id:PNP0c02
/sys/devices/pnp0/00:01/resources:state = active
/sys/devices/pnp0/00:01/resources:io 0x10-0x1f
/sys/devices/pnp0/00:01/resources:io 0x90-0x9f
/sys/devices/pnp0/00:01/resources:io 0x24-0x25
/sys/devices/pnp0/00:01/resources:io 0x28-0x29
/sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
/sys/devices/pnp0/00:01/resources:io 0x30-0x31
/sys/devices/pnp0/00:01/resources:io 0x34-0x35
/sys/devices/pnp0/00:01/resources:io 0x38-0x39
/sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
/sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
/sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
/sys/devices/pnp0/00:01/resources:io 0xac-0xad
/sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
/sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
/sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
/sys/devices/pnp0/00:01/resources:io 0x50-0x53
/sys/devices/pnp0/00:01/resources:io 0x72-0x77
/sys/devices/pnp0/00:01/resources:io 0x400-0x47f
/sys/devices/pnp0/00:01/resources:io 0x500-0x57f
/sys/devices/pnp0/00:01/resources:io 0x800-0x80f
/sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
/sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
/sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
/sys/devices/pnp0/00:01/resources:mem disabled
/sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
/sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
/sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
/sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
/sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
/sys/devices/pnp0/00:02/id:PNP0103
/sys/devices/pnp0/00:02/resources:state = active
/sys/devices/pnp0/00:02/resources:mem 0xfed0-0xfed003ff
/sys/devices/pnp0/00:03/id:PNP0200
/sys/devices/pnp0/00:03/resources:state = active
/sys/devices/pnp0/00:03/resources:io 0x0-0xf
/sys/devices/pnp0/00:03/resources:io 0x80-0x8f
/sys/devices/pnp0/00:03/resources:io 0xc0-0xdf
/sys/devices/pnp0/00:03/resources:dma 4
/sys/devices/pnp0/00:04/id:PNP0800
/sys/devices/pnp0/00:04/resources:state = active
/sys/devices/pnp0/00:04/resources:io 0x61-0x61
/sys/devices/pnp0/00:05/id:PNP0c04
/sys/devices/pnp0/00:05/resources:state = active
/sys/devices/pnp0/00:05/resources:io 0xf0-0xf0
/sys/devices/pnp0/00:05/resources:irq 13
/sys/devices/pnp0/00:06/id:PNP0b00
/sys/devices/pnp0/00:06/resources:state = active
/sys/devices/pnp0/00:06/resources:io 0x70-0x71
/sys/devices/pnp0/00:06/resources:irq 8
/sys/devices/pnp0/00:07/id:LEN0071
/sys/devices/pnp0/00:07/id:PNP0303
/sys/devices/pnp0/00:07/resources:state = active
/sys/devices/pnp0/00:07/resources:io 0x60-0x60
/sys/devices/pnp0/00:07/resources:io 0x64-0x64
/sys/devices/pnp0/00:07/resources:irq 1
/sys/devices/pnp0/00:08/id:LEN0015
/sys/devices/pnp0/00:08/id:PNP0f13
/sys/devices/pnp0/00:08/resources:state = active
/sys/devices/pnp0/00:08/resources:irq 12
/sys/devices/pnp0/00:09/id:SMO1200
/sys/devices/pnp0/00:09/id:PNP0c31
/sys/devices/pnp0/00:09/resources:state = active
/sys/devices/pnp0/00:09/resources:mem 0xfed4-0xfed44fff

00:00.0 Host bridge: Intel Corporation 3rd Gen Core

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 5:08 PM, Stephane Eranian  wrote:
> On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas  wrote:
>> On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
>>> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
>>> > Right.  Even if we had this long-term solution, we'd still have
>>> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>>> >
>>> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
>>> > the PNP resource if we found the matching PCI device and MCHBAR.  That
>>> > should solve Stephane's problem even with the current
>>> > drivers/pnp/system.c.
>>>
>>> Guys, this still triggers in -rc1. Do we have a fix or something
>>> testable at least?
>>
>> Hi Boris,
>>
>> Can you try the patch below?
>>
>>
>>
>> PNP: Work around Haswell BIOS defect in MCH area reporting
>>
>> From: Bjorn Helgaas 
>>
>> Work around a Haswell BIOS defect that causes part of the MCH area to be
>> unreported.
>>
>> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
>> PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
>> in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
>> only 16KB, which means the rest of the MCH space is consumed but
>> unreported.
>>
> Why are you saying this is Haswell vs. others. I see the problem on my
> IvyBridge laptop, like Boris.

Ah, good question.  Somewhere I got pointed to the Haswell docs, which
say 32KB.  I don't know what other parts have 32KB MCH spaces.  If we
could figure out a list of device IDs with 32KB spaces, we could add
that to the quirk.

But I don't know how to come up with a complete list.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Stephane Eranian

On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas  wrote:
> On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
>> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
>> > Right.  Even if we had this long-term solution, we'd still have
>> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>> >
>> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
>> > the PNP resource if we found the matching PCI device and MCHBAR.  That
>> > should solve Stephane's problem even with the current
>> > drivers/pnp/system.c.
>>
>> Guys, this still triggers in -rc1. Do we have a fix or something
>> testable at least?
>
> Hi Boris,
>
> Can you try the patch below?
>
>
>
> PNP: Work around Haswell BIOS defect in MCH area reporting
>
> From: Bjorn Helgaas 
>
> Work around a Haswell BIOS defect that causes part of the MCH area to be
> unreported.
>
> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
> PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
> in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
> only 16KB, which means the rest of the MCH space is consumed but
> unreported.
>
Why are you saying this is Haswell vs. others. I see the problem on my
IvyBridge laptop, like Boris.

> This can cause resource map sanity check warnings or (theoretically) a
> device conflict if we assigned the unreported space to another device.
>
> The Intel perf event uncore driver tripped over this when it claimed the
> MCH region:
>
>   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
> 0xfed13fff pnp 00:01
>   Info: mapping multiple BARs. Your kernel is fine.
>
> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
> space, extend it to cover the entire space.
>
> Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
> Reported-by: Borislav Petkov 
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/pnp/quirks.c |   52 
> ++
>  1 file changed, 52 insertions(+)
>
> diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
> index 258fef272ea7..023edf592371 100644
> --- a/drivers/pnp/quirks.c
> +++ b/drivers/pnp/quirks.c
> @@ -334,6 +334,57 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
>  }
>  #endif
>
> +static void quirk_intel_haswell_mch(struct pnp_dev *dev)
> +{
> +   struct pci_dev *host;
> +   u32 addr_lo, addr_hi;
> +   struct pci_bus_region region;
> +   struct resource mch;
> +   struct pnp_resource *pnp_res;
> +   struct resource *res;
> +
> +   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
> +   if (!host)
> +   return;
> +
> +   /*
> +* MCHBAR is not an architected PCI BAR, so MCH space is usually
> +* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
> +* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
> +* report a PNP0C02 resource that is only 16KB, which means the
> +* rest of the MCH space is consumed but unreported.
> +*/
> +
> +   /*
> +* Read MCHBAR for Host Member Mapped Register Range Base
> +* 
> https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
> +* Sec 3.1.12.
> +*/
> +   pci_read_config_dword(host, 0x48, _lo);
> +   region.start = addr_lo & ~0x7fff;
> +   pci_read_config_dword(host, 0x4c, _hi);
> +   region.start |= (dma_addr_t) addr_hi << 32;
> +   region.end = region.start + 32*1024 - 1 ;
> +   pcibios_bus_to_resource(host->bus, , );
> +
> +   list_for_each_entry(pnp_res, >resources, list) {
> +   res = _res->res;
> +   if (res->end < mch.start || res->start > mch.end)
> +   continue;   /* no overlap */
> +   if (res->start == mch.start && res->end == mch.end)
> +   continue;   /* exact match */
> +
> +   dev_info(>dev, FW_BUG
> +"%pR covers only part of Intel Haswell MCH; 
> extending to %pR\n",
> +res, );
> +   res->start = mch.start;
> +   res->end = mch.end;
> +   break;
> +   }
> +
> +   pci_dev_put(host);
> +}
> +
>  /*
>   *  PnP Quirks
>   *  Cards or devices that need some tweaking due to incomplete resource info
> @@ -364,6 +415,7 @@ static struct pnp_fixup pnp_fixups[] = {
>  #ifdef CONFIG_AMD_NB
> {"PNP0c01", quirk_amd_mmconfig_area},
>  #endif
> +   {"PNP0c02", quirk_intel_haswell_mch},
> {""}
>  };
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 06:31:22PM -0400, Dave Jones wrote:
> On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:
>  > On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
>  > > On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
>  > > > Right.  Even if we had this long-term solution, we'd still have
>  > > > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>  > > > 
>  > > > We do have a drivers/pnp/quirks.c where we could conceivably adjust
>  > > > the PNP resource if we found the matching PCI device and MCHBAR.  That
>  > > > should solve Stephane's problem even with the current
>  > > > drivers/pnp/system.c.
>  > > 
>  > > Guys, this still triggers in -rc1. Do we have a fix or something
>  > > testable at least?
>  > 
>  > Hi Boris,
>  > 
>  > Can you try the patch below?
> 
> I'm seeing the exact same message on my thinkpad t430s.
> When I try your patch, modesetting no longer works. When it tries
> to change to the framebuffer I get a black screen and lockup.
> If I boot with nomodeset it locks up when it gets to X.
> It all scrolls by too fast to read, but it looks like there's still
> a backtrace present.

Ouch, sorry about that.  I do see a bug in my patch (fixed below), but I
don't see how that could cause what you're seeing.  Maybe I could figure
out something from this info (this can be from a kernel without my patch):

- dmesg log
- output of "find /sys/devices/pnp0 -name id -o -name resources | xargs 
grep ."
- output of "sudo lspci -s00:00.0 -xxx"



PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas 

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov 
Signed-off-by: Bjorn Helgaas 
---
 drivers/pnp/quirks.c |   55 ++
 1 file changed, 55 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..8402088d4145 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,60 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+static void quirk_intel_haswell_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
+* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
+* report a PNP0C02 resource that is only 16KB, which means the
+* rest of the MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, _lo);
+   region.start = addr_lo & ~0x7fff;
+   pci_read_config_dword(host, 0x4c, _hi);
+   region.start |= (dma_addr_t) addr_hi << 32;
+   region.end = region.start + 32*1024 - 1 ;
+
+   memset(, 0, sizeof(mch));
+   mch.flags = IORESOURCE_MEM;
+   pcibios_bus_to_resource(host->bus, , );
+
+   list_for_each_entry(pnp_res, >resources, list) {
+   res = _res->res;
+   if (res->end < mch.start || res->start > mch.end)
+   continue;   /* no overlap */
+   if (res->start == mch.start && res->end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(>dev, FW_BUG
+"%pR covers only part of Intel Haswell MCH; extending 
to %pR\n",
+res, );
+   res->start = mch.start;
+   res->end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+
 /*

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:
 > On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
 > > On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
 > > > Right.  Even if we had this long-term solution, we'd still have
 > > > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
 > > > 
 > > > We do have a drivers/pnp/quirks.c where we could conceivably adjust
 > > > the PNP resource if we found the matching PCI device and MCHBAR.  That
 > > > should solve Stephane's problem even with the current
 > > > drivers/pnp/system.c.
 > > 
 > > Guys, this still triggers in -rc1. Do we have a fix or something
 > > testable at least?
 > 
 > Hi Boris,
 > 
 > Can you try the patch below?

I'm seeing the exact same message on my thinkpad t430s.
When I try your patch, modesetting no longer works. When it tries
to change to the framebuffer I get a black screen and lockup.
If I boot with nomodeset it locks up when it gets to X.
It all scrolls by too fast to read, but it looks like there's still
a backtrace present.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > Right.  Even if we had this long-term solution, we'd still have
> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> > 
> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > the PNP resource if we found the matching PCI device and MCHBAR.  That
> > should solve Stephane's problem even with the current
> > drivers/pnp/system.c.
> 
> Guys, this still triggers in -rc1. Do we have a fix or something
> testable at least?

Hi Boris,

Can you try the patch below?



PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas 

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov 
Signed-off-by: Bjorn Helgaas 
---
 drivers/pnp/quirks.c |   52 ++
 1 file changed, 52 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..023edf592371 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,57 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+static void quirk_intel_haswell_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
+* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
+* report a PNP0C02 resource that is only 16KB, which means the
+* rest of the MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, _lo);
+   region.start = addr_lo & ~0x7fff;
+   pci_read_config_dword(host, 0x4c, _hi);
+   region.start |= (dma_addr_t) addr_hi << 32;
+   region.end = region.start + 32*1024 - 1 ;
+   pcibios_bus_to_resource(host->bus, , );
+
+   list_for_each_entry(pnp_res, >resources, list) {
+   res = _res->res;
+   if (res->end < mch.start || res->start > mch.end)
+   continue;   /* no overlap */
+   if (res->start == mch.start && res->end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(>dev, FW_BUG
+"%pR covers only part of Intel Haswell MCH; extending 
to %pR\n",
+res, );
+   res->start = mch.start;
+   res->end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+
 /*
  *  PnP Quirks
  *  Cards or devices that need some tweaking due to incomplete resource info
@@ -364,6 +415,7 @@ static struct pnp_fixup pnp_fixups[] = {
 #ifdef CONFIG_AMD_NB
{"PNP0c01", quirk_amd_mmconfig_area},
 #endif
+   {"PNP0c02", quirk_intel_haswell_mch},
{""}
 };
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Zhang, Rui



> -Original Message-
> From: Borislav Petkov [mailto:b...@alien8.de]
> Sent: Wednesday, April 16, 2014 12:04 PM
> To: Bjorn Helgaas; Rafael J. Wysocki
> Cc: Zhang, Rui; Lu, Aaron; lkml; x...@kernel.org; Linux PCI; ACPI Devel
> Maling List; Yinghai Lu; H. Peter Anvin; Stephane Eranian; Yan, Zheng Z
> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> Importance: High
> 
> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > Right.  Even if we had this long-term solution, we'd still have
> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> >
> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > the PNP resource if we found the matching PCI device and MCHBAR.
> That
> > should solve Stephane's problem even with the current
> > drivers/pnp/system.c.
> 
> Guys, this still triggers in -rc1. Do we have a fix or something
> testable at least?
> 
Could you please attach the dmesg output after a fresh boot in -rc1?

Thanks,
rui
> Thanks.
> 
> --
> Regards/Gruss,
> Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Borislav Petkov

On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> Right.  Even if we had this long-term solution, we'd still have
> Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> 
> We do have a drivers/pnp/quirks.c where we could conceivably adjust
> the PNP resource if we found the matching PCI device and MCHBAR.  That
> should solve Stephane's problem even with the current
> drivers/pnp/system.c.

Guys, this still triggers in -rc1. Do we have a fix or something
testable at least?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Borislav Petkov

On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
 Right.  Even if we had this long-term solution, we'd still have
 Stephane's current problem, because the PNP0C02 _CRS is still wrong.
 
 We do have a drivers/pnp/quirks.c where we could conceivably adjust
 the PNP resource if we found the matching PCI device and MCHBAR.  That
 should solve Stephane's problem even with the current
 drivers/pnp/system.c.

Guys, this still triggers in -rc1. Do we have a fix or something
testable at least?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Zhang, Rui

 -Original Message-
 From: Borislav Petkov [mailto:b...@alien8.de]
 Sent: Wednesday, April 16, 2014 12:04 PM
 To: Bjorn Helgaas; Rafael J. Wysocki
 Cc: Zhang, Rui; Lu, Aaron; lkml; x...@kernel.org; Linux PCI; ACPI Devel
 Maling List; Yinghai Lu; H. Peter Anvin; Stephane Eranian; Yan, Zheng Z
 Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
 Importance: High

 On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
  Right.  Even if we had this long-term solution, we'd still have
  Stephane's current problem, because the PNP0C02 _CRS is still wrong.

  We do have a drivers/pnp/quirks.c where we could conceivably adjust
  the PNP resource if we found the matching PCI device and MCHBAR.
 That
  should solve Stephane's problem even with the current
  drivers/pnp/system.c.

 Guys, this still triggers in -rc1. Do we have a fix or something
 testable at least?

Could you please attach the dmesg output after a fresh boot in -rc1?

Thanks,
rui
 Thanks.

 --
 Regards/Gruss,
 Boris.

 Sent from a fat crate under my desk. Formatting is fine.
 --

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
 On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
  Right.  Even if we had this long-term solution, we'd still have
  Stephane's current problem, because the PNP0C02 _CRS is still wrong.
  
  We do have a drivers/pnp/quirks.c where we could conceivably adjust
  the PNP resource if we found the matching PCI device and MCHBAR.  That
  should solve Stephane's problem even with the current
  drivers/pnp/system.c.
 
 Guys, this still triggers in -rc1. Do we have a fix or something
 testable at least?

Hi Boris,

Can you try the patch below?



PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas bhelg...@google.com

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov b...@alien8.de
Signed-off-by: Bjorn Helgaas bhelg...@google.com
---
 drivers/pnp/quirks.c |   52 ++
 1 file changed, 52 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..023edf592371 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,57 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+static void quirk_intel_haswell_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
+* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
+* report a PNP0C02 resource that is only 16KB, which means the
+* rest of the MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, addr_lo);
+   region.start = addr_lo  ~0x7fff;
+   pci_read_config_dword(host, 0x4c, addr_hi);
+   region.start |= (dma_addr_t) addr_hi  32;
+   region.end = region.start + 32*1024 - 1 ;
+   pcibios_bus_to_resource(host-bus, mch, region);
+
+   list_for_each_entry(pnp_res, dev-resources, list) {
+   res = pnp_res-res;
+   if (res-end  mch.start || res-start  mch.end)
+   continue;   /* no overlap */
+   if (res-start == mch.start  res-end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(dev-dev, FW_BUG
+%pR covers only part of Intel Haswell MCH; extending 
to %pR\n,
+res, mch);
+   res-start = mch.start;
+   res-end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+
 /*
  *  PnP Quirks
  *  Cards or devices that need some tweaking due to incomplete resource info
@@ -364,6 +415,7 @@ static struct pnp_fixup pnp_fixups[] = {
 #ifdef CONFIG_AMD_NB
{PNP0c01, quirk_amd_mmconfig_area},
 #endif
+   {PNP0c02, quirk_intel_haswell_mch},
{}
 };
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:
  On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
   On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
Right.  Even if we had this long-term solution, we'd still have
Stephane's current problem, because the PNP0C02 _CRS is still wrong.

We do have a drivers/pnp/quirks.c where we could conceivably adjust
the PNP resource if we found the matching PCI device and MCHBAR.  That
should solve Stephane's problem even with the current
drivers/pnp/system.c.
   
   Guys, this still triggers in -rc1. Do we have a fix or something
   testable at least?
  
  Hi Boris,
  
  Can you try the patch below?

I'm seeing the exact same message on my thinkpad t430s.
When I try your patch, modesetting no longer works. When it tries
to change to the framebuffer I get a black screen and lockup.
If I boot with nomodeset it locks up when it gets to X.
It all scrolls by too fast to read, but it looks like there's still
a backtrace present.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 06:31:22PM -0400, Dave Jones wrote:
 On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:
   On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
 Right.  Even if we had this long-term solution, we'd still have
 Stephane's current problem, because the PNP0C02 _CRS is still wrong.
 
 We do have a drivers/pnp/quirks.c where we could conceivably adjust
 the PNP resource if we found the matching PCI device and MCHBAR.  That
 should solve Stephane's problem even with the current
 drivers/pnp/system.c.

Guys, this still triggers in -rc1. Do we have a fix or something
testable at least?
   
   Hi Boris,
   
   Can you try the patch below?
 
 I'm seeing the exact same message on my thinkpad t430s.
 When I try your patch, modesetting no longer works. When it tries
 to change to the framebuffer I get a black screen and lockup.
 If I boot with nomodeset it locks up when it gets to X.
 It all scrolls by too fast to read, but it looks like there's still
 a backtrace present.

Ouch, sorry about that.  I do see a bug in my patch (fixed below), but I
don't see how that could cause what you're seeing.  Maybe I could figure
out something from this info (this can be from a kernel without my patch):

- dmesg log
- output of find /sys/devices/pnp0 -name id -o -name resources | xargs 
grep .
- output of sudo lspci -s00:00.0 -xxx



PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas bhelg...@google.com

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

  resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
0xfed13fff pnp 00:01
  Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
Reported-by: Borislav Petkov b...@alien8.de
Signed-off-by: Bjorn Helgaas bhelg...@google.com
---
 drivers/pnp/quirks.c |   55 ++
 1 file changed, 55 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..8402088d4145 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,60 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
 }
 #endif
 
+static void quirk_intel_haswell_mch(struct pnp_dev *dev)
+{
+   struct pci_dev *host;
+   u32 addr_lo, addr_hi;
+   struct pci_bus_region region;
+   struct resource mch;
+   struct pnp_resource *pnp_res;
+   struct resource *res;
+
+   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
+   if (!host)
+   return;
+
+   /*
+* MCHBAR is not an architected PCI BAR, so MCH space is usually
+* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
+* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
+* report a PNP0C02 resource that is only 16KB, which means the
+* rest of the MCH space is consumed but unreported.
+*/
+
+   /*
+* Read MCHBAR for Host Member Mapped Register Range Base
+* 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+* Sec 3.1.12.
+*/
+   pci_read_config_dword(host, 0x48, addr_lo);
+   region.start = addr_lo  ~0x7fff;
+   pci_read_config_dword(host, 0x4c, addr_hi);
+   region.start |= (dma_addr_t) addr_hi  32;
+   region.end = region.start + 32*1024 - 1 ;
+
+   memset(mch, 0, sizeof(mch));
+   mch.flags = IORESOURCE_MEM;
+   pcibios_bus_to_resource(host-bus, mch, region);
+
+   list_for_each_entry(pnp_res, dev-resources, list) {
+   res = pnp_res-res;
+   if (res-end  mch.start || res-start  mch.end)
+   continue;   /* no overlap */
+   if (res-start == mch.start  res-end == mch.end)
+   continue;   /* exact match */
+
+   dev_info(dev-dev, FW_BUG
+%pR covers only part of Intel Haswell MCH; extending 
to %pR\n,
+res, mch);
+   res-start = mch.start;
+   res-end = mch.end;
+   break;
+   }
+
+   pci_dev_put(host);
+}
+

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Stephane Eranian

On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas bhelg...@google.com wrote:
 On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
 On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
  Right.  Even if we had this long-term solution, we'd still have
  Stephane's current problem, because the PNP0C02 _CRS is still wrong.
 
  We do have a drivers/pnp/quirks.c where we could conceivably adjust
  the PNP resource if we found the matching PCI device and MCHBAR.  That
  should solve Stephane's problem even with the current
  drivers/pnp/system.c.

 Guys, this still triggers in -rc1. Do we have a fix or something
 testable at least?

 Hi Boris,

 Can you try the patch below?



 PNP: Work around Haswell BIOS defect in MCH area reporting

 From: Bjorn Helgaas bhelg...@google.com

 Work around a Haswell BIOS defect that causes part of the MCH area to be
 unreported.

 MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
 PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
 in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
 only 16KB, which means the rest of the MCH space is consumed but
 unreported.

Why are you saying this is Haswell vs. others. I see the problem on my
IvyBridge laptop, like Boris.

 This can cause resource map sanity check warnings or (theoretically) a
 device conflict if we assigned the unreported space to another device.

 The Intel perf event uncore driver tripped over this when it claimed the
 MCH region:

   resource map sanity check conflict: 0xfed1 0xfed15fff 0xfed1 
 0xfed13fff pnp 00:01
   Info: mapping multiple BARs. Your kernel is fine.

 To prevent this, if we find a PNP0C02 resource that covers part of the MCH
 space, extend it to cover the entire space.

 Link: http://lkml.kernel.org/r/20140224162400.ge16...@pd.tnic
 Reported-by: Borislav Petkov b...@alien8.de
 Signed-off-by: Bjorn Helgaas bhelg...@google.com
 ---
  drivers/pnp/quirks.c |   52 
 ++
  1 file changed, 52 insertions(+)

 diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
 index 258fef272ea7..023edf592371 100644
 --- a/drivers/pnp/quirks.c
 +++ b/drivers/pnp/quirks.c
 @@ -334,6 +334,57 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
  }
  #endif

 +static void quirk_intel_haswell_mch(struct pnp_dev *dev)
 +{
 +   struct pci_dev *host;
 +   u32 addr_lo, addr_hi;
 +   struct pci_bus_region region;
 +   struct resource mch;
 +   struct pnp_resource *pnp_res;
 +   struct resource *res;
 +
 +   host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
 +   if (!host)
 +   return;
 +
 +   /*
 +* MCHBAR is not an architected PCI BAR, so MCH space is usually
 +* reported as a PNP0C02 resource.  The MCH space was 16KB prior to
 +* Haswell, but it is 32KB in Haswell.  Some Haswell BIOSes still
 +* report a PNP0C02 resource that is only 16KB, which means the
 +* rest of the MCH space is consumed but unreported.
 +*/
 +
 +   /*
 +* Read MCHBAR for Host Member Mapped Register Range Base
 +* 
 https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
 +* Sec 3.1.12.
 +*/
 +   pci_read_config_dword(host, 0x48, addr_lo);
 +   region.start = addr_lo  ~0x7fff;
 +   pci_read_config_dword(host, 0x4c, addr_hi);
 +   region.start |= (dma_addr_t) addr_hi  32;
 +   region.end = region.start + 32*1024 - 1 ;
 +   pcibios_bus_to_resource(host-bus, mch, region);
 +
 +   list_for_each_entry(pnp_res, dev-resources, list) {
 +   res = pnp_res-res;
 +   if (res-end  mch.start || res-start  mch.end)
 +   continue;   /* no overlap */
 +   if (res-start == mch.start  res-end == mch.end)
 +   continue;   /* exact match */
 +
 +   dev_info(dev-dev, FW_BUG
 +%pR covers only part of Intel Haswell MCH; 
 extending to %pR\n,
 +res, mch);
 +   res-start = mch.start;
 +   res-end = mch.end;
 +   break;
 +   }
 +
 +   pci_dev_put(host);
 +}
 +
  /*
   *  PnP Quirks
   *  Cards or devices that need some tweaking due to incomplete resource info
 @@ -364,6 +415,7 @@ static struct pnp_fixup pnp_fixups[] = {
  #ifdef CONFIG_AMD_NB
 {PNP0c01, quirk_amd_mmconfig_area},
  #endif
 +   {PNP0c02, quirk_intel_haswell_mch},
 {}
  };

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 5:08 PM, Stephane Eranian eran...@google.com wrote:
 On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas bhelg...@google.com wrote:
 On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
 On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
  Right.  Even if we had this long-term solution, we'd still have
  Stephane's current problem, because the PNP0C02 _CRS is still wrong.
 
  We do have a drivers/pnp/quirks.c where we could conceivably adjust
  the PNP resource if we found the matching PCI device and MCHBAR.  That
  should solve Stephane's problem even with the current
  drivers/pnp/system.c.

 Guys, this still triggers in -rc1. Do we have a fix or something
 testable at least?

 Hi Boris,

 Can you try the patch below?



 PNP: Work around Haswell BIOS defect in MCH area reporting

 From: Bjorn Helgaas bhelg...@google.com

 Work around a Haswell BIOS defect that causes part of the MCH area to be
 unreported.

 MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
 PNP0C02 resource.  The MCH space was 16KB prior to Haswell, but it is 32KB
 in Haswell.  Some Haswell BIOSes still report a PNP0C02 resource that is
 only 16KB, which means the rest of the MCH space is consumed but
 unreported.

 Why are you saying this is Haswell vs. others. I see the problem on my
 IvyBridge laptop, like Boris.

Ah, good question.  Somewhere I got pointed to the Haswell docs, which
say 32KB.  I don't know what other parts have 32KB MCH spaces.  If we
could figure out a list of device IDs with 32KB spaces, we could add
that to the quirk.

But I don't know how to come up with a complete list.

Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:
 
   I'm seeing the exact same message on my thinkpad t430s.
   When I try your patch, modesetting no longer works. When it tries
   to change to the framebuffer I get a black screen and lockup.
   If I boot with nomodeset it locks up when it gets to X.
   It all scrolls by too fast to read, but it looks like there's still
   a backtrace present.
  
  Ouch, sorry about that.  I do see a bug in my patch (fixed below), but I
  don't see how that could cause what you're seeing.

updated diff made no difference fwiw.

  Maybe I could figure
  out something from this info (this can be from a kernel without my patch):
  
  - dmesg log
  - output of find /sys/devices/pnp0 -name id -o -name resources | xargs 
  grep .
  - output of sudo lspci -s00:00.0 -xxx

attached from a fedora build of rc1.

Dave

/sys/devices/pnp0/00:00/id:PNP0c01
/sys/devices/pnp0/00:00/resources:state = active
/sys/devices/pnp0/00:00/resources:mem 0x0-0x9
/sys/devices/pnp0/00:00/resources:mem 0xc-0xc3fff
/sys/devices/pnp0/00:00/resources:mem 0xc4000-0xc7fff
/sys/devices/pnp0/00:00/resources:mem 0xc8000-0xcbfff
/sys/devices/pnp0/00:00/resources:mem 0xcc000-0xc
/sys/devices/pnp0/00:00/resources:mem 0xd-0xd3fff
/sys/devices/pnp0/00:00/resources:mem 0xd4000-0xd7fff
/sys/devices/pnp0/00:00/resources:mem 0xd8000-0xdbfff
/sys/devices/pnp0/00:00/resources:mem 0xdc000-0xd
/sys/devices/pnp0/00:00/resources:mem 0xe-0xe3fff
/sys/devices/pnp0/00:00/resources:mem 0xe4000-0xe7fff
/sys/devices/pnp0/00:00/resources:mem 0xe8000-0xebfff
/sys/devices/pnp0/00:00/resources:mem 0xec000-0xe
/sys/devices/pnp0/00:00/resources:mem 0xf-0xf
/sys/devices/pnp0/00:00/resources:mem 0x10-0xbf9f
/sys/devices/pnp0/00:00/resources:mem 0xfec0-0xfed3
/sys/devices/pnp0/00:00/resources:mem 0xfed4c000-0x
/sys/devices/pnp0/00:01/id:PNP0c02
/sys/devices/pnp0/00:01/resources:state = active
/sys/devices/pnp0/00:01/resources:io 0x10-0x1f
/sys/devices/pnp0/00:01/resources:io 0x90-0x9f
/sys/devices/pnp0/00:01/resources:io 0x24-0x25
/sys/devices/pnp0/00:01/resources:io 0x28-0x29
/sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
/sys/devices/pnp0/00:01/resources:io 0x30-0x31
/sys/devices/pnp0/00:01/resources:io 0x34-0x35
/sys/devices/pnp0/00:01/resources:io 0x38-0x39
/sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
/sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
/sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
/sys/devices/pnp0/00:01/resources:io 0xac-0xad
/sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
/sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
/sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
/sys/devices/pnp0/00:01/resources:io 0x50-0x53
/sys/devices/pnp0/00:01/resources:io 0x72-0x77
/sys/devices/pnp0/00:01/resources:io 0x400-0x47f
/sys/devices/pnp0/00:01/resources:io 0x500-0x57f
/sys/devices/pnp0/00:01/resources:io 0x800-0x80f
/sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
/sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
/sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
/sys/devices/pnp0/00:01/resources:mem disabled
/sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
/sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
/sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
/sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
/sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
/sys/devices/pnp0/00:02/id:PNP0103
/sys/devices/pnp0/00:02/resources:state = active
/sys/devices/pnp0/00:02/resources:mem 0xfed0-0xfed003ff
/sys/devices/pnp0/00:03/id:PNP0200
/sys/devices/pnp0/00:03/resources:state = active
/sys/devices/pnp0/00:03/resources:io 0x0-0xf
/sys/devices/pnp0/00:03/resources:io 0x80-0x8f
/sys/devices/pnp0/00:03/resources:io 0xc0-0xdf
/sys/devices/pnp0/00:03/resources:dma 4
/sys/devices/pnp0/00:04/id:PNP0800
/sys/devices/pnp0/00:04/resources:state = active
/sys/devices/pnp0/00:04/resources:io 0x61-0x61
/sys/devices/pnp0/00:05/id:PNP0c04
/sys/devices/pnp0/00:05/resources:state = active
/sys/devices/pnp0/00:05/resources:io 0xf0-0xf0
/sys/devices/pnp0/00:05/resources:irq 13
/sys/devices/pnp0/00:06/id:PNP0b00
/sys/devices/pnp0/00:06/resources:state = active
/sys/devices/pnp0/00:06/resources:io 0x70-0x71
/sys/devices/pnp0/00:06/resources:irq 8
/sys/devices/pnp0/00:07/id:LEN0071
/sys/devices/pnp0/00:07/id:PNP0303
/sys/devices/pnp0/00:07/resources:state = active
/sys/devices/pnp0/00:07/resources:io 0x60-0x60
/sys/devices/pnp0/00:07/resources:io 0x64-0x64
/sys/devices/pnp0/00:07/resources:irq 1
/sys/devices/pnp0/00:08/id:LEN0015
/sys/devices/pnp0/00:08/id:PNP0f13
/sys/devices/pnp0/00:08/resources:state = active
/sys/devices/pnp0/00:08/resources:irq 12
/sys/devices/pnp0/00:09/id:SMO1200
/sys/devices/pnp0/00:09/id:PNP0c31
/sys/devices/pnp0/00:09/resources:state = active
/sys/devices/pnp0/00:09/resources:mem 0xfed4-0xfed44fff

00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
(rev

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 2:55 PM, Rafael J. Wysocki  wrote:
> On Thursday, March 20, 2014 10:45:52 AM Bjorn Helgaas wrote:
>> The purpose of system.c is indeed to prevent resources from being
>> allocated to other devices.  This is really a question for Rafael, but
>> in my opinion this function (reserving resources of PNP/ACPI devices
>> to prevent their allocation to other devices) should be done for *all*
>> PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
>> system.c.
>>
>> So I think the best solution would be to move that into the ACPI core
>> somehow so it happens for all devices.
>
> Well, I think you got to the bottom of this, but that's something we can
> do long-term.  Still, we need to find a short-term solution for the
> particular issue at hand.

Right.  Even if we had this long-term solution, we'd still have
Stephane's current problem, because the PNP0C02 _CRS is still wrong.

We do have a drivers/pnp/quirks.c where we could conceivably adjust
the PNP resource if we found the matching PCI device and MCHBAR.  That
should solve Stephane's problem even with the current
drivers/pnp/system.c.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thursday, March 20, 2014 10:45:52 AM Bjorn Helgaas wrote:
> On Wed, Mar 19, 2014 at 9:03 PM, Zhang, Rui  wrote:
> 
> > I've talked with Yan Zheng, and I was told that this resource "0xfed1 - 
> > 0xfed15fff"
> > is got from PCI device register directly, which is not in its BAR range.
> > Thus IMO, it is impossible for PNP layer to be aware of this resource.
> 
> Slow down, this isn't quite correct.  The *base* address (0xfed1)
> is from a PCI config register (MCHBAR, at PCI config offset 0x48) [1].
>  This is a device-dependent register, so the PCI core knows neither
> the base nor the size.
> 
> The device consumes address space that is not reported via the
> architected PCI mechanism, so the only way to report that space is via
> the PNP0C02 ACPI device.  The BIOS has to determine the base and size
> based on its knowledge of the hardware.  On this hardware, per the
> spec in [1], the region described by MCHBAR is 32KB in size.
> 
> The 0x6000 (24KB) size of the region above comes from the driver and
> is actually less than what the device consumes.  It is legal for a
> driver to request only the area it requires, but the entire area
> consumed by the device should be reported via the PNP0C02 device.  The
> fact that PNP0C02 reports 16KB but the device actually consumes 32KB
> is a BIOS defect.  This probably happened because previous versions of
> this chip consumed only 16KB, and the BIOS didn't get updated for the
> change.
> 
> > BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
> > resources from being allocated to uninitialized PCI devices, then IMO,
> > the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
> > resources directly, probably via a platform callback, say,
> > 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
> > pnp_dev->protocol.
> > 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
> > resources.
> > 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
> >resources to PCI devices.
> 
> The purpose of system.c is indeed to prevent resources from being
> allocated to other devices.  This is really a question for Rafael, but
> in my opinion this function (reserving resources of PNP/ACPI devices
> to prevent their allocation to other devices) should be done for *all*
> PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
> system.c.
> 
> So I think the best solution would be to move that into the ACPI core
> somehow so it happens for all devices.

Well, I think you got to the bottom of this, but that's something we can
do long-term.  Still, we need to find a short-term solution for the
particular issue at hand.

> If we had that, we could get
> rid of system.c altogether, and we wouldn't have to do anything
> special in PCI.  This is much easier to say than to do, however,
> because there are all kinds of issues with legacy resource
> reservations, and we currently can't really deal with overlapping
> resources.

Indeed.

All above said, appended is the relevant piece of the DSDT from the machine
in question (and that is in the PCI host bridge scope).

So we have a PCI device with an ACPI object called LPC which has a child
called SIO and the _HID of that child is "PNP0C02".

I'm not sure if the way system.c handles this is correct in this particular
case to be honest.


Device (LPC)
{
Name (_ADR, 0x001F)
Name (_S3D, 0x03)
Name (RID, 0x00)
Device (SIO)
{
Name (_HID, EisaId ("PNP0C02"))
Name (_UID, 0x00)
Name (SCRS, ResourceTemplate ()
{
IO (Decode16,
0x0010, // Range Minimum
0x0010, // Range Maximum
0x01,   // Alignment
0x10,   // Length
)
IO (Decode16,
0x0090, // Range Minimum
0x0090, // Range Maximum
0x01,   // Alignment
0x10,   // Length
)
IO (Decode16,
0x0024, // Range Minimum
0x0024, // Range Maximum
0x01,   // Alignment
0x02,   // Length
)
IO (Decode16,
0x0028, // Range Minimum
0x0028, // Range Maximum
0x01,   // Alignment
0x02,   // Length
)

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Mar 19, 2014 at 9:03 PM, Zhang, Rui  wrote:

> I've talked with Yan Zheng, and I was told that this resource "0xfed1 - 
> 0xfed15fff"
> is got from PCI device register directly, which is not in its BAR range.
> Thus IMO, it is impossible for PNP layer to be aware of this resource.

Slow down, this isn't quite correct.  The *base* address (0xfed1)
is from a PCI config register (MCHBAR, at PCI config offset 0x48) [1].
 This is a device-dependent register, so the PCI core knows neither
the base nor the size.

The device consumes address space that is not reported via the
architected PCI mechanism, so the only way to report that space is via
the PNP0C02 ACPI device.  The BIOS has to determine the base and size
based on its knowledge of the hardware.  On this hardware, per the
spec in [1], the region described by MCHBAR is 32KB in size.

The 0x6000 (24KB) size of the region above comes from the driver and
is actually less than what the device consumes.  It is legal for a
driver to request only the area it requires, but the entire area
consumed by the device should be reported via the PNP0C02 device.  The
fact that PNP0C02 reports 16KB but the device actually consumes 32KB
is a BIOS defect.  This probably happened because previous versions of
this chip consumed only 16KB, and the BIOS didn't get updated for the
change.

> BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
> resources from being allocated to uninitialized PCI devices, then IMO,
> the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
> resources directly, probably via a platform callback, say,
> 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
> pnp_dev->protocol.
> 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
> resources.
> 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
>resources to PCI devices.

The purpose of system.c is indeed to prevent resources from being
allocated to other devices.  This is really a question for Rafael, but
in my opinion this function (reserving resources of PNP/ACPI devices
to prevent their allocation to other devices) should be done for *all*
PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
system.c.

So I think the best solution would be to move that into the ACPI core
somehow so it happens for all devices.  If we had that, we could get
rid of system.c altogether, and we wouldn't have to do anything
special in PCI.  This is much easier to say than to do, however,
because there are all kinds of issues with legacy resource
reservations, and we currently can't really deal with overlapping
resources.

Bjorn

[1] 
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet,
sec. 3.1.2 on p. 61
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Stephane Eranian

On Thu, Mar 20, 2014 at 9:16 AM, Yan, Zheng  wrote:
> On 03/20/2014 03:53 PM, Zhang, Rui wrote:
>> The resource length is also hardcoded to 0x6000, right?
>> This is probably a problem, because
>> only if the resource length read from PCI config space is larger than 0x4000,
>> drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
>> resource 0xfed1 - 0xfed13fff, and the PCI device can request this
>> resource successfully.
>> In order to check this, can you please attach the dmesg output after boot?
>
> maybe the issue can be fixed by below untested patch
>
> ---
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
> b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> index fd5e883..2b3d834 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] 
> = {
>  #define SNB_UNCORE_PCI_IMC_BAR_OFFSET  0x48
>
>  /* page size multiple covering all config regs */
> -#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x6000
> +#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x8
>
I assume ioremap() works on page boundaries.
Eventually want to expose the other counters too, not just read and
writes ( 8 bytes total).

The size of 0x6000 comes from the counter offsets: BAR + 0x5040 to BAR + 0x5054.
May be a better way of doing this would be to remap just the one page
holding them
instead of the 6 covering the entire BAR + counters. That would need
changes in the
read_counter() but that is okay.

So that would something along the line of:

addr = (addr + 0x5040) & (PAGE_SIZE - 1);
ioremap(addr, 0x1000);


>  #define SNB_UNCORE_PCI_IMC_DATA_READS  0x1
>  #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
> @@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
> intel_uncore_box *box)
>
> addr &= ~(PAGE_SIZE - 1);
>
> -   box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
> +   box->io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
> +  SNB_UNCORE_PCI_IMC_MAP_SIZE);
> box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
>  }
>
> @@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
> *event)
> }
>
> /* must be done before validate_group */
> -   event->hw.event_base = base;
> +   event->hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
> event->hw.config = cfg;
> event->hw.idx = idx;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, 2014-03-20 at 16:16 +0800, Yan, Zheng wrote:
> On 03/20/2014 03:53 PM, Zhang, Rui wrote:
> > The resource length is also hardcoded to 0x6000, right?
> > This is probably a problem, because
> > only if the resource length read from PCI config space is larger than 
> > 0x4000,
> > drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
> > resource 0xfed1 - 0xfed13fff, and the PCI device can request this
> > resource successfully.
> > In order to check this, can you please attach the dmesg output after boot?
> 
> maybe the issue can be fixed by below untested patch
> 
> ---
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
> b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> index fd5e883..2b3d834 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] 
> = {
>  #define SNB_UNCORE_PCI_IMC_BAR_OFFSET0x48
>  
>  /* page size multiple covering all config regs */
> -#define SNB_UNCORE_PCI_IMC_MAP_SIZE  0x6000
> +#define SNB_UNCORE_PCI_IMC_MAP_SIZE  0x8
>  
>  #define SNB_UNCORE_PCI_IMC_DATA_READS0x1
>  #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE   0x5050
> @@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
> intel_uncore_box *box)
>  
>   addr &= ~(PAGE_SIZE - 1);
>  
> - box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
> + box->io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
> +SNB_UNCORE_PCI_IMC_MAP_SIZE);

you're remapping 0xfed15050 - 0xfed1b04f instead of 0xfed1 -
0xfed15fff ?
I do not quite understand this, but apparently this is not a FIX.
If it works for this problem, it is because 0xfed15050 - 0xfed1b04f
happens to be not conflict with any resource reserved by PNP system
driver, on this machine.

thanks,
rui

>   box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
>  }
>  
> @@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
> *event)
>   }
>  
>   /* must be done before validate_group */
> - event->hw.event_base = base;
> + event->hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
>   event->hw.config = cfg;
>   event->hw.idx = idx;
>  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Info: mapping multiple BARs. Your kernel is fine.

On Thu, 2014-03-20 at 07:53 +, Zhang, Rui wrote:
> 
> > -Original Message-
> > From: Stephane Eranian [mailto:eran...@google.com]
> > Sent: Thursday, March 20, 2014 11:35 AM
> > To: Zhang, Rui
> > Cc: Lu, Aaron; Rafael J. Wysocki; Borislav Petkov; lkml; x...@kernel.org;
> > Bjorn Helgaas; Linux PCI; ACPI Devel Maling List; Yinghai Lu; H. Peter
> > Anvin; Yan, Zheng Z
> > Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> > Importance: High
> > 
> > On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui  wrote:
> > >
> > >
> > >> -Original Message-
> > >> From: Lu, Aaron
> > >> Sent: Thursday, March 20, 2014 10:24 AM
> > >> To: Rafael J. Wysocki; Borislav Petkov
> > >> Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel
> > Maling
> > >> List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
> > >> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> > >> Importance: High
> > >>
> > >> On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> > >> > [CC list rearranged]
> > >> >
> > >> > On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > >> >> This started happening this morning after booting -rc4+tip, let's
> > >> add
> > >> >> *everybody* to CC :-)
> > >> >>
> > >> >> We have intel_uncore_init, snb_uncore_imc_init_box,
> > >> >> uncore_pci_probe and other goodies on the stack.
> > >> >
> > >> > I've just gone throught this.
> > >> >
> > >> > So the problem is that we have the PNP "system" driver whose only
> > >> > purpose seems to be to reserve system resources so that the PCI
> > >> > layer doesn't assign them to new devices on hotplug (disclaimer: I
> > >> > didn't invent it, I only read the code and comments in there).
> > >>
> > >> And to PCI devices which have uninitialized BARs.
> > >>
> > >> >
> > >> > It does that for ACPI device objects having the "PNP0C02" and
> > >> "PNP0C01" IDs.
> > >> >
> > >> > Apparently, snb_uncore_imc_init_box() steps on a range already
> > >> > reserved by that driver on your box.  And this doesn't seem to be
> > a
> > >> > coincidence, because the ACPI device object in question probably
> > >> > *does* correspond to the memory controller that the uncore driver
> > >> attempts to use.
> > >> >
> > >> > I'm not sure how to address that right now to be honest.  Arguably,
> > >> > the PNP "system" driver should be replaced with something saner,
> > >> > but still the resources it claims need to be kept out of reach of
> > >> > the PCI's resource allocation code.
> > >>
> > >> The quirk_system_pci_resources is meant to disable PNP devices'
> > >> resource if they collide with any known PCI device's BAR. I'm not
> > >> sure why it doesn't work here, perhaps the uncore PCI device doesn't
> > >> have a BAR that falls in the PNP device's resource window?
> > >>
> > > I've talked with Yan Zheng, and I was told that this resource
> > "0xfed1 - 0xfed15fff"
> > > is got from PCI device register directly, which is not in its BAR
> > range.
> > > Thus IMO, it is impossible for PNP layer to be aware of this resource.
> > >
> > That is not what the perf_event code does. Nothing is hardcoded except
> > the IMC PCI device ids. The BAR offset is hardcoded that's all. The
> > 0xfed1 is discovered.
> > 
> The resource length is also hardcoded to 0x6000, right?
> This is probably a problem, because
> only if the resource length read from PCI config space is larger than 0x4000,
> drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
> resource 0xfed1 - 0xfed13fff, and the PCI device can request this
> resource successfully.
> In order to check this, can you please attach the dmesg output after boot?
> 
sorry, one correction here, I should say,
if the resource length read from PCI config space is smaller than
0x4000, the problem still exists because drivers/pnp/quirks.c do not
think this is a conflict.
But if the resource length read from PCI config space is larger than
0x4000, drivers/pnp/quirks.c can detect this conflict and prevent
resource 0xfed1 - 0xfed13fff from be

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thursday, March 20, 2014 03:03:45 AM Zhang, Rui wrote:
> 
> > -Original Message-
> > From: Lu, Aaron
> > Sent: Thursday, March 20, 2014 10:24 AM
> > To: Rafael J. Wysocki; Borislav Petkov
> > Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
> > List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
> > Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> > Importance: High
> > 
> > On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> > > [CC list rearranged]
> > >
> > > On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > >> This started happening this morning after booting -rc4+tip, let's
> > add
> > >> *everybody* to CC :-)
> > >>
> > >> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
> > >> and other goodies on the stack.
> > >
> > > I've just gone throught this.
> > >
> > > So the problem is that we have the PNP "system" driver whose only
> > > purpose seems to be to reserve system resources so that the PCI layer
> > > doesn't assign them to new devices on hotplug (disclaimer: I didn't
> > > invent it, I only read the code and comments in there).
> > 
> > And to PCI devices which have uninitialized BARs.
> > 
> > >
> > > It does that for ACPI device objects having the "PNP0C02" and
> > "PNP0C01" IDs.
> > >
> > > Apparently, snb_uncore_imc_init_box() steps on a range already
> > > reserved by that driver on your box.  And this doesn't seem to be a
> > > coincidence, because the ACPI device object in question probably
> > > *does* correspond to the memory controller that the uncore driver
> > attempts to use.
> > >
> > > I'm not sure how to address that right now to be honest.  Arguably,
> > > the PNP "system" driver should be replaced with something saner, but
> > > still the resources it claims need to be kept out of reach of the
> > > PCI's resource allocation code.
> > 
> > The quirk_system_pci_resources is meant to disable PNP devices'
> > resource if they collide with any known PCI device's BAR. I'm not sure
> > why it doesn't work here, perhaps the uncore PCI device doesn't have a
> > BAR that falls in the PNP device's resource window?
> >
> I've talked with Yan Zheng, and I was told that this resource "0xfed1 - 
> 0xfed15fff"
> is got from PCI device register directly, which is not in its BAR range.
> Thus IMO, it is impossible for PNP layer to be aware of this resource.
> 
> BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
> resources from being allocated to uninitialized PCI devices, then IMO,
> the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
> resources directly, probably via a platform callback, say,
> 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
> pnp_dev->protocol.

Then we can drop drivers/pnp/system.c entirely I think.

> 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
> resources.
> 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
>resources to PCI devices.

Well, sounds reasonable.


> > >
> > >> ...
> > >> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
> > mapped at [8800cac3-8800cec2]
> > >> [0.489975] resource map sanity check conflict: 0xfed1
> > 0xfed15fff 0xfed1 0xfed13fff pnp 00:01
> > >> [0.490079] [ cut here ]
> > >> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
> > __ioremap_caller+0x372/0x380()
> > >> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
> > >> [0.490371] Modules linked in:
> > >> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
> > #1
> > >> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
> > (2.06 ) 11/13/2012
> > >> [0.490742]  00ab 880213d01ad8 816112e3
> > 0006
> > >> [0.491032]  880213d01b28 880213d01b18 8104e9bc
> > 880213d01b08
> > >> [0.491343]  c9c58000 fed1 fed1
> > 6000
> > >> [0.491631] Call Trace:
> > >> [0.493337]  [] dump_stack+0x4f/0x7c
> > >> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
> > >> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
&g

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Yan, Zheng

On 03/20/2014 03:53 PM, Zhang, Rui wrote:
> The resource length is also hardcoded to 0x6000, right?
> This is probably a problem, because
> only if the resource length read from PCI config space is larger than 0x4000,
> drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
> resource 0xfed1 - 0xfed13fff, and the PCI device can request this
> resource successfully.
> In order to check this, can you please attach the dmesg output after boot?

maybe the issue can be fixed by below untested patch

---
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index fd5e883..2b3d834 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] = 
{
 #define SNB_UNCORE_PCI_IMC_BAR_OFFSET  0x48
 
 /* page size multiple covering all config regs */
-#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x6000
+#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x8
 
 #define SNB_UNCORE_PCI_IMC_DATA_READS  0x1
 #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
@@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
intel_uncore_box *box)
 
addr &= ~(PAGE_SIZE - 1);
 
-   box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+   box->io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
+  SNB_UNCORE_PCI_IMC_MAP_SIZE);
box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
 }
 
@@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
*event)
}
 
/* must be done before validate_group */
-   event->hw.event_base = base;
+   event->hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
event->hw.config = cfg;
event->hw.idx = idx;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Zhang, Rui



> -Original Message-
> From: Stephane Eranian [mailto:eran...@google.com]
> Sent: Thursday, March 20, 2014 11:35 AM
> To: Zhang, Rui
> Cc: Lu, Aaron; Rafael J. Wysocki; Borislav Petkov; lkml; x...@kernel.org;
> Bjorn Helgaas; Linux PCI; ACPI Devel Maling List; Yinghai Lu; H. Peter
> Anvin; Yan, Zheng Z
> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> Importance: High
> 
> On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui  wrote:
> >
> >
> >> -Original Message-
> >> From: Lu, Aaron
> >> Sent: Thursday, March 20, 2014 10:24 AM
> >> To: Rafael J. Wysocki; Borislav Petkov
> >> Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel
> Maling
> >> List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
> >> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> >> Importance: High
> >>
> >> On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> >> > [CC list rearranged]
> >> >
> >> > On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> >> >> This started happening this morning after booting -rc4+tip, let's
> >> add
> >> >> *everybody* to CC :-)
> >> >>
> >> >> We have intel_uncore_init, snb_uncore_imc_init_box,
> >> >> uncore_pci_probe and other goodies on the stack.
> >> >
> >> > I've just gone throught this.
> >> >
> >> > So the problem is that we have the PNP "system" driver whose only
> >> > purpose seems to be to reserve system resources so that the PCI
> >> > layer doesn't assign them to new devices on hotplug (disclaimer: I
> >> > didn't invent it, I only read the code and comments in there).
> >>
> >> And to PCI devices which have uninitialized BARs.
> >>
> >> >
> >> > It does that for ACPI device objects having the "PNP0C02" and
> >> "PNP0C01" IDs.
> >> >
> >> > Apparently, snb_uncore_imc_init_box() steps on a range already
> >> > reserved by that driver on your box.  And this doesn't seem to be
> a
> >> > coincidence, because the ACPI device object in question probably
> >> > *does* correspond to the memory controller that the uncore driver
> >> attempts to use.
> >> >
> >> > I'm not sure how to address that right now to be honest.  Arguably,
> >> > the PNP "system" driver should be replaced with something saner,
> >> > but still the resources it claims need to be kept out of reach of
> >> > the PCI's resource allocation code.
> >>
> >> The quirk_system_pci_resources is meant to disable PNP devices'
> >> resource if they collide with any known PCI device's BAR. I'm not
> >> sure why it doesn't work here, perhaps the uncore PCI device doesn't
> >> have a BAR that falls in the PNP device's resource window?
> >>
> > I've talked with Yan Zheng, and I was told that this resource
> "0xfed1 - 0xfed15fff"
> > is got from PCI device register directly, which is not in its BAR
> range.
> > Thus IMO, it is impossible for PNP layer to be aware of this resource.
> >
> That is not what the perf_event code does. Nothing is hardcoded except
> the IMC PCI device ids. The BAR offset is hardcoded that's all. The
> 0xfed1 is discovered.
> 
The resource length is also hardcoded to 0x6000, right?
This is probably a problem, because
only if the resource length read from PCI config space is larger than 0x4000,
drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
resource 0xfed1 - 0xfed13fff, and the PCI device can request this
resource successfully.
In order to check this, can you please attach the dmesg output after boot?

Thanks,
rui

> > BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent
> > those resources from being allocated to uninitialized PCI devices,
> > then IMO, the best way to do this is make PCI bus handle those
> > PNP0C01/PNP0C02 resources directly, probably via a platform callback,
> > say, 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking
> pnp_dev->protocol.
> > 2. introduce acpi_check_reserved_resource() to parsing
> PNP0C01/PNP0C02 resources.
> > 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
> >resources to PCI devices.
> >
> > Thanks,
> > rui
> >
> >> Thanks,
> >> Aaron
> >>
> >> >
> >> >> ...
> >> >> [0.488998] software IO TLB [mem 0xc

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Zhang, Rui

 -Original Message-
 From: Stephane Eranian [mailto:eran...@google.com]
 Sent: Thursday, March 20, 2014 11:35 AM
 To: Zhang, Rui
 Cc: Lu, Aaron; Rafael J. Wysocki; Borislav Petkov; lkml; x...@kernel.org;
 Bjorn Helgaas; Linux PCI; ACPI Devel Maling List; Yinghai Lu; H. Peter
 Anvin; Yan, Zheng Z
 Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
 Importance: High

 On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui rui.zh...@intel.com wrote:

  -Original Message-
  From: Lu, Aaron
  Sent: Thursday, March 20, 2014 10:24 AM
  To: Rafael J. Wysocki; Borislav Petkov
  Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel
 Maling
  List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
  Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
  Importance: High

  On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
   [CC list rearranged]

   On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
   This started happening this morning after booting -rc4+tip, let's
  add
   *everybody* to CC :-)

   We have intel_uncore_init, snb_uncore_imc_init_box,
   uncore_pci_probe and other goodies on the stack.

   I've just gone throught this.

   So the problem is that we have the PNP system driver whose only
   purpose seems to be to reserve system resources so that the PCI
   layer doesn't assign them to new devices on hotplug (disclaimer: I
   didn't invent it, I only read the code and comments in there).

  And to PCI devices which have uninitialized BARs.

   It does that for ACPI device objects having the PNP0C02 and
  PNP0C01 IDs.

   Apparently, snb_uncore_imc_init_box() steps on a range already
   reserved by that driver on your box.  And this doesn't seem to be
 a
   coincidence, because the ACPI device object in question probably
   *does* correspond to the memory controller that the uncore driver
  attempts to use.

   I'm not sure how to address that right now to be honest.  Arguably,
   the PNP system driver should be replaced with something saner,
   but still the resources it claims need to be kept out of reach of
   the PCI's resource allocation code.

  The quirk_system_pci_resources is meant to disable PNP devices'
  resource if they collide with any known PCI device's BAR. I'm not
  sure why it doesn't work here, perhaps the uncore PCI device doesn't
  have a BAR that falls in the PNP device's resource window?

  I've talked with Yan Zheng, and I was told that this resource
 0xfed1 - 0xfed15fff
  is got from PCI device register directly, which is not in its BAR
 range.
  Thus IMO, it is impossible for PNP layer to be aware of this resource.

 That is not what the perf_event code does. Nothing is hardcoded except
 the IMC PCI device ids. The BAR offset is hardcoded that's all. The
 0xfed1 is discovered.

The resource length is also hardcoded to 0x6000, right?
This is probably a problem, because
only if the resource length read from PCI config space is larger than 0x4000,
drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
resource 0xfed1 - 0xfed13fff, and the PCI device can request this
resource successfully.
In order to check this, can you please attach the dmesg output after boot?

Thanks,
rui

  BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent
  those resources from being allocated to uninitialized PCI devices,
  then IMO, the best way to do this is make PCI bus handle those
  PNP0C01/PNP0C02 resources directly, probably via a platform callback,
  say, 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking
 pnp_dev-protocol.
  2. introduce acpi_check_reserved_resource() to parsing
 PNP0C01/PNP0C02 resources.
  3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
 resources to PCI devices.

  Thanks,
  rui

  Thanks,
  Aaron

   ...
   [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
  mapped at [8800cac3-8800cec2]
   [0.489975] resource map sanity check conflict: 0xfed1
  0xfed15fff 0xfed1 0xfed13fff pnp 00:01
   [0.490079] [ cut here ]
   [0.490204] WARNING: CPU: 2 PID: 1 at
 arch/x86/mm/ioremap.c:171
  __ioremap_caller+0x372/0x380()
   [0.490306] Info: mapping multiple BARs. Your kernel is fine.
   [0.490371] Modules linked in:
   [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-
 rc4+
  #1
   [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS
 G2ET86WW
  (2.06 ) 11/13/2012
   [0.490742]  00ab 880213d01ad8
 816112e3
  0006
   [0.491032]  880213d01b28 880213d01b18
 8104e9bc
  880213d01b08
   [0.491343]  c9c58000 fed1
 fed1
  6000
   [0.491631] Call Trace:
   [0.493337]  [816112e3] dump_stack+0x4f/0x7c
   [0.493420]  [8104e9bc]
 warn_slowpath_common+0x8c/0xc0
   [0.493503]  [8104eaa6

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Yan, Zheng

On 03/20/2014 03:53 PM, Zhang, Rui wrote:
 The resource length is also hardcoded to 0x6000, right?
 This is probably a problem, because
 only if the resource length read from PCI config space is larger than 0x4000,
 drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
 resource 0xfed1 - 0xfed13fff, and the PCI device can request this
 resource successfully.
 In order to check this, can you please attach the dmesg output after boot?

maybe the issue can be fixed by below untested patch

---
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index fd5e883..2b3d834 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] = 
{
 #define SNB_UNCORE_PCI_IMC_BAR_OFFSET  0x48
 
 /* page size multiple covering all config regs */
-#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x6000
+#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x8
 
 #define SNB_UNCORE_PCI_IMC_DATA_READS  0x1
 #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
@@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
intel_uncore_box *box)
 
addr = ~(PAGE_SIZE - 1);
 
-   box-io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+   box-io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
+  SNB_UNCORE_PCI_IMC_MAP_SIZE);
box-hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
 }
 
@@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
*event)
}
 
/* must be done before validate_group */
-   event-hw.event_base = base;
+   event-hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
event-hw.config = cfg;
event-hw.idx = idx;
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thursday, March 20, 2014 03:03:45 AM Zhang, Rui wrote:

  -Original Message-
  From: Lu, Aaron
  Sent: Thursday, March 20, 2014 10:24 AM
  To: Rafael J. Wysocki; Borislav Petkov
  Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
  List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
  Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
  Importance: High

  On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
   [CC list rearranged]

   On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
   This started happening this morning after booting -rc4+tip, let's
  add
   *everybody* to CC :-)

   We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
   and other goodies on the stack.

   I've just gone throught this.

   So the problem is that we have the PNP system driver whose only
   purpose seems to be to reserve system resources so that the PCI layer
   doesn't assign them to new devices on hotplug (disclaimer: I didn't
   invent it, I only read the code and comments in there).

  And to PCI devices which have uninitialized BARs.

   It does that for ACPI device objects having the PNP0C02 and
  PNP0C01 IDs.

   Apparently, snb_uncore_imc_init_box() steps on a range already
   reserved by that driver on your box.  And this doesn't seem to be a
   coincidence, because the ACPI device object in question probably
   *does* correspond to the memory controller that the uncore driver
  attempts to use.

   I'm not sure how to address that right now to be honest.  Arguably,
   the PNP system driver should be replaced with something saner, but
   still the resources it claims need to be kept out of reach of the
   PCI's resource allocation code.

  The quirk_system_pci_resources is meant to disable PNP devices'
  resource if they collide with any known PCI device's BAR. I'm not sure
  why it doesn't work here, perhaps the uncore PCI device doesn't have a
  BAR that falls in the PNP device's resource window?

 I've talked with Yan Zheng, and I was told that this resource 0xfed1 - 
 0xfed15fff
 is got from PCI device register directly, which is not in its BAR range.
 Thus IMO, it is impossible for PNP layer to be aware of this resource.

 BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
 resources from being allocated to uninitialized PCI devices, then IMO,
 the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
 resources directly, probably via a platform callback, say,
 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
 pnp_dev-protocol.

Then we can drop drivers/pnp/system.c entirely I think.

 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
 resources.
 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
resources to PCI devices.

Well, sounds reasonable.

   ...
   [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
  mapped at [8800cac3-8800cec2]
   [0.489975] resource map sanity check conflict: 0xfed1
  0xfed15fff 0xfed1 0xfed13fff pnp 00:01
   [0.490079] [ cut here ]
   [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
  __ioremap_caller+0x372/0x380()
   [0.490306] Info: mapping multiple BARs. Your kernel is fine.
   [0.490371] Modules linked in:
   [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
  #1
   [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
  (2.06 ) 11/13/2012
   [0.490742]  00ab 880213d01ad8 816112e3
  0006
   [0.491032]  880213d01b28 880213d01b18 8104e9bc
  880213d01b08
   [0.491343]  c9c58000 fed1 fed1
  6000
   [0.491631] Call Trace:
   [0.493337]  [816112e3] dump_stack+0x4f/0x7c
   [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
   [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
   [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
   [0.493674]  [810211a2] ?
  snb_uncore_imc_init_box+0x62/0x90
   [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
   [0.493846]  [810211a2]
  snb_uncore_imc_init_box+0x62/0x90
   [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
   [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
   [0.494104]  [81418a59] ? get_device+0x19/0x20
   [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
   [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
   [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
   [0.494469]  [8141d590] ?
  driver_probe_device+0x240/0x240
   [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
   [0.494634]  [8141cede] driver_attach+0x1e/0x20
   [0.494718]  [8141ca57] bus_add_driver+0x117/0x230

RE: Info: mapping multiple BARs. Your kernel is fine.

On Thu, 2014-03-20 at 07:53 +, Zhang, Rui wrote:

  -Original Message-
  From: Stephane Eranian [mailto:eran...@google.com]
  Sent: Thursday, March 20, 2014 11:35 AM
  To: Zhang, Rui
  Cc: Lu, Aaron; Rafael J. Wysocki; Borislav Petkov; lkml; x...@kernel.org;
  Bjorn Helgaas; Linux PCI; ACPI Devel Maling List; Yinghai Lu; H. Peter
  Anvin; Yan, Zheng Z
  Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
  Importance: High

  On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui rui.zh...@intel.com wrote:

   -Original Message-
   From: Lu, Aaron
   Sent: Thursday, March 20, 2014 10:24 AM
   To: Rafael J. Wysocki; Borislav Petkov
   Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel
  Maling
   List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
   Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
   Importance: High

   On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
[CC list rearranged]

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
This started happening this morning after booting -rc4+tip, let's
   add
*everybody* to CC :-)

We have intel_uncore_init, snb_uncore_imc_init_box,
uncore_pci_probe and other goodies on the stack.

I've just gone throught this.

So the problem is that we have the PNP system driver whose only
purpose seems to be to reserve system resources so that the PCI
layer doesn't assign them to new devices on hotplug (disclaimer: I
didn't invent it, I only read the code and comments in there).

   And to PCI devices which have uninitialized BARs.

It does that for ACPI device objects having the PNP0C02 and
   PNP0C01 IDs.

Apparently, snb_uncore_imc_init_box() steps on a range already
reserved by that driver on your box.  And this doesn't seem to be
  a
coincidence, because the ACPI device object in question probably
*does* correspond to the memory controller that the uncore driver
   attempts to use.

I'm not sure how to address that right now to be honest.  Arguably,
the PNP system driver should be replaced with something saner,
but still the resources it claims need to be kept out of reach of
the PCI's resource allocation code.

   The quirk_system_pci_resources is meant to disable PNP devices'
   resource if they collide with any known PCI device's BAR. I'm not
   sure why it doesn't work here, perhaps the uncore PCI device doesn't
   have a BAR that falls in the PNP device's resource window?

   I've talked with Yan Zheng, and I was told that this resource
  0xfed1 - 0xfed15fff
   is got from PCI device register directly, which is not in its BAR
  range.
   Thus IMO, it is impossible for PNP layer to be aware of this resource.

  That is not what the perf_event code does. Nothing is hardcoded except
  the IMC PCI device ids. The BAR offset is hardcoded that's all. The
  0xfed1 is discovered.

 The resource length is also hardcoded to 0x6000, right?
 This is probably a problem, because
 only if the resource length read from PCI config space is larger than 0x4000,
 drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
 resource 0xfed1 - 0xfed13fff, and the PCI device can request this
 resource successfully.
 In order to check this, can you please attach the dmesg output after boot?

sorry, one correction here, I should say,
if the resource length read from PCI config space is smaller than
0x4000, the problem still exists because drivers/pnp/quirks.c do not
think this is a conflict.
But if the resource length read from PCI config space is larger than
0x4000, drivers/pnp/quirks.c can detect this conflict and prevent
resource 0xfed1 - 0xfed13fff from being reserved.

thanks,
rui

 Thanks,
 rui

   BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent
   those resources from being allocated to uninitialized PCI devices,
   then IMO, the best way to do this is make PCI bus handle those
   PNP0C01/PNP0C02 resources directly, probably via a platform callback,
   say, 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking
  pnp_dev-protocol.
   2. introduce acpi_check_reserved_resource() to parsing
  PNP0C01/PNP0C02 resources.
   3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
  resources to PCI devices.

   Thanks,
   rui

   Thanks,
   Aaron

...
[0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
   mapped at [8800cac3-8800cec2]
[0.489975] resource map sanity check conflict: 0xfed1
   0xfed15fff 0xfed1 0xfed13fff pnp 00:01
[0.490079] [ cut here ]
[0.490204] WARNING: CPU: 2 PID: 1 at
  arch/x86/mm/ioremap.c:171
   __ioremap_caller+0x372/0x380()
[0.490306] Info: mapping multiple BARs. Your kernel is fine.
[0.490371] Modules linked in:
[0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, 2014-03-20 at 16:16 +0800, Yan, Zheng wrote:
 On 03/20/2014 03:53 PM, Zhang, Rui wrote:
  The resource length is also hardcoded to 0x6000, right?
  This is probably a problem, because
  only if the resource length read from PCI config space is larger than 
  0x4000,
  drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
  resource 0xfed1 - 0xfed13fff, and the PCI device can request this
  resource successfully.
  In order to check this, can you please attach the dmesg output after boot?
 
 maybe the issue can be fixed by below untested patch
 
 ---
 diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
 b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 index fd5e883..2b3d834 100644
 --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 @@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] 
 = {
  #define SNB_UNCORE_PCI_IMC_BAR_OFFSET0x48
  
  /* page size multiple covering all config regs */
 -#define SNB_UNCORE_PCI_IMC_MAP_SIZE  0x6000
 +#define SNB_UNCORE_PCI_IMC_MAP_SIZE  0x8
  
  #define SNB_UNCORE_PCI_IMC_DATA_READS0x1
  #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE   0x5050
 @@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
 intel_uncore_box *box)
  
   addr = ~(PAGE_SIZE - 1);
  
 - box-io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
 + box-io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
 +SNB_UNCORE_PCI_IMC_MAP_SIZE);

you're remapping 0xfed15050 - 0xfed1b04f instead of 0xfed1 -
0xfed15fff ?
I do not quite understand this, but apparently this is not a FIX.
If it works for this problem, it is because 0xfed15050 - 0xfed1b04f
happens to be not conflict with any resource reserved by PNP system
driver, on this machine.

thanks,
rui

   box-hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
  }
  
 @@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
 *event)
   }
  
   /* must be done before validate_group */
 - event-hw.event_base = base;
 + event-hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
   event-hw.config = cfg;
   event-hw.idx = idx;
  


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-20 Thread Stephane Eranian

On Thu, Mar 20, 2014 at 9:16 AM, Yan, Zheng zheng.z@intel.com wrote:
 On 03/20/2014 03:53 PM, Zhang, Rui wrote:
 The resource length is also hardcoded to 0x6000, right?
 This is probably a problem, because
 only if the resource length read from PCI config space is larger than 0x4000,
 drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
 resource 0xfed1 - 0xfed13fff, and the PCI device can request this
 resource successfully.
 In order to check this, can you please attach the dmesg output after boot?

 maybe the issue can be fixed by below untested patch

 ---
 diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
 b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 index fd5e883..2b3d834 100644
 --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 @@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] 
 = {
  #define SNB_UNCORE_PCI_IMC_BAR_OFFSET  0x48

  /* page size multiple covering all config regs */
 -#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x6000
 +#define SNB_UNCORE_PCI_IMC_MAP_SIZE0x8

I assume ioremap() works on page boundaries.
Eventually want to expose the other counters too, not just read and
writes ( 8 bytes total).

The size of 0x6000 comes from the counter offsets: BAR + 0x5040 to BAR + 0x5054.
May be a better way of doing this would be to remap just the one page
holding them
instead of the 6 covering the entire BAR + counters. That would need
changes in the
read_counter() but that is okay.

So that would something along the line of:

addr = (addr + 0x5040)  (PAGE_SIZE - 1);
ioremap(addr, 0x1000);


  #define SNB_UNCORE_PCI_IMC_DATA_READS  0x1
  #define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
 @@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct 
 intel_uncore_box *box)

 addr = ~(PAGE_SIZE - 1);

 -   box-io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
 +   box-io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
 +  SNB_UNCORE_PCI_IMC_MAP_SIZE);
 box-hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
  }

 @@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event 
 *event)
 }

 /* must be done before validate_group */
 -   event-hw.event_base = base;
 +   event-hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
 event-hw.config = cfg;
 event-hw.idx = idx;

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Mar 19, 2014 at 9:03 PM, Zhang, Rui rui.zh...@intel.com wrote:

I've talked with Yan Zheng, and I was told that this resource 0xfed1 -
0xfed15fff
is got from PCI device register directly, which is not in its BAR range.
Thus IMO, it is impossible for PNP layer to be aware of this resource.

Slow down, this isn't quite correct. The *base* address (0xfed1)
is from a PCI config register (MCHBAR, at PCI config offset 0x48) [1].
This is a device-dependent register, so the PCI core knows neither
the base nor the size.

The device consumes address space that is not reported via the
architected PCI mechanism, so the only way to report that space is via
the PNP0C02 ACPI device. The BIOS has to determine the base and size
based on its knowledge of the hardware. On this hardware, per the
spec in [1], the region described by MCHBAR is 32KB in size.

The 0x6000 (24KB) size of the region above comes from the driver and
is actually less than what the device consumes. It is legal for a
driver to request only the area it requires, but the entire area
consumed by the device should be reported via the PNP0C02 device. The
fact that PNP0C02 reports 16KB but the device actually consumes 32KB
is a BIOS defect. This probably happened because previous versions of
this chip consumed only 16KB, and the BIOS didn't get updated for the
change.

BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
resources from being allocated to uninitialized PCI devices, then IMO,
the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
resources directly, probably via a platform callback, say,
1. make drivers/pnp/system.c a no-op for PNPACPI, by checking
pnp_dev-protocol.
2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02
resources.
3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
resources to PCI devices.

The purpose of system.c is indeed to prevent resources from being
allocated to other devices. This is really a question for Rafael, but
in my opinion this function (reserving resources of PNP/ACPI devices
to prevent their allocation to other devices) should be done for *all*
PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
system.c.

So I think the best solution would be to move that into the ACPI core
somehow so it happens for all devices. If we had that, we could get
rid of system.c altogether, and we wouldn't have to do anything
special in PCI. This is much easier to say than to do, however,
because there are all kinds of issues with legacy resource
reservations, and we currently can't really deal with overlapping
resources.

Bjorn

[1]
https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet,
sec. 3.1.2 on p. 61
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thursday, March 20, 2014 10:45:52 AM Bjorn Helgaas wrote:
 On Wed, Mar 19, 2014 at 9:03 PM, Zhang, Rui rui.zh...@intel.com wrote:
 
  I've talked with Yan Zheng, and I was told that this resource 0xfed1 - 
  0xfed15fff
  is got from PCI device register directly, which is not in its BAR range.
  Thus IMO, it is impossible for PNP layer to be aware of this resource.
 
 Slow down, this isn't quite correct.  The *base* address (0xfed1)
 is from a PCI config register (MCHBAR, at PCI config offset 0x48) [1].
  This is a device-dependent register, so the PCI core knows neither
 the base nor the size.
 
 The device consumes address space that is not reported via the
 architected PCI mechanism, so the only way to report that space is via
 the PNP0C02 ACPI device.  The BIOS has to determine the base and size
 based on its knowledge of the hardware.  On this hardware, per the
 spec in [1], the region described by MCHBAR is 32KB in size.
 
 The 0x6000 (24KB) size of the region above comes from the driver and
 is actually less than what the device consumes.  It is legal for a
 driver to request only the area it requires, but the entire area
 consumed by the device should be reported via the PNP0C02 device.  The
 fact that PNP0C02 reports 16KB but the device actually consumes 32KB
 is a BIOS defect.  This probably happened because previous versions of
 this chip consumed only 16KB, and the BIOS didn't get updated for the
 change.
 
  BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
  resources from being allocated to uninitialized PCI devices, then IMO,
  the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
  resources directly, probably via a platform callback, say,
  1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
  pnp_dev-protocol.
  2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
  resources.
  3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
 resources to PCI devices.
 
 The purpose of system.c is indeed to prevent resources from being
 allocated to other devices.  This is really a question for Rafael, but
 in my opinion this function (reserving resources of PNP/ACPI devices
 to prevent their allocation to other devices) should be done for *all*
 PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
 system.c.
 
 So I think the best solution would be to move that into the ACPI core
 somehow so it happens for all devices.

Well, I think you got to the bottom of this, but that's something we can
do long-term.  Still, we need to find a short-term solution for the
particular issue at hand.

 If we had that, we could get
 rid of system.c altogether, and we wouldn't have to do anything
 special in PCI.  This is much easier to say than to do, however,
 because there are all kinds of issues with legacy resource
 reservations, and we currently can't really deal with overlapping
 resources.

Indeed.

All above said, appended is the relevant piece of the DSDT from the machine
in question (and that is in the PCI host bridge scope).

So we have a PCI device with an ACPI object called LPC which has a child
called SIO and the _HID of that child is PNP0C02.

I'm not sure if the way system.c handles this is correct in this particular
case to be honest.


Device (LPC)
{
Name (_ADR, 0x001F)
Name (_S3D, 0x03)
Name (RID, 0x00)
Device (SIO)
{
Name (_HID, EisaId (PNP0C02))
Name (_UID, 0x00)
Name (SCRS, ResourceTemplate ()
{
IO (Decode16,
0x0010, // Range Minimum
0x0010, // Range Maximum
0x01,   // Alignment
0x10,   // Length
)
IO (Decode16,
0x0090, // Range Minimum
0x0090, // Range Maximum
0x01,   // Alignment
0x10,   // Length
)
IO (Decode16,
0x0024, // Range Minimum
0x0024, // Range Maximum
0x01,   // Alignment
0x02,   // Length
)
IO (Decode16,
0x0028, // Range Minimum
0x0028, // Range Maximum
0x01,   // Alignment
0x02,   // Length
)
IO (Decode16,

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 2:55 PM, Rafael J. Wysocki r...@rjwysocki.net wrote:
 On Thursday, March 20, 2014 10:45:52 AM Bjorn Helgaas wrote:
 The purpose of system.c is indeed to prevent resources from being
 allocated to other devices.  This is really a question for Rafael, but
 in my opinion this function (reserving resources of PNP/ACPI devices
 to prevent their allocation to other devices) should be done for *all*
 PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
 system.c.

 So I think the best solution would be to move that into the ACPI core
 somehow so it happens for all devices.

 Well, I think you got to the bottom of this, but that's something we can
 do long-term.  Still, we need to find a short-term solution for the
 particular issue at hand.

Right.  Even if we had this long-term solution, we'd still have
Stephane's current problem, because the PNP0C02 _CRS is still wrong.

We do have a drivers/pnp/quirks.c where we could conceivably adjust
the PNP resource if we found the matching PCI device and MCHBAR.  That
should solve Stephane's problem even with the current
drivers/pnp/system.c.

Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui  wrote:
>
>
>> -Original Message-
>> From: Lu, Aaron
>> Sent: Thursday, March 20, 2014 10:24 AM
>> To: Rafael J. Wysocki; Borislav Petkov
>> Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
>> List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
>> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
>> Importance: High
>>
>> On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
>> > [CC list rearranged]
>> >
>> > On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
>> >> This started happening this morning after booting -rc4+tip, let's
>> add
>> >> *everybody* to CC :-)
>> >>
>> >> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
>> >> and other goodies on the stack.
>> >
>> > I've just gone throught this.
>> >
>> > So the problem is that we have the PNP "system" driver whose only
>> > purpose seems to be to reserve system resources so that the PCI layer
>> > doesn't assign them to new devices on hotplug (disclaimer: I didn't
>> > invent it, I only read the code and comments in there).
>>
>> And to PCI devices which have uninitialized BARs.
>>
>> >
>> > It does that for ACPI device objects having the "PNP0C02" and
>> "PNP0C01" IDs.
>> >
>> > Apparently, snb_uncore_imc_init_box() steps on a range already
>> > reserved by that driver on your box.  And this doesn't seem to be a
>> > coincidence, because the ACPI device object in question probably
>> > *does* correspond to the memory controller that the uncore driver
>> attempts to use.
>> >
>> > I'm not sure how to address that right now to be honest.  Arguably,
>> > the PNP "system" driver should be replaced with something saner, but
>> > still the resources it claims need to be kept out of reach of the
>> > PCI's resource allocation code.
>>
>> The quirk_system_pci_resources is meant to disable PNP devices'
>> resource if they collide with any known PCI device's BAR. I'm not sure
>> why it doesn't work here, perhaps the uncore PCI device doesn't have a
>> BAR that falls in the PNP device's resource window?
>>
> I've talked with Yan Zheng, and I was told that this resource "0xfed1 - 
> 0xfed15fff"
> is got from PCI device register directly, which is not in its BAR range.
> Thus IMO, it is impossible for PNP layer to be aware of this resource.
>
That is not what the perf_event code does. Nothing is hardcoded except
the IMC PCI device ids. The BAR offset is hardcoded that's all. The 0xfed1
is discovered.

> BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
> resources from being allocated to uninitialized PCI devices, then IMO,
> the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
> resources directly, probably via a platform callback, say,
> 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
> pnp_dev->protocol.
> 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
> resources.
> 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
>resources to PCI devices.
>
> Thanks,
> rui
>
>> Thanks,
>> Aaron
>>
>> >
>> >> ...
>> >> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
>> mapped at [8800cac3-8800cec2]
>> >> [0.489975] resource map sanity check conflict: 0xfed1
>> 0xfed15fff 0xfed1 0xfed13fff pnp 00:01
>> >> [0.490079] [ cut here ]
>> >> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
>> __ioremap_caller+0x372/0x380()
>> >> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
>> >> [0.490371] Modules linked in:
>> >> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
>> #1
>> >> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
>> (2.06 ) 11/13/2012
>> >> [0.490742]  00ab 880213d01ad8 816112e3
>> 0006
>> >> [0.491032]  880213d01b28 880213d01b18 8104e9bc
>> 880213d01b08
>> >> [0.491343]  c9c58000 fed1 fed1
>> 6000
>> >> [0.491631] Call Trace:
>> >> [0.493337]  [] dump_stack+0x4f/0x7c
>> >> [0.493420]  [] warn_slowpath_common+0x8c/0x

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-03-19 Thread Zhang, Rui



> -Original Message-
> From: Lu, Aaron
> Sent: Thursday, March 20, 2014 10:24 AM
> To: Rafael J. Wysocki; Borislav Petkov
> Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
> List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> Importance: High
> 
> On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> > [CC list rearranged]
> >
> > On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> >> This started happening this morning after booting -rc4+tip, let's
> add
> >> *everybody* to CC :-)
> >>
> >> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
> >> and other goodies on the stack.
> >
> > I've just gone throught this.
> >
> > So the problem is that we have the PNP "system" driver whose only
> > purpose seems to be to reserve system resources so that the PCI layer
> > doesn't assign them to new devices on hotplug (disclaimer: I didn't
> > invent it, I only read the code and comments in there).
> 
> And to PCI devices which have uninitialized BARs.
> 
> >
> > It does that for ACPI device objects having the "PNP0C02" and
> "PNP0C01" IDs.
> >
> > Apparently, snb_uncore_imc_init_box() steps on a range already
> > reserved by that driver on your box.  And this doesn't seem to be a
> > coincidence, because the ACPI device object in question probably
> > *does* correspond to the memory controller that the uncore driver
> attempts to use.
> >
> > I'm not sure how to address that right now to be honest.  Arguably,
> > the PNP "system" driver should be replaced with something saner, but
> > still the resources it claims need to be kept out of reach of the
> > PCI's resource allocation code.
> 
> The quirk_system_pci_resources is meant to disable PNP devices'
> resource if they collide with any known PCI device's BAR. I'm not sure
> why it doesn't work here, perhaps the uncore PCI device doesn't have a
> BAR that falls in the PNP device's resource window?
>
I've talked with Yan Zheng, and I was told that this resource "0xfed1 - 
0xfed15fff"
is got from PCI device register directly, which is not in its BAR range.
Thus IMO, it is impossible for PNP layer to be aware of this resource.

BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
resources from being allocated to uninitialized PCI devices, then IMO,
the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
resources directly, probably via a platform callback, say,
1. make drivers/pnp/system.c a no-op for PNPACPI, by checking pnp_dev->protocol.
2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
resources.
3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
   resources to PCI devices.

Thanks,
rui
 
> Thanks,
> Aaron
> 
> >
> >> ...
> >> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
> mapped at [8800cac3-8800cec2]
> >> [0.489975] resource map sanity check conflict: 0xfed1
> 0xfed15fff 0xfed1 0xfed13fff pnp 00:01
> >> [0.490079] [ cut here ]
> >> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
> __ioremap_caller+0x372/0x380()
> >> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
> >> [0.490371] Modules linked in:
> >> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
> #1
> >> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
> (2.06 ) 11/13/2012
> >> [0.490742]  00ab 880213d01ad8 816112e3
> 0006
> >> [0.491032]  880213d01b28 880213d01b18 8104e9bc
> 880213d01b08
> >> [0.491343]  c9c58000 fed1 fed1
> 6000
> >> [0.491631] Call Trace:
> >> [0.493337]  [] dump_stack+0x4f/0x7c
> >> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
> >> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
> >> [0.493588]  [] __ioremap_caller+0x372/0x380
> >> [0.493674]  [] ?
> snb_uncore_imc_init_box+0x62/0x90
> >> [0.493761]  [] ioremap_nocache+0x17/0x20
> >> [0.493846]  []
> snb_uncore_imc_init_box+0x62/0x90
> >> [0.493933]  [] uncore_pci_probe+0xe5/0x1e0
> >> [0.494020]  [] local_pci_probe+0x4e/0xa0
> >> [0.494104]  [] ? get_device+0x19/0x20
> >> [0.494213]  [] pci_device_probe+0xe1/0x130
> >> [0.494300]  [] driver_probe_de

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 3:24 AM, Aaron Lu  wrote:
> On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
>> [CC list rearranged]
>>
>> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
>>> This started happening this morning after booting -rc4+tip, let's
>>> add *everybody* to CC :-)
>>>
>>> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
>>> other goodies on the stack.
>>
>> I've just gone throught this.
>>
>> So the problem is that we have the PNP "system" driver whose only purpose 
>> seems
>> to be to reserve system resources so that the PCI layer doesn't assign them 
>> to
>> new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
>> and
>> comments in there).
>
> And to PCI devices which have uninitialized BARs.
>
>>
>> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.
>>
>> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
>> that
>> driver on your box.  And this doesn't seem to be a coincidence, because the 
>> ACPI
>> device object in question probably *does* correspond to the memory controller
>> that the uncore driver attempts to use.
>>
>> I'm not sure how to address that right now to be honest.  Arguably, the PNP
>> "system" driver should be replaced with something saner, but still the
>> resources it claims need to be kept out of reach of the PCI's resource
>> allocation code.
>
> The quirk_system_pci_resources is meant to disable PNP devices' resource if
> they collide with any known PCI device's BAR. I'm not sure why it doesn't work
> here, perhaps the uncore PCI device doesn't have a BAR that falls in the PNP
> device's resource window?
>
Another hypothesis I am exploring with Bjorn is that the BIOS does not advertise
this correctly or that this BAR has non-standard size or behavior. So
far, we have
observed the collision only on Lenovo IvyBridge laptops. I have tried
on my desktop
SNB, IVB, HSW machines and never saw the assertion.

> Thanks,
> Aaron
>
>>
>>> ...
>>> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
>>> [8800cac3-8800cec2]
>>> [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
>>> 0xfed1 0xfed13fff pnp 00:01
>>> [0.490079] [ cut here ]
>>> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
>>> __ioremap_caller+0x372/0x380()
>>> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
>>> [0.490371] Modules linked in:
>>> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
>>> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
>>> 11/13/2012
>>> [0.490742]  00ab 880213d01ad8 816112e3 
>>> 0006
>>> [0.491032]  880213d01b28 880213d01b18 8104e9bc 
>>> 880213d01b08
>>> [0.491343]  c9c58000 fed1 fed1 
>>> 6000
>>> [0.491631] Call Trace:
>>> [0.493337]  [] dump_stack+0x4f/0x7c
>>> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
>>> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
>>> [0.493588]  [] __ioremap_caller+0x372/0x380
>>> [0.493674]  [] ? snb_uncore_imc_init_box+0x62/0x90
>>> [0.493761]  [] ioremap_nocache+0x17/0x20
>>> [0.493846]  [] snb_uncore_imc_init_box+0x62/0x90
>>> [0.493933]  [] uncore_pci_probe+0xe5/0x1e0
>>> [0.494020]  [] local_pci_probe+0x4e/0xa0
>>> [0.494104]  [] ? get_device+0x19/0x20
>>> [0.494213]  [] pci_device_probe+0xe1/0x130
>>> [0.494300]  [] driver_probe_device+0x7b/0x240
>>> [0.494385]  [] __driver_attach+0xab/0xb0
>>> [0.494469]  [] ? driver_probe_device+0x240/0x240
>>> [0.494551]  [] bus_for_each_dev+0x5e/0x90
>>> [0.494634]  [] driver_attach+0x1e/0x20
>>> [0.494718]  [] bus_add_driver+0x117/0x230
>>> [0.494802]  [] driver_register+0x64/0xf0
>>> [0.494884]  [] __pci_register_driver+0x64/0x70
>>> [0.494972]  [] ? uncore_types_init+0x19c/0x19c
>>> [0.495056]  [] intel_uncore_init+0x177/0x41c
>>> [0.495155]  [] ? uncore_types_init+0x19c/0x19c
>>> [0.495242]  [] do_one_initcall+0x4e/0x170
>>> [0.495326]  [] ? parse_args+0x60/0x360
>>> [0.495411]  [] kernel_init_freeable+0x106/0x19a
>>> [0.495497]  [] ? do_early_param+0x86/0x86
>>> [0.495582]  [] ? rest_init+0xd0/0xd0
>>> [0.495666]  [] kernel_init+0xe/0xf0
>>> [0.495749]  [] ret_from_fork+0x7c/0xb0
>>> [0.495831]  [] ? rest_init+0xd0/0xd0
>>> [0.495921] ---[ end trace 428f365c054d9a01 ]---
>>> [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
>>> Joules, 3 fixed counters 163840 ms ovfl timer
>>> [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
>>> [0.498833] audit: initializing netlink subsys (disabled)
>>> [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
>>> ...
>>>
>>>
>>
>
--
To unsubscribe from this list: send the line

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-19 Thread Aaron Lu

On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> [CC list rearranged]
> 
> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
>> This started happening this morning after booting -rc4+tip, let's
>> add *everybody* to CC :-)
>>
>> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
>> other goodies on the stack.
> 
> I've just gone throught this.
> 
> So the problem is that we have the PNP "system" driver whose only purpose 
> seems
> to be to reserve system resources so that the PCI layer doesn't assign them to
> new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
> and
> comments in there).

And to PCI devices which have uninitialized BARs.

> 
> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.
> 
> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
> that
> driver on your box.  And this doesn't seem to be a coincidence, because the 
> ACPI
> device object in question probably *does* correspond to the memory controller
> that the uncore driver attempts to use.
> 
> I'm not sure how to address that right now to be honest.  Arguably, the PNP
> "system" driver should be replaced with something saner, but still the
> resources it claims need to be kept out of reach of the PCI's resource
> allocation code.

The quirk_system_pci_resources is meant to disable PNP devices' resource if
they collide with any known PCI device's BAR. I'm not sure why it doesn't work
here, perhaps the uncore PCI device doesn't have a BAR that falls in the PNP
device's resource window?

Thanks,
Aaron

> 
>> ...
>> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
>> [8800cac3-8800cec2]
>> [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
>> 0xfed1 0xfed13fff pnp 00:01
>> [0.490079] [ cut here ]
>> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
>> __ioremap_caller+0x372/0x380()
>> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
>> [0.490371] Modules linked in:
>> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
>> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
>> 11/13/2012
>> [0.490742]  00ab 880213d01ad8 816112e3 
>> 0006
>> [0.491032]  880213d01b28 880213d01b18 8104e9bc 
>> 880213d01b08
>> [0.491343]  c9c58000 fed1 fed1 
>> 6000
>> [0.491631] Call Trace:
>> [0.493337]  [] dump_stack+0x4f/0x7c
>> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
>> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
>> [0.493588]  [] __ioremap_caller+0x372/0x380
>> [0.493674]  [] ? snb_uncore_imc_init_box+0x62/0x90
>> [0.493761]  [] ioremap_nocache+0x17/0x20
>> [0.493846]  [] snb_uncore_imc_init_box+0x62/0x90
>> [0.493933]  [] uncore_pci_probe+0xe5/0x1e0
>> [0.494020]  [] local_pci_probe+0x4e/0xa0
>> [0.494104]  [] ? get_device+0x19/0x20
>> [0.494213]  [] pci_device_probe+0xe1/0x130
>> [0.494300]  [] driver_probe_device+0x7b/0x240
>> [0.494385]  [] __driver_attach+0xab/0xb0
>> [0.494469]  [] ? driver_probe_device+0x240/0x240
>> [0.494551]  [] bus_for_each_dev+0x5e/0x90
>> [0.494634]  [] driver_attach+0x1e/0x20
>> [0.494718]  [] bus_add_driver+0x117/0x230
>> [0.494802]  [] driver_register+0x64/0xf0
>> [0.494884]  [] __pci_register_driver+0x64/0x70
>> [0.494972]  [] ? uncore_types_init+0x19c/0x19c
>> [0.495056]  [] intel_uncore_init+0x177/0x41c
>> [0.495155]  [] ? uncore_types_init+0x19c/0x19c
>> [0.495242]  [] do_one_initcall+0x4e/0x170
>> [0.495326]  [] ? parse_args+0x60/0x360
>> [0.495411]  [] kernel_init_freeable+0x106/0x19a
>> [0.495497]  [] ? do_early_param+0x86/0x86
>> [0.495582]  [] ? rest_init+0xd0/0xd0
>> [0.495666]  [] kernel_init+0xe/0xf0
>> [0.495749]  [] ret_from_fork+0x7c/0xb0
>> [0.495831]  [] ? rest_init+0xd0/0xd0
>> [0.495921] ---[ end trace 428f365c054d9a01 ]---
>> [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
>> Joules, 3 fixed counters 163840 ms ovfl timer
>> [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
>> [0.498833] audit: initializing netlink subsys (disabled)
>> [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
>> ...
>>
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-19 Thread Aaron Lu

On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
 [CC list rearranged]
 
 On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
 This started happening this morning after booting -rc4+tip, let's
 add *everybody* to CC :-)

 We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
 other goodies on the stack.
 
 I've just gone throught this.
 
 So the problem is that we have the PNP system driver whose only purpose 
 seems
 to be to reserve system resources so that the PCI layer doesn't assign them to
 new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
 and
 comments in there).

And to PCI devices which have uninitialized BARs.

 
 It does that for ACPI device objects having the PNP0C02 and PNP0C01 IDs.
 
 Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
 that
 driver on your box.  And this doesn't seem to be a coincidence, because the 
 ACPI
 device object in question probably *does* correspond to the memory controller
 that the uncore driver attempts to use.
 
 I'm not sure how to address that right now to be honest.  Arguably, the PNP
 system driver should be replaced with something saner, but still the
 resources it claims need to be kept out of reach of the PCI's resource
 allocation code.

The quirk_system_pci_resources is meant to disable PNP devices' resource if
they collide with any known PCI device's BAR. I'm not sure why it doesn't work
here, perhaps the uncore PCI device doesn't have a BAR that falls in the PNP
device's resource window?

Thanks,
Aaron

 
 ...
 [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
 [8800cac3-8800cec2]
 [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
 0xfed1 0xfed13fff pnp 00:01
 [0.490079] [ cut here ]
 [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
 __ioremap_caller+0x372/0x380()
 [0.490306] Info: mapping multiple BARs. Your kernel is fine.
 [0.490371] Modules linked in:
 [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
 [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
 11/13/2012
 [0.490742]  00ab 880213d01ad8 816112e3 
 0006
 [0.491032]  880213d01b28 880213d01b18 8104e9bc 
 880213d01b08
 [0.491343]  c9c58000 fed1 fed1 
 6000
 [0.491631] Call Trace:
 [0.493337]  [816112e3] dump_stack+0x4f/0x7c
 [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
 [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
 [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
 [0.493674]  [810211a2] ? snb_uncore_imc_init_box+0x62/0x90
 [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
 [0.493846]  [810211a2] snb_uncore_imc_init_box+0x62/0x90
 [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
 [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
 [0.494104]  [81418a59] ? get_device+0x19/0x20
 [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
 [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
 [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
 [0.494469]  [8141d590] ? driver_probe_device+0x240/0x240
 [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
 [0.494634]  [8141cede] driver_attach+0x1e/0x20
 [0.494718]  [8141ca57] bus_add_driver+0x117/0x230
 [0.494802]  [8141dd34] driver_register+0x64/0xf0
 [0.494884]  [812d4c14] __pci_register_driver+0x64/0x70
 [0.494972]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495056]  [81d03312] intel_uncore_init+0x177/0x41c
 [0.495155]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495242]  [8100029e] do_one_initcall+0x4e/0x170
 [0.495326]  [81071100] ? parse_args+0x60/0x360
 [0.495411]  [81cfbfb8] kernel_init_freeable+0x106/0x19a
 [0.495497]  [81cfb83b] ? do_early_param+0x86/0x86
 [0.495582]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495666]  [81607efe] kernel_init+0xe/0xf0
 [0.495749]  [81621f6c] ret_from_fork+0x7c/0xb0
 [0.495831]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495921] ---[ end trace 428f365c054d9a01 ]---
 [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
 Joules, 3 fixed counters 163840 ms ovfl timer
 [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
 [0.498833] audit: initializing netlink subsys (disabled)
 [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
 ...


 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 3:24 AM, Aaron Lu aaron...@intel.com wrote:
 On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
 [CC list rearranged]

 On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
 This started happening this morning after booting -rc4+tip, let's
 add *everybody* to CC :-)

 We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
 other goodies on the stack.

 I've just gone throught this.

 So the problem is that we have the PNP system driver whose only purpose 
 seems
 to be to reserve system resources so that the PCI layer doesn't assign them 
 to
 new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
 and
 comments in there).

 And to PCI devices which have uninitialized BARs.


 It does that for ACPI device objects having the PNP0C02 and PNP0C01 IDs.

 Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
 that
 driver on your box.  And this doesn't seem to be a coincidence, because the 
 ACPI
 device object in question probably *does* correspond to the memory controller
 that the uncore driver attempts to use.

 I'm not sure how to address that right now to be honest.  Arguably, the PNP
 system driver should be replaced with something saner, but still the
 resources it claims need to be kept out of reach of the PCI's resource
 allocation code.

 The quirk_system_pci_resources is meant to disable PNP devices' resource if
 they collide with any known PCI device's BAR. I'm not sure why it doesn't work
 here, perhaps the uncore PCI device doesn't have a BAR that falls in the PNP
 device's resource window?

Another hypothesis I am exploring with Bjorn is that the BIOS does not advertise
this correctly or that this BAR has non-standard size or behavior. So
far, we have
observed the collision only on Lenovo IvyBridge laptops. I have tried
on my desktop
SNB, IVB, HSW machines and never saw the assertion.

 Thanks,
 Aaron


 ...
 [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
 [8800cac3-8800cec2]
 [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
 0xfed1 0xfed13fff pnp 00:01
 [0.490079] [ cut here ]
 [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
 __ioremap_caller+0x372/0x380()
 [0.490306] Info: mapping multiple BARs. Your kernel is fine.
 [0.490371] Modules linked in:
 [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
 [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
 11/13/2012
 [0.490742]  00ab 880213d01ad8 816112e3 
 0006
 [0.491032]  880213d01b28 880213d01b18 8104e9bc 
 880213d01b08
 [0.491343]  c9c58000 fed1 fed1 
 6000
 [0.491631] Call Trace:
 [0.493337]  [816112e3] dump_stack+0x4f/0x7c
 [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
 [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
 [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
 [0.493674]  [810211a2] ? snb_uncore_imc_init_box+0x62/0x90
 [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
 [0.493846]  [810211a2] snb_uncore_imc_init_box+0x62/0x90
 [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
 [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
 [0.494104]  [81418a59] ? get_device+0x19/0x20
 [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
 [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
 [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
 [0.494469]  [8141d590] ? driver_probe_device+0x240/0x240
 [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
 [0.494634]  [8141cede] driver_attach+0x1e/0x20
 [0.494718]  [8141ca57] bus_add_driver+0x117/0x230
 [0.494802]  [8141dd34] driver_register+0x64/0xf0
 [0.494884]  [812d4c14] __pci_register_driver+0x64/0x70
 [0.494972]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495056]  [81d03312] intel_uncore_init+0x177/0x41c
 [0.495155]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495242]  [8100029e] do_one_initcall+0x4e/0x170
 [0.495326]  [81071100] ? parse_args+0x60/0x360
 [0.495411]  [81cfbfb8] kernel_init_freeable+0x106/0x19a
 [0.495497]  [81cfb83b] ? do_early_param+0x86/0x86
 [0.495582]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495666]  [81607efe] kernel_init+0xe/0xf0
 [0.495749]  [81621f6c] ret_from_fork+0x7c/0xb0
 [0.495831]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495921] ---[ end trace 428f365c054d9a01 ]---
 [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
 Joules, 3 fixed counters 163840 ms ovfl timer
 [0.498598] futex hash table

RE: Info: mapping multiple BARs. Your kernel is fine.

2014-03-19 Thread Zhang, Rui

 -Original Message-
 From: Lu, Aaron
 Sent: Thursday, March 20, 2014 10:24 AM
 To: Rafael J. Wysocki; Borislav Petkov
 Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
 List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
 Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
 Importance: High

 On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
  [CC list rearranged]

  On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
  This started happening this morning after booting -rc4+tip, let's
 add
  *everybody* to CC :-)

  We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
  and other goodies on the stack.

  I've just gone throught this.

  So the problem is that we have the PNP system driver whose only
  purpose seems to be to reserve system resources so that the PCI layer
  doesn't assign them to new devices on hotplug (disclaimer: I didn't
  invent it, I only read the code and comments in there).

 And to PCI devices which have uninitialized BARs.

  It does that for ACPI device objects having the PNP0C02 and
 PNP0C01 IDs.

  Apparently, snb_uncore_imc_init_box() steps on a range already
  reserved by that driver on your box.  And this doesn't seem to be a
  coincidence, because the ACPI device object in question probably
  *does* correspond to the memory controller that the uncore driver
 attempts to use.

  I'm not sure how to address that right now to be honest.  Arguably,
  the PNP system driver should be replaced with something saner, but
  still the resources it claims need to be kept out of reach of the
  PCI's resource allocation code.

 The quirk_system_pci_resources is meant to disable PNP devices'
 resource if they collide with any known PCI device's BAR. I'm not sure
 why it doesn't work here, perhaps the uncore PCI device doesn't have a
 BAR that falls in the PNP device's resource window?

I've talked with Yan Zheng, and I was told that this resource 0xfed1 - 
0xfed15fff
is got from PCI device register directly, which is not in its BAR range.
Thus IMO, it is impossible for PNP layer to be aware of this resource.

BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
resources from being allocated to uninitialized PCI devices, then IMO,
the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
resources directly, probably via a platform callback, say,
1. make drivers/pnp/system.c a no-op for PNPACPI, by checking pnp_dev-protocol.
2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
resources.
3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
   resources to PCI devices.

Thanks,
rui

 Thanks,
 Aaron

  ...
  [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
 mapped at [8800cac3-8800cec2]
  [0.489975] resource map sanity check conflict: 0xfed1
 0xfed15fff 0xfed1 0xfed13fff pnp 00:01
  [0.490079] [ cut here ]
  [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
 __ioremap_caller+0x372/0x380()
  [0.490306] Info: mapping multiple BARs. Your kernel is fine.
  [0.490371] Modules linked in:
  [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
 #1
  [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
 (2.06 ) 11/13/2012
  [0.490742]  00ab 880213d01ad8 816112e3
 0006
  [0.491032]  880213d01b28 880213d01b18 8104e9bc
 880213d01b08
  [0.491343]  c9c58000 fed1 fed1
 6000
  [0.491631] Call Trace:
  [0.493337]  [816112e3] dump_stack+0x4f/0x7c
  [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
  [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
  [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
  [0.493674]  [810211a2] ?
 snb_uncore_imc_init_box+0x62/0x90
  [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
  [0.493846]  [810211a2]
 snb_uncore_imc_init_box+0x62/0x90
  [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
  [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
  [0.494104]  [81418a59] ? get_device+0x19/0x20
  [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
  [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
  [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
  [0.494469]  [8141d590] ?
 driver_probe_device+0x240/0x240
  [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
  [0.494634]  [8141cede] driver_attach+0x1e/0x20
  [0.494718]  [8141ca57] bus_add_driver+0x117/0x230
  [0.494802]  [8141dd34] driver_register+0x64/0xf0
  [0.494884]  [812d4c14] __pci_register_driver+0x64/0x70
  [0.494972]  [81d0319b] ? uncore_types_init+0x19c/0x19c
  [0.495056

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Mar 20, 2014 at 4:03 AM, Zhang, Rui rui.zh...@intel.com wrote:

 -Original Message-
 From: Lu, Aaron
 Sent: Thursday, March 20, 2014 10:24 AM
 To: Rafael J. Wysocki; Borislav Petkov
 Cc: lkml; x...@kernel.org; Bjorn Helgaas; Linux PCI; ACPI Devel Maling
 List; Zhang, Rui; Yinghai Lu; H. Peter Anvin; Stephane Eranian
 Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
 Importance: High

 On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
  [CC list rearranged]

  On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
  This started happening this morning after booting -rc4+tip, let's
 add
  *everybody* to CC :-)

  We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe
  and other goodies on the stack.

  I've just gone throught this.

  So the problem is that we have the PNP system driver whose only
  purpose seems to be to reserve system resources so that the PCI layer
  doesn't assign them to new devices on hotplug (disclaimer: I didn't
  invent it, I only read the code and comments in there).

 And to PCI devices which have uninitialized BARs.

  It does that for ACPI device objects having the PNP0C02 and
 PNP0C01 IDs.

  Apparently, snb_uncore_imc_init_box() steps on a range already
  reserved by that driver on your box.  And this doesn't seem to be a
  coincidence, because the ACPI device object in question probably
  *does* correspond to the memory controller that the uncore driver
 attempts to use.

  I'm not sure how to address that right now to be honest.  Arguably,
  the PNP system driver should be replaced with something saner, but
  still the resources it claims need to be kept out of reach of the
  PCI's resource allocation code.

 The quirk_system_pci_resources is meant to disable PNP devices'
 resource if they collide with any known PCI device's BAR. I'm not sure
 why it doesn't work here, perhaps the uncore PCI device doesn't have a
 BAR that falls in the PNP device's resource window?

 I've talked with Yan Zheng, and I was told that this resource 0xfed1 - 
 0xfed15fff
 is got from PCI device register directly, which is not in its BAR range.
 Thus IMO, it is impossible for PNP layer to be aware of this resource.

That is not what the perf_event code does. Nothing is hardcoded except
the IMC PCI device ids. The BAR offset is hardcoded that's all. The 0xfed1
is discovered.

 BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
 resources from being allocated to uninitialized PCI devices, then IMO,
 the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
 resources directly, probably via a platform callback, say,
 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking 
 pnp_dev-protocol.
 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 
 resources.
 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
resources to PCI devices.

 Thanks,
 rui

 Thanks,
 Aaron

  ...
  [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB)
 mapped at [8800cac3-8800cec2]
  [0.489975] resource map sanity check conflict: 0xfed1
 0xfed15fff 0xfed1 0xfed13fff pnp 00:01
  [0.490079] [ cut here ]
  [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171
 __ioremap_caller+0x372/0x380()
  [0.490306] Info: mapping multiple BARs. Your kernel is fine.
  [0.490371] Modules linked in:
  [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+
 #1
  [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW
 (2.06 ) 11/13/2012
  [0.490742]  00ab 880213d01ad8 816112e3
 0006
  [0.491032]  880213d01b28 880213d01b18 8104e9bc
 880213d01b08
  [0.491343]  c9c58000 fed1 fed1
 6000
  [0.491631] Call Trace:
  [0.493337]  [816112e3] dump_stack+0x4f/0x7c
  [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
  [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
  [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
  [0.493674]  [810211a2] ?
 snb_uncore_imc_init_box+0x62/0x90
  [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
  [0.493846]  [810211a2]
 snb_uncore_imc_init_box+0x62/0x90
  [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
  [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
  [0.494104]  [81418a59] ? get_device+0x19/0x20
  [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
  [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
  [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
  [0.494469]  [8141d590] ?
 driver_probe_device+0x240/0x240
  [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
  [0.494634]  [8141cede] driver_attach+0x1e/0x20
  [0.494718]  [8141ca57

Re: Info: mapping multiple BARs. Your kernel is fine.

On Monday, March 17, 2014 01:09:39 AM Rafael J. Wysocki wrote:
> On Sunday, March 16, 2014 02:08:16 PM Stephane Eranian wrote:
> > Rafael,
> > 
> > Thanks for the analysis.
> > 
> > On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov  wrote:
> > > On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
> > >> I've just gone throught this.
> > >
> > > Thanks.
> > >
> > >> So the problem is that we have the PNP "system" driver whose only 
> > >> purpose seems
> > >> to be to reserve system resources so that the PCI layer doesn't assign 
> > >> them to
> > >> new devices on hotplug (disclaimer: I didn't invent it, I only read the 
> > >> code and
> > >> comments in there).
> > >>
> > >> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" 
> > >> IDs.
> > >
> > > Right, pnp 00:01 is PNP0C02.
> > >
> > >> Apparently, snb_uncore_imc_init_box() steps on a range already reserved 
> > >> by that
> > >> driver on your box.  And this doesn't seem to be a coincidence, because 
> > >> the ACPI
> > >> device object in question probably *does* correspond to the memory 
> > >> controller
> > >> that the uncore driver attempts to use.
> > >>
> > >> I'm not sure how to address that right now to be honest.  Arguably, the 
> > >> PNP
> > >> "system" driver should be replaced with something saner, but still the
> > >> resources it claims need to be kept out of reach of the PCI's resource
> > >> allocation code.
> > >
> > > Well, I'm only conjecturing here but there should be a way for the
> > > uncore code to tell the PNP "system" driver to free this resource
> > > because uncore is going to use it now. Or something to that effect.
> > >
> > I agree. The snb_uncore_imc() is making real (good) use of the device.
> > It needs to own it. So we need a way to free the resource from the PNP
> > system or  a way to tell PNP need to grab it on systems with the
> > snb_uncore_imc() support. Does that kind of API exist?
> > 
> > Where do I look to prevent PNP from grabbing the IMC?
> 
> drivers/pnp/system.c is the driver in question and system_pnp_probe() makes
> the reservations via reserve_resources_of_dev(), so you'd need to modify that.
> 
> I'm not sure what's the right way to go here, though.

Boris, can you please sent the acpidump output from that machine?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Sunday, March 16, 2014 02:08:16 PM Stephane Eranian wrote:
> Rafael,
> 
> Thanks for the analysis.
> 
> On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov  wrote:
> > On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
> >> I've just gone throught this.
> >
> > Thanks.
> >
> >> So the problem is that we have the PNP "system" driver whose only purpose 
> >> seems
> >> to be to reserve system resources so that the PCI layer doesn't assign 
> >> them to
> >> new devices on hotplug (disclaimer: I didn't invent it, I only read the 
> >> code and
> >> comments in there).
> >>
> >> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" 
> >> IDs.
> >
> > Right, pnp 00:01 is PNP0C02.
> >
> >> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
> >> that
> >> driver on your box.  And this doesn't seem to be a coincidence, because 
> >> the ACPI
> >> device object in question probably *does* correspond to the memory 
> >> controller
> >> that the uncore driver attempts to use.
> >>
> >> I'm not sure how to address that right now to be honest.  Arguably, the PNP
> >> "system" driver should be replaced with something saner, but still the
> >> resources it claims need to be kept out of reach of the PCI's resource
> >> allocation code.
> >
> > Well, I'm only conjecturing here but there should be a way for the
> > uncore code to tell the PNP "system" driver to free this resource
> > because uncore is going to use it now. Or something to that effect.
> >
> I agree. The snb_uncore_imc() is making real (good) use of the device.
> It needs to own it. So we need a way to free the resource from the PNP
> system or  a way to tell PNP need to grab it on systems with the
> snb_uncore_imc() support. Does that kind of API exist?
> 
> Where do I look to prevent PNP from grabbing the IMC?

drivers/pnp/system.c is the driver in question and system_pnp_probe() makes
the reservations via reserve_resources_of_dev(), so you'd need to modify that.

I'm not sure what's the right way to go here, though.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-16 Thread Stephane Eranian

Rafael,

Thanks for the analysis.

On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov  wrote:
> On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
>> I've just gone throught this.
>
> Thanks.
>
>> So the problem is that we have the PNP "system" driver whose only purpose 
>> seems
>> to be to reserve system resources so that the PCI layer doesn't assign them 
>> to
>> new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
>> and
>> comments in there).
>>
>> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.
>
> Right, pnp 00:01 is PNP0C02.
>
>> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
>> that
>> driver on your box.  And this doesn't seem to be a coincidence, because the 
>> ACPI
>> device object in question probably *does* correspond to the memory controller
>> that the uncore driver attempts to use.
>>
>> I'm not sure how to address that right now to be honest.  Arguably, the PNP
>> "system" driver should be replaced with something saner, but still the
>> resources it claims need to be kept out of reach of the PCI's resource
>> allocation code.
>
> Well, I'm only conjecturing here but there should be a way for the
> uncore code to tell the PNP "system" driver to free this resource
> because uncore is going to use it now. Or something to that effect.
>
I agree. The snb_uncore_imc() is making real (good) use of the device.
It needs to own it. So we need a way to free the resource from the PNP
system or  a way to tell PNP need to grab it on systems with the
snb_uncore_imc() support. Does that kind of API exist?

Where do I look to prevent PNP from grabbing the IMC?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-16 Thread Borislav Petkov

On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
> I've just gone throught this.

Thanks.

> So the problem is that we have the PNP "system" driver whose only purpose 
> seems
> to be to reserve system resources so that the PCI layer doesn't assign them to
> new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
> and
> comments in there).
> 
> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.

Right, pnp 00:01 is PNP0C02.

> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
> that
> driver on your box.  And this doesn't seem to be a coincidence, because the 
> ACPI
> device object in question probably *does* correspond to the memory controller
> that the uncore driver attempts to use.
> 
> I'm not sure how to address that right now to be honest.  Arguably, the PNP
> "system" driver should be replaced with something saner, but still the
> resources it claims need to be kept out of reach of the PCI's resource
> allocation code.

Well, I'm only conjecturing here but there should be a way for the
uncore code to tell the PNP "system" driver to free this resource
because uncore is going to use it now. Or something to that effect.

Oh well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-16 Thread Borislav Petkov

On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
 I've just gone throught this.

Thanks.

 So the problem is that we have the PNP system driver whose only purpose 
 seems
 to be to reserve system resources so that the PCI layer doesn't assign them to
 new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
 and
 comments in there).
 
 It does that for ACPI device objects having the PNP0C02 and PNP0C01 IDs.

Right, pnp 00:01 is PNP0C02.

 Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
 that
 driver on your box.  And this doesn't seem to be a coincidence, because the 
 ACPI
 device object in question probably *does* correspond to the memory controller
 that the uncore driver attempts to use.
 
 I'm not sure how to address that right now to be honest.  Arguably, the PNP
 system driver should be replaced with something saner, but still the
 resources it claims need to be kept out of reach of the PCI's resource
 allocation code.

Well, I'm only conjecturing here but there should be a way for the
uncore code to tell the PNP system driver to free this resource
because uncore is going to use it now. Or something to that effect.

Oh well.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-16 Thread Stephane Eranian

Rafael,

Thanks for the analysis.

On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov b...@alien8.de wrote:
 On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
 I've just gone throught this.

 Thanks.

 So the problem is that we have the PNP system driver whose only purpose 
 seems
 to be to reserve system resources so that the PCI layer doesn't assign them 
 to
 new devices on hotplug (disclaimer: I didn't invent it, I only read the code 
 and
 comments in there).

 It does that for ACPI device objects having the PNP0C02 and PNP0C01 IDs.

 Right, pnp 00:01 is PNP0C02.

 Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
 that
 driver on your box.  And this doesn't seem to be a coincidence, because the 
 ACPI
 device object in question probably *does* correspond to the memory controller
 that the uncore driver attempts to use.

 I'm not sure how to address that right now to be honest.  Arguably, the PNP
 system driver should be replaced with something saner, but still the
 resources it claims need to be kept out of reach of the PCI's resource
 allocation code.

 Well, I'm only conjecturing here but there should be a way for the
 uncore code to tell the PNP system driver to free this resource
 because uncore is going to use it now. Or something to that effect.

I agree. The snb_uncore_imc() is making real (good) use of the device.
It needs to own it. So we need a way to free the resource from the PNP
system or  a way to tell PNP need to grab it on systems with the
snb_uncore_imc() support. Does that kind of API exist?

Where do I look to prevent PNP from grabbing the IMC?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Sunday, March 16, 2014 02:08:16 PM Stephane Eranian wrote:
 Rafael,
 
 Thanks for the analysis.
 
 On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov b...@alien8.de wrote:
  On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
  I've just gone throught this.
 
  Thanks.
 
  So the problem is that we have the PNP system driver whose only purpose 
  seems
  to be to reserve system resources so that the PCI layer doesn't assign 
  them to
  new devices on hotplug (disclaimer: I didn't invent it, I only read the 
  code and
  comments in there).
 
  It does that for ACPI device objects having the PNP0C02 and PNP0C01 
  IDs.
 
  Right, pnp 00:01 is PNP0C02.
 
  Apparently, snb_uncore_imc_init_box() steps on a range already reserved by 
  that
  driver on your box.  And this doesn't seem to be a coincidence, because 
  the ACPI
  device object in question probably *does* correspond to the memory 
  controller
  that the uncore driver attempts to use.
 
  I'm not sure how to address that right now to be honest.  Arguably, the PNP
  system driver should be replaced with something saner, but still the
  resources it claims need to be kept out of reach of the PCI's resource
  allocation code.
 
  Well, I'm only conjecturing here but there should be a way for the
  uncore code to tell the PNP system driver to free this resource
  because uncore is going to use it now. Or something to that effect.
 
 I agree. The snb_uncore_imc() is making real (good) use of the device.
 It needs to own it. So we need a way to free the resource from the PNP
 system or  a way to tell PNP need to grab it on systems with the
 snb_uncore_imc() support. Does that kind of API exist?
 
 Where do I look to prevent PNP from grabbing the IMC?

drivers/pnp/system.c is the driver in question and system_pnp_probe() makes
the reservations via reserve_resources_of_dev(), so you'd need to modify that.

I'm not sure what's the right way to go here, though.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Monday, March 17, 2014 01:09:39 AM Rafael J. Wysocki wrote:
 On Sunday, March 16, 2014 02:08:16 PM Stephane Eranian wrote:
  Rafael,
  
  Thanks for the analysis.
  
  On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov b...@alien8.de wrote:
   On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
   I've just gone throught this.
  
   Thanks.
  
   So the problem is that we have the PNP system driver whose only 
   purpose seems
   to be to reserve system resources so that the PCI layer doesn't assign 
   them to
   new devices on hotplug (disclaimer: I didn't invent it, I only read the 
   code and
   comments in there).
  
   It does that for ACPI device objects having the PNP0C02 and PNP0C01 
   IDs.
  
   Right, pnp 00:01 is PNP0C02.
  
   Apparently, snb_uncore_imc_init_box() steps on a range already reserved 
   by that
   driver on your box.  And this doesn't seem to be a coincidence, because 
   the ACPI
   device object in question probably *does* correspond to the memory 
   controller
   that the uncore driver attempts to use.
  
   I'm not sure how to address that right now to be honest.  Arguably, the 
   PNP
   system driver should be replaced with something saner, but still the
   resources it claims need to be kept out of reach of the PCI's resource
   allocation code.
  
   Well, I'm only conjecturing here but there should be a way for the
   uncore code to tell the PNP system driver to free this resource
   because uncore is going to use it now. Or something to that effect.
  
  I agree. The snb_uncore_imc() is making real (good) use of the device.
  It needs to own it. So we need a way to free the resource from the PNP
  system or  a way to tell PNP need to grab it on systems with the
  snb_uncore_imc() support. Does that kind of API exist?
  
  Where do I look to prevent PNP from grabbing the IMC?
 
 drivers/pnp/system.c is the driver in question and system_pnp_probe() makes
 the reservations via reserve_resources_of_dev(), so you'd need to modify that.
 
 I'm not sure what's the right way to go here, though.

Boris, can you please sent the acpidump output from that machine?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-15 Thread Rafael J. Wysocki

[CC list rearranged]

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> This started happening this morning after booting -rc4+tip, let's
> add *everybody* to CC :-)
> 
> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
> other goodies on the stack.

I've just gone throught this.

So the problem is that we have the PNP "system" driver whose only purpose seems
to be to reserve system resources so that the PCI layer doesn't assign them to
new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
comments in there).

It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.

Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
driver on your box.  And this doesn't seem to be a coincidence, because the ACPI
device object in question probably *does* correspond to the memory controller
that the uncore driver attempts to use.

I'm not sure how to address that right now to be honest.  Arguably, the PNP
"system" driver should be replaced with something saner, but still the
resources it claims need to be kept out of reach of the PCI's resource
allocation code.

> ...
> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
> [8800cac3-8800cec2]
> [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
> 0xfed1 0xfed13fff pnp 00:01
> [0.490079] [ cut here ]
> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
> __ioremap_caller+0x372/0x380()
> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
> [0.490371] Modules linked in:
> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [0.490742]  00ab 880213d01ad8 816112e3 
> 0006
> [0.491032]  880213d01b28 880213d01b18 8104e9bc 
> 880213d01b08
> [0.491343]  c9c58000 fed1 fed1 
> 6000
> [0.491631] Call Trace:
> [0.493337]  [] dump_stack+0x4f/0x7c
> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
> [0.493588]  [] __ioremap_caller+0x372/0x380
> [0.493674]  [] ? snb_uncore_imc_init_box+0x62/0x90
> [0.493761]  [] ioremap_nocache+0x17/0x20
> [0.493846]  [] snb_uncore_imc_init_box+0x62/0x90
> [0.493933]  [] uncore_pci_probe+0xe5/0x1e0
> [0.494020]  [] local_pci_probe+0x4e/0xa0
> [0.494104]  [] ? get_device+0x19/0x20
> [0.494213]  [] pci_device_probe+0xe1/0x130
> [0.494300]  [] driver_probe_device+0x7b/0x240
> [0.494385]  [] __driver_attach+0xab/0xb0
> [0.494469]  [] ? driver_probe_device+0x240/0x240
> [0.494551]  [] bus_for_each_dev+0x5e/0x90
> [0.494634]  [] driver_attach+0x1e/0x20
> [0.494718]  [] bus_add_driver+0x117/0x230
> [0.494802]  [] driver_register+0x64/0xf0
> [0.494884]  [] __pci_register_driver+0x64/0x70
> [0.494972]  [] ? uncore_types_init+0x19c/0x19c
> [0.495056]  [] intel_uncore_init+0x177/0x41c
> [0.495155]  [] ? uncore_types_init+0x19c/0x19c
> [0.495242]  [] do_one_initcall+0x4e/0x170
> [0.495326]  [] ? parse_args+0x60/0x360
> [0.495411]  [] kernel_init_freeable+0x106/0x19a
> [0.495497]  [] ? do_early_param+0x86/0x86
> [0.495582]  [] ? rest_init+0xd0/0xd0
> [0.495666]  [] kernel_init+0xe/0xf0
> [0.495749]  [] ret_from_fork+0x7c/0xb0
> [0.495831]  [] ? rest_init+0xd0/0xd0
> [0.495921] ---[ end trace 428f365c054d9a01 ]---
> [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
> Joules, 3 fixed counters 163840 ms ovfl timer
> [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
> [0.498833] audit: initializing netlink subsys (disabled)
> [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
> ...
> 
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-15 Thread Rafael J. Wysocki

[CC list rearranged]

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
 This started happening this morning after booting -rc4+tip, let's
 add *everybody* to CC :-)
 
 We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
 other goodies on the stack.

I've just gone throught this.

So the problem is that we have the PNP system driver whose only purpose seems
to be to reserve system resources so that the PCI layer doesn't assign them to
new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
comments in there).

It does that for ACPI device objects having the PNP0C02 and PNP0C01 IDs.

Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
driver on your box.  And this doesn't seem to be a coincidence, because the ACPI
device object in question probably *does* correspond to the memory controller
that the uncore driver attempts to use.

I'm not sure how to address that right now to be honest.  Arguably, the PNP
system driver should be replaced with something saner, but still the
resources it claims need to be kept out of reach of the PCI's resource
allocation code.

 ...
 [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
 [8800cac3-8800cec2]
 [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
 0xfed1 0xfed13fff pnp 00:01
 [0.490079] [ cut here ]
 [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
 __ioremap_caller+0x372/0x380()
 [0.490306] Info: mapping multiple BARs. Your kernel is fine.
 [0.490371] Modules linked in:
 [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
 [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
 11/13/2012
 [0.490742]  00ab 880213d01ad8 816112e3 
 0006
 [0.491032]  880213d01b28 880213d01b18 8104e9bc 
 880213d01b08
 [0.491343]  c9c58000 fed1 fed1 
 6000
 [0.491631] Call Trace:
 [0.493337]  [816112e3] dump_stack+0x4f/0x7c
 [0.493420]  [8104e9bc] warn_slowpath_common+0x8c/0xc0
 [0.493503]  [8104eaa6] warn_slowpath_fmt+0x46/0x50
 [0.493588]  [8103f1e2] __ioremap_caller+0x372/0x380
 [0.493674]  [810211a2] ? snb_uncore_imc_init_box+0x62/0x90
 [0.493761]  [8103f247] ioremap_nocache+0x17/0x20
 [0.493846]  [810211a2] snb_uncore_imc_init_box+0x62/0x90
 [0.493933]  [81022925] uncore_pci_probe+0xe5/0x1e0
 [0.494020]  [812d487e] local_pci_probe+0x4e/0xa0
 [0.494104]  [81418a59] ? get_device+0x19/0x20
 [0.494213]  [812d5cd1] pci_device_probe+0xe1/0x130
 [0.494300]  [8141d3cb] driver_probe_device+0x7b/0x240
 [0.494385]  [8141d63b] __driver_attach+0xab/0xb0
 [0.494469]  [8141d590] ? driver_probe_device+0x240/0x240
 [0.494551]  [8141b71e] bus_for_each_dev+0x5e/0x90
 [0.494634]  [8141cede] driver_attach+0x1e/0x20
 [0.494718]  [8141ca57] bus_add_driver+0x117/0x230
 [0.494802]  [8141dd34] driver_register+0x64/0xf0
 [0.494884]  [812d4c14] __pci_register_driver+0x64/0x70
 [0.494972]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495056]  [81d03312] intel_uncore_init+0x177/0x41c
 [0.495155]  [81d0319b] ? uncore_types_init+0x19c/0x19c
 [0.495242]  [8100029e] do_one_initcall+0x4e/0x170
 [0.495326]  [81071100] ? parse_args+0x60/0x360
 [0.495411]  [81cfbfb8] kernel_init_freeable+0x106/0x19a
 [0.495497]  [81cfb83b] ? do_early_param+0x86/0x86
 [0.495582]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495666]  [81607efe] kernel_init+0xe/0xf0
 [0.495749]  [81621f6c] ret_from_fork+0x7c/0xb0
 [0.495831]  [81607ef0] ? rest_init+0xd0/0xd0
 [0.495921] ---[ end trace 428f365c054d9a01 ]---
 [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
 Joules, 3 fixed counters 163840 ms ovfl timer
 [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
 [0.498833] audit: initializing netlink subsys (disabled)
 [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
 ...
 
 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-05 Thread Stephane Eranian

Hi,

Any update on this problem?

On Thu, Feb 27, 2014 at 11:21 PM, Borislav Petkov  wrote:
> On Thu, Feb 27, 2014 at 11:12:17PM +0100, Rafael J. Wysocki wrote:
>> I won't be able to look at that before Monday I'm afraid (personal
>> stuff).
>
> No worries, sir, whenever. It can wait.
>
> Thanks a lot!
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-03-05 Thread Stephane Eranian

Hi,

Any update on this problem?

On Thu, Feb 27, 2014 at 11:21 PM, Borislav Petkov b...@alien8.de wrote:
 On Thu, Feb 27, 2014 at 11:12:17PM +0100, Rafael J. Wysocki wrote:
 I won't be able to look at that before Monday I'm afraid (personal
 stuff).

 No worries, sir, whenever. It can wait.

 Thanks a lot!

 --
 Regards/Gruss,
 Boris.

 Sent from a fat crate under my desk. Formatting is fine.
 --
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:17PM +0100, Rafael J. Wysocki wrote:
> I won't be able to look at that before Monday I'm afraid (personal
> stuff).

No worries, sir, whenever. It can wait.

Thanks a lot!

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-27 Thread Rafael J. Wysocki

On Thursday, February 27, 2014 11:27:22 AM Borislav Petkov wrote:
> On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> > My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
> > there to BAR is at a completely different address. Same thing on my
> > Haswell desktop system.
> 
> Hrrm, I'd like to see what Rafael finds out, whether what we're reading
> from PCI config space is even sane.

I won't be able to look at that before Monday I'm afraid (personal stuff).

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 12:08 PM, Peter Zijlstra  wrote:
> On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
>> On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra  
>> wrote:
>> > On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
>> >> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
>> >> They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. 
>> >> I am
>> >> not so sure this is all related to the uncore IMC support, though.
>> >
>> > Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
>> > patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
>> > up soon.
>>
>> Yes, I mean from tip.git.
>
> lkml.kernel.org/r/20140224121218.gr15...@twins.programming.kicks-ass.net
>
> Should cure things; unless there's more borkage.

Works again now with your patch.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
> On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra  wrote:
> > On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> >> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
> >> They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I 
> >> am
> >> not so sure this is all related to the uncore IMC support, though.
> >
> > Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
> > patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
> > up soon.
> 
> Yes, I mean from tip.git.

lkml.kernel.org/r/20140224121218.gr15...@twins.programming.kicks-ass.net

Should cure things; unless there's more borkage.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra  wrote:
> On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
>> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
>> They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
>> not so sure this is all related to the uncore IMC support, though.
>
> Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
> patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
> up soon.

Yes, I mean from tip.git.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
> They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
> not so sure this is all related to the uncore IMC support, though.

Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
up soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
> there to BAR is at a completely different address. Same thing on my
> Haswell desktop system.

Hrrm, I'd like to see what Rafael finds out, whether what we're reading
from PCI config space is even sane.

> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally
> unstable. They hang if I type make in my kernel tree. Whereas 3.14-rc3
> is stable. I am not so sure this is all related to the uncore IMC
> support, though.

Easy to test - just disable the uncore thing.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Feb 26, 2014 at 10:59 AM, Borislav Petkov  wrote:
> Can you please, pretty please, not top-post...
>
> On Wed, Feb 26, 2014 at 10:47:05AM +0100, Stephane Eranian wrote:
>> Hi,
>>
>> Ok, so I am getting the same error message as you.
>> I checked my syslog now.
>>
>> I have my uncore_imc addr=0xfed1 (after masking)
>>
>> And I also have pnp 00:01 overlapping the imc range completely.
>>
>> What pnp device does  it really represent? the DRAM controller?
>>
>> So I think my laptop behaves like yours.
>
> grep -Er . /sys/devices/pnp0/00\:01/* 2>/dev/null
> /sys/devices/pnp0/00:01/firmware_node/hid:PNP0C02
> ...
>
> so this PNP0C02 is
>
> [0.363943] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
>
My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
there to BAR is at a completely different address. Same thing on my Haswell
desktop system.

As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
not so sure this is all related to the uncore IMC support, though.

> @Rafael, can you please make sense of this whole ACPI gunk?
>
> We have a resource conflict with pnp 00:01, analysis here:
> http://lkml.kernel.org/r/20140226092903.ga22...@pd.tnic
>
> This is the rest of the 00:01 info from sysfs:
>
> /sys/devices/pnp0/00:01/firmware_node/uid:0
> /sys/devices/pnp0/00:01/firmware_node/path:\_SB_.PCI0.LPC_.SIO_
> /sys/devices/pnp0/00:01/firmware_node/power/control:auto
> /sys/devices/pnp0/00:01/firmware_node/power/runtime_active_time:0
> /sys/devices/pnp0/00:01/firmware_node/power/runtime_status:unsupported
> /sys/devices/pnp0/00:01/firmware_node/power/runtime_suspended_time:0
> /sys/devices/pnp0/00:01/firmware_node/modalias:acpi:PNP0C02:
> /sys/devices/pnp0/00:01/firmware_node/uevent:MODALIAS=acpi:PNP0C02:
> /sys/devices/pnp0/00:01/id:PNP0c02
> /sys/devices/pnp0/00:01/power/control:auto
> /sys/devices/pnp0/00:01/power/runtime_active_time:0
> /sys/devices/pnp0/00:01/power/runtime_status:unsupported
> /sys/devices/pnp0/00:01/power/runtime_suspended_time:0
> /sys/devices/pnp0/00:01/resources:state = active
> /sys/devices/pnp0/00:01/resources:io 0x10-0x1f
> /sys/devices/pnp0/00:01/resources:io 0x90-0x9f
> /sys/devices/pnp0/00:01/resources:io 0x24-0x25
> /sys/devices/pnp0/00:01/resources:io 0x28-0x29
> /sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
> /sys/devices/pnp0/00:01/resources:io 0x30-0x31
> /sys/devices/pnp0/00:01/resources:io 0x34-0x35
> /sys/devices/pnp0/00:01/resources:io 0x38-0x39
> /sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
> /sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
> /sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
> /sys/devices/pnp0/00:01/resources:io 0xac-0xad
> /sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
> /sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
> /sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
> /sys/devices/pnp0/00:01/resources:io 0x50-0x53
> /sys/devices/pnp0/00:01/resources:io 0x72-0x77
> /sys/devices/pnp0/00:01/resources:io 0x400-0x47f
> /sys/devices/pnp0/00:01/resources:io 0x500-0x57f
> /sys/devices/pnp0/00:01/resources:io 0x800-0x80f
> /sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
> /sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
> /sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
> /sys/devices/pnp0/00:01/resources:mem 0xf000-0x
> /sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
> /sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
> /sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
> /sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
> /sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
> /sys/devices/pnp0/00:01/resources:mem 0xfed4-0xfed44fff
> /sys/devices/pnp0/00:01/subsystem/drivers_autoprobe:1
> /sys/devices/pnp0/00:01/uevent:DRIVER=system
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Feb 26, 2014 at 10:59 AM, Borislav Petkov b...@alien8.de wrote:
 Can you please, pretty please, not top-post...

 On Wed, Feb 26, 2014 at 10:47:05AM +0100, Stephane Eranian wrote:
 Hi,

 Ok, so I am getting the same error message as you.
 I checked my syslog now.

 I have my uncore_imc addr=0xfed1 (after masking)

 And I also have pnp 00:01 overlapping the imc range completely.

 What pnp device does  it really represent? the DRAM controller?

 So I think my laptop behaves like yours.

 grep -Er . /sys/devices/pnp0/00\:01/* 2/dev/null
 /sys/devices/pnp0/00:01/firmware_node/hid:PNP0C02
 ...

 so this PNP0C02 is

 [0.363943] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)

My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
there to BAR is at a completely different address. Same thing on my Haswell
desktop system.

As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
not so sure this is all related to the uncore IMC support, though.

 @Rafael, can you please make sense of this whole ACPI gunk?

 We have a resource conflict with pnp 00:01, analysis here:
 http://lkml.kernel.org/r/20140226092903.ga22...@pd.tnic

 This is the rest of the 00:01 info from sysfs:

 /sys/devices/pnp0/00:01/firmware_node/uid:0
 /sys/devices/pnp0/00:01/firmware_node/path:\_SB_.PCI0.LPC_.SIO_
 /sys/devices/pnp0/00:01/firmware_node/power/control:auto
 /sys/devices/pnp0/00:01/firmware_node/power/runtime_active_time:0
 /sys/devices/pnp0/00:01/firmware_node/power/runtime_status:unsupported
 /sys/devices/pnp0/00:01/firmware_node/power/runtime_suspended_time:0
 /sys/devices/pnp0/00:01/firmware_node/modalias:acpi:PNP0C02:
 /sys/devices/pnp0/00:01/firmware_node/uevent:MODALIAS=acpi:PNP0C02:
 /sys/devices/pnp0/00:01/id:PNP0c02
 /sys/devices/pnp0/00:01/power/control:auto
 /sys/devices/pnp0/00:01/power/runtime_active_time:0
 /sys/devices/pnp0/00:01/power/runtime_status:unsupported
 /sys/devices/pnp0/00:01/power/runtime_suspended_time:0
 /sys/devices/pnp0/00:01/resources:state = active
 /sys/devices/pnp0/00:01/resources:io 0x10-0x1f
 /sys/devices/pnp0/00:01/resources:io 0x90-0x9f
 /sys/devices/pnp0/00:01/resources:io 0x24-0x25
 /sys/devices/pnp0/00:01/resources:io 0x28-0x29
 /sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
 /sys/devices/pnp0/00:01/resources:io 0x30-0x31
 /sys/devices/pnp0/00:01/resources:io 0x34-0x35
 /sys/devices/pnp0/00:01/resources:io 0x38-0x39
 /sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
 /sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
 /sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
 /sys/devices/pnp0/00:01/resources:io 0xac-0xad
 /sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
 /sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
 /sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
 /sys/devices/pnp0/00:01/resources:io 0x50-0x53
 /sys/devices/pnp0/00:01/resources:io 0x72-0x77
 /sys/devices/pnp0/00:01/resources:io 0x400-0x47f
 /sys/devices/pnp0/00:01/resources:io 0x500-0x57f
 /sys/devices/pnp0/00:01/resources:io 0x800-0x80f
 /sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
 /sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
 /sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
 /sys/devices/pnp0/00:01/resources:mem 0xf000-0x
 /sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
 /sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
 /sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
 /sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
 /sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
 /sys/devices/pnp0/00:01/resources:mem 0xfed4-0xfed44fff
 /sys/devices/pnp0/00:01/subsystem/drivers_autoprobe:1
 /sys/devices/pnp0/00:01/uevent:DRIVER=system

 --
 Regards/Gruss,
 Boris.

 Sent from a fat crate under my desk. Formatting is fine.
 --
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
 My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
 there to BAR is at a completely different address. Same thing on my
 Haswell desktop system.

Hrrm, I'd like to see what Rafael finds out, whether what we're reading
from PCI config space is even sane.

 As a asides, my SNB and HSW desktops with 3.14-rc4 are totally
 unstable. They hang if I type make in my kernel tree. Whereas 3.14-rc3
 is stable. I am not so sure this is all related to the uncore IMC
 support, though.

Easy to test - just disable the uncore thing.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
 As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
 They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
 not so sure this is all related to the uncore IMC support, though.

Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
up soon.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra pet...@infradead.org wrote:
 On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
 As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
 They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
 not so sure this is all related to the uncore IMC support, though.

 Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
 patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
 up soon.

Yes, I mean from tip.git.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
 On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra pet...@infradead.org wrote:
  On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
  As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
  They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I 
  am
  not so sure this is all related to the uncore IMC support, though.
 
  Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
  patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
  up soon.
 
 Yes, I mean from tip.git.

lkml.kernel.org/r/20140224121218.gr15...@twins.programming.kicks-ass.net

Should cure things; unless there's more borkage.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 12:08 PM, Peter Zijlstra pet...@infradead.org wrote:
 On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
 On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra pet...@infradead.org 
 wrote:
  On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
  As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
  They  hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. 
  I am
  not so sure this is all related to the uncore IMC support, though.
 
  Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
  patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
  up soon.

 Yes, I mean from tip.git.

 lkml.kernel.org/r/20140224121218.gr15...@twins.programming.kicks-ass.net

 Should cure things; unless there's more borkage.

Works again now with your patch.
Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-27 Thread Rafael J. Wysocki

On Thursday, February 27, 2014 11:27:22 AM Borislav Petkov wrote:
 On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
  My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
  there to BAR is at a completely different address. Same thing on my
  Haswell desktop system.
 
 Hrrm, I'd like to see what Rafael finds out, whether what we're reading
 from PCI config space is even sane.

I won't be able to look at that before Monday I'm afraid (personal stuff).

Rafael

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Thu, Feb 27, 2014 at 11:12:17PM +0100, Rafael J. Wysocki wrote:
 I won't be able to look at that before Monday I'm afraid (personal
 stuff).

No worries, sir, whenever. It can wait.

Thanks a lot!

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Feb 26, 2014 at 02:57:16PM +0100, Rafael J. Wysocki wrote:
> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > This started happening this morning after booting -rc4+tip, let's
> > add *everybody* to CC :-)
> 
> What about -rc4 without tip?

I don't think so because

commit b9e1ab6d4c0582cad97699285a6b3cf992251b00
Author: Stephane Eranian 
Date:   Tue Feb 11 16:20:12 2014 +0100

perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

in -tip introduces that snb_uncore_imc_init_box() thing which causes the
ioremap conflict.

Btw, see my last email on this thread for more details about what I'm
seeing here.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-26 Thread Peter Zijlstra

On Wed, Feb 26, 2014 at 02:57:16PM +0100, Rafael J. Wysocki wrote:
> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > This started happening this morning after booting -rc4+tip, let's
> > add *everybody* to CC :-)
> 
> What about -rc4 without tip?

The driver causing this is new and lives in -tip.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-26 Thread Rafael J. Wysocki

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> This started happening this morning after booting -rc4+tip, let's
> add *everybody* to CC :-)

What about -rc4 without tip?

> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
> other goodies on the stack.
> 
> ...
> [0.488998] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
> [8800cac3-8800cec2]
> [0.489975] resource map sanity check conflict: 0xfed1 0xfed15fff 
> 0xfed1 0xfed13fff pnp 00:01
> [0.490079] [ cut here ]
> [0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
> __ioremap_caller+0x372/0x380()
> [0.490306] Info: mapping multiple BARs. Your kernel is fine.
> [0.490371] Modules linked in:
> [0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
> [0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [0.490742]  00ab 880213d01ad8 816112e3 
> 0006
> [0.491032]  880213d01b28 880213d01b18 8104e9bc 
> 880213d01b08
> [0.491343]  c9c58000 fed1 fed1 
> 6000
> [0.491631] Call Trace:
> [0.493337]  [] dump_stack+0x4f/0x7c
> [0.493420]  [] warn_slowpath_common+0x8c/0xc0
> [0.493503]  [] warn_slowpath_fmt+0x46/0x50
> [0.493588]  [] __ioremap_caller+0x372/0x380
> [0.493674]  [] ? snb_uncore_imc_init_box+0x62/0x90
> [0.493761]  [] ioremap_nocache+0x17/0x20
> [0.493846]  [] snb_uncore_imc_init_box+0x62/0x90
> [0.493933]  [] uncore_pci_probe+0xe5/0x1e0
> [0.494020]  [] local_pci_probe+0x4e/0xa0
> [0.494104]  [] ? get_device+0x19/0x20
> [0.494213]  [] pci_device_probe+0xe1/0x130
> [0.494300]  [] driver_probe_device+0x7b/0x240
> [0.494385]  [] __driver_attach+0xab/0xb0
> [0.494469]  [] ? driver_probe_device+0x240/0x240
> [0.494551]  [] bus_for_each_dev+0x5e/0x90
> [0.494634]  [] driver_attach+0x1e/0x20
> [0.494718]  [] bus_add_driver+0x117/0x230
> [0.494802]  [] driver_register+0x64/0xf0
> [0.494884]  [] __pci_register_driver+0x64/0x70
> [0.494972]  [] ? uncore_types_init+0x19c/0x19c
> [0.495056]  [] intel_uncore_init+0x177/0x41c
> [0.495155]  [] ? uncore_types_init+0x19c/0x19c
> [0.495242]  [] do_one_initcall+0x4e/0x170
> [0.495326]  [] ? parse_args+0x60/0x360
> [0.495411]  [] kernel_init_freeable+0x106/0x19a
> [0.495497]  [] ? do_early_param+0x86/0x86
> [0.495582]  [] ? rest_init+0xd0/0xd0
> [0.495666]  [] kernel_init+0xe/0xf0
> [0.495749]  [] ret_from_fork+0x7c/0xb0
> [0.495831]  [] ? rest_init+0xd0/0xd0
> [0.495921] ---[ end trace 428f365c054d9a01 ]---
> [0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 
> Joules, 3 fixed counters 163840 ms ovfl timer
> [0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
> [0.498833] audit: initializing netlink subsys (disabled)
> [0.499024] audit: type=2000 audit(1393259866.477:1): initialized
> ...
> 
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

Can you please, pretty please, not top-post...

On Wed, Feb 26, 2014 at 10:47:05AM +0100, Stephane Eranian wrote:
> Hi,
> 
> Ok, so I am getting the same error message as you.
> I checked my syslog now.
> 
> I have my uncore_imc addr=0xfed1 (after masking)
> 
> And I also have pnp 00:01 overlapping the imc range completely.
> 
> What pnp device does  it really represent? the DRAM controller?
> 
> So I think my laptop behaves like yours.

grep -Er . /sys/devices/pnp0/00\:01/* 2>/dev/null
/sys/devices/pnp0/00:01/firmware_node/hid:PNP0C02
...

so this PNP0C02 is 

[0.363943] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)

@Rafael, can you please make sense of this whole ACPI gunk?

We have a resource conflict with pnp 00:01, analysis here:
http://lkml.kernel.org/r/20140226092903.ga22...@pd.tnic

This is the rest of the 00:01 info from sysfs:

/sys/devices/pnp0/00:01/firmware_node/uid:0
/sys/devices/pnp0/00:01/firmware_node/path:\_SB_.PCI0.LPC_.SIO_
/sys/devices/pnp0/00:01/firmware_node/power/control:auto
/sys/devices/pnp0/00:01/firmware_node/power/runtime_active_time:0
/sys/devices/pnp0/00:01/firmware_node/power/runtime_status:unsupported
/sys/devices/pnp0/00:01/firmware_node/power/runtime_suspended_time:0
/sys/devices/pnp0/00:01/firmware_node/modalias:acpi:PNP0C02:
/sys/devices/pnp0/00:01/firmware_node/uevent:MODALIAS=acpi:PNP0C02:
/sys/devices/pnp0/00:01/id:PNP0c02
/sys/devices/pnp0/00:01/power/control:auto
/sys/devices/pnp0/00:01/power/runtime_active_time:0
/sys/devices/pnp0/00:01/power/runtime_status:unsupported
/sys/devices/pnp0/00:01/power/runtime_suspended_time:0
/sys/devices/pnp0/00:01/resources:state = active
/sys/devices/pnp0/00:01/resources:io 0x10-0x1f
/sys/devices/pnp0/00:01/resources:io 0x90-0x9f
/sys/devices/pnp0/00:01/resources:io 0x24-0x25
/sys/devices/pnp0/00:01/resources:io 0x28-0x29
/sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
/sys/devices/pnp0/00:01/resources:io 0x30-0x31
/sys/devices/pnp0/00:01/resources:io 0x34-0x35
/sys/devices/pnp0/00:01/resources:io 0x38-0x39
/sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
/sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
/sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
/sys/devices/pnp0/00:01/resources:io 0xac-0xad
/sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
/sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
/sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
/sys/devices/pnp0/00:01/resources:io 0x50-0x53
/sys/devices/pnp0/00:01/resources:io 0x72-0x77
/sys/devices/pnp0/00:01/resources:io 0x400-0x47f
/sys/devices/pnp0/00:01/resources:io 0x500-0x57f
/sys/devices/pnp0/00:01/resources:io 0x800-0x80f
/sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
/sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
/sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
/sys/devices/pnp0/00:01/resources:mem 0xf000-0x
/sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
/sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
/sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
/sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
/sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
/sys/devices/pnp0/00:01/resources:mem 0xfed4-0xfed44fff
/sys/devices/pnp0/00:01/subsystem/drivers_autoprobe:1
/sys/devices/pnp0/00:01/uevent:DRIVER=system

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-26 Thread Stephane Eranian

Hi,

Ok, so I am getting the same error message as you.
I checked my syslog now.

I have my uncore_imc addr=0xfed1 (after masking)

And I also have pnp 00:01 overlapping the imc range completely.

What pnp device does  it really represent? the DRAM controller?

So I think my laptop behaves like yours.

On Wed, Feb 26, 2014 at 10:29 AM, Borislav Petkov  wrote:
> On Wed, Feb 26, 2014 at 07:56:58AM +0100, Stephane Eranian wrote:
>> > Also IVB, model 58?
>> >
>> Yes.
>
> Right, so it must be chipset-specific.
>
>> > Dunno. What do you mean by "pm callbacks" exactly? I don't know that
>> > code so I have to ask.
>> >
>> power management callbacks.
>
> Ok, just as I thought. But why would they be relevant if this happens
> very early during boot?
>
>> > #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
>> Yes. Needs to point to the DRAM controller.
>
> It seems I have it :-)
>
> $ lspci -xxx -s 00.0
> 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
> (rev 09)
> 00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
>   ^
>
> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
> 30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
> 40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
> 50: 11 02 00 00 11 00 00 00 07 00 90 df 01 00 00 db
> 60: 05 00 00 f8 00 00 00 00 01 80 d1 fe 00 00 00 00
> 70: 00 00 00 fe 01 00 00 00 00 0c 00 fe 7f 00 00 00
> 80: 10 11 11 11 11 11 11 00 1a 00 00 00 00 00 00 00
> 90: 01 00 00 fe 01 00 00 00 01 00 50 1e 02 00 00 00
> a0: 01 00 00 00 02 00 00 00 01 00 60 1e 02 00 00 00
> b0: 01 00 a0 db 01 00 80 db 01 00 00 db 01 00 a0 df
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 09 00 0c 01 9b 61 00 e2 d0 00 e8 76 00 00 00 00
> f0: 00 00 00 01 00 00 00 00 c8 0f 09 00 00 00 00 00
>
> Anyway, here's some more debugging output and some more staring:
>
> So we're correctly getting 0x154 and then snb_uncore_imc_init_box()
> tries to ioremap 0xfed1 but this fails the resource map check with:
>
> [0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
> 0xfed1 0xfed13fff pnp 00:01
>
> and the pnp 00:01 device already partially occupies that range (from
> /proc/iomem):
>
>   fed1-fed13fff : pnp 00:01
>
> Oh, and snb_uncore_imc_init_box() gets that address from
> SNB_UNCORE_PCI_IMC_BAR_OFFSET and SNB_UNCORE_PCI_IMC_BAR_OFFSET+4 and
> they start at offset 0x48 in the PCI config space above, i.e.
>
> 40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
> ^^^
>
> which is 0x00fed10001 (the 0x1 bit disappears after addr &= ~(PAGE_SIZE - 
> 1);)
>
> So I'm guessing it is time to talk to platform guys and ask them why
> they're putting SNB_UNCORE_PCI_IMC_BAR_OFFSET{,+4} in an overlapping
> range with pnp 00:01.
>
> [0.484023] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [0.484108] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
> [8800cac3-8800cec2]
> [0.484971] DBG: will get device: 0x8086:154
> [0.485054] DBG: Got device, bus: 0x0
> [0.485254] DBG: ioremapping addr: 0xfed1
> [0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
> 0xfed1 0xfed13fff pnp 00:01
> [0.485460] [ cut here ]
> [0.485544] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
> __ioremap_caller+0x372/0x380()
> [0.485643] Info: mapping multiple BARs. Your kernel is fine.
> [0.485709] Modules linked in:
> [0.485935] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #6
> [0.486019] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [0.486117]  00ab 880213d01ad8 81611339 
> 0006
> [0.486411]  880213d01b28 880213d01b18 8104e9cc 
> 880213d01b08
> [0.488308]  c9c58000 fed1 fed1 
> 6000
> [0.488595] Call Trace:
> [0.488671]  [] dump_stack+0x4f/0x7c
> [0.488754]  [] warn_slowpath_common+0x8c/0xc0
> [0.488877]  [] warn_slowpath_fmt+0x46/0x50
> [0.488966]  [] __ioremap_caller+0x372/0x380
> [0.489052]  [] ? snb_uncore_imc_init_box+0x76/0xa0
> [0.489137]  [] ioremap_nocache+0x17/0x20
> [0.489221]  [] snb_uncore_imc_init_box+0x76/0xa0
> [0.489307]  [] uncore_pci_probe+0xe5/0x1e0
> [0.489391]  [] local_pci_probe+0x4e/0xa0
> [0.489474]  [] ? get_device+0x19/0x20
> [0.489558]  [] pci_device_probe+0xe1/0x130
> [0.489642]  [] driver_probe_device+0x7b/0x240
> [0.489726]  [] __driver_attach+0xab/0xb0
> [0.489834]  [] ? driver_probe_device+0x240/0x240
> [0.489920]  [] bus_for_each_dev+0x5e/0x90
> [0.490003]  [] driver_attach+0x1e/0x20
> [0.490086]  [] bus_add_driver+0x117/0x230
> [0.490170]  [] driver_register+0x64/0xf0
> [0.490251]  []

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Feb 26, 2014 at 07:56:58AM +0100, Stephane Eranian wrote:
> > Also IVB, model 58?
> >
> Yes.

Right, so it must be chipset-specific.

> > Dunno. What do you mean by "pm callbacks" exactly? I don't know that
> > code so I have to ask.
> >
> power management callbacks.

Ok, just as I thought. But why would they be relevant if this happens
very early during boot?

> > #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
> Yes. Needs to point to the DRAM controller.

It seems I have it :-)

$ lspci -xxx -s 00.0
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
(rev 09)
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
  ^

10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
50: 11 02 00 00 11 00 00 00 07 00 90 df 01 00 00 db
60: 05 00 00 f8 00 00 00 00 01 80 d1 fe 00 00 00 00
70: 00 00 00 fe 01 00 00 00 00 0c 00 fe 7f 00 00 00
80: 10 11 11 11 11 11 11 00 1a 00 00 00 00 00 00 00
90: 01 00 00 fe 01 00 00 00 01 00 50 1e 02 00 00 00
a0: 01 00 00 00 02 00 00 00 01 00 60 1e 02 00 00 00
b0: 01 00 a0 db 01 00 80 db 01 00 00 db 01 00 a0 df
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 09 00 0c 01 9b 61 00 e2 d0 00 e8 76 00 00 00 00
f0: 00 00 00 01 00 00 00 00 c8 0f 09 00 00 00 00 00

Anyway, here's some more debugging output and some more staring:

So we're correctly getting 0x154 and then snb_uncore_imc_init_box()
tries to ioremap 0xfed1 but this fails the resource map check with:

[0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
0xfed1 0xfed13fff pnp 00:01

and the pnp 00:01 device already partially occupies that range (from
/proc/iomem):

  fed1-fed13fff : pnp 00:01

Oh, and snb_uncore_imc_init_box() gets that address from
SNB_UNCORE_PCI_IMC_BAR_OFFSET and SNB_UNCORE_PCI_IMC_BAR_OFFSET+4 and
they start at offset 0x48 in the PCI config space above, i.e.

40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
^^^

which is 0x00fed10001 (the 0x1 bit disappears after addr &= ~(PAGE_SIZE - 
1);)

So I'm guessing it is time to talk to platform guys and ask them why
they're putting SNB_UNCORE_PCI_IMC_BAR_OFFSET{,+4} in an overlapping
range with pnp 00:01.

[0.484023] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[0.484108] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
[8800cac3-8800cec2]
[0.484971] DBG: will get device: 0x8086:154
[0.485054] DBG: Got device, bus: 0x0
[0.485254] DBG: ioremapping addr: 0xfed1
[0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
0xfed1 0xfed13fff pnp 00:01
[0.485460] [ cut here ]
[0.485544] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
__ioremap_caller+0x372/0x380()
[0.485643] Info: mapping multiple BARs. Your kernel is fine.
[0.485709] Modules linked in:
[0.485935] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #6
[0.486019] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[0.486117]  00ab 880213d01ad8 81611339 
0006
[0.486411]  880213d01b28 880213d01b18 8104e9cc 
880213d01b08
[0.488308]  c9c58000 fed1 fed1 
6000
[0.488595] Call Trace:
[0.488671]  [] dump_stack+0x4f/0x7c
[0.488754]  [] warn_slowpath_common+0x8c/0xc0
[0.488877]  [] warn_slowpath_fmt+0x46/0x50
[0.488966]  [] __ioremap_caller+0x372/0x380
[0.489052]  [] ? snb_uncore_imc_init_box+0x76/0xa0
[0.489137]  [] ioremap_nocache+0x17/0x20
[0.489221]  [] snb_uncore_imc_init_box+0x76/0xa0
[0.489307]  [] uncore_pci_probe+0xe5/0x1e0
[0.489391]  [] local_pci_probe+0x4e/0xa0
[0.489474]  [] ? get_device+0x19/0x20
[0.489558]  [] pci_device_probe+0xe1/0x130
[0.489642]  [] driver_probe_device+0x7b/0x240
[0.489726]  [] __driver_attach+0xab/0xb0
[0.489834]  [] ? driver_probe_device+0x240/0x240
[0.489920]  [] bus_for_each_dev+0x5e/0x90
[0.490003]  [] driver_attach+0x1e/0x20
[0.490086]  [] bus_add_driver+0x117/0x230
[0.490170]  [] driver_register+0x64/0xf0
[0.490251]  [] __pci_register_driver+0x64/0x70
[0.490337]  [] ? uncore_types_init+0x19c/0x19c
[0.490421]  [] intel_uncore_init+0x196/0x462
[0.490504]  [] ? uncore_types_init+0x19c/0x19c
[0.490591]  [] do_one_initcall+0x4e/0x170
[0.490676]  [] ? parse_args+0x50/0x360
[0.490762]  [] kernel_init_freeable+0x106/0x19a
[0.490863]  [] ? do_early_param+0x86/0x86
[0.490948]  [] ? rest_init+0xd0/0xd0
[0.491032]  [] kernel_init+0xe/0xf0
[0.491116]  [] ret_from_fork+0x7c/0xb0
[0.491199]  [] ? rest_init+0xd0/0xd0
[0.491289] ---[ end trace b31a7f760e34b24a

Re: Info: mapping multiple BARs. Your kernel is fine.

On Wed, Feb 26, 2014 at 07:56:58AM +0100, Stephane Eranian wrote:
  Also IVB, model 58?
 
 Yes.

Right, so it must be chipset-specific.

  Dunno. What do you mean by pm callbacks exactly? I don't know that
  code so I have to ask.
 
 power management callbacks.

Ok, just as I thought. But why would they be relevant if this happens
very early during boot?

  #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
 Yes. Needs to point to the DRAM controller.

It seems I have it :-)

$ lspci -xxx -s 00.0
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
(rev 09)
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
  ^

10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
50: 11 02 00 00 11 00 00 00 07 00 90 df 01 00 00 db
60: 05 00 00 f8 00 00 00 00 01 80 d1 fe 00 00 00 00
70: 00 00 00 fe 01 00 00 00 00 0c 00 fe 7f 00 00 00
80: 10 11 11 11 11 11 11 00 1a 00 00 00 00 00 00 00
90: 01 00 00 fe 01 00 00 00 01 00 50 1e 02 00 00 00
a0: 01 00 00 00 02 00 00 00 01 00 60 1e 02 00 00 00
b0: 01 00 a0 db 01 00 80 db 01 00 00 db 01 00 a0 df
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 09 00 0c 01 9b 61 00 e2 d0 00 e8 76 00 00 00 00
f0: 00 00 00 01 00 00 00 00 c8 0f 09 00 00 00 00 00

Anyway, here's some more debugging output and some more staring:

So we're correctly getting 0x154 and then snb_uncore_imc_init_box()
tries to ioremap 0xfed1 but this fails the resource map check with:

[0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
0xfed1 0xfed13fff pnp 00:01

and the pnp 00:01 device already partially occupies that range (from
/proc/iomem):

  fed1-fed13fff : pnp 00:01

Oh, and snb_uncore_imc_init_box() gets that address from
SNB_UNCORE_PCI_IMC_BAR_OFFSET and SNB_UNCORE_PCI_IMC_BAR_OFFSET+4 and
they start at offset 0x48 in the PCI config space above, i.e.

40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
^^^

which is 0x00fed10001 (the 0x1 bit disappears after addr = ~(PAGE_SIZE - 
1);)

So I'm guessing it is time to talk to platform guys and ask them why
they're putting SNB_UNCORE_PCI_IMC_BAR_OFFSET{,+4} in an overlapping
range with pnp 00:01.

[0.484023] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[0.484108] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
[8800cac3-8800cec2]
[0.484971] DBG: will get device: 0x8086:154
[0.485054] DBG: Got device, bus: 0x0
[0.485254] DBG: ioremapping addr: 0xfed1
[0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
0xfed1 0xfed13fff pnp 00:01
[0.485460] [ cut here ]
[0.485544] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
__ioremap_caller+0x372/0x380()
[0.485643] Info: mapping multiple BARs. Your kernel is fine.
[0.485709] Modules linked in:
[0.485935] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #6
[0.486019] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[0.486117]  00ab 880213d01ad8 81611339 
0006
[0.486411]  880213d01b28 880213d01b18 8104e9cc 
880213d01b08
[0.488308]  c9c58000 fed1 fed1 
6000
[0.488595] Call Trace:
[0.488671]  [81611339] dump_stack+0x4f/0x7c
[0.488754]  [8104e9cc] warn_slowpath_common+0x8c/0xc0
[0.488877]  [8104eab6] warn_slowpath_fmt+0x46/0x50
[0.488966]  [8103f1f2] __ioremap_caller+0x372/0x380
[0.489052]  [810211b6] ? snb_uncore_imc_init_box+0x76/0xa0
[0.489137]  [8103f257] ioremap_nocache+0x17/0x20
[0.489221]  [810211b6] snb_uncore_imc_init_box+0x76/0xa0
[0.489307]  [81022935] uncore_pci_probe+0xe5/0x1e0
[0.489391]  [812d488e] local_pci_probe+0x4e/0xa0
[0.489474]  [81418a69] ? get_device+0x19/0x20
[0.489558]  [812d5ce1] pci_device_probe+0xe1/0x130
[0.489642]  [8141d3db] driver_probe_device+0x7b/0x240
[0.489726]  [8141d64b] __driver_attach+0xab/0xb0
[0.489834]  [8141d5a0] ? driver_probe_device+0x240/0x240
[0.489920]  [8141b72e] bus_for_each_dev+0x5e/0x90
[0.490003]  [8141ceee] driver_attach+0x1e/0x20
[0.490086]  [8141ca67] bus_add_driver+0x117/0x230
[0.490170]  [8141dd44] driver_register+0x64/0xf0
[0.490251]  [812d4c24] __pci_register_driver+0x64/0x70
[0.490337]  [81d0319b] ? uncore_types_init+0x19c/0x19c
[0.490421]  [81d03331] intel_uncore_init+0x196/0x462
[0.490504]  [81d0319b] ? uncore_types_init+0x19c/0x19c
[0.490591]  [8100029e] do_one_initcall+0x4e/0x170
[

Re: Info: mapping multiple BARs. Your kernel is fine.

2014-02-26 Thread Stephane Eranian

Hi,

Ok, so I am getting the same error message as you.
I checked my syslog now.

I have my uncore_imc addr=0xfed1 (after masking)

And I also have pnp 00:01 overlapping the imc range completely.

What pnp device does  it really represent? the DRAM controller?

So I think my laptop behaves like yours.

On Wed, Feb 26, 2014 at 10:29 AM, Borislav Petkov b...@alien8.de wrote:
 On Wed, Feb 26, 2014 at 07:56:58AM +0100, Stephane Eranian wrote:
  Also IVB, model 58?
 
 Yes.

 Right, so it must be chipset-specific.

  Dunno. What do you mean by pm callbacks exactly? I don't know that
  code so I have to ask.
 
 power management callbacks.

 Ok, just as I thought. But why would they be relevant if this happens
 very early during boot?

  #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
 Yes. Needs to point to the DRAM controller.

 It seems I have it :-)

 $ lspci -xxx -s 00.0
 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller 
 (rev 09)
 00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
   ^

 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
 30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
 40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
 50: 11 02 00 00 11 00 00 00 07 00 90 df 01 00 00 db
 60: 05 00 00 f8 00 00 00 00 01 80 d1 fe 00 00 00 00
 70: 00 00 00 fe 01 00 00 00 00 0c 00 fe 7f 00 00 00
 80: 10 11 11 11 11 11 11 00 1a 00 00 00 00 00 00 00
 90: 01 00 00 fe 01 00 00 00 01 00 50 1e 02 00 00 00
 a0: 01 00 00 00 02 00 00 00 01 00 60 1e 02 00 00 00
 b0: 01 00 a0 db 01 00 80 db 01 00 00 db 01 00 a0 df
 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 e0: 09 00 0c 01 9b 61 00 e2 d0 00 e8 76 00 00 00 00
 f0: 00 00 00 01 00 00 00 00 c8 0f 09 00 00 00 00 00

 Anyway, here's some more debugging output and some more staring:

 So we're correctly getting 0x154 and then snb_uncore_imc_init_box()
 tries to ioremap 0xfed1 but this fails the resource map check with:

 [0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
 0xfed1 0xfed13fff pnp 00:01

 and the pnp 00:01 device already partially occupies that range (from
 /proc/iomem):

   fed1-fed13fff : pnp 00:01

 Oh, and snb_uncore_imc_init_box() gets that address from
 SNB_UNCORE_PCI_IMC_BAR_OFFSET and SNB_UNCORE_PCI_IMC_BAR_OFFSET+4 and
 they start at offset 0x48 in the PCI config space above, i.e.

 40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
 ^^^

 which is 0x00fed10001 (the 0x1 bit disappears after addr = ~(PAGE_SIZE - 
 1);)

 So I'm guessing it is time to talk to platform guys and ask them why
 they're putting SNB_UNCORE_PCI_IMC_BAR_OFFSET{,+4} in an overlapping
 range with pnp 00:01.

 [0.484023] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
 [0.484108] software IO TLB [mem 0xcac3-0xcec3] (64MB) mapped at 
 [8800cac3-8800cec2]
 [0.484971] DBG: will get device: 0x8086:154
 [0.485054] DBG: Got device, bus: 0x0
 [0.485254] DBG: ioremapping addr: 0xfed1
 [0.485356] resource map sanity check conflict: 0xfed1 0xfed15fff 
 0xfed1 0xfed13fff pnp 00:01
 [0.485460] [ cut here ]
 [0.485544] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 
 __ioremap_caller+0x372/0x380()
 [0.485643] Info: mapping multiple BARs. Your kernel is fine.
 [0.485709] Modules linked in:
 [0.485935] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #6
 [0.486019] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
 11/13/2012
 [0.486117]  00ab 880213d01ad8 81611339 
 0006
 [0.486411]  880213d01b28 880213d01b18 8104e9cc 
 880213d01b08
 [0.488308]  c9c58000 fed1 fed1 
 6000
 [0.488595] Call Trace:
 [0.488671]  [81611339] dump_stack+0x4f/0x7c
 [0.488754]  [8104e9cc] warn_slowpath_common+0x8c/0xc0
 [0.488877]  [8104eab6] warn_slowpath_fmt+0x46/0x50
 [0.488966]  [8103f1f2] __ioremap_caller+0x372/0x380
 [0.489052]  [810211b6] ? snb_uncore_imc_init_box+0x76/0xa0
 [0.489137]  [8103f257] ioremap_nocache+0x17/0x20
 [0.489221]  [810211b6] snb_uncore_imc_init_box+0x76/0xa0
 [0.489307]  [81022935] uncore_pci_probe+0xe5/0x1e0
 [0.489391]  [812d488e] local_pci_probe+0x4e/0xa0
 [0.489474]  [81418a69] ? get_device+0x19/0x20
 [0.489558]  [812d5ce1] pci_device_probe+0xe1/0x130
 [0.489642]  [8141d3db] driver_probe_device+0x7b/0x240
 [0.489726]  [8141d64b] __driver_attach+0xab/0xb0
 [0.489834]  [8141d5a0] ? driver_probe_device+0x240/0x240
 [0.489920]  [8141b72e] bus_for_each_dev+0x5e/0x90
 [0.490003]  [8141ceee]

Re: Info: mapping multiple BARs. Your kernel is fine.