from:"Bjorn Helgaas"

Re: [PATCH] pci: Rename pci_dev->untrusted to pci_dev->external

2021-04-20 Thread Bjorn Helgaas

On Tue, Apr 20, 2021 at 07:10:06AM +0100, Christoph Hellwig wrote:
> On Mon, Apr 19, 2021 at 05:30:49PM -0700, Rajat Jain wrote:
> > The current flag name "untrusted" is not correct as it is populated
> > using the firmware property "external-facing" for the parent ports. In
> > other words, the firmware only says which ports are external facing, so
> > the field really identifies the devices as external (vs internal).
> > 
> > Only field renaming. No functional change intended.
> 
> I don't think this is a good idea.  First the field should have been
> added to the generic struct device as requested multiple times before.

Fair point.  There isn't anything PCI-specific about this idea.  The
ACPI "ExternalFacingPort" and DT "external-facing" are currently only
defined for PCI devices, but could be applied elsewhere.

> Right now this requires horrible hacks in the IOMMU code to get at the
> pci_dev, and also doesn't scale to various other potential users.

Agreed, this is definitely suboptimal.  Do you have other users in
mind?  Maybe they could help inform the plan.

> Second the untrusted is objectively a better name.  Because untrusted
> is how we treat the device, which is what mattes.  External is just
> how we come to that conclusion.

The decision to treat "external" as being "untrusted" is a little bit
of policy that the PCI core really doesn't care about, so I think it
does make some sense to let the places that *do* care decide what to
trust based on "external" and possibly other factors, e.g., whether
the device is a BMC or processes untrusted data, etc.

But I guess it makes sense to wait until we have a better motivation
before renaming it, since we don't gain any functionality here.

Bjorn

Re: [PATCH] PCI: acpiphp: Fixed coding style

2021-04-16 Thread Bjorn Helgaas

On Mon, Mar 01, 2021 at 12:51:45PM +0530, chakravarthikulkarni wrote:
> In this commit fixed coding style for braces and comments.
> 
> Signed-off-by: chakravarthikulkarni 

Applied to pci/hotplug for v5.13, thanks!

I dropped the comment change because it's really one comment that
should remain connected, so it doesn't seem like an improvement to me
to add comment start/stop in the middle.

> ---
>  drivers/pci/hotplug/acpiphp.h | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/acpiphp.h b/drivers/pci/hotplug/acpiphp.h
> index a74b274a8c45..e0964600a78f 100644
> --- a/drivers/pci/hotplug/acpiphp.h
> +++ b/drivers/pci/hotplug/acpiphp.h
> @@ -80,8 +80,8 @@ struct acpiphp_bridge {
>  struct acpiphp_slot {
>   struct list_head node;
>   struct pci_bus *bus;
> - struct list_head funcs; /* one slot may have different
> -objects (i.e. for each function) */
> + struct list_head funcs; /* one slot may have different */
> + /* objects (i.e. for each function) */
>   struct slot *slot;
>  
>   u8  device; /* pci device# */
> @@ -148,8 +148,7 @@ static inline struct acpiphp_root_context 
> *to_acpiphp_root_context(struct acpi_h
>   * ACPI has no generic method of setting/getting attention status
>   * this allows for device specific driver registration
>   */
> -struct acpiphp_attention_info
> -{
> +struct acpiphp_attention_info {
>   int (*set_attn)(struct hotplug_slot *slot, u8 status);
>   int (*get_attn)(struct hotplug_slot *slot, u8 *status);
>   struct module *owner;
> -- 
> 2.17.1
>

Re: [v9,2/7] PCI: Export pci_pio_to_address() for module use

2021-04-16 Thread Bjorn Helgaas

On Tue, Apr 13, 2021 at 10:53:05AM +0100, Lorenzo Pieralisi wrote:
> On Wed, Mar 24, 2021 at 10:09:42AM +0100, Pali Rohár wrote:
> > On Wednesday 24 March 2021 11:05:05 Jianjun Wang wrote:
> > > This interface will be used by PCI host drivers for PIO translation,
> > > export it to support compiling those drivers as kernel modules.
> > > 
> > > Signed-off-by: Jianjun Wang 
> > > ---
> > >  drivers/pci/pci.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 16a17215f633..12bba221c9f2 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -4052,6 +4052,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio)
> > >  
> > >   return address;
> > >  }
> > > +EXPORT_SYMBOL(pci_pio_to_address);
> > 
> > Hello! I'm not sure if EXPORT_SYMBOL is correct because file has GPL-2.0
> > header. Should not be in this case used only EXPORT_SYMBOL_GPL? Maybe
> > other people would know what is correct?
> 
> I think this should be EXPORT_SYMBOL_GPL(), I can make this change
> but this requires Bjorn's ACK to go upstream (Bjorn, it is my fault,
> it was assigned to me on patchwork, now updated, please have a look).

Yep, looks good to me, and I agree it should be EXPORT_SYMBOL_GPL().

Acked-by: Bjorn Helgaas 

> > >  
> > >  unsigned long __weak pci_address_to_pio(phys_addr_t address)
> > >  {
> > > -- 
> > > 2.25.1
> > >

Re: [v9,0/7] PCI: mediatek: Add new generation controller support

2021-04-16 Thread Bjorn Helgaas

On Wed, Mar 24, 2021 at 11:05:03AM +0800, Jianjun Wang wrote:
> These series patches add pcie-mediatek-gen3.c and dt-bindings file to
> support new generation PCIe controller.

Incidental: b4 doesn't work on this thread, I suspect because the
usual subject line format is:

  [PATCH v9 9/7]

instead of:

  [v9,0/7]

For b4 info, see https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst

Re: [PATCH V4] PCI: Add MCFG quirks for Tegra194 host controllers

2021-04-16 Thread Bjorn Helgaas

On Fri, Apr 16, 2021 at 07:15:37PM +0530, Vidya Sagar wrote:
> The PCIe controller in Tegra194 SoC is not completely ECAM-compliant.
> With the current hardware design limitations in place, ECAM can be enabled
> only for one controller (C5 controller to be precise) with bus numbers
> starting from 160 instead of 0. A different approach is taken to avoid this
> abnormal way of enabling ECAM for just one controller but to enable
> configuration space access for all the other controllers. In this approach,
> ops are added through MCFG quirk mechanism which access the configuration
> spaces by dynamically programming iATU (internal AddressTranslation Unit)
> to generate respective configuration accesses just like the way it is
> done in DesignWare core sub-system.
> This issue is specific to Tegra194 and it would be fixed in the future
> generations of Tegra SoCs.
> 
> Signed-off-by: Vidya Sagar 

Applied to pci/tegra for v5.13, thanks!

> ---
> V4:
> * Addressed Bjorn's review comments
> * Rebased changes on top of Lorenzo's pci/dwc branch
> 
> V3:
> * Removed MCFG address hardcoding in pci_mcfg.c file
> * Started using 'dbi_base' for accessing root port's own config space
> * and using 'config_base' for accessing config space of downstream hierarchy
> 
> V2:
> * Fixed build issues reported by kbuild test bot
> 
>  drivers/acpi/pci_mcfg.c|   7 ++
>  drivers/pci/controller/dwc/Makefile|   2 +-
>  drivers/pci/controller/dwc/pcie-tegra194.c | 103 +
>  include/linux/pci-ecam.h   |   1 +
>  4 files changed, 112 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
> index 95f23acd5b80..53cab975f612 100644
> --- a/drivers/acpi/pci_mcfg.c
> +++ b/drivers/acpi/pci_mcfg.c
> @@ -116,6 +116,13 @@ static struct mcfg_fixup mcfg_quirks[] = {
>   THUNDER_ECAM_QUIRK(2, 12),
>   THUNDER_ECAM_QUIRK(2, 13),
>  
> + { "NVIDIA", "TEGRA194", 1, 0, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 1, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 2, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 3, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 4, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 5, MCFG_BUS_ANY, _pcie_ops},
> +
>  #define XGENE_V1_ECAM_MCFG(rev, seg) \
>   {"APM   ", "XGENE   ", rev, seg, MCFG_BUS_ANY, \
>   _v1_pcie_ecam_ops }
> diff --git a/drivers/pci/controller/dwc/Makefile 
> b/drivers/pci/controller/dwc/Makefile
> index 625f6aaeb5b8..2da826ef18ac 100644
> --- a/drivers/pci/controller/dwc/Makefile
> +++ b/drivers/pci/controller/dwc/Makefile
> @@ -18,7 +18,6 @@ obj-$(CONFIG_PCIE_INTEL_GW) += pcie-intel-gw.o
>  obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o
>  obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o
>  obj-$(CONFIG_PCI_MESON) += pci-meson.o
> -obj-$(CONFIG_PCIE_TEGRA194) += pcie-tegra194.o
>  obj-$(CONFIG_PCIE_UNIPHIER) += pcie-uniphier.o
>  obj-$(CONFIG_PCIE_UNIPHIER_EP) += pcie-uniphier-ep.o
>  
> @@ -35,4 +34,5 @@ obj-$(CONFIG_PCIE_UNIPHIER_EP) += pcie-uniphier-ep.o
>  ifdef CONFIG_PCI
>  obj-$(CONFIG_ARM64) += pcie-al.o
>  obj-$(CONFIG_ARM64) += pcie-hisi.o
> +obj-$(CONFIG_ARM64) += pcie-tegra194.o
>  endif
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
> b/drivers/pci/controller/dwc/pcie-tegra194.c
> index 6fa216e52d14..cb38e94a3033 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -22,6 +22,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -311,6 +313,104 @@ struct tegra_pcie_dw_of_data {
>   enum dw_pcie_device_mode mode;
>  };
>  
> +#if defined(CONFIG_ACPI) && defined(CONFIG_PCI_QUIRKS)
> +struct tegra194_pcie_ecam  {
> + void __iomem *config_base;
> + void __iomem *iatu_base;
> + void __iomem *dbi_base;
> +};
> +
> +static int tegra194_acpi_init(struct pci_config_window *cfg)
> +{
> + struct device *dev = cfg->parent;
> + struct tegra194_pcie_ecam *pcie_ecam;
> +
> + pcie_ecam = devm_kzalloc(dev, sizeof(*pcie_ecam), GFP_KERNEL);
> + if (!pcie_ecam)
> + return -ENOMEM;
> +
> + pcie_ecam->config_base = cfg->win;
> + pcie_ecam->iatu_base = cfg->win + SZ_256K;
> + pcie_ecam->dbi_base = cfg->win + SZ_512K;
> + cfg->priv = pcie_ecam;
> +
> + return 0;
> +}
> +
> +static void atu_reg_write(struct tegra194_pcie_ecam *pcie_ecam, int index,
> +   u32 val, u32 reg)
> +{
> + u32 offset = PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(index);
> +
> + writel(val, pcie_ecam->iatu_base + offset + reg);
> +}
> +
> +static void program_outbound_atu(struct tegra194_pcie_ecam *pcie_ecam,
> +  int index, int type, u64 cpu_addr,
> +  u64 pci_addr, u64 size)
> +{
> + atu_reg_write(pcie_ecam, index, lower_32_bits(cpu_addr),
> +

Re: [PATCH] PCI: shpchp: remove unused function

2021-04-16 Thread Bjorn Helgaas

On Thu, Apr 15, 2021 at 04:30:22PM +0800, Jiapeng Chong wrote:
> Fix the following clang warning:
> 
> drivers/pci/hotplug/shpchp_hpc.c:177:20: warning: unused function
> 'shpc_writeb' [-Wunused-function].
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Applied to pci/hotplug for v5.13 with the following subject, thanks!

  PCI: shpchp: Remove unused shpc_writeb()

> ---
>  drivers/pci/hotplug/shpchp_hpc.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/shpchp_hpc.c 
> b/drivers/pci/hotplug/shpchp_hpc.c
> index db04728..9e3b277 100644
> --- a/drivers/pci/hotplug/shpchp_hpc.c
> +++ b/drivers/pci/hotplug/shpchp_hpc.c
> @@ -174,11 +174,6 @@ static inline u8 shpc_readb(struct controller *ctrl, int 
> reg)
>   return readb(ctrl->creg + reg);
>  }
>  
> -static inline void shpc_writeb(struct controller *ctrl, int reg, u8 val)
> -{
> - writeb(val, ctrl->creg + reg);
> -}
> -
>  static inline u16 shpc_readw(struct controller *ctrl, int reg)
>  {
>   return readw(ctrl->creg + reg);
> -- 
> 1.8.3.1
>

Re: QCA6174 pcie wifi: Add pci quirks

2021-04-14 Thread Bjorn Helgaas

[+cc Alex]

On Fri, Apr 09, 2021 at 11:26:33AM +0200, Ingmar Klein wrote:
> Edit: Retry, as I did not consider, that my mail-client would make this
> party html.
> 
> Dear maintainers,
> I recently encountered an issue on my Proxmox server system, that
> includes a Qualcomm QCA6174 m.2 PCIe wifi module.
> https://deviwiki.com/wiki/AIRETOS_AFX-QCA6174-NX
> 
> On system boot and subsequent virtual machine start (with passed-through
> QCA6174), the VM would just freeze/hang, at the point where the ath10k
> driver loads.
> Quick search in the proxmox related topics, brought me to the following
> discussion, which suggested a PCI quirk entry for the QCA6174 in the kernel:
> https://forum.proxmox.com/threads/pcie-passthrough-freezes-proxmox.27513/
> 
> I then went ahead, got the Proxmox kernel source (v5.4.106) and applied
> the attached patch.
> Effect was as hoped, that the VM hangs are now gone. System boots and
> runs as intended.
> 
> Judging by the existing quirk entries for Atheros, I would think, that
> my proposed "fix" could be included in the vanilla kernel.
> As far as I saw, there is no entry yet, even in the latest kernel sources.

This would need a signed-off-by; see
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.11#n361

This is an old issue, and likely we'll end up just applying this as
yet another quirk.  But looking at c3e59ee4e766 ("PCI: Mark Atheros
AR93xx to avoid bus reset"), where it started, it seems to be
connected to 425c1b223dac ("PCI: Add Virtual Channel to save/restore
support").

I'd like to dig into that a bit more to see if there are any clues.
AFAIK Linux itself still doesn't use VC at all, and 425c1b223dac added
a fair bit of code.  I wonder if we're restoring something out of
order or making some simple mistake in the way to restore VC config.

> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 706f27a86a8e..ecfe80ec5b9c 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3584,6 +3584,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, 
> quirk_no_bus_reset);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003e, quirk_no_bus_reset);
>  
>  /*
>   * Root port on some Cavium CN8xxx chips do not successfully complete a bus

Re: [PATCH -next] PCI: Use DEFINE_SPINLOCK() for spinlock

2021-04-14 Thread Bjorn Helgaas

On Tue, Apr 06, 2021 at 08:06:37PM +0800, Huang Guobin wrote:
> From: Guobin Huang 
> 
> spinlock can be initialized automatically with DEFINE_SPINLOCK()
> rather than explicitly calling spin_lock_init().
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Guobin Huang 

Applied to pci/hotplug for v5.13, thanks!

> ---
>  drivers/pci/hotplug/cpqphp_nvram.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/cpqphp_nvram.c 
> b/drivers/pci/hotplug/cpqphp_nvram.c
> index 00cd2b43364f..7a65d427ac11 100644
> --- a/drivers/pci/hotplug/cpqphp_nvram.c
> +++ b/drivers/pci/hotplug/cpqphp_nvram.c
> @@ -80,7 +80,7 @@ static u8 evbuffer[1024];
>  static void __iomem *compaq_int15_entry_point;
>  
>  /* lock for ordering int15_bios_call() */
> -static spinlock_t int15_lock;
> +static DEFINE_SPINLOCK(int15_lock);
>  
>  
>  /* This is a series of function that deals with
> @@ -415,9 +415,6 @@ void compaq_nvram_init(void __iomem *rom_start)
>   compaq_int15_entry_point = (rom_start + ROM_INT15_PHY_ADDR - 
> ROM_PHY_ADDR);
>  
>   dbg("int15 entry  = %p\n", compaq_int15_entry_point);
> -
> - /* initialize our int15 lock */
> - spin_lock_init(_lock);
>  }
>  
>  
>

Re: [PATCH] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros

2021-04-14 Thread Bjorn Helgaas

On Tue, Apr 13, 2021 at 10:39:16AM +0200, Pali Rohár wrote:
> On Monday 12 April 2021 14:27:40 Bjorn Helgaas wrote:
> > On Mon, Apr 12, 2021 at 02:46:02PM +0200, Pali Rohár wrote:
> > > Define new PCI_EXP_DEVCTL_PAYLOAD_* macros in linux/pci_regs.h header file
> > > for Max Payload Size. Macros are defined in the same style as existing
> > > macros PCI_EXP_DEVCTL_READRQ_* macros.
> > > 
> > > Signed-off-by: Pali Rohár 
> > > ---
> > >  include/uapi/linux/pci_regs.h | 6 ++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> > > index e709ae8235e7..8f1b15eea53e 100644
> > > --- a/include/uapi/linux/pci_regs.h
> > > +++ b/include/uapi/linux/pci_regs.h
> > > @@ -504,6 +504,12 @@
> > >  #define  PCI_EXP_DEVCTL_URRE 0x0008  /* Unsupported Request 
> > > Reporting En. */
> > >  #define  PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
> > >  #define  PCI_EXP_DEVCTL_PAYLOAD  0x00e0  /* Max_Payload_Size */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_128B 0x /* 128 Bytes */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */
> > > +#define  PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00A0 /* 4096 Bytes */
> > 
> > This is fine if we're going to use them, but we generally don't add
> > definitions purely for documentation.
> > 
> > 5929b8a38ce0 ("PCI: Add defines for PCIe Max_Read_Request_Size") added
> > the PCI_EXP_DEVCTL_READRQ_* definitions and we do have a few (very
> > few) uses in drivers.
> 
> I'm planning to use this constant to fix pci-aardvark.c driver. Aardvark
> changes are not ready yet, but I'm preparing them in my git tree
> https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark
> (commit PCI: aardvark: Fix PCIe Max Payload Size setting)
> 
> But as this is not change in aardvark driver, I sent it separately and
> earlier. As it would be dependency for aardvark changes.

OK, just include this in that series.

Bjorn

Re: [PATCH 1/1] s390/pci: expose a PCI device's UID as its index

2021-04-14 Thread Bjorn Helgaas

On Mon, Apr 12, 2021 at 03:59:05PM +0200, Niklas Schnelle wrote:
> On s390 each PCI device has a user-defined ID (UID) exposed under
> /sys/bus/pci/devices//uid. This ID was designed to serve as the PCI
> device's primary index and to match the device within Linux to the
> device configured in the hypervisor. To serve as a primary identifier
> the UID must be unique within the Linux instance, this is guaranteed by
> the platform if and only if the UID Uniqueness Checking flag is set
> within the CLP List PCI Functions response.
> 
> In this sense the UID serves an analogous function as the SMBIOS
> instance number or ACPI index exposed as the "index" respectively
> "acpi_index" device attributes and used by e.g. systemd to set interface
> names. As s390 does not use and will likely never use ACPI nor SMBIOS
> there is no conflict and we can just expose the UID under the "index"
> attribute whenever UID Uniqueness Checking is active and get systemd's
> interface naming support for free.
> 
> Signed-off-by: Niklas Schnelle 
> Acked-by: Viktor Mihajlovski 

This seems like a nice solution to me.

Acked-by: Bjorn Helgaas 

> ---
>  Documentation/ABI/testing/sysfs-bus-pci | 11 +---
>  arch/s390/pci/pci_sysfs.c   | 35 +
>  2 files changed, 42 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
> b/Documentation/ABI/testing/sysfs-bus-pci
> index 25c9c39770c6..1241b6d11a52 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -195,10 +195,13 @@ What:   /sys/bus/pci/devices/.../index
>  Date:July 2010
>  Contact: Narendra K , linux-b...@dell.com
>  Description:
> - Reading this attribute will provide the firmware
> - given instance (SMBIOS type 41 device type instance) of the
> - PCI device. The attribute will be created only if the firmware
> - has given an instance number to the PCI device.
> + Reading this attribute will provide the firmware given instance
> + number of the PCI device.  Depending on the platform this can
> + be for example the SMBIOS type 41 device type instance or the
> + user-defined ID (UID) on s390. The attribute will be created
> + only if the firmware has given an instance number to the PCI
> + device and that number is guaranteed to uniquely identify the
> + device in the system.
>  Users:
>   Userspace applications interested in knowing the
>   firmware assigned device type instance of the PCI
> diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c
> index e14d346dafd6..20dbb2058d51 100644
> --- a/arch/s390/pci/pci_sysfs.c
> +++ b/arch/s390/pci/pci_sysfs.c
> @@ -138,6 +138,38 @@ static ssize_t uid_is_unique_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(uid_is_unique);
>  
> +#ifndef CONFIG_DMI
> +/* analogous to smbios index */

I think this is smbios_attr_instance, right?  Maybe mention that
specifically to make it easier to match these up.

Looks like smbios_attr_instance and the similar ACPI stuff could use
some updating to use the current attribute group infrastructure.

> +static ssize_t index_show(struct device *dev,
> +   struct device_attribute *attr, char *buf)
> +{
> + struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
> + u32 index = ~0;
> +
> + if (zpci_unique_uid)
> + index = zdev->uid;
> +
> + return sysfs_emit(buf, "%u\n", index);
> +}
> +static DEVICE_ATTR_RO(index);
> +
> +static umode_t zpci_unique_uids(struct kobject *kobj,
> + struct attribute *attr, int n)
> +{
> + return zpci_unique_uid ? attr->mode : 0;
> +}
> +
> +static struct attribute *zpci_ident_attrs[] = {
> + _attr_index.attr,
> + NULL,
> +};
> +
> +static struct attribute_group zpci_ident_attr_group = {
> + .attrs = zpci_ident_attrs,
> + .is_visible = zpci_unique_uids,

It's conventional to name these functions *_is_visible() (another
convention that smbios_attr_instance and acpi_attr_index probably
predate).

> +};
> +#endif
> +
>  static struct bin_attribute *zpci_bin_attrs[] = {
>   _attr_util_string,
>   _attr_report_error,
> @@ -179,5 +211,8 @@ static struct attribute_group pfip_attr_group = {
>  const struct attribute_group *zpci_attr_groups[] = {
>   _attr_group,
>   _attr_group,
> +#ifndef CONFIG_DMI
> + _ident_attr_group,
> +#endif
>   NULL,
>  };
> -- 
> 2.25.1
>

Re: [PATCH v10 3/3] PCI: uniphier: Add misc interrupt handler to invoke PME and AER

2021-04-14 Thread Bjorn Helgaas

On Sat, Apr 10, 2021 at 01:22:18AM +0900, Kunihiko Hayashi wrote:
> This patch adds misc interrupt handler to detect and invoke PME/AER event.
> 
> In UniPhier PCIe controller, PME/AER signals are assigned to the same
> signal as MSI by the internal logic. These signals should be detected by
> the internal register, however, DWC MSI handler can't handle these signals.
> 
> DWC MSI handler calls .msi_host_isr() callback function, that detects
> PME/AER signals using the internal register and invokes the interrupt
> with PME/AER vIRQ numbers.
> 
> These vIRQ numbers is obtained by uniphier_pcie_port_get_irq() function,
> that finds the device that matches PME/AER from the devices associated
> with Root Port, and returns its vIRQ number.

Why do you use the term "vIRQ"?  What exactly is a vIRQ?  It seems no
different than the simple "irq" as stored in pci_dev.irq or
pcie_device.irq and passed to generic_handle_irq().  "virq" is also
used in the patch, so if you change one, please change the other as
well.

Bjorn

Re: [PATCH v2 7/8] cxl/port: Introduce cxl_port objects

2021-04-13 Thread Bjorn Helgaas

On Thu, Apr 08, 2021 at 07:13:38PM -0700, Dan Williams wrote:
> Hi Bjorn, thanks for taking a look.
> 
> On Thu, Apr 8, 2021 at 3:42 PM Bjorn Helgaas  wrote:
> >
> > [+cc Greg, Rafael, Matthew: device model questions]
> >
> > Hi Dan,
> >
> > On Thu, Apr 01, 2021 at 07:31:20AM -0700, Dan Williams wrote:
> > > Once the cxl_root is established then other ports in the hierarchy can
> > > be attached. The cxl_port object, unlike cxl_root that is associated
> > > with host bridges, is associated with PCIE Root Ports or PCIE Switch
> > > Ports. Add cxl_port instances for all PCIE Root Ports in an ACPI0016
> > > host bridge.

Incidentally, "PCIe" is the abbreviation used in the PCIe specs, so I
try to use that instead of "PCIE" in drivers/pci/.

> > I'm not a device model expert, but I'm not sure about adding a new
> > /sys/bus/cxl/devices hierarchy.  I'm under the impression that CXL
> > devices will be enumerated by the PCI core as PCIe devices.
> 
> Yes, PCIe is involved, but mostly only for the CXL.io slow path
> (configuration and provisioning via mailbox) when we're talking about
> memory expander devices (CXL calls these Type-3). So-called "Type-3"
> support is the primary driver of this infrastructure.
>
> You might be thinking of CXL accelerator devices that will look like
> plain PCIe devices that happen to participate in the CPU cache
> hierarchy (CXL calls these Type-1). There will also be accelerator
> devices that want to share coherent memory with the system (CXL calls
> these Type-2).

IIUC all these CXL devices will be enumerated by the PCI core.  They
seem to have regular PCI BARs (separate from the HDM stuff), so the
PCI core will presumably manage address allocation for them.  It looks
like Function Level Reset and hotplug are supposed to use the regular
PCIe code.  I guess this will all be visible via lspci just like
regular PCI devices, right?

> The infrastructure being proposed here is primarily for the memory
> expander (Type-3) device case where the PCI sysfs hierarchy is wholly
> unsuited for modeling it. A single CXL memory region device may span
> multiple endpoints, switches, and host bridges. It poses similar
> stress to an OS device model as RAID where there is a driver for the
> component contributors to an upper level device / driver that exposes
> the RAID Volume (CXL memory region interleave set). The CXL memory
> decode space (HDM: Host Managed Device Memory) is independent of the
> PCIe MMIO BAR space.

It looks like you add a cxl_port for each ACPI0016 device and every
PCIe Root Port below it.  So I guess the upper level spanning is at a
higher level than cxl_port?

> That's where the /sys/bus/cxl hierarchy is needed, to manage the HDM
> space across the CXL topology in a way that is foreign to PCIE (HDM
> Decoder hierarchy).

When we do FLR on the PCIe device, what happens to these CXL clients?
Do they care?  Are they notified?  Do they need to do anything before
or after the FLR?

What about hotplug?  Spec says it leverages PCIe hotplug, but it looks
like maybe this all requires ACPI hotplug (acpiphp) for adding
ACPI0017 devices and notifying of hot remove requests?  If it uses
PCIe native hotplug (pciehp), what connects the CXL side to the PCI
side?

I guess the HDM address space management is entirely outside the scope
of PCI -- the address space is not described by the CXL host bridge
_CRS and not described by CXL endpoint BARs?  Where *is* it described
and who manages and allocates it?  I guess any transaction routing
through the CXL fabric for HDM space is also completely outside the
scope of PCI -- we don't need to worry about managing PCI-to-PCI
bridge windows, for instance?

Is there a cxl_register_driver() or something?  I assume there will be
drivers that need to manage CXL devices?  Or will they use
pci_register_driver() and search for a CXL capability?

> > Doesn't that mean we will have one struct device in the pci_dev,
> > and another one in the cxl_port?
> 
> Yes, that is the proposal.

> The superfluous power/ issue can be cleaned up with
> device_set_pm_not_required().

Thanks, we might be able to use that for portdrv.  I added it to my
list to investigate.

> What are the other problems this poses, because in other areas this
> ability to subdivide a device's functionality into sub-drivers is a
> useful organization principle?

Well, I'm thinking about things like enumeration, hotplug, reset,
resource management (BARs, bridge windows, etc), interrupts, power
management (suspend, resume, etc), and error reporting.  These are all
things that PCIe defines on a per-Function basis and seem kind of hard
to cleanly subdivide.

> So much so that several device writer teams came together to create
> the auxiliary-bus for t

Re: Device driver location for the PCIe root port's DMA engine

2021-04-13 Thread Bjorn Helgaas

On Tue, Apr 13, 2021 at 11:42:15PM +0530, Vidya Sagar wrote:
> On 4/13/2021 3:23 AM, Bjorn Helgaas wrote:

> > The existing port services (AER, DPC, hotplug, etc) are things the
> > device advertises via the PCI Capabilities defined by the generic PCIe
> > spec, and in my opinion the support for them should be directly part
> > of the PCI core and activated when the relevant Capability is present.
> Is there an on-going activity to remove port service drivers are move
> AER/DPC/Hotplug etc.. handling within PCI core?

No, not that I'm aware of.  I'd just like to avoid extending that
model.

Bjorn

Re: Device driver location for the PCIe root port's DMA engine

2021-04-12 Thread Bjorn Helgaas

[+cc Matthew for portdrv comment]

On Mon, Apr 12, 2021 at 10:31:02PM +0530, Vidya Sagar wrote:
> Hi
> I'm starting this mail to seek advice on the best approach to be taken to
> add support for the driver of the PCIe root port's DMA engine.
> To give some background, Tegra194's PCIe IPs are dual-mode PCIe IPs i.e.
> they work either in the root port mode or in the endpoint mode based on the
> boot time configuration.
> Since the PCIe hardware IP as such is the same for both (RP and EP) modes,
> the DMA engine sub-system of the PCIe IP is also made available to both
> modes of operation.
> Typically, the DMA engine is seen only in the endpoint mode, and that DMA
> engine’s configuration registers are made available to the host through one
> of its BARs.
> In the situation that we have here, where there is a DMA engine present as
> part of the root port, the DMA engine isn’t a typical general-purpose DMA
> engine in the sense that it can’t have both source and destination addresses
> targeting external memory addresses.
> RP’s DMA engine, while doing a write operation,
> would always fetch data (i.e. source) from local memory and write it to the
> remote memory over PCIe link (i.e. destination would be the BAR of an
> endpoint)
> whereas while doing a read operation,
> would always fetch/read data (i.e. source) from a remote memory over the
> PCIe link and write it to the local memory.
> 
> I see that there are at least two ways we can have a driver for this DMA
> engine.
> a) DMA engine driver as one of the port service drivers
>   Since the DMA engine is a part of the root port hardware itself 
> (although
> it is not part of the standard capabilities of the root port), it is one of
> the options to have the driver for the DMA engine go as one of the port
> service drivers (along with AER, PME, hot-plug, etc...). Based on Vendor-ID
> and Device-ID matching runtime, either it gets binded/enabled (like in the
> case of Tegra194) or it doesn't.
> b) DMA engine driver as a platform driver
>   The DMA engine hardware can be described as a sub-node under the PCIe
> controller's node in the device tree and a separate platform driver can be
> written to work with it.
> 
> I’m inclined to have the DMA engine driver as a port service driver as it
> makes it cleaner and also in line with the design philosophy (the way I
> understood it) of the port service drivers.
> Please let me know your thoughts on this.

Personally I'm not a fan of the port service driver model.  It creates
additional struct devices for things that are not separate devices.
And it creates a parallel hierarchy in /sys/bus/pci_express/devices/
that I think does not accurately model the hardware.

The existing port services (AER, DPC, hotplug, etc) are things the
device advertises via the PCI Capabilities defined by the generic PCIe
spec, and in my opinion the support for them should be directly part
of the PCI core and activated when the relevant Capability is present.

The DMA engine is different -- this is device-specific functionality
and I think the obvious way to discover it and bind a driver to it is
via the PCI Vendor and Device ID.

This *is* complicated by the fact that you can't just use
pci_register_driver() to claim functionality implemented in Root Ports
or Switch Ports because portdrv binds to them before you have a
chance.  I think that's a defect in the portdrv design.  The usual
workaround is to use pci_get_device(), which has its own issues (it's
ugly, it's outside the normal driver binding model, doesn't work
nicely with hotplug or udev, doesn't coordinate with other drivers
using the same device, etc).  There are many examples of this in the
EDAC code.

Bjorn

Re: [PATCH] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros

2021-04-12 Thread Bjorn Helgaas

On Mon, Apr 12, 2021 at 02:46:02PM +0200, Pali Rohár wrote:
> Define new PCI_EXP_DEVCTL_PAYLOAD_* macros in linux/pci_regs.h header file
> for Max Payload Size. Macros are defined in the same style as existing
> macros PCI_EXP_DEVCTL_READRQ_* macros.
> 
> Signed-off-by: Pali Rohár 
> ---
>  include/uapi/linux/pci_regs.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index e709ae8235e7..8f1b15eea53e 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -504,6 +504,12 @@
>  #define  PCI_EXP_DEVCTL_URRE 0x0008  /* Unsupported Request Reporting En. */
>  #define  PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
>  #define  PCI_EXP_DEVCTL_PAYLOAD  0x00e0  /* Max_Payload_Size */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_128B 0x /* 128 Bytes */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */
> +#define  PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00A0 /* 4096 Bytes */

This is fine if we're going to use them, but we generally don't add
definitions purely for documentation.

5929b8a38ce0 ("PCI: Add defines for PCIe Max_Read_Request_Size") added
the PCI_EXP_DEVCTL_READRQ_* definitions and we do have a few (very
few) uses in drivers.

If we do need to add these, please follow the local use of lower-case
in the hex bitmasks.  The file is a mixture, but the closest examples
are lower-case.

>  #define  PCI_EXP_DEVCTL_EXT_TAG  0x0100  /* Extended Tag Field Enable */
>  #define  PCI_EXP_DEVCTL_PHANTOM  0x0200  /* Phantom Functions Enable */
>  #define  PCI_EXP_DEVCTL_AUX_PME  0x0400  /* Auxiliary Power PM Enable */
> -- 
> 2.20.1
>

Re: [PATCH] PCI: Delay after FLR of Intel DC P4510 NVMe

2021-04-09 Thread Bjorn Helgaas

On Thu, Apr 08, 2021 at 07:05:27PM +, Raphael Norwitz wrote:
> Like the Intel DC P3700 NVMe, the Intel P4510 NVMe exhibits a timeout
> failure when the driver tries to interact with the device to soon after
> an FLR. The same reset quirk the P3700 uses also resolves the failure
> for the P4510, so this change introduces the same reset quirk for the
> P4510.
> 
> Reviewed-by: Alex Williamson 
> Signed-off-by: Alay Shah 
> Signed-off-by: Suresh Gumpula 
> Signed-off-by: Raphael Norwitz 

Applied to pci/virtualization for v5.13, thanks!

> ---
>  drivers/pci/quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 653660e3ba9e..5a8c059b848d 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3922,6 +3922,7 @@ static const struct pci_dev_reset_methods 
> pci_dev_reset_methods[] = {
>   reset_ivb_igd },
>   { PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
>   { PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
> + { PCI_VENDOR_ID_INTEL, 0x0a54, delay_250ms_after_flr },
>   { PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
>   reset_chelsio_generic_dev },
>   { 0 }
> -- 
> 2.20.1

Re: [PATCH v2 7/8] cxl/port: Introduce cxl_port objects

2021-04-08 Thread Bjorn Helgaas

[+cc Greg, Rafael, Matthew: device model questions]

Hi Dan,

On Thu, Apr 01, 2021 at 07:31:20AM -0700, Dan Williams wrote:
> Once the cxl_root is established then other ports in the hierarchy can
> be attached. The cxl_port object, unlike cxl_root that is associated
> with host bridges, is associated with PCIE Root Ports or PCIE Switch
> Ports. Add cxl_port instances for all PCIE Root Ports in an ACPI0016
> host bridge. 

I'm not a device model expert, but I'm not sure about adding a new
/sys/bus/cxl/devices hierarchy.  I'm under the impression that CXL
devices will be enumerated by the PCI core as PCIe devices.  Doesn't
that mean we will have one struct device in the pci_dev, and another
one in the cxl_port?  That seems like an issue to me.  More below.

> The cxl_port instances for PCIE Switch Ports are not
> included here as those are to be modeled as another service device
> registered on the pcie_port_bus_type.

I'm hesitant about the idea of adding more uses of pcie_port_bus_type.
I really dislike portdrv because it makes a parallel hierarchy:

  /sys/bus/pci
  /sys/bus/pci_express

for things that really should not be different.  There's a struct
device in pci_dev, and potentially several pcie_devices, each with
another struct device.  We make these pcie_device things for AER, DPC,
hotplug, etc.  E.g.,

  /sys/bus/pci/devices/:00:1c.0
  /sys/bus/pci_express/devices/:00:1c.0:pcie002  # AER
  /sys/bus/pci_express/devices/:00:1c.0:pcie010  # BW notification

These are all the same PCI device.  AER is a PCI capability.
Bandwidth notification is just a feature of all Downstream Ports.  I
think it makes zero sense to have extra struct devices for them.  From
a device point of view (enumeration, power management, VM assignment),
we can't manage them separately from the underlying PCI device.  For
example, we have three separate "power/" directories, but obviously
there's only one point of control (00:1c.0):

  /sys/devices/pci:00/:00:1c.0/power/
  /sys/devices/pci:00/:00:1c.0/:00:1c.0:pcie002/power/
  /sys/devices/pci:00/:00:1c.0/:00:1c.0:pcie010/power/

> A sample sysfs topology for a single-host-bridge with
> single-PCIE/CXL-root-port:
> 
> /sys/bus/cxl/devices/root0
> ├── address_space0
> │   ├── devtype
> │   ├── end
> │   ├── start
> │   ├── supports_ram
> │   ├── supports_type2
> │   ├── supports_type3
> │   └── uevent
> ├── address_space1
> │   ├── devtype
> │   ├── end
> │   ├── start
> │   ├── supports_pmem
> │   ├── supports_type2
> │   ├── supports_type3
> │   └── uevent
> ├── devtype
> ├── port1
> │   ├── devtype
> │   ├── host -> ../../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00
> │   ├── port2
> │   │   ├── devtype
> │   │   ├── host -> ../../../../../pci:34/:34:00.0
> │   │   ├── subsystem -> ../../../../../../bus/cxl
> │   │   ├── target_id
> │   │   └── uevent
> │   ├── subsystem -> ../../../../../bus/cxl
> │   ├── target_id
> │   └── uevent
> ├── subsystem -> ../../../../bus/cxl
> ├── target_id
> └── uevent
> 
> Signed-off-by: Dan Williams 
> ---
>  drivers/cxl/acpi.c |   99 +++
>  drivers/cxl/core.c |  121 
> 
>  drivers/cxl/cxl.h  |5 ++
>  3 files changed, 224 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index d54c2d5de730..bc2a35ae880b 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -5,18 +5,117 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "cxl.h"
>  
> +static int match_ACPI0016(struct device *dev, const void *host)
> +{
> + struct acpi_device *adev = to_acpi_device(dev);
> + const char *hid = acpi_device_hid(adev);
> +
> + return strcmp(hid, "ACPI0016") == 0;
> +}
> +
> +struct cxl_walk_context {
> + struct device *dev;
> + struct pci_bus *root;
> + struct cxl_port *port;
> + int error;
> + int count;
> +};
> +
> +static int match_add_root_ports(struct pci_dev *pdev, void *data)
> +{
> + struct cxl_walk_context *ctx = data;
> + struct pci_bus *root_bus = ctx->root;
> + struct cxl_port *port = ctx->port;
> + int type = pci_pcie_type(pdev);
> + struct device *dev = ctx->dev;
> + resource_size_t cxl_regs_phys;
> + int target_id = ctx->count;
> +
> + if (pdev->bus != root_bus)
> + return 0;
> + if (!pci_is_pcie(pdev))
> + return 0;
> + if (type != PCI_EXP_TYPE_ROOT_PORT)
> + return 0;
> +
> + ctx->count++;
> +
> + /* TODO walk DVSEC to find component register base */
> + cxl_regs_phys = -1;
> +
> + port = devm_cxl_add_port(dev, port, >dev, target_id,
> +  cxl_regs_phys);
> + if (IS_ERR(port)) {
> + ctx->error = PTR_ERR(port);
> + return ctx->error;
> + }
> +
> + dev_dbg(dev, "%s: register: %s\n", dev_name(>dev),
> + dev_name(>dev));
> +
> + return 0;
> +}
> +

Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver

2021-04-08 Thread Bjorn Helgaas

On Thu, Apr 08, 2021 at 09:22:52PM +0800, Yicong Yang wrote:
> On 2021/4/8 2:55, Bjorn Helgaas wrote:
> > On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:

> >> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
> >> +PCIe cores.
> 
> > Can you connect "Kunpeng 930" to something in the kernel tree?
> > "git grep -i kunpeng" shows nothing that's obviously relevant.
> > I assume there's a related driver in drivers/pci/controller/?
> 
> Kunpeng 930 is the product name of Hip09 platform. The PCIe
> controller uses the generic PCIe driver based on ACPI.

I guess I'm just looking for a hint to help users know when to enable
the Kconfig for this.  Maybe the "HiSilicon" in the Kconfig help is
enough?  Maybe "Kunpeng 930" is not even necessary?  If "Kunpeng 930"
*is* necessary, there should be some way to relate it to something
else.

> >> +from the file, and the desired value written to the file to tune.
> > 
> >> +Tuning multiple events at the same time is not permitted, which means
> >> +you cannot read or write more than one tune file at one time.
> > 
> > I think this is obvious from the model, so the sentence doesn't really
> > add anything.  Each event is a separate file, and it's obvious that
> > there's no way to write to multiple files simultaneously.
> 
> from the usage we shown below this situation won't happen. I just worry
> that users may have a program to open multiple files at the same time and
> read/write simultaneously, so add this line here to mention the restriction.

How is this possible?  I don't think "writing multiple files
simultaneously" is even possible in the Linux syscall model.  I don't
think a user will do anything differently after reading "you cannot
read or write more than one tune file at one time."

> >> +- tx_path_rx_req_alloc_buf_level: watermark of RX requested
> >> +- tx_path_tx_req_alloc_buf_level: watermark of TX requested
> >> +
> >> +These events influence the watermark of the buffer allocated for each
> >> +type. RX means the inbound while Tx means outbound. For a busy
> >> +direction, you should increase the related buffer watermark to enhance
> >> +the performance.
> > 
> > Based on what you have written here, I would just write 2 to both
> > files to enhance the performance in both directions.  But obviously
> > there must be some tradeoff here, e.g., increasing Rx performance
> > comes at the cost of Tx performane.
> 
> the Rx buffer and Tx buffer are separate, so they won't influence
> each other.

Why would I write anything other than 2 to these files?  That's the
question I think this paragraph should answer.

> >> +9. data_format
> >> +--
> >> +
> >> +File to indicate the format of the traced TLP headers. User can also
> >> +specify the desired format of traced TLP headers. Available formats
> >> +are 4DW, 8DW which indicates the length of each TLP headers traced.
> >> +::
> >> +$ cat data_format
> >> +[4DW]8DW
> >> +$ echo 8 > data_format
> >> +$ cat data_format
> >> +4DW [8DW]
> >> +
> >> +The traced TLP header format is different from the PCIe standard.
> > 
> > I'm confused.  Below you say the fields of the traced TLP header are
> > defined by the PCIe spec.  But here you say the format is *different*.
> > What exactly is different?
> 
> For the Request Header Format for 64-bit addressing of Memory, defind in
> PCIe spec 4.0, Figure 2-15, the 1st DW is like:
> 
> Byte 0 > [Fmt] [Type] [T9] [Tc] [T8] [Attr] [LN] [TH] ... [Length]
> 
> some are recorded in our traced header like below, which some are not.
> that's what I mean the format of the header are different. But for a
> certain field like 'Fmt', the meaning keeps same with what Spec defined.
> that's what I mean the fields definition of our traced header keep same
> with the Spec.

Ah, that helps a lot, thank you.  Maybe you could say something along
the lines of this:

  When using the 8DW data format, the entire TLP header is logged.
  For example, the TLP header for Memory Reads with 64-bit addresses
  is shown in PCIe r5.0, Figure 2-17; the header for Configuration
  Requests is shown in Figure 2.20, etc.

  In addition, 8DW trace buffer entries contain a timestamp and
  possibly a prefix, e.g., a PASID TLP prefix (see Figure 6-20).  TLPs
  may include more than one prefix, but only one can be logged in
  trace buffer entries.

  When using the 4DW data format, DW0 of the trace buffer entry
  contains selected fields of DW0 of the TLP, together with a
  timestamp.  DW1-DW3 of the trace buffer entry contain DW1-DW3
  directly from the TLP header.

This looks like a really cool device.  I wish we had this for more
platforms.

Bjorn

Re: [PATCH v2] ACPI / hotplug / PCI: fix memory leak in enable_slot()

2021-04-08 Thread Bjorn Helgaas

On Thu, Mar 25, 2021 at 03:26:00PM +0800, Zhiqiang Liu wrote:
> From: Feilong Lin 
> 
> In enable_slot() in drivers/pci/hotplug/acpiphp_glue.c, if pci_get_slot()
> will return NULL, we will do not set SLOT_ENABLED flag of slot. if one
> device is found by calling pci_get_slot(), its reference count will be
> increased. In this case, we did not call pci_dev_put() to decrement the
> its reference count, the memory of the device (struct pci_dev type) will
> leak.
> 
> Fix it by calling pci_dev_put() to decrement its reference count after that
> pci_get_slot() returns a PCI device.
> 
> Signed-off-by: Feilong Lin 
> Signed-off-by: Zhiqiang Liu 

Applied with Rafael's reviewed-by to pci/hotplug for v5.13, thanks!

> --
> v2: rewrite subject and commit log as suggested by Bjorn Helgaas.
> ---
>  drivers/pci/hotplug/acpiphp_glue.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
> b/drivers/pci/hotplug/acpiphp_glue.c
> index 3365c93abf0e..f031302ad401 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -533,6 +533,7 @@ static void enable_slot(struct acpiphp_slot *slot, bool 
> bridge)
>   slot->flags &= ~SLOT_ENABLED;
>   continue;
>   }
> + pci_dev_put(dev);
>   }
>  }
> 
> -- 
> 2.19.1
>

Re: [PATCH v2] ACPI / hotplug / PCI: fix memory leak in enable_slot()

2021-04-08 Thread Bjorn Helgaas

On Thu, Apr 08, 2021 at 05:18:46PM +0200, Rafael J. Wysocki wrote:
> On Thu, Mar 25, 2021 at 8:27 AM Zhiqiang Liu  wrote:
> >
> > From: Feilong Lin 
> >
> > In enable_slot() in drivers/pci/hotplug/acpiphp_glue.c, if pci_get_slot()
> > will return NULL, we will do not set SLOT_ENABLED flag of slot. if one
> > device is found by calling pci_get_slot(), its reference count will be
> > increased. In this case, we did not call pci_dev_put() to decrement the
> > its reference count, the memory of the device (struct pci_dev type) will
> > leak.
> >
> > Fix it by calling pci_dev_put() to decrement its reference count after that
> > pci_get_slot() returns a PCI device.
> >
> > Signed-off-by: Feilong Lin 
> > Signed-off-by: Zhiqiang Liu 
> > --
> > v2: rewrite subject and commit log as suggested by Bjorn Helgaas.
> 
> The fix is correct AFAICS, so
> 
> Reviewed-by: Rafael J. Wysocki 
> 
> Bjorn, has this been applied already?  If not, do you want me to take
> it or are you going to queue it up yourself?

I'll pick it up; thanks for the review and the reminder!

> > ---
> >  drivers/pci/hotplug/acpiphp_glue.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
> > b/drivers/pci/hotplug/acpiphp_glue.c
> > index 3365c93abf0e..f031302ad401 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -533,6 +533,7 @@ static void enable_slot(struct acpiphp_slot *slot, bool 
> > bridge)
> > slot->flags &= ~SLOT_ENABLED;
> > continue;
> > }
> > +   pci_dev_put(dev);
> > }
> >  }
> >
> > --
> > 2.19.1
> >

Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver

2021-04-07 Thread Bjorn Helgaas

Move important info in the subject earlier, e.g.,

  docs: Add HiSilicon PTT device documentation

On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:
> Document the introduction and usage of HiSilicon PTT device driver.
> 
> Signed-off-by: Yicong Yang 
> ---
>  Documentation/trace/hisi-ptt.rst | 316 
> +++
>  1 file changed, 316 insertions(+)
>  create mode 100644 Documentation/trace/hisi-ptt.rst
> 
> diff --git a/Documentation/trace/hisi-ptt.rst 
> b/Documentation/trace/hisi-ptt.rst
> new file mode 100644
> index 000..215676f
> --- /dev/null
> +++ b/Documentation/trace/hisi-ptt.rst
> @@ -0,0 +1,316 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==
> +HiSilicon PCIe Tune and Trace device
> +==
> +
> +Introduction
> +
> +
> +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
> +integrated Endpoint (RCiEP) device, providing the capability
> +to dynamically monitor and tune the PCIe link's events (tune),
> +and trace the TLP headers (trace). The two functions are independent,
> +but is recommended to use them together to analyze and enhance the
> +PCIe link's performance.

> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
> +PCIe cores.
> +Each core is composed of several root ports, RCiEPs, and one
> +PTT device, like below. The PTT device is capable of tuning and
> +tracing the link of the PCIe core.

s/root complex/Root Complex/ to match spec, diagram, RCiEP above
s/root ports/Root Ports/ to match spec, etc (also below)

Can you connect "Kunpeng 930" to something in the kernel tree?
"git grep -i kunpeng" shows nothing that's obviously relevant.
I assume there's a related driver in drivers/pci/controller/?

Is this one paragraph or two?  If one, reflow.  If two, add blank line
between.

IIUC, the diagram below shows two PCIe cores, each with three Root
Ports and a PTT RCiEP.  Your text mentions "RCiEPs, and one PTT" which
suggests RCiEPs in addition to the PTT, but the diagram doesn't show
any, and if there are other RCiEPs, they don't seem relevant to this
doc.  Maybe something like this?

  Each PCIe core includes several Root Ports and a PTT RCiEP ...

> +::
> +  +--Core 0---+
> +  |   |   [   PTT   ] |
> +  |   |   [Root Port]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +Root Complex  |--Core 1---+
> +  |   |   [   PTT   ] |
> +  |   |   [Root Port]---[ Switch ]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint] `-[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +  +---+
> +
> +The PTT device driver cannot be loaded if debugfs is not mounted.
> +Each PTT device will be presented under /sys/kernel/debugfs/hisi_ptt
> +as its root directory, with name of its BDF number.
> +::
> +
> +/sys/kernel/debug/hisi_ptt/::.
> +
> +Tune
> +
> +
> +PTT tune is designed for monitoring and adjusting PCIe link 
> parameters(events).

Add a space before "(".

> +Currently we support events in 4 classes. The scope of the events
> +covers the PCIe core with which the PTT device belongs to.

... the PCIe core to which the PTT device belongs.
> +
> +Each event is presented as a file under $(PTT root dir)/$(BDF)/tune, and
> +mostly a simple open/read/write/close cycle will be used to tune
> +the event.
> +::
> +$ cd /sys/kernel/debug/hisi_ptt/$(BDF)/tune
> +$ ls
> +qos_tx_cplqos_tx_npqos_tx_p
> +tx_path_rx_req_alloc_buf_level
> +tx_path_tx_req_alloc_buf_level
> +$ cat qos_tx_dp
> +1
> +$ echo 2 > qos_tx_dp
> +$ cat qos_tx_dp
> +2
> +
> +Current value(numerical value) of the event can be simply read

Add space before "(".

> +from the file, and the desired value written to the file to tune.

> +Tuning multiple events at the same time is not permitted, which means
> +you cannot read or write more than one tune file at one time.

I think this is obvious from the model, so the sentence doesn't really
add anything.  Each event is a separate file, and it's obvious that
there's no way to write to multiple files simultaneously.

> +1. Tx path QoS control
> +
> +
> +Following files are provided to tune the QoS of the tx path of the PCIe core.

"The following ..."

> +- qos_tx_cpl: weight of tx completion TLPs
> +- qos_tx_np: weight of tx non-posted TLPs
> +- qos_tx_p: weight of tx posted TLPs
> +
> +The weight influences the proportion of certain packets on the PCIe link.
> +For example, for the storage scenario, increase the proportion
> +of the completion packets on the link to enhance the performance as
> +more completions are consumed.

I don't believe you can directly influence the *proportions* of packet
types.  The number and types of TLPs are

Re: [PATCH] PCI: dwc: Change the inheritance between the abstracted structures

2021-04-06 Thread Bjorn Helgaas

On Tue, Apr 06, 2021 at 05:28:25PM +0800, Zhiqiang Hou wrote:
> From: Hou Zhiqiang 
> 
> Currently the core struct dw_pcie includes both struct pcie_port
> and dw_pcie_ep and the RC and EP platform drivers directly
> includes the dw_pcie. So it results in a RC or EP platform driver
> has 2 indirect parents pcie_port and dw_pcie_ep, but it doesn't
> make sense let RC platform driver includes the dw_pcie_ep and
> so does the EP platform driver.
> 
> This patch makes the struct pcie_port and dw_pcie_ep includes
> the core struct dw_pcie and the RC and EP platform drivers
> include struct pcie_port and dw_pcie_ep respectively.

I really like the way this patch is heading.  There's a lot of
historical cruft in these drivers and this is a good step to cleaning
it up.  Thanks a lot for working on this!

What does this patch apply to?  It doesn't apply cleanly to either my
"main" branch or the "next" branch.  Try to send things that apply to
"main" and if it needs to apply on top of something else, mention what
that is.

> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c 
> b/drivers/pci/controller/dwc/pci-dra7xx.c
> index 12726c63366f..0e914df6eaba 100644
> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
> @@ -85,7 +85,8 @@
>  #define PCIE_B0_B1_TSYNCEN   BIT(0)
>  
>  struct dra7xx_pcie {
> - struct dw_pcie  *pci;
> + struct pcie_port*pp;
> + struct dw_pcie_ep   *ep;

1) This is not related to your patch, but I think "pcie_port" used to
   make more sense before we had endpoint drivers, but now it's the
   wrong name.  Root Ports and Endpoints both have "PCIe Ports", but
   IIUC "struct pcie_port" only applies to Root Ports, and "struct
   dw_pcie_ep" is the analogue for Endpoints.

   It would be nice to coordinate these names with a separate patch,
   e.g., maybe "dw_pcie_rc" (or "dw_pcie_rp") and "dw_pcie_ep".

2) We allocate struct dra7xx_pcie for both RPs and EPs.  But IIUC, RPs
   only use "struct pcie_port", and EPs only use "struct dw_pcie_ep".
   It doesn't seem right to keep both pointers when only one is ever
   used.

3) I'm not sure why these should be pointers at all.  Why can't they
   be directly embedded, e.g., "struct pcie_port pp" instead of
   "struct pcie_port *pp"?  Obviously this would have to be done in a
   way that we allocate an RC-specific structure or an EP-specific
   one.

>   void __iomem*base;  /* DT ti_conf */
>   int phy_count;  /* DT phy-names count */
>   struct phy  **phy;

> @@ -796,6 +798,17 @@ static int __init dra7xx_pcie_probe(struct 
> platform_device *pdev)
>  
>   switch (mode) {
>   case DW_PCIE_RC_TYPE:
> + pp = devm_kzalloc(dev, sizeof(*pp), GFP_KERNEL);

We know "mode" right after the of_match_device() at the top of this
function.  I think we should allocate the RC or EP structure way up
there, ideally with a single alloc for everything we need
(dra7xx_pcie, pcie_port, dw_pcie_ep, etc).  That would be fewer allocs
and would simplify error handling because if the alloc fails we
wouldn't have to undo anything.

> + if (!pp) {
> + ret = -ENOMEM;
> + goto err_gpio;
> + }
> +
> + pci = >pcie;
> + pci->dev = dev;
> + pci->ops = _pcie_ops;
> + dra7xx->pp = pp;
> +
>   if (!IS_ENABLED(CONFIG_PCI_DRA7XX_HOST)) {
>   ret = -ENODEV;
>   goto err_gpio;
> @@ -813,6 +826,17 @@ static int __init dra7xx_pcie_probe(struct 
> platform_device *pdev)
>   goto err_gpio;
>   break;
>   case DW_PCIE_EP_TYPE:
> + ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL);
> + if (!ep) {
> + ret = -ENOMEM;
> + goto err_gpio;
> + }
> +
> + pci = >pcie;
> + pci->dev = dev;
> + pci->ops = _pcie_ops;
> + dra7xx->ep = ep;
> +
>   if (!IS_ENABLED(CONFIG_PCI_DRA7XX_EP)) {
>   ret = -ENODEV;
>   goto err_gpio;

> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -171,12 +171,44 @@ enum dw_pcie_device_mode {
>   DW_PCIE_RC_TYPE,
>  };
>  
> +struct dw_pcie_ops {
> + u64 (*cpu_addr_fixup)(struct dw_pcie *pcie, u64 cpu_addr);
> + u32 (*read_dbi)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
> + size_t size);
> + void(*write_dbi)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
> +  size_t size, u32 val);
> + void(*write_dbi2)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
> +   size_t size, u32 val);
> + int (*link_up)(struct dw_pcie *pcie);
> + int

Re: [PATCH v1 3/7] PCI: New Primary to Sideband (P2SB) bridge support library

2021-04-01 Thread Bjorn Helgaas

On Thu, Apr 01, 2021 at 09:23:04PM +0300, Andy Shevchenko wrote:
> On Thu, Apr 01, 2021 at 11:42:56AM -0500, Bjorn Helgaas wrote:
> > On Thu, Apr 01, 2021 at 06:45:02PM +0300, Andy Shevchenko wrote:
> > > On Tue, Mar 09, 2021 at 09:42:52AM +0100, Henning Schild wrote:
> > > > Am Mon, 8 Mar 2021 19:42:21 -0600
> > > > schrieb Bjorn Helgaas :
> > > > > On Mon, Mar 08, 2021 at 09:16:50PM +0200, Andy Shevchenko wrote:
> > > > > > On Mon, Mar 08, 2021 at 12:52:12PM -0600, Bjorn Helgaas wrote:  
> > > > > > > On Mon, Mar 08, 2021 at 02:20:16PM +0200, Andy Shevchenko wrote:  
> > > 
> > > ...
> > > 
> > > > > > > > +   /* Read the first BAR of the device in question */
> > > > > > > > +   __pci_bus_read_base(bus, devfn, pci_bar_unknown, mem,
> > > > > > > > PCI_BASE_ADDRESS_0, true);  
> > > > > > > 
> > > > > > > I don't get this.  Apparently this normally hidden device is
> > > > > > > consuming PCI address space.  The PCI core needs to know
> > > > > > > about this.  If it doesn't, the PCI core may assign this
> > > > > > > space to another device.  
> > > > > > 
> > > > > > Right, it returns all 1:s to any request so PCI core *thinks*
> > > > > > it's plugged off (like D3cold or so).  
> > > > > 
> > > > > I'm asking about the MMIO address space.  The BAR is a register
> > > > > in config space.  AFAICT, clearing P2SBC_HIDE_BYTE makes that
> > > > > BAR visible.  The BAR describes a region of PCI address space.
> > > > > It looks like setting P2SBC_HIDE_BIT makes the BAR disappear
> > > > > from config space, but it sounds like the PCI address space
> > > > > *described* by the BAR is still claimed by the device.  If the
> > > > > device didn't respond to that MMIO space, you would have no
> > > > > reason to read the BAR at all.
> > > > > 
> > > > > So what keeps the PCI core from assigning that MMIO space to
> > > > > another device?
> > > > 
> > > > The device will respond to MMIO while being hidden. I am afraid
> > > > nothing stops a collision, except for the assumption that the BIOS
> > > > is always right and PCI devices never get remapped. But just
> > > > guessing here.
> > > > 
> > > > I have seen devices with coreboot having the P2SB visible, and
> > > > most likely relocatable. Making it visible in Linux and not hiding
> > > > it again might work, but probably only as long as Linux will not
> > > > relocate it.  Which i am afraid might seriously upset the BIOS,
> > > > depending on what a device does with those GPIOs and which parts
> > > > are implemented in the BIOS.
> > > 
> > > So the question is, do we have knobs in PCI core to mark device
> > > fixes in terms of BARs, no relocation must be applied, no other
> > > devices must have the region?
> > 
> > I think the closest thing is the IORESOURCE_PCI_FIXED bit that we use
> > for things that must not be moved.  Generally PCI resources are
> > associated with a pci_dev, and we set IORESOURCE_PCI_FIXED for BARs,
> > e.g., dev->resource[n].  We do that for IDE legacy regions (see
> > LEGACY_IO_RESOURCE), Langwell devices (pci_fixed_bar_fixup()),
> > "enhanced allocation" (pci_ea_flags()), and some quirks (quirk_io()).
> > 
> > In your case, the device is hidden so it doesn't respond to config
> > accesses, so there is no pci_dev for it.
> 
> Yes, and the idea is to unhide it on the early stage.
> Would it be possible to quirk it to fix the IO resources?

If I read your current patch right, it unhides the device, reads the
BAR, then hides the device again.  I didn't see that it would create a
pci_dev for it.

If you unhide it and then enumerate it normally (and mark the BAR as
IORESOURCE_PCI_FIXED to make sure we never move it), that might work.
Then there should be a pci_dev for it, and it would then show up in
sysfs, lspci, etc.  And we should insert the BAR in iomem_resource, so
we should see it in /proc/iomem and we won't accidentally put
something else on top of it.

> > Maybe you could do some sort of quirk that allocates its own struct
> > resource, fills it in, sets IORESOURCE_PCI_FIXED, and does something
> > similar to pci_claim_resource()?
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>

Re: [PATCH v1 3/7] PCI: New Primary to Sideband (P2SB) bridge support library

2021-04-01 Thread Bjorn Helgaas

On Thu, Apr 01, 2021 at 06:45:02PM +0300, Andy Shevchenko wrote:
> On Tue, Mar 09, 2021 at 09:42:52AM +0100, Henning Schild wrote:
> > Am Mon, 8 Mar 2021 19:42:21 -0600
> > schrieb Bjorn Helgaas :
> > > On Mon, Mar 08, 2021 at 09:16:50PM +0200, Andy Shevchenko wrote:
> > > > On Mon, Mar 08, 2021 at 12:52:12PM -0600, Bjorn Helgaas wrote:  
> > > > > On Mon, Mar 08, 2021 at 02:20:16PM +0200, Andy Shevchenko wrote:  
> 
> ...
> 
> > > > > > +   /* Read the first BAR of the device in question */
> > > > > > +   __pci_bus_read_base(bus, devfn, pci_bar_unknown, mem,
> > > > > > PCI_BASE_ADDRESS_0, true);  
> > > > > 
> > > > > I don't get this.  Apparently this normally hidden device is
> > > > > consuming PCI address space.  The PCI core needs to know
> > > > > about this.  If it doesn't, the PCI core may assign this
> > > > > space to another device.  
> > > > 
> > > > Right, it returns all 1:s to any request so PCI core *thinks*
> > > > it's plugged off (like D3cold or so).  
> > > 
> > > I'm asking about the MMIO address space.  The BAR is a register
> > > in config space.  AFAICT, clearing P2SBC_HIDE_BYTE makes that
> > > BAR visible.  The BAR describes a region of PCI address space.
> > > It looks like setting P2SBC_HIDE_BIT makes the BAR disappear
> > > from config space, but it sounds like the PCI address space
> > > *described* by the BAR is still claimed by the device.  If the
> > > device didn't respond to that MMIO space, you would have no
> > > reason to read the BAR at all.
> > > 
> > > So what keeps the PCI core from assigning that MMIO space to
> > > another device?
> > 
> > The device will respond to MMIO while being hidden. I am afraid
> > nothing stops a collision, except for the assumption that the BIOS
> > is always right and PCI devices never get remapped. But just
> > guessing here.
> > 
> > I have seen devices with coreboot having the P2SB visible, and
> > most likely relocatable. Making it visible in Linux and not hiding
> > it again might work, but probably only as long as Linux will not
> > relocate it.  Which i am afraid might seriously upset the BIOS,
> > depending on what a device does with those GPIOs and which parts
> > are implemented in the BIOS.
> 
> So the question is, do we have knobs in PCI core to mark device
> fixes in terms of BARs, no relocation must be applied, no other
> devices must have the region?

I think the closest thing is the IORESOURCE_PCI_FIXED bit that we use
for things that must not be moved.  Generally PCI resources are
associated with a pci_dev, and we set IORESOURCE_PCI_FIXED for BARs,
e.g., dev->resource[n].  We do that for IDE legacy regions (see
LEGACY_IO_RESOURCE), Langwell devices (pci_fixed_bar_fixup()),
"enhanced allocation" (pci_ea_flags()), and some quirks (quirk_io()).

In your case, the device is hidden so it doesn't respond to config
accesses, so there is no pci_dev for it.

Maybe you could do some sort of quirk that allocates its own struct
resource, fills it in, sets IORESOURCE_PCI_FIXED, and does something
similar to pci_claim_resource()?

Bjorn

Re: [PATCH] PCI: ACPI: PM: Fix debug message in acpi_pci_set_power_state()

2021-03-31 Thread Bjorn Helgaas

On Thu, Mar 25, 2021 at 07:57:51PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> If PCI_D3cold is passed to acpi_pci_set_power_state() as the second
> argument and there is no ACPI D3cold support for the given device,
> the debug message printed by that function will state that the
> device power state has been changed to D3cold, while in fact it
> will be D3hot, because acpi_device_set_power() falls back to D3hot
> automatically if D3cold is not supported without returning an error.
> 
> To address this issue, modify the debug message in question to print
> the current power state of the target PCI device's ACPI companion
> instead of printing the target power state which may not reflect
> the real final power state of the device.
> 
> Signed-off-by: Rafael J. Wysocki 

Applied with Krzysztof's reviewed-by to pci/pm for v5.13, thanks!

Let me know if you have nearby or related changes that you'd rather
take via your tree.

> ---
>  drivers/pci/pci-acpi.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -1021,7 +1021,7 @@ static int acpi_pci_set_power_state(stru
>  
>   if (!error)
>   pci_dbg(dev, "power state changed by ACPI to %s\n",
> -  acpi_power_state_string(state_conv[state]));
> + acpi_power_state_string(adev->power.state));
>  
>   return error;
>  }
> 
> 
>

Re: [PATCH v2 04/15] ACPI: table: replace attribute((packed)) by __packed

2021-03-31 Thread Bjorn Helgaas

On Wed, Mar 31, 2021 at 11:55:08PM +0800, Zhang Rui wrote:
> ...

> From e18c942855e2f51e814d057fff4dd951cd0d0907 Mon Sep 17 00:00:00 2001
> From: Zhang Rui 
> Date: Wed, 31 Mar 2021 20:34:13 +0800
> Subject: [PATCH] ACPI: tables: FPDT: Fix 64bit alignment issue
> 
> Some of the 64bit items in FPDT table may be 32bit aligned.
> Using __attribute__((packed)) is not needed in this case, fixing it by
> allowing 32bit alignment for these 64bit items.

1) Can you please add a spec reference for this?  I think it's ACPI
   v6.3, sec 5.2.23.5, or something close to that.

2) The exact layout in memory is prescribed by the spec.  I think
   that's basically what "packed" accomplishes.  I don't understand
   why using "aligned" would be preferable.  Using "aligned" means
   things can be at different offsets depending on the starting
   address of the structure.  We always want the identical layout, no
   matter what the starting address is.

> Signed-off-by: Zhang Rui 
> ---
>  drivers/acpi/acpi_fpdt.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_fpdt.c b/drivers/acpi/acpi_fpdt.c
> index a89a806a7a2a..94e107b9a114 100644
> --- a/drivers/acpi/acpi_fpdt.c
> +++ b/drivers/acpi/acpi_fpdt.c
> @@ -23,12 +23,14 @@ enum fpdt_subtable_type {
>   SUBTABLE_S3PT,
>  };
>  
> +typedef u64 __attribute__((aligned(4))) u64_align32;
> +
>  struct fpdt_subtable_entry {
>   u16 type;   /* refer to enum fpdt_subtable_type */
>   u8 length;
>   u8 revision;
>   u32 reserved;
> - u64 address;/* physical address of the S3PT/FBPT table */
> + u64_align32 address;/* physical address of the S3PT/FBPT 
> table */
>  };
>  
>  struct fpdt_subtable_header {
> @@ -51,25 +53,25 @@ struct fpdt_record_header {
>  struct resume_performance_record {
>   struct fpdt_record_header header;
>   u32 resume_count;
> - u64 resume_prev;
> - u64 resume_avg;
> -} __attribute__((packed));
> + u64_align32 resume_prev;
> + u64_align32 resume_avg;
> +};
>  
>  struct boot_performance_record {
>   struct fpdt_record_header header;
>   u32 reserved;
> - u64 firmware_start;
> - u64 bootloader_load;
> - u64 bootloader_launch;
> - u64 exitbootservice_start;
> - u64 exitbootservice_end;
> -} __attribute__((packed));
> + u64_align32 firmware_start;
> + u64_align32 bootloader_load;
> + u64_align32 bootloader_launch;
> + u64_align32 exitbootservice_start;
> + u64_align32 exitbootservice_end;
> +};
>  
>  struct suspend_performance_record {
>   struct fpdt_record_header header;
> - u64 suspend_start;
> - u64 suspend_end;
> -} __attribute__((packed));
> + u64_align32 suspend_start;
> + u64_align32 suspend_end;
> +};
>  
>  
>  static struct resume_performance_record *record_resume;
> -- 
> 2.17.1
> 
>

Re: [PATCH] PCI: xgene: fix a mistake about cfg address

2021-03-30 Thread Bjorn Helgaas

On Sun, Mar 28, 2021 at 10:41:18PM +0800, Dejin Zheng wrote:
> It has a wrong modification to the xgene driver by the commit
> e2dcd20b1645a. it use devm_platform_ioremap_resource_byname() to
> simplify codes and remove the res variable, But the following code
> needs to use this res variable, So after this commit, the port->cfg_addr
> will get a wrong address. Now, revert it.
> 
> Fixes: e2dcd20b1645a ("PCI: controller: Convert to 
> devm_platform_ioremap_resource_byname()")
> Reported-by: dann.fraz...@canonical.com
> Signed-off-by: Dejin Zheng 

This looks right to me, but since e2dcd20b1645a appeared in v5.9-rc1,
I think it should have:

  Cc: sta...@vger.kernel.org# v5.9+

> ---
>  drivers/pci/controller/pci-xgene.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-xgene.c 
> b/drivers/pci/controller/pci-xgene.c
> index 2afdc865253e..7f503dd4ff81 100644
> --- a/drivers/pci/controller/pci-xgene.c
> +++ b/drivers/pci/controller/pci-xgene.c
> @@ -354,7 +354,8 @@ static int xgene_pcie_map_reg(struct xgene_pcie_port 
> *port,
>   if (IS_ERR(port->csr_base))
>   return PTR_ERR(port->csr_base);
>  
> - port->cfg_base = devm_platform_ioremap_resource_byname(pdev, "cfg");
> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "cfg");
> + port->cfg_base = devm_ioremap_resource(dev, res);
>   if (IS_ERR(port->cfg_base))
>   return PTR_ERR(port->cfg_base);
>   port->cfg_addr = res->start;
> -- 
> 2.30.1
>

Re: [PATCH] PCI: Try to find two continuous regions for child resource

2021-03-29 Thread Bjorn Helgaas

On Mon, Mar 29, 2021 at 04:47:59PM +0800, Kai-Heng Feng wrote:
> Built-in grahpics on HP EliteDesk 805 G6 doesn't work because graphics
> can't get the BAR it needs:
> [0.611504] pci_bus :00: root bus resource [mem 
> 0x1002020-0x100303f window]
> [0.611505] pci_bus :00: root bus resource [mem 
> 0x1003040-0x100401f window]
> ...
> [0.638083] pci :00:08.1:   bridge window [mem 0xd200-0xd23f]
> [0.638086] pci :00:08.1:   bridge window [mem 
> 0x1003000-0x100401f 64bit pref]
> [0.962086] pci :00:08.1: can't claim BAR 15 [mem 
> 0x1003000-0x100401f 64bit pref]: no compatible bridge window
> [0.962086] pci :00:08.1: [mem 0x1003000-0x100401f 64bit pref] 
> clipped to [mem 0x1003000-0x100303f 64bit pref]
> [0.962086] pci :00:08.1:   bridge window [mem 
> 0x1003000-0x100303f 64bit pref]
> [0.962086] pci :07:00.0: can't claim BAR 0 [mem 
> 0x1003000-0x1003fff 64bit pref]: no compatible bridge window
> [0.962086] pci :07:00.0: can't claim BAR 2 [mem 
> 0x1004000-0x100401f 64bit pref]: no compatible bridge window
>
> However, the root bus has two continuous regions that can contain the
> child resource requested.
>
> So try to find another parent region if two regions are continuous and
> can contain child resource. This change makes the grahpics works on the
> system in question.

The BIOS description of PCI0 is interesting:

  pci_bus :00: root bus resource [mem 0x100-0x100201f window]
  pci_bus :00: root bus resource [mem 0x1002020-0x100303f window]
  pci_bus :00: root bus resource [mem 0x1003040-0x100401f window]

So the PCI0 _CRS apparently gave us:

  [mem 0x100-0x100201f] size 0x2020 (512MB + 2MB)
  [mem 0x1002020-0x100303f] size 0x1020 (256MB + 2MB)
  [mem 0x1003040-0x100401f] size 0x0fe0 (254MB)

These are all contiguous, so we'd have no problem if we coalesced them
into a single window:

  [mem 0x100-0x100401f window] size 0x4020 (1GB + 2MB)

I think we currently keep these root bus resources separate because if
we ever support _SRS for host bridges, the argument we give to _SRS
must be exactly the same format as what we got from _CRS (see ACPI
v6.3, sec 6.2.16, and pnpacpi_set_resources()).

pnpacpi_encode_resources() is currently very simple-minded and copies
each device resource back into a single _SRS entry.  But (1) we don't
support _SRS for host bridges, and (2) if we ever do, we can make
pnpacpi_encode_resources() smarter so it breaks things back up.

So I think we should try to fix this by coalescing these adjacent
resources from _CRS so we end up with a single root bus resource that
covers all contiguous regions.

Typos, etc:
  - No need for the timestamps; they're not relevant to the problem.
  - s/grahpics/graphics/ (two occurrences above)
  - s/continuous/contiguous/ (three occurrences above)

> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212013
> Signed-off-by: Kai-Heng Feng 
> ---
>  arch/microblaze/pci/pci-common.c |  4 +--
>  arch/powerpc/kernel/pci-common.c |  8 ++---
>  arch/sparc/kernel/pci.c  |  4 +--
>  drivers/pci/pci.c| 60 +++-
>  drivers/pci/setup-res.c  | 21 +++
>  drivers/pcmcia/rsrc_nonstatic.c  |  4 +--
>  include/linux/pci.h  |  6 ++--
>  7 files changed, 80 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/microblaze/pci/pci-common.c 
> b/arch/microblaze/pci/pci-common.c
> index 557585f1be41..8e65832fb510 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -669,7 +669,7 @@ static void pcibios_allocate_bus_resources(struct pci_bus 
> *bus)
>  {
>   struct pci_bus *b;
>   int i;
> - struct resource *res, *pr;
> + struct resource *res, *pr = NULL;
>  
>   pr_debug("PCI: Allocating bus resources for %04x:%02x...\n",
>pci_domain_nr(bus), bus->number);
> @@ -688,7 +688,7 @@ static void pcibios_allocate_bus_resources(struct pci_bus 
> *bus)
>* and as such ensure proper re-allocation
>* later.
>*/
> - pr = pci_find_parent_resource(bus->self, res);
> + pci_find_parent_resource(bus->self, res, , NULL);
>   if (pr == res) {
>   /* this happens when the generic PCI
>* code (wrongly) decides that this
> diff --git a/arch/powerpc/kernel/pci-common.c 
> b/arch/powerpc/kernel/pci-common.c
> index 001e90cd8948..f865354b746d 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1196,7 +1196,7 @@ static void pcibios_allocate_bus_resources(struct 
> pci_bus *bus)
>  {
>   struct pci_bus *b;
>   int i;
> - struct resource *res, *pr;
> +

Re: [PATCH] PCI: Remove pci_try_set_mwi

2021-03-28 Thread Bjorn Helgaas

On Sun, Mar 28, 2021 at 12:04:35AM +0100, Heiner Kallweit wrote:
> On 26.03.2021 22:26, Bjorn Helgaas wrote:
> > [+cc Randy, Andrew (though I'm sure you have zero interest in this
> > ancient question :))]
> > 
> > On Wed, Dec 09, 2020 at 09:31:21AM +0100, Heiner Kallweit wrote:
> >> pci_set_mwi() and pci_try_set_mwi() do exactly the same, just that the
> >> former one is declared as __must_check. However also some callers of
> >> pci_set_mwi() have a comment that it's an optional feature. I don't
> >> think there's much sense in this separation and the use of
> >> __must_check. Therefore remove pci_try_set_mwi() and remove the
> >> __must_check attribute from pci_set_mwi().
> >> I don't expect either function to be used in new code anyway.
> > 
> > There's not much I like better than removing things.  But some
> > significant thought went into adding pci_try_set_mwi() in the first
> > place, so I need a little more convincing about why it's safe to
> > remove it.
> > 
> 
> Thanks for the link to the 13 yrs old discussion. Unfortunately it
> doesn't mention any real argument for the __must_check, just:
> 
> "And one of the reasons for adding the __must_check annotation is to
> weed out design errors."
> And the very next response in the discussion calls this a "non-argument".
> Plus not mentioning what the other reasons could be.

I think you're referring to Alan's response [1]:

  akpm> And we *need* to be excessively anal in the PCI setup code.
  akpm> We have metric shitloads of bugs due to problems in that area,
  akpm> and the more formality and error handling and error reporting
  akpm> we can get in there the better off we will be.

  ac> No argument there

So Alan is actually *agreeing* that "we need to be excessively anal in
the PCI setup code,"  not saying that "weeding out design errors is
not an argument for __must_check."

> Currently we have three ancient drivers that bail out if the call fails.
> Most callers of pci_set_mwi() use the return code only to emit an
> error message, but they proceed normally. Majority of users calls
> pci_try_set_mwi(). And as stated in the commit message I don't expect
> any new usage of pci_set_mwi().

I would love to merge this patch.  We just need to clarify the commit
log.  Right now the only justification is "I don't think there's much
sense in the __must_check annotation," which may well be true but
could use some support.

If MWI is purely an optimization and there's never a functional
problem if pci_set_mwi() fails, we should say that (and maybe
update any drivers that bail out on failure).

Andrew and Alan both seem to agree that MSI *is* purely advisory:

  akpm> pci_set_mwi() is an advisory thing, and on certain platforms
  akpm> it might fail to set the cacheline size to the desired number.
  akpm> This is not a fatal error and the driver can successfully run
  akpm> at a lesser performance level.

  ac> Correct.

But even after that, Andrew proposed adding pci_try_set_mwi().  So it
makes sense to really understand what was going on there so we don't
break something in the name of cleaning it up.

[1] https://lore.kernel.org/linux-ide/20070405211609.5263d...@the-village.bc.nu/

> > The argument should cite the discussion about adding it.  I think one
> > of the earliest conversations is here:
> > https://lore.kernel.org/linux-ide/20070404213704.224128ec.randy.dun...@oracle.com/

Re: [PATCH] PCI: Remove pci_try_set_mwi

2021-03-26 Thread Bjorn Helgaas

On Fri, Mar 26, 2021 at 11:42:46PM +0200, Andy Shevchenko wrote:
> On Fri, Mar 26, 2021 at 04:26:55PM -0500, Bjorn Helgaas wrote:
> > [+cc Randy, Andrew (though I'm sure you have zero interest in this
> > ancient question :))]
> > 
> > On Wed, Dec 09, 2020 at 09:31:21AM +0100, Heiner Kallweit wrote:
> > > pci_set_mwi() and pci_try_set_mwi() do exactly the same, just that the
> > > former one is declared as __must_check. However also some callers of
> > > pci_set_mwi() have a comment that it's an optional feature. I don't
> > > think there's much sense in this separation and the use of
> > > __must_check. Therefore remove pci_try_set_mwi() and remove the
> > > __must_check attribute from pci_set_mwi().
> > > I don't expect either function to be used in new code anyway.
> > 
> > There's not much I like better than removing things.  But some
> > significant thought went into adding pci_try_set_mwi() in the first
> > place, so I need a little more convincing about why it's safe to
> > remove it.
> > 
> > The argument should cite the discussion about adding it.  I think one
> > of the earliest conversations is here:
> > https://lore.kernel.org/linux-ide/20070404213704.224128ec.randy.dun...@oracle.com/
> 
> It's solely PCI feature which is absent on PCIe.
>
> So, if there is a guarantee that the driver never services a device connected
> to old PCI bus, it's okay to remove the call (it's no-op on PCIe anyway).

Yes, I'm aware that MWI is a no-op on PCIe.  If we want to argue that
we don't need to support Conventional PCI devices, that should be
explicit, and we could remove pci_set_mwi() completely.  But I don't
think we're ready to drop Conventional PCI support.

> OTOH, PCI core may try MWI itself for every device (but this is an opposite,
> what should we do on broken devices that do change their state based on that
> bit while violating specification).
> 
> In any case
> 
> Acked-by: Andy Shevchenko 

Thanks!

Bjorn

Re: [PATCH] PCI: Remove pci_try_set_mwi

2021-03-26 Thread Bjorn Helgaas

[+cc Randy, Andrew (though I'm sure you have zero interest in this
ancient question :))]

On Wed, Dec 09, 2020 at 09:31:21AM +0100, Heiner Kallweit wrote:
> pci_set_mwi() and pci_try_set_mwi() do exactly the same, just that the
> former one is declared as __must_check. However also some callers of
> pci_set_mwi() have a comment that it's an optional feature. I don't
> think there's much sense in this separation and the use of
> __must_check. Therefore remove pci_try_set_mwi() and remove the
> __must_check attribute from pci_set_mwi().
> I don't expect either function to be used in new code anyway.

There's not much I like better than removing things.  But some
significant thought went into adding pci_try_set_mwi() in the first
place, so I need a little more convincing about why it's safe to
remove it.

The argument should cite the discussion about adding it.  I think one
of the earliest conversations is here:
https://lore.kernel.org/linux-ide/20070404213704.224128ec.randy.dun...@oracle.com/

> Signed-off-by: Heiner Kallweit 
> ---
> patch applies on top of pci/misc for v5.11
> ---
>  Documentation/PCI/pci.rst |  5 +
>  drivers/ata/pata_cs5530.c |  2 +-
>  drivers/ata/sata_mv.c |  2 +-
>  drivers/dma/dw/pci.c  |  2 +-
>  drivers/dma/hsu/pci.c |  2 +-
>  drivers/ide/cs5530.c  |  2 +-
>  drivers/mfd/intel-lpss-pci.c  |  2 +-
>  drivers/net/ethernet/adaptec/starfire.c   |  2 +-
>  drivers/net/ethernet/alacritech/slicoss.c |  2 +-
>  drivers/net/ethernet/dec/tulip/tulip_core.c   |  5 +
>  drivers/net/ethernet/sun/cassini.c|  4 ++--
>  drivers/net/wireless/intersil/p54/p54pci.c|  2 +-
>  .../intersil/prism54/islpci_hotplug.c |  3 +--
>  .../wireless/realtek/rtl818x/rtl8180/dev.c|  2 +-
>  drivers/pci/pci.c | 19 ---
>  drivers/scsi/3w-9xxx.c|  4 ++--
>  drivers/scsi/3w-sas.c |  4 ++--
>  drivers/scsi/csiostor/csio_init.c |  2 +-
>  drivers/scsi/lpfc/lpfc_init.c |  2 +-
>  drivers/scsi/qla2xxx/qla_init.c   |  8 
>  drivers/scsi/qla2xxx/qla_mr.c |  2 +-
>  drivers/tty/serial/8250/8250_lpss.c   |  2 +-
>  drivers/usb/chipidea/ci_hdrc_pci.c|  2 +-
>  drivers/usb/gadget/udc/amd5536udc_pci.c   |  2 +-
>  drivers/usb/gadget/udc/net2280.c  |  2 +-
>  drivers/usb/gadget/udc/pch_udc.c  |  2 +-
>  include/linux/pci.h   |  5 ++---
>  27 files changed, 33 insertions(+), 60 deletions(-)
> 
> diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
> index 814b40f83..120362cc9 100644
> --- a/Documentation/PCI/pci.rst
> +++ b/Documentation/PCI/pci.rst
> @@ -226,10 +226,7 @@ If the PCI device can use the PCI 
> Memory-Write-Invalidate transaction,
>  call pci_set_mwi().  This enables the PCI_COMMAND bit for Mem-Wr-Inval
>  and also ensures that the cache line size register is set correctly.
>  Check the return value of pci_set_mwi() as not all architectures
> -or chip-sets may support Memory-Write-Invalidate.  Alternatively,
> -if Mem-Wr-Inval would be nice to have but is not required, call
> -pci_try_set_mwi() to have the system do its best effort at enabling
> -Mem-Wr-Inval.
> +or chip-sets may support Memory-Write-Invalidate.
>  
>  
>  Request MMIO/IOP resources
> diff --git a/drivers/ata/pata_cs5530.c b/drivers/ata/pata_cs5530.c
> index ad75d02b6..8654b3ae1 100644
> --- a/drivers/ata/pata_cs5530.c
> +++ b/drivers/ata/pata_cs5530.c
> @@ -214,7 +214,7 @@ static int cs5530_init_chip(void)
>   }
>  
>   pci_set_master(cs5530_0);
> - pci_try_set_mwi(cs5530_0);
> + pci_set_mwi(cs5530_0);
>  
>   /*
>* Set PCI CacheLineSize to 16-bytes:
> diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
> index 664ef658a..ee37755ea 100644
> --- a/drivers/ata/sata_mv.c
> +++ b/drivers/ata/sata_mv.c
> @@ -4432,7 +4432,7 @@ static int mv_pci_init_one(struct pci_dev *pdev,
>   mv_print_info(host);
>  
>   pci_set_master(pdev);
> - pci_try_set_mwi(pdev);
> + pci_set_mwi(pdev);
>   return ata_host_activate(host, pdev->irq, mv_interrupt, IRQF_SHARED,
>IS_GEN_I(hpriv) ? _sht : _sht);
>  }
> diff --git a/drivers/dma/dw/pci.c b/drivers/dma/dw/pci.c
> index 1142aa6f8..1c20b7485 100644
> --- a/drivers/dma/dw/pci.c
> +++ b/drivers/dma/dw/pci.c
> @@ -30,7 +30,7 @@ static int dw_pci_probe(struct pci_dev *pdev, const struct 
> pci_device_id *pid)
>   }
>  
>   pci_set_master(pdev);
> - pci_try_set_mwi(pdev);
> + pci_set_mwi(pdev);
>  
>   ret = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
>   if (ret)
> diff --git a/drivers/dma/hsu/pci.c b/drivers/dma/hsu/pci.c
> index 07cc7320a..420dd3706 100644
> --- a/drivers/dma/hsu/pci.c
>

Re: [PATCH v3 6/6] PCI: brcmstb: Check return value of clk_prepare_enable()

2021-03-26 Thread Bjorn Helgaas

On Fri, Mar 26, 2021 at 03:19:04PM -0400, Jim Quinlan wrote:
> The check was missing on PCIe resume.

"PCIe resume" isn't really a thing, per se.  PCI/PCIe gives us device
power states (D0, D3hot, etc), and Linux power management builds
suspend/resume on top of those.  Maybe:

  Check for failure of clk_prepare_enable() on device resume.

> Signed-off-by: Jim Quinlan 
> Acked-by: Florian Fainelli 
> Fixes: 8195b7417018 ("PCI: brcmstb: Add suspend and resume pm_ops")
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c 
> b/drivers/pci/controller/pcie-brcmstb.c
> index 2d9288399014..f6d9d785b301 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -1396,7 +1396,9 @@ static int brcm_pcie_resume(struct device *dev)
>   int ret;
>  
>   base = pcie->base;
> - clk_prepare_enable(pcie->clk);
> + ret = clk_prepare_enable(pcie->clk);
> + if (ret)
> + return ret;

This fix doesn't look like it depends on the EP regulator support.
Maybe it should be a preparatory patch before patch 1/6?  It could
then easily be backported to kernels that contain 8195b7417018 but not
EP regulator support.

>   ret = brcm_set_regulators(pcie, TURN_ON);
>   if (ret)
> @@ -1535,7 +1537,9 @@ static int brcm_pcie_probe(struct platform_device *pdev)
>  
>   ret = brcm_pcie_get_regulators(pcie);
>   if (ret) {
> - dev_err(pcie->dev, "failed to get regulators (err=%d)\n", ret);
> + pcie->num_supplies = 0;
> + if (ret != -EPROBE_DEFER)
> + dev_err(pcie->dev, "failed to get regulators 
> (err=%d)\n", ret);

Looks like this hunk might belong somewhere else, e.g., in patch 2/6?
The "Fixes:" line suggests that this patch could/should be backported to
every kernel that contains 8195b7417018, but 8195b7417018 doesn't have
pcie->num_supplies.

>   goto fail;
>   }
>  
> -- 
> 2.17.1
>

Re: [PATCH v3 3/6] PCI: brcmstb: Do not turn off regulators if EP can wake up

2021-03-26 Thread Bjorn Helgaas

On Fri, Mar 26, 2021 at 03:19:01PM -0400, Jim Quinlan wrote:
> If any downstream device may wake up during S2/S3 suspend, we do not want
> to turn off its power when suspending.
> 
> Signed-off-by: Jim Quinlan 
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 58 +++
>  1 file changed, 51 insertions(+), 7 deletions(-)

> +enum {
> + TURN_OFF,   /* Turn egulators off, unless an EP is 
> wakeup-capable */
> + TURN_OFF_ALWAYS,/* Turn Regulators off, no exceptions */
> + TURN_ON,/* Turn regulators on, unless 
> pcie->ep_wakeup_capable */

s/egulators/regulators/
s/Regulators/regulators/

Re: [PATCH v3 2/6] PCI: brcmstb: Add control of EP voltage regulators

2021-03-26 Thread Bjorn Helgaas

On Fri, Mar 26, 2021 at 03:19:00PM -0400, Jim Quinlan wrote:
> Control of EP regulators by the RC is needed because of the chicken-and-egg

Can you expand "EP"?  Not sure if this refers to "endpoint" or
something else.

If this refers to a device in a slot, I guess it isn't necessarily a
PCIe *endpoint*; it could also be a switch upstream port.

> situation: although the regulator is "owned" by the EP and would be best
> handled on its driver, the EP cannot be discovered and probed unless its
> regulator is already turned on.
> 
> Signed-off-by: Jim Quinlan 
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 90 ++-
>  1 file changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c 
> b/drivers/pci/controller/pcie-brcmstb.c
> index e330e6811f0b..b76ec7d9af32 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -169,6 +170,7 @@
>  #define SSC_STATUS_SSC_MASK  0x400
>  #define SSC_STATUS_PLL_LOCK_MASK 0x800
>  #define PCIE_BRCM_MAX_MEMC   3
> +#define PCIE_BRCM_MAX_EP_REGULATORS  4
>  
>  #define IDX_ADDR(pcie)   
> (pcie->reg_offsets[EXT_CFG_INDEX])
>  #define DATA_ADDR(pcie)  
> (pcie->reg_offsets[EXT_CFG_DATA])
> @@ -295,8 +297,27 @@ struct brcm_pcie {
>   u32 hw_rev;
>   void(*perst_set)(struct brcm_pcie *pcie, u32 val);
>   void(*bridge_sw_init_set)(struct brcm_pcie *pcie, 
> u32 val);
> + struct regulator_bulk_data supplies[PCIE_BRCM_MAX_EP_REGULATORS];
> + unsigned intnum_supplies;
>  };
>  
> +static int brcm_set_regulators(struct brcm_pcie *pcie, bool on)
> +{
> + struct device *dev = pcie->dev;
> + int ret;
> +
> + if (!pcie->num_supplies)
> + return 0;
> + if (on)
> + ret = regulator_bulk_enable(pcie->num_supplies, pcie->supplies);
> + else
> + ret = regulator_bulk_disable(pcie->num_supplies, 
> pcie->supplies);
> + if (ret)
> + dev_err(dev, "failed to %s EP regulators\n",
> + on ? "enable" : "disable");
> + return ret;
> +}
> +
>  /*
>   * This is to convert the size of the inbound "BAR" region to the
>   * non-linear values of PCIE_X_MISC_RC_BAR[123]_CONFIG_LO.SIZE
> @@ -1141,16 +1162,63 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
>   pcie->bridge_sw_init_set(pcie, 1);
>  }
>  
> +static int brcm_pcie_get_regulators(struct brcm_pcie *pcie)
> +{
> + struct device_node *child, *parent = pcie->np;
> + const unsigned int max_name_len = 64 + 4;
> + struct property *pp;
> +
> + /* Look for regulator supply property in the EP device subnodes */
> + for_each_available_child_of_node(parent, child) {
> + /*
> +  * Do a santiy test to ensure that this is an EP node

s/santiy/sanity/

> +  * (e.g. node name: "pci-ep@0,0").  The slot number
> +  * should always be 0 as our controller only has a single
> +  * port.
> +  */
> + const char *p = strstr(child->full_name, "@0");
> +
> + if (!p || (p[2] && p[2] != ','))
> + continue;
> +
> + /* Now look for regulator supply properties */
> + for_each_property_of_node(child, pp) {
> + int i, n = strnlen(pp->name, max_name_len);
> +
> + if (n <= 7 || strncmp("-supply", >name[n - 7], 7))
> + continue;
> +
> + /* Make sure this is not a duplicate */
> + for (i = 0; i < pcie->num_supplies; i++)
> + if (strncmp(pcie->supplies[i].supply,
> + pp->name, max_name_len) == 0)
> + continue;
> +
> + if (pcie->num_supplies < PCIE_BRCM_MAX_EP_REGULATORS)
> + pcie->supplies[pcie->num_supplies++].supply = 
> pp->name;
> + else
> + dev_warn(pcie->dev, "No room for EP supply 
> %s\n",
> +  pp->name);
> + }
> + }
> + /*
> +  * Get the regulators that the EP devices require.  We cannot use
> +  * pcie->dev as the device argument in regulator_bulk_get() since
> +  * it will not find the regulators.  Instead, use NULL and the
> +  * regulators are looked up by their name.

The comment doesn't explain the interesting part of why you need NULL
instead of "pcie->dev".  I assume it has something to do with the
platform topology and its DT description.

This appears to be the only instance in the whole kernel of a use of
regulator_bulk_get() or devm_regulator_bulk_get() with NULL.

Re: [PATCH] PCI: Allow drivers to claim exclusive access to config regions

2021-03-26 Thread Bjorn Helgaas

[+cc Christoph]

On Wed, Mar 24, 2021 at 06:23:54PM -0700, Dan Williams wrote:
> The PCIE Data Object Exchange (DOE) mailbox is a protocol run over
> configuration cycles. It assumes one initiator at a time is
> reading/writing the data registers. If userspace reads from the response
> data payload it may steal data that a kernel driver was expecting to
> read. If userspace writes to the request payload it may corrupt the
> request a driver was trying to send.

IIUC the problem we're talking about is that userspace config access,
e.g., via "lspci" or "setpci" may interfere with kernel usage of DOE.
I attached what I think are the relevant bits from the spec.

It looks to me like config *reads* should not be a problem: A read of
Write Data Mailbox always returns 0 and looks innocuous.  A userspace
read of Read Data Mailbox may return a DW of the data object, but it
doesn't advance the cursor, so it shouldn't interfere with a kernel
read.  

A write to Write Data Mailbox could obviously corrupt an object being
written to the device.  A config write to Read Data Mailbox *does*
advance the cursor, so that would definitely interfere with a kernel
user.  

So I think we're really talking about an issue with "setpci" and I
don't expect "lspci" to be a problem.  "setpci" is a valuable tool,
and the fact that it can hose your system is not really news.  I don't
know how hard we should work to protect against that.

Bjorn

>From PCIe r6.0 v0.7 draft (sec 7.9.24):

  DOE Control Register

DOE Go – A write of 1b to this bit indicates to the DOE instance
that it can start consuming the data object transferred through the
DOE Write Data Mailbox Register.

Behavior is undefined if the DOE Go bit is Set before the entire
data object has been written to the DOE Write Data Mailbox Register.

Behavior is undefined if the DOE Go bit is written with 1b when the
DOE Busy bit is Set.

Reads from this bit must always return 0b.

  DOE Write Data Mailbox Register

DOE Write Data Mailbox – The DOE instance receives data objects
via writes to this register.

A successfully completed write to this register adds one DW to the
incoming data object.

Setting the DOE Go bit in the DOE Control Register indicates to
the DOE Instance that the final DW of the data object has been
written to this register.

Reads of this register must return all 0’s.

  DOE Read Data Mailbox Register

DOE Read Data Mailbox – If the Data Object Ready bit is Set, a
read of this register returns the current DW of the data object.

A write of any value to this register indicates a successful
transfer of the current data object DW, and the DOE instance must
return the next DW in the data object upon the next read of this
register as long as the Data Object Ready bit remains Set.

It is permitted for multiple data objects to be read from this
register back-to-back. When this scenario occurs the Data Object
Ready bit will remain Set until the final DW is read.

A write of any value to this register when the Data Object Ready
bit is Clear must have no effect.

The value read from this register when Data Object Ready is Clear
must be  h.

> Introduce pci_{request,release}_config_region() for a driver to exclude
> the possibility of userspace induced corruption while accessing the DOE
> mailbox. Likely there are other configuration state assumptions that a
> driver may want to assert are under its exclusive control, so this
> capability is not limited to any specific configuration range.
> 
> Since writes are targeted and are already prepared for failure the
> entire request is failed. The same can not be done for reads as the
> device completely disappears from lspci output if any configuration
> register in the request is exclusive. Instead skip the actual
> configuration cycle on a per-access basis and return all f's as if the
> read had failed.
> 
> Cc: Bjorn Helgaas 
> Cc: Greg Kroah-Hartman 
> Cc: Jonathan Cameron 
> Signed-off-by: Dan Williams 
> ---
>  drivers/pci/access.c|5 +++--
>  drivers/pci/pci-sysfs.c |3 +++
>  drivers/pci/probe.c |5 +
>  include/linux/ioport.h  |2 ++
>  include/linux/pci.h |   16 
>  kernel/resource.c   |   24 +++-
>  6 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 46935695cfb9..a6b3cdfbd505 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -225,8 +225,9 @@ int pci_user_read_config_##size   
> \
>   raw_spin_lock_irq(_lock);   \
>   if (unlikely(dev->

Re: [PATCH 2/4] PCI: j721e: Add PCI legacy interrupt support for J721E

2021-03-25 Thread Bjorn Helgaas

I'd promote J721E earlier in subject so it doesn't get truncated, e.g.,

  PCI: j721e: Add J721E PCI legacy interrupt support

On Thu, Mar 25, 2021 at 02:39:34PM +0530, Kishon Vijay Abraham I wrote:

> +static void j721e_pcie_legacy_irq_handler(struct irq_desc *desc)
> +{
> + int i;
> + u32 reg;
> + int virq;
> + struct j721e_pcie *pcie = irq_desc_get_handler_data(desc);
> + struct irq_chip *chip = irq_desc_get_chip(desc);

The rest of this driver sorts locals in order of use, e.g.,

struct j721e_pcie *pcie = irq_desc_get_handler_data(desc);
struct irq_chip *chip = irq_desc_get_chip(desc);
int i;
u32 reg;
int virq;

> + chained_irq_enter(chip, desc);
> +
> + for (i = 0; i < PCI_NUM_INTX; i++) {
> + reg = j721e_pcie_intd_readl(pcie, STATUS_REG_SYS_0);
> + if (!(reg & INTx_EN(i)))
> + continue;
> +
> + virq = irq_find_mapping(pcie->legacy_irq_domain, 3 - i);

Whitespace error (should be indented another tab, I think).

> + generic_handle_irq(virq);
> + j721e_pcie_intd_writel(pcie, STATUS_CLR_REG_SYS_0, INTx_EN(i));
> + j721e_pcie_intd_writel(pcie, EOI_REG, 3 - i);
> + }
> +
> + chained_irq_exit(chip, desc);
> +}

Re: [PATCH] PCI: dwc: Move forward the iATU detection process

2021-03-25 Thread Bjorn Helgaas

On Thu, Mar 25, 2021 at 10:24:28AM +0100, Marek Szyprowski wrote:
> On 25.01.2021 05:48, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang 
> >
> > In the dw_pcie_ep_init(), it depends on the detected iATU region
> > numbers to allocate the in/outbound window management bit map.
> > It fails after the commit 281f1f99cf3a ("PCI: dwc: Detect number
> > of iATU windows").
> >
> > So this patch move the iATU region detection into a new function,
> > move forward the detection to the very beginning of functions
> > dw_pcie_host_init() and dw_pcie_ep_init(). And also remove it
> > from the dw_pcie_setup(), since it's more like a software
> > perspective initialization step than hardware setup.
> >
> > Fixes: 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows")
> > Signed-off-by: Hou Zhiqiang 
> 
> This patch causes exynos-pcie to hang during the initialization. It 
> looks that some resources are not enabled yet, so calling 
> dw_pcie_iatu_detect() much earlier causes a hang. When I have some time, 
> I will try to identify what is needed to call it properly.

Thanks, I dropped it for now.  We can add it back after we figure out
what the exynos issue is.

For reference, here's the patch I dropped (I had made some minor
corrections to the commit log):

commit fd4162f05194 ("PCI: dwc: Move iATU detection earlier")
Author: Hou Zhiqiang 
Date:   Mon Jan 25 12:48:03 2021 +0800

PCI: dwc: Move iATU detection earlier

dw_pcie_ep_init() depends on the detected iATU region numbers to allocate
the in/outbound window management bitmap.  It fails after 281f1f99cf3a
("PCI: dwc: Detect number of iATU windows").

Move the iATU region detection into a new function, move the detection to
the very beginning of dw_pcie_host_init() and dw_pcie_ep_init().  Also
remove it from the dw_pcie_setup(), since it's more like a software
initialization step than hardware setup.

Fixes: 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows")
Link: https://lore.kernel.org/r/20210125044803.4310-1-zhiqiang@nxp.com
Tested-by: Kunihiko Hayashi 
Signed-off-by: Hou Zhiqiang 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rob Herring 
Cc: sta...@vger.kernel.org  # v5.11+

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 1c25d8337151..8d028a88b375 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -705,6 +705,8 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
}
}
 
+   dw_pcie_iatu_detect(pci);
+
res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "addr_space");
if (!res)
return -EINVAL;
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 7e55b2b66182..52f6887179cd 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -319,6 +319,8 @@ int dw_pcie_host_init(struct pcie_port *pp)
return PTR_ERR(pci->dbi_base);
}
 
+   dw_pcie_iatu_detect(pci);
+
bridge = devm_pci_alloc_host_bridge(dev, 0);
if (!bridge)
return -ENOMEM;
diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 004cb860e266..a945f0c0e73d 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -660,11 +660,9 @@ static void dw_pcie_iatu_detect_regions(struct dw_pcie 
*pci)
pci->num_ob_windows = ob;
 }
 
-void dw_pcie_setup(struct dw_pcie *pci)
+void dw_pcie_iatu_detect(struct dw_pcie *pci)
 {
-   u32 val;
struct device *dev = pci->dev;
-   struct device_node *np = dev->of_node;
struct platform_device *pdev = to_platform_device(dev);
 
if (pci->version >= 0x480A || (!pci->version &&
@@ -693,6 +691,13 @@ void dw_pcie_setup(struct dw_pcie *pci)
 
dev_info(pci->dev, "Detected iATU regions: %u outbound, %u inbound",
 pci->num_ob_windows, pci->num_ib_windows);
+}
+
+void dw_pcie_setup(struct dw_pcie *pci)
+{
+   u32 val;
+   struct device *dev = pci->dev;
+   struct device_node *np = dev->of_node;
 
if (pci->link_gen > 0)
dw_pcie_link_set_max_speed(pci, pci->link_gen);
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 7247c8b01f04..7d6e9b7576be 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -306,6 +306,7 @@ int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, u8 
func_no, int in

Re: [PATCH v5 1/4] PCI: Introduce pcim_alloc_irq_vectors()

2021-03-23 Thread Bjorn Helgaas

[+cc Christoph, Thomas, Alexander, in case you're interested]
[+cc Jonathan, Kurt, Logan: vmd.c and switchtec.c use managed resources
and pci_alloc_irq_vectors()]

On Fri, Feb 26, 2021 at 11:50:53PM +0800, Dejin Zheng wrote:
> Introduce pcim_alloc_irq_vectors(), a device-managed version of
> pci_alloc_irq_vectors(). Introducing this function can simplify
> the error handling path in many drivers.
> 
> And use pci_free_irq_vectors() to replace some code in pcim_release(),
> they are equivalent, and no functional change. It is more explicit
> that pcim_alloc_irq_vectors() is a device-managed function.
> 
> Suggested-by: Andy Shevchenko 
> Signed-off-by: Dejin Zheng 

Acked-by: Bjorn Helgaas 

Let me know if you'd like me to take the series.

> ---
> v4 -> v5:
>   - Remove the check of enable device in pcim_alloc_irq_vectors()
> and make it as a static line function.
> v3 -> v4:
>   - No change
> v2 -> v3:
>   - Add some commit comments for replace some codes in
> pcim_release() by pci_free_irq_vectors().
> v1 -> v2:
>   - Use pci_free_irq_vectors() to replace some code in
> pcim_release().
>   - Modify some commit messages.
> 
>  drivers/pci/pci.c   |  5 +
>  include/linux/pci.h | 24 
>  2 files changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 16a17215f633..fecfdc0add2f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1969,10 +1969,7 @@ static void pcim_release(struct device *gendev, void 
> *res)
>   struct pci_devres *this = res;
>   int i;
>  
> - if (dev->msi_enabled)
> - pci_disable_msi(dev);
> - if (dev->msix_enabled)
> - pci_disable_msix(dev);
> + pci_free_irq_vectors(dev);
>  
>   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
>   if (this->region_mask & (1 << i))
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..5cafd7d65fd7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1818,6 +1818,30 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned 
> int min_vecs,
> NULL);
>  }
>  
> +/**
> + * pcim_alloc_irq_vectors - a device-managed pci_alloc_irq_vectors()
> + * @dev: PCI device to operate on
> + * @min_vecs:minimum number of vectors required (must be >= 
> 1)
> + * @max_vecs:maximum (desired) number of vectors
> + * @flags:   flags or quirks for the allocation
> + *
> + * Return the number of vectors allocated, (which might be smaller than
> + * @max_vecs) if successful, or a negative error code on error. If less
> + * than @min_vecs interrupt vectors are available for @dev the function
> + * will fail with -ENOSPC.
> + *
> + * It depends on calling pcim_enable_device() to make IRQ resources
> + * manageable.
> + */
> +static inline int
> +pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
> + unsigned int max_vecs, unsigned int flags)
> +{
> + if (!pci_is_managed(dev))
> + return -EINVAL;
> + return pci_alloc_irq_vectors(dev, min_vecs, max_vecs, flags);
> +}
> +
>  /* Include architecture-dependent settings and functions */
>  
>  #include 
> -- 
> 2.25.0
>

Re: [PATCH] PCI: hotplug: fix null-ptr-dereferencd in cpcihp error path

2021-03-23 Thread Bjorn Helgaas

On Sun, Mar 21, 2021 at 01:51:08AM -0400, Tong Zhang wrote:
> There is an issue in the error path, which cpci_thread may remain NULL.
> Calling kthread_stop(cpci_thread) will trigger a BUG().
> It is better to check whether the thread is really created and started
> before stop it.
> 
> [1.292859] BUG: kernel NULL pointer dereference, address: 0028
> [1.293252] #PF: supervisor write access in kernel mode
> [1.293533] #PF: error_code(0x0002) - not-present page
> [1.295163] RIP: 0010:kthread_stop+0x22/0x170
> [1.300491] Call Trace:
> [1.300628]  cpci_hp_unregister_controller+0xf6/0x130
> [1.300906]  zt5550_hc_init_one+0x27a/0x27f [cpcihp_zt5550]

Wow, I didn't know anybody actually used this driver :)

> Signed-off-by: Tong Zhang 
> ---
>  drivers/pci/hotplug/cpci_hotplug_core.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c 
> b/drivers/pci/hotplug/cpci_hotplug_core.c
> index d0559d2faf50..b44da397d631 100644
> --- a/drivers/pci/hotplug/cpci_hotplug_core.c
> +++ b/drivers/pci/hotplug/cpci_hotplug_core.c
> @@ -47,7 +47,7 @@ static atomic_t extracting;
>  int cpci_debug;
>  static struct cpci_hp_controller *controller;
>  static struct task_struct *cpci_thread;
> -static int thread_finished;
> +static int thread_started;

Why are we messing around with "thread_started" or "thread_finished"?
We should know whether cpci_thread has been started by the control
flow.

There are two ways to start cpci_thread:

  1)  cpcihp_generic_init# module_init function
cpci_hp_start
  cpci_start_thread
cpci_thread = kthread_run(...)

  2)  zt5550_hc_init_one # .probe function
cpci_hp_start
  cpci_start_thread
cpci_thread = kthread_run(...)

cpci_hp_start() returns a non-zero error if kthread_run() fails, and 
both cpcihp_generic_init() and zt5550_hc_init_one() clean up and exit
in that case.

The error cleanup is a little sloppy: if cpci_hp_register_bus() fails,
cpcihp_generic_init() calls cpci_hp_unregister_controller(), which
stops cpci_thread if it has been started.  But in that case, we *know*
there's no cpci_thread because we haven't even tried to start it.  I
think this error cleanup could be done better by splitting the
cpci_stop_thread() out from cpci_hp_unregister_controller() so it
could be done separately.  zt5550_hc_init_one() has a similar problem.

If cpcihp_generic_init() or zt5550_hc_init_one() succeeds, we *know*
there is a cpci_thread.  We should be able to call kthread_stop() on
it unconditionally in the cpcihp_generic_exit() and
zt5550_hc_remove_one() paths.

What do you think?  It's a little more restructuring work, but I think
"thread_started" and "thread_finished" are basically kind of kludgy
and they add complication without giving me confidence that they're
actually correct.

>  static int enable_slot(struct hotplug_slot *slot);
>  static int disable_slot(struct hotplug_slot *slot);
> @@ -447,7 +447,7 @@ event_thread(void *data)
>   msleep(500);
>   } else if (rc < 0) {
>   dbg("%s - error checking slots", __func__);
> - thread_finished = 1;
> + thread_started = 0;
>   goto out;
>   }
>   } while (atomic_read() && !kthread_should_stop());
> @@ -479,7 +479,7 @@ poll_thread(void *data)
>   msleep(500);
>   } else if (rc < 0) {
>   dbg("%s - error checking slots", 
> __func__);
> - thread_finished = 1;
> + thread_started = 0;
>   goto out;
>   }
>   } while (atomic_read() && 
> !kthread_should_stop());
> @@ -501,7 +501,7 @@ cpci_start_thread(void)
>   err("Can't start up our thread");
>   return PTR_ERR(cpci_thread);
>   }
> - thread_finished = 0;
> + thread_started = 1;
>   return 0;
>  }
>  
> @@ -509,7 +509,7 @@ static void
>  cpci_stop_thread(void)
>  {
>   kthread_stop(cpci_thread);
> - thread_finished = 1;
> + thread_started = 0;
>  }
>  
>  int
> @@ -571,7 +571,7 @@ cpci_hp_unregister_controller(struct cpci_hp_controller 
> *old_controller)
>   int status = 0;
>  
>   if (controller) {
> - if (!thread_finished)
> + if (thread_started)
>   cpci_stop_thread();
>   if (controller->irq)
>   free_irq(controller->irq, controller->dev_id);
> -- 
> 2.25.1
>

Re: [RESEND PATCH V6 2/2] PCI: sprd: Add support for Unisoc SoCs' PCIe controller

2021-03-23 Thread Bjorn Helgaas

On Mon, Mar 22, 2021 at 05:18:31PM +0800, Chunyan Zhang wrote:
> From: Hongtao Wu 
> 
> This series adds PCIe controller driver for Unisoc SoCs.
> This controller is based on DesignWare PCIe IP.
> 
> Signed-off-by: Hongtao Wu 
> Signed-off-by: Chunyan Zhang 
> ---
>  drivers/pci/controller/dwc/Kconfig |  12 +
>  drivers/pci/controller/dwc/Makefile|   1 +
>  drivers/pci/controller/dwc/pcie-sprd.c | 292 +
>  3 files changed, 305 insertions(+)
>  create mode 100644 drivers/pci/controller/dwc/pcie-sprd.c
> 
> diff --git a/drivers/pci/controller/dwc/Kconfig 
> b/drivers/pci/controller/dwc/Kconfig
> index 22c5529e9a65..61f0b79f963d 100644
> --- a/drivers/pci/controller/dwc/Kconfig
> +++ b/drivers/pci/controller/dwc/Kconfig
> @@ -318,4 +318,16 @@ config PCIE_AL
> required only for DT-based platforms. ACPI platforms with the
> Annapurna Labs PCIe controller don't need to enable this.
>  
> +config PCIE_SPRD

Maybe you want PCIE_SPRD_HOST for this one so there's room for a
PCIE_SPRD_EP someday?

> + tristate "Unisoc PCIe controller - Host Mode"
> + depends on ARCH_SPRD || COMPILE_TEST
> + depends on PCI_MSI_IRQ_DOMAIN
> + select PCIE_DW_HOST
> + help
> +   Unisoc PCIe controller uses the DesignWare core. It can be configured
> +   as an Endpoint (EP) or a Root complex (RC). In order to enable host
> +   mode (the controller works as RC), PCIE_SPRD must be selected.
> +   Say Y or M here if you want to PCIe RC controller support on Unisoc
> +   SoCs.
> +
>  endmenu
> diff --git a/drivers/pci/controller/dwc/Makefile 
> b/drivers/pci/controller/dwc/Makefile
> index a751553fa0db..eb546e97c14a 100644
> --- a/drivers/pci/controller/dwc/Makefile
> +++ b/drivers/pci/controller/dwc/Makefile
> @@ -20,6 +20,7 @@ obj-$(CONFIG_PCI_MESON) += pci-meson.o
>  obj-$(CONFIG_PCIE_TEGRA194) += pcie-tegra194.o
>  obj-$(CONFIG_PCIE_UNIPHIER) += pcie-uniphier.o
>  obj-$(CONFIG_PCIE_UNIPHIER_EP) += pcie-uniphier-ep.o
> +obj-$(CONFIG_PCIE_SPRD) += pcie-sprd.o
>  
>  # The following drivers are for devices that use the generic ACPI
>  # pci_root.c driver but don't support standard ECAM config access.
> diff --git a/drivers/pci/controller/dwc/pcie-sprd.c 
> b/drivers/pci/controller/dwc/pcie-sprd.c
> new file mode 100644
> index ..2ccb99eda24f
> --- /dev/null
> +++ b/drivers/pci/controller/dwc/pcie-sprd.c
> @@ -0,0 +1,292 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCIe host controller driver for Unisoc SoCs
> + *
> + * Copyright (C) 2020-2021 Unisoc, Inc.
> + *
> + * Author: Hongtao Wu 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "pcie-designware.h"
> +
> +/* aon apb syscon */
> +#define IPA_ACCESS_CFG   0xcd8
> +#define  AON_ACCESS_PCIE_EN  BIT(1)
> +
> +/* pmu apb syscon */
> +#define SNPS_PCIE3_SLP_CTRL  0xac
> +#define  PERST_N_ASSERT  BIT(1)
> +#define  PERST_N_AUTO_EN BIT(0)
> +#define PD_PCIE_CFG_00x3e8
> +#define  PCIE_FORCE_SHUTDOWN BIT(25)
> +
> +#define PCIE_SS_REG_BASE 0xE00

Pick uppercase or lowercase for your hex constants and use it
consistently.

> +#define APB_CLKFREQ_TIMEOUT  0x4
> +#define  BUSERR_EN   BIT(12)
> +#define  APB_TIMER_DIS   BIT(10)
> +#define  APB_TIMER_LIMIT GENMASK(31, 16)
> +
> +#define PE0_GEN_CTRL_3   0x58
> +#define  LTSSM_ENBIT(0)
> +
> +struct sprd_pcie_soc_data {
> + u32 syscon_offset;
> +};
> +
> +static const struct sprd_pcie_soc_data ums9520_syscon_data = {
> + .syscon_offset = 0x1000,/* The offset of set/clear register */
> +};
> +
> +struct sprd_pcie {
> + u32 syscon_offset;
> + struct device   *dev;
> + struct dw_pcie  *pci;
> + struct regmap   *aon_map;
> + struct regmap   *pmu_map;
> + const struct sprd_pcie_soc_data *socdata;
> +};
> +
> +enum sprd_pcie_syscon_type {
> + normal_syscon,  /* it's not a set/clear register */
> + set_syscon, /* set a set/clear register */
> + clr_syscon, /* clear a set/clear register */
> +};
> +
> +static void sprd_pcie_buserr_enable(struct dw_pcie *pci)
> +{
> + u32 val;
> +
> + val = dw_pcie_readl_dbi(pci, PCIE_SS_REG_BASE + APB_CLKFREQ_TIMEOUT);
> + val &= ~APB_TIMER_DIS;
> + val |= BUSERR_EN;
> + val |= APB_TIMER_LIMIT & (0x1f4 << 16);
> + dw_pcie_writel_dbi(pci, PCIE_SS_REG_BASE + APB_CLKFREQ_TIMEOUT, val);
> +}
> +
> +static void sprd_pcie_ltssm_enable(struct dw_pcie *pci, bool enable)
> +{
> + u32 val;
> +
> + val = dw_pcie_readl_dbi(pci, PCIE_SS_REG_BASE + PE0_GEN_CTRL_3);
> + if (enable)
> + dw_pcie_writel_dbi(pci, PCIE_SS_REG_BASE + PE0_GEN_CTRL_3,
> +val | LTSSM_EN);
> + else
> + dw_pcie_writel_dbi(pci,

Re: [PATCH] pci: fix memory leak when virtio pci hotplug

2021-03-23 Thread Bjorn Helgaas

On Sun, Mar 21, 2021 at 11:29:30PM +0800, Zhiqiang Liu wrote:
> From: Feilong Lin 
> 
> Repeated hot-plugging of pci devices for a virtual
> machine driven by virtio, we found that there is a
> leak in kmalloc-4k, which was confirmed as the memory
> of the pci_device structure. Then we found out that
> it was missing pci_dev_put() after pci_get_slot() in
> enable_slot() of acpiphp_glue.c.
> 
> Signed-off-by: Feilong Lin 
> Reviewed-by: Zhiqiang Liu 

Since this came from you, Zhiqiang, it needs a signed-off-by (not just
a reviewed-by) from you.  See
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.11#n361

Also see
https://lore.kernel.org/r/20171026223701.ga25...@bhelgaas-glaptop.roam.corp.google.com
and

  - Wrap commit log to fill 80 columns
  - s/pci/PCI/ (subject and commit log)
  - Run "git log --oneline drivers/pci/hotplug/acpiphp_glue.c".  It's
not completely consistent, but at least match the style of one of
them.

There is no "pci_device" structure.  I think you mean the "struct
pci_dev".

The commit log doesn't actually say what the patch does.  It's obvious
from the patch, but it should say in the commit log.  Look at previous
commit logs to see how they do it.

> ---
>  drivers/pci/hotplug/acpiphp_glue.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
> b/drivers/pci/hotplug/acpiphp_glue.c
> index 3365c93abf0e..f031302ad401 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -533,6 +533,7 @@ static void enable_slot(struct acpiphp_slot *slot, bool 
> bridge)
>   slot->flags &= ~SLOT_ENABLED;
>   continue;
>   }
> + pci_dev_put(dev);
>   }
>  }
> 
> -- 
> 2.19.1
> 
>

Re: [PATCH RESEND] PCI: dwc: Fix MSI not work after resume

2021-03-23 Thread Bjorn Helgaas

[-cc Dilip (mail to him bounced)]

On Tue, Mar 23, 2021 at 11:01:15AM +0800, Jisheng Zhang wrote:
> On Mon, 22 Mar 2021 20:24:41 -0500 Bjorn Helgaas wrote:
> > 
> > [+cc Kishon, Richard, Lucas, Dilip]
> > 
> > On Mon, Mar 01, 2021 at 11:10:31AM +0800, Jisheng Zhang wrote:
> > > After we move dw_pcie_msi_init() into core -- dw_pcie_host_init(), the
> > > MSI stops working after resume. Because dw_pcie_host_init() is only
> > > called once during probe. To fix this issue, we move dw_pcie_msi_init()
> > > to dw_pcie_setup_rc().  
> > 
> > This patch looks fine, but I don't think the commit log tells the
> > whole story.
> > 
> > Prior to 59fbab1ae40e, it looks like the only dwc-based drivers with
> > resume functions were dra7xx, imx6, intel-gw, and tegra [1].
> > 
> > Only tegra called dw_pcie_msi_init() in the resume path, and I do
> > think 59fbab1ae40e broke MSI after resume because it removed the
> > dw_pcie_msi_init() call from tegra_pcie_enable_msi_interrupts().
> > 
> > I'm not convinced this patch fixes it reliably, though.  The call
> > chain looks like this:
> > 
> >   tegra_pcie_dw_resume_noirq
> > tegra_pcie_dw_start_link
> >   if (dw_pcie_wait_for_link(pci))
> > dw_pcie_setup_rc
> > 
> > dw_pcie_wait_for_link() returns 0 if the link is up, so we only call
> > dw_pcie_setup_rc() in the case where the link *didn't* come up.  If
> > the link comes up nicely without retry, we won't call
> > dw_pcie_setup_rc() and hence won't call dw_pcie_msi_init().
> 
> The v1 version patch was sent before commit 275e88b06a (PCI: tegra: Fix host
> link initialization"). At that time, the resume path looks like this:
> 
> tegra_pcie_dw_resume_noirq
>   tegra_pcie_dw_host_init
> tegra_pcie_prepare_host
>   dw_pcie_setup_rc
> 
> so after patch, dw_pcie_msi_init() will be called. But now it seems that
> the tegra version needs one more fix for the resume.
> 
> So could I sent a new patch to update the commit-msg a bit?

This patch only touches the dwc core, and the commit log says
generically that it fixes MSI after resume, so one could assume that
it applies to all dwc-based drivers.  But I don't think it's that
simple, so I'd like to know *which* drivers are fixed and which
commits are related.  I don't see how 59fbab1ae40e breaks anything
except tegra.

> > Since then, exynos added a resume function.  My guess is MSI never
> > worked after resume for dra7xx, exynos, imx6, and intel-gw because
> > they don't call dw_pcie_msi_init() in their resume functions.
> > 
> > This patch looks like it should fix MSI after resume for exynos, imx6,
> > and intel-gw because they *do* call dw_pcie_setup_rc() from their
> > resume functions [2], and after this patch, dw_pcie_msi_init() will be
> > called from there.
> > 
> > I suspect MSI after resume still doesn't work on dra7xx.
> 
> I checked the dra7xx history, I'm afraid that the resume never works
> from the beginning if the host lost power during suspend, I guess the
> platform never power off the host but only the phy?

Sounds like that would make sense.

> > [1] git grep -A20 -e "static.*resume_noirq" 
> > 59fbab1ae40e^:drivers/pci/controller/dwc
> > [2] git grep -A20 -e "static.*resume_noirq" drivers/pci/controller/dwc
> > 
> > > Fixes: 59fbab1ae40e ("PCI: dwc: Move dw_pcie_msi_init() into core")
> > > Reviewed-by: Rob Herring 
> > > Signed-off-by: Jisheng Zhang 
> > > ---
> > > Since v1:
> > >  - collect Reviewed-by tag
> > >
> > >  drivers/pci/controller/dwc/pcie-designware-host.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> > > b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > index 7e55b2b66182..e6c274f4485c 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > @@ -400,7 +400,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
> > >   }
> > >
> > >   dw_pcie_setup_rc(pp);
> > > - dw_pcie_msi_init(pp);
> > >
> > >   if (!dw_pcie_link_up(pci) && pci->ops && pci->ops->start_link) {
> > >   ret = pci->ops->start_link(pci);
> > > @@ -551,6 +550,8 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
> > >   }
> > >   }
> > >
> > > + dw_pcie_msi_init(pp);
> > > +
> > >   /* Setup RC BARs */
> > >   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0x0004);
> > >   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_1, 0x);
> > > --
> > > 2.30.1
> > >  
>

Re: [PATCH RESEND] PCI: dwc: Fix MSI not work after resume

2021-03-22 Thread Bjorn Helgaas

[+cc Kishon, Richard, Lucas, Dilip]

On Mon, Mar 01, 2021 at 11:10:31AM +0800, Jisheng Zhang wrote:
> After we move dw_pcie_msi_init() into core -- dw_pcie_host_init(), the
> MSI stops working after resume. Because dw_pcie_host_init() is only
> called once during probe. To fix this issue, we move dw_pcie_msi_init()
> to dw_pcie_setup_rc().

This patch looks fine, but I don't think the commit log tells the
whole story.

Prior to 59fbab1ae40e, it looks like the only dwc-based drivers with
resume functions were dra7xx, imx6, intel-gw, and tegra [1].

Only tegra called dw_pcie_msi_init() in the resume path, and I do
think 59fbab1ae40e broke MSI after resume because it removed the
dw_pcie_msi_init() call from tegra_pcie_enable_msi_interrupts().

I'm not convinced this patch fixes it reliably, though.  The call
chain looks like this:

  tegra_pcie_dw_resume_noirq
tegra_pcie_dw_start_link
  if (dw_pcie_wait_for_link(pci))
dw_pcie_setup_rc

dw_pcie_wait_for_link() returns 0 if the link is up, so we only call
dw_pcie_setup_rc() in the case where the link *didn't* come up.  If
the link comes up nicely without retry, we won't call
dw_pcie_setup_rc() and hence won't call dw_pcie_msi_init().

Since then, exynos added a resume function.  My guess is MSI never
worked after resume for dra7xx, exynos, imx6, and intel-gw because
they don't call dw_pcie_msi_init() in their resume functions.

This patch looks like it should fix MSI after resume for exynos, imx6,
and intel-gw because they *do* call dw_pcie_setup_rc() from their
resume functions [2], and after this patch, dw_pcie_msi_init() will be
called from there.

I suspect MSI after resume still doesn't work on dra7xx.

[1] git grep -A20 -e "static.*resume_noirq" 
59fbab1ae40e^:drivers/pci/controller/dwc
[2] git grep -A20 -e "static.*resume_noirq" drivers/pci/controller/dwc

> Fixes: 59fbab1ae40e ("PCI: dwc: Move dw_pcie_msi_init() into core")
> Reviewed-by: Rob Herring 
> Signed-off-by: Jisheng Zhang 
> ---
> Since v1:
>  - collect Reviewed-by tag
> 
>  drivers/pci/controller/dwc/pcie-designware-host.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 7e55b2b66182..e6c274f4485c 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -400,7 +400,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
>   }
>  
>   dw_pcie_setup_rc(pp);
> - dw_pcie_msi_init(pp);
>  
>   if (!dw_pcie_link_up(pci) && pci->ops && pci->ops->start_link) {
>   ret = pci->ops->start_link(pci);
> @@ -551,6 +550,8 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
>   }
>   }
>  
> + dw_pcie_msi_init(pp);
> +
>   /* Setup RC BARs */
>   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0x0004);
>   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_1, 0x);
> -- 
> 2.30.1
>

Re: [PATCH] PCI: dwc: Move forward the iATU detection process

2021-03-22 Thread Bjorn Helgaas

On Mon, Jan 25, 2021 at 12:48:03PM +0800, Zhiqiang Hou wrote:
> From: Hou Zhiqiang 
> 
> In the dw_pcie_ep_init(), it depends on the detected iATU region
> numbers to allocate the in/outbound window management bit map.
> It fails after the commit 281f1f99cf3a ("PCI: dwc: Detect number
> of iATU windows").
> 
> So this patch move the iATU region detection into a new function,
> move forward the detection to the very beginning of functions
> dw_pcie_host_init() and dw_pcie_ep_init(). And also remove it
> from the dw_pcie_setup(), since it's more like a software
> perspective initialization step than hardware setup.
> 
> Fixes: 281f1f99cf3a ("PCI: dwc: Detect number of iATU windows")
> Signed-off-by: Hou Zhiqiang 

Applied to for-linus for v5.12, with stable tag for v5.11, thanks!

> ---
>  drivers/pci/controller/dwc/pcie-designware-ep.c   |  2 ++
>  drivers/pci/controller/dwc/pcie-designware-host.c |  2 ++
>  drivers/pci/controller/dwc/pcie-designware.c  | 11 ---
>  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
>  4 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c 
> b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index bcd1cd9ba8c8..fcf935bf6f5e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -707,6 +707,8 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>   }
>   }
>  
> + dw_pcie_iatu_detect(pci);
> +
>   res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "addr_space");
>   if (!res)
>   return -EINVAL;
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 8a84c005f32b..8eae817c138d 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -316,6 +316,8 @@ int dw_pcie_host_init(struct pcie_port *pp)
>   return PTR_ERR(pci->dbi_base);
>   }
>  
> + dw_pcie_iatu_detect(pci);
> +
>   bridge = devm_pci_alloc_host_bridge(dev, 0);
>   if (!bridge)
>   return -ENOMEM;
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
> b/drivers/pci/controller/dwc/pcie-designware.c
> index 5b72a5448d2e..5b9bf02d918b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -654,11 +654,9 @@ static void dw_pcie_iatu_detect_regions(struct dw_pcie 
> *pci)
>   pci->num_ob_windows = ob;
>  }
>  
> -void dw_pcie_setup(struct dw_pcie *pci)
> +void dw_pcie_iatu_detect(struct dw_pcie *pci)
>  {
> - u32 val;
>   struct device *dev = pci->dev;
> - struct device_node *np = dev->of_node;
>   struct platform_device *pdev = to_platform_device(dev);
>  
>   if (pci->version >= 0x480A || (!pci->version &&
> @@ -687,6 +685,13 @@ void dw_pcie_setup(struct dw_pcie *pci)
>  
>   dev_info(pci->dev, "Detected iATU regions: %u outbound, %u inbound",
>pci->num_ob_windows, pci->num_ib_windows);
> +}
> +
> +void dw_pcie_setup(struct dw_pcie *pci)
> +{
> + u32 val;
> + struct device *dev = pci->dev;
> + struct device_node *np = dev->of_node;
>  
>   if (pci->link_gen > 0)
>   dw_pcie_link_set_max_speed(pci, pci->link_gen);
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
> b/drivers/pci/controller/dwc/pcie-designware.h
> index 5d979953800d..867369d4c4f7 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -305,6 +305,7 @@ int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, u8 
> func_no, int index,
>  void dw_pcie_disable_atu(struct dw_pcie *pci, int index,
>enum dw_pcie_region_type type);
>  void dw_pcie_setup(struct dw_pcie *pci);
> +void dw_pcie_iatu_detect(struct dw_pcie *pci);
>  
>  static inline void dw_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
>  {
> -- 
> 2.17.1
>

Re: [RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata

2021-03-19 Thread Bjorn Helgaas

[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
pci_register_host_bridge() interface"), which I think would make my
idea below possible), Marc (IRQ domains maintainer)]

On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.

It might be a mistake that we put the struct pci_config_window
pointer, which is really arch-independent, in the ->sysdata element,
which normally contains a pointer to arch- or host bridge-dependent 
data.

> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
> 
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
> 
> Originally-by: Sunil Muthuswamy 
> Signed-off-by: Boqun Feng (Microsoft) 
> ---
>  arch/arm64/include/asm/pci.h | 29 +
>  arch/arm64/kernel/pci.c  | 15 ---
>  include/linux/pci.h  |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>  
>  extern int isa_dma_bridge_buggy;
>  
> +struct pci_sysdata {
> + int domain; /* PCI domain */
> + int node;   /* NUMA Node */
> +#ifdef CONFIG_ACPI
> + struct acpi_device *companion;  /* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> + void *fwnode;   /* IRQ domain for MSI assignment */
> +#endif
> +};

Our PCI domain code is really a mess (mostly my fault) and I hate to
make it even more complicated by adding more switches, e.g.,
->use_arch_sysdata.

I think the design problem is that PCI host bridge drivers should
supply the PCI domain up front instead of having callbacks to extract
it.

We could put "int domain_nr" in struct pci_host_bridge, and the arch
code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
HV, etc) could fill in pci_host_bridge.domain_nr before calling
pci_scan_root_bus_bridge() or pci_host_probe().

Then maybe we could get rid of pci_bus_find_domain_nr() and some of
the needlessly arch-specific implementations of pci_domain_nr().
I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
eventually.

>  #ifdef CONFIG_PCI
>  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  {
> @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev 
> *dev, int channel)
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> + if (bus->ops->use_arch_sysdata)
> + return pci_domain_nr(bus);
>   return 1;

I don't understand this.  pci_proc_domain() returns a boolean and
determines whether the /proc/bus/pci/ directory contains, e.g.,

  /proc/bus/pci/00or
  /proc/bus/pci/:00

On arm64, pci_proc_domain() currently always returns 1, so the
directory contains ":00".  After these patches, pci_proc_domain()
returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0,
so buses in domain 0 will be "00" instead of ":00".

This doesn't make sense to me, but at the very least, this
user-visible change needs to be explained.

>  }
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> +{
> + struct pci_sysdata *sd = bus->sysdata;
> +
> + if (bus->ops->use_arch_sysdata)
> + return sd->fwnode;
> +
> + /*
> +  * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> +  * be queried from of/acpi.
> +  */
> + return NULL;
> +}
> +#define pci_root_bus_fwnode  _pci_root_bus_fwnode

Ugh.  pci_root_bus_fwnode() is another callback to find the
irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
itself is only called from pci_set_bus_msi_domain().  This feels like
another case where we could simplify things by having the host bridge
driver figure out the irq_domain explicitly when it creates the
pci_host_bridge.  It seems like that's where we have the most
information about how to find the irq_domain.

> +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> +
>  #endif  /* CONFIG_PCI */
>  
>  #endif  /* __ASM_PCI_H */
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> ---

Re: [RFC 0/2] PCI: Introduce pci_ops::use_arch_sysdata

2021-03-19 Thread Bjorn Helgaas

On Sat, Mar 20, 2021 at 12:19:54AM +0800, Boqun Feng wrote:
> Hi Bjorn,
> 
> I'm currently working on virtual PCI support for Hyper-V ARM64 guests.
> Similar to virtual PCI on x86 Hyper-V guests, the PCI root bus is not
> probed via ACPI (or of), it's probed from Hyper-V VMbus, therefore it

Prime example of why "OF" should be capitalized to prevent the
confusion of reading it as an English word, where it looks like a typo
and makes no sense.  Capitalizing it gives me and other uninitiates a
hint that it's an initialism.  Also applies to your commit logs and
code comments.

> doesn't have config window.

Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

2021-03-19 Thread Bjorn Helgaas

On Fri, Mar 19, 2021 at 02:59:47PM +0200, Leon Romanovsky wrote:
> On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult 
> wrote:
> > On 18.03.21 18:22, Leon Romanovsky wrote:
> > 
> > > Which email client do you use?  Your responses are grouped as
> > > one huge block without any chance to respond to you on specific
> > > point or answer to your question.
> > 
> > I'm reading this thread in Tbird, and threading / quoting all
> > looks nice.
> 
> I'm not talking about threading or quoting but about response
> itself.  See it here
> https://lore.kernel.org/lkml/20210318103935.2ec32...@omen.home.shazbot.org/
> Alex's response is one big chunk without any separations to
> paragraphs.

Don't make this harder than it needs to be.  I think it's totally
acceptable to just split Alex's text where you need to respond.  For
example, Alex wrote this:

  vfio-pci uses the internal kernel API, ie. the variants of
  pci_reset_function(), which is the same interface used by the existing
  sysfs reset mechanism.  This proposed configuration of the reset method
  would affect any driver using that same core infrastructure and from my
  perspective that's really the goal.  ...

If I wanted to respond to the first sentence, I would just do this:

aw> vfio-pci uses the internal kernel API, ie. the variants of
aw> pci_reset_function(), which is the same interface used by the existing
aw> sysfs reset mechanism.  

I would write my response to the above here.  The rest of the quote
continues on below.  If the rest of Alex's message isn't relevant to
my response, I would remove it completely.

aw> This proposed configuration of the reset method
aw> would affect any driver using that same core infrastructure and from my
aw> perspective that's really the goal.  ...

Bjorn

Re: [v5] PCI: Add reset quirk for Huawei Intelligent NIC virtual function

2021-03-18 Thread Bjorn Helgaas

On Tue, Mar 16, 2021 at 10:08:47PM +0800, Chiqijun wrote:
> When multiple VFs do FLR at the same time, the firmware is
> processed serially, resulting in some VF FLRs being delayed more
> than 100ms, when the virtual machine restarts and the device
> driver is loaded, the firmware is doing the corresponding VF
> FLR, causing the driver to fail to load.

Nit: VFs do not do FLR; *software* does FLR on a VF.  And I think this
is a spec compliance issue specific to the Huawei NIC.  I would say
something like "When we do an FLR on several VFs at the same time, the
Huawei Intelligent NIC processes them serially, ..."

"VF FLRs being delayed more than 100ms" does not by itself explain
what the problem is.  I'm guessing the problem is that it exceeds the
"msleep(100)" in pcie_flr(), which is based on PCIe r5.0, sec 6.6.2,
which requires:

  After an FLR has been initiated by writing a 1b to the Initiate
  Function Level Reset bit, the Function must complete the FLR within
  100 ms.

So this device is apparently out of spec.  Is there an erratum for
this?  Please cite it and quote the relevant part here.  I want to
avoid having to update this quirk with future device IDs.

IIUC, VFIO is initiating the FLR, probably as part of assigning the VF
to a VM?

> To solve this problem, add host and firmware status synchronization
> during FLR.
> 
> Signed-off-by: Chiqijun 
> ---
> v5:
>  - Fix build warning reported by kernel test robot
> 
> v4:
>  - Addressed Bjorn's review comments
> 
> v3:
>  - The MSE bit in the VF configuration space is hardwired to zero,
>remove the setting of PCI_COMMAND_MEMORY bit. Add comment for
>set PCI_COMMAND register.
> 
> v2:
>  - Update comments
>  - Use the HINIC_VF_FLR_CAP_BIT_SHIFT and HINIC_VF_FLR_PROC_BIT_SHIFT
>macro instead of the magic number
> ---
>  drivers/pci/quirks.c | 69 
>  1 file changed, 69 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 653660e3ba9e..343890432ba8 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3913,6 +3913,73 @@ static int delay_250ms_after_flr(struct pci_dev *dev, 
> int probe)
>   return 0;
>  }
>  
> +#define PCI_DEVICE_ID_HINIC_VF  0x375E
> +#define HINIC_VF_FLR_TYPE   0x1000
> +#define HINIC_VF_FLR_CAP_BIT_SHIFT  30
> +#define HINIC_VF_OP 0xE80
> +#define HINIC_VF_FLR_PROC_BIT_SHIFT 18
> +#define HINIC_OPERATION_TIMEOUT 15000/* 15 seconds */

If you did this:

  #define HINIC_VF_FLR_CAP_BIT   (1UL << 30)
  #define HINIC_VF_FLR_PROC_BIT  (1UL << 18)

the code below could be a little more readable, e.g,:

  if (!(val & HINIC_VF_FLR_CAP_BIT))
...
  val |= HINIC_VF_FLR_PROC_BIT;

> +/* Device-specific reset method for Huawei Intelligent NIC virtual functions 
> */
> +static int reset_hinic_vf_dev(struct pci_dev *pdev, int probe)
> +{
> + unsigned long timeout;
> + void __iomem *bar;
> + u32 val;
> +
> + if (probe)
> + return 0;
> +
> + bar = pci_iomap(pdev, 0, 0);
> + if (!bar)
> + return -ENOTTY;
> +
> + /* Get and check firmware capabilities. */
> + val = ioread32be(bar + HINIC_VF_FLR_TYPE);
> + if (!(val & (1UL << HINIC_VF_FLR_CAP_BIT_SHIFT))) {
> + pci_iounmap(pdev, bar);
> + return -ENOTTY;
> + }
> +
> + /*
> +  * Set the processing bit for the start of FLR, which will be cleared
> +  * by the firmware after FLR is completed.
> +  */
> + val = ioread32be(bar + HINIC_VF_OP);
> + val = val | (1UL << HINIC_VF_FLR_PROC_BIT_SHIFT);

> + iowrite32be(val, bar + HINIC_VF_OP);
> +
> + /* Perform the actual device function reset */
> + pcie_flr(pdev);
> +
> + /*
> +  * The device must learn BDF after FLR in order to respond to BAR's
> +  * read request, therefore, we issue a configure write request to let
> +  * the device capture BDF.

Will this device capture the bus/device here even though it hasn't
completed the reset?  Or does this write need to happen below, after
the device has cleared HINIC_VF_FLR_PROC_BIT?

> +  */
> + pci_write_config_word(pdev, PCI_VENDOR_ID, 0);
> +
> + /* Waiting for device reset complete */
> + timeout = jiffies + msecs_to_jiffies(HINIC_OPERATION_TIMEOUT);
> + do {
> + val = ioread32be(bar + HINIC_VF_OP);
> + if (!(val & (1UL << HINIC_VF_FLR_PROC_BIT_SHIFT)))
> + goto reset_complete;
> + msleep(20);
> + } while (time_before(jiffies, timeout));
> +
> + val = ioread32be(bar + HINIC_VF_OP);
> + if (!(val & (1UL << HINIC_VF_FLR_PROC_BIT_SHIFT)))
> + goto reset_complete;
> +
> + pci_warn(pdev, "Reset dev timeout, flr ack reg: %#010x\n", val);

s/flr/FLR/

> +reset_complete:
> + pci_iounmap(pdev, bar);
> +
> + return 0;

You return 0 (success) even if the reset timed out.  Is that what you
want?

I'd consider adding an "int err"

Re: [PATCH RESEND] PCI: dwc: Fix MSI not work after resume

2021-03-18 Thread Bjorn Helgaas

On Mon, Mar 01, 2021 at 11:10:31AM +0800, Jisheng Zhang wrote:
> After we move dw_pcie_msi_init() into core -- dw_pcie_host_init(), the
> MSI stops working after resume. Because dw_pcie_host_init() is only
> called once during probe. To fix this issue, we move dw_pcie_msi_init()
> to dw_pcie_setup_rc().
> 
> Fixes: 59fbab1ae40e ("PCI: dwc: Move dw_pcie_msi_init() into core")
> Reviewed-by: Rob Herring 
> Signed-off-by: Jisheng Zhang 

Oops, sorry, looks like this fell through the cracks.  Since
59fbab1ae40e appeared in v5.11, I think we should add:

  Cc: sta...@vger.kernel.org# v5.11+

I'm sure Lorenzo will add it when applying, so no need to repost just
for that.

> ---
> Since v1:
>  - collect Reviewed-by tag
> 
>  drivers/pci/controller/dwc/pcie-designware-host.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 7e55b2b66182..e6c274f4485c 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -400,7 +400,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
>   }
>  
>   dw_pcie_setup_rc(pp);
> - dw_pcie_msi_init(pp);
>  
>   if (!dw_pcie_link_up(pci) && pci->ops && pci->ops->start_link) {
>   ret = pci->ops->start_link(pci);
> @@ -551,6 +550,8 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
>   }
>   }
>  
> + dw_pcie_msi_init(pp);
> +
>   /* Setup RC BARs */
>   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0x0004);
>   dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_1, 0x);
> -- 
> 2.30.1
>

Re: [PATCH 2/2] PCI: Revoke mappings like devmem

2021-03-13 Thread Bjorn Helgaas

[+cc Krzysztof, Pali, Oliver]

On Thu, Feb 04, 2021 at 05:58:31PM +0100, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.
> 
> Except there's two more ways to access PCI BARs: sysfs and proc mmap
> support. Let's plug that hole.

IIUC, the idea is that if a driver calls request_mem_region() on a PCI
BAR, we prevent access to the BAR via sysfs.  I guess I'm OK with that
if it's a real security improvement or something.

But the downside of this implementation is that it depends on
iomem_get_mapping(), which doesn't work until after fs_initcalls,
which means the sysfs files cannot be static attributes of devices
added before that.  PCI devices are typically enumerated in
subsys_initcall.

Krzysztof is converting PCI sysfs files (config, rom, reset, vpd, etc)
to static attributes.  This is a major improvement that could get rid
of pci_create_sysfs_dev_files(), the late_initcall pci_sysfs_init(),
and the "sysfs_initialized" hack.  This would fix a race reported by
Pali [1] (thanks to Oliver for the idea [2]).

EXCEPT that this revoke change means the "resource%d", "legacy_io",
and "legacy_mem" files cannot be static attributes because of
iomem_get_mapping().

Any ideas on how to deal with this?  Having to keep the
pci_sysfs_init() initcall just for these few files seems like the tail
wagging the dog.

[1] https://lore.kernel.org/r/20200716110423.xtfyb3n6tn5ixedh@pali
[2] 
https://lore.kernel.org/r/caosf1chss03dbsdo4pmttmp0tceu5kscn704zewlkgxqzbf...@mail.gmail.com

> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> to adjust this at at ->open time:
> 
> - for sysfs this is easy, now that binary attributes support this. We
>   just set bin_attr->mapping when mmap is supported
> - for procfs it's a bit more tricky, since procfs pci access has only
>   one file per device, and access to a specific resources first needs
>   to be set up with some ioctl calls. But mmap is only supported for
>   the same resources as sysfs exposes with mmap support, and otherwise
>   rejected, so we can set the mapping unconditionally at open time
>   without harm.
> 
> A special consideration is for arch_can_pci_mmap_io() - we need to
> make sure that the ->f_mapping doesn't alias between ioport and iomem
> space. There's only 2 ways in-tree to support mmap of ioports: generic
> pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> architecture hand-rolling. Both approach support ioport mmap through a
> special pfn range and not through magic pte attributes. Aliasing is
> therefore not a problem.
> 
> The only difference in access checks left is that sysfs PCI mmap does
> not check for CAP_RAWIO. I'm not really sure whether that should be
> added or not.
> 
> Acked-by: Bjorn Helgaas 
> Reviewed-by: Dan Williams 
> Signed-off-by: Daniel Vetter 
> Cc: Stephen Rothwell 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: Greg Kroah-Hartman 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 4 
>  drivers/pci/proc.c  | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 0c45b4f7b214..f8afd54ca3e1 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>   b->legacy_io->read = pci_read_legacy_io;
>   b->legacy_io->write = pci_write_legacy_io;
>   b->legacy_io->mmap = pci_mmap_legacy_io;
> + b->legacy_io->mapping = iomem_get_mapping();
>   pci_adjust_legacy_attr(b, pci_mmap_io);
>   error = device_create_bin_file(>dev, b->legacy_io);
>   if (error)
> @@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>   b->legacy_mem->size = 1024*1024;
>   b->legacy_mem->attr.mode = 0600;
>   b->legacy_mem->mmap = pci_mmap_legacy_mem;
> + b->legacy_io->mapping = iomem_get_mapping();
>   pci_adjust_legacy_attr(b, pci_mmap_mem);
>

Re: [PATCH v1 1/1] PCI: pciehp: Skip DLLSC handling if DPC is triggered

2021-03-12 Thread Bjorn Helgaas

On Fri, Mar 12, 2021 at 02:11:03PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> On 3/12/21 1:33 PM, Bjorn Helgaas wrote:
> > On Mon, Mar 08, 2021 at 10:34:10PM -0800, 
> > sathyanarayanan.kuppusw...@linux.intel.com wrote:
> > > From: Kuppuswamy Sathyanarayanan 
> > > 

> > > +bool is_dpc_reset_active(struct pci_dev *dev)
> > > +{
> > > + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
> > > + u16 status;
> > > +
> > > + if (!dev->dpc_cap)
> > > + return false;
> > > +
> > > + /*
> > > +  * If DPC is owned by firmware and EDR is not supported, there is
> > > +  * no race between hotplug and DPC recovery handler. So return
> > > +  * false.
> > > +  */
> > > + if (!host->native_dpc && !IS_ENABLED(CONFIG_PCIE_EDR))
> > > + return false;
> > > +
> > > + if (atomic_read_acquire(>dpc_reset_active))
> > > + return true;
> > > +
> > > + pci_read_config_word(dev, dev->dpc_cap + PCI_EXP_DPC_STATUS, );
> > > +
> > > + return !!(status & PCI_EXP_DPC_STATUS_TRIGGER);
> > 
> > I know it's somewhat common in drivers/pci/, but I'm not really a
> > big fan of "!!".
> I can change it to use ternary operator.
> (status & PCI_EXP_DPC_STATUS_TRIGGER) ? true : false;

Ternary isn't terrible, but what's wrong with:

  if (status & PCI_EXP_DPC_STATUS_TRIGGER)
return true;
  return false;

which matches the style of the rest of the function.

Looking at this again, we return "true" if either dpc_reset_active or
PCI_EXP_DPC_STATUS_TRIGGER.  I haven't worked this all out, but that
pattern feels racy.  I guess the thought is that if
PCI_EXP_DPC_STATUS_TRIGGER is set, dpc_reset_link() will be invoked
soon and we don't want to interfere?

Re: [PATCH 03/44] PCI: remove synclink entries from pci_ids

2021-03-12 Thread Bjorn Helgaas

On Tue, Mar 02, 2021 at 07:21:33AM +0100, Jiri Slaby wrote:
> The drivers were removed in a1f714b44e34 (tty: Remove redundant synclink
> driver) and 3d608a591b2b (tty: Remove redundant synclinkmp driver).
> 
> So remove also the PCI ID entries.
> 
> Signed-off-by: Jiri Slaby 

Applied with Krzysztof's reviewed-by to pci/misc for v5.13, thanks!

> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  include/linux/pci_ids.h | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index a76ccb697bef..8a18517696c1 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2065,8 +2065,6 @@
>  #define PCI_DEVICE_ID_EXAR_XR17V358  0x0358
>  
>  #define PCI_VENDOR_ID_MICROGATE  0x13c0
> -#define PCI_DEVICE_ID_MICROGATE_USC  0x0010
> -#define PCI_DEVICE_ID_MICROGATE_SCA  0x0030
>  
>  #define PCI_VENDOR_ID_3WARE  0x13C1
>  #define PCI_DEVICE_ID_3WARE_1000 0x1000
> -- 
> 2.30.1
>

Re: [PATCH v1 1/1] PCI: pciehp: Skip DLLSC handling if DPC is triggered

2021-03-12 Thread Bjorn Helgaas

[+cc Lukas, pciehp expert]

On Mon, Mar 08, 2021 at 10:34:10PM -0800, 
sathyanarayanan.kuppusw...@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan 
> 
> When hotplug and DPC are both enabled on a Root port or
> Downstream Port, during DPC events that cause a DLLSC link
> down/up events, such events must be suppressed to let the DPC
> driver own the recovery path.

I first thought you were saying "during DPC events, DPC events must be
suppressed" which would be nonsensical.  But I guess this is saying
the "*link down/up* events must be suppressed"?

> When DPC is present and enabled, hardware will put the port in
> containment state to allow SW to recover from the error condition
> in the seamless manner. But, during the DPC error recovery process,
> since the link is in disabled state, it will also raise the DLLSC
> event. In Linux kernel architecture, DPC events are handled by DPC
> driver and DLLSC events are handled by hotplug driver. If a hotplug
> driver is allowed to handle such DLLSC event (triggered by DPC
> containment), then we will have a race condition between error
> recovery handler (in DPC driver) and hotplug handler in recovering
> the contained port. Allowing such a race leads to a lot of stability
> issues while recovering the  device. So skip DLLSC handling in the
> hotplug driver when the PCIe port associated with the hotplug event is
> in DPC triggered state and let the DPC driver be responsible for the
> port recovery.
> 
> Following is the sample dmesg log which shows the contention
> between hotplug handler and error recovery handler. In this
> case, hotplug handler won the race and error recovery
> handler reported failure.
> 
> [  724.974237] pcieport :97:02.0: pciehp: Slot(4): Link Down
> [  724.974266] pcieport :97:02.0: DPC: containment event, status:0x1f01 
> source:0x
> [  724.974269] pcieport :97:02.0: DPC: unmasked uncorrectable error 
> detected
> [  724.974275] pcieport :97:02.0: PCIe Bus Error: severity=Uncorrected 
> (Non-Fatal), type=Transaction Layer, (Requester ID)
> [  724.974283] pcieport :97:02.0:   device [8086:347a] error 
> status/mask=4000/00100020
> [  724.974288] pcieport :97:02.0:[14] CmpltTO(First)
> [  724.999181] pci :98:00.0: AER: can't recover (no error_detected 
> callback)
> [  724.999227] pci :98:00.0: Removing from iommu group 181
> [  726.063125] pcieport :97:02.0: pciehp: Slot(4): Card present
> [  726.221117] pcieport :97:02.0: DPC: Data Link Layer Link Active not 
> set in 1000 msec
> [  726.221122] pcieport :97:02.0: AER: subordinate device reset failed
> [  726.221162] pcieport :97:02.0: AER: device recovery failed
> [  727.227176] pci :98:00.0: [8086:0953] type 00 class 0x010802
> [  727.227202] pci :98:00.0: reg 0x10: [mem 0x-0x3fff 64bit]
> [  727.227234] pci :98:00.0: reg 0x30: [mem 0x-0x pref]
> [  727.227246] pci :98:00.0: Max Payload Size set to 256 (was 128, max 
> 256)
> [  727.227251] pci :98:00.0: enabling Extended Tags
> [  727.227736] pci :98:00.0: Adding to iommu group 181
> [  727.231150] pci :98:00.0: BAR 6: assigned [mem 0xd100-0xd100 
> pref]
> [  727.231156] pci :98:00.0: BAR 0: assigned [mem 0xd101-0xd1013fff 
> 64bit]
> [  727.231170] pcieport :97:02.0: PCI bridge to [bus 98]
> [  727.231174] pcieport :97:02.0:   bridge window [io  0xc000-0xcfff]
> [  727.231181] pcieport :97:02.0:   bridge window [mem 
> 0xd100-0xd10f]
> [  727.231186] pcieport :97:02.0:   bridge window [mem 
> 0x2060-0x2060001f 64bit pref]
> [  727.231555] nvme nvme1: pci function :98:00.0
> [  727.231581] nvme :98:00.0: enabling device (0140 -> 0142)
> [  737.141132] nvme nvme1: 31/0/0 default/read/poll queues
> [  737.146211]  nvme1n2: p1

Quite a bit of the above really isn't relevant to the problem, so
stripping it out would reduce distraction.  E.g.,

  Removing from iommu group
  reg ...
  Max Payload Size set
  enabling Extended Tags
  Adding to iommu group
  BAR X: assigned ...
  PCI bridge to [bus 98]
  bridge window ...

Probably the timestamps are also only of incidental interest and could
be removed?

> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 
> Reviewed-by: Dan Williams 
> Reviewed-by: Raj Ashok 
> ---
>  drivers/pci/hotplug/pciehp_hpc.c | 18 +
>  drivers/pci/pci.h|  2 ++
>  drivers/pci/pcie/dpc.c   | 33 ++--
>  include/linux/pci.h  |  1 +
>  4 files changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c 
> b/drivers/pci/hotplug/pciehp_hpc.c
> index fb3840e222ad..8e7916abc60e 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -691,6 +691,24 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
>   goto out;
>   }
>  
> + /*
> +  * If the DLLSC link

Re: [PATCH v3] PCI: Add quirk for preventing bus reset on TI C667X

2021-03-12 Thread Bjorn Helgaas

On Mon, Mar 08, 2021 at 02:21:30PM +, Antti Järvinen wrote:
> Some TI KeyStone C667X devices do no support bus/hot reset. Its PCIESS
> automatically disables LTSSM when secondary bus reset is received and
> device stops working. Prevent bus reset by adding quirk_no_bus_reset to
> the device. With this change device can be assigned to VMs with VFIO,
> but it will leak state between VMs.

s/do no/do/not/ (also in the comment below)

Does the user get any indication of this leaking state?  I looked
through drivers/vfio and drivers/pci, but I haven't found anything
yet.

We *could* log something in quirk_no_bus_reset(), but that would just
be noise for people who don't pass the device through to a VM.  So
maybe it would be nicer if we logged something when we actually *do*
pass it through to a VM.

> Reference: https://e2e.ti.com/support/processors/f/791/t/954382
> Signed-off-by: Antti Järvinen 
> ---
>  drivers/pci/quirks.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 653660e3ba9e..d9201ad1ca39 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3578,6 +3578,16 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 
> 0x0034, quirk_no_bus_reset);
>   */
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);
>  
> +/*
> + * Some TI keystone C667X devices do no support bus/hot reset.
> + * Its PCIESS automatically disables LTSSM when secondary bus reset is
> + * received and device stops working. Prevent bus reset by adding
> + * quirk_no_bus_reset to the device. With this change device can be
> + * assigned to VMs with VFIO, but it will leak state between VMs.
> + * Reference https://e2e.ti.com/support/processors/f/791/t/954382
> + */
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TI, 0xb005, quirk_no_bus_reset);
> +
>  static void quirk_no_pm_reset(struct pci_dev *dev)
>  {
>   /*
> -- 
> 2.17.1
>

Re: [RFC PATCH v2 02/11] PCI/P2PDMA: Avoid pci_get_slot() which sleeps

2021-03-12 Thread Bjorn Helgaas

On Thu, Mar 11, 2021 at 04:31:32PM -0700, Logan Gunthorpe wrote:
> In order to use upstream_bridge_distance_warn() from a dma_map function,
> it must not sleep. However, pci_get_slot() takes the pci_bus_sem so it
> might sleep.
> 
> In order to avoid this, try to get the host bridge's device from
> bus->self, and if that is not set just get the first element in the
> list. It should be impossible for the host bridges device to go away
> while references are held on child devices, so the first element
> should not change and this should be safe.
> 
> Signed-off-by: Logan Gunthorpe 
> ---
>  drivers/pci/p2pdma.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index bd89437faf06..2135fe69bb07 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -311,11 +311,15 @@ static const struct pci_p2pdma_whitelist_entry {
>  static bool __host_bridge_whitelist(struct pci_host_bridge *host,
>   bool same_host_bridge)
>  {
> - struct pci_dev *root = pci_get_slot(host->bus, PCI_DEVFN(0, 0));
>   const struct pci_p2pdma_whitelist_entry *entry;
> + struct pci_dev *root = host->bus->self;
>   unsigned short vendor, device;
>  
>   if (!root)
> + root = list_first_entry_or_null(>bus->devices,
> + struct pci_dev, bus_list);

Replacing one ugliness (assuming there is a pci_dev for the host
bridge, and that it is at 00.0) with another (still assuming a pci_dev
and that it is host->bus->self or the first entry).  I can't suggest
anything better, but maybe a little comment in the code would help
future readers.

I wish we had a real way to discover this property without the
whitelist, at least for future devices.  Was there ever any interest
in a _DSM or similar interface for this?

I *am* very glad to remove a pci_get_slot() usage.

> +
> + if (!root || root->devfn)
>   return false;
>  
>   vendor = root->vendor;

Don't you need to also remove the "pci_dev_put(root)" a few lines
below?

Re: [RFC PATCH v2 01/11] PCI/P2PDMA: Pass gfp_mask flags to upstream_bridge_distance_warn()

2021-03-12 Thread Bjorn Helgaas

On Thu, Mar 11, 2021 at 04:31:31PM -0700, Logan Gunthorpe wrote:
> In order to call this function from a dma_map function, it must not sleep.
> The only reason it does sleep so to allocate the seqbuf to print
> which devices are within the ACS path.

s/this function/upstream_bridge_distance_warn()/ ?
s/so to/is to/

Maybe the subject could say something about the purpose, e.g., allow
calling from atomic context or something?  "Pass gfp_mask flags" sort
of restates what we can read from the patch, but without the
motivation of why this is useful.

> Switch the kmalloc call to use a passed in gfp_mask  and don't print that
> message if the buffer fails to be allocated.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/p2pdma.c | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 196382630363..bd89437faf06 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -267,7 +267,7 @@ static int pci_bridge_has_acs_redir(struct pci_dev *pdev)
>  
>  static void seq_buf_print_bus_devfn(struct seq_buf *buf, struct pci_dev 
> *pdev)
>  {
> - if (!buf)
> + if (!buf || !buf->buffer)
>   return;
>  
>   seq_buf_printf(buf, "%s;", pci_name(pdev));
> @@ -495,25 +495,26 @@ upstream_bridge_distance(struct pci_dev *provider, 
> struct pci_dev *client,
>  
>  static enum pci_p2pdma_map_type
>  upstream_bridge_distance_warn(struct pci_dev *provider, struct pci_dev 
> *client,
> -   int *dist)
> +   int *dist, gfp_t gfp_mask)
>  {
>   struct seq_buf acs_list;
>   bool acs_redirects;
>   int ret;
>  
> - seq_buf_init(_list, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE);
> - if (!acs_list.buffer)
> - return -ENOMEM;
> + seq_buf_init(_list, kmalloc(PAGE_SIZE, gfp_mask), PAGE_SIZE);
>  
>   ret = upstream_bridge_distance(provider, client, dist, _redirects,
>  _list);
>   if (acs_redirects) {
>   pci_warn(client, "ACS redirect is set between the client and 
> provider (%s)\n",
>pci_name(provider));
> - /* Drop final semicolon */
> - acs_list.buffer[acs_list.len-1] = 0;
> - pci_warn(client, "to disable ACS redirect for this path, add 
> the kernel parameter: pci=disable_acs_redir=%s\n",
> -  acs_list.buffer);
> +
> + if (acs_list.buffer) {
> + /* Drop final semicolon */
> + acs_list.buffer[acs_list.len - 1] = 0;
> + pci_warn(client, "to disable ACS redirect for this 
> path, add the kernel parameter: pci=disable_acs_redir=%s\n",
> +  acs_list.buffer);
> + }
>   }
>  
>   if (ret == PCI_P2PDMA_MAP_NOT_SUPPORTED) {
> @@ -566,7 +567,7 @@ int pci_p2pdma_distance_many(struct pci_dev *provider, 
> struct device **clients,
>  
>   if (verbose)
>   ret = upstream_bridge_distance_warn(provider,
> - pci_client, );
> + pci_client, , GFP_KERNEL);
>   else
>   ret = upstream_bridge_distance(provider, pci_client,
>  , NULL, NULL);
> -- 
> 2.20.1
>

Re: [PATCH 02/17] cfi: add __cficanonical

2021-03-12 Thread Bjorn Helgaas

On Thu, Mar 11, 2021 at 04:49:04PM -0800, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces a function address taken
> in C code with the address of a local jump table entry, which passes
> runtime indirect call checks. However, the compiler won't replace
> addresses taken in assembly code, which will result in a CFI failure
> if we later jump to such an address in instrumented C code. The code
> generated for the non-canonical jump table looks this:
> 
>   : /* In C,  points here */
>   jmp noncanonical
>   ...
>   :/* function body */
>   ...
> 
> This change adds the __cficanonical attribute, which tells the
> compiler to use a canonical jump table for the function instead. This
> means the compiler will rename the actual function to .cfi
> and points the original symbol to the jump table entry instead:
> 
>   :   /* jump table entry */
>   jmp canonical.cfi
>   ...
>   :   /* function body */
>   ...
> 
> As a result, the address taken in assembly, or other non-instrumented
> code always points to the jump table and therefore, can be used for
> indirect calls in instrumented code without tripping CFI checks.
> 
> Signed-off-by: Sami Tolvanen 

If you need it:

Acked-by: Bjorn Helgaas# pci.h

> ---
>  include/linux/compiler-clang.h | 1 +
>  include/linux/compiler_types.h | 4 
>  include/linux/init.h   | 4 ++--
>  include/linux/pci.h| 4 ++--
>  4 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 1ff22bdad992..c275f23ce023 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -57,3 +57,4 @@
>  #endif
>  
>  #define __nocfi  __attribute__((__no_sanitize__("cfi")))
> +#define __cficanonical   __attribute__((__cfi_canonical_jump_table__))
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 796935a37e37..d29bda7f6ebd 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -246,6 +246,10 @@ struct ftrace_likely_data {
>  # define __nocfi
>  #endif
>  
> +#ifndef __cficanonical
> +# define __cficanonical
> +#endif
> +
>  #ifndef asm_volatile_goto
>  #define asm_volatile_goto(x...) asm goto(x)
>  #endif
> diff --git a/include/linux/init.h b/include/linux/init.h
> index b3ea15348fbd..045ad1650ed1 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -220,8 +220,8 @@ extern bool initcall_debug;
>   __initcall_name(initstub, __iid, id)
>  
>  #define __define_initcall_stub(__stub, fn)   \
> - int __init __stub(void);\
> - int __init __stub(void) \
> + int __init __cficanonical __stub(void); \
> + int __init __cficanonical __stub(void)  \
>   {   \
>   return fn();\
>   }   \
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..39684b72db91 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1944,8 +1944,8 @@ enum pci_fixup_pass {
>  #ifdef CONFIG_LTO_CLANG
>  #define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,
> \
> class_shift, hook, stub)  \
> - void stub(struct pci_dev *dev); \
> - void stub(struct pci_dev *dev)  \
> + void __cficanonical stub(struct pci_dev *dev);  \
> + void __cficanonical stub(struct pci_dev *dev)   \
>   {   \
>   hook(dev);  \
>   }   \
> -- 
> 2.31.0.rc2.261.g7f71774620-goog
>

Re: [PATCH 1/3] PCI: controller: al: select CONFIG_PCI_ECAM

2021-03-11 Thread Bjorn Helgaas

On Wed, Mar 10, 2021 at 10:02:55PM +0100, Arnd Bergmann wrote:
> On Wed, Mar 10, 2021 at 8:32 PM Bjorn Helgaas  wrote:
> >
> > On Mon, Mar 08, 2021 at 04:24:46PM +0100, Arnd Bergmann wrote:
> > > From: Arnd Bergmann 
> > >
> > > Compile-testing this driver without ECAM support results in a link
> > > failure:
> > >
> > > ld.lld: error: undefined symbol: pci_ecam_map_bus
> > > >>> referenced by pcie-al.c
> > > >>>   pci/controller/dwc/pcie-al.o:(al_pcie_map_bus) in 
> > > >>> archive drivers/built-in.a
> > >
> > > Select CONFIG_ECAM like the other drivers do.
> >
> > Did we add these compile issues in the v5.12-rc1?  I.e., are the fixes
> > candidates for v5.12?
> 
> No, the bug exists but is hidden until you apply patch 3/3 because the
> driver is never compile tested on anything other than arm64, which
> turns on PCI_ECAM unconditionally.
> 
> Merging all three for 5.13 is sufficient.

I put these on pci/misc for v5.13, thanks!

Re: [PATCH] MAINTAINERS: Update PCI patchwork to kernel.org instance

2021-03-11 Thread Bjorn Helgaas

On Thu, Mar 11, 2021 at 03:12:23PM -0600, Bjorn Helgaas wrote:
> From: Bjorn Helgaas 
> 
> We now use the kernel.org patchwork instance.  Update the links in
> MAINTAINERS.
> 
> Signed-off-by: Bjorn Helgaas 

I put this on for-linus for v5.12.

> ---
>  MAINTAINERS | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d92f85ca831d..a3c2e930b3d5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -13843,7 +13843,7 @@ M:Lorenzo Pieralisi 
>  R:   Rob Herring 
>  L:   linux-...@vger.kernel.org
>  S:   Supported
> -Q:   http://patchwork.ozlabs.org/project/linux-pci/list/
> +Q:   http://patchwork.kernel.org/project/linux-pci/list/
>  T:   git git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/
>  F:   drivers/pci/controller/
>  
> @@ -13851,7 +13851,7 @@ PCI SUBSYSTEM
>  M:   Bjorn Helgaas 
>  L:   linux-...@vger.kernel.org
>  S:   Supported
> -Q:   http://patchwork.ozlabs.org/project/linux-pci/list/
> +Q:   http://patchwork.kernel.org/project/linux-pci/list/
>  T:   git git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
>  F:   Documentation/PCI/
>  F:   Documentation/devicetree/bindings/pci/
> -- 
> 2.25.1
>

[PATCH] MAINTAINERS: Update PCI patchwork to kernel.org instance

2021-03-11 Thread Bjorn Helgaas

From: Bjorn Helgaas 

We now use the kernel.org patchwork instance.  Update the links in
MAINTAINERS.

Signed-off-by: Bjorn Helgaas 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d92f85ca831d..a3c2e930b3d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13843,7 +13843,7 @@ M:  Lorenzo Pieralisi 
 R: Rob Herring 
 L: linux-...@vger.kernel.org
 S: Supported
-Q: http://patchwork.ozlabs.org/project/linux-pci/list/
+Q: http://patchwork.kernel.org/project/linux-pci/list/
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/
 F: drivers/pci/controller/
 
@@ -13851,7 +13851,7 @@ PCI SUBSYSTEM
 M: Bjorn Helgaas 
 L: linux-...@vger.kernel.org
 S: Supported
-Q: http://patchwork.ozlabs.org/project/linux-pci/list/
+Q: http://patchwork.kernel.org/project/linux-pci/list/
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
 F: Documentation/PCI/
 F: Documentation/devicetree/bindings/pci/
-- 
2.25.1

Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state

2021-03-11 Thread Bjorn Helgaas

On Thu, Jan 28, 2021 at 03:52:42PM +, Victor Ding wrote:
> Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
> T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
> such devices enter and exit L1 frequently during pci_save_state and
> pci_restore_state; eventually causing poor suspend/resume performance.
> 
> Based on the observation that PCI accesses dominance pci_save_state/
> pci_restore_state plus these accesses are fairly close to each other, the
> actual time the device could stay in low power states is negligible.
> Therefore, the little power-saving benefit from ASPM during suspend/resume
> does not overweight the performance degradation caused by high L1 exit
> penalties.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187

Thanks for this!

This device can tolerate unlimited delay for L1 exit (DevCtl Endpoint
L1 Acceptable Latency is unlimited) and it makes no guarantees about
how fast it can exit L1 (LnkCap L1 Exit Latency is also unlimited), so
I think there's basically no restriction on when it can enter ASPM
L1.0.

I have a hard time interpreting the L1.2 entry conditions in PCIe
r5.0, sec 5.5.1, but I can believe it enters L1.2 aggressively since
the device says it can tolerate any latencies.

If L1.2 exit takes 3100us, it could do ~60 L1 exits in 200ms.  I guess
config accesses and code execution can account for some of that, but
still seems like a lot of L1 entries/exits during suspend.  I wouldn't
think we would touch the device that much and that intermittently.

> Signed-off-by: Victor Ding 
> 
> ---
> 
> Changes in v2:
> - Updated commit message to remove unnecessary information
> - Fixed a bug reading wrong register in pcie_save_aspm_control
> - Updated to reuse existing pcie_config_aspm_dev where possible
> - Fixed goto label style
> 
>  drivers/pci/pci.c   | 18 +++---
>  drivers/pci/pci.h   |  6 ++
>  drivers/pci/pcie/aspm.c | 27 +++
>  include/linux/pci.h |  1 +
>  4 files changed, 49 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 32011b7b4c04..9ea88953f90b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
>  int pci_save_state(struct pci_dev *dev)
>  {
>   int i;
> +
> + pcie_save_aspm_control(dev);
> + pcie_disable_aspm(dev);

If I understand this patch correctly, it basically does this:

pci_save_state
  +   pcie_save_aspm_control
  +   pcie_disable_aspm

  +   pcie_restore_aspm_control

The  part is just a bunch of config reads with very little
other code execution.  I'm really surprised that there's enough time
between config reads for the link to go to L1.  I guess you've
verified that this does speed up suspend significantly, but this just
doesn't make sense to me.

In the bugzilla you say the GL9750 can go to L1.2 after ~4us of
inactivity.  That's enough time for a lot of code execution.  We must
be missing something.  There's so much PCI traffic during save/restore
that it should be easy to match up the analyzer trace with the code.
Can you get any insight into what's going on that way?

>   /* XXX: 100% dword access ok here? */
>   for (i = 0; i < 16; i++) {
>   pci_read_config_dword(dev, i * 4, >saved_config_space[i]);
> @@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
>  
>   i = pci_save_pcie_state(dev);
>   if (i != 0)
> - return i;
> + goto exit;
>  
>   i = pci_save_pcix_state(dev);
>   if (i != 0)
> - return i;
> + goto exit;
>  
>   pci_save_ltr_state(dev);
>   pci_save_aspm_l1ss_state(dev);
>   pci_save_dpc_state(dev);
>   pci_save_aer_state(dev);
>   pci_save_ptm_state(dev);
> - return pci_save_vc_state(dev);
> + i = pci_save_vc_state(dev);
> +
> +exit:
> + pcie_restore_aspm_control(dev);
> + return i;
>  }
>  EXPORT_SYMBOL(pci_save_state);
>  
> @@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
>   if (!dev->state_saved)
>   return;
>  
> + pcie_disable_aspm(dev);
> +
>   /*
>* Restore max latencies (in the LTR capability) before enabling
>* LTR itself (in the PCIe capability).
> @@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
>   pci_enable_acs(dev);
>   pci_restore_iov_state(dev);
>  
> + pcie_restore_aspm_control(dev);
> +
>   dev->state_saved = false;
>  }
>  EXPORT_SYMBOL(pci_restore_state);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index a81459159f6d..e074a0cbe73c 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
>  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
>  void pci_restore_aspm_l1ss_state(struct

Re: [PATCH] PCI: Add AMD RV2 based APUs, such as 3015Ce, to D3hot to D3 quirk table.

2021-03-11 Thread Bjorn Helgaas

[+cc Daniel, Mika (author, reviewer of 3030df209aa8]

On Thu, Mar 11, 2021 at 10:11:35AM +0530, Shirish S wrote:
> From: Julian Schroeder 
> 
> This allows for an extra 10ms for the state transition.
> Currently only AMD PCO based APUs are covered by this table.

I'm really glad to see this coming straight from AMD.  Is this a
documented erratum?  Please provide a reference to that.

The point is that quirks are for working around hardware defects.  If
the device is not defective, and it is actually following the spec
correctly, there should be a way to fix this in a generic way that
doesn't require quirks.  That avoids the need to add more quirks for
future devices.

> WIP. Working on commit to kernel.org.

I'm not sure what "WIP. Working on commit to kernel.org." means.  Does
it mean I should ignore this and wait for the final posting?

I'm OCD enough that I like commits doing the same thing to have the
same subject line.  This is an extension of 3030df209aa8 ("PCI:
Increase D3 delay for AMD Ryzen5/7 XHCI controllers"), so it should
look like that.

> Signed-off-by: Julian Schroeder 

This appears to require an additional signoff from you, Shiresh; see
[1].

Bjorn

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n356

> ---
>  drivers/pci/quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 653660e3ba9e..7d8f52524ada 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -1904,6 +1904,7 @@ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev)
>  }
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e5, quirk_ryzen_xhci_d3hot);
>  
>  #ifdef CONFIG_X86_IO_APIC
>  static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
> -- 
> 2.17.1
>

Re: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association

2021-03-10 Thread Bjorn Helgaas

On Mon, Feb 22, 2021 at 09:17:17AM +0800, Qiuxu Zhuo wrote:
> Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
> byte encoding the device and function number) as the device number to
> check whether the corresponding bit was set in the RCiEPBitmap of the
> RCEC (Root Complex Event Collector) while enumerating over each bit of
> the RCiEPBitmap.
> 
> As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
> Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
> use a device number to check whether the corresponding bit was set in
> the RCiEPBitmap.
> 
> Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
> of "rciep->devfn" to a device number to ensure that the RCiEP devices
> associated with the RCEC are linked when the RCEC is enumerated.
> 
> Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> Reported-and-tested-by: Wen Jin 
> Reviewed-by: Sean V Kelley 
> Signed-off-by: Qiuxu Zhuo 

I think 507b460f8144 appeared in v5.11, so not something we broke in
v5.12.  Applied to pci/error for v5.13, thanks!

If I understand correctly, we previously only got this right in one
case:

   0 == PCI_SLOT(00.0)# correct
   1 == PCI_SLOT(00.1)# incorrect
   2 == PCI_SLOT(00.2)# incorrect
   ...
   8 == PCI_SLOT(01.0)# incorrect
   9 == PCI_SLOT(01.1)# incorrect
   ...
  31 == PCI_SLOT(03.7)# incorrect

> ---
> v2->v3:
>  Drop "[ Krzysztof: Update commit message. ]" from the commit message
> 
>  drivers/pci/pcie/rcec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
> index 2c5c552994e4..d0bcd141ac9c 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct 
> pci_dev *rciep)
>  
>   /* Same bus, so check bitmap */
>   for_each_set_bit(devn, , 32)
> - if (devn == rciep->devfn)
> + if (devn == PCI_SLOT(rciep->devfn))
>   return true;
>  
>   return false;
> -- 
> 2.17.1
>

Re: [PATCH 1/3] PCI: controller: al: select CONFIG_PCI_ECAM

2021-03-10 Thread Bjorn Helgaas

On Mon, Mar 08, 2021 at 04:24:46PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Compile-testing this driver without ECAM support results in a link
> failure:
> 
> ld.lld: error: undefined symbol: pci_ecam_map_bus
> >>> referenced by pcie-al.c
> >>>   pci/controller/dwc/pcie-al.o:(al_pcie_map_bus) in archive 
> >>> drivers/built-in.a
> 
> Select CONFIG_ECAM like the other drivers do.

Did we add these compile issues in the v5.12-rc1?  I.e., are the fixes
candidates for v5.12?

> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/pci/controller/dwc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/Kconfig 
> b/drivers/pci/controller/dwc/Kconfig
> index 5a3032d9b844..d981a0eba99f 100644
> --- a/drivers/pci/controller/dwc/Kconfig
> +++ b/drivers/pci/controller/dwc/Kconfig
> @@ -311,6 +311,7 @@ config PCIE_AL
>   depends on OF && (ARM64 || COMPILE_TEST)
>   depends on PCI_MSI_IRQ_DOMAIN
>   select PCIE_DW_HOST
> + select PCI_ECAM
>   help
> Say Y here to enable support of the Amazon's Annapurna Labs PCIe
> controller IP on Amazon SoCs. The PCIe controller uses the DesignWare
> -- 
> 2.29.2
>

Re: [PATCH 00/13] PCI: MSI: Getting rid of msi_controller, and other cleanups

2021-03-10 Thread Bjorn Helgaas

On Thu, Feb 25, 2021 at 03:10:10PM +, Marc Zyngier wrote:
> The msi_controller data structure was the first attempt at treating
> MSIs like any other interrupt. We replaced it a few years ago with the
> generic MSI framework, but as it turns out, some older drivers are
> still using it.
> 
> This series aims at converting these stragglers, drop msi_controller,
> and fix some other nits such as having ways for a host bridge to
> advertise whether it supports MSIs or not.
> 
> A few notes:
> 
> - The Tegra patch is the result of back and forth work with Thierry: I
>   wrote the initial patch, which didn't work (I didn't have any HW at
>   the time). Thierry made it work, and I subsequently fixed a couple
>   of bugs/cleanups. I'm responsible for the result, so don't blame
>   Thierry for any of it! FWIW, I'm now running a Jetson TX2 with its
>   root fs over NVME, and MSIs are OK.
> 
> - RCAR is totally untested, though Marek had a go at a previous
>   version. More testing required.
> 
> - The xilinx stuff is *really* untested. Paul, if you have a RISC-V
>   board that uses it, could you please give it a go? Michal, same
>   thing for the stuff you have at hand...
> 
> - hyperv: I don't have access to such hypervisor, and no way to test
>   it. Help welcomed.
> 
> - The patches dealing with the advertising of MSI handling are the
>   result of a long discussion that took place here[1]. I took the
>   liberty to rejig Thomas' initial patches, and add what I needed for
>   the MSI domain stuff. Again, blame me if something is wrong, and not
>   Thomas.
> 
> Feedback welcome.
> 
>   M.
> 
> [1] https://lore.kernel.org/r/20201031140330.83768-1-li...@fw-web.de
> 
> Marc Zyngier (11):
>   PCI: tegra: Convert to MSI domains
>   PCI: rcar: Convert to MSI domains
>   PCI: xilinx: Convert to MSI domains
>   PCI: hyperv: Drop msi_controller structure
>   PCI: MSI: Drop use of msi_controller from core code
>   PCI: MSI: Kill msi_controller structure
>   PCI: MSI: Kill default_teardown_msi_irqs()
>   PCI: MSI: Let PCI host bridges declare their reliance on MSI domains
>   PCI: Make pci_host_common_probe() declare its reliance on MSI domains
>   PCI: MSI: Document the various ways of ending up with NO_MSI
>   PCI: quirks: Refactor advertising of the NO_MSI flag
> 
> Thomas Gleixner (2):
>   PCI: MSI: Let PCI host bridges declare their lack of MSI handling
>   PCI: mediatek: Advertise lack of MSI handling

All looks good to me; I'm guessing Lorenzo will want to apply it or at
least take a look since the bulk of this is in the native host
drivers.

s|PCI: MSI:|PCI/MSI:| above (I use "PCI/:" and "PCI: :")
s|PCI: hyperv:|PCI: hv:| to match previous practice

Maybe:

  PCI: Refactor HT advertising of NO_MSI flag

since "HT" contains more information than "quirks"?

In the 03/13 commit log, s/appaling/appalling/ :)
In the patch, it sounds like the MSI capture address change might be
separable into its own patch?  If it were separate, it would be easier
to see the problem/fix and watch for it elsewhere.

Acked-by: Bjorn Helgaas 

>  drivers/pci/controller/Kconfig   |   4 +-
>  drivers/pci/controller/pci-host-common.c |   1 +
>  drivers/pci/controller/pci-hyperv.c  |   4 -
>  drivers/pci/controller/pci-tegra.c   | 343 ---
>  drivers/pci/controller/pcie-mediatek.c   |   4 +
>  drivers/pci/controller/pcie-rcar-host.c  | 342 +++---
>  drivers/pci/controller/pcie-xilinx.c | 238 +++-
>  drivers/pci/msi.c|  46 +--
>  drivers/pci/probe.c  |   4 +-
>  drivers/pci/quirks.c |  15 +-
>  include/linux/msi.h  |  17 +-
>  include/linux/pci.h  |   4 +-
>  12 files changed, 463 insertions(+), 559 deletions(-)
> 
> -- 
> 2.29.2
>

Re: [patch 12/14] PCI: hv: Use tasklet_disable_in_atomic()

2021-03-09 Thread Bjorn Helgaas

On Tue, Mar 09, 2021 at 09:42:15AM +0100, Thomas Gleixner wrote:
> From: Sebastian Andrzej Siewior 
> 
> The hv_compose_msi_msg() callback in irq_chip::irq_compose_msi_msg is
> invoked via irq_chip_compose_msi_msg(), which itself is always invoked from
> atomic contexts from the guts of the interrupt core code.
> 
> There is no way to change this w/o rewriting the whole driver, so use
> tasklet_disable_in_atomic() which allows to make tasklet_disable()
> sleepable once the remaining atomic users are addressed.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> Signed-off-by: Thomas Gleixner 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Cc: Wei Liu 
> Cc: Lorenzo Pieralisi 
> Cc: Rob Herring 
> Cc: Bjorn Helgaas 
> Cc: linux-hyp...@vger.kernel.org
> Cc: linux-...@vger.kernel.org

Acked-by: Bjorn Helgaas 

It'd be ideal if you could merge this as a group.  Let me know if you
want me to do anything else.

> ---
>  drivers/pci/controller/pci-hyperv.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1458,7 +1458,7 @@ static void hv_compose_msi_msg(struct ir
>* Prevents hv_pci_onchannelcallback() from running concurrently
>* in the tasklet.
>*/
> - tasklet_disable(>callback_event);
> + tasklet_disable_in_atomic(>callback_event);
>  
>   /*
>* Since this function is called with IRQ locks held, can't
>

Re: [PATCH v1 3/7] PCI: New Primary to Sideband (P2SB) bridge support library

2021-03-08 Thread Bjorn Helgaas

On Mon, Mar 08, 2021 at 09:16:50PM +0200, Andy Shevchenko wrote:
> On Mon, Mar 08, 2021 at 12:52:12PM -0600, Bjorn Helgaas wrote:
> > On Mon, Mar 08, 2021 at 02:20:16PM +0200, Andy Shevchenko wrote:
> > > From: Jonathan Yong 
> > > 
> > > There is already one and at least one more user is coming which
> > > requires an access to Primary to Sideband bridge (P2SB) in order to
> > > get IO or MMIO bar hidden by BIOS. Create a library to access P2SB
> > > for x86 devices.
> > 
> > Can you include a spec reference?
> 
> I'm not sure I have a public link to the spec. It's the 100 Series PCH [1].
> The document number to look for is 546955 [2] and there actually a bit of
> information about this.

This link, found by googling for "p2sb bridge", looks like it might
have relevant public links:

https://lab.whitequark.org/notes/2017-11-08/accessing-intel-ich-pch-gpios/

I'd prefer if you could dig out the relevant sections because I really
don't know how to identify them.

> > I'm trying to figure out why this
> > belongs in drivers/pci/.  It looks very device-specific.
> 
> Because it's all about access to PCI configuration spaces of the (hidden)
> devices.

The PCI core generally doesn't deal with device-specific config
registers.

> [1]: 
> https://ark.intel.com/content/www/us/en/ark/products/series/98456/intel-100-series-desktop-chipsets.html
> [2]: https://medium.com/@jacksonchen_43335/bios-gpio-p2sb-70e9b829b403
> 
> ...
> 
> > > +config PCI_P2SB
> > > + bool "Primary to Sideband (P2SB) bridge access support"
> > > + depends on PCI && X86
> > > + help
> > > +   The Primary to Sideband bridge is an interface to some PCI
> > > +   devices connected through it. In particular, SPI NOR
> > > +   controller in Intel Apollo Lake SoC is one of such devices.
> > 
> > This doesn't sound like a "bridge".  If it's a bridge, what's on the
> > primary (upstream) side?  What's on the secondary side?  What
> > resources are passed through the bridge, i.e., what transactions does
> > it transfer from one side to the other?
> 
> It's a confusion terminology here. It's a Bridge according to the spec, but
> it is *not* a PCI Bridge as you may had a first impression.

The code suggests that a register on this device controls whether a
different device is visible in config space.  I think it will be
better if we can describe what's happening.

> ...
> 
> > > + /* Unhide the P2SB device */
> > > + pci_bus_write_config_byte(bus, df, P2SBC_HIDE_BYTE, 0);
> > > +
> > > + /* Read the first BAR of the device in question */
> > > + __pci_bus_read_base(bus, devfn, pci_bar_unknown, mem, 
> > > PCI_BASE_ADDRESS_0, true);
> > 
> > I don't get this.  Apparently this normally hidden device is consuming
> > PCI address space.  The PCI core needs to know about this.  If it
> > doesn't, the PCI core may assign this space to another device.
> 
> Right, it returns all 1:s to any request so PCI core *thinks* it's
> plugged off (like D3cold or so).

I'm asking about the MMIO address space.  The BAR is a register in
config space.  AFAICT, clearing P2SBC_HIDE_BYTE makes that BAR
visible.  The BAR describes a region of PCI address space.  It looks
like setting P2SBC_HIDE_BIT makes the BAR disappear from config space,
but it sounds like the PCI address space *described* by the BAR is
still claimed by the device.  If the device didn't respond to that
MMIO space, you would have no reason to read the BAR at all.

So what keeps the PCI core from assigning that MMIO space to another
device?

This all sounds quite irregular from the point of view of the PCI
core.  If a device responds to address space that is not described by
a standard PCI BAR, or by an EA capability, or by one of the legacy
VGA or IDE exceptions, we have a problem.  That space must be
described *somehow* in a generic way, e.g., ACPI or similar.

What happens if CONFIG_PCI_P2SB is unset?  The device doesn't know
that, and if it is still consuming MMIO address space that we don't
know about, that's a problem.

> > > + /* Hide the P2SB device */
> > > + pci_bus_write_config_byte(bus, df, P2SBC_HIDE_BYTE, P2SBC_HIDE_BIT);
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>

Re: [PATCH v4 2/2] nPCI: brcmstb: Use reset/rearm instead of deassert/assert

2021-03-08 Thread Bjorn Helgaas

If you update this, please fix the s/nPCI: /PCI: / in the subject

On Mon, Mar 08, 2021 at 02:50:37PM -0500, Jim Quinlan wrote:
> The Brcmstb PCIe RC uses a reset control "rescal" for certain chips.  This
> reset implements a "pulse reset" so it matches more the reset/rearm
> calls instead of the deassert/assert calls.

You say "also" below, but the paragraph above doesn't tell us the
*first* thing this patch does.  It just tells us that some chips use
"rescal" and that "rescal" implements a "pulse reset".

I guess you're replacing reset_control_deassert() with
reset_control_reset(), and reset_control_assert() with
reset_control_rearm().

It's not obvious to me that those are equivalent or why it's safe to
do this for all chips, including those that don't use the "rescal"
(since it sounds like only certain chips have that).

> Also, add reset_control calls in suspend/resume functions.
> 
> Fixes: 740d6c3708a9 ("PCI: brcmstb: Add control of rescal reset")
> Signed-off-by: Jim Quinlan 
> Acked-by: Florian Fainelli 
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c 
> b/drivers/pci/controller/pcie-brcmstb.c
> index e330e6811f0b..18f23cba7e3a 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -1148,6 +1148,7 @@ static int brcm_pcie_suspend(struct device *dev)
>  
>   brcm_pcie_turn_off(pcie);
>   ret = brcm_phy_stop(pcie);
> + reset_control_rearm(pcie->rescal);
>   clk_disable_unprepare(pcie->clk);
>  
>   return ret;
> @@ -1163,9 +1164,13 @@ static int brcm_pcie_resume(struct device *dev)
>   base = pcie->base;
>   clk_prepare_enable(pcie->clk);
>  
> + ret = reset_control_reset(pcie->rescal);
> + if (ret)
> + goto err0;
> +
>   ret = brcm_phy_start(pcie);
>   if (ret)
> - goto err;
> + goto err1;
>  
>   /* Take bridge out of reset so we can access the SERDES reg */
>   pcie->bridge_sw_init_set(pcie, 0);
> @@ -1180,14 +1185,16 @@ static int brcm_pcie_resume(struct device *dev)
>  
>   ret = brcm_pcie_setup(pcie);
>   if (ret)
> - goto err;
> + goto err1;
>  
>   if (pcie->msi)
>   brcm_msi_set_regs(pcie->msi);
>  
>   return 0;
>  
> -err:
> +err1:
> + reset_control_rearm(pcie->rescal);
> +err0:
>   clk_disable_unprepare(pcie->clk);
>   return ret;
>  }
> @@ -1197,7 +1204,7 @@ static void __brcm_pcie_remove(struct brcm_pcie *pcie)
>   brcm_msi_remove(pcie);
>   brcm_pcie_turn_off(pcie);
>   brcm_phy_stop(pcie);
> - reset_control_assert(pcie->rescal);
> + reset_control_rearm(pcie->rescal);
>   clk_disable_unprepare(pcie->clk);
>  }
>  
> @@ -1278,13 +1285,13 @@ static int brcm_pcie_probe(struct platform_device 
> *pdev)
>   return PTR_ERR(pcie->perst_reset);
>   }
>  
> - ret = reset_control_deassert(pcie->rescal);
> + ret = reset_control_reset(pcie->rescal);
>   if (ret)
>   dev_err(>dev, "failed to deassert 'rescal'\n");
>  
>   ret = brcm_phy_start(pcie);
>   if (ret) {
> - reset_control_assert(pcie->rescal);
> + reset_control_rearm(pcie->rescal);
>   clk_disable_unprepare(pcie->clk);
>   return ret;
>   }
> -- 
> 2.17.1
>

Re: [PATCH v1 3/7] PCI: New Primary to Sideband (P2SB) bridge support library

2021-03-08 Thread Bjorn Helgaas

On Mon, Mar 08, 2021 at 02:20:16PM +0200, Andy Shevchenko wrote:
> From: Jonathan Yong 
> 
> There is already one and at least one more user is coming which
> requires an access to Primary to Sideband bridge (P2SB) in order to
> get IO or MMIO bar hidden by BIOS. Create a library to access P2SB
> for x86 devices.

Can you include a spec reference?  I'm trying to figure out why this
belongs in drivers/pci/.  It looks very device-specific.

> Signed-off-by: Jonathan Yong 
> Co-developed-by: Andy Shevchenko 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/pci/Kconfig  |  8 
>  drivers/pci/Makefile |  1 +
>  drivers/pci/pci-p2sb.c   | 83 
>  include/linux/pci-p2sb.h | 28 ++
>  4 files changed, 120 insertions(+)
>  create mode 100644 drivers/pci/pci-p2sb.c
>  create mode 100644 include/linux/pci-p2sb.h
> 
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 0c473d75e625..740e5b30d6fd 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -252,6 +252,14 @@ config PCIE_BUS_PEER2PEER
>  
>  endchoice
>  
> +config PCI_P2SB
> + bool "Primary to Sideband (P2SB) bridge access support"
> + depends on PCI && X86
> + help
> +   The Primary to Sideband bridge is an interface to some PCI
> +   devices connected through it. In particular, SPI NOR
> +   controller in Intel Apollo Lake SoC is one of such devices.

This doesn't sound like a "bridge".  If it's a bridge, what's on the
primary (upstream) side?  What's on the secondary side?  What
resources are passed through the bridge, i.e., what transactions does
it transfer from one side to the other?

>  source "drivers/pci/hotplug/Kconfig"
>  source "drivers/pci/controller/Kconfig"
>  source "drivers/pci/endpoint/Kconfig"
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index d62c4ac4ae1b..eee8d5dda7d9 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -23,6 +23,7 @@ obj-$(CONFIG_PCI_IOV)   += iov.o
>  obj-$(CONFIG_PCI_BRIDGE_EMUL)+= pci-bridge-emul.o
>  obj-$(CONFIG_PCI_LABEL)  += pci-label.o
>  obj-$(CONFIG_X86_INTEL_MID)  += pci-mid.o
> +obj-$(CONFIG_PCI_P2SB)   += pci-p2sb.o
>  obj-$(CONFIG_PCI_SYSCALL)+= syscall.o
>  obj-$(CONFIG_PCI_STUB)   += pci-stub.o
>  obj-$(CONFIG_PCI_PF_STUB)+= pci-pf-stub.o
> diff --git a/drivers/pci/pci-p2sb.c b/drivers/pci/pci-p2sb.c
> new file mode 100644
> index ..68d7dad48cdb
> --- /dev/null
> +++ b/drivers/pci/pci-p2sb.c
> @@ -0,0 +1,83 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Primary to Sideband bridge (P2SB) access support
> + *
> + * Copyright (c) 2017, 2021 Intel Corporation.
> + *
> + * Authors: Andy Shevchenko 
> + *   Jonathan Yong 
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include "pci.h"
> +
> +#define P2SBC_HIDE_BYTE  0xe1
> +#define P2SBC_HIDE_BIT   BIT(0)
> +
> +static const struct x86_cpu_id p2sb_cpu_ids[] = {
> + X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT,   PCI_DEVFN(13, 0)),
> + {}
> +};
> +
> +static int pci_p2sb_devfn(unsigned int *devfn)
> +{
> + const struct x86_cpu_id *id;
> +
> + id = x86_match_cpu(p2sb_cpu_ids);
> + if (!id)
> + return -ENODEV;
> +
> + *devfn = (unsigned int)id->driver_data;
> + return 0;
> +}
> +
> +/**
> + * pci_p2sb_bar - Get Primary to Sideband bridge (P2SB) device BAR
> + * @pdev:PCI device to get a PCI bus to communicate with
> + * @devfn:   PCI slot and function to communicate with
> + * @mem: memory resource to be filled in
> + *
> + * The BIOS prevents the P2SB device from being enumerated by the PCI
> + * subsystem, so we need to unhide and hide it back to lookup the BAR.
> + *
> + * Caller must provide a valid pointer to @mem.
> + *
> + * Locking is handled by pci_rescan_remove_lock mutex.
> + *
> + * Return:
> + * 0 on success or appropriate errno value on error.
> + */
> +int pci_p2sb_bar(struct pci_dev *pdev, unsigned int devfn, struct resource 
> *mem)
> +{
> + struct pci_bus *bus = pdev->bus;
> + unsigned int df;
> + int ret;
> +
> + /* Get devfn for P2SB device itself */
> + ret = pci_p2sb_devfn();
> + if (ret)
> + return ret;
> +
> + pci_lock_rescan_remove();
> +
> + /* Unhide the P2SB device */
> + pci_bus_write_config_byte(bus, df, P2SBC_HIDE_BYTE, 0);
> +
> + /* Read the first BAR of the device in question */
> + __pci_bus_read_base(bus, devfn, pci_bar_unknown, mem, 
> PCI_BASE_ADDRESS_0, true);

I don't get this.  Apparently this normally hidden device is consuming
PCI address space.  The PCI core needs to know about this.  If it
doesn't, the PCI core may assign this space to another device.

> + /* Hide the P2SB device */
> + pci_bus_write_config_byte(bus, df, P2SBC_HIDE_BYTE, P2SBC_HIDE_BIT);
> +
> + pci_unlock_rescan_remove();
> +
> +

Re: [PATCH V3 2/2] PCI: Add MCFG quirks for Tegra194 host controllers

2021-03-05 Thread Bjorn Helgaas

[+cc Krzysztof for .bus_shift below]

This is [2/2] but I don't see a [1/2].  Is there something missing?

On Sat, Jan 11, 2020 at 12:45:00AM +0530, Vidya Sagar wrote:
> The PCIe controller in Tegra194 SoC is not completely ECAM-compliant.
> With the current hardware design limitations in place, ECAM can be enabled
> only for one controller (C5 controller to be precise) with bus numbers
> starting from 160 instead of 0. A different approach is taken to avoid this
> abnormal way of enabling ECAM for just one controller but to enable
> configuration space access for all the other controllers. In this approach,
> ops are added through MCFG quirk mechanism which access the configuration
> spaces by dynamically programming iATU (internal AddressTranslation Unit)
> to generate respective configuration accesses just like the way it is
> done in DesignWare core sub-system.

Is this a published erratum in the device?  The purpose of specs is so
we can run existing code on new platforms without having to add quirks
like this, so I'm looking for some acknowledgement that this is an
issue that will be fixed in future designs.

Ideally this would be a URL to published errata, and we would include
the text or a synopsis here in the commit log.

> Signed-off-by: Vidya Sagar 
> Reported-by: kbuild test robot 

What is this "Reported-by" telling me?  Normally this would be a
person who could supply more information about a defect we're fixing
and might be able to test the fix.

> ---
> V3:
> * Removed MCFG address hardcoding in pci_mcfg.c file
> * Started using 'dbi_base' for accessing root port's own config space
> * and using 'config_base' for accessing config space of downstream hierarchy
> 
> V2:
> * Fixed build issues reported by kbuild test bot

Ah, I see this is probably where the "Reported-by" came from.  To me,
it would make sense to add the tag if the commit *only* fixes the
build problem so it's obvious what the robot reported.

But here, the build fix got squashed in before merging, so it's more
like a general review comment and I think the robot's response on the
mailing list is probably enough.

>  drivers/acpi/pci_mcfg.c|   7 ++
>  drivers/pci/controller/dwc/Kconfig |   3 +-
>  drivers/pci/controller/dwc/Makefile|   2 +-
>  drivers/pci/controller/dwc/pcie-tegra194.c | 102 +
>  include/linux/pci-ecam.h   |   1 +
>  5 files changed, 113 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
> index 6b347d9920cc..707181408173 100644
> --- a/drivers/acpi/pci_mcfg.c
> +++ b/drivers/acpi/pci_mcfg.c
> @@ -116,6 +116,13 @@ static struct mcfg_fixup mcfg_quirks[] = {
>   THUNDER_ECAM_QUIRK(2, 12),
>   THUNDER_ECAM_QUIRK(2, 13),
>  
> + { "NVIDIA", "TEGRA194", 1, 0, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 1, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 2, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 3, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 4, MCFG_BUS_ANY, _pcie_ops},
> + { "NVIDIA", "TEGRA194", 1, 5, MCFG_BUS_ANY, _pcie_ops},
> +
>  #define XGENE_V1_ECAM_MCFG(rev, seg) \
>   {"APM   ", "XGENE   ", rev, seg, MCFG_BUS_ANY, \
>   _v1_pcie_ecam_ops }
> diff --git a/drivers/pci/controller/dwc/Kconfig 
> b/drivers/pci/controller/dwc/Kconfig
> index 0830dfcfa43a..f5b9e75aceed 100644
> --- a/drivers/pci/controller/dwc/Kconfig
> +++ b/drivers/pci/controller/dwc/Kconfig
> @@ -255,7 +255,8 @@ config PCIE_TEGRA194
>   select PHY_TEGRA194_P2U
>   help
> Say Y here if you want support for DesignWare core based PCIe host
> -   controller found in NVIDIA Tegra194 SoC.
> +   controller found in NVIDIA Tegra194 SoC. ACPI platforms with Tegra194
> +   don't need to enable this.
>  
>  config PCIE_UNIPHIER
>   bool "Socionext UniPhier PCIe controllers"
> diff --git a/drivers/pci/controller/dwc/Makefile 
> b/drivers/pci/controller/dwc/Makefile
> index 8a637cfcf6e9..76a6c52b8500 100644
> --- a/drivers/pci/controller/dwc/Makefile
> +++ b/drivers/pci/controller/dwc/Makefile
> @@ -17,7 +17,6 @@ obj-$(CONFIG_PCIE_INTEL_GW) += pcie-intel-gw.o
>  obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o
>  obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o
>  obj-$(CONFIG_PCI_MESON) += pci-meson.o
> -obj-$(CONFIG_PCIE_TEGRA194) += pcie-tegra194.o
>  obj-$(CONFIG_PCIE_UNIPHIER) += pcie-uniphier.o
>  
>  # The following drivers are for devices that use the generic ACPI
> @@ -33,4 +32,5 @@ obj-$(CONFIG_PCIE_UNIPHIER) += pcie-uniphier.o
>  ifdef CONFIG_PCI
>  obj-$(CONFIG_ARM64) += pcie-al.o
>  obj-$(CONFIG_ARM64) += pcie-hisi.o
> +obj-$(CONFIG_ARM64) += pcie-tegra194.o
>  endif
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
> b/drivers/pci/controller/dwc/pcie-tegra194.c
> index cbe95f0ea0ca..660f55caa8be 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++

Re: [PATCH] PCI: tegra: Disable PTM capabilities for EP mode

2021-03-05 Thread Bjorn Helgaas

On Fri, Mar 05, 2021 at 01:42:34PM +0530, Om Prakash Singh wrote:
> PCIe EP compliance expect PTM capabilities (ROOT_CAPABLE, RES_CAPABLE,
> CLK_GRAN) to be disabled.

I guess this is just enforcing the PCIe spec requirements that only
Root Ports, RCRBs, and Switches are allowed to set the PTM Responder
Capable bit, and that the Local Clock Granularity is RsvdP if PTM Root
Capable is zero?  (PCIe r5.0, sec 7.9.16.2)

Should this be done more generally somewhere in the dwc code as
opposed to in the tegra code?

> Signed-off-by: Om Prakash Singh 
> ---
>  drivers/pci/controller/dwc/pcie-tegra194.c | 17 -
>  include/uapi/linux/pci_regs.h  |  1 +
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c 
> b/drivers/pci/controller/dwc/pcie-tegra194.c
> index 6fa216e..a588312 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -1639,7 +1639,7 @@ static void pex_ep_event_pex_rst_deassert(struct 
> tegra_pcie_dw *pcie)
>   struct dw_pcie *pci = >pci;
>   struct dw_pcie_ep *ep = >ep;
>   struct device *dev = pcie->dev;
> - u32 val;
> + u32 val, ptm_cap_base = 0;

Unnecessary init.

>   int ret;
>  
>   if (pcie->ep_state == EP_STATE_ENABLED)
> @@ -1760,6 +1760,21 @@ static void pex_ep_event_pex_rst_deassert(struct 
> tegra_pcie_dw *pcie)
> PCI_CAP_ID_EXP);
>   clk_set_rate(pcie->core_clk, GEN4_CORE_CLK_FREQ);
>  
> + /* Disable PTM root and responder capability */
> + ptm_cap_base = dw_pcie_find_ext_capability(>pci,
> +PCI_EXT_CAP_ID_PTM);
> + if (ptm_cap_base) {
> + dw_pcie_dbi_ro_wr_en(pci);
> + val = dw_pcie_readl_dbi(pci, ptm_cap_base + PCI_PTM_CAP);
> + val &= ~PCI_PTM_CAP_ROOT;
> + dw_pcie_writel_dbi(pci, ptm_cap_base + PCI_PTM_CAP, val);
> +
> + val = dw_pcie_readl_dbi(pci, ptm_cap_base + PCI_PTM_CAP);
> + val &= ~(PCI_PTM_CAP_RES | PCI_PTM_GRANULARITY_MASK);
> + dw_pcie_writel_dbi(pci, ptm_cap_base + PCI_PTM_CAP, val);
> + dw_pcie_dbi_ro_wr_dis(pci);
> + }
> +
>   val = (ep->msi_mem_phys & MSIX_ADDR_MATCH_LOW_OFF_MASK);
>   val |= MSIX_ADDR_MATCH_LOW_OFF_EN;
>   dw_pcie_writel_dbi(pci, MSIX_ADDR_MATCH_LOW_OFF, val);
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index e709ae8..9dd6f8d 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1050,6 +1050,7 @@
>  /* Precision Time Measurement */
>  #define PCI_PTM_CAP  0x04/* PTM Capability */
>  #define  PCI_PTM_CAP_REQ 0x0001  /* Requester capable */
> +#define  PCI_PTM_CAP_RES 0x0002  /* Responder capable */
>  #define  PCI_PTM_CAP_ROOT0x0004  /* Root capable */
>  #define  PCI_PTM_GRANULARITY_MASK0xFF00  /* Clock granularity */
>  #define PCI_PTM_CTRL 0x08/* PTM Control */
> -- 
> 2.7.4
>

Re: [PATCH 3/3] PCI: Convert rtw88 power cycle quirk to shutdown quirk

2021-03-04 Thread Bjorn Helgaas

[+cc Rafael, linux-pm]

On Thu, Mar 04, 2021 at 02:07:18PM +0800, Kai-Heng Feng wrote:
> On Sat, Feb 27, 2021 at 2:17 AM Bjorn Helgaas  wrote:
> > On Fri, Feb 26, 2021 at 02:31:31PM +0100, Heiner Kallweit wrote:
> > > On 26.02.2021 13:18, Kai-Heng Feng wrote:
> > > > On Fri, Feb 26, 2021 at 8:10 PM Heiner Kallweit  
> > > > wrote:
> > > >>
> > > >> On 26.02.2021 08:12, Kalle Valo wrote:
> > > >>> Kai-Heng Feng  writes:
> > > >>>
> > > >>>> Now we have a generic D3 shutdown quirk, so convert the original
> > > >>>> approach to a PCI quirk.
> > > >>>>
> > > >>>> Signed-off-by: Kai-Heng Feng 
> > > >>>> ---
> > > >>>>  drivers/net/wireless/realtek/rtw88/pci.c | 2 --
> > > >>>>  drivers/pci/quirks.c | 6 ++
> > > >>>>  2 files changed, 6 insertions(+), 2 deletions(-)
> > > >>>
> > > >>> It would have been nice to CC linux-wireless also on patches 1-2. I 
> > > >>> only
> > > >>> saw patch 3 and had to search the rest of patches from lkml.
> > > >>>
> > > >>> I assume this goes via the PCI tree so:
> > > >>>
> > > >>> Acked-by: Kalle Valo 
> > > >>
> > > >> To me it looks odd to (mis-)use the quirk mechanism to set a device
> > > >> to D3cold on shutdown. As I see it the quirk mechanism is used to work
> > > >> around certain device misbehavior. And setting a device to a D3
> > > >> state on shutdown is a normal activity, and the shutdown() callback
> > > >> seems to be a good place for it.
> > > >> I miss an explanation what the actual benefit of the change is.
> > > >
> > > > To make putting device to D3 more generic, as there are more than one
> > > > device need the quirk.
> > > >
> > > > Here's the discussion:
> > > > https://lore.kernel.org/linux-usb/00de6927-3fa6-a9a3-2d65-2b4d4e8f0...@linux.intel.com/
> > > >
> > >
> > > Thanks for the link. For the AMD USB use case I don't have a strong 
> > > opinion,
> > > what's considered the better option may be a question of personal taste.
> > > For rtw88 however I'd still consider it over-engineering to replace a 
> > > simple
> > > call to pci_set_power_state() with a PCI quirk.
> > > I may be biased here because I find it sometimes bothering if I want to
> > > look up how a device is handled and in addition to checking the respective
> > > driver I also have to grep through quirks.c whether there's any special
> > > handling.
> >
> > I haven't looked at these patches carefully, but in general, I agree
> > that quirks should be used to work around hardware defects in the
> > device.  If the device behaves correctly per spec, we should use a
> > different mechanism so the code remains generic and all devices get
> > the benefit.
> >
> > If we do add quirks, the commit log should explain what the device
> > defect is.
> 
> So maybe it's reasonable to put all PCI devices to D3 at shutdown?

I don't know off-hand.  I added Rafael and linux-pm in case they do.

If not, I suggest working up a patch to do that and a commit log that
explains why that's a good idea and then we can have a discussion
about it.  This thread really doesn't have that justification.  It
says "putting device X in D3cold at shutdown saves 0.03w while in S5",
but doesn't explain why that's safe or desirable for all devices.

Bjorn

Re: [RFC PATCH 5/6] PCI: designware: Add SiFive FU740 PCIe host controller driver

2021-03-04 Thread Bjorn Helgaas

Make the subject like this:

  PCI: fu740: Add SiFive FU740 PCIe host controller driver

since you're adding a "fu740" driver, not a "designware" driver.
Future commits will then look like:

  PCI: fu740: ...

On Tue, Mar 02, 2021 at 06:59:16PM +0800, Greentime Hu wrote:
> From: Paul Walmsley 
> 
> Add driver for the SiFive FU740 PCIe host controller.
> This controller is based on the DesignWare PCIe core.
> 
> Co-developed-by: Henry Styles 
> Signed-off-by: Henry Styles 
> Co-developed-by: Erik Danie 
> Signed-off-by: Erik Danie 
> Co-developed-by: Greentime Hu 
> Signed-off-by: Greentime Hu 
> Signed-off-by: Paul Walmsley 
> ---
>  drivers/pci/controller/dwc/Kconfig  |   9 +
>  drivers/pci/controller/dwc/Makefile |   1 +
>  drivers/pci/controller/dwc/pcie-fu740.c | 455 
>  3 files changed, 465 insertions(+)
>  create mode 100644 drivers/pci/controller/dwc/pcie-fu740.c
> 
> diff --git a/drivers/pci/controller/dwc/Kconfig 
> b/drivers/pci/controller/dwc/Kconfig
> index 22c5529e9a65..0a37d21ed64e 100644
> --- a/drivers/pci/controller/dwc/Kconfig
> +++ b/drivers/pci/controller/dwc/Kconfig
> @@ -318,4 +318,13 @@ config PCIE_AL
> required only for DT-based platforms. ACPI platforms with the
> Annapurna Labs PCIe controller don't need to enable this.
>  
> +config PCIE_FU740
> + bool "SiFive FU740 PCIe host controller"
> + depends on PCI_MSI_IRQ_DOMAIN
> + depends on SOC_SIFIVE || COMPILE_TEST
> + select PCIE_DW_HOST
> + help
> +   Say Y here if you want PCIe controller support for the SiFive
> +   FU740.
> +
>  endmenu
> diff --git a/drivers/pci/controller/dwc/Makefile 
> b/drivers/pci/controller/dwc/Makefile
> index a751553fa0db..625f6aaeb5b8 100644
> --- a/drivers/pci/controller/dwc/Makefile
> +++ b/drivers/pci/controller/dwc/Makefile
> @@ -5,6 +5,7 @@ obj-$(CONFIG_PCIE_DW_EP) += pcie-designware-ep.o
>  obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
>  obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
>  obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
> +obj-$(CONFIG_PCIE_FU740) += pcie-fu740.o
>  obj-$(CONFIG_PCI_IMX6) += pci-imx6.o
>  obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o
>  obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone.o
> diff --git a/drivers/pci/controller/dwc/pcie-fu740.c 
> b/drivers/pci/controller/dwc/pcie-fu740.c
> new file mode 100644
> index ..6916eea40ea5
> --- /dev/null
> +++ b/drivers/pci/controller/dwc/pcie-fu740.c
> @@ -0,0 +1,455 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * FU740 DesignWare PCIe Controller integration
> + * Copyright (C) 2019-2021 SiFive, Inc.
> + * Paul Walmsley
> + * Greentime Hu
> + *
> + * Based in part on the i.MX6 PCIe host controller shim which is:
> + *
> + * Copyright (C) 2013 Kosagi
> + *   https://www.kosagi.com
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "pcie-designware.h"
> +
> +#define to_fu740_pcie(x) dev_get_drvdata((x)->dev)
> +
> +struct fu740_pcie {
> + struct dw_pcie *pci;
> + void __iomem *mgmt_base;
> + int perstn_gpio;
> + int pwren_gpio;
> + struct clk *pcie_aux;
> + struct reset_control *rst;
> +};
> +
> +#define SIFIVE_DEVICESRESETREG   0x28
> +
> +#define PCIEX8MGMT_PERST_N   0x0
> +#define PCIEX8MGMT_APP_LTSSM_ENABLE  0x10
> +#define PCIEX8MGMT_APP_HOLD_PHY_RST  0x18
> +#define PCIEX8MGMT_DEVICE_TYPE   0x708
> +#define PCIEX8MGMT_PHY0_CR_PARA_ADDR 0x860
> +#define PCIEX8MGMT_PHY0_CR_PARA_RD_EN0x870
> +#define PCIEX8MGMT_PHY0_CR_PARA_RD_DATA  0x878
> +#define PCIEX8MGMT_PHY0_CR_PARA_SEL  0x880
> +#define PCIEX8MGMT_PHY0_CR_PARA_WR_DATA  0x888
> +#define PCIEX8MGMT_PHY0_CR_PARA_WR_EN0x890
> +#define PCIEX8MGMT_PHY0_CR_PARA_ACK  0x898
> +#define PCIEX8MGMT_PHY1_CR_PARA_ADDR 0x8a0
> +#define PCIEX8MGMT_PHY1_CR_PARA_RD_EN0x8b0
> +#define PCIEX8MGMT_PHY1_CR_PARA_RD_DATA  0x8b8
> +#define PCIEX8MGMT_PHY1_CR_PARA_SEL  0x8c0
> +#define PCIEX8MGMT_PHY1_CR_PARA_WR_DATA  0x8c8
> +#define PCIEX8MGMT_PHY1_CR_PARA_WR_EN0x8d0
> +#define PCIEX8MGMT_PHY1_CR_PARA_ACK  0x8d8
> +
> +/* PCIe Port Logic registers (memory-mapped) */
> +#define PL_OFFSET0x700
> +#define PCIE_PL_GEN2_CTRL_OFF(PL_OFFSET + 0x10c)
> +#define PCIE_PL_DIRECTED_SPEED_CHANGE_OFF0x2
> +
> +#define PCIE_PHY_MAX_RETRY_CNT   1000
> +
> +static void fu740_pcie_assert_perstn(struct fu740_pcie *afp)
> +{
> + /* PERST_N GPIO */
> + if (gpio_is_valid(afp->perstn_gpio))
> + gpio_direction_output(afp->perstn_gpio, 0);
> +
> +

Re: RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)

2021-03-01 Thread Bjorn Helgaas

[+cc Alex, reset expert]

On Mon, Mar 01, 2021 at 06:12:21PM +0100, Pali Rohár wrote:
> Hello!
> 
> PCIe card can be reset via in-band Hot Reset signal which can be
> triggered by PCIe bridge via Secondary Bus Reset bit in PCI config
> space.
> 
> Kernel already exports sysfs node "reset" for triggering Functional
> Reset of particular function of PCI device. But in some cases Functional
> Reset is not enough and Hot Reset is required.
> 
> Following RFC patch exports sysfs node "reset_bus" for PCI bridges which
> triggers Secondary Bus Reset and therefore for PCIe bridges it resets
> connected PCIe card.
> 
> What do you think about it?
> 
> Currently there is userspace script which can trigger PCIe Hot Reset by
> modifying PCI config space from userspace:
> 
> https://alexforencich.com/wiki/en/pcie/hot-reset-linux
> 
> But because kernel already provides way how to trigger Functional Reset
> it could provide also way how to trigger PCIe Hot Reset.
> 
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 50fcb62d59b5..f5e11c589498 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1321,6 +1321,30 @@ static ssize_t reset_store(struct device *dev, struct 
> device_attribute *attr,
>  
>  static DEVICE_ATTR(reset, 0200, NULL, reset_store);
>  
> +static ssize_t reset_bus_store(struct device *dev, struct device_attribute 
> *attr,
> +const char *buf, size_t count)
> +{
> + struct pci_dev *pdev = to_pci_dev(dev);
> + unsigned long val;
> + ssize_t result = kstrtoul(buf, 0, );
> +
> + if (result < 0)
> + return result;
> +
> + if (val != 1)
> + return -EINVAL;
> +
> + pm_runtime_get_sync(dev);
> + result = pci_bridge_secondary_bus_reset(pdev);
> + pm_runtime_put(dev);
> + if (result < 0)
> + return result;
> +
> + return count;
> +}
> +
> +static DEVICE_ATTR(reset_bus, 0200, NULL, reset_bus_store);
> +
>  static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>  {
>   int retval;
> @@ -1332,8 +1356,15 @@ static int pci_create_capabilities_sysfs(struct 
> pci_dev *dev)
>   if (retval)
>   goto error;
>   }
> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
> + retval = device_create_file(>dev, _attr_reset_bus);
> + if (retval)
> + goto error_reset_bus;
> + }
>   return 0;
>  
> +error_reset_bus:
> + device_remove_file(>dev, _attr_reset);
>  error:
>   pcie_vpd_remove_sysfs_dev_files(dev);
>   return retval;
> @@ -1414,6 +1445,8 @@ static void pci_remove_capabilities_sysfs(struct 
> pci_dev *dev)
>   device_remove_file(>dev, _attr_reset);
>   dev->reset_fn = 0;
>   }
> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
> + device_remove_file(>dev, _attr_reset_bus);
>  }
>  
>  /**

Re: [PATCH 3/3] PCI: Convert rtw88 power cycle quirk to shutdown quirk

2021-02-26 Thread Bjorn Helgaas

On Fri, Feb 26, 2021 at 02:31:31PM +0100, Heiner Kallweit wrote:
> On 26.02.2021 13:18, Kai-Heng Feng wrote:
> > On Fri, Feb 26, 2021 at 8:10 PM Heiner Kallweit  
> > wrote:
> >>
> >> On 26.02.2021 08:12, Kalle Valo wrote:
> >>> Kai-Heng Feng  writes:
> >>>
>  Now we have a generic D3 shutdown quirk, so convert the original
>  approach to a PCI quirk.
> 
>  Signed-off-by: Kai-Heng Feng 
>  ---
>   drivers/net/wireless/realtek/rtw88/pci.c | 2 --
>   drivers/pci/quirks.c | 6 ++
>   2 files changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> It would have been nice to CC linux-wireless also on patches 1-2. I only
> >>> saw patch 3 and had to search the rest of patches from lkml.
> >>>
> >>> I assume this goes via the PCI tree so:
> >>>
> >>> Acked-by: Kalle Valo 
> >>
> >> To me it looks odd to (mis-)use the quirk mechanism to set a device
> >> to D3cold on shutdown. As I see it the quirk mechanism is used to work
> >> around certain device misbehavior. And setting a device to a D3
> >> state on shutdown is a normal activity, and the shutdown() callback
> >> seems to be a good place for it.
> >> I miss an explanation what the actual benefit of the change is.
> > 
> > To make putting device to D3 more generic, as there are more than one
> > device need the quirk.
> > 
> > Here's the discussion:
> > https://lore.kernel.org/linux-usb/00de6927-3fa6-a9a3-2d65-2b4d4e8f0...@linux.intel.com/
> > 
> 
> Thanks for the link. For the AMD USB use case I don't have a strong opinion,
> what's considered the better option may be a question of personal taste.
> For rtw88 however I'd still consider it over-engineering to replace a simple
> call to pci_set_power_state() with a PCI quirk.
> I may be biased here because I find it sometimes bothering if I want to
> look up how a device is handled and in addition to checking the respective
> driver I also have to grep through quirks.c whether there's any special
> handling.

I haven't looked at these patches carefully, but in general, I agree
that quirks should be used to work around hardware defects in the
device.  If the device behaves correctly per spec, we should use a
different mechanism so the code remains generic and all devices get
the benefit.

If we do add quirks, the commit log should explain what the device
defect is.

Bjorn

Re: [PATCH 1/2] PCI: controller: thunder: fix compile testing

2021-02-25 Thread Bjorn Helgaas

On Thu, Feb 25, 2021 at 09:44:12AM -0800, Kuppuswamy, Sathyanarayanan wrote:
> On 2/25/21 6:37 AM, Arnd Bergmann wrote:
> > From: Arnd Bergmann 
> > 
> > Compile-testing these drivers is currently broken. Enabling
> > it causes a couple of build failures though:
> > 
> > drivers/pci/controller/pci-thunder-ecam.c:119:30: error: shift count >= 
> > width of type [-Werror,-Wshift-count-overflow]
> > drivers/pci/controller/pci-thunder-pem.c:54:2: error: implicit declaration 
> > of function 'writeq' [-Werror,-Wimplicit-function-declaration]
> > drivers/pci/controller/pci-thunder-pem.c:392:8: error: implicit declaration 
> > of function 'acpi_get_rc_resources' 
> > [-Werror,-Wimplicit-function-declaration]
> > 
> > Fix them with the obvious one-line changes.
> Looks good to me.

Thanks for looking this over!  I'd like to acknowledge your review,
but I need an explicit Reviewed-by or similar.  I don't want to put
words in your mouth by converting "Looks good to me" to "Reviewed-by".

[GIT PULL v2] PCI changes for v5.12

2021-02-25 Thread Bjorn Helgaas

The following changes since commit 7c53f6b671f4aba70ff15e1b05148b10d58c2837:

  Linux 5.11-rc3 (2021-01-10 14:34:50 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
tags/pci-v5.12-changes

for you to fetch changes up to e18fb64b79860cf5f381208834b8fbc493ef7cbc:

  Merge branch 'remotes/lorenzo/pci/misc' (2021-02-24 14:59:25 -0600)


This is a resend to explain the recent dates of some of these commits.
This material has all appeared in linux-next.  More details at [1],
but I fixed typos in commit logs and documentation and put a few
patches on different topic branches, resulting in recent commit dates.

The SHA1 above (e18fb64b7986) is different from yesterday's because I
added my Signed-off-by to a couple patches that I missed.

I apologize for the confusion.

Bjorn

[1] https://lore.kernel.org/r/20210224214036.GA1586541@bjorn-Precision-5520



Enumeration:
  - Remove unnecessary locking around _OSC (Bjorn Helgaas)
  - Clarify message about _OSC failure (Bjorn Helgaas)
  - Remove notification of PCIe bandwidth changes (Bjorn Helgaas)
  - Tidy checking of syscall user config accessors (Heiner Kallweit)

Resource management:
  - Decline to resize resources if boot config must be preserved (Ard
Biesheuvel)
  - Fix pci_register_io_range() memory leak (Geert Uytterhoeven)

Error handling (Keith Busch):
  - Clear error status from the correct device
  - Retain error recovery status so drivers can use it after reset
  - Log the type of Port (Root or Switch Downstream) that we reset
  - Always request a reset for Downstream Ports in frozen state

Endpoint framework and NTB (Kishon Vijay Abraham I):
  - Make *_get_first_free_bar() take into account 64 bit BAR
  - Add helper API to get the 'next' unreserved BAR
  - Make *_free_bar() return error codes on failure
  - Remove unused pci_epf_match_device()
  - Add support to associate secondary EPC with EPF
  - Add support in configfs to associate two EPCs with EPF
  - Add pci_epc_ops to map MSI IRQ
  - Add pci_epf_ops to expose function-specific attrs
  - Allow user to create sub-directory of 'EPF Device' directory
  - Implement ->msi_map_irq() ops for cadence
  - Configure LM_EP_FUNC_CFG based on epc->function_num_map for cadence
  - Add EP function driver to provide NTB functionality
  - Add support for EPF PCI Non-Transparent Bridge
  - Add specification for PCI NTB function device
  - Add PCI endpoint NTB function user guide
  - Add configfs binding documentation for pci-ntb endpoint function

Broadcom STB PCIe controller driver:
  - Add support for BCM4908 and external PERST# signal controller (Rafał
Miłecki)

Cadence PCIe controller driver:
  - Retrain Link to work around Gen2 training defect (Nadeem Athani)
  - Fix merge botch in cdns_pcie_host_map_dma_ranges() (Krzysztof
Wilczyński)

Freescale Layerscape PCIe controller driver:
  - Add LX2160A rev2 EP mode support (Hou Zhiqiang)
  - Convert to builtin_platform_driver() (Michael Walle)

MediaTek PCIe controller driver:
  - Fix OF node reference leak (Krzysztof Wilczyński)

Microchip PolarFlare PCIe controller driver:
  - Add Microchip PolarFire PCIe controller driver (Daire McNamara)

Qualcomm PCIe controller driver:
  - Use PHY_REFCLK_USE_PAD only for ipq8064 (Ansuel Smith)
  - Add support for ddrss_sf_tbu clock for sm8250 (Dmitry Baryshkov)

Renesas R-Car PCIe controller driver:
  - Drop PCIE_RCAR config option (Lad Prabhakar)
  - Always allocate MSI addresses in 32bit space (Marek Vasut)

Rockchip PCIe controller driver:
  - Add FriendlyARM NanoPi M4B DT binding (Chen-Yu Tsai)
  - Make 'ep-gpios' DT property optional (Chen-Yu Tsai)

Synopsys DesignWare PCIe controller driver:
  - Work around ECRC configuration hardware defect (Vidya Sagar)
  - Drop support for config space in DT 'ranges' (Rob Herring)
  - Change size to u64 for EP outbound iATU (Shradha Todi)
  - Add upper limit address for outbound iATU (Shradha Todi)
  - Make dw_pcie ops optional (Jisheng Zhang)
  - Remove unnecessary dw_pcie_ops from al driver (Jisheng Zhang)

Xilinx Versal CPM PCIe controller driver:
  - Fix OF node reference leak (Pan Bian)

Miscellaneous:
  - Remove tango host controller driver (Arnd Bergmann)
  - Remove IRQ handler & data together (altera-msi, brcmstb, dwc) (Martin
Kaiser)
  - Fix xgene-msi race in installing chained IRQ handler (Martin Kaiser)
  - Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy (Junhao He)
  - Fix pci-bridge-emul array overruns (Russell King)
  - Remove obsolete uses of WARN_ON(in_interrupt()) (Sebastian Andrzej
Siewior)


Ansuel Smith (1):
  PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064

Ard Biesheuvel (1):
  PCI: Decline to resize resources if boot config must be preserved

Arnd Bergmann (1):
  PCI: Remove tango host controller driver

Bjorn Helgaas (28):

Re: [GIT PULL] PCI changes for v5.12

2021-02-24 Thread Bjorn Helgaas

On Wed, Feb 24, 2021 at 11:21:44AM -0800, Linus Torvalds wrote:
> On Wed, Feb 24, 2021 at 11:03 AM Bjorn Helgaas  wrote:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
> > tags/pci-v5.12-changes
> 
> I pulled this, but I'm now unpulling it again.
> 
> Why are many of those commits only two hours old, and most of the rest
> is from yesterday?
> 
> Has any of this been in linux-next?

Sorry about the mess.  This has been in linux-next.  Most of the
recent commit dates are from typos I fixed in commit logs and
documentation patches.  A few are because I also sorted a few patches
onto different topic branches.

Here's a little history of the pci/next branch from linux-next and
from my pull request:

  next-20210222: pci/next 84c8d3d0b60e
  next-20210223: pci/next 4cb431e82c25
$ git diff 84c8d3d0b60e 4cb431e82c25
- fix documentation and comment typos, whitespace issues
- add fc235fcb0f7c ("PCI: acpiphp: Remove unused acpiphp_callback typedef")
- add f8ee579d53ac ("PCI: pci-bridge-emul: Fix array overruns, improve 
safety")
- add f6bda644fa3a ("PCI: Fix pci_register_io_range() memory leak")
- add d2bb2f9e1af6 ("PCI/ASPM: Move LTR, ASPM L1SS save/restore into PCIe 
save/restore")
- add e34a4f0b7001 ("PCI/ASPM: Move LTR save/restore state functions 
earlier")

  next-20210224: pci/next 6039bd61b69f
$ git diff 4cb431e82c25 6039bd61b69f
- drop d2bb2f9e1af6 ("PCI/ASPM: Move LTR, ASPM L1SS save/restore into PCIe 
save/restore")
- drop e34a4f0b7001 ("PCI/ASPM: Move LTR save/restore state functions 
earlier")
- dropped these cosmetic changes

  pci-v5.12-changes: 2bd36c391515
$ git diff 6039bd61b69f 2bd36c391515

- no content changes; changed commit logs and moved patches
  between topic branches

  pci-v5.12-changes: e18fb64b7986 (updated)
$ git diff 2bd36c391515 e18fb64b7986

- no content changes; added Signed-off-by for patches moved to
  topic branch

I'll send you a new pull request because I forgot to add my sign-off
on a couple patches I had moved to a topic branch.

Sorry again.

Bjorn

Re: linux-next: Signed-off-by missing for commit in the pci tree

2021-02-24 Thread Bjorn Helgaas

On Thu, Feb 25, 2021 at 07:21:31AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Commits
> 
>   557fb5faf4ca ("PCI: qcom: Add support for ddrss_sf_tbu clock")
>   3d0e5cf9c062 ("dt-bindings: PCI: qcom: Document ddrss_sf_tbu clock for 
> sm8250")
> 
> are missing a Signed-off-by from their committer.

Thanks, Stephen, I fixed these.  Sigh, it's not my day ;)

[GIT PULL] PCI changes for v5.12

2021-02-24 Thread Bjorn Helgaas

The following changes since commit 7c53f6b671f4aba70ff15e1b05148b10d58c2837:

  Linux 5.11-rc3 (2021-01-10 14:34:50 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
tags/pci-v5.12-changes

for you to fetch changes up to 2bd36c391515cba855b8db8ae5708154f1082b8e:

  Merge branch 'remotes/lorenzo/pci/misc' (2021-02-24 11:17:05 -0600)



Enumeration:
  - Remove unnecessary locking around _OSC (Bjorn Helgaas)
  - Clarify message about _OSC failure (Bjorn Helgaas)
  - Remove notification of PCIe bandwidth changes (Bjorn Helgaas)
  - Tidy checking of syscall user config accessors (Heiner Kallweit)

Resource management:
  - Decline to resize resources if boot config must be preserved (Ard
Biesheuvel)
  - Fix pci_register_io_range() memory leak (Geert Uytterhoeven)

Error handling (Keith Busch):
  - Clear error status from the correct device
  - Retain error recovery status so drivers can use it after reset
  - Log the type of Port (Root or Switch Downstream) that we reset
  - Always request a reset for Downstream Ports in frozen state

Endpoint framework and NTB (Kishon Vijay Abraham I):
  - Make *_get_first_free_bar() take into account 64 bit BAR
  - Add helper API to get the 'next' unreserved BAR
  - Make *_free_bar() return error codes on failure
  - Remove unused pci_epf_match_device()
  - Add support to associate secondary EPC with EPF
  - Add support in configfs to associate two EPCs with EPF
  - Add pci_epc_ops to map MSI IRQ
  - Add pci_epf_ops to expose function-specific attrs
  - Allow user to create sub-directory of 'EPF Device' directory
  - Implement ->msi_map_irq() ops for cadence
  - Configure LM_EP_FUNC_CFG based on epc->function_num_map for cadence
  - Add EP function driver to provide NTB functionality
  - Add support for EPF PCI Non-Transparent Bridge
  - Add specification for PCI NTB function device
  - Add PCI endpoint NTB function user guide
  - Add configfs binding documentation for pci-ntb endpoint function

Broadcom STB PCIe controller driver:
  - Add support for BCM4908 and external PERST# signal controller (Rafał
Miłecki)

Cadence PCIe controller driver:
  - Retrain Link to work around Gen2 training defect (Nadeem Athani)
  - Fix merge botch in cdns_pcie_host_map_dma_ranges() (Krzysztof
Wilczyński)

Freescale Layerscape PCIe controller driver:
  - Add LX2160A rev2 EP mode support (Hou Zhiqiang)
  - Convert to builtin_platform_driver() (Michael Walle)

MediaTek PCIe controller driver:
  - Fix OF node reference leak (Krzysztof Wilczyński)

Microchip PolarFlare PCIe controller driver:
  - Add Microchip PolarFire PCIe controller driver (Daire McNamara)

Qualcomm PCIe controller driver:
  - Use PHY_REFCLK_USE_PAD only for ipq8064 (Ansuel Smith)
  - Add support for ddrss_sf_tbu clock for sm8250 (Dmitry Baryshkov)

Renesas R-Car PCIe controller driver:
  - Drop PCIE_RCAR config option (Lad Prabhakar)
  - Always allocate MSI addresses in 32bit space (Marek Vasut)

Rockchip PCIe controller driver:
  - Add FriendlyARM NanoPi M4B DT binding (Chen-Yu Tsai)
  - Make 'ep-gpios' DT property optional (Chen-Yu Tsai)

Synopsys DesignWare PCIe controller driver:
  - Work around ECRC configuration hardware defect (Vidya Sagar)
  - Drop support for config space in DT 'ranges' (Rob Herring)
  - Change size to u64 for EP outbound iATU (Shradha Todi)
  - Add upper limit address for outbound iATU (Shradha Todi)
  - Make dw_pcie ops optional (Jisheng Zhang)
  - Remove unnecessary dw_pcie_ops from al driver (Jisheng Zhang)

Xilinx Versal CPM PCIe controller driver:
  - Fix OF node reference leak (Pan Bian)

Miscellaneous:
  - Remove tango host controller driver (Arnd Bergmann)
  - Remove IRQ handler & data together (altera-msi, brcmstb, dwc) (Martin
Kaiser)
  - Fix xgene-msi race in installing chained IRQ handler (Martin Kaiser)
  - Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy (Junhao He)
  - Fix pci-bridge-emul array overruns (Russell King)
  - Remove obsolete uses of WARN_ON(in_interrupt()) (Sebastian Andrzej
Siewior)


Ansuel Smith (1):
  PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064

Ard Biesheuvel (1):
  PCI: Decline to resize resources if boot config must be preserved

Arnd Bergmann (1):
  PCI: Remove tango host controller driver

Bjorn Helgaas (28):
  PCI/ACPI: Make acpi_pci_osc_control_set() static
  PCI/ACPI: Remove unnecessary osc_lock
  PCI/ACPI: Clarify message about _OSC failure
  PCI: xgene: Fix CRS SV comment
  PCI: hv: Fix typo
  Fix "ordering" comment typos
  MAINTAINERS: Fix 'ARM/TEXAS INSTRUMENT KEYSTONE CLOCKSOURCE' 
capitalization
  PCI/LINK: Remove bandwidth notification
  Merge branch 'pci/enumeration'
  Merge branch 'pci/error'
  Merge branch 'pci/hotplug'
  Merge branch 'pci/link'

Re: [PATCH] PCI: hotplug: Remove unused function pointer typedef acpiphp_callback

2021-02-18 Thread Bjorn Helgaas

On Tue, Feb 16, 2021 at 10:38:40AM +0800, Chen Lin wrote:
> From: Chen Lin 
> 
> Remove the 'acpiphp_callback' typedef as it is not used.
> 
> Signed-off-by: Chen Lin 

Applied to pci/hotplug for v5.12, thanks!

> ---
>  drivers/pci/hotplug/acpiphp.h |3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/acpiphp.h b/drivers/pci/hotplug/acpiphp.h
> index a2094c0..a74b274 100644
> --- a/drivers/pci/hotplug/acpiphp.h
> +++ b/drivers/pci/hotplug/acpiphp.h
> @@ -176,9 +176,6 @@ struct acpiphp_attention_info
>  int acpiphp_register_hotplug_slot(struct acpiphp_slot *slot, unsigned int 
> sun);
>  void acpiphp_unregister_hotplug_slot(struct acpiphp_slot *slot);
>  
> -/* acpiphp_glue.c */
> -typedef int (*acpiphp_callback)(struct acpiphp_slot *slot, void *data);
> -
>  int acpiphp_enable_slot(struct acpiphp_slot *slot);
>  int acpiphp_disable_slot(struct acpiphp_slot *slot);
>  u8 acpiphp_get_power_status(struct acpiphp_slot *slot);
> -- 
> 1.7.9.5
> 
>

Re: [PATCH v7 04/15] PCI: Add pci_find_vsec_capability() to find a specific VSEC

2021-02-18 Thread Bjorn Helgaas

On Thu, Feb 18, 2021 at 08:03:58PM +0100, Gustavo Pimentel wrote:
> Add pci_find_vsec_capability() to locate a Vendor-Specific Extended
> Capability with the specified VSEC ID.
> 
> The Vendor-Specific Extended Capability (VSEC) allows one or more
> proprietary capabilities defined by the vendor which aren't standard
> or shared between vendors.
> 
> Signed-off-by: Gustavo Pimentel 

Beautiful, thanks!

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/pci.c   | 30 ++
>  include/linux/pci.h |  1 +
>  2 files changed, 31 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b9fecc2..aef217c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -693,6 +693,36 @@ u8 pci_find_ht_capability(struct pci_dev *dev, int 
> ht_cap)
>  EXPORT_SYMBOL_GPL(pci_find_ht_capability);
>  
>  /**
> + * pci_find_vsec_capability - Find a vendor-specific extended capability
> + * @dev: PCI device to query
> + * @vendor: Vendor ID for which capability is defined
> + * @cap: Vendor-specific capability ID
> + *
> + * If @dev has Vendor ID @vendor, search for a VSEC capability with
> + * VSEC ID @cap. If found, return the capability offset in
> + * config space; otherwise return 0.
> + */
> +u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap)
> +{
> + u16 vsec = 0;
> + u32 header;
> +
> + if (vendor != dev->vendor)
> + return 0;
> +
> + while ((vsec = pci_find_next_ext_capability(dev, vsec,
> +  PCI_EXT_CAP_ID_VNDR))) {
> + if (pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER,
> +   ) == PCIBIOS_SUCCESSFUL &&
> + PCI_VNDR_HEADER_ID(header) == cap)
> + return vsec;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_find_vsec_capability);
> +
> +/**
>   * pci_find_parent_resource - return resource region of parent bus of given
>   * region
>   * @dev: PCI device structure contains resources to be searched
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b32126d..814f814 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1077,6 +1077,7 @@ u8 pci_find_next_ht_capability(struct pci_dev *dev, u8 
> pos, int ht_cap);
>  u16 pci_find_ext_capability(struct pci_dev *dev, int cap);
>  u16 pci_find_next_ext_capability(struct pci_dev *dev, u16 pos, int cap);
>  struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
> +u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap);

If you do any updates for other reasons, slide this up one more line
so we have:

  u16 pci_find_ext_capability(struct pci_dev *dev, int cap);
  u16 pci_find_next_ext_capability(struct pci_dev *dev, u16 pos, int cap);
  u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap);

  struct pci_bus *pci_find_next_bus(const struct pci_bus *from);

I don't know why pci_find_next_bus() got stuck with the capability
things.  It doesn't have anything to do with finding capabilities.  It
goes more with pci_get_device(), etc.

But don't roll the series just for that.

>  u64 pci_get_dsn(struct pci_dev *dev);
>  
> -- 
> 2.7.4
>

Re: [v4] PCI: Avoid unsync of LTR mechanism configuration

2021-02-18 Thread Bjorn Helgaas

On Thu, Feb 04, 2021 at 05:51:25PM +0800, mingchuang.q...@mediatek.com wrote:
> From: Mingchuang Qiao 
> 
> In bus scan flow, the "LTR Mechanism Enable" bit of DEVCTL2 register is
> configured in pci_configure_ltr(). If device and bridge both support LTR
> mechanism, the "LTR Mechanism Enable" bit of device and bridge will be
> enabled in DEVCTL2 register. And pci_dev->ltr_path will be set as 1.
> 
> If PCIe link goes down when device resets, the "LTR Mechanism Enable" bit
> of bridge will change to 0 according to PCIe r5.0, sec 7.5.3.16. However,
> the pci_dev->ltr_path value of bridge is still 1.
> 
> For following conditions, check and re-configure "LTR Mechanism Enable" bit
> of bridge to make "LTR Mechanism Enable" bit match ltr_path value.
>-before configuring device's LTR for hot-remove/hot-add
>-before restoring device's DEVCTL2 register when restore device state

There's definitely a bug here.  The commit log should say a little
more about what it is.  I *think* if LTR is enabled and we suspend
(putting the device in D3cold) and resume, LTR probably doesn't work
after resume because LTR is disabled in the upstream bridge, which
would be an obvious bug.

Also, if a device with LTR enabled is hot-removed, and we hot-add a
device, I think LTR will not work on the new device.  Possibly also a
bug, although I'm not convinced we know how to configure LTR on the
new device anyway.

So I'd *like* to merge the bug fix for v5.12, but I think I'll wait
because of the issue below.

> Signed-off-by: Mingchuang Qiao 
> ---
> changes of v4
>  -fix typo of commit message
>  -rename: pci_reconfigure_bridge_ltr()->pci_bridge_reconfigure_ltr()
> changes of v3
>  -call pci_reconfigure_bridge_ltr() in probe.c
> changes of v2
>  -modify patch description
>  -reconfigure bridge's LTR before restoring device DEVCTL2 register
> ---
>  drivers/pci/pci.c   | 25 +
>  drivers/pci/pci.h   |  1 +
>  drivers/pci/probe.c | 13 ++---
>  3 files changed, 36 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b9fecc25d213..6bf65d295331 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1437,6 +1437,24 @@ static int pci_save_pcie_state(struct pci_dev *dev)
>   return 0;
>  }
>  
> +void pci_bridge_reconfigure_ltr(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_PCIEASPM
> + struct pci_dev *bridge;
> + u32 ctl;
> +
> + bridge = pci_upstream_bridge(dev);
> + if (bridge && bridge->ltr_path) {
> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, );
> + if (!(ctl & PCI_EXP_DEVCTL2_LTR_EN)) {
> + pci_dbg(bridge, "re-enabling LTR\n");
> + pcie_capability_set_word(bridge, PCI_EXP_DEVCTL2,
> +  PCI_EXP_DEVCTL2_LTR_EN);

This pattern of updating the upstream bridge on behalf of "dev" is
problematic because it's racy:

  CPU 1 CPU 2
  ---   -
  ctl = read DEVCTL2ctl = read(DEVCTL2)
  ctl |= DEVCTL2_LTR_EN ctl |= DEVCTL2_ARI
  write(DEVCTL2, ctl)
write(DEVCTL2, ctl)

Now the bridge has ARI set, but not LTR_EN.

We have the same problem in the pci_enable_device() path.  The most
recent try at fixing it is [1].

[1] 
https://lore.kernel.org/linux-pci/20201218174011.340514-2-s.miroshniche...@yadro.com/

> + }
> + }
> +#endif
> +}
> +
>  static void pci_restore_pcie_state(struct pci_dev *dev)
>  {
>   int i = 0;
> @@ -1447,6 +1465,13 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
>   if (!save_state)
>   return;
>  
> + /*
> +  * Downstream ports reset the LTR enable bit when link goes down.
> +  * Check and re-configure the bit here before restoring device.
> +  * PCIe r5.0, sec 7.5.3.16.
> +  */
> + pci_bridge_reconfigure_ltr(dev);
> +
>   cap = (u16 *)_state->cap.data[0];
>   pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
>   pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 5c59365092fa..b3a5e5287cb7 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -111,6 +111,7 @@ void pci_free_cap_save_buffers(struct pci_dev *dev);
>  bool pci_bridge_d3_possible(struct pci_dev *dev);
>  void pci_bridge_d3_update(struct pci_dev *dev);
>  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
> +void pci_bridge_reconfigure_ltr(struct pci_dev *dev);
>  
>  static inline void pci_wakeup_event(struct pci_dev *dev)
>  {
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 953f15abc850..ade055e9fb58 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2132,9 +2132,16 @@ static void pci_configure_ltr(struct pci_dev *dev)
>* Complex and all intermediate Switches indicate support for LTR.
>* PCIe r4.0, sec 6.18.
>*/
> -

Re: [PATCH] PCI: Fix memory leak in pci_register_io_range()

2021-02-17 Thread Bjorn Helgaas

On Tue, Feb 02, 2021 at 11:03:32AM +0100, Geert Uytterhoeven wrote:
> Kmemleak reports:
> 
> unreferenced object 0xc328de40 (size 64):
>   comm "kworker/1:1", pid 21, jiffies 4294938212 (age 1484.670s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 e0 d8 fc eb 00 00 00 00  
> 00 00 10 fe 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> backtrace:
>   [] pci_register_io_range+0x3c/0x80
>   [<2c7f139e>] of_pci_range_to_resource+0x48/0xc0
>   [] 
> devm_of_pci_get_host_bridge_resources.constprop.0+0x2ac/0x3ac
>   [] devm_of_pci_bridge_init+0x60/0x1b8
>   [] devm_pci_alloc_host_bridge+0x54/0x64
>   [] rcar_pcie_probe+0x2c/0x644
> 
> In case a PCI host driver's probe is deferred, the same I/O range may be
> allocated again, and be ignored, causing a memory leak.
> 
> Fix this by (a) letting logic_pio_register_range() return -EEXIST if the
> passed range already exists, so pci_register_io_range() will free it,
> and by (b) making pci_register_io_range() not consider -EEXIST an error
> condition.
> 
> Signed-off-by: Geert Uytterhoeven 

Applied to pci/enumeration for v5.12, thanks!

> ---
>  drivers/pci/pci.c | 4 
>  lib/logic_pio.c   | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 09b03cfba8894955..c651003e304a2b71 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4037,6 +4037,10 @@ int pci_register_io_range(struct fwnode_handle 
> *fwnode, phys_addr_t addr,
>   ret = logic_pio_register_range(range);
>   if (ret)
>   kfree(range);
> +
> + /* Ignore duplicates due to deferred probing */
> + if (ret == -EEXIST)
> + ret = 0;
>  #endif
>  
>   return ret;
> diff --git a/lib/logic_pio.c b/lib/logic_pio.c
> index f32fe481b4922bc1..07b4b9a1f54b6bf5 100644
> --- a/lib/logic_pio.c
> +++ b/lib/logic_pio.c
> @@ -28,6 +28,8 @@ static DEFINE_MUTEX(io_range_mutex);
>   * @new_range: pointer to the IO range to be registered.
>   *
>   * Returns 0 on success, the error code in case of failure.
> + * If the range already exists, -EEXIST will be returned, which should be
> + * considered a success.
>   *
>   * Register a new IO range node in the IO range list.
>   */
> @@ -51,6 +53,7 @@ int logic_pio_register_range(struct logic_pio_hwaddr 
> *new_range)
>   list_for_each_entry(range, _range_list, list) {
>   if (range->fwnode == new_range->fwnode) {
>   /* range already there */
> + ret = -EEXIST;
>   goto end_register;
>   }
>   if (range->flags == LOGIC_PIO_CPU_MMIO &&
> -- 
> 2.25.1
>

Re: [PATCH v6 04/15] PCI: Add pci_find_vsec_capability() to find a specific VSEC

2021-02-17 Thread Bjorn Helgaas

[+cc Krzysztof, since he commented on a previous version]
[+cc Lukas, who previously proposed exactly what I suggest below,
sorry for repeating.  I think Lukas was right to propose passing in
the vendor ID because it makes it easier to read the caller.]

When you post new versions of a series, please cc people who commented
on previous versions.

On Fri, Feb 12, 2021 at 06:37:39PM +0100, Gustavo Pimentel wrote:
> Adds another helper to ones that already exist called
> pci_find_vsec_capability. This helper crawls through the device PCI
> config space searching for a specific ID on the Vendor-Specific Extended
> Capabilities section.

  Add pci_find_vsec_capability() to locate a Vendor-Specific Extended
  Capability with the specified VSEC ID.
> 
> The Vendor-Specific Extended Capability (VSEC) is a special PCI
> capability (acts like container) defined by PCI-SIG that allows the one
> or more proprietary capabilities defined by the vendor which aren't
> standard or shared between the manufactures.

s/is a special ... by PCI-SIG that//
s/allows the one/allows one/
s/the manufactures/manufacturers/ (or maybe "vendors" to match previous use)

> Signed-off-by: Gustavo Pimentel 
> ---
>  drivers/pci/pci.c | 34 ++
>  include/linux/pci.h   |  2 ++
>  include/uapi/linux/pci_regs.h |  6 ++
>  3 files changed, 42 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b9fecc2..628aa9f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -693,6 +693,40 @@ u8 pci_find_ht_capability(struct pci_dev *dev, int 
> ht_cap)
>  EXPORT_SYMBOL_GPL(pci_find_ht_capability);
>  
>  /**
> + * pci_find_vsec_capability - Find a vendor-specific extended capability
> + * @dev: PCI device to query
> + * @cap: vendor-specific capability ID code
> + *
> + * Typically this function will be called by the PCI driver, which passes
> + * through argument the 'struct pci_dev *' already pointing for the device
> + * config space that is associated with the vendor and device ID which will
> + * know which ID to search and what to do with it, however, there might be
> + * cases that this function could be called outside of this scope and
> + * therefore is the caller responsibility to check the vendor and/or
> + * device ID first.

This is important because it's a bit subtle.  IIUC, each vendor
(identified by Vendor ID at 0x00 in config space) can define its own
VSEC IDs, so it can define up to 2^16 == 64K VSEC structures.

Of course there's not room for that many in config space; but the
point is that the vendor chooses its own VSEC IDs and doesn't need to
coordinate with anybody.

So a VSEC ID 0x0006 in a Synopsys device (Vendor ID 0x16c3) has
nothing to do with a VSEC ID 0x0006 in an Intel device (Vendor ID
0x8086), and it's up to the caller to make sure it's using the correct
one.

I wonder if it would help avoid mistakes if we made the interface look
like this:

  u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int vsec_cap_id)
  {
if (vendor != dev->vendor)
  return 0;

while ((vsec = ...))
  ...
  }

so calls would look like this:

  vsec = pci_find_vsec_capability(dev, PCI_VENDOR_ID_SYNOPSYS, 
DW_PCIE_VSEC_DMA_ID);

which would make it more obvious that DW_PCIE_VSEC_DMA_ID is only
valid in a Synopsys device.

The function comment could be something like this:

  pci_find_vsec_capability - Find a vendor-specific extended capability
  @dev: PCI device to query
  @vendor: Vendor ID for which capability is defined
  @vsec_cap_id: Vendor-specific capability ID

  If @dev has Vendor ID @vendor, search for a VSEC capability with
  VSEC ID @vsec_cap_id.  If found, return the capability offset in
  config space; otherwise return 0.

Or maybe it's even more subtle than I thought, and I'm missing
something :)

> + * Returns the address of the vendor-specific structure that matches the
> + * requested capability ID code within the device's PCI configuration space
> + * or 0 if it does not find a match.
> + */
> +u16 pci_find_vsec_capability(struct pci_dev *dev, int vsec_cap_id)
> +{
> + u16 vsec = 0;
> + u32 header;
> +
> + while ((vsec = pci_find_next_ext_capability(dev, vsec,
> +  PCI_EXT_CAP_ID_VNDR))) {
> + if (pci_read_config_dword(dev, vsec + PCI_VSEC_HDR,
> +   ) == PCIBIOS_SUCCESSFUL &&
> + PCI_VSEC_CAP_ID(header) == vsec_cap_id)
> + return vsec;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_find_vsec_capability);
> +
> +/**
>   * pci_find_parent_resource - return resource region of parent bus of given
>   * region
>   * @dev: PCI device structure contains resources to be searched
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b32126d..da6ab6a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1080,6 +1080,8 @@

Re: [PATCH] PCI: quirk for preventing bus reset on TI C667X

2021-02-17 Thread Bjorn Helgaas

On Thu, Jan 21, 2021 at 05:55:47PM -0600, Bjorn Helgaas wrote:
> On Tue, Jan 12, 2021 at 03:36:43PM +, Antti Järvinen wrote:
> > TI C667X does not support bus/hot reset.
> > See https://e2e.ti.com/support/processors/f/791/t/954382
> 
> You can cite the URL as the source, but the URL will eventually become
> stale, so let's include the relevant details here directly.  

Thanks for trying the experiment below.  I'll look for a repost that
includes details from the URL directly in the commit log.

> From the forum, it looks like the device doesn't respond after a
> reset (config accesses return ~0).  It seems somewhat surprising that
> something as basic as a reset would be completely broken.  I wonder if
> we're not doing the reset correctly.
> 
> It looks like we would probably be trying a Secondary Bus Reset using
> the bridge leading to the C667X.  Can you confirm?  Wonder if you
> could try doing what pci_reset_secondary_bus() does by hand:
> 
>   # BRIDGE=...  # PCI address, e.g., 00:1c.0
>   # C667X=...
>   # setpci -s$C667X VENDOR_ID.w
>   # setpci -s$BRIDGE BRIDGE_CONTROL.w   # prints "val"
>   # setpci -s$BRIDGE BRIDGE_CONTROL.w=  # val | 0x40 (set SBR)
>   # sleep 1
>   # setpci -s$BRIDGE BRIDGE_CONTROL.w=  # val (clear SBR)
>   # sleep 1
>   # setpci -s$C667X VENDOR_ID.w=0
>   # setpci -s$C667X VENDOR_ID.w
> 
> If we use this quirk and avoid the reset, I assume that means
> assigning the device to VMs with VFIO will leak state between VMs?
> 
> > Signed-off-by: Antti Järvinen 
> > ---
> >  drivers/pci/quirks.c | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 653660e3ba9e..c8fcf24c5bd0 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3578,6 +3578,12 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 
> > 0x0034, quirk_no_bus_reset);
> >   */
> >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);
> >  
> > +/*
> > + * Some TI keystone C667X devices do no support bus/hot reset.
> > + * https://e2e.ti.com/support/processors/f/791/t/954382
> > + */
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TI, 0xb005, quirk_no_bus_reset);
> > +
> >  static void quirk_no_pm_reset(struct pci_dev *dev)
> >  {
> > /*
> > -- 
> > 2.17.1
> >

Re: [PATCH] PCI : check if type 0 devices have all BARs of size zero

2021-02-16 Thread Bjorn Helgaas

On Tue, Feb 16, 2021 at 07:52:08AM +, Wasim Khan wrote:
> > -Original Message-
> > From: Bjorn Helgaas 
> > Sent: Tuesday, February 16, 2021 2:43 AM
> > To: Wasim Khan (OSS) 
> > Cc: bhelg...@google.com; linux-...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Wasim Khan 
> > Subject: Re: [PATCH] PCI : check if type 0 devices have all BARs of size 
> > zero
> > 
> > On Fri, Feb 12, 2021 at 11:08:56AM +0100, Wasim Khan wrote:
> > > From: Wasim Khan 
> > >
> > > Log a message if all BARs of type 0 devices are of size zero. This can
> > > help detecting type 0 devices not reporting BAR size correctly.
> > 
> > I could be missing something, but I don't think we can do this.  I
> > would think the simplest possible presilicon testing would find
> > errors like this, and the first attempt to have a driver claim the
> > device would fail if required BARs were missing, so I'm not sure
> > what this would add.
> 
> Thank you for the review.
> I observed this issue with an under development EP. Due to some
> logic problem in EP's firmware, the BAR sizes were reported zero and
> crash was observed sometime later in PCIe code. 

I'm interested in this crash.  The PCI core should not crash just
because a BAR size is zero, i.e., the BAR looks like it's
unimplemented.

> I agree with you that such issues should have been caught in
> pre-silicon testing, but not sure of pre-si testing details and if
> the issue was specifically observed with real OS. Also, because the
> EP is in early stage of development, device driver of EP is not
> available as of now. 

> So, I though it will be a good idea to print an information message
> only for *type 0* devices to give a quick hint if the zero BAR size
> is expected for the given EP or not. So that SW can contribute to
> identify HW problem.

> > While the subject line says "type 0 devices," this code path is
> > also used for type 1 devices (bridges), and it's quite common for
> > bridges to have no BARs, which means they would all be hardwired
> > to zero.
> 
> Yes, for type 1 devices, it is common to have zero BAR size, so I
> added log msg for type 0 devices only , which are in-general
> expected to have valid BARs.

Oh, right, I missed your check of dev->hdr_type.

> > It is also legal for even type 0 devices to implement no BARs.
> > They may be operated entirely via config space or via
> > device-specific BARs that are unknown to the PCI core.
> 
> OK, I did not know this . Thank you for sharing this.

This is actually quite common.  On my garden-variet laptop, this:

  $ lspci -v | grep -E "^(\S|(Memory|I/O))"

finds two type 0 devices that have no BARs:

  00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
Host Bridge/DRAM Registers
  00:1f.0 ISA bridge: Intel Corporation CM238 Chipset LPC/eSPI Controller

I don't really want to add more dmesg logging for things like this
that are working correctly.  In this case, I think the best solution
is to either keep this patch in your private branch for testing or to
manually inspect the dmesg log, where we already log every BAR we
discover, for devices that should have BARs but don't.

> > > Signed-off-by: Wasim Khan 
> > > ---
> > >  drivers/pci/probe.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index
> > > 953f15abc850..6438d6d56777 100644
> > > --- a/drivers/pci/probe.c
> > > +++ b/drivers/pci/probe.c
> > > @@ -321,6 +321,7 @@ int __pci_read_base(struct pci_dev *dev, enum
> > > pci_bar_type type,  static void pci_read_bases(struct pci_dev *dev,
> > > unsigned int howmany, int rom)  {
> > >   unsigned int pos, reg;
> > > + bool found = false;
> > >
> > >   if (dev->non_compliant_bars)
> > >   return;
> > > @@ -333,8 +334,12 @@ static void pci_read_bases(struct pci_dev *dev,
> > unsigned int howmany, int rom)
> > >   struct resource *res = >resource[pos];
> > >   reg = PCI_BASE_ADDRESS_0 + (pos << 2);
> > >   pos += __pci_read_base(dev, pci_bar_unknown, res, reg);
> > > + found |= res->flags ? 1 : 0;
> > >   }
> > >
> > > + if (!dev->hdr_type && !found)
> > > + pci_info(dev, "BAR size is 0 for BAR[0..%d]\n", howmany - 1);
> > > +
> > >   if (rom) {
> > >   struct resource *res = >resource[PCI_ROM_RESOURCE];
> > >   dev->rom_base_reg = rom;
> > > --
> > > 2.25.1
> > >

Re: [PATCH] PCI : check if type 0 devices have all BARs of size zero

2021-02-15 Thread Bjorn Helgaas

On Fri, Feb 12, 2021 at 11:08:56AM +0100, Wasim Khan wrote:
> From: Wasim Khan 
> 
> Log a message if all BARs of type 0 devices are of
> size zero. This can help detecting type 0 devices
> not reporting BAR size correctly.

I could be missing something, but I don't think we can do this.  I
would think the simplest possible presilicon testing would find errors
like this, and the first attempt to have a driver claim the device
would fail if required BARs were missing, so I'm not sure what this
would add.

While the subject line says "type 0 devices," this code path is also
used for type 1 devices (bridges), and it's quite common for bridges
to have no BARs, which means they would all be hardwired to zero.

It is also legal for even type 0 devices to implement no BARs.  They
may be operated entirely via config space or via device-specific BARs
that are unknown to the PCI core.

> Signed-off-by: Wasim Khan 
> ---
>  drivers/pci/probe.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 953f15abc850..6438d6d56777 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -321,6 +321,7 @@ int __pci_read_base(struct pci_dev *dev, enum 
> pci_bar_type type,
>  static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int 
> rom)
>  {
>   unsigned int pos, reg;
> + bool found = false;
>  
>   if (dev->non_compliant_bars)
>   return;
> @@ -333,8 +334,12 @@ static void pci_read_bases(struct pci_dev *dev, unsigned 
> int howmany, int rom)
>   struct resource *res = >resource[pos];
>   reg = PCI_BASE_ADDRESS_0 + (pos << 2);
>   pos += __pci_read_base(dev, pci_bar_unknown, res, reg);
> + found |= res->flags ? 1 : 0;
>   }
>  
> + if (!dev->hdr_type && !found)
> + pci_info(dev, "BAR size is 0 for BAR[0..%d]\n", howmany - 1);
> +
>   if (rom) {
>   struct resource *res = >resource[PCI_ROM_RESOURCE];
>   dev->rom_base_reg = rom;
> -- 
> 2.25.1
>

Re: [PATCH] PCI: Run platform power transition on initial D0 entry

2021-02-10 Thread Bjorn Helgaas

[+cc Rafael, linux-pm]

On Thu, Feb 04, 2021 at 11:06:40PM +0100, Maximilian Luz wrote:
> On some devices and platforms, the initial platform power state is not
> in sync with the power state of the PCI device.
> 
> pci_enable_device_flags() updates the state of a PCI device by reading
> from the PCI_PM_CTRL register. This may change the stored power state of
> the device without running the appropriate platform power transition.

At this point in the code, setting dev->current_state based on the
value of PCI_PM_CTRL seems reasonable.  We're making the pci_dev state
match the PCI device hardware state.  This paragraph sort of implies
we're missing an "appropriate platform power transition" here, but I
don't think that's the case.

But it would be nice if we could combine this bit from
pci_enable_device_flags() with the pci_set_power_state() in
do_pci_enable_device().

> Due to the stored power-state being changed, the later call to
> pci_set_power_state(..., PCI_D0) in do_pci_enable_device() can evaluate
> to a no-op if the stored state has been changed to D0 via that. This
> will then prevent the appropriate platform power transition to be run,
> which can on some devices and platforms lead to platform and PCI power
> state being entirely different, i.e. out-of-sync. On ACPI platforms,
> this can lead to power resources not being turned on, even though they
> are marked as required for D0.
> 
> Specifically, on the Microsoft Surface Book 2 and 3, some ACPI power
> regions that should be "on" for the D0 state (and others) are
> initialized as "off" in ACPI, whereas the PCI device is in D0.

So some ACPI power regions are in fact "on" (because the PCI device
that requires them is in D0), but the ACPI core believes them to be
"off" (or probably "unknown, treated as 'off'")?

> As the
> state is updated in pci_enable_device_flags() without ensuring that the
> platform state is also updated, the power resource will never be
> properly turned on. Instead, it lives in a sort of on-but-marked-as-off
> zombie-state, which confuses things down the line when attempting to
> transition the device into D3cold: As the resource is already marked as
> off, it won't be turned off and the device does not fully enter D3cold,
> causing increased power consumption during (runtime-)suspend.
> 
> By replacing pci_set_power_state() in do_pci_enable_device() with
> pci_power_up(), we can force pci_platform_power_transition() to be
> called, which will then check if the platform power state needs updating
> and appropriate actions need to be taken.
> 
> Signed-off-by: Maximilian Luz 

I added Rafael & linux-pm because he should chime in here.

> ---
> 
> I'm not entirely sure if this is the best way to do this, so I'm open to
> alternatives. In a previous version of this, I've tried to run the
> platform/ACPI transition directly after the pci_read_config_word() in
> pci_enable_device_flags(), however, that caused some regression in
> intel-lpss-pci, specifically that then had trouble accessing its config
> space for initial setup.
> 
> This version has been tested for a while now on [1/2] without any
> complaints. As this essentially only drops the initial are-we-already-
> in-that-state-check, I don't expect any issues to be caused by that.
> 
> [1]: https://github.com/linux-surface/linux-surface
> [2]: https://github.com/linux-surface/kernel
> 
> ---
>  drivers/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b9fecc25d213..eb778e80d8cf 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1802,7 +1802,7 @@ static int do_pci_enable_device(struct pci_dev *dev, 
> int bars)
>   u16 cmd;
>   u8 pin;
>  
> - err = pci_set_power_state(dev, PCI_D0);
> + err = pci_power_up(dev);
>   if (err < 0 && err != -EIO)
>   return err;
>  
> -- 
> 2.30.0
>

Re: [PATCH 2/2] PCI: Revoke mappings like devmem

2021-02-10 Thread Bjorn Helgaas

I see I already acked this, but if you haven't merged it yet there are
a few typos in the commit log:

On Thu, Feb 04, 2021 at 05:58:31PM +0100, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.

s/ptes/PTEs/

> Except there's two more ways to access PCI BARs: sysfs and proc mmap
> support. Let's plug that hole.

s/there's two/there are two/

> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> to adjust this at at ->open time:
> 
> - for sysfs this is easy, now that binary attributes support this. We
>   just set bin_attr->mapping when mmap is supported
> - for procfs it's a bit more tricky, since procfs pci access has only
>   one file per device, and access to a specific resources first needs
>   to be set up with some ioctl calls. But mmap is only supported for
>   the same resources as sysfs exposes with mmap support, and otherwise
>   rejected, so we can set the mapping unconditionally at open time
>   without harm.

s/pci access/PCI access/
s/a specific resources/a specific resource/

> A special consideration is for arch_can_pci_mmap_io() - we need to
> make sure that the ->f_mapping doesn't alias between ioport and iomem
> space. There's only 2 ways in-tree to support mmap of ioports: generic
> pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> architecture hand-rolling. Both approach support ioport mmap through a
> special pfn range and not through magic pte attributes. Aliasing is
> therefore not a problem.

s/There's only 2/There are only two/
s/pci mmap/PCI mmap/
s/Both approach/Both approaches/
s/pfn/PFN/
s/pte/PTE/

> The only difference in access checks left is that sysfs PCI mmap does
> not check for CAP_RAWIO. I'm not really sure whether that should be
> added or not.
> 
> Acked-by: Bjorn Helgaas 
> Reviewed-by: Dan Williams 
> Signed-off-by: Daniel Vetter 
> Cc: Stephen Rothwell 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: Greg Kroah-Hartman 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 4 
>  drivers/pci/proc.c  | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 0c45b4f7b214..f8afd54ca3e1 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>   b->legacy_io->read = pci_read_legacy_io;
>   b->legacy_io->write = pci_write_legacy_io;
>   b->legacy_io->mmap = pci_mmap_legacy_io;
> + b->legacy_io->mapping = iomem_get_mapping();
>   pci_adjust_legacy_attr(b, pci_mmap_io);
>   error = device_create_bin_file(>dev, b->legacy_io);
>   if (error)
> @@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>   b->legacy_mem->size = 1024*1024;
>   b->legacy_mem->attr.mode = 0600;
>   b->legacy_mem->mmap = pci_mmap_legacy_mem;
> + b->legacy_io->mapping = iomem_get_mapping();
>   pci_adjust_legacy_attr(b, pci_mmap_mem);
>   error = device_create_bin_file(>dev, b->legacy_mem);
>   if (error)
> @@ -1169,6 +1171,8 @@ static int pci_create_attr(struct pci_dev *pdev, int 
> num, int write_combine)
>   res_attr->mmap = pci_mmap_resource_uc;
>   }
>   }
> + if (res_attr->mmap)
> + res_attr->mapping = iomem_get_mapping();
>   res_attr->attr.name = res_attr_name;
>   res_attr->attr.mode = 0600;
>   res_attr->size = pci_resource_len(pdev, num);
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3a2f90beb4cb..9bab07302bbf 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct 
> file *file)
>   fpriv->write_combine = 0;
>  
>   file->private_data = fpriv;
> + file->f_mapping = iomem_get_mapping();
>  
>   return 0;
>  }
> -- 
> 2.30.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Re: [PATCH] PCI: Also set up legacy files only after sysfs init

2021-02-10 Thread Bjorn Helgaas

On Fri, Feb 05, 2021 at 02:36:32PM +0100, Daniel Vetter wrote:
> We are already doing this for all the regular sysfs files on PCI
> devices, but not yet on the legacy io files on the PCI buses. Thus far
> no problem, but in the next patch I want to wire up iomem revoke
> support. That needs the vfs up and running already to make sure that
> iomem_get_mapping() works.
> 
> Wire it up exactly like the existing code in
> pci_create_sysfs_dev_files(). Note that pci_remove_legacy_files()
> doesn't need a check since the one for pci_bus->legacy_io is
> sufficient.
> 
> An alternative solution would be to implement a callback in sysfs to
> set up the address space from iomem_get_mapping() when userspace calls
> mmap(). This also works, but Greg didn't really like that just to work
> around an ordering issue when the kernel loads initially.
> 
> v2: Improve commit message (Bjorn)
> 
> Signed-off-by: Daniel Vetter 

Acked-by: Bjorn Helgaas 

I wish we weren't extending a known-racy mechanism to do this, but at
least we're not *adding* a brand new race.

> Cc: Stephen Rothwell 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: Greg Kroah-Hartman 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index fb072f4b3176..0c45b4f7b214 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
>  {
>   int error;
>  
> + if (!sysfs_initialized)
> + return;
> +
>   b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
>  GFP_ATOMIC);
>   if (!b->legacy_io)
> @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
>  static int __init pci_sysfs_init(void)
>  {
>   struct pci_dev *pdev = NULL;
> + struct pci_bus *pbus = NULL;
>   int retval;
>  
>   sysfs_initialized = 1;
> @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
>   }
>   }
>  
> + while ((pbus = pci_find_next_bus(pbus)))
> + pci_create_legacy_files(pbus);
> +
>   return 0;
>  }
>  late_initcall(pci_sysfs_init);
> -- 
> 2.30.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Re: [PATCH] checkpatch: add warning for non-lore mailing list URLs

2021-02-10 Thread Bjorn Helgaas

On Wed, Feb 10, 2021 at 12:22:35AM -0800, Kees Cook wrote:
> On Thu, Dec 17, 2020 at 04:50:41PM -0800, Joe Perches wrote:
> > On Thu, 2020-12-17 at 17:56 -0600, Bjorn Helgaas wrote:
> > > From: Bjorn Helgaas 
> > > 
> > > The lkml.org, marc.info, spinics.net, etc archives are not quite as useful
> > > as lore.kernel.org because they use different styles, add advertising, and
> > > may disappear in the future.  The lore archives are more consistent and
> > > more likely to stick around, so prefer https://lore.kernel.org URLs when
> > > they exist.
> > 
> > Hi Bjorn.
> > 
> > I like the idea, thanks, but a couple notes.
> > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > []
> > > @@ -564,6 +564,17 @@ sub find_standard_signature {
> > >   return "";
> > >  }
> >  
> > > +our $obsolete_archives = qr{(?xi:
> > > + freedesktop.org/archives/dri-devel|
> > > + lists.infradead.org|
> > > + lkml.org|
> > > + mail-archive.com|
> > > + mailman.alsa-project.org/pipermail|
> > > + marc.info|
> > > + ozlabs.org/pipermail|
> > > + spinics.net
> > > +)};
> > 
> > Strictly, these all need \Q \E escaping so uses like lkmlAorg do not match.
> > 
> > 
> > > @@ -3101,6 +3112,12 @@ sub process {
> > >   }
> > >   }
> > >  
> > > +# Check for mailing list archives other than lore.kernel.org
> > > + if ($line =~ /(http|https):\/\/\S*$obsolete_archives/) {
> > 
> > The https?:// doesn't seem necessary.  Perhaps:
> > 
> > if ($line =~ m{\b$obsolete_archives}) {
> > 
> > > + WARN("PREFER_LORE_ARCHIVE",
> > > +  "Use lore.kernel.org archive links when possible; 
> > > see https://lore.kernel.org/lists.html\n; . $herecurr);
> > 
> > Perhaps:
> >  "Prefer lore.kernel.org links. see: 
> > https://www.kernel.org/lore.html#linking-to-list-discussions-from-commits\n;
> >  . $herecurr);
> > 
> > So maybe instead:
> > ---
> >  scripts/checkpatch.pl | 17 +
> >  1 file changed, 17 insertions(+)
> > 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > index 00085308ed9d..c2a324d628a6 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -564,6 +564,17 @@ sub find_standard_signature {
> > return "";
> >  }
> >  
> > +our $obsolete_archives = qr{(?xi:
> > +   \Qfreedesktop.org/archives/dri-devel\E |
> > +   \Qlists.infradead.org\E |
> > +   \Qlkml.org\E |
> > +   \Qmail-archive.com\E |
> > +   \Qmailman.alsa-project.org/pipermail\E |
> > +   \Qmarc.info\E |
> > +   \Qozlabs.org/pipermail\E |
> > +   \Qspinics.net\E
> > +)};
> > +
> >  our @typeListMisordered = (
> > qr{char\s+(?:un)?signed},
> > qr{int\s+(?:(?:un)?signed\s+)?short\s},
> > @@ -3101,6 +3112,12 @@ sub process {
> > }
> > }
> >  
> > +   # Check for mailing list archives other than lore.kernel.org
> > +   if ($rawline =~ m{\b$obsolete_archives}) {
> > +   WARN("PREFER_LORE_ARCHIVE",
> > +"Use lore.kernel.org archive links when possible - 
> > see https://lore.kernel.org/lists.html\n; . $herecurr);
> > +   }
> > +
> >  # Check for added, moved or deleted files
> > if (!$reported_maintainer_file && !$in_commit_log &&
> > ($line =~ /^(?:new|deleted) file mode\s*\d+\s*$/ ||
> > 
> > 
> 
> Ah, nice. Yes, this would be great to get added. Joe, can you respin as
> a full path? Please consider it:

I hate to ask Joe to rework *my* patch just because I've dropped the
ball on it!  Sorry, I'll try to resurrect this.

> Reviewed-by: Kees Cook

Re: [PATCH] PCI: Use subdir-ccflags-* to inherit debug flag

2021-02-09 Thread Bjorn Helgaas

[+cc Masahiro, Michal, linux-kbuild, linux-kernel]

On Thu, Feb 04, 2021 at 07:30:15PM +0800, Yicong Yang wrote:
> From: Junhao He 
> 
> Use subdir-ccflags-* instead of ccflags-* to inherit the debug
> settings from Kconfig when traversing subdirectories.
> 
> Signed-off-by: Junhao He 
> Signed-off-by: Yicong Yang 

I applied this with Krzysztof's reviewed-by and the commit log below
to pci/misc for v5.12, thanks!

Feel free to copy or improve the commit log for use elsewhere.

> ---
>  drivers/pci/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index 11cc794..d62c4ac 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -36,4 +36,4 @@ obj-$(CONFIG_PCI_ENDPOINT)  += endpoint/
>  obj-y+= controller/
>  obj-y+= switch/
>  
> -ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
> +subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG

commit e8e9aababe60 ("PCI: Apply CONFIG_PCI_DEBUG to entire drivers/pci 
hierarchy")
Author: Junhao He 
Date:   Thu Feb 4 19:30:15 2021 +0800

PCI: Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy

CONFIG_PCI_DEBUG=y adds -DDEBUG to CFLAGS, which enables things like
pr_debug() and dev_dbg() (and hence pci_dbg()).  Previously we added
-DDEBUG for files in drivers/pci/, but not files in subdirectories of
drivers/pci/.

Add -DDEBUG to CFLAGS for all files below drivers/pci/ so CONFIG_PCI_DEBUG
applies to the entire hierarchy.

[bhelgaas: commit log]
Link: 
https://lore.kernel.org/r/1612438215-33105-1-git-send-email-yangyic...@hisilicon.com
Signed-off-by: Junhao He 
Signed-off-by: Yicong Yang 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Krzysztof Wilczyński 

diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 11cc79411e2d..d62c4ac4ae1b 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -36,4 +36,4 @@ obj-$(CONFIG_PCI_ENDPOINT)+= endpoint/
 obj-y  += controller/
 obj-y  += switch/
 
-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
+subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG

Re: [PATCHv2] PCI: Add Silicom Denmark vendor ID

2021-02-09 Thread Bjorn Helgaas

On Mon, Feb 08, 2021 at 04:01:57PM +0100, Martin Hundebøll wrote:
> Update pci_ids.h with the vendor ID for Silicom Denmark. The define is
> going to be referenced in driver(s) for FPGA accelerated smart NICs.
> 
> Signed-off-by: Martin Hundebøll 

Applied to pci/misc for v5.12 with reviewed-by from Krzysztof and Tom,
thanks!

> ---
> 
> Changes since v1:
>  * Align commit message/shortlog with similar changes to pci_ids.h
> 
>  include/linux/pci_ids.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index f968fcda338e..c119f0eb41b6 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2589,6 +2589,8 @@
>  
>  #define PCI_VENDOR_ID_REDHAT 0x1b36
>  
> +#define PCI_VENDOR_ID_SILICOM_DENMARK0x1c2c
> +
>  #define PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS  0x1c36
>  
>  #define PCI_VENDOR_ID_CIRCUITCO  0x1cc8
> -- 
> 2.29.2
>

Re: [RESEND v4 1/6] misc: Add Synopsys DesignWare xData IP driver

2021-02-09 Thread Bjorn Helgaas

On Tue, Feb 09, 2021 at 03:28:16PM +, Gustavo Pimentel wrote:
> On Mon, Feb 8, 2021 at 22:53:54, Krzysztof Wilczyński  
> wrote:
> > [...]
> > > Thanks for your review. I will wait for a couple of days, before sending 
> > > a new version of this patch series based on your feedback.
> > 
> > Thank you!
> > 
> > There might be one more change, and improvement, to be done as per
> > Bjorn's feedback, see:
> > 
> >   
> > https://urldefense.com/v3/__https://lore.kernel.org/linux-pci/20210208193516.GA406304@bjorn-Precision-5520/__;!!A4F2R9G_pg!Oxp56pU_UN6M2BhfNRSdYqsFUncqVklBj_1IdLQD_w_V6dKRPDO_FjPUystMa5D39SRj8uo$
> >  
> > 
> > The code in question would be (exceprt from the patch):
> > 
> > [...]
> > +static int dw_xdata_pcie_probe(struct pci_dev *pdev,
> > +  const struct pci_device_id *pid)
> > +{
> > +   const struct dw_xdata_pcie_data *pdata = (void *)pid->driver_data;
> > +   struct dw_xdata *dw;
> > [...]
> > +   dw->rg_region.vaddr = pcim_iomap_table(pdev)[pdata->rg_bar];
> > +   if (!dw->rg_region.vaddr)
> > +   return -ENOMEM;
> > [...]
> > 
> > Perhaps something like the following would would?
> > 
> > void __iomem * const *iomap_table;
> > 
> > iomap_table = pcim_iomap_table(pdev);
> > if (!iomap_table)
> > return -ENOMEM;
> > 
> > dw->rg_region.vaddr = iomap_table[pdata->rg_bar];
> > if (!dw->rg_region.vaddr)
> > return -ENOMEM;
> > 
> > With sensible error messages added, of course.  What do you think?
> 
> I think all the improvements are welcome. I will do that.
> My only doubt is if Bjorn recommends removing the 
> iomap_table[pdata->rg_bar] check, after adding the verification on the 
> pcim_iomap_table, because all other drivers doesn't do that.

I misunderstood the usage of pcim_iomap_table() -- it looks like one
must call pcim_iomap_regions() *first*, and test its result, and
*that* is where we should catch any pcim_iomap_table() failures, e.g.,

  rc = pcim_iomap_regions()   # or pcim_iomap_regions_request_all()
  if (rc)
return rc;# pcim_iomap_table() or other failure

  vaddr = pcim_iomap_table()[BAR];
  if (!vaddr)
return -ENOMEM;   # BAR doesn't exist

You *do* correctly call pcim_iomap_regions() first, which calls
pcim_imap_table() internally, so if pcim_iomap_table() were to return
NULL, you should catch it there.

Then we assume that the subsequent "pcim_iomap_table()[BAR]" call will
succeed and NOT return NULL, so it should be safe to index into the
table.  And if the table[BAR] entry is NULL, it means the BAR doesn't
exist or isn't mapped.

That sort of makes sense, but the API design doesn't quite seem
obviously correct to me.  The table was created by
pcim_iomap_regions(), and pcim_iomap_table() is basically retrieving
that artifact.

I wonder if it could be improved by making pcim_iomap_table() strictly
internal to devres.c and having the pcim_iomap functions return the
table directly.  Then the code would look something like this:

  table = pcim_iomap_regions();
  if (IS_ERR(table))
return PTR_ERR(table);# pcim_iomap_table() or other failure

  vaddr = table[BAR]; # "table" is guaranteed to be non-NULL
  if (!vaddr)
return -ENOMEM;

Obviously this is not something you should do for *this* series.
I think you should follow the example of other drivers, which means
keeping your patch exactly as you posted it.  I'm just interested in
opinions on this as a possible future API improvement.

Bjorn

Re: [PATCH v4 15/15] dmaengine: dw-edma: Add pcim_iomap_table return checker

2021-02-08 Thread Bjorn Helgaas

[+cc Krzysztof]

>From reading the subject, I thought you were adding a function to
check the return values, i.e., a "checker."  But you're really adding
"checks" :)

On Wed, Feb 03, 2021 at 10:58:06PM +0100, Gustavo Pimentel wrote:
> Detected by CoverityScan CID 16555 ("Dereference null return")
> 
> Signed-off-by: Gustavo Pimentel 
> ---
>  drivers/dma/dw-edma/dw-edma-pcie.c | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/dma/dw-edma/dw-edma-pcie.c 
> b/drivers/dma/dw-edma/dw-edma-pcie.c
> index 686b4ff..7445033 100644
> --- a/drivers/dma/dw-edma/dw-edma-pcie.c
> +++ b/drivers/dma/dw-edma/dw-edma-pcie.c
> @@ -238,6 +238,9 @@ static int dw_edma_pcie_probe(struct pci_dev *pdev,
>   dw->rd_ch_cnt = vsec_data.rd_ch_cnt;
>  
>   dw->rg_region.vaddr = pcim_iomap_table(pdev)[vsec_data.rg.bar];
> + if (!dw->rg_region.vaddr)
> + return -ENOMEM;

This doesn't seem quite right.  If pcim_iomap_table() fails, it
returns NULL.  But then we assign "vaddr = NULL[vsec_data.rg.bar]"
which dereferences the NULL pointer even before your test.

This "pcim_iomap_table(dev)[n]" pattern is extremely common.  There
are over 100 calls of pcim_iomap_table(), and

  $ git grep "pcim_iomap_table(.*)\[.*\]" | wc -l

says about 75 of them are of this form, where we dereference the
result before testing it.

>   dw->rg_region.vaddr += vsec_data.rg.off;
>   dw->rg_region.paddr = pdev->resource[vsec_data.rg.bar].start;
>   dw->rg_region.paddr += vsec_data.rg.off;
> @@ -250,12 +253,18 @@ static int dw_edma_pcie_probe(struct pci_dev *pdev,
>   struct dw_edma_block *dt_block = _data.dt_wr[i];
>  
>   ll_region->vaddr = pcim_iomap_table(pdev)[ll_block->bar];
> + if (!ll_region->vaddr)
> + return -ENOMEM;
> +
>   ll_region->vaddr += ll_block->off;
>   ll_region->paddr = pdev->resource[ll_block->bar].start;
>   ll_region->paddr += ll_block->off;
>   ll_region->sz = ll_block->sz;
>  
>   dt_region->vaddr = pcim_iomap_table(pdev)[dt_block->bar];
> + if (!dt_region->vaddr)
> + return -ENOMEM;
> +
>   dt_region->vaddr += dt_block->off;
>   dt_region->paddr = pdev->resource[dt_block->bar].start;
>   dt_region->paddr += dt_block->off;
> @@ -269,12 +278,18 @@ static int dw_edma_pcie_probe(struct pci_dev *pdev,
>   struct dw_edma_block *dt_block = _data.dt_rd[i];
>  
>   ll_region->vaddr = pcim_iomap_table(pdev)[ll_block->bar];
> + if (!ll_region->vaddr)
> + return -ENOMEM;
> +
>   ll_region->vaddr += ll_block->off;
>   ll_region->paddr = pdev->resource[ll_block->bar].start;
>   ll_region->paddr += ll_block->off;
>   ll_region->sz = ll_block->sz;
>  
>   dt_region->vaddr = pcim_iomap_table(pdev)[dt_block->bar];
> + if (!dt_region->vaddr)
> + return -ENOMEM;
> +
>   dt_region->vaddr += dt_block->off;
>   dt_region->paddr = pdev->resource[dt_block->bar].start;
>   dt_region->paddr += dt_block->off;
> -- 
> 2.7.4
>

Re: [PATCH 2/4] hwmon: Use subdir-ccflags-* to inherit debug flag

2021-02-05 Thread Bjorn Helgaas

On Fri, Feb 05, 2021 at 10:28:32AM -0800, Guenter Roeck wrote:
> On Fri, Feb 05, 2021 at 05:44:13PM +0800, Yicong Yang wrote:
> > From: Junhao He 
> > 
> > Use subdir-ccflags-* instead of ccflags-* to inherit the debug
> > settings from Kconfig when traversing subdirectories.
> > 
> > Suggested-by: Bjorn Helgaas 
> > Signed-off-by: Junhao He 
> > Signed-off-by: Yicong Yang 
> 
> What problem does this fix ? Maybe I am missing it, but I don't see
> DEBUG being used in a subdirectory of drivers/hwmon.

It's my fault for raising this question [1].  Yicong fixed a real
problem in drivers/pci, where we are currently using

  ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG

so CONFIG_PCI_DEBUG=y turns on debug in drivers/pci, but not in the
subdirectories.  That's surprising to users.

So my question was whether we should default to using subdir-ccflags
for -DDEBUG in general, and only use ccflags when we have
subdirectories that have their own debug options, e.g.,

  drivers/i2c/Makefile:ccflags-$(CONFIG_I2C_DEBUG_CORE) := -DDEBUG
  drivers/i2c/algos/Makefile:ccflags-$(CONFIG_I2C_DEBUG_ALGO) := -DDEBUG
  drivers/i2c/busses/Makefile:ccflags-$(CONFIG_I2C_DEBUG_BUS) := -DDEBUG
  drivers/i2c/muxes/Makefile:ccflags-$(CONFIG_I2C_DEBUG_BUS) := -DDEBUG

I mentioned drivers/hwmon along with a few others that have
subdirectories, do not have per-subdirectory debug options, and use
ccflags.  I didn't try to determine whether those subdirectories
currently use -DDEBUG.

In the case of drivers/hwmon, several drivers do use pr_debug(),
and CONFIG_HWMON_DEBUG_CHIP=y turns those on.  But if somebody
were to add pr_debug() to drivers/hwmon/occ/common.c, for example,
CONFIG_HWMON_DEBUG_CHIP=y would *not* turn it on.  That sounds
surprising to me, but if that's what you intend, that's totally fine.

[1] https://lore.kernel.org/r/20210204161048.GA68790@bjorn-Precision-5520

> > ---
> >  drivers/hwmon/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index 09a86c5..1c0c089 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -201,5 +201,5 @@ obj-$(CONFIG_SENSORS_XGENE) += xgene-hwmon.o
> >  obj-$(CONFIG_SENSORS_OCC)  += occ/
> >  obj-$(CONFIG_PMBUS)+= pmbus/
> >  
> > -ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> > +subdir-ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
> >  
> > -- 
> > 2.8.1
> >

Re: [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend

2021-02-04 Thread Bjorn Helgaas

[+cc Alex]

On Thu, Jan 28, 2021 at 12:09:37PM +0800, Kai-Heng Feng wrote:
> On Thu, Jan 28, 2021 at 4:51 AM Bjorn Helgaas  wrote:
> > On Thu, Jan 28, 2021 at 01:31:00AM +0800, Kai-Heng Feng wrote:
> > > Commit 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in
> > > hint") enables ACS, and some platforms lose its NVMe after resume from
> > > firmware:
> > > [   50.947816] pcieport :00:1b.0: DPC: containment event, 
> > > status:0x1f01 source:0x
> > > [   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error 
> > > detected
> > > [   50.947829] pcieport :00:1b.0: PCIe Bus Error: 
> > > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
> > > [   50.947830] pcieport :00:1b.0:   device [8086:06ac] error 
> > > status/mask=0020/0001
> > > [   50.947831] pcieport :00:1b.0:[21] ACSViol
> > > (First)
> > > [   50.947841] pcieport :00:1b.0: AER: broadcast error_detected 
> > > message
> > > [   50.947843] nvme nvme0: frozen state error detected, reset controller
> > >
> > > It happens right after ACS gets enabled during resume.
> > >
> > > To prevent that from happening, disable AER interrupt and enable it on
> > > system suspend and resume, respectively.
> >
> > Lots of questions here.  Maybe this is what we'll end up doing, but I
> > am curious about why the error is reported in the first place.
> >
> > Is this a consequence of the link going down and back up?
> 
> Could be. From the observations, it only happens when firmware suspend
> (S3) is used.
> Maybe it happens when it's gets powered up, but I don't have equipment
> to debug at hardware level.
> 
> If we use non-firmware suspend method, enabling ACS after resume won't
> trip AER and DPC.
> 
> > Is it consequence of the device doing a DMA when it shouldn't?
> 
> If it's doing DMA while suspending, the same error should also happen
> after NVMe is suspended and before PCIe port suspending.
> Furthermore, if non-firmware suspend method is used, there's so such
> issue, so less likely to be any DMA operation.
> 
> > Are we doing something in the wrong order during suspend?  Or maybe
> > resume, since I assume the error is reported during resume?
> 
> Yes the error is reported during resume. The suspend/resume order
> seems fine as non-firmware suspend doesn't have this issue.

I really feel like we need a better understanding of what's going on
here.  Disabling the AER interrupt is like closing our eyes and
pretending that because we don't see it, it didn't happen.

An ACS error is triggered by a DMA, right?  I'm assuming an MMIO
access from the CPU wouldn't trigger this error.  And it sounds like
the error is triggered before we even start running the driver after
resume.

If we're powering up an NVMe device from D3cold and it DMAs before the
driver touches it, something would be seriously broken.  I doubt
that's what's happening.  Maybe a device could resume some previously
programmed DMA after powering up from D3hot.

Or maybe the error occurred on suspend, like if the device wasn't
quiesced or something, but we didn't notice it until resume?  The 
AER error status bits are RW1CS, which means they can be preserved
across hot/warm/cold resets.

Can you instrument the code to see whether the AER error status bit is
set before enabling ACS?  I'm not sure that merely enabling ACS (I
assume you mean pci_std_enable_acs(), where we write PCI_ACS_CTRL)
should cause an interrupt for a previously-logged error.  I suspect
that could happen when enabling *AER*, but I wouldn't think it would
happen when enabling *ACS*.

Does this error happen on multiple machines from different vendors?
Wondering if it could be a BIOS issue, e.g., BIOS not cleaning up
after it did something to cause an error.

> > If we *do* take the error, why doesn't DPC recovery work?
> 
> It works for the root port, but not for the NVMe drive:
> [   50.947816] pcieport :00:1b.0: DPC: containment event,
> status:0x1f01 source:0x
> [   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error 
> detected
> [   50.947829] pcieport :00:1b.0: PCIe Bus Error:
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
> ID)
> [   50.947830] pcieport :00:1b.0:   device [8086:06ac] error
> status/mask=0020/0001
> [   50.947831] pcieport :00:1b.0:[21] ACSViol(First)
> [   50.947841] pcieport :00:1b.0: AER: broadcast error_detected message
> [   50.947843] nvme nvme0: frozen state error detected, reset controller
> [   50.948400] ACPI: EC: event unblocked
> [   50.948432] xhci_

[GIT PULL] PCI fixes for v5.11

2021-02-04 Thread Bjorn Helgaas

The following changes since commit 7c53f6b671f4aba70ff15e1b05148b10d58c2837:

  Linux 5.11-rc3 (2021-01-10 14:34:50 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
pci-v5.11-fixes-2

for you to fetch changes up to 40fb68c7725aee024ed99ad38504f5d25820c6f0:

  Revert "PCI/ASPM: Save/restore L1SS Capability for suspend/resume" 
(2021-01-27 10:12:43 -0600)


PCI fixes:

  - Revert ASPM suspend/resume fix that regressed NVMe devices (Bjorn
    Helgaas)

--------
Bjorn Helgaas (1):
  Revert "PCI/ASPM: Save/restore L1SS Capability for suspend/resume"

 drivers/pci/pci.c   |  7 ---
 drivers/pci/pci.h   |  4 
 drivers/pci/pcie/aspm.c | 44 
 3 files changed, 55 deletions(-)

Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init

2021-02-04 Thread Bjorn Helgaas

[+cc Oliver, Pali, Krzysztof]

s/also/Also/ in subject

On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> We are already doing this for all the regular sysfs files on PCI
> devices, but not yet on the legacy io files on the PCI buses. Thus far
> now problem, but in the next patch I want to wire up iomem revoke
> support. That needs the vfs up an running already to make so that
> iomem_get_mapping() works.

s/now problem/no problem/
s/an running/and running/
s/so that/sure that/ ?

iomem_get_mapping() doesn't exist; I don't know what that should be.

> Wire it up exactly like the existing code. Note that
> pci_remove_legacy_files() doesn't need a check since the one for
> pci_bus->legacy_io is sufficient.

I'm not sure exactly what you mean by "the existing code."  I could
probably figure it out, but it would save time to mention the existing
function here.

This looks like another instance where we should really apply Oliver's
idea of converting these to attribute_groups [1].

The cover letter mentions options discussed with Greg in [2], but I
don't think the "sysfs_initialized" hack vs attribute_groups was part
of that discussion.

It's not absolutely a show-stopper, but it *is* a shame to extend the
sysfs_initialized hack if attribute_groups could do this more cleanly
and help solve more than one issue.

Bjorn

[1] 
https://lore.kernel.org/r/caosf1chss03dbsdo4pmttmp0tceu5kscn704zewlkgxqzbf...@mail.gmail.com
[2] 
https://lore.kernel.org/dri-devel/cakmk7ugrddrbtj0oyzqqc0cgrqwc2f3tfju9vlfm2jjufaz...@mail.gmail.com/

> Signed-off-by: Daniel Vetter 
> Cc: Stephen Rothwell 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: Greg Kroah-Hartman 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index fb072f4b3176..0c45b4f7b214 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
>  {
>   int error;
>  
> + if (!sysfs_initialized)
> + return;
> +
>   b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
>  GFP_ATOMIC);
>   if (!b->legacy_io)
> @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
>  static int __init pci_sysfs_init(void)
>  {
>   struct pci_dev *pdev = NULL;
> + struct pci_bus *pbus = NULL;
>   int retval;
>  
>   sysfs_initialized = 1;
> @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
>   }
>   }
>  
> + while ((pbus = pci_find_next_bus(pbus)))
> + pci_create_legacy_files(pbus);
> +
>   return 0;
>  }
>  late_initcall(pci_sysfs_init);
> -- 
> 2.30.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 12007 matches

Mail list logo