Re: [PATCH v10 5/7] PCI: mediatek-gen3: Add MSI support
On Tuesday 20 April 2021 12:01:10 Marc Zyngier wrote: > On Tue, 20 Apr 2021 10:44:02 +0100, > Pali Rohár wrote: > > > > Hello! > > > > On Tuesday 20 April 2021 14:17:21 Jianjun Wang wrote: > > > +static void mtk_pcie_enable_msi(struct mtk_pcie_port *port) > > > +{ > > > + int i; > > > + u32 val; > > > + > > > + for (i = 0; i < PCIE_MSI_SET_NUM; i++) { > > > + struct mtk_msi_set *msi_set = >msi_sets[i]; > > > + > > > + msi_set->base = port->base + PCIE_MSI_SET_BASE_REG + > > > + i * PCIE_MSI_SET_OFFSET; > > > + msi_set->msg_addr = port->reg_base + PCIE_MSI_SET_BASE_REG + > > > + i * PCIE_MSI_SET_OFFSET; > > > + > > > + /* Configure the MSI capture address */ > > > + writel_relaxed(lower_32_bits(msi_set->msg_addr), msi_set->base); > > > + writel_relaxed(upper_32_bits(msi_set->msg_addr), > > > +port->base + PCIE_MSI_SET_ADDR_HI_BASE + > > > +i * PCIE_MSI_SET_ADDR_HI_OFFSET); > > > > This looks like as setting MSI doorbell address to MSI doorbell address. > > > > > +static void mtk_compose_msi_msg(struct irq_data *data, struct msi_msg > > > *msg) > > > +{ > > > + struct mtk_msi_set *msi_set = irq_data_get_irq_chip_data(data); > > > + struct mtk_pcie_port *port = data->domain->host_data; > > > + unsigned long hwirq; > > > + > > > + hwirq = data->hwirq % PCIE_MSI_IRQS_PER_SET; > > > + > > > + msg->address_hi = upper_32_bits(msi_set->msg_addr); > > > + msg->address_lo = lower_32_bits(msi_set->msg_addr); > > > + msg->data = hwirq; > > > + dev_dbg(port->dev, "msi#%#lx address_hi %#x address_lo %#x data %d\n", > > > + hwirq, msg->address_hi, msg->address_lo, msg->data); > > > > ... which is later used in compose_msi_msg(). > > > > Marc in some other patches for other pci controller drivers changed this > > address to just main "port" structure. It simplified implementations and > > also avoided need to declare additional member "msg_addr". > > > > Marc, would it be possible to simplify it also for this driver and just > > set msg_addr to virt_to_phys(port)? > > Maybe. It really depends on what range the HW accepts, and the sole > requirement is to use an address that the endpoint cannot DMA > to. Here, the driver seems to be using something based on the port > base address, which is good enough as far as I am concerned (the thing > I usually object to is the allocation of memory just for the sake of > getting a capture address). > > If you want to further simplify it, you could simply use port.reg_base > as the MSI address for all sets, as I don't think they have to be > distinct. But someone with access to the TRM for this should go and > check it. > > I don't believe this should gate the merging od this driver though. Of course! I'm just asking details to understand best practises and how it works. So thanks for information! > M. > > -- > Without deviation from the norm, progress is not possible.
Re: [PATCH v10 5/7] PCI: mediatek-gen3: Add MSI support
Hello! On Tuesday 20 April 2021 14:17:21 Jianjun Wang wrote: > +static void mtk_pcie_enable_msi(struct mtk_pcie_port *port) > +{ > + int i; > + u32 val; > + > + for (i = 0; i < PCIE_MSI_SET_NUM; i++) { > + struct mtk_msi_set *msi_set = >msi_sets[i]; > + > + msi_set->base = port->base + PCIE_MSI_SET_BASE_REG + > + i * PCIE_MSI_SET_OFFSET; > + msi_set->msg_addr = port->reg_base + PCIE_MSI_SET_BASE_REG + > + i * PCIE_MSI_SET_OFFSET; > + > + /* Configure the MSI capture address */ > + writel_relaxed(lower_32_bits(msi_set->msg_addr), msi_set->base); > + writel_relaxed(upper_32_bits(msi_set->msg_addr), > +port->base + PCIE_MSI_SET_ADDR_HI_BASE + > +i * PCIE_MSI_SET_ADDR_HI_OFFSET); This looks like as setting MSI doorbell address to MSI doorbell address. > +static void mtk_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) > +{ > + struct mtk_msi_set *msi_set = irq_data_get_irq_chip_data(data); > + struct mtk_pcie_port *port = data->domain->host_data; > + unsigned long hwirq; > + > + hwirq = data->hwirq % PCIE_MSI_IRQS_PER_SET; > + > + msg->address_hi = upper_32_bits(msi_set->msg_addr); > + msg->address_lo = lower_32_bits(msi_set->msg_addr); > + msg->data = hwirq; > + dev_dbg(port->dev, "msi#%#lx address_hi %#x address_lo %#x data %d\n", > + hwirq, msg->address_hi, msg->address_lo, msg->data); ... which is later used in compose_msi_msg(). Marc in some other patches for other pci controller drivers changed this address to just main "port" structure. It simplified implementations and also avoided need to declare additional member "msg_addr". Marc, would it be possible to simplify it also for this driver and just set msg_addr to virt_to_phys(port)?
Re: [PATCH] arm64: dts: marvell: armada-37xx: Set linux,pci-domain to zero
On Saturday 17 April 2021 17:19:38 Andrew Lunn wrote: > > Currently this code is implemented in pci_bus_find_domain_nr() function. > > IIRC domain number is 16bit integer, so plain bitmap would consume 8 kB > > of memory. I'm not sure if it is fine or some other tree-based structure > > for allocated domain numbers is needed. > > Hi Pali > > Have a look at lib/idr.c > > Andrew Great! So number allocation is already implemented in kernel (via radix trees).
Re: [PATCH] arm64: dts: marvell: armada-37xx: Set linux,pci-domain to zero
On Thursday 15 April 2021 10:13:17 Rob Herring wrote: > On Thu, Apr 15, 2021 at 3:45 AM Marek Behun wrote: > > > > On Thu, 15 Apr 2021 10:36:40 +0200 > > Pali Rohár wrote: > > > > > On Tuesday 13 April 2021 13:17:29 Rob Herring wrote: > > > > On Mon, Apr 12, 2021 at 7:41 AM Pali Rohár wrote: > > > > > > > > > > Since commit 526a76991b7b ("PCI: aardvark: Implement driver 'remove' > > > > > function and allow to build it as module") PCIe controller driver for > > > > > Armada 37xx can be dynamically loaded and unloaded at runtime. Also > > > > > driver > > > > > allows dynamic binding and unbinding of PCIe controller device. > > > > > > > > > > Kernel PCI subsystem assigns by default dynamically allocated PCI > > > > > domain > > > > > number (starting from zero) for this PCIe controller every time when > > > > > device > > > > > is bound. So PCI domain changes after every unbind / bind operation. > > > > > > > > PCI host bridges as a module are relatively new, so seems likely a bug > > > > to me. > > > > > > Why a bug? It is there since 5.10 and it is working. > > I mean historically, the PCI subsystem didn't even support host > bridges as a module. They weren't even proper drivers and it was all > arch specific code. Most of the host bridge drivers are still built-in > only. This seems like a small detail that was easily overlooked. > unbind is not a well tested path. Ok! Just to note that during my testing I have not spotted any issue. > > > > > Alternative way for assigning PCI domain number is to use static > > > > > allocated > > > > > numbers defined in Device Tree. This option has requirement that > > > > > every PCI > > > > > controller in system must have defined PCI bus number in Device Tree. > > > > > > > > That seems entirely pointless from a DT point of view with a single PCI > > > > bridge. > > > > > > If domain id is not specified in DT then kernel uses counter and assigns > > > counter++. So it is not pointless if we want to have stable domain id. > > > > What Rob is trying to say is that > > - the bug is that kernel assigns counter++ > > - device-tree should not be used to fix problems with how kernel does > > things > > - if a device has only one PCIe controller, it is pointless to define > > it's pci-domain. If there were multiple controllers, then it would > > make sense, but there is only one > > Yes. I think what we want here is a domain bitmap rather than a > counter and we assign the lowest free bit. That could also allow for > handling a mixture of fixed domain numbers and dynamically assigned > ones. Currently this code is implemented in pci_bus_find_domain_nr() function. IIRC domain number is 16bit integer, so plain bitmap would consume 8 kB of memory. I'm not sure if it is fine or some other tree-based structure for allocated domain numbers is needed. > You could create scenarios where the numbers change on you, but it > wouldn't be any different than say plugging in USB serial adapters. > You get the same ttyUSBx device when you re-attach unless there's been > other ttyUSBx devices attached/detached. This should be fine for most scenarios. Dynamically attaching / detaching PCI domain is not such common action... Will you implement this new feature?
Re: [”PATCH” v2 2/5] PCI: armada8k: Add link-down handle
On Wednesday 14 April 2021 16:20:51 bpe...@marvell.com wrote: > From: Ben Peled > > In PCIE ISR routine caused by RST_LINK_DOWN > we schedule work to handle the link-down procedure. > Link-down procedure will: > 1. Remove PCIe bus > 2. Reset the MAC > 3. Reconfigure link back up > 4. Rescan PCIe bus Hello! This looks like a part of PCIe Hot Plug procedure, which is already handled by kernel pci hotplug code, it can detect link down interrupt and remove PCI device from the bus. I'm not sure if it would not be better to use existing "infrastructure" for hotplug rather then implementing new in controller driver. But I do not know what is "correct" way, so I hope that other people (maybe Bjorn?) would say something about this topic. > Signed-off-by: Ben Peled > --- > drivers/pci/controller/dwc/pcie-armada8k.c | 69 > 1 file changed, 69 insertions(+) > > diff --git a/drivers/pci/controller/dwc/pcie-armada8k.c > b/drivers/pci/controller/dwc/pcie-armada8k.c > index b2278b1..34b253c 100644 > --- a/drivers/pci/controller/dwc/pcie-armada8k.c > +++ b/drivers/pci/controller/dwc/pcie-armada8k.c > @@ -22,6 +22,8 @@ > #include > #include > #include > +#include > +#include > > #include "pcie-designware.h" > > @@ -33,6 +35,9 @@ struct armada8k_pcie { > struct clk *clk_reg; > struct phy *phy[ARMADA8K_PCIE_MAX_LANES]; > unsigned int phy_count; > + struct regmap *sysctrl_base; > + u32 mac_rest_bitmask; > + struct work_struct recover_link_work; > }; > > #define PCIE_VENDOR_REGS_OFFSET 0x8000 > @@ -73,6 +78,8 @@ struct armada8k_pcie { > #define AX_USER_DOMAIN_MASK 0x3 > #define AX_USER_DOMAIN_SHIFT 4 > > +#define UNIT_SOFT_RESET_CONFIG_REG 0x268 > + > #define to_armada8k_pcie(x) dev_get_drvdata((x)->dev) > > static void armada8k_pcie_disable_phys(struct armada8k_pcie *pcie) > @@ -225,6 +232,50 @@ static int armada8k_pcie_host_init(struct pcie_port *pp) > return 0; > } > > +static void armada8k_pcie_recover_link(struct work_struct *ws) > +{ > + struct armada8k_pcie *pcie = container_of(ws, struct armada8k_pcie, > recover_link_work); > + struct pcie_port *pp = >pci->pp; > + struct pci_bus *bus = pp->bridge->bus; > + struct pci_dev *root_port; > + int ret; > + > + root_port = pci_get_slot(bus, 0); > + if (!root_port) { > + dev_err(pcie->pci->dev, "failed to get root port\n"); > + return; > + } > + pci_lock_rescan_remove(); > + pci_stop_and_remove_bus_device(root_port); > + /* > + * Sleep needed to make sure all pcie transactions and access > + * are flushed before resetting the mac > + */ > + msleep(100); > + > + /* Reset mac */ > + regmap_update_bits_base(pcie->sysctrl_base, UNIT_SOFT_RESET_CONFIG_REG, > + pcie->mac_rest_bitmask, 0, NULL, false, true); > + udelay(1); > + regmap_update_bits_base(pcie->sysctrl_base, UNIT_SOFT_RESET_CONFIG_REG, > + pcie->mac_rest_bitmask, pcie->mac_rest_bitmask, > + NULL, false, true); > + udelay(1); > + ret = armada8k_pcie_host_init(pp); > + if (ret) { > + dev_err(pcie->pci->dev, "failed to initialize host: %d\n", ret); > + pci_unlock_rescan_remove(); > + pci_dev_put(root_port); > + return; > + } > + > + bus = NULL; > + while ((bus = pci_find_next_bus(bus)) != NULL) > + pci_rescan_bus(bus); > + pci_unlock_rescan_remove(); > + pci_dev_put(root_port); > +} > + > static irqreturn_t armada8k_pcie_irq_handler(int irq, void *arg) > { > struct armada8k_pcie *pcie = arg; > @@ -262,6 +313,9 @@ static irqreturn_t armada8k_pcie_irq_handler(int irq, > void *arg) >* initiate a link retrain. If link retrains were >* possible, that is. >*/ > + if (pcie->sysctrl_base && pcie->mac_rest_bitmask) > + schedule_work(>recover_link_work); > + > dev_dbg(pci->dev, "%s: link went down\n", __func__); > } > > @@ -330,6 +384,8 @@ static int armada8k_pcie_probe(struct platform_device > *pdev) > > pcie->pci = pci; > > + INIT_WORK(>recover_link_work, armada8k_pcie_recover_link); > + > pcie->clk = devm_clk_get(dev, NULL); > if (IS_ERR(pcie->clk)) > return PTR_ERR(pcie->clk); > @@ -357,6 +413,19 @@ static int armada8k_pcie_probe(struct platform_device > *pdev) > goto fail_clkreg; > } > > + pcie->sysctrl_base = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, > + > "marvell,system-controller"); > + if (IS_ERR(pcie->sysctrl_base)) { > + dev_warn(dev, "failed to find marvell,system-controller\n"); > + pcie->sysctrl_base = 0x0; > + } > + > + ret =
Re: QCA6174 pcie wifi: Add pci quirks
Hello! On Thursday 15 April 2021 13:01:19 Alex Williamson wrote: > [cc +Pali] > > On Thu, 15 Apr 2021 20:02:23 +0200 > Ingmar Klein wrote: > > > First thanks to you both, Alex and Bjorn! > > I am in no way an expert on this topic, so I have to fully rely on your > > feedback, concerning this issue. > > > > If you should have any other solution approach, in form of patch-set, I > > would be glad to test it out. Just let me know, what you think might > > make sense. > > I will wait for your further feedback on the issue. In the meantime I > > have my current workaround via quirk entry. > > > > By the way, my layman's question: > > Do you think, that the following topic might also apply for the QCA6174? > > https://www.spinics.net/lists/linux-pci/msg106395.html I have been testing more ath cards and I'm going to send a new version of this patch with including more PCI ids. > > Or in other words, should a similar approach be tried for the QCA6174 > > and if yes, would it bring any benefit at all? > > I hope you can excuse me, in case the questions should not make too much > > sense. > > If you run lspci -vvv on your device, what do LnkCap and LnkSta report > under the express capability? I wonder if your device even supports > >Gen1 speeds, mine does not. > > I would not expect that patch to be relevant to you based on your > report. I understand it to resolve an issue during link retraining to a > higher speed on boot, not during a bus reset. Pali can correct if I'm > wrong. Thanks, These two issues are are related. Both operations (PCIe Hot Reset and PCIe Link Retraining) cause reset of ath chips. Seems that they cause double reset. After reset these chips reads configuration from internal EEPROM/OTP and if another reset is triggered prior chip finishes internal configuration read then it stops working. My testing showed that ath10k chips completely disappear from the PCIe bus, some ath9k chips works fine but starts reporting incorrect PCI ID (0xABCD) and some other ath9k chips reports correct PCI ID but does not work. I had discussion with Adrian Chadd who knows probably everything about ath9k and confirmed me that this issue is there with ath9k and ath10k chips. He wrote me that workaround to turn card back from this "broken" state is to do PCIe Cold Reset of the card, which means turning power supply off for particular PCIe slot. Such thing is not supported on many low-end boards, so workaround cannot be applied. I was able to recover my testing cards from this "broken" state by PCIe Warm Reset (= reset via PERST# pin). I have tried many other reset methods (PCIe PM reset, Link Down, PCIe Hot Reset with bigger internal, ...) but nothing worked. So seems that the only workaround is to do PCIe Cold Reset or PCIe Warm Reset. I will send V2 of my patch with details and explanation. As kernel does not have API for doing PCIe Warm Reset, I think is another argument why kernel really needs it. I do not have any QCA6174 card for testing, but based on the fact I reproduced this issue with more ath9k and ath10 cards and Adrian confirmed that above reset issue is there, I think that it affects all AR9xxx and QCA cards handled by ath9k and ath10 drivers. I was told that AMI BIOS was patching their BIOSes found in notebooks to avoid triggering this issue on notebooks ath9k cards. > Alex > > > Am 15.04.2021 um 04:36 schrieb Alex Williamson: > > > On Wed, 14 Apr 2021 16:03:50 -0500 > > > Bjorn Helgaas wrote: > > > > > >> [+cc Alex] > > >> > > >> On Fri, Apr 09, 2021 at 11:26:33AM +0200, Ingmar Klein wrote: > > >>> Edit: Retry, as I did not consider, that my mail-client would make this > > >>> party html. > > >>> > > >>> Dear maintainers, > > >>> I recently encountered an issue on my Proxmox server system, that > > >>> includes a Qualcomm QCA6174 m.2 PCIe wifi module. > > >>> https://deviwiki.com/wiki/AIRETOS_AFX-QCA6174-NX > > >>> > > >>> On system boot and subsequent virtual machine start (with passed-through > > >>> QCA6174), the VM would just freeze/hang, at the point where the ath10k > > >>> driver loads. > > >>> Quick search in the proxmox related topics, brought me to the following > > >>> discussion, which suggested a PCI quirk entry for the QCA6174 in the > > >>> kernel: > > >>> https://forum.proxmox.com/threads/pcie-passthrough-freezes-proxmox.27513/ > > >>> > > >>> I then went ahead, got the Proxmox kernel source (v5.4.106) and applied > > >>> the attached patch. > > >>> Effect was as hoped, that the VM hangs are now gone. System boots and > > >>> runs as intended. > > >>> > > >>> Judging by the existing quirk entries for Atheros, I would think, that > > >>> my proposed "fix" could be included in the vanilla kernel. > > >>> As far as I saw, there is no entry yet, even in the latest kernel > > >>> sources. > > >> This would need a signed-off-by; see > > >>
Re: [PATCH] arm64: dts: marvell: armada-37xx: Set linux,pci-domain to zero
On Tuesday 13 April 2021 13:17:29 Rob Herring wrote: > On Mon, Apr 12, 2021 at 7:41 AM Pali Rohár wrote: > > > > Since commit 526a76991b7b ("PCI: aardvark: Implement driver 'remove' > > function and allow to build it as module") PCIe controller driver for > > Armada 37xx can be dynamically loaded and unloaded at runtime. Also driver > > allows dynamic binding and unbinding of PCIe controller device. > > > > Kernel PCI subsystem assigns by default dynamically allocated PCI domain > > number (starting from zero) for this PCIe controller every time when device > > is bound. So PCI domain changes after every unbind / bind operation. > > PCI host bridges as a module are relatively new, so seems likely a bug to me. Why a bug? It is there since 5.10 and it is working. > > Alternative way for assigning PCI domain number is to use static allocated > > numbers defined in Device Tree. This option has requirement that every PCI > > controller in system must have defined PCI bus number in Device Tree. > > That seems entirely pointless from a DT point of view with a single PCI > bridge. If domain id is not specified in DT then kernel uses counter and assigns counter++. So it is not pointless if we want to have stable domain id. > > Armada 37xx has only one PCIe controller, so assign for it PCI domain 0 in > > Device Tree. > > > > After this change PCI domain on Armada 37xx is always zero, even after > > repeated unbind and bind operations. > > > > Signed-off-by: Pali Rohár > > Fixes: 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and > > allow to build it as module") > > --- > > arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi > > b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi > > index 7a2df148c6a3..f02058ef5364 100644 > > --- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi > > +++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi > > @@ -495,6 +495,7 @@ > > <0 0 0 2 _intc 1>, > > <0 0 0 3 _intc 2>, > > <0 0 0 4 _intc 3>; > > + linux,pci-domain = <0>; > > max-link-speed = <2>; > > phys = < 0>; > > pcie_intc: interrupt-controller { > > -- > > 2.20.1 > >
Re: [PATCH] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
On Wednesday 14 April 2021 15:23:09 Bjorn Helgaas wrote: > On Tue, Apr 13, 2021 at 10:39:16AM +0200, Pali Rohár wrote: > > On Monday 12 April 2021 14:27:40 Bjorn Helgaas wrote: > > > On Mon, Apr 12, 2021 at 02:46:02PM +0200, Pali Rohár wrote: > > > > Define new PCI_EXP_DEVCTL_PAYLOAD_* macros in linux/pci_regs.h header > > > > file > > > > for Max Payload Size. Macros are defined in the same style as existing > > > > macros PCI_EXP_DEVCTL_READRQ_* macros. > > > > > > > > Signed-off-by: Pali Rohár > > > > --- > > > > include/uapi/linux/pci_regs.h | 6 ++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/include/uapi/linux/pci_regs.h > > > > b/include/uapi/linux/pci_regs.h > > > > index e709ae8235e7..8f1b15eea53e 100644 > > > > --- a/include/uapi/linux/pci_regs.h > > > > +++ b/include/uapi/linux/pci_regs.h > > > > @@ -504,6 +504,12 @@ > > > > #define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request > > > > Reporting En. */ > > > > #define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */ > > > > #define PCI_EXP_DEVCTL_PAYLOAD0x00e0 /* Max_Payload_Size */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_128B 0x /* 128 Bytes */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */ > > > > +#define PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00A0 /* 4096 Bytes */ > > > > > > This is fine if we're going to use them, but we generally don't add > > > definitions purely for documentation. > > > > > > 5929b8a38ce0 ("PCI: Add defines for PCIe Max_Read_Request_Size") added > > > the PCI_EXP_DEVCTL_READRQ_* definitions and we do have a few (very > > > few) uses in drivers. > > > > I'm planning to use this constant to fix pci-aardvark.c driver. Aardvark > > changes are not ready yet, but I'm preparing them in my git tree > > https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark > > (commit PCI: aardvark: Fix PCIe Max Payload Size setting) > > > > But as this is not change in aardvark driver, I sent it separately and > > earlier. As it would be dependency for aardvark changes. > > OK, just include this in that series. Ok, fine.
Re: [PATCH] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
On Monday 12 April 2021 14:27:40 Bjorn Helgaas wrote: > On Mon, Apr 12, 2021 at 02:46:02PM +0200, Pali Rohár wrote: > > Define new PCI_EXP_DEVCTL_PAYLOAD_* macros in linux/pci_regs.h header file > > for Max Payload Size. Macros are defined in the same style as existing > > macros PCI_EXP_DEVCTL_READRQ_* macros. > > > > Signed-off-by: Pali Rohár > > --- > > include/uapi/linux/pci_regs.h | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > > index e709ae8235e7..8f1b15eea53e 100644 > > --- a/include/uapi/linux/pci_regs.h > > +++ b/include/uapi/linux/pci_regs.h > > @@ -504,6 +504,12 @@ > > #define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request > > Reporting En. */ > > #define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */ > > #define PCI_EXP_DEVCTL_PAYLOAD0x00e0 /* Max_Payload_Size */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_128B 0x /* 128 Bytes */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */ > > +#define PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00A0 /* 4096 Bytes */ > > This is fine if we're going to use them, but we generally don't add > definitions purely for documentation. > > 5929b8a38ce0 ("PCI: Add defines for PCIe Max_Read_Request_Size") added > the PCI_EXP_DEVCTL_READRQ_* definitions and we do have a few (very > few) uses in drivers. I'm planning to use this constant to fix pci-aardvark.c driver. Aardvark changes are not ready yet, but I'm preparing them in my git tree https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark (commit PCI: aardvark: Fix PCIe Max Payload Size setting) But as this is not change in aardvark driver, I sent it separately and earlier. As it would be dependency for aardvark changes. > If we do need to add these, please follow the local use of lower-case > in the hex bitmasks. The file is a mixture, but the closest examples > are lower-case. Ok, therefore I will change 0x00A0 to 0x00a0. > > #define PCI_EXP_DEVCTL_EXT_TAG0x0100 /* Extended Tag Field Enable */ > > #define PCI_EXP_DEVCTL_PHANTOM0x0200 /* Phantom Functions Enable */ > > #define PCI_EXP_DEVCTL_AUX_PME0x0400 /* Auxiliary Power PM Enable */ > > -- > > 2.20.1 > >
[PATCH v2] net: phy: marvell: fix detection of PHY on Topaz switches
Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading"), Linux reports the temperature of Topaz hwmon as constant -75°C. This is because switches from the Topaz family (88E6141 / 88E6341) have the address of the temperature sensor register different from Peridot. This address is instead compatible with 88E1510 PHYs, as was used for Topaz before the above mentioned commit. Create a new mapping table between switch family and PHY ID for families which don't have a model number. And define PHY IDs for Topaz and Peridot families. Create a new PHY ID and a new PHY driver for Topaz's internal PHY. The only difference from Peridot's PHY driver is the HWMON probing method. Prior this change Topaz's internal PHY is detected by kernel as: PHY [...] driver [Marvell 88E6390] (irq=63) And afterwards as: PHY [...] driver [Marvell 88E6341 Family] (irq=63) Signed-off-by: Pali Rohár BugLink: https://github.com/globalscaletechnologies/linux/issues/1 Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading") Reviewed-by: Marek Behún --- Changes since v1: * create direct mapping table between family and prod_id which avoid need to use of forward declaration and simplify implementation * fill mapping table only for Topaz and Peridot families --- drivers/net/dsa/mv88e6xxx/chip.c | 30 +- drivers/net/phy/marvell.c| 32 +--- include/linux/marvell_phy.h | 5 +++-- 3 files changed, 45 insertions(+), 22 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 903d619e08ed..e08bf9377140 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3026,10 +3026,17 @@ static int mv88e6xxx_setup(struct dsa_switch *ds) return err; } +/* prod_id for switch families which do not have a PHY model number */ +static const u16 family_prod_id_table[] = { + [MV88E6XXX_FAMILY_6341] = MV88E6XXX_PORT_SWITCH_ID_PROD_6341, + [MV88E6XXX_FAMILY_6390] = MV88E6XXX_PORT_SWITCH_ID_PROD_6390, +}; + static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) { struct mv88e6xxx_mdio_bus *mdio_bus = bus->priv; struct mv88e6xxx_chip *chip = mdio_bus->chip; + u16 prod_id; u16 val; int err; @@ -3040,23 +3047,12 @@ static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) err = chip->info->ops->phy_read(chip, bus, phy, reg, ); mv88e6xxx_reg_unlock(chip); - if (reg == MII_PHYSID2) { - /* Some internal PHYs don't have a model number. */ - if (chip->info->family != MV88E6XXX_FAMILY_6165) - /* Then there is the 6165 family. It gets is -* PHYs correct. But it can also have two -* SERDES interfaces in the PHY address -* space. And these don't have a model -* number. But they are not PHYs, so we don't -* want to give them something a PHY driver -* will recognise. -* -* Use the mv88e6390 family model number -* instead, for anything which really could be -* a PHY, -*/ - if (!(val & 0x3f0)) - val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> 4; + /* Some internal PHYs don't have a model number. */ + if (reg == MII_PHYSID2 && !(val & 0x3f0) && + chip->info->family < ARRAY_SIZE(family_prod_id_table)) { + prod_id = family_prod_id_table[chip->info->family]; + if (prod_id) + val |= prod_id >> 4; } return err ? err : val; diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index e26a5d663f8a..8018ddf7f316 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -3021,9 +3021,34 @@ static struct phy_driver marvell_drivers[] = { .get_stats = marvell_get_stats, }, { - .phy_id = MARVELL_PHY_ID_88E6390, + .phy_id = MARVELL_PHY_ID_88E6341_FAMILY, .phy_id_mask = MARVELL_PHY_ID_MASK, - .name = "Marvell 88E6390", + .name = "Marvell 88E6341 Family", + /* PHY_GBIT_FEATURES */ + .flags = PHY_POLL_CABLE_TEST, + .probe = m88e1510_probe, + .config_init = marvell_config_init, + .config_aneg = m88e6390_config_aneg, + .read_status = marvell_read_status, + .config_intr = marvell_config_intr, + .handle_interrupt = marvell_handle_interrupt, + .resume = genphy_resume, +
Re: [PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
On Monday 12 April 2021 18:12:35 Andrew Lunn wrote: > On Mon, Apr 12, 2021 at 05:52:39PM +0200, Pali Rohár wrote: > > On Monday 12 April 2021 17:32:33 Andrew Lunn wrote: > > > > Anyway, now I'm looking at phy/marvell.c driver again and it supports > > > > only 88E6341 and 88E6390 families from whole 88E63xxx range. > > > > > > > > So do we need to define for now table for more than > > > > MV88E6XXX_FAMILY_6341 and MV88E6XXX_FAMILY_6390 entries? > > > > > > Probably not. I've no idea if the 6393 has an ID, so to be safe you > > > should add that. Assuming it has a family of its own. > > > > So what about just? > > > > if (reg == MII_PHYSID2 && !(val & 0x3f0)) { > > if (chip->info->family == MV88E6XXX_FAMILY_6341) > > val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6341 >> 4; > > else if (chip->info->family == MV88E6XXX_FAMILY_6390) > > val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> 4; > > } > > As i said, i expect the 6393 also has no ID. And i recently found out > Marvell have some automotive switches, 88Q5xxx which are actually > based around the same IP and could be added to this driver. They also > might not have an ID. I suspect this list is going to get longer, so > having it table driven will make that simpler, less error prone. > > Andrew Ok, I will use table but I fill it only with Topaz (6341) and Peridot (6390) which was there before as I do not have 6393 switch for testing. If you or anybody else has 6393 unit for testing, please extend then table.
Re: [PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
On Monday 12 April 2021 17:32:33 Andrew Lunn wrote: > > Anyway, now I'm looking at phy/marvell.c driver again and it supports > > only 88E6341 and 88E6390 families from whole 88E63xxx range. > > > > So do we need to define for now table for more than > > MV88E6XXX_FAMILY_6341 and MV88E6XXX_FAMILY_6390 entries? > > Probably not. I've no idea if the 6393 has an ID, so to be safe you > should add that. Assuming it has a family of its own. So what about just? if (reg == MII_PHYSID2 && !(val & 0x3f0)) { if (chip->info->family == MV88E6XXX_FAMILY_6341) val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6341 >> 4; else if (chip->info->family == MV88E6XXX_FAMILY_6390) val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> 4; }
Re: [PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
On Monday 12 April 2021 16:44:11 Andrew Lunn wrote: > On Mon, Apr 12, 2021 at 03:34:47PM +0200, Pali Rohár wrote: > > On Monday 12 April 2021 15:15:07 Andrew Lunn wrote: > > > > +static u16 mv88e6xxx_physid_for_family(enum mv88e6xxx_family family); > > > > + > > > > > > No forward declaration please. Move the code around. It is often best > > > to do that in a patch which just moves code, no other changes. It > > > makes it easier to review. > > > > Avoiding forward declaration would mean to move about half of source > > code. mv88e6xxx_physid_for_family depends on mv88e6xxx_table which > > depends on all _ops structures which depends on all lot of other > > functions. > > So this is basically what you are trying to do: > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > b/drivers/net/dsa/mv88e6xxx/chip.c > index 903d619e08ed..ef4dbcb052b7 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -3026,6 +3026,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds) > return err; > } > > +static const enum mv88e6xxx_model family_model_table[] = { > + [MV88E6XXX_FAMILY_6095] = MV88E6XXX_PORT_SWITCH_ID_PROD_6095, > + [MV88E6XXX_FAMILY_6097] = MV88E6XXX_PORT_SWITCH_ID_PROD_6097, > + [MV88E6XXX_FAMILY_6185] = MV88E6XXX_PORT_SWITCH_ID_PROD_6185, > + [MV88E6XXX_FAMILY_6250] = MV88E6XXX_PORT_SWITCH_ID_PROD_6250, > + [MV88E6XXX_FAMILY_6320] = MV88E6XXX_PORT_SWITCH_ID_PROD_6320, > + [MV88E6XXX_FAMILY_6341] = MV88E6XXX_PORT_SWITCH_ID_PROD_6341, > + [MV88E6XXX_FAMILY_6351] = MV88E6XXX_PORT_SWITCH_ID_PROD_6351, > + [MV88E6XXX_FAMILY_6352] = MV88E6XXX_PORT_SWITCH_ID_PROD_6352, > + [MV88E6XXX_FAMILY_6390] = MV88E6XXX_PORT_SWITCH_ID_PROD_6390, > +}; Ok, no problem. I can change it in this way. I just thought that if prod_id is already defined for every model in mv88e6xxx_table[] table I could reuse it, instead of duplicating it... Anyway, now I'm looking at phy/marvell.c driver again and it supports only 88E6341 and 88E6390 families from whole 88E63xxx range. So do we need to define for now table for more than MV88E6XXX_FAMILY_6341 and MV88E6XXX_FAMILY_6390 entries? > + > static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) > { > struct mv88e6xxx_mdio_bus *mdio_bus = bus->priv; > @@ -3056,7 +3068,7 @@ static int mv88e6xxx_mdio_read(struct mii_bus *bus, int > phy, int reg) > * a PHY, > */ > if (!(val & 0x3f0)) > - val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> > 4; > + val |= family_model_table[chip->info->family] > >> 4; > } > > and it compiles. No forward declarations needed. It is missing all the > error checking etc, but i don't see why that should change the > dependencies. > > Andrew
Re: [PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
On Monday 12 April 2021 16:30:03 Andrew Lunn wrote: > > > > +/* This table contains representative model for every family */ > > > > +static const enum mv88e6xxx_model family_model_table[] = { > > > > + [MV88E6XXX_FAMILY_6095] = MV88E6095, > > > > + [MV88E6XXX_FAMILY_6097] = MV88E6097, > > > > + [MV88E6XXX_FAMILY_6185] = MV88E6185, > > > > + [MV88E6XXX_FAMILY_6250] = MV88E6250, > > > > + [MV88E6XXX_FAMILY_6320] = MV88E6320, > > > > + [MV88E6XXX_FAMILY_6341] = MV88E6341, > > > > + [MV88E6XXX_FAMILY_6351] = MV88E6351, > > > > + [MV88E6XXX_FAMILY_6352] = MV88E6352, > > > > + [MV88E6XXX_FAMILY_6390] = MV88E6390, > > > > +}; > > > > > > This table is wrong. MV88E6390 does not equal > > > MV88E6XXX_PORT_SWITCH_ID_PROD_6390. MV88E6XXX_PORT_SWITCH_ID_PROD_6390 > > > was chosen because it is already an MDIO device ID, in register 2 and > > > 3. It probably will never clash with a real Marvell PHY ID. MV88E6390 > > > is just a small integer, and there is a danger it will clash with a > > > real PHY. > > > > So... how to solve this issue? What should be in the mapping table? > > You need to use MV88E6XXX_PORT_SWITCH_ID_PROD_6095, > MV88E6XXX_PORT_SWITCH_ID_PROD_6097, > ... > MV88E6XXX_PORT_SWITCH_ID_PROD_6390, But I'm using it. First chip->info->family (enum mv88e6xxx_family; MV88E6XXX_FAMILY_6341) is mapped to enum mv88e6xxx_model (MV88E6341) via family_model_table[] and then enum mv88e6xxx_model (MV88E6341) is mapped to prod_num (MV88E6XXX_PORT_SWITCH_ID_PROD_6341) via mv88e6xxx_table[]. All this is done in mv88e6xxx_physid_for_family() function. So at the end, this function converts MV88E6XXX_FAMILY_6341 to MV88E6XXX_PORT_SWITCH_ID_PROD_6341. And therefore I do not see anything wrong in family_model_table[] table. I defined family_model_table[] table to just maps enum mv88e6xxx_family to enum mv88e6xxx_model as mv88e6xxx_table[] table already contains mapping from enum mv88e6xxx_model to phys_id, to simplify implementation.
Re: [PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
On Monday 12 April 2021 15:15:07 Andrew Lunn wrote: > > +static u16 mv88e6xxx_physid_for_family(enum mv88e6xxx_family family); > > + > > No forward declaration please. Move the code around. It is often best > to do that in a patch which just moves code, no other changes. It > makes it easier to review. Avoiding forward declaration would mean to move about half of source code. mv88e6xxx_physid_for_family depends on mv88e6xxx_table which depends on all _ops structures which depends on all lot of other functions. I wanted to create a small fixup patch which can be easily backported to stable releases which are affected by this issue. If you do not like forward declarations, I can create a followup patch which moves this half of code in this file to avoid forward declaration. But having it in one patch would mean to complicate reviewing code... > > static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) > > { > > struct mv88e6xxx_mdio_bus *mdio_bus = bus->priv; > > @@ -3040,24 +3042,9 @@ static int mv88e6xxx_mdio_read(struct mii_bus *bus, > > int phy, int reg) > > err = chip->info->ops->phy_read(chip, bus, phy, reg, ); > > mv88e6xxx_reg_unlock(chip); > > > > - if (reg == MII_PHYSID2) { > > - /* Some internal PHYs don't have a model number. */ > > - if (chip->info->family != MV88E6XXX_FAMILY_6165) > > - /* Then there is the 6165 family. It gets is > > -* PHYs correct. But it can also have two > > -* SERDES interfaces in the PHY address > > -* space. And these don't have a model > > -* number. But they are not PHYs, so we don't > > -* want to give them something a PHY driver > > -* will recognise. > > -* > > -* Use the mv88e6390 family model number > > -* instead, for anything which really could be > > -* a PHY, > > -*/ > > - if (!(val & 0x3f0)) > > - val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> 4; > > - } > > + /* Some internal PHYs don't have a model number. */ > > + if (reg == MII_PHYSID2 && !(val & 0x3f0)) > > + val |= mv88e6xxx_physid_for_family(chip->info->family); > > > > return err ? err : val; > > } > > @@ -5244,6 +5231,39 @@ static const struct mv88e6xxx_info > > *mv88e6xxx_lookup_info(unsigned int prod_num) > > return NULL; > > } > > > > +/* This table contains representative model for every family */ > > +static const enum mv88e6xxx_model family_model_table[] = { > > + [MV88E6XXX_FAMILY_6095] = MV88E6095, > > + [MV88E6XXX_FAMILY_6097] = MV88E6097, > > + [MV88E6XXX_FAMILY_6185] = MV88E6185, > > + [MV88E6XXX_FAMILY_6250] = MV88E6250, > > + [MV88E6XXX_FAMILY_6320] = MV88E6320, > > + [MV88E6XXX_FAMILY_6341] = MV88E6341, > > + [MV88E6XXX_FAMILY_6351] = MV88E6351, > > + [MV88E6XXX_FAMILY_6352] = MV88E6352, > > + [MV88E6XXX_FAMILY_6390] = MV88E6390, > > +}; > > This table is wrong. MV88E6390 does not equal > MV88E6XXX_PORT_SWITCH_ID_PROD_6390. MV88E6XXX_PORT_SWITCH_ID_PROD_6390 > was chosen because it is already an MDIO device ID, in register 2 and > 3. It probably will never clash with a real Marvell PHY ID. MV88E6390 > is just a small integer, and there is a danger it will clash with a > real PHY. So... how to solve this issue? What should be in the mapping table? > > --- a/drivers/net/phy/marvell.c > > +++ b/drivers/net/phy/marvell.c > > @@ -3021,9 +3021,34 @@ static struct phy_driver marvell_drivers[] = { > > .get_stats = marvell_get_stats, > > }, > > { > > - .phy_id = MARVELL_PHY_ID_88E6390, > > + .phy_id = MARVELL_PHY_ID_88E6341_FAMILY, > > .phy_id_mask = MARVELL_PHY_ID_MASK, > > - .name = "Marvell 88E6390", > > + .name = "Marvell 88E6341 Family", > > You cannot just replace the MARVELL_PHY_ID_88E6390. That will break > the 6390! You need to add the new PHY for the 88E6341. I have not replaced anything. I just put there a new phy_id section for 88E6341. You are probably confused by 'git diff' output as quickly looking at it, somebody can think that there is phy replacement. But there is no replacement, I only added a new PHY. Entry for 88E6390 is still there! Also this is reason why I wanted to avoid big code movement. It will be harder to read the 'git diff' output in this patch.
[PATCH] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
Define new PCI_EXP_DEVCTL_PAYLOAD_* macros in linux/pci_regs.h header file for Max Payload Size. Macros are defined in the same style as existing macros PCI_EXP_DEVCTL_READRQ_* macros. Signed-off-by: Pali Rohár --- include/uapi/linux/pci_regs.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index e709ae8235e7..8f1b15eea53e 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -504,6 +504,12 @@ #define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */ #define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */ #define PCI_EXP_DEVCTL_PAYLOAD0x00e0 /* Max_Payload_Size */ +#define PCI_EXP_DEVCTL_PAYLOAD_128B 0x /* 128 Bytes */ +#define PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */ +#define PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */ +#define PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */ +#define PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */ +#define PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00A0 /* 4096 Bytes */ #define PCI_EXP_DEVCTL_EXT_TAG0x0100 /* Extended Tag Field Enable */ #define PCI_EXP_DEVCTL_PHANTOM0x0200 /* Phantom Functions Enable */ #define PCI_EXP_DEVCTL_AUX_PME0x0400 /* Auxiliary Power PM Enable */ -- 2.20.1
[PATCH] arm64: dts: marvell: armada-37xx: Set linux,pci-domain to zero
Since commit 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module") PCIe controller driver for Armada 37xx can be dynamically loaded and unloaded at runtime. Also driver allows dynamic binding and unbinding of PCIe controller device. Kernel PCI subsystem assigns by default dynamically allocated PCI domain number (starting from zero) for this PCIe controller every time when device is bound. So PCI domain changes after every unbind / bind operation. Alternative way for assigning PCI domain number is to use static allocated numbers defined in Device Tree. This option has requirement that every PCI controller in system must have defined PCI bus number in Device Tree. Armada 37xx has only one PCIe controller, so assign for it PCI domain 0 in Device Tree. After this change PCI domain on Armada 37xx is always zero, even after repeated unbind and bind operations. Signed-off-by: Pali Rohár Fixes: 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module") --- arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi index 7a2df148c6a3..f02058ef5364 100644 --- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi @@ -495,6 +495,7 @@ <0 0 0 2 _intc 1>, <0 0 0 3 _intc 2>, <0 0 0 4 _intc 3>; + linux,pci-domain = <0>; max-link-speed = <2>; phys = < 0>; pcie_intc: interrupt-controller { -- 2.20.1
[PATCH] net: phy: marvell: fix detection of PHY on Topaz switches
Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading"), Linux reports the temperature of Topaz hwmon as constant -75°C. This is because switches from the Topaz family (88E6141 / 88E6341) have the address of the temperature sensor register different from Peridot. This address is instead compatible with 88E1510 PHYs, as was used for Topaz before the above mentioned commit. Define a representative model for each switch family and add a mapping table for constructing PHY IDs for the internal PHYs (since they don't have a model number). Create a new PHY ID and a new PHY driver for Topaz' internal PHY. The only difference from Peridot's PHY driver is the HWMON probing method. Prior this change Topaz's internal PHY is detected by kernel as: PHY [...] driver [Marvell 88E6390] (irq=63) And afterwards as: PHY [...] driver [Marvell 88E6341 Family] (irq=63) Signed-off-by: Pali Rohár BugLink: https://github.com/globalscaletechnologies/linux/issues/1 Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading") Reviewed-by: Marek Behún --- drivers/net/dsa/mv88e6xxx/chip.c | 56 ++-- drivers/net/dsa/mv88e6xxx/chip.h | 2 ++ drivers/net/phy/marvell.c| 32 -- include/linux/marvell_phy.h | 5 +-- 4 files changed, 72 insertions(+), 23 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 903d619e08ed..e602c9816aee 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3026,6 +3026,8 @@ static int mv88e6xxx_setup(struct dsa_switch *ds) return err; } +static u16 mv88e6xxx_physid_for_family(enum mv88e6xxx_family family); + static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) { struct mv88e6xxx_mdio_bus *mdio_bus = bus->priv; @@ -3040,24 +3042,9 @@ static int mv88e6xxx_mdio_read(struct mii_bus *bus, int phy, int reg) err = chip->info->ops->phy_read(chip, bus, phy, reg, ); mv88e6xxx_reg_unlock(chip); - if (reg == MII_PHYSID2) { - /* Some internal PHYs don't have a model number. */ - if (chip->info->family != MV88E6XXX_FAMILY_6165) - /* Then there is the 6165 family. It gets is -* PHYs correct. But it can also have two -* SERDES interfaces in the PHY address -* space. And these don't have a model -* number. But they are not PHYs, so we don't -* want to give them something a PHY driver -* will recognise. -* -* Use the mv88e6390 family model number -* instead, for anything which really could be -* a PHY, -*/ - if (!(val & 0x3f0)) - val |= MV88E6XXX_PORT_SWITCH_ID_PROD_6390 >> 4; - } + /* Some internal PHYs don't have a model number. */ + if (reg == MII_PHYSID2 && !(val & 0x3f0)) + val |= mv88e6xxx_physid_for_family(chip->info->family); return err ? err : val; } @@ -5244,6 +5231,39 @@ static const struct mv88e6xxx_info *mv88e6xxx_lookup_info(unsigned int prod_num) return NULL; } +/* This table contains representative model for every family */ +static const enum mv88e6xxx_model family_model_table[] = { + [MV88E6XXX_FAMILY_6095] = MV88E6095, + [MV88E6XXX_FAMILY_6097] = MV88E6097, + [MV88E6XXX_FAMILY_6185] = MV88E6185, + [MV88E6XXX_FAMILY_6250] = MV88E6250, + [MV88E6XXX_FAMILY_6320] = MV88E6320, + [MV88E6XXX_FAMILY_6341] = MV88E6341, + [MV88E6XXX_FAMILY_6351] = MV88E6351, + [MV88E6XXX_FAMILY_6352] = MV88E6352, + [MV88E6XXX_FAMILY_6390] = MV88E6390, +}; +static_assert(ARRAY_SIZE(family_model_table) == MV88E6XXX_FAMILY_LAST); + +static u16 mv88e6xxx_physid_for_family(enum mv88e6xxx_family family) +{ + enum mv88e6xxx_model model; + + /* 6165 family gets it's PHYs correct. But it can also have two SERDES +* interfaces in the PHY address space. And these don't have a model +* number. But they are not PHYs, so we don't want to give them +* something a PHY driver will recognise. +*/ + if (family == MV88E6XXX_FAMILY_6165) + return 0; + + model = family_model_table[family]; + if (model == MV88E_NA) + return 0; + + return mv88e6xxx_table[model].prod_num >> 4; +} + static int mv88e6xxx_detect(struct mv88e6xxx_chip *chip) { const struct mv88e6xxx_info *info; diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index a57c8886f3ac..70c736788a68 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/m
Re: [PATCH v2] hwmon: (dell-smm) Add Dell Latitude E7440 to fan control whitelist
On Sunday 11 April 2021 11:57:41 Sebastian Oechsle wrote: > Allow manual PWM control on Dell Latitude E7440. > > Signed-off-by: Sebastian Oechsle Reviewed-by: Pali Rohár > > Changes in v2: > - Fix spelling > - Fix format > --- > drivers/hwmon/dell-smm-hwmon.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c > index 73b9db9e3aab..2970892bed82 100644 > --- a/drivers/hwmon/dell-smm-hwmon.c > +++ b/drivers/hwmon/dell-smm-hwmon.c > @@ -1210,6 +1210,14 @@ static struct dmi_system_id > i8k_whitelist_fan_control[] __initdata = { > }, > .driver_data = (void *)_fan_control_data[I8K_FAN_34A3_35A3], > }, > + { > + .ident = "Dell Latitude E7440", > + .matches = { > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Latitude E7440"), > + }, > + .driver_data = (void *)_fan_control_data[I8K_FAN_34A3_35A3], > + }, > { } > }; > > -- > 2.31.1 >
Re: PCI: Race condition in pci_create_sysfs_dev_files
On Wednesday 07 April 2021 16:25:46 Petr Štetiar wrote: > Pali Rohár [2020-10-07 10:12:27]: > > Hi, > > [adding Koen to Cc:] > > > I'm hitting these race conditions randomly on pci aardvark controller > > driver- I prepared patch which speed up initialization of this driver, > > but also increase probability that it hits above race conditions :-( > > it seems, that I'm able to reproduce this race condition on every boot with > 5.10 on my Freescale i.MX6Q board, see the log excerpt bellow. I don't know if > this helps, but it's not happening on 5.4 kernel. Hello! This is same race condition which I described in my original report. Good to know that other people can reproduce it too! > [0.00] Booting Linux on physical CPU 0x0 > [0.00] Linux version 5.10.27 (ynezz@ntbk) > (arm-openwrt-linux-muslgnueabi-gcc (OpenWrt GCC 8.4.0 r12719+30-84f4a783c698) > 8.4.0, GNU ld (GNU Binutils) 2.34) #0 SMP Wed Apr 7 12:52:23 2021 > [0.00] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), > cr=10c5387d > [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing > instruction cache > [0.00] OF: fdt: Machine model: Toradex Apalis iMX6Q/D Module on Ixora > Carrier Board > > ... > > [2.239498] pci :00:00.0: BAR 0: assigned [mem 0x0100-0x010f] > [2.266430] pci :00:00.0: BAR 6: assigned [mem 0x0110-0x0110 > pref] > #���+$HH��.274570] pci :00:00.0: Max Payload Size set to 128/ 128 > (was 128), Max Read Rq 128 > > (this serial console hiccup during PCI initialization seems quite strange as > well, happens always) I'm seeing similar garbage on UART output when such thing happen. But I suspect that this is the issue when doing serialization of more parallel messages print by kernel. Nothing relevant to PCI. But running 'dmesg' after full bootup can show also these lost messages. > [2.283074] sysfs: cannot create duplicate filename > '/devices/platform/soc/1ffc000.pcie/pci:00/:00:00.0/config' > [2.293884] CPU: 1 PID: 47 Comm: kworker/u8:3 Not tainted 5.10.27 #0 > [2.300165] random: fast init done > [2.300249] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [2.310186] Workqueue: events_unbound async_run_entry_fn And this is important information! Device is registered by some async workqueue. When I tried to use my (private) patch which speed by aardvark initialization by putting init code into separate worker then probability of hitting this race condition increased to about 90%. So same situation as yours, it was called from async workqueue. But now with 5.12-rc kernel I'm not able to reproduce it because something changed in kernel and kernel schedule this "racy" worker long after race condition may happen. So it needs perfect timing and seems that one important thing is to call pci_bus_add_device() function from separate worker. > [2.315507] Backtrace: > [2.317976] [<8010cc88>] (dump_backtrace) from [<8010d134>] > (show_stack+0x20/0x24) > [2.325556] r7:813283d4 r6:6013 r5: r4:80ec845c > [2.331236] [<8010d114>] (show_stack) from [<805971fc>] > (dump_stack+0xa4/0xb8) > [2.338480] [<80597158>] (dump_stack) from [<803a7128>] > (sysfs_warn_dup+0x68/0x74) > [2.346058] r7:813283d4 r6:80af0dfc r5:812f48f0 r4:81afa000 > [2.351726] [<803a70c0>] (sysfs_warn_dup) from [<803a6c90>] > (sysfs_add_file_mode_ns+0x100/0x1cc) > [2.360518] r7:813283d4 r6:812f48f0 r5:80b4299c r4:ffef > [2.366187] [<803a6b90>] (sysfs_add_file_mode_ns) from [<803a6fe8>] > (sysfs_create_bin_file+0x94/0x9c) > [2.375411] r6:81eb8078 r5:80b4299c r4: > [2.380043] [<803a6f54>] (sysfs_create_bin_file) from [<805da848>] > (pci_create_sysfs_dev_files+0x58/0x2cc) > [2.389701] r6:81eb8000 r5:81eb8078 r4:81eb8000 > [2.394341] [<805da7f0>] (pci_create_sysfs_dev_files) from [<805cba98>] > (pci_bus_add_device+0x34/0x90) > [2.403659] r10:80b45d88 r9:81818810 r8:81328200 r7:813283d4 r6:8190c000 > r5:81eb8078 > [2.411490] r4:81eb8000 > [2.414034] [<805cba64>] (pci_bus_add_device) from [<805cbb30>] > (pci_bus_add_devices+0x3c/0x80) > [2.421744] random: crng init done > [2.422737] r5:8190c014 r4:81eb8000 > [2.429720] [<805cbaf4>] (pci_bus_add_devices) from [<805cf898>] > (pci_host_probe+0x50/0xa0) > [2.438078] r7:813283d4 r6:8190c000 r5:8190c00c r4: > [2.443753] [<805cf848>] (pci_host_probe) from [<805eeb20>] > (dw_pcie_host_init+0x1d0/0x414) > [2.452111] r7:813283d4 r6:81328058 r5:0200 r4: > [2.457780] [<805ee950>] (dw_pc
Re: PCI: Race condition in pci_create_sysfs_dev_files
On Thursday 16 July 2020 13:04:23 Pali Rohár wrote: > Hello Bjorn! > > I see following error message in dmesg which looks like a race condition: > > sysfs: cannot create duplicate filename > '/devices/platform/soc/d007.pcie/pci:00/:00:00.0/config' > > I looked at it deeper and found out that in PCI subsystem code is race > condition between pci_bus_add_device() and pci_sysfs_init() calls. Both > of these functions calls pci_create_sysfs_dev_files() and calling this > function more times for same pci device throws above error message. > > There can be two different race conditions: > > 1. pci_bus_add_device() called pcibios_bus_add_device() or > pci_fixup_device() but have not called pci_create_sysfs_dev_files() yet. > Meanwhile pci_sysfs_init() is running and pci_create_sysfs_dev_files() > was called for newly registered device. In this case function > pci_create_sysfs_dev_files() is called two times, ones from > pci_bus_add_device() and once from pci_sysfs_init(). > > 2. pci_sysfs_init() is called. It first sets sysfs_initialized to 1 > which unblock calling pci_create_sysfs_dev_files(). Then another bus > registers new PCI device and calls pci_bus_add_device() which calls > pci_create_sysfs_dev_files() and registers sysfs files. Function > pci_sysfs_init() continues execution and calls function > pci_create_sysfs_dev_files() also for this newly registered device. So > pci_create_sysfs_dev_files() is again called two times. > > > I workaround both race conditions I created following hack patch. After > applying it I'm not getting that 'sysfs: cannot create duplicate filename' > error message anymore. Scratch this hack patch, it contains another new race condition. The only way how to get rid of this race condition is either to protect whole "sysfs_initialized" variable by mutex or by completely removing "sysfs_initialized" variable and therefore also removing function pci_create_sysfs_dev_files(). I'm for second variant.
[PATCH] ath9k: Fix kernel NULL pointer dereference during ath_reset_internal()
Function ath9k_hw_reset() is dereferencing chan structure pointer, so it needs to be non-NULL pointer. Function ath9k_stop() already contains code which sets ah->curchan to valid non-NULL pointer prior calling ath9k_hw_reset() function. Add same code pattern also into ath_reset_internal() function to prevent kernel NULL pointer dereference in ath9k_hw_reset() function. This change fixes kernel NULL pointer dereference in ath9k_hw_reset() which is caused by calling ath9k_hw_reset() from ath_reset_internal() with NULL chan structure. [ 45.334305] Unable to handle kernel NULL pointer dereference at virtual address 0008 [ 45.344417] Mem abort info: [ 45.347301] ESR = 0x9605 [ 45.350448] EC = 0x25: DABT (current EL), IL = 32 bits [ 45.356166] SET = 0, FnV = 0 [ 45.359350] EA = 0, S1PTW = 0 [ 45.362596] Data abort info: [ 45.365756] ISV = 0, ISS = 0x0005 [ 45.369735] CM = 0, WnR = 0 [ 45.372814] user pgtable: 4k pages, 39-bit VAs, pgdp=0685d000 [ 45.379663] [0008] pgd=, p4d=, pud= [ 45.388856] Internal error: Oops: 9605 [#1] SMP [ 45.393897] Modules linked in: ath9k ath9k_common ath9k_hw [ 45.399574] CPU: 1 PID: 309 Comm: kworker/u4:2 Not tainted 5.12.0-rc2-dirty #785 [ 45.414746] Workqueue: phy0 ath_reset_work [ath9k] [ 45.419713] pstate: 4005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 45.425910] pc : ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw] [ 45.431234] lr : ath9k_hw_reset+0xc0/0x1c48 [ath9k_hw] [ 45.436548] sp : ffc0118dbca0 [ 45.439961] x29: ffc0118dbca0 x28: [ 45.445442] x27: ff800dee4080 x26: [ 45.450923] x25: ff800df9b9d8 x24: [ 45.456404] x23: ffc0115f6000 x22: ffc008d0d408 [ 45.461885] x21: ff800dee5080 x20: ff800df9b9d8 [ 45.467366] x19: x18: [ 45.472846] x17: x16: [ 45.478326] x15: 0010 x14: [ 45.483807] x13: ffc0918db94f x12: ffc011498720 [ 45.489289] x11: 0003 x10: ffc0114806e0 [ 45.494770] x9 : ffc01014b2ec x8 : 00017fe8 [ 45.500251] x7 : c000efff x6 : 0001 [ 45.505733] x5 : x4 : [ 45.511213] x3 : x2 : ff801fece870 [ 45.516693] x1 : ffc00eded000 x0 : 003f [ 45.522174] Call trace: [ 45.524695] ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw] [ 45.529653] ath_reset_internal+0x1a8/0x2b8 [ath9k] [ 45.534696] ath_reset_work+0x2c/0x40 [ath9k] [ 45.539198] process_one_work+0x210/0x480 [ 45.543339] worker_thread+0x5c/0x510 [ 45.547115] kthread+0x12c/0x130 [ 45.550445] ret_from_fork+0x10/0x1c [ 45.554138] Code: 910922c2 9117e021 95ff0398 b4000294 (b9400a61) [ 45.560430] ---[ end trace 566410ba90b50e8b ]--- [ 45.565193] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 45.572282] SMP: stopping secondary CPUs [ 45.576331] Kernel Offset: disabled [ 45.579924] CPU features: 0x00040002,200c [ 45.584416] Memory Limit: none [ 45.587564] Rebooting in 3 seconds.. Signed-off-by: Pali Rohár Cc: sta...@vger.kernel.org --- drivers/net/wireless/ath/ath9k/main.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 45f6402478b5..97c3a53f9cef 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -307,6 +307,11 @@ static int ath_reset_internal(struct ath_softc *sc, struct ath9k_channel *hchan) hchan = ah->curchan; } + if (!hchan) { + fastcc = false; + hchan = ath9k_cmn_get_channel(sc->hw, ah, >cur_chan->chandef); + } + if (!ath_prepare_reset(sc)) fastcc = false; -- 2.20.1
Re: [PATCH] power: supply: bq27xxx: Return the value instead of -ENODATA
On Thursday 01 April 2021 01:51:28 Hermes Zhang wrote: > On 3/31/21 10:02 PM, Pali Rohár wrote: > > @@ -1655,9 +1655,6 @@ static int bq27xxx_battery_read_time(struct > > bq27xxx_device_info *di, u8 reg) > > return tval; > > } > > > > - if (tval == 65535) > > - return -ENODATA; > > - > > return tval * 60; > > I'm not sure if this is correct change. If value 65535 is special which > > indicates that data are not available then driver should not return > > (converted) value 65535*60. If -ENODATA is there to indicate that data > > are not available then -ENODATA should not be used. > > > > And if there is application which does not handle -ENODATA for state > > when data are not available then it is a bug in application. > > Yeah, I just have a feeling return -ENODATA for time_to_full/empty is > not good here. Because: > > 1. From chip datasheet, it mentioned return 65535 when it's not > available (e.g. read time_to_full when discharging), but the driver > changes behavior here. So if chip reports that value is not available (by special value 65535) then driver should not report that data are available with value 3932100s. > 2. There is other case will return -ENODATA (e.g. the gauge not > calibrated), so it will confuse application which is real failure. In both cases, driver does not have (meaningful) value, so it reports -ENODATA. If you think that some other error code is better for 2. then we can discuss about it. But -ENODATA seems a correct error code for reporting when data are not available. > Could we change the value in minute instead of seconds in > bq27xxx_battery_read_time(), so that means driver do nothing but only > pass the value from the chip? No, according to API, driver must report value in seconds, not in minutes, see documentation: Documentation/power/power_supply_class.rst All voltages, currents, charges, energies, time and temperatures in µV, µA, µAh, µWh, seconds and tenths of degree Celsius unless otherwise stated. It's driver's job to convert its raw values to units in which this class operates. There is no API for reporting value in minutes.
Re: [PATCH] power: supply: bq27xxx: Return the value instead of -ENODATA
On Wednesday 31 March 2021 21:51:41 Hermes Zhang wrote: > From: Hermes Zhang > > It might be better to return value (e.g. 65535) instead of an error > (e.g. No data available) for the time property. > > Normally a common function will handle the read string and parse to > integer for all the properties, but will have problem when read the > time property because need to handle the NODATA error as non-error. > So it will make simple for application which indicate success when > read a number, otherwise as an error to handle. > > Signed-off-by: Hermes Zhang > --- > drivers/power/supply/bq27xxx_battery.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/power/supply/bq27xxx_battery.c > b/drivers/power/supply/bq27xxx_battery.c > index 4c4a7b1c64c5..b75e54aa8ada 100644 > --- a/drivers/power/supply/bq27xxx_battery.c > +++ b/drivers/power/supply/bq27xxx_battery.c > @@ -1655,9 +1655,6 @@ static int bq27xxx_battery_read_time(struct > bq27xxx_device_info *di, u8 reg) > return tval; > } > > - if (tval == 65535) > - return -ENODATA; > - > return tval * 60; I'm not sure if this is correct change. If value 65535 is special which indicates that data are not available then driver should not return (converted) value 65535*60. If -ENODATA is there to indicate that data are not available then -ENODATA should not be used. And if there is application which does not handle -ENODATA for state when data are not available then it is a bug in application. > } > > -- > 2.20.1 >
Re: [PATCH] hwmon: (dell-smm) Add Dell Latitude 7440 to fan control whitelist
On Tuesday 30 March 2021 19:02:55 Sebastian Oechsle wrote: > Allow manual PWM control on Dell Latitude 7440. > > Signed-off-by: Sebastian Oechsle <mailto:setbool...@icloud.com>> Reviewed-by: Pali Rohár (Btw, in commit message is small typo, model name is E7440, not 7440) Guenter, would you take this patch? > --- > drivers/hwmon/dell-smm-hwmon.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c > index 73b9db9e3aab..2970892bed82 100644 > --- a/drivers/hwmon/dell-smm-hwmon.c > +++ b/drivers/hwmon/dell-smm-hwmon.c > @@ -1210,6 +1210,14 @@ static struct dmi_system_id > i8k_whitelist_fan_control[] __initdata = { > }, > .driver_data = (void *)_fan_control_data[I8K_FAN_34A3_35A3], > }, > + { > + .ident = "Dell Latitude E7440", > + .matches = { > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Latitude E7440"), > + }, > + .driver_data = (void *)_fan_control_data[I8K_FAN_34A3_35A3], > + }, > { } > }; > > -- > 2.31.1
Re: How long should be PCIe card in Warm Reset state?
On Tuesday 30 March 2021 16:34:47 Maciej W. Rozycki wrote: > On Tue, 30 Mar 2021, Pali Rohár wrote: > > > > If I were to implement this stuff, for good measure I'd give it a safety > > > margin beyond what the spec requires and use a timeout of say 2-4s while > > > actively querying the status of the device. The values given in the spec > > > are only the minimum requirements. > > > > Are you able to also figure out what is the minimal timeout value for > > PCIe Warm Reset? > > > > Because we are having troubles to "decode" correct minimal timeout value > > for this PCIe Warm Reset (not Function-level reset). > > The spec does not give any exceptions AFAICT as to the timeouts required > between the three kinds of a Conventional Reset (Hot, Warm, or Cold) and > refers to them collectively as a Conventional Reset across the relevant > parts of the document, so clearly the same rules apply. > > Maciej There are specified more timeouts related to Warm reset and PERST# signal. Just they are not in Base spec, but in CEM spec. See previous Amey's email where are described some timeouts and also links in my first email where I put other timeouts defined in specs relevant for PERST# signal and therefore also for Warm Reset.
Re: How long should be PCIe card in Warm Reset state?
On Tuesday 30 March 2021 15:04:02 Maciej W. Rozycki wrote: > On Thu, 25 Mar 2021, David Laight wrote: > > > I can't see the value in the (nice bound) copy of the PCI 2.0 spec I have. > > But IIRC it is 100ms (it might just me 500ms). > > While this might seem like ages it can be problematic if targets have > > to load large FPGA images from serial EEPROMs. > > AFAICT it is 100ms for the Conventional Reset before Configuration > Requests are allowed to be issued in the first place, and then they are > allowed to fail with the Configuration Request Retry Status (CRS) status > until the device is ready to respond. Then it is 1.0s before the Root > Complex and/or system software is allowed to consider a device broken that > does not return a Successful Completion status for a valid Configuration > Request. > > This 1.0s period is analogous to the Trhfa parameter for PCI/PCI-X buses > (2^25/2^27 bus clocks respectively; I don't know why the PCIe spec quotes > the latter value as 2^26, contrary to what the original PCI-X spec says, > but obviously the latter document is what sets the norm), which also has > to be respected for the respective bus segments in the presence of PCIe to > PCI/PCI-X bridges. Hello Maciej! Thank you for information. > For Function-level reset the timeout is 100ms. > > This is specified in sections 6.6.1. "Conventional Reset" and 6.6.2. > "Function-Level Reset (FLR)" respectively in the copy of PCIe 2.0 base > spec I have access to; I imagine other versions may have different section > numbers, but will have them named similarly. > > If I were to implement this stuff, for good measure I'd give it a safety > margin beyond what the spec requires and use a timeout of say 2-4s while > actively querying the status of the device. The values given in the spec > are only the minimum requirements. Are you able to also figure out what is the minimal timeout value for PCIe Warm Reset? Because we are having troubles to "decode" correct minimal timeout value for this PCIe Warm Reset (not Function-level reset).
Re: [v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
On Thursday 18 March 2021 13:48:07 Jianjun Wang wrote: > On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote: > > On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote: > > > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote: > > > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote: > > > > > + > > > > > + /* Check if the link is up or not */ > > > > > + err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val, > > > > > + !!(val & PCIE_PORT_LINKUP), 20, > > > > > + 50 * USEC_PER_MSEC); > > > > > > > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal > > > > as it is required by PCIe specs and also because experiments proved that > > > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not > > > > wait this minimal time. > > > > > > Yes, this should be 100ms, I will fix it at next version, thanks for > > > your review. > > > > In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for > > this step during reviewing aardvark driver. > > > > https://lore.kernel.org/linux-pci/20190426161050.ga189...@google.com/ > > > > And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100: > > > > https://lore.kernel.org/linux-pci/20190522213351.21366-2-r...@triplefau.lt/ > > Sure, I will use PCI_PM_D3COLD_WAIT macro instead in the next version. > > Thanks. Anyway, now I found out that kernel has functions for this waiting: pcie_wait_for_link_delay() and pcie_wait_for_link() Function is called from pci_bridge_wait_for_secondary_bus(). But in current form it is not usable for native controller drivers. This looks like another candidate for code de-duplication or providing "framework". Lorenzo, as maintainer of native controller drivers, do you have some ideas about providing "framework", common functions or something for avoiding to implement same code patterns in every native controller driver, which is de-facto standard PCIe codepath? Including a way how to export PERST# reset gpio?
Re: [PATCH v2] eeprom/optoe: driver to read/write SFP/QSFP/CMIS EEPROMS
On Friday 26 March 2021 11:43:10 Don Bollinger wrote: > > Hello Don! > > > > I have read whole discussion and your EEPROM patch proposal. But for me it > > looks like some kernel glue code for some old legacy / proprietary access > > method which does not have any usage outside of that old code. > > I don't know if 'kernel glue code' is good or bad. It is a driver. It > locks access to a device so it can perform multiple accesses without > interference. It organizes the data on a weird device into a simple linear > address space that can be accessed with open(), seek(), read() and write() > calls. > > As for 'old code', this code and variations of it are under active > development by multiple Network OS vendors and multiple switch vendors, and > in production on hundreds of thousands of switches with millions of > SFP/QSFP/CMIS devices. This stuff is running the biggest clouds in the > world. > > > > > Your code does not contain any quirks which are needed to read different > > EEPROMs in different SFP modules. As Andrew wrote there are lot of broken > > SFPs which needs special handling and this logic is already implemented in > > sfp.c and sfp-bus.c kernel drivers. These drivers then export EEPROM > > content to userspace via ethtool -m API in unified way and userspace does > > not implement any quirks (nor does not have to deal with quirks). > > As a technical matter, you handle those quirks in the code that interprets > EEPROM data. You have figured out what devices have what quirks, then coded > up solutions to make them work. Then, after all the quirk handling is done, > you call the actual access routines (sfp_i2c_read() and sfp_i2c_write()) to > access the module EEPROMs. My code works the same way, except in my > community all the interpretation of EEPROM data is done in user space. You > may not like that, but it is not clear why you should care how my community > chooses to handle the data. In this architecture, the only thing needed > from the kernel is the equivalent of sfp_i2c_read() and sfp_i2c_write, which > optoe provides. The key point here is that my community wants the kernel to > just access the data. No interpretation, no identification, no special > cases. > > > > > If you try to read EEPROM "incorrectly" then SFP module with its EEPROM > > chip (or emulation of chip) locks and is fully unusable after you unplug > it and > > plug it again. Kernel really should not export API to userspace which can > > cause "damage" to SFP modules. And currently it does *not* do it. > > In my community, such devices are not tolerated. Modules which can be > "damaged" should be thrown away. Well, those tested / problematic modules are not damaged by real. They just stop working until you do power off / on cycle and unplug / plug them again. But I agree with you. Russel wrote about those SFP modules that they should have biohazard label :-) The best would be if those modules disappear, but that would not happen in near future. These modules are already used by lot of users and users wanted support for them. Hence we wrote what was possible. The main issue is that we do know know any well-behaved GPON SFP module. GPON is now very common technology for internet access, so it is not obvious that more people are asking about support for GPON HW compatible with Linux kernel. > Please be clear, I am not disagreeing with your implementation. For your > GPON devices, you handle this in kernel code. Cool. Keep it that way. > Just don't make my community implement that where it is not needed and not > wanted. Optoe just accesses the device. Other layers handle interpretation > and quirks. Your handling is in sfp.c, mine is in user space. Not right or > wrong, just different. Both work. > > > > > I have contributed code for some GPON SFP modules, so their EEPROM can > > be correctly read and exported to userspace via ethtool -m. So I know that > > this is very fragile area and needs to be properly handled. > > My code is in use in cloud datacenters and campus switches around the world. > I know it needs to be properly handled. > > > > > So I do not see any reason why can be a new optoe method in _current_ > > form useful. It does not implemented required things for handling > different > > EEPROM modules. > > optoe would be useful in your current environment as a replacement for > sfp_i2c_read() and sfp_i2c_write(). Those routines just access the EEPROM. > They don't identify or interpret or implement quirk handling. Neither does > optoe. Now that we know that there are SFP modules which needs special handling without it they lock themselves on i2c bus, we really cannot export to userspace interface which allows this locking. I understand that you do not care for these broken GPON SFP modules, but "generic" API which would be part of mainline kernel really cannot have known issue that can cause some modules to lock by just simple "read" operation. > AND, optoe is
Re: [PATCH mvebu v3 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Friday 12 March 2021 10:12:06 Gregory CLEMENT wrote: > Hello Pali, > > > Hello Gregory! > > > > Patches are the for almost two months and more people have tested them. > > They are marked with Fixed/CC-stable tags, they should go also into > > stable trees as they are fixing CPU scaling and instability issues. > > > > Are there any issues with these patches? If not, could you please merge > > them for upcoming Linux version? > > Actually I am not the maintainer of the clk and cpufreq subsystems, so > the only thing I can apply is the device tree relative patch. > > Gregory Hello Gregory! Could you please at least review this patches, so other maintainers could merge them?
Re: [v9,5/7] PCI: mediatek-gen3: Add MSI support
On Saturday 27 March 2021 19:44:30 Marc Zyngier wrote: > On Sat, 27 Mar 2021 19:28:37 +, > Pali Rohár wrote: > > > > On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote: > > > +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx) > > > +{ > > > + struct mtk_msi_set *msi_set = >msi_sets[set_idx]; > > > + unsigned long msi_enable, msi_status; > > > + unsigned int virq; > > > + irq_hw_number_t bit, hwirq; > > > + > > > + msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET); > > > + > > > + do { > > > + msi_status = readl_relaxed(msi_set->base + > > > +PCIE_MSI_SET_STATUS_OFFSET); > > > + msi_status &= msi_enable; > > > + if (!msi_status) > > > + break; > > > + > > > + for_each_set_bit(bit, _status, PCIE_MSI_IRQS_PER_SET) { > > > + hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET; > > > + virq = irq_find_mapping(port->msi_bottom_domain, hwirq); > > > + generic_handle_irq(virq); > > > + } > > > + } while (true); > > > > Hello! > > > > Just a question, cannot this while-loop cause block of processing other > > interrupts? > > This is a level interrupt. You don't have much choice but to handle it > immediately, although an alternative would be to mask it and deal with > it in a thread. And since Linux doesn't deal with interrupt priority, > a screaming interrupt is never a good thing. I see. Something like "interrupt priority" (which does not exist?) would be needed to handle it. > > I have done tests with different HW (aardvark) but with same while(true) > > loop logic. One XHCI PCIe controller was sending MSI interrupts too fast > > and interrupt handler with this while(true) logic was in infinite loop. > > During one IRQ it was calling infinite many times generic_handle_irq() > > as HW was feeding new and new MSI hwirq into status register. > > Define "too fast". Fast - next interrupt comes prior checking if while(true)-loop should stop. > If something in the system is able to program the > XHCI device in such a way that it causes a screaming interrupt, that's > the place to look for problems, and probably not in the interrupt > handling itself, which does what it is supposed to do. > > > But this is different HW, so it can have different behavior and does not > > have to cause above issue. > > > > I have just spotted same code pattern for processing MSI interrupts... > > This is a common pattern that you will find in pretty much any > interrupt handling/demuxing, and is done this way when the cost of > taking the exception is high compared to that of handling it. And would not help if while(true)-loop is replaced by loop with upper limit of iterations? Or just call only one iteration? > Which is pretty much any of the badly designed, level-driving, > DW-inspired, sorry excuse for MSI implementations that are popular on > low-end ARM SoCs. Ok. So thank you for information! > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.
Re: [v9,5/7] PCI: mediatek-gen3: Add MSI support
On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote: > +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx) > +{ > + struct mtk_msi_set *msi_set = >msi_sets[set_idx]; > + unsigned long msi_enable, msi_status; > + unsigned int virq; > + irq_hw_number_t bit, hwirq; > + > + msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET); > + > + do { > + msi_status = readl_relaxed(msi_set->base + > +PCIE_MSI_SET_STATUS_OFFSET); > + msi_status &= msi_enable; > + if (!msi_status) > + break; > + > + for_each_set_bit(bit, _status, PCIE_MSI_IRQS_PER_SET) { > + hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET; > + virq = irq_find_mapping(port->msi_bottom_domain, hwirq); > + generic_handle_irq(virq); > + } > + } while (true); Hello! Just a question, cannot this while-loop cause block of processing other interrupts? I have done tests with different HW (aardvark) but with same while(true) loop logic. One XHCI PCIe controller was sending MSI interrupts too fast and interrupt handler with this while(true) logic was in infinite loop. During one IRQ it was calling infinite many times generic_handle_irq() as HW was feeding new and new MSI hwirq into status register. But this is different HW, so it can have different behavior and does not have to cause above issue. I have just spotted same code pattern for processing MSI interrupts... > +} > + > static void mtk_pcie_irq_handler(struct irq_desc *desc) > { > struct mtk_pcie_port *port = irq_desc_get_handler_data(desc); > @@ -405,6 +673,14 @@ static void mtk_pcie_irq_handler(struct irq_desc *desc) > generic_handle_irq(virq); > } > > + irq_bit = PCIE_MSI_SHIFT; > + for_each_set_bit_from(irq_bit, , PCIE_MSI_SET_NUM + > + PCIE_MSI_SHIFT) { > + mtk_pcie_msi_handler(port, irq_bit - PCIE_MSI_SHIFT); > + > + writel_relaxed(BIT(irq_bit), port->base + PCIE_INT_STATUS_REG); > + } > + > chained_irq_exit(irqchip, desc); > } > > -- > 2.25.1 >
Re: [PATCH] PCI: Disallow retraining link for Atheros QCA98xx chips on non-Gen1 PCIe bridges
On Saturday 27 March 2021 15:42:13 Marek Behún wrote: > On Sat, 27 Mar 2021 14:29:59 +0100 > Pali Rohár wrote: > > > I can change this to 'if (!ret)' if needed, no problem. > > > > I use 'if (!val)' mostly for boolean and pointer variables. If > > variable can contain more integer values then I lot of times I use > > '=='. > > Comparing integer varibales with explicit literals is sensible, but > if a function returning integer returns 0 on success and negative value > on error, Linux kernel has a tradition of using just > if (!ret) > or > if (ret) > > Marek Ok, I will change it.
Re: [PATCH] PCI: Disallow retraining link for Atheros QCA98xx chips on non-Gen1 PCIe bridges
Hello! On Saturday 27 March 2021 01:14:10 Krzysztof Wilczyński wrote: > Hi Pali, > > Thank you for sending the patch over! > > [...] > > +static int pcie_change_tls_to_gen1(struct pci_dev *parent) > > Just a nitpick, so feel free to ignore it. I would just call the > variable "dev" as we pass a pointer to a particular device, but it does > not matter as much, so I am leaving this to you. I called it 'parent' because it is called 'parent' also in caller function. Link consists of two devices, so 'dev' could be ambiguous. > [...] > > + if (ret == 0) { > > You prefer this style over "if (!ret)"? Just asking in the view of the > style that seem to be preferred in the code base at the moment. I can change this to 'if (!ret)' if needed, no problem. I use 'if (!val)' mostly for boolean and pointer variables. If variable can contain more integer values then I lot of times I use '=='. > > + /* Verify that new value was really set */ > > + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ); > > + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) > > + ret = -EINVAL; > > I am wondering about this verification - did you have a case where the > device would not properly set its capability, or accept the write and do > nothing? I do not know any real device which is doing this thing. But this issue can happen with kernel's PCIe emulated root bridge: drivers/pci/pci-bridge-emul.c Drivers which are using this emulated root bridge (and both pci-mvebu.c and pci-aardvark.c are using it!) do not have to implement callback for every read and every write operation of every register. Note that both pci-mvebu.c and pci-aardvark.c currently does not implement access to HW register PCI_EXP_LNKCTL2. So currently it is not possible to set PCI_EXP_LNKCTL2_TLS_2_5GT. And above code correctly fails and disallow ASPM code to retrain link. I have some WIP patches which implement LNKCAP2, LNKCTL2 and LNKSTA2 read/write callbacks on emulated bridge for pci-mvebu.c and pci-aardvark.c. And I have tested that with those WIP patches ASPM code can correctly switch link to 2.5GT/s and enable ASPM. > > + if (ret != 0) > > I think "if (ret)" would be fine to use here, unless you prefer being > more explicit. See my question about style above. > > > static bool pcie_retrain_link(struct pcie_link_state *link) > > { > > struct pci_dev *parent = link->pdev; > > unsigned long end_jiffies; > > u16 reg16; > > + u32 reg32; > > + > > + /* Check if link is capable of higher speed than 2.5 GT/s and > > needs quirk */ > > + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ); > > + if ((reg32 & PCI_EXP_LNKCAP_SLS) > PCI_EXP_LNKCAP_SLS_2_5GB) { > > I wonder if moving this check to pcie_change_tls_to_gen1() would make > more sense? It would then make this function a little cleaner. What do > you think? Ok, no problem. But then function needs to be renamed. Any idea how should be this function called? > [...] > > +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) > > +{ > > + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET | > > PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1; > > +} > [...] > > I know that the style has been changed to allow 100 characters width and > that checkpatch.pl now also does not warn about line length, as per > commit bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column > warning"), but I think Bjorn still prefers 80 characters, thus this line > above might have to be aligned. Ok! If needed I can reformat patch to 80 characters per line.
[PATCH] PCI: Disallow retraining link for Atheros QCA98xx chips on non-Gen1 PCIe bridges
Atheros QCA9880 and QCA9890 chips do not behave after a bus reset and also after retrain link when PCIe bridge is not in GEN1 mode at 2.5 GT/s speed. The device will throw a Link Down error and config space is not accessible again. Retrain link can be called only when using GEN1 PCIe bridge or when PCIe bridge has forced link speed to 2.5 GT/s via PCI_EXP_LNKCTL2 register. This issue was reproduced with more Compex WLE900VX cards (QCA9880 based) on Armada 385 with pci-mvebu.c driver and also on Armada 3720 with pci-aardvark.c driver. Also this issue was reproduced with some "noname" card with QCA9890 WiFi chip on Armada 3720. All problematic cards with these QCA chips have PCI device id 0x003c. Tests showed that other WiFi cards based on AR93xx (PCI device id 0x0030) and AR9287 (PCI device id 0x002e) chips do not have these problems. To workaround this issue, this change introduces a new PCI quirk called PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1 for PCI device id 0x003c. When this quirk is set then kernel disallows triggering PCI_EXP_LNKCTL_RL bit in config space of PCIe Bridge in case PCIe Bridge is capable of higher speed than 2.5 GT/s and higher speed is already allowed. When PCIe Bridge has accessible LNKCTL2 register then kernel tries to force target link speed via PCI_EXP_LNKCTL2_TLS* bits to 2.5 GT/s. After this change it is possible to trigger PCI_EXP_LNKCTL_RL bit without causing issues on problematic Atheros QCA98xx cards. Currently only PCIe ASPM kernel code triggers this PCI_EXP_LNKCTL_RL bit, so quirk check is added only into pcie/aspm.c file. Signed-off-by: Pali Rohár Reported-by: Toke Høiland-Jørgensen Tested-by: Marek Behún Link: https://lore.kernel.org/linux-pci/87h7l8axqp@toke.dk/ Cc: sta...@vger.kernel.org # c80851f6ce63a ("PCI: Add PCI_EXP_LNKCTL2_TLS* macros") --- drivers/pci/pcie/aspm.c | 43 + drivers/pci/quirks.c| 15 +- include/linux/pci.h | 2 ++ 3 files changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index ac0557a305af..ea5bdf6107f6 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -192,11 +192,54 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; } +static int pcie_change_tls_to_gen1(struct pci_dev *parent) +{ + u16 reg16; + u32 reg32; + int ret; + + /* Check if link speed can be forced to 2.5 GT/s */ + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, ); + if (!(reg32 & PCI_EXP_LNKCAP2_SLS_2_5GB)) { + pci_err(parent, "ASPM: Bridge does not support changing Link Speed to 2.5 GT/s\n"); + return -EOPNOTSUPP; + } + + /* Force link speed to 2.5 GT/s */ + ret = pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2, +PCI_EXP_LNKCTL2_TLS, +PCI_EXP_LNKCTL2_TLS_2_5GT); + if (ret == 0) { + /* Verify that new value was really set */ + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ); + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) + ret = -EINVAL; + } + + if (ret != 0) + pci_err(parent, "ASPM: Changing Target Link Speed to 2.5 GT/s failed: %d\n", ret); + + return ret; +} + static bool pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; unsigned long end_jiffies; u16 reg16; + u32 reg32; + + if (link->downstream->dev_flags & PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1) { + /* Check if link is capable of higher speed than 2.5 GT/s and needs quirk */ + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ); + if ((reg32 & PCI_EXP_LNKCAP_SLS) > PCI_EXP_LNKCAP_SLS_2_5GB) { + if (pcie_change_tls_to_gen1(parent) != 0) { + pci_err(parent, "ASPM: Retrain Link at higher speed is disallowed by quirk\n"); + return false; + } + pci_info(parent, "ASPM: Target Link Speed changed to 2.5 GT/s due to quirk\n"); + } + } pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ); reg16 |= PCI_EXP_LNKCTL_RL; diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e3ba9e..8ff690c7679d 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3558,6 +3558,11 @@ static void quirk_no_bus_reset(struct pci_dev *dev) dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; } +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) +{ + dev->dev_flags |= PCI_DEV_FL
Re: [v9,2/7] PCI: Export pci_pio_to_address() for module use
On Wednesday 24 March 2021 11:05:05 Jianjun Wang wrote: > This interface will be used by PCI host drivers for PIO translation, > export it to support compiling those drivers as kernel modules. > > Signed-off-by: Jianjun Wang > --- > drivers/pci/pci.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 16a17215f633..12bba221c9f2 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4052,6 +4052,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio) > > return address; > } > +EXPORT_SYMBOL(pci_pio_to_address); Hello! I'm not sure if EXPORT_SYMBOL is correct because file has GPL-2.0 header. Should not be in this case used only EXPORT_SYMBOL_GPL? Maybe other people would know what is correct? > > unsigned long __weak pci_address_to_pio(phys_addr_t address) > { > -- > 2.25.1 >
Re: [PATCH 5.4 00/24] 5.4.105-rc1 review
On Tuesday 23 March 2021 10:20:27 Florian Fainelli wrote: > On 3/15/2021 2:50 AM, Pali Rohár wrote: > > On Friday 12 March 2021 12:54:18 Alexander Lobakin wrote: > >> From: Florian Fainelli > >> Date: Thu, 11 Mar 2021 09:41:27 -0800 > >> > >> Hi Florian, > >> > >>> On 3/11/21 9:40 AM, Greg KH wrote: > >>>> On Thu, Mar 11, 2021 at 09:23:56AM -0800, Florian Fainelli wrote: > >>>>> On 3/11/21 5:08 AM, Greg KH wrote: > >>>>>> On Wed, Mar 10, 2021 at 08:19:45PM -0800, Florian Fainelli wrote: > >>>>>>> +Alex, > >>>>>>> > >>>>>>> On 3/10/2021 5:24 AM, gre...@linuxfoundation.org wrote: > >>>>>>>> From: Greg Kroah-Hartman > >>>>>>>> > >>>>>>>> This is the start of the stable review cycle for the 5.4.105 release. > >>>>>>>> There are 24 patches in this series, all will be posted as a response > >>>>>>>> to this one. If anyone has any issues with these being applied, > >>>>>>>> please > >>>>>>>> let me know. > >>>>>>>> > >>>>>>>> Responses should be made by Fri, 12 Mar 2021 13:23:09 +. > >>>>>>>> Anything received after that time might be too late. > >>>>>>>> > >>>>>>>> The whole patch series can be found in one patch at: > >>>>>>>> > >>>>>>>> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.105-rc1.gz > >>>>>>>> or in the git tree and branch at: > >>>>>>>> > >>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > >>>>>>>> linux-5.4.y > >>>>>>>> and the diffstat can be found below. > >>>>>>>> > >>>>>>>> thanks, > >>>>>>>> > >>>>>>>> greg k-h > >>>>>>> > >>>>>>> I believe you need to drop "net: dsa: add GRO support via gro_cells" > >>>>>>> as > >>>>>>> it causes the following kernel panic on a DSA-enabled platform: > >>>>>>> > >>>>>>> Configuring rgmii_2 interface > >>>>>>> [ 10.170527] brcm-sf2 f0b0.ethernet_switch rgmii_2: configuring > >>>>>>> for fixed/rgmii-txid link mode > >>>>>>> [ 10.179597] 8021q: adding VLAN 0 to HW filter on device rgmii_2 > >>>>>>> [ 10.185608] brcm-sf2 f0b0.ethernet_switch rgmii_2: Link is Up - > >>>>>>> 1Gbps/Full - flow control off > >>>>>>> [ 10.198631] IPv6: ADDRCONF(NETDEV_CHANGE): rgmii_2: link becomes > >>>>>>> ready > >>>>>>> Configuring sit0 interface > >>>>>>> [ 10.254346] 8<--- cut here --- > >>>>>>> [ 10.257438] Unable to handle kernel paging request at virtual > >>>>>>> address > >>>>>>> d6df6190 > >>>>>>> [ 10.264685] pgd = (ptrval) > >>>>>>> [ 10.267411] [d6df6190] *pgd=807003, *pmd= > >>>>>>> [ 10.272846] Internal error: Oops: 206 [#1] SMP ARM > >>>>>>> [ 10.277661] Modules linked in: > >>>>>>> [ 10.280739] CPU: 0 PID: 1886 Comm: sed Not tainted > >>>>>>> 5.4.105-1.0pre-geff642e2af2b #4 > >>>>>>> [ 10.288337] Hardware name: Broadcom STB (Flattened Device Tree) > >>>>>>> [ 10.294292] PC is at gro_cells_receive+0x90/0x11c > >>>>>>> [ 10.299020] LR is at dsa_switch_rcv+0x120/0x1d4 > >>>>>>> [ 10.303562] pc : []lr : []psr: 600f0113 > >>>>>>> [ 10.309841] sp : c1d33cd0 ip : 03e8 fp : c1d33ce4 > >>>>>>> [ 10.315078] r10: c8901000 r9 : c8901000 r8 : c0b4a53c > >>>>>>> [ 10.320314] r7 : c2208920 r6 : r5 : r4 : > >>>>>>> 4000 > >>>>>>> [ 10.326855] r3 : d6df6188 r2 : c4927000 r1 : c8adc300 r0 : > >>>>>>> c22069dc > >>>>>>> [ 10.98] Flags: nZCv IRQs on FIQs on
Re: How long should be PCIe card in Warm Reset state?
On Tuesday 23 March 2021 21:49:41 Amey Narkhede wrote: > On 21/03/10 12:05PM, Pali Rohár wrote: > > Hello! > > > > I would like to open a question about PCIe Warm Reset. Warm Reset of > > PCIe card is triggered by asserting PERST# signal and in most cases > > PERST# signal is controlled by GPIO. > > > > Basically every native Linux PCIe controller driver is doing this Warm > > Reset of connected PCIe card during native driver initialization > > procedure. > > > > And now the important question is: How long should be PCIe card in Warm > > Reset state? After which timeout can be PERST# signal de-asserted by > > Linux controller driver? > > > > Lorenzo and Rob already expressed concerns [1] [2] that this Warm Reset > > timeout should not be driver specific and I agree with them. > > > > I have done investigation which timeout is using which native PCIe > > driver [3] and basically every driver is using different timeout. > > > > I have tried to find timeouts in PCIe specifications, I was not able to > > understand and deduce correct timeout value for Warm Reset from PCIe > > specifications. What I have found is written in my email [4]. > > > > Alex (as a "reset expert"), could you look at this issue? > > > > Or is there somebody else who understand PCIe specifications and PCIe > > diagrams to figure out what is the minimal timeout for de-asserting > > PERST# signal? > > > > There are still some issues with WiFi cards (e.g. Compex one) which > > sometimes do not appear on PCIe bus. And based on these "reset timeout > > differences" in Linux PCIe controller drivers, I suspect that it is not > > (only) the problems in WiFi cards but also in Linux PCIe controller > > drivers. In my email [3] I have written that I figured out that WLE1216 > > card needs to be in Warm Reset state for at least 10ms, otherwise card > > is not detected. > > > > [1] - > > https://lore.kernel.org/linux-pci/20200513115940.fiemtnxfqcyqo6ik@pali/ > > [2] - https://lore.kernel.org/linux-pci/20200507212002.GA32182@bogus/ > > [3] - > > https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/ > > [4] - > > https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/ > > I somehow got my hands on PCIe Gen4 spec. It says on page no 555- > "When PERST# is provided to a component or adapter, this signal must be > used by the component or adapter as Fundamental Reset. > When PERST# is not provided to a component or adapter, Fundamental Reset is > generated autonomously by the component or adapter, and the details of how > this is done are outside the scope of this document." > Not sure what component/adapter means in this context. > > Then below it says- > "In some cases, it may be possible for the Fundamental Reset mechanism > to be triggered by hardware without the removal and re-application of > power to the component. This is called a warm reset. This document does > not specify a means for generating a warm reset." > > Thanks, > Amey Hello Amey, PCIe Base document does not specify how to control PERST# signal and how to issue Warm Reset. But it is documented in PCIe CEM, Mini PCIe CEM and M.2 CEM documents (maybe in some other PCIe docs too). It is needed look into more documents, "merge them in head" and then deduce final meaning...
Re: [v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
On Tuesday 23 March 2021 09:31:34 Jianjun Wang wrote: > One more question, is there any chance that we can put this linkup flow > to a more "standard" way, such as drivers provides the ops of the PERST# > pin and let the framework to decide how to start a link training, or we > just use macro to replace this timeout value in the future? This is something about which I was thinking that could be useful for pci-aardvark.c driver. But I was not sure if some other driver can benefit from such "framework". But now I see that your driver is another candidate which can benefit from it. Currently there is no such "framework" in kernel and the hardest part would be to design it. Having this API would allow kernel to implement and export PCIe Warm Reset (which is done via PERST# signal) and easily extend Amey's reset patches to export also Warm Reset via sysfs. But to implement this framework and using it for reset we first need to answer questions which I have sent in email: https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/ Bjorn, Alex: any opinion about PERST#? Also see Enrico's email, where confirmed that there are platforms which shares one PERST# signal for more endpoint cards: https://lore.kernel.org/linux-pci/1da0fa2c-8056-9ae8-6ce4-ab6453177...@metux.net/
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Thursday 18 March 2021 20:01:55 Amey Narkhede wrote: > On 21/03/17 09:13PM, Pali Rohár wrote: > > On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote: > > > On Wed, 17 Mar 2021 20:40:24 +0100 > > > Pali Rohár wrote: > > > > > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote: > > > > > On Wed, 17 Mar 2021 20:24:24 +0100 > > > > > Pali Rohár wrote: > > > > > > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote: > > > > > > > On Wed, 17 Mar 2021 20:02:06 +0100 > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote: > > > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100 > > > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > > > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100 > > > > > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and > > > > > > > > > > > > > secondary bus > > > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot > > > > > > > > > > > > > reset and > > > > > > > > > > > > > warm reset respectively. > > > > > > > > > > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot > > > > > > > > > > > > reset is just another > > > > > > > > > > > > type of reset, which is currently implemented only for > > > > > > > > > > > > PCIe hot plug > > > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just > > > > > > > > > > > > call PCI secondary > > > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does > > > > > > > > > > > > not have API in > > > > > > > > > > > > kernel and therefore drivers do not export this type of > > > > > > > > > > > > reset via any > > > > > > > > > > > > kernel function (yet). > > > > > > > > > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but could > > > > > > > > > > > be implemented > > > > > > > > > > > in a compatible way to fit within the > > > > > > > > > > > pci_reset_fn_methods[] array > > > > > > > > > > > defined here. > > > > > > > > > > > > > > > > > > > > Ok! > > > > > > > > > > > > > > > > > > > > > Note that with this series the resets available through > > > > > > > > > > > pci_reset_function() and the per device reset attribute > > > > > > > > > > > is sysfs remain > > > > > > > > > > > exactly the same as they are currently. The bus and slot > > > > > > > > > > > reset > > > > > > > > > > > methods used here are limited to devices where only a > > > > > > > > > > > single function is > > > > > > > > > > > affected by the reset, therefore it is not like the patch > > > > > > > > > > > you proposed > > > > > > > > > > > which performed a reset irrespective of the downstream > > > > > > > > > > > devices. This > > > > > > > > > > > series only enables selection of the existing methods. > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > Alex > > > > > >
Re: [PATCH v2] eeprom/optoe: driver to read/write SFP/QSFP/CMIS EEPROMS
Hello Don! I have read whole discussion and your EEPROM patch proposal. But for me it looks like some kernel glue code for some old legacy / proprietary access method which does not have any usage outside of that old code. Your code does not contain any quirks which are needed to read different EEPROMs in different SFP modules. As Andrew wrote there are lot of broken SFPs which needs special handling and this logic is already implemented in sfp.c and sfp-bus.c kernel drivers. These drivers then export EEPROM content to userspace via ethtool -m API in unified way and userspace does not implement any quirks (nor does not have to deal with quirks). If you try to read EEPROM "incorrectly" then SFP module with its EEPROM chip (or emulation of chip) locks and is fully unusable after you unplug it and plug it again. Kernel really should not export API to userspace which can cause "damage" to SFP modules. And currently it does *not* do it. I have contributed code for some GPON SFP modules, so their EEPROM can be correctly read and exported to userspace via ethtool -m. So I know that this is very fragile area and needs to be properly handled. So I do not see any reason why can be a new optoe method in _current_ form useful. It does not implemented required things for handling different EEPROM modules. I would rather suggest you to use ethtool -m IOCTL API and in case it is not suitable for QSFP (e.g. because of paging system) then extend this API. There were already proposals for using netlink socket interface which is today de-facto standard interface for new code. sysfs API for such thing really looks like some legacy code and nowadays we have better access methods. If you want, I can help you with these steps and make patches to be in acceptable state; not written in "legacy" style. As I'm working with GPON SFP modules with different EEPROMs in them, I'm really interested in any improvements in their EEPROM area.
Re: [v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
On Thursday 18 March 2021 13:48:07 Jianjun Wang wrote: > On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote: > > On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote: > > > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote: > > > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote: > > > > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port) > > > > > +{ > > > > ... > > > > > + > > > > > + /* Delay 100ms to wait the reference clocks become stable */ > > > > > + msleep(100); > > > > > + > > > > > + /* De-assert PERST# signal */ > > > > > + val &= ~PCIE_PE_RSTB; > > > > > + writel_relaxed(val, port->base + PCIE_RST_CTRL_REG); > > > > > > > > Hello! This is a new driver which introduce yet another custom timeout > > > > prior PERST# signal for PCIe card is de-asserted. Timeouts for other > > > > drivers I collected in older email [2]. > > > > > > > > Please look at my email [1] about PCIe Warm Reset if you have any clue > > > > about it. Lorenzo and Rob already expressed that this timeout should not > > > > be driver specific. But nobody was able to "decode" and "understand" > > > > PCIe spec yet about these timeouts. > > > > > > Hi Pali, > > > > > > I think this is more like a platform specific timeout, which is used to > > > wait for the reference clocks to become stable and finish the reset flow > > > of HW blocks. > > > > > > Here is the steps to start a link training in this HW: > > > > > > 1. Assert all reset signals which including the transaction layer, PIPE > > > interface and internal bus interface; > > > > > > 2. De-assert reset signals except the PERST#, this will make the > > > physical layer active and start to output the reference clock, but the > > > EP device remains in the reset state. > > >Before releasing the PERST# signal, the HW blocks needs at least 10ms > > > to finish the reset flow, and ref-clk needs about 30us to become stable. > > > > > > 3. De-assert PERST# signal, wait LTSSM enter L0 state. > > > > > > This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since > > > we are in the kernel stage, the power supply has already stabled, this > > > timeout may not take that long. > > > > I think that this is not platform specific timeout or platform specific > > steps. This matches generic steps as defined in PCIe CEM spec, section > > 2.2.1. Initial Power-Up (G3 to S0). > > > > What is platform specific is just how to achieve these steps. > > > > Am I right? > > > > ... > > > > TPVPERL is one of my timeout candidates as minimal required timeout for > > Warm Reset. I have wrote it in email: > > > > https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/ > > > > But I'm not sure as specially in none diagram is described just warm > > reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal). > > > > ... > > > > Anyway, I would suggest to define constants for those timeouts. I guess > > that in future we could be able to define "generic" timeout constants > > which would not be in private driver section, but in some common header > > file. > > I agree with this, but I'm not sure if we really need that long time in > the kernel stage, because the power supply has already stable and it's > really impact the boot time, especially when the platform have multi > ports and not connect any EP device, we need to wait 200ms for each port > when system bootup. Ports are independent. So you can initialize them in parallel, right? If you initialize each port in separate worker then during msleep calls kernel can schedule other kernel thread to run and so it does not increase boot time. While pcie is sleeping kernel can do other things. So the result is that whole boot time is not increased, just reordered. > For this PCIe controller driver, I would like to change the timeout > value to 10ms to comply with the HW design, and save some boot time. In case you can connect _any_ PCIe card to your HW then you cannot decrease or change timeouts required by PCIe specs. Otherwise there can be a card which would not be initialized correctly. I'm debugging driver for aardvark PCIe controller and I see that Compex cards really needs these timeouts, otherwise link is down and card cannot be detected. So I guess that t
Re: [v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote: > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote: > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote: > > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port) > > > +{ > > ... > > > + > > > + /* Delay 100ms to wait the reference clocks become stable */ > > > + msleep(100); > > > + > > > + /* De-assert PERST# signal */ > > > + val &= ~PCIE_PE_RSTB; > > > + writel_relaxed(val, port->base + PCIE_RST_CTRL_REG); > > > > Hello! This is a new driver which introduce yet another custom timeout > > prior PERST# signal for PCIe card is de-asserted. Timeouts for other > > drivers I collected in older email [2]. > > > > Please look at my email [1] about PCIe Warm Reset if you have any clue > > about it. Lorenzo and Rob already expressed that this timeout should not > > be driver specific. But nobody was able to "decode" and "understand" > > PCIe spec yet about these timeouts. > > Hi Pali, > > I think this is more like a platform specific timeout, which is used to > wait for the reference clocks to become stable and finish the reset flow > of HW blocks. > > Here is the steps to start a link training in this HW: > > 1. Assert all reset signals which including the transaction layer, PIPE > interface and internal bus interface; > > 2. De-assert reset signals except the PERST#, this will make the > physical layer active and start to output the reference clock, but the > EP device remains in the reset state. >Before releasing the PERST# signal, the HW blocks needs at least 10ms > to finish the reset flow, and ref-clk needs about 30us to become stable. > > 3. De-assert PERST# signal, wait LTSSM enter L0 state. > > This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since > we are in the kernel stage, the power supply has already stabled, this > timeout may not take that long. I think that this is not platform specific timeout or platform specific steps. This matches generic steps as defined in PCIe CEM spec, section 2.2.1. Initial Power-Up (G3 to S0). What is platform specific is just how to achieve these steps. Am I right? ... TPVPERL is one of my timeout candidates as minimal required timeout for Warm Reset. I have wrote it in email: https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/ But I'm not sure as specially in none diagram is described just warm reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal). ... Anyway, I would suggest to define constants for those timeouts. I guess that in future we could be able to define "generic" timeout constants which would not be in private driver section, but in some common header file. > > > + > > > + /* Check if the link is up or not */ > > > + err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val, > > > + !!(val & PCIE_PORT_LINKUP), 20, > > > + 50 * USEC_PER_MSEC); > > > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal > > as it is required by PCIe specs and also because experiments proved that > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not > > wait this minimal time. > > Yes, this should be 100ms, I will fix it at next version, thanks for > your review. In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for this step during reviewing aardvark driver. https://lore.kernel.org/linux-pci/20190426161050.ga189...@google.com/ And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100: https://lore.kernel.org/linux-pci/20190522213351.21366-2-r...@triplefau.lt/ > Thanks. > > > > > + if (err) { > > > + val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG); > > > + dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val); > > > + return err; > > > + } > > > > [1] - > > https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/ > > [2] - > > https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/ >
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Wednesday 17 March 2021 14:00:20 Alex Williamson wrote: > On Wed, 17 Mar 2021 20:40:24 +0100 > Pali Rohár wrote: > > > On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote: > > > On Wed, 17 Mar 2021 20:24:24 +0100 > > > Pali Rohár wrote: > > > > > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote: > > > > > On Wed, 17 Mar 2021 20:02:06 +0100 > > > > > Pali Rohár wrote: > > > > > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote: > > > > > > > On Mon, 15 Mar 2021 15:52:38 +0100 > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > > > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100 > > > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > > > > > > > > > > > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus > > > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset > > > > > > > > > > > and > > > > > > > > > > > warm reset respectively. > > > > > > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is > > > > > > > > > > just another > > > > > > > > > > type of reset, which is currently implemented only for PCIe > > > > > > > > > > hot plug > > > > > > > > > > bridges and for PowerPC PowerNV platform and it just call > > > > > > > > > > PCI secondary > > > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not > > > > > > > > > > have API in > > > > > > > > > > kernel and therefore drivers do not export this type of > > > > > > > > > > reset via any > > > > > > > > > > kernel function (yet). > > > > > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but could be > > > > > > > > > implemented > > > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] > > > > > > > > > array > > > > > > > > > defined here. > > > > > > > > > > > > > > > > Ok! > > > > > > > > > > > > > > > > > Note that with this series the resets available through > > > > > > > > > pci_reset_function() and the per device reset attribute is > > > > > > > > > sysfs remain > > > > > > > > > exactly the same as they are currently. The bus and slot > > > > > > > > > reset > > > > > > > > > methods used here are limited to devices where only a single > > > > > > > > > function is > > > > > > > > > affected by the reset, therefore it is not like the patch you > > > > > > > > > proposed > > > > > > > > > which performed a reset irrespective of the downstream > > > > > > > > > devices. This > > > > > > > > > series only enables selection of the existing methods. > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > But with this patch series, there is still an issue with PCI > > > > > > > > secondary > > > > > > > > bus reset mechanism as exported sysfs attribute does not do that > > > > > > > > remove-reset-rescan procedure. As discussed in other thread, > > > > > > > > this reset > > > > > > > > let device in unconfigured / broken state. > > > > > > > > > > > > > > No, the
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Wednesday 17 March 2021 13:32:45 Alex Williamson wrote: > On Wed, 17 Mar 2021 20:24:24 +0100 > Pali Rohár wrote: > > > On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote: > > > On Wed, 17 Mar 2021 20:02:06 +0100 > > > Pali Rohár wrote: > > > > > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote: > > > > > On Mon, 15 Mar 2021 15:52:38 +0100 > > > > > Pali Rohár wrote: > > > > > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > > > > > > > On Mon, 15 Mar 2021 14:52:26 +0100 > > > > > > > Pali Rohár wrote: > > > > > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus > > > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and > > > > > > > > > warm reset respectively. > > > > > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is > > > > > > > > just another > > > > > > > > type of reset, which is currently implemented only for PCIe hot > > > > > > > > plug > > > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI > > > > > > > > secondary > > > > > > > > bus reset with some other hook. PCIe Warm Reset does not have > > > > > > > > API in > > > > > > > > kernel and therefore drivers do not export this type of reset > > > > > > > > via any > > > > > > > > kernel function (yet). > > > > > > > > > > > > > > Warm reset is beyond the scope of this series, but could be > > > > > > > implemented > > > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array > > > > > > > defined here. > > > > > > > > > > > > Ok! > > > > > > > > > > > > > Note that with this series the resets available through > > > > > > > pci_reset_function() and the per device reset attribute is sysfs > > > > > > > remain > > > > > > > exactly the same as they are currently. The bus and slot reset > > > > > > > methods used here are limited to devices where only a single > > > > > > > function is > > > > > > > affected by the reset, therefore it is not like the patch you > > > > > > > proposed > > > > > > > which performed a reset irrespective of the downstream devices. > > > > > > > This > > > > > > > series only enables selection of the existing methods. Thanks, > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > But with this patch series, there is still an issue with PCI > > > > > > secondary > > > > > > bus reset mechanism as exported sysfs attribute does not do that > > > > > > remove-reset-rescan procedure. As discussed in other thread, this > > > > > > reset > > > > > > let device in unconfigured / broken state. > > > > > > > > > > No, there's not: > > > > > > > > > > int pci_reset_function(struct pci_dev *dev) > > > > > { > > > > > int rc; > > > > > > > > > > if (!dev->reset_fn) > > > > > return -ENOTTY; > > > > > > > > > > pci_dev_lock(dev); > > > > > >>> pci_dev_save_and_disable(dev); > > > > > > > > > > rc = __pci_reset_function_locked(dev); > > > > > > > > > > >>> pci_dev_restore(dev); > > > > > pci_dev_unlock(dev); > > > > > > > > > > return rc; > > > > > } > > > > > > > > > > The remove/re-scan was discussed primarily because your patch > > > > > performed > > > > > a bus reset regardless of what devices were
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Wednesday 17 March 2021 13:15:36 Alex Williamson wrote: > On Wed, 17 Mar 2021 20:02:06 +0100 > Pali Rohár wrote: > > > On Monday 15 March 2021 09:03:39 Alex Williamson wrote: > > > On Mon, 15 Mar 2021 15:52:38 +0100 > > > Pali Rohár wrote: > > > > > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > > > > > On Mon, 15 Mar 2021 14:52:26 +0100 > > > > > Pali Rohár wrote: > > > > > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus > > > > > > > reset(pci_parent_bus_reset) which I think are hot reset and > > > > > > > warm reset respectively. > > > > > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just > > > > > > another > > > > > > type of reset, which is currently implemented only for PCIe hot plug > > > > > > bridges and for PowerPC PowerNV platform and it just call PCI > > > > > > secondary > > > > > > bus reset with some other hook. PCIe Warm Reset does not have API in > > > > > > kernel and therefore drivers do not export this type of reset via > > > > > > any > > > > > > kernel function (yet). > > > > > > > > > > Warm reset is beyond the scope of this series, but could be > > > > > implemented > > > > > in a compatible way to fit within the pci_reset_fn_methods[] array > > > > > defined here. > > > > > > > > Ok! > > > > > > > > > Note that with this series the resets available through > > > > > pci_reset_function() and the per device reset attribute is sysfs > > > > > remain > > > > > exactly the same as they are currently. The bus and slot reset > > > > > methods used here are limited to devices where only a single function > > > > > is > > > > > affected by the reset, therefore it is not like the patch you proposed > > > > > which performed a reset irrespective of the downstream devices. This > > > > > series only enables selection of the existing methods. Thanks, > > > > > > > > > > Alex > > > > > > > > > > > > > But with this patch series, there is still an issue with PCI secondary > > > > bus reset mechanism as exported sysfs attribute does not do that > > > > remove-reset-rescan procedure. As discussed in other thread, this reset > > > > let device in unconfigured / broken state. > > > > > > No, there's not: > > > > > > int pci_reset_function(struct pci_dev *dev) > > > { > > > int rc; > > > > > > if (!dev->reset_fn) > > > return -ENOTTY; > > > > > > pci_dev_lock(dev); > > > >>> pci_dev_save_and_disable(dev); > > > > > > rc = __pci_reset_function_locked(dev); > > > > > > >>> pci_dev_restore(dev); > > > pci_dev_unlock(dev); > > > > > > return rc; > > > } > > > > > > The remove/re-scan was discussed primarily because your patch performed > > > a bus reset regardless of what devices were affected by that reset and > > > it's difficult to manage the scope where multiple devices are affected. > > > Here, the bus and slot reset functions will fail unless the scope is > > > limited to the single device triggering this reset. Thanks, > > > > > > Alex > > > > > > > I was thinking a bit more about it and I'm really sure how it would > > behave with hotplugging PCIe bridge. > > > > On aardvark PCIe controller I have already tested that secondary bus > > reset bit is triggering Hot Reset event and then also Link Down event. > > These events are not handled by aardvark driver yet (needs to > > implemented into kernel's emulated root bridge code). > > > > But I'm not sure how it would behave on real HW PCIe hotplugging bridge. > > Kernel has already code which removes PCIe device if it changes presence > > bit (and inform via interrupt). And Link Down event triggers this > > change. > > This is the difference between slot and bus resets, the slot reset is > implemented by the hotplug controller and disables presence detection > around the bus reset. Thanks, Yes, but I'm talking about bus reset, not about slot reset. I mean: to use bus reset via sysfs on hardware which supports slots and hotplugging. And if I'm reading code correctly, this combination is allowed, right? Via these new patches it is possible to disable slot reset and enable bus reset.
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Monday 15 March 2021 09:03:39 Alex Williamson wrote: > On Mon, 15 Mar 2021 15:52:38 +0100 > Pali Rohár wrote: > > > On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > > > On Mon, 15 Mar 2021 14:52:26 +0100 > > > Pali Rohár wrote: > > > > > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > > > slot reset (pci_dev_reset_slot_function) and secondary bus > > > > > reset(pci_parent_bus_reset) which I think are hot reset and > > > > > warm reset respectively. > > > > > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another > > > > type of reset, which is currently implemented only for PCIe hot plug > > > > bridges and for PowerPC PowerNV platform and it just call PCI secondary > > > > bus reset with some other hook. PCIe Warm Reset does not have API in > > > > kernel and therefore drivers do not export this type of reset via any > > > > kernel function (yet). > > > > > > Warm reset is beyond the scope of this series, but could be implemented > > > in a compatible way to fit within the pci_reset_fn_methods[] array > > > defined here. > > > > Ok! > > > > > Note that with this series the resets available through > > > pci_reset_function() and the per device reset attribute is sysfs remain > > > exactly the same as they are currently. The bus and slot reset > > > methods used here are limited to devices where only a single function is > > > affected by the reset, therefore it is not like the patch you proposed > > > which performed a reset irrespective of the downstream devices. This > > > series only enables selection of the existing methods. Thanks, > > > > > > Alex > > > > > > > But with this patch series, there is still an issue with PCI secondary > > bus reset mechanism as exported sysfs attribute does not do that > > remove-reset-rescan procedure. As discussed in other thread, this reset > > let device in unconfigured / broken state. > > No, there's not: > > int pci_reset_function(struct pci_dev *dev) > { > int rc; > > if (!dev->reset_fn) > return -ENOTTY; > > pci_dev_lock(dev); > >>> pci_dev_save_and_disable(dev); > > rc = __pci_reset_function_locked(dev); > > >>> pci_dev_restore(dev); > pci_dev_unlock(dev); > > return rc; > } > > The remove/re-scan was discussed primarily because your patch performed > a bus reset regardless of what devices were affected by that reset and > it's difficult to manage the scope where multiple devices are affected. > Here, the bus and slot reset functions will fail unless the scope is > limited to the single device triggering this reset. Thanks, > > Alex > I was thinking a bit more about it and I'm really sure how it would behave with hotplugging PCIe bridge. On aardvark PCIe controller I have already tested that secondary bus reset bit is triggering Hot Reset event and then also Link Down event. These events are not handled by aardvark driver yet (needs to implemented into kernel's emulated root bridge code). But I'm not sure how it would behave on real HW PCIe hotplugging bridge. Kernel has already code which removes PCIe device if it changes presence bit (and inform via interrupt). And Link Down event triggers this change. Can somebody test these changes on some PCIe hotplug controller what secondary bus reset via sysfs would do? Because currently it is not exported as reset method and there can be different race conditions and maybe error (?) if hotplug code is going to remove device on which user triggered bus reset via sysfs. And in my opinion this can happen also in case when only one device is on the bus, so it perfectly matches all conditions when sysfs can use bus reset for one device. I can try to implement hotplug code into aardvark driver and root bridge emulator to test how this patch would happen. But it would take some time...
Re: [PATCHv2 18/38] dt-bindings: power: supply: n900-battery: Convert to DT schema format
On Wednesday 17 March 2021 14:48:44 Sebastian Reichel wrote: > Convert the binding to DT schema format. > > Cc: Pali Rohár Rejected-by: Pali Rohár > Signed-off-by: Sebastian Reichel Hello Sebastian! I'm really really sorry, I have nothing against you, but personally I cannot ack change signed by company where some people are supporting censorship, GPL violations and other similar immoral activities against other individual developers. > --- > .../power/supply/nokia,n900-battery.yaml | 49 +++ > .../bindings/power/supply/rx51-battery.txt| 25 -- > 2 files changed, 49 insertions(+), 25 deletions(-) > create mode 100644 > Documentation/devicetree/bindings/power/supply/nokia,n900-battery.yaml > delete mode 100644 > Documentation/devicetree/bindings/power/supply/rx51-battery.txt
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Monday 15 March 2021 08:34:09 Alex Williamson wrote: > On Mon, 15 Mar 2021 14:52:26 +0100 > Pali Rohár wrote: > > > On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > > > slot reset (pci_dev_reset_slot_function) and secondary bus > > > reset(pci_parent_bus_reset) which I think are hot reset and > > > warm reset respectively. > > > > No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another > > type of reset, which is currently implemented only for PCIe hot plug > > bridges and for PowerPC PowerNV platform and it just call PCI secondary > > bus reset with some other hook. PCIe Warm Reset does not have API in > > kernel and therefore drivers do not export this type of reset via any > > kernel function (yet). > > Warm reset is beyond the scope of this series, but could be implemented > in a compatible way to fit within the pci_reset_fn_methods[] array > defined here. Ok! > Note that with this series the resets available through > pci_reset_function() and the per device reset attribute is sysfs remain > exactly the same as they are currently. The bus and slot reset > methods used here are limited to devices where only a single function is > affected by the reset, therefore it is not like the patch you proposed > which performed a reset irrespective of the downstream devices. This > series only enables selection of the existing methods. Thanks, > > Alex > But with this patch series, there is still an issue with PCI secondary bus reset mechanism as exported sysfs attribute does not do that remove-reset-rescan procedure. As discussed in other thread, this reset let device in unconfigured / broken state.
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Monday 15 March 2021 19:13:23 Amey Narkhede wrote: > slot reset (pci_dev_reset_slot_function) and secondary bus > reset(pci_parent_bus_reset) which I think are hot reset and > warm reset respectively. No. PCI secondary bus reset = PCIe Hot Reset. Slot reset is just another type of reset, which is currently implemented only for PCIe hot plug bridges and for PowerPC PowerNV platform and it just call PCI secondary bus reset with some other hook. PCIe Warm Reset does not have API in kernel and therefore drivers do not export this type of reset via any kernel function (yet).
Re: [PATCH 5.4 00/24] 5.4.105-rc1 review
On Friday 12 March 2021 12:54:18 Alexander Lobakin wrote: > From: Florian Fainelli > Date: Thu, 11 Mar 2021 09:41:27 -0800 > > Hi Florian, > > > On 3/11/21 9:40 AM, Greg KH wrote: > > > On Thu, Mar 11, 2021 at 09:23:56AM -0800, Florian Fainelli wrote: > > >> On 3/11/21 5:08 AM, Greg KH wrote: > > >>> On Wed, Mar 10, 2021 at 08:19:45PM -0800, Florian Fainelli wrote: > > +Alex, > > > > On 3/10/2021 5:24 AM, gre...@linuxfoundation.org wrote: > > > From: Greg Kroah-Hartman > > > > > > This is the start of the stable review cycle for the 5.4.105 release. > > > There are 24 patches in this series, all will be posted as a response > > > to this one. If anyone has any issues with these being applied, > > > please > > > let me know. > > > > > > Responses should be made by Fri, 12 Mar 2021 13:23:09 +. > > > Anything received after that time might be too late. > > > > > > The whole patch series can be found in one patch at: > > > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.105-rc1.gz > > > or in the git tree and branch at: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > > linux-5.4.y > > > and the diffstat can be found below. > > > > > > thanks, > > > > > > greg k-h > > > > I believe you need to drop "net: dsa: add GRO support via gro_cells" as > > it causes the following kernel panic on a DSA-enabled platform: > > > > Configuring rgmii_2 interface > > [ 10.170527] brcm-sf2 f0b0.ethernet_switch rgmii_2: configuring > > for fixed/rgmii-txid link mode > > [ 10.179597] 8021q: adding VLAN 0 to HW filter on device rgmii_2 > > [ 10.185608] brcm-sf2 f0b0.ethernet_switch rgmii_2: Link is Up - > > 1Gbps/Full - flow control off > > [ 10.198631] IPv6: ADDRCONF(NETDEV_CHANGE): rgmii_2: link becomes > > ready > > Configuring sit0 interface > > [ 10.254346] 8<--- cut here --- > > [ 10.257438] Unable to handle kernel paging request at virtual > > address > > d6df6190 > > [ 10.264685] pgd = (ptrval) > > [ 10.267411] [d6df6190] *pgd=807003, *pmd= > > [ 10.272846] Internal error: Oops: 206 [#1] SMP ARM > > [ 10.277661] Modules linked in: > > [ 10.280739] CPU: 0 PID: 1886 Comm: sed Not tainted > > 5.4.105-1.0pre-geff642e2af2b #4 > > [ 10.288337] Hardware name: Broadcom STB (Flattened Device Tree) > > [ 10.294292] PC is at gro_cells_receive+0x90/0x11c > > [ 10.299020] LR is at dsa_switch_rcv+0x120/0x1d4 > > [ 10.303562] pc : []lr : []psr: 600f0113 > > [ 10.309841] sp : c1d33cd0 ip : 03e8 fp : c1d33ce4 > > [ 10.315078] r10: c8901000 r9 : c8901000 r8 : c0b4a53c > > [ 10.320314] r7 : c2208920 r6 : r5 : r4 : > > 4000 > > [ 10.326855] r3 : d6df6188 r2 : c4927000 r1 : c8adc300 r0 : > > c22069dc > > [ 10.98] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM > > Segment user > > [ 10.340547] Control: 30c5387d Table: 04ac4c80 DAC: fffd > > [ 10.346307] Process sed (pid: 1886, stack limit = 0x(ptrval)) > > [ 10.352066] Stack: (0xc1d33cd0 to 0xc1d34000) > > [ 10.356434] 3cc0: c8adc300 > > c4927000 c1d33d04 c1d33ce8 > > [ 10.364631] 3ce0: c0b4a65c c0a579a4 c1d33d24 c2208920 c1d33d24 > > c1d33d5c c1d33d08 > > [ 10.372827] 3d00: c0a0b38c c0b4a548 c021e070 c2204cc8 > > c89015c0 04b87700 c89015c0 > > [ 10.381023] 3d20: c2208920 c1d33d24 c1d33d24 00976ec2 04b87700 > > c8adc300 c89015c0 > > [ 10.389218] 3d40: c1d33d74 c1d32000 c230742c c1d33dac > > c1d33d60 c0a0b5c0 c0a0b180 > > [ 10.397414] 3d60: c2204cc8 c1d33d6c c1d33d6c > > c1d33d80 c029daf8 00976ec2 > > [ 10.405610] 3d80: 0800 c8901540 c89015c0 c8901540 > > 0001 016c 0162 > > [ 10.413805] 3da0: c1d33dc4 c1d33db0 c0a0b7fc c0a0b3b8 > > c8adc300 c1d33dfc c1d33dc8 > > [ 10.422001] 3dc0: c0a0c660 c0a0b7e4 c8901540 c8adc300 c1d33dfc > > c1d33de0 c8901540 c8adc300 > > [ 10.430196] 3de0: 015e c8901000 0001 016c c1d33e74 > > c1d33e00 c083df00 c0a0c4fc > > [ 10.438391] 3e00: 012c c22b0f14 c1d33e4c c1d33e18 c0fbd9b8 > > c0fbd9cc c0fbd9e0 c0fbd98c > > [ 10.446586] 3e20: 0001 0040 c8901500 0001 > > > > [ 10.454780] 3e40: c02f65a0 c8901540 0001 > > 0040 c22b07e4 012c > > [ 10.462975] 3e60: d1003000 fffb942f c1d33edc c1d33e78 c0a0c94c > > c083dafc d051ad80 c2204cc8 > > [ 10.471170] 3e80: c2204cf0 c1d32000 c22b40b0
Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism
On Friday 12 March 2021 23:04:52 ameynarkhed...@gmail.com wrote: > From: Amey Narkhede > > Add reset_methods_enabled bitmap to struct pci_dev to > keep track of user preferred device reset mechanisms. > Add reset_method sysfs attribute to query and set > user preferred device reset mechanisms. > > Signed-off-by: Amey Narkhede > --- > Reviewed-by: Alex Williamson > Reviewed-by: Raphael Norwitz > > Documentation/ABI/testing/sysfs-bus-pci | 15 ++ > drivers/pci/pci-sysfs.c | 66 +++-- > drivers/pci/pci.c | 3 +- > include/linux/pci.h | 2 + > 4 files changed, 82 insertions(+), 4 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci > b/Documentation/ABI/testing/sysfs-bus-pci > index 25c9c3977..ae53ecd2e 100644 > --- a/Documentation/ABI/testing/sysfs-bus-pci > +++ b/Documentation/ABI/testing/sysfs-bus-pci > @@ -121,6 +121,21 @@ Description: > child buses, and re-discover devices removed earlier > from this part of the device tree. > > +What:/sys/bus/pci/devices/.../reset_method > +Date:March 2021 > +Contact: Amey Narkhede > +Description: > + Some devices allow an individual function to be reset > + without affecting other functions in the same slot. > + For devices that have this support, a file named reset_method > + will be present in sysfs. Reading this file will give names > + of the device supported reset methods. Currently used methods > + are enclosed in brackets. Writing the name of any of the device > + supported reset method to this file will set the reset method to > + be used when resetting the device. Writing "none" to this file > + will disable ability to reset the device and writing "default" > + will return to the original value. > + Hello Amey! I think that this API does not work for PCIe Hot Reset (=PCI secondary bus reset) and PCIe Warm Reset. First reset method is bound to the bus, not device and therefore kernel does not have to see any registered device. So there would be no "reset_method" sysfs file, and also no "reset" sysfs file. But PCIe Hot Reset is in most cases needed when buggy card is not registered on bus, to trigger this reset. And with this API this is not possible. PCIe Warm Reset is done by PERST# signal. When signal is asserted then device is in reset state and therefore is not registered. So again kernel does not have to see registered device. Moreover for mPCIe form factor cards, boards can share one PERST# signal with more PCIe cards and control this signal via GPIO. So asserting PERST# GPIO can trigger Warm reset for more PCIe cards, not just one. It depends on board or topology. So... I do not think that current approach with "reset_method" sysfs entry bound to the PCI device does not work for PCI secondary bus reset and also cannot be used for implementing PCIe Warm Reset. I would rather suggest to re-design and prepare a new API which would work also with PCIe Hot Reset and PCIe Warm Reset. This "reset" sysfs file can work only with PCI Function Level Reset or some PM or device specific reset. But not with reset types which are more like slot or bus orientated. > What:/sys/bus/pci/devices/.../reset > Date:July 2009 > Contact: Michael S. Tsirkin > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 78d2c130c..3cd06d1c0 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1304,6 +1304,59 @@ static const struct bin_attribute pcie_config_attr = { > .write = pci_write_config, > }; > > +static ssize_t reset_method_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + const struct pci_reset_fn_method *reset; > + struct pci_dev *pdev = to_pci_dev(dev); > + ssize_t len = 0; > + int i; > + > + for (i = 0, reset = pci_reset_fn_methods; reset->reset_fn; i++, > reset++) { > + if (pdev->reset_methods_enabled & (1 << i)) > + len += sysfs_emit_at(buf, len, "[%s] ", reset->name); > + else if (pdev->reset_methods & (1 << i)) > + len += sysfs_emit_at(buf, len, "%s ", reset->name); > + } > + > + return len; > +} > + > +static ssize_t reset_method_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + const struct pci_reset_fn_method *reset = pci_reset_fn_methods; > + struct pci_dev *pdev = to_pci_dev(dev); > + u8 reset_mechanism; > + int i = 0; > + > + /* Writing none disables reset */ > + if (sysfs_streq(buf, "none")) { > + reset_mechanism = 0; > + } else if
Re: [PATCH 3/4] PCI: Remove reset_fn field from pci_dev
On Friday 12 March 2021 23:04:51 ameynarkhed...@gmail.com wrote: > From: Amey Narkhede > > reset_fn field is used to indicate whether the > device supports any reset mechanism or not. > Deprecate use of reset_fn in favor of new > reset_methods bitmap which can be used to keep > track of all supported reset mechanisms of a device. Hello Amey! You cannot trigger PCIe Hot Reset (PCI secondary bus reset) in this simple way from sysfs via new reset methods. I proposed very similar functionality just few days ago: https://lore.kernel.org/linux-pci/20210301171221.3d42a55i7h5ubqsb@pali/T/#u And I realized that it needs more steps to be done. At least some remove-reset-rescan procedure done atomically is required. > Signed-off-by: Amey Narkhede > --- > Reviewed-by: Alex Williamson > Reviewed-by: Raphael Norwitz > > drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 2 +- > drivers/pci/pci-sysfs.c| 6 ++ > drivers/pci/pci.c | 6 +++--- > drivers/pci/probe.c| 1 - > drivers/pci/quirks.c | 2 +- > include/linux/pci.h| 1 - > 6 files changed, 7 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c > b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c > index 9b9d305c6..3e2c49e08 100644 > --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c > +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c > @@ -526,7 +526,7 @@ static void octeon_destroy_resources(struct octeon_device > *oct) > oct->irq_name_storage = NULL; > } > /* Soft reset the octeon device before exiting */ > - if (oct->pci_dev->reset_fn) > + if (oct->pci_dev->reset_methods) > octeon_pci_flr(oct); > else > cn23xx_vf_ask_pf_to_do_flr(oct); > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index f8afd54ca..78d2c130c 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1334,7 +1334,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev > *dev) > > pcie_vpd_create_sysfs_dev_files(dev); > > - if (dev->reset_fn) { > + if (dev->reset_methods) { > retval = device_create_file(>dev, _attr_reset); > if (retval) > goto error; > @@ -1417,10 +1417,8 @@ int __must_check pci_create_sysfs_dev_files(struct > pci_dev *pdev) > static void pci_remove_capabilities_sysfs(struct pci_dev *dev) > { > pcie_vpd_remove_sysfs_dev_files(dev); > - if (dev->reset_fn) { > + if (dev->reset_methods) > device_remove_file(>dev, _attr_reset); > - dev->reset_fn = 0; > - } > } > > /** > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 407b44e85..b7f6c6588 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5175,7 +5175,7 @@ int pci_reset_function(struct pci_dev *dev) > { > int rc; > > - if (!dev->reset_fn) > + if (!dev->reset_methods) > return -ENOTTY; > > pci_dev_lock(dev); > @@ -5211,7 +5211,7 @@ int pci_reset_function_locked(struct pci_dev *dev) > { > int rc; > > - if (!dev->reset_fn) > + if (!dev->reset_methods) > return -ENOTTY; > > pci_dev_save_and_disable(dev); > @@ -5234,7 +5234,7 @@ int pci_try_reset_function(struct pci_dev *dev) > { > int rc; > > - if (!dev->reset_fn) > + if (!dev->reset_methods) > return -ENOTTY; > > if (!pci_dev_trylock(dev)) > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 01dd037bd..4764e031a 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2404,7 +2404,6 @@ static void pci_init_capabilities(struct pci_dev *dev) > > pcie_report_downtraining(dev); > pci_init_reset_methods(dev); > - dev->reset_fn = !!dev->reset_methods; > } > > /* > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 0a3df84c9..20a81b1bc 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5535,7 +5535,7 @@ static void > quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) > > if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO || > pdev->subsystem_device != 0x222e || > - !pdev->reset_fn) > + !pdev->reset_methods) > return; > > if (pci_enable_device_mem(pdev)) > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 56d6e4750..a2f003f4e 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -437,7 +437,6 @@ struct pci_dev { > unsigned intstate_saved:1; > unsigned intis_physfn:1; > unsigned intis_virtfn:1; > - unsigned intreset_fn:1; > unsigned intis_hotplug_bridge:1; > unsigned intshpc_managed:1; /* SHPC owned by shpchp */ >
Re: [PATCH 2/4] PCI: Add new bitmap for keeping track of supported reset mechanisms
On Friday 12 March 2021 23:04:50 ameynarkhed...@gmail.com wrote: > From: Amey Narkhede > > Introduce a new bitmap reset_methods in struct pci_dev > to keep track of reset mechanisms supported by the > device. Also refactor probing and reset functions > to take advantage of calling convention of reset > functions. > > Signed-off-by: Amey Narkhede > --- > Reviewed-by: Alex Williamson > Reviewed-by: Raphael Norwitz > > drivers/pci/pci.c | 106 > drivers/pci/pci.h | 11 - > drivers/pci/probe.c | 5 +-- > include/linux/pci.h | 10 + > 4 files changed, 79 insertions(+), 53 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 4a7c084a3..407b44e85 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -40,6 +40,26 @@ const char *pci_power_names[] = { > }; > EXPORT_SYMBOL_GPL(pci_power_names); > > +static int pci_af_flr(struct pci_dev *dev, int probe); > +static int pci_pm_reset(struct pci_dev *dev, int probe); > +static int pci_dev_reset_slot_function(struct pci_dev *dev, int probe); > +static int pci_parent_bus_reset(struct pci_dev *dev, int probe); > + > +/* > + * The ordering for functions in pci_reset_fn_methods > + * is required for bitmap positions defined > + * in reset_methods in struct pci_dev > + */ > +const struct pci_reset_fn_method pci_reset_fn_methods[] = { > + { .reset_fn = _dev_specific_reset, .name = "device_specific" }, > + { .reset_fn = _flr, .name = "flr" }, > + { .reset_fn = _af_flr, .name = "af_flr" }, > + { .reset_fn = _pm_reset, .name = "pm" }, > + { .reset_fn = _dev_reset_slot_function, .name = "slot" }, > + { .reset_fn = _parent_bus_reset, .name = "bus" }, Hello Amey! In the list of reset methods is missing PCIe Warm Reset. Could you extend and prepare API also for PCIe Warm Reset? According to PCI Express mini card and m.2 electromechanical specifications, PCIe Warm Reset can be triggered by PERST# signal and more kernel drivers can internally control PERST#. Just there is no kernel API and therefore PCIe Warm Reset nor PERST# signal is unified. > + { 0 }, > +}; > + > int isa_dma_bridge_buggy; > EXPORT_SYMBOL(isa_dma_bridge_buggy); > > @@ -5080,71 +5100,59 @@ static void pci_dev_restore(struct pci_dev *dev) > */ > int __pci_reset_function_locked(struct pci_dev *dev) > { > - int rc; > + int i, rc = -ENOTTY; > + const struct pci_reset_fn_method *reset; > > might_sleep(); > > - /* > - * A reset method returns -ENOTTY if it doesn't support this device > - * and we should try the next method. > - * > - * If it returns 0 (success), we're finished. If it returns any > - * other error, we're also finished: this indicates that further > - * reset mechanisms might be broken on the device. > - */ > - rc = pci_dev_specific_reset(dev, 0); > - if (rc != -ENOTTY) > - return rc; > - rc = pcie_flr(dev, 0); > - if (rc != -ENOTTY) > - return rc; > - rc = pci_af_flr(dev, 0); > - if (rc != -ENOTTY) > - return rc; > - rc = pci_pm_reset(dev, 0); > - if (rc != -ENOTTY) > - return rc; > - rc = pci_dev_reset_slot_function(dev, 0); > - if (rc != -ENOTTY) > - return rc; > - return pci_parent_bus_reset(dev, 0); > + for (i = 0, reset = pci_reset_fn_methods; reset->reset_fn; i++, > reset++) { > + if (!(dev->reset_methods & (1 << i))) > + continue; > + > + /* > + * A reset method returns -ENOTTY if it doesn't support this > device > + * and we should try the next method. > + * > + * If it returns 0 (success), we're finished. If it returns any > + * other error, we're also finished: this indicates that further > + * reset mechanisms might be broken on the device. > + */ > + rc = reset->reset_fn(dev, 0); > + if (rc != -ENOTTY) > + return rc; > + } > + return rc; > } > EXPORT_SYMBOL_GPL(__pci_reset_function_locked); > > /** > - * pci_probe_reset_function - check whether the device can be safely reset > - * @dev: PCI device to reset > + * pci_init_reset_methods - check whether device can be safely reset > + * and store supported reset mechanisms. > + * @dev: PCI device to check for reset mechanisms > * > * Some devices allow an individual function to be reset without affecting > * other functions in the same device. The PCI device must be responsive > - * to PCI config space in order to use this function. > + * to reads and writes to its PCI config space in order to use this function. > * > - * Returns 0 if the device function can be reset or negative if the > - * device doesn't support resetting a single function. > + * Stores reset mechanisms supported by device in reset_methods bitmap > + * field of struct pci_dev
Re: RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)
On Thursday 11 March 2021 19:53:26 Vidya Sagar wrote: > On 3/3/2021 11:26 PM, Pali Rohár wrote: > > External email: Use caution opening links or attachments > > > > > > On Tuesday 02 March 2021 12:58:29 Alex Williamson wrote: > > > On Mon, 1 Mar 2021 14:28:17 -0600 > > > Bjorn Helgaas wrote: > > > > > > > [+cc Alex, reset expert] > > > > > > > > On Mon, Mar 01, 2021 at 06:12:21PM +0100, Pali Rohár wrote: > > > > > Hello! > > > > > > > > > > PCIe card can be reset via in-band Hot Reset signal which can be > > > > > triggered by PCIe bridge via Secondary Bus Reset bit in PCI config > > > > > space. > > > > > > > > > > Kernel already exports sysfs node "reset" for triggering Functional > > > > > Reset of particular function of PCI device. But in some cases > > > > > Functional > > > > > Reset is not enough and Hot Reset is required. > > > > > > > > > > Following RFC patch exports sysfs node "reset_bus" for PCI bridges > > > > > which > > > > > triggers Secondary Bus Reset and therefore for PCIe bridges it resets > > > > > connected PCIe card. > > > > > > > > > > What do you think about it? > > > > > > > > > > Currently there is userspace script which can trigger PCIe Hot Reset > > > > > by > > > > > modifying PCI config space from userspace: > > > > > > > > > > https://alexforencich.com/wiki/en/pcie/hot-reset-linux > > > > > > > > > > But because kernel already provides way how to trigger Functional > > > > > Reset > > > > > it could provide also way how to trigger PCIe Hot Reset. > > > > > > What that script does and what this does, or what the existing reset > > > attribute does, are very different. The script finds the upstream > > > bridge for a given device, removes the device (ignoring that more than > > > one device might be affected by the bus reset), uses setpci to trigger > > > a secondary bus reset, then rescans devices. The below only triggers > > > the secondary bus reset, neither saving and restoring affected device > > > state like the existing function level reset attribute, nor removing > > > and rescanning as the script does. It simply leaves an entire > > > hierarchy of PCI devices entirely un-programmed yet still has struct > > > pci_devs attached to them for untold future misery. > > > > > > In fact, for the case of a single device affected by the bus reset, as > > > intended by the script, the existing reset attribute will already do > > > that if the device supports no other reset mechanism. There's actually > > > a running LFX mentorship project that aims to allow the user to control > > > the type of reset performed by the existing reset attribute such that a > > > user could force the bus reset behavior over other reset methods. > > > > Hello Alex? Do you have a link for this "reset" project? I'm interesting > > in it as I'm dealing with Compex wifi cards which are causing problems. > > > > For correct initialization I need to issue PCIe Warm Reset for these > > cards (Warm Reset is done via PERST# pin which most linux controller > > drivers controls via GPIO subsystem). And for now there is no way to > > trigger PCIe Warm Reset for particular PCIe device from userspace. As > > there is no userspace <--> kernel API for it. > > > > > There might be some justification for an attribute that actually > > > implements the referenced script correctly, perhaps in kernel we could > > > avoid races with bus rescans, but simply triggering an SBR to quietly > > > de-program all downstream devices with no state restore or device > > > rescan is not it. Any affected device would be unusable. Was this > > > tested? Thanks, > > > > I have tested my change. First I called 'remove' attribute for PCIe > > card, then I called this 'bus_reset' on parent PCIe bridge and later I > > called 'rescan' attribute on bridge. It correctly rested tested ath9k > > card. So I did something similar as in above script. But I agree that > > there are race conditions and basically lot of other calls needs to be > > done to restore state. > > > > So I see that to make it 'usable' we need to do it automatically in > > kernel and also re
Re: [v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote: > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port) > +{ ... > + > + /* Delay 100ms to wait the reference clocks become stable */ > + msleep(100); > + > + /* De-assert PERST# signal */ > + val &= ~PCIE_PE_RSTB; > + writel_relaxed(val, port->base + PCIE_RST_CTRL_REG); Hello! This is a new driver which introduce yet another custom timeout prior PERST# signal for PCIe card is de-asserted. Timeouts for other drivers I collected in older email [2]. Please look at my email [1] about PCIe Warm Reset if you have any clue about it. Lorenzo and Rob already expressed that this timeout should not be driver specific. But nobody was able to "decode" and "understand" PCIe spec yet about these timeouts. > + > + /* Check if the link is up or not */ > + err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val, > + !!(val & PCIE_PORT_LINKUP), 20, > + 50 * USEC_PER_MSEC); IIRC, you need to wait at least 100ms after de-asserting PERST# signal as it is required by PCIe specs and also because experiments proved that some Compex wifi cards (e.g. WLE900VX) are not detected if you do not wait this minimal time. > + if (err) { > + val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG); > + dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val); > + return err; > + } [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/ [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/
Re: [v8,5/7] PCI: mediatek-gen3: Add MSI support
On Wednesday 24 February 2021 14:11:30 Jianjun Wang wrote: > +static int mtk_msi_bottom_domain_alloc(struct irq_domain *domain, > +unsigned int virq, unsigned int nr_irqs, > +void *arg) > +{ > + struct mtk_pcie_port *port = domain->host_data; > + struct mtk_msi_set *msi_set; > + int i, hwirq, set_idx; > + > + mutex_lock(>lock); > + > + hwirq = bitmap_find_free_region(port->msi_irq_in_use, PCIE_MSI_IRQS_NUM, > + order_base_2(nr_irqs)); > + > + mutex_unlock(>lock); > + > + if (hwirq < 0) > + return -ENOSPC; > + > + set_idx = hwirq / PCIE_MSI_IRQS_PER_SET; > + msi_set = >msi_sets[set_idx]; > + > + for (i = 0; i < nr_irqs; i++) > + irq_domain_set_info(domain, virq + i, hwirq + i, > + _msi_bottom_irq_chip, msi_set, > + handle_edge_irq, NULL, NULL); > + > + return 0; > +} > + > +static void mtk_msi_bottom_domain_free(struct irq_domain *domain, > +unsigned int virq, unsigned int nr_irqs) > +{ > + struct mtk_pcie_port *port = domain->host_data; > + struct irq_data *data = irq_domain_get_irq_data(domain, virq); > + > + mutex_lock(>lock); > + > + bitmap_clear(port->msi_irq_in_use, data->hwirq, nr_irqs); Marc, should not be there bitmap_release_region() with order_base_2()? bitmap_release_region(port->msi_irq_in_use, data->hwirq, order_base_2(nr_irqs)); Because mtk_msi_bottom_domain_alloc() is allocating order_base_2(nr_irqs) interrupts, not only nr_irqs. > + > + mutex_unlock(>lock); > + > + irq_domain_free_irqs_common(domain, virq, nr_irqs); > +}
How long should be PCIe card in Warm Reset state?
Hello! I would like to open a question about PCIe Warm Reset. Warm Reset of PCIe card is triggered by asserting PERST# signal and in most cases PERST# signal is controlled by GPIO. Basically every native Linux PCIe controller driver is doing this Warm Reset of connected PCIe card during native driver initialization procedure. And now the important question is: How long should be PCIe card in Warm Reset state? After which timeout can be PERST# signal de-asserted by Linux controller driver? Lorenzo and Rob already expressed concerns [1] [2] that this Warm Reset timeout should not be driver specific and I agree with them. I have done investigation which timeout is using which native PCIe driver [3] and basically every driver is using different timeout. I have tried to find timeouts in PCIe specifications, I was not able to understand and deduce correct timeout value for Warm Reset from PCIe specifications. What I have found is written in my email [4]. Alex (as a "reset expert"), could you look at this issue? Or is there somebody else who understand PCIe specifications and PCIe diagrams to figure out what is the minimal timeout for de-asserting PERST# signal? There are still some issues with WiFi cards (e.g. Compex one) which sometimes do not appear on PCIe bus. And based on these "reset timeout differences" in Linux PCIe controller drivers, I suspect that it is not (only) the problems in WiFi cards but also in Linux PCIe controller drivers. In my email [3] I have written that I figured out that WLE1216 card needs to be in Warm Reset state for at least 10ms, otherwise card is not detected. [1] - https://lore.kernel.org/linux-pci/20200513115940.fiemtnxfqcyqo6ik@pali/ [2] - https://lore.kernel.org/linux-pci/20200507212002.GA32182@bogus/ [3] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/ [4] - https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/
Re: [PATCH AUTOSEL 5.11 16/67] net: sfp: add mode quirk for GPON module Ubiquiti U-Fiber Instant
On Thursday 04 March 2021 17:33:01 Sasha Levin wrote: > On Thu, Feb 25, 2021 at 08:03:06PM +0100, Pali Rohár wrote: > > On Wednesday 24 February 2021 07:49:34 Sasha Levin wrote: > > > From: Pali Rohár > > > > > > [ Upstream commit f0b4f847673299577c29b71d3f3acd3c313d81b7 ] > > > > Hello! This commit requires also commit~1 from that patch series: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=426c6cbc409cbda9ab1a9dbf15d3c2ef947eb8c1 > > > > Without it kernel cannot read EEPROM from Ubiquiti U-Fiber Instant > > module and therefore the hook based on EEPROM data which is below would > > not be applied. > > Looks like that commit is already in, thanks! Yes! Now I see that commit in 5.11 queue. So 5.11 would be OK. But I do not see it in 5.10 queue. In 5.10 queue I see only backport of f0b4f8476732 commit. 426c6cbc409c seems to be still missing. Could you check it?
Re: RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)
On Tuesday 02 March 2021 12:58:29 Alex Williamson wrote: > On Mon, 1 Mar 2021 14:28:17 -0600 > Bjorn Helgaas wrote: > > > [+cc Alex, reset expert] > > > > On Mon, Mar 01, 2021 at 06:12:21PM +0100, Pali Rohár wrote: > > > Hello! > > > > > > PCIe card can be reset via in-band Hot Reset signal which can be > > > triggered by PCIe bridge via Secondary Bus Reset bit in PCI config > > > space. > > > > > > Kernel already exports sysfs node "reset" for triggering Functional > > > Reset of particular function of PCI device. But in some cases Functional > > > Reset is not enough and Hot Reset is required. > > > > > > Following RFC patch exports sysfs node "reset_bus" for PCI bridges which > > > triggers Secondary Bus Reset and therefore for PCIe bridges it resets > > > connected PCIe card. > > > > > > What do you think about it? > > > > > > Currently there is userspace script which can trigger PCIe Hot Reset by > > > modifying PCI config space from userspace: > > > > > > https://alexforencich.com/wiki/en/pcie/hot-reset-linux > > > > > > But because kernel already provides way how to trigger Functional Reset > > > it could provide also way how to trigger PCIe Hot Reset. > > What that script does and what this does, or what the existing reset > attribute does, are very different. The script finds the upstream > bridge for a given device, removes the device (ignoring that more than > one device might be affected by the bus reset), uses setpci to trigger > a secondary bus reset, then rescans devices. The below only triggers > the secondary bus reset, neither saving and restoring affected device > state like the existing function level reset attribute, nor removing > and rescanning as the script does. It simply leaves an entire > hierarchy of PCI devices entirely un-programmed yet still has struct > pci_devs attached to them for untold future misery. > > In fact, for the case of a single device affected by the bus reset, as > intended by the script, the existing reset attribute will already do > that if the device supports no other reset mechanism. There's actually > a running LFX mentorship project that aims to allow the user to control > the type of reset performed by the existing reset attribute such that a > user could force the bus reset behavior over other reset methods. Hello Alex? Do you have a link for this "reset" project? I'm interesting in it as I'm dealing with Compex wifi cards which are causing problems. For correct initialization I need to issue PCIe Warm Reset for these cards (Warm Reset is done via PERST# pin which most linux controller drivers controls via GPIO subsystem). And for now there is no way to trigger PCIe Warm Reset for particular PCIe device from userspace. As there is no userspace <--> kernel API for it. > There might be some justification for an attribute that actually > implements the referenced script correctly, perhaps in kernel we could > avoid races with bus rescans, but simply triggering an SBR to quietly > de-program all downstream devices with no state restore or device > rescan is not it. Any affected device would be unusable. Was this > tested? Thanks, I have tested my change. First I called 'remove' attribute for PCIe card, then I called this 'bus_reset' on parent PCIe bridge and later I called 'rescan' attribute on bridge. It correctly rested tested ath9k card. So I did something similar as in above script. But I agree that there are race conditions and basically lot of other calls needs to be done to restore state. So I see that to make it 'usable' we need to do it automatically in kernel and also rescan/restore state of PCIe devices behind bridge after reset... > Alex > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > index 50fcb62d59b5..f5e11c589498 100644 > > > --- a/drivers/pci/pci-sysfs.c > > > +++ b/drivers/pci/pci-sysfs.c > > > @@ -1321,6 +1321,30 @@ static ssize_t reset_store(struct device *dev, > > > struct device_attribute *attr, > > > > > > static DEVICE_ATTR(reset, 0200, NULL, reset_store); > > > > > > +static ssize_t reset_bus_store(struct device *dev, struct > > > device_attribute *attr, > > > +const char *buf, size_t count) > > > +{ > > > + struct pci_dev *pdev = to_pci_dev(dev); > > > + unsigned long val; > > > + ssize_t result = kstrtoul(buf, 0, ); > > > + > > > + if (result < 0) > > > + return result; > > > + > > > + if
[PATCH] PCI: iproc: Fix return value of iproc_msi_irq_domain_alloc()
IRQ domain alloc function should return zero on success. Non-zero value indicates failure. Signed-off-by: Pali Rohár Fixes: fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs") --- drivers/pci/controller/pcie-iproc-msi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/controller/pcie-iproc-msi.c b/drivers/pci/controller/pcie-iproc-msi.c index 908475d27e0e..eede4e8f3f75 100644 --- a/drivers/pci/controller/pcie-iproc-msi.c +++ b/drivers/pci/controller/pcie-iproc-msi.c @@ -271,7 +271,7 @@ static int iproc_msi_irq_domain_alloc(struct irq_domain *domain, NULL, NULL); } - return hwirq; + return 0; } static void iproc_msi_irq_domain_free(struct irq_domain *domain, -- 2.20.1
Re: [PATCH mvebu v3 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
Hello Gregory! Patches are the for almost two months and more people have tested them. They are marked with Fixed/CC-stable tags, they should go also into stable trees as they are fixing CPU scaling and instability issues. Are there any issues with these patches? If not, could you please merge them for upcoming Linux version? On Monday 22 February 2021 20:41:48 Pali Rohár wrote: > Hello! > > This is third version of patches for Armada 37xx cpufreq driver which > fix CPU scaling with 1 GHz base frequency. > > The only change in this third version is modified patch 04/10 with fixes > for 1.2 GHz variant of Espressobin. Minimal CPU voltage in L1 load for > 1.2 GHz variant was increased to 1.155V. > > Patches are now rebased on top of the kernel version 5.11 with all > collected Acked-by/Tested-by lines and are available also in my git > tree in branch a3720-cpufreq-issues: > > https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=a3720-cpufreq-issues > > If you have other Armada 3720 boards with 1.2 GHz CPU, including > Espressobin V7, let us know if it is working fine for you. > > Marek & Pali > > Marek Behún (3): > arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node > cpufreq: armada-37xx: Fix setting TBG parent for load levels > clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM > clock > > Pali Rohár (7): > cpufreq: armada-37xx: Fix the AVS value for load L1 > clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to > 1 GHz > clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 > to L0 > cpufreq: armada-37xx: Fix driver cleanup when registration failed > cpufreq: armada-37xx: Fix determining base CPU frequency > cpufreq: armada-37xx: Remove cur_frequency variable > cpufreq: armada-37xx: Fix module unloading > > arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 3 +- > drivers/clk/mvebu/armada-37xx-periph.c | 83 +++--- > drivers/cpufreq/armada-37xx-cpufreq.c| 111 +++ > 3 files changed, 135 insertions(+), 62 deletions(-) > > -- > 2.20.1 >
RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)
Hello! PCIe card can be reset via in-band Hot Reset signal which can be triggered by PCIe bridge via Secondary Bus Reset bit in PCI config space. Kernel already exports sysfs node "reset" for triggering Functional Reset of particular function of PCI device. But in some cases Functional Reset is not enough and Hot Reset is required. Following RFC patch exports sysfs node "reset_bus" for PCI bridges which triggers Secondary Bus Reset and therefore for PCIe bridges it resets connected PCIe card. What do you think about it? Currently there is userspace script which can trigger PCIe Hot Reset by modifying PCI config space from userspace: https://alexforencich.com/wiki/en/pcie/hot-reset-linux But because kernel already provides way how to trigger Functional Reset it could provide also way how to trigger PCIe Hot Reset. diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 50fcb62d59b5..f5e11c589498 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -1321,6 +1321,30 @@ static ssize_t reset_store(struct device *dev, struct device_attribute *attr, static DEVICE_ATTR(reset, 0200, NULL, reset_store); +static ssize_t reset_bus_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct pci_dev *pdev = to_pci_dev(dev); + unsigned long val; + ssize_t result = kstrtoul(buf, 0, ); + + if (result < 0) + return result; + + if (val != 1) + return -EINVAL; + + pm_runtime_get_sync(dev); + result = pci_bridge_secondary_bus_reset(pdev); + pm_runtime_put(dev); + if (result < 0) + return result; + + return count; +} + +static DEVICE_ATTR(reset_bus, 0200, NULL, reset_bus_store); + static int pci_create_capabilities_sysfs(struct pci_dev *dev) { int retval; @@ -1332,8 +1356,15 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev) if (retval) goto error; } + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + retval = device_create_file(>dev, _attr_reset_bus); + if (retval) + goto error_reset_bus; + } return 0; +error_reset_bus: + device_remove_file(>dev, _attr_reset); error: pcie_vpd_remove_sysfs_dev_files(dev); return retval; @@ -1414,6 +1445,8 @@ static void pci_remove_capabilities_sysfs(struct pci_dev *dev) device_remove_file(>dev, _attr_reset); dev->reset_fn = 0; } + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + device_remove_file(>dev, _attr_reset_bus); } /**
Re: [PATCH AUTOSEL 5.11 16/67] net: sfp: add mode quirk for GPON module Ubiquiti U-Fiber Instant
On Wednesday 24 February 2021 07:49:34 Sasha Levin wrote: > From: Pali Rohár > > [ Upstream commit f0b4f847673299577c29b71d3f3acd3c313d81b7 ] Hello! This commit requires also commit~1 from that patch series: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=426c6cbc409cbda9ab1a9dbf15d3c2ef947eb8c1 Without it kernel cannot read EEPROM from Ubiquiti U-Fiber Instant module and therefore the hook based on EEPROM data which is below would not be applied. > The Ubiquiti U-Fiber Instant SFP GPON module has nonsensical information > stored in its EEPROM. It claims to support all transceiver types including > 10G Ethernet. Clear all claimed modes and set only 1000baseX_Full, which is > the only one supported. > > This module has also phys_id set to SFF, and the SFP subsystem currently > does not allow to use SFP modules detected as SFFs. Add exception for this > module so it can be detected as supported. > > This change finally allows to detect and use SFP GPON module Ubiquiti > U-Fiber Instant on Linux system. > > EEPROM content of this SFP module is (where XX is serial number): > > 00: 02 04 0b ff ff ff ff ff ff ff ff 03 0c 00 14 c8?????.?? > 10: 00 00 00 00 55 42 4e 54 20 20 20 20 20 20 20 20UBNT > 20: 20 20 20 20 00 18 e8 29 55 46 2d 49 4e 53 54 41.??)UF-INSTA > 30: 4e 54 20 20 20 20 20 20 34 20 20 20 05 1e 00 36NT 4 ??.6 > 40: 00 06 00 00 55 42 4e 54 XX XX XX XX XX XX XX XX.?..UBNT > 50: 20 20 20 20 31 34 30 31 32 33 20 20 60 80 02 41140123 `??A > > Signed-off-by: Pali Rohár > Signed-off-by: Jakub Kicinski > Signed-off-by: Sasha Levin > --- > drivers/net/phy/sfp-bus.c | 15 +++ > drivers/net/phy/sfp.c | 17 +++-- > 2 files changed, 30 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c > index 20b91f5dfc6ed..4cf874fb5c5b4 100644 > --- a/drivers/net/phy/sfp-bus.c > +++ b/drivers/net/phy/sfp-bus.c > @@ -44,6 +44,17 @@ static void sfp_quirk_2500basex(const struct sfp_eeprom_id > *id, > phylink_set(modes, 2500baseX_Full); > } > > +static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id, > + unsigned long *modes) > +{ > + /* Ubiquiti U-Fiber Instant module claims that support all transceiver > + * types including 10G Ethernet which is not truth. So clear all claimed > + * modes and set only one mode which module supports: 1000baseX_Full. > + */ > + phylink_zero(modes); > + phylink_set(modes, 1000baseX_Full); > +} > + > static const struct sfp_quirk sfp_quirks[] = { > { > // Alcatel Lucent G-010S-P can operate at 2500base-X, but > @@ -63,6 +74,10 @@ static const struct sfp_quirk sfp_quirks[] = { > .vendor = "HUAWEI", > .part = "MA5671A", > .modes = sfp_quirk_2500basex, > + }, { > + .vendor = "UBNT", > + .part = "UF-INSTANT", > + .modes = sfp_quirk_ubnt_uf_instant, > }, > }; > > diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c > index 91d74c1a920ab..804295ad8a044 100644 > --- a/drivers/net/phy/sfp.c > +++ b/drivers/net/phy/sfp.c > @@ -273,8 +273,21 @@ static const struct sff_data sff_data = { > > static bool sfp_module_supported(const struct sfp_eeprom_id *id) > { > - return id->base.phys_id == SFF8024_ID_SFP && > -id->base.phys_ext_id == SFP_PHYS_EXT_ID_SFP; > + if (id->base.phys_id == SFF8024_ID_SFP && > + id->base.phys_ext_id == SFP_PHYS_EXT_ID_SFP) > + return true; > + > + /* SFP GPON module Ubiquiti U-Fiber Instant has in its EEPROM stored > + * phys id SFF instead of SFP. Therefore mark this module explicitly > + * as supported based on vendor name and pn match. > + */ > + if (id->base.phys_id == SFF8024_ID_SFF_8472 && > + id->base.phys_ext_id == SFP_PHYS_EXT_ID_SFP && > + !memcmp(id->base.vendor_name, "UBNT", 16) && > + !memcmp(id->base.vendor_pn, "UF-INSTANT ", 16)) > + return true; > + > + return false; > } > > static const struct sff_data sfp_data = { > -- > 2.27.0 > On Wednesday 24 February 2021 07:51:28 Sasha Levin wrote: > From: Pali Rohár > > [ Upstream commit f0b4f847673299577c29b71d3f3acd3c313d81b7 ] > > The Ubiquiti U-Fiber Instant SFP GPON module has nonsensical information > stored in its EEPROM. It claims to support all transceiver types including > 10G Eth
Re: [PATCH 1/3] power: supply: bq27xxx: fix sign of current_now for newer ICs
On Tuesday 23 February 2021 15:49:46 Andreas Kemnade wrote: > On Tue, 23 Feb 2021 15:11:20 +0100 > Matthias Schiffer wrote: > > > Commit cd060b4d0868 ("power: supply: bq27xxx: fix polarity of current_now") > > changed the sign of current_now for all bq27xxx variants, but on BQ28Z610 > > I'm now seeing negated values *with* that patch. > > > > The GTA04/Openmoko device that was used for testing uses a BQ27000 or > > BQ27010 IC, so I assume only the BQ27XXX_O_ZERO code path was incorrect. > > Revert the behaviour for newer ICs. > > > > Fixes: cd060b4d0868 "power: supply: bq27xxx: fix polarity of current_now" > > Signed-off-by: Matthias Schiffer > > --- > > > > @Andreas Kemnade: It would be great to get a confirmation that the > > Openmoko battery indeed uses BQ27000/BQ27010 - I was having some trouble > > finding that information. > > > I can confirm that. > here is the corresponding schematic: > > http://people.openmoko.org/tony_tu/GTA02/hardware/GTA02/CT-GTA02.pdf > > Regards, > Andreas Nokia N900 has BQ27200 connected via i2c. CCing Maemo mailing list maemo-le...@lists.dyne.org
[PATCH mvebu v3 10/10] cpufreq: armada-37xx: Fix module unloading
This driver is missing module_exit hook. Add proper driver exit function which unregisters the platform device and cleans up the data. Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares --- drivers/cpufreq/armada-37xx-cpufreq.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index 050abff18308..3fc98a3ffd91 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -86,6 +86,8 @@ static int avs_map[] = { }; struct armada37xx_cpufreq_state { + struct platform_device *pdev; + struct device *cpu_dev; struct regmap *regmap; u32 nb_l0l1; u32 nb_l2l3; @@ -506,6 +508,9 @@ static int __init armada37xx_cpufreq_driver_init(void) if (ret) goto disable_dvfs; + armada37xx_cpufreq_state->cpu_dev = cpu_dev; + armada37xx_cpufreq_state->pdev = pdev; + platform_set_drvdata(pdev, dvfs); return 0; disable_dvfs: @@ -524,6 +529,26 @@ static int __init armada37xx_cpufreq_driver_init(void) /* late_initcall, to guarantee the driver is loaded after A37xx clock driver */ late_initcall(armada37xx_cpufreq_driver_init); +static void __exit armada37xx_cpufreq_driver_exit(void) +{ + struct platform_device *pdev = armada37xx_cpufreq_state->pdev; + struct armada_37xx_dvfs *dvfs = platform_get_drvdata(pdev); + unsigned long freq; + int load_lvl; + + platform_device_unregister(pdev); + + armada37xx_cpufreq_disable_dvfs(armada37xx_cpufreq_state->regmap); + + for (load_lvl = ARMADA_37XX_DVFS_LOAD_0; load_lvl < LOAD_LEVEL_NR; load_lvl++) { + freq = dvfs->cpu_freq_max / dvfs->divider[load_lvl]; + dev_pm_opp_remove(armada37xx_cpufreq_state->cpu_dev, freq); + } + + kfree(armada37xx_cpufreq_state); +} +module_exit(armada37xx_cpufreq_driver_exit); + static const struct of_device_id __maybe_unused armada37xx_cpufreq_of_match[] = { { .compatible = "marvell,armada-3700-nb-pm" }, { }, -- 2.20.1
[PATCH mvebu v3 08/10] cpufreq: armada-37xx: Fix determining base CPU frequency
When current CPU load is not L0 then loading armada-37xx-cpufreq.ko driver fails with following error: # modprobe armada-37xx-cpufreq [ 502.702097] Unsupported CPU frequency 250 MHz This issue was partially fixed by commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp"), but only for calculating CPU frequency for opp. Fix this also for determination of base CPU frequency. Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Cc: sta...@vger.kernel.org # 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") --- drivers/cpufreq/armada-37xx-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index 1ab2113daef5..e4782f562e7a 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -469,7 +469,7 @@ static int __init armada37xx_cpufreq_driver_init(void) return -EINVAL; } - dvfs = armada_37xx_cpu_freq_info_get(cur_frequency); + dvfs = armada_37xx_cpu_freq_info_get(base_frequency); if (!dvfs) { clk_put(clk); return -EINVAL; -- 2.20.1
[PATCH mvebu v3 07/10] cpufreq: armada-37xx: Fix driver cleanup when registration failed
Commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") changed calculation of frequency passed to the dev_pm_opp_add() function call. But the code for dev_pm_opp_remove() function call was not updated, so the driver cleanup phase does not work when registration fails. This fixes the issue by using the same frequency in both calls. Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") Cc: sta...@vger.kernel.org --- drivers/cpufreq/armada-37xx-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index c7683d447b11..1ab2113daef5 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -521,7 +521,7 @@ static int __init armada37xx_cpufreq_driver_init(void) remove_opp: /* clean-up the already added opp before leaving */ while (load_lvl-- > ARMADA_37XX_DVFS_LOAD_0) { - freq = cur_frequency / dvfs->divider[load_lvl]; + freq = base_frequency / dvfs->divider[load_lvl]; dev_pm_opp_remove(cpu_dev, freq); } -- 2.20.1
[PATCH mvebu v3 06/10] clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
When CPU frequency is at 250 MHz and set_rate() is called with 500 MHz (L1) quickly followed by a call with 1 GHz (L0), the CPU does not necessarily stay in L1 for at least 20ms as is required by Marvell errata. This situation happens frequently with the ondemand cpufreq governor and can be also reproduced with userspace governor. In most cases it causes CPU to crash. This change fixes the above issue and ensures that the CPU always stays in L1 for at least 20ms when switching from any state to L0. Signed-off-by: Marek Behún Signed-off-by: Pali Rohár Acked-by: Stephen Boyd Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 61c40f35f5cd ("clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz") Cc: sta...@vger.kernel.org --- drivers/clk/mvebu/armada-37xx-periph.c | 45 ++ 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index b15e177bea7e..32ac6b6b7530 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -84,6 +84,7 @@ struct clk_pm_cpu { void __iomem *reg_div; u8 shift_div; struct regmap *nb_pm_base; + unsigned long l1_expiration; }; #define to_clk_double_div(_hw) container_of(_hw, struct clk_double_div, hw) @@ -504,22 +505,52 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate, * 2. Sleep 20ms for stabling VDD voltage * 3. Then switch from L1 (500/600 MHz) to L0 (1000/1200 MHz). */ -static void clk_pm_cpu_set_rate_wa(unsigned long rate, struct regmap *base) +static void clk_pm_cpu_set_rate_wa(struct clk_pm_cpu *pm_cpu, + unsigned int new_level, unsigned long rate, + struct regmap *base) { unsigned int cur_level; - if (rate < 1000 * 1000 * 1000) - return; - regmap_read(base, ARMADA_37XX_NB_CPU_LOAD, _level); cur_level &= ARMADA_37XX_NB_CPU_LOAD_MASK; - if (cur_level <= ARMADA_37XX_DVFS_LOAD_1) + + if (cur_level == new_level) + return; + + /* +* System wants to go to L1 on its own. If we are going from L2/L3, +* remember when 20ms will expire. If from L0, set the value so that +* next switch to L0 won't have to wait. +*/ + if (new_level == ARMADA_37XX_DVFS_LOAD_1) { + if (cur_level == ARMADA_37XX_DVFS_LOAD_0) + pm_cpu->l1_expiration = jiffies; + else + pm_cpu->l1_expiration = jiffies + msecs_to_jiffies(20); return; + } + + /* +* If we are setting to L2/L3, just invalidate L1 expiration time, +* sleeping is not needed. +*/ + if (rate < 1000*1000*1000) + goto invalidate_l1_exp; + + /* +* We are going to L0 with rate >= 1GHz. Check whether we have been at +* L1 for long enough time. If not, go to L1 for 20ms. +*/ + if (pm_cpu->l1_expiration && jiffies >= pm_cpu->l1_expiration) + goto invalidate_l1_exp; regmap_update_bits(base, ARMADA_37XX_NB_CPU_LOAD, ARMADA_37XX_NB_CPU_LOAD_MASK, ARMADA_37XX_DVFS_LOAD_1); msleep(20); + +invalidate_l1_exp: + pm_cpu->l1_expiration = 0; } static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, @@ -553,7 +584,9 @@ static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, reg = ARMADA_37XX_NB_CPU_LOAD; mask = ARMADA_37XX_NB_CPU_LOAD_MASK; - clk_pm_cpu_set_rate_wa(rate, base); + /* Apply workaround when base CPU frequency is 1000 or 1200 MHz */ + if (parent_rate >= 1000*1000*1000) + clk_pm_cpu_set_rate_wa(pm_cpu, load_level, rate, base); regmap_update_bits(base, reg, mask, load_level); -- 2.20.1
[PATCH mvebu v3 09/10] cpufreq: armada-37xx: Remove cur_frequency variable
Variable cur_frequency in armada37xx_cpufreq_driver_init() is unused. Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares --- drivers/cpufreq/armada-37xx-cpufreq.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index e4782f562e7a..050abff18308 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -400,7 +400,7 @@ static int __init armada37xx_cpufreq_driver_init(void) struct armada_37xx_dvfs *dvfs; struct platform_device *pdev; unsigned long freq; - unsigned int cur_frequency, base_frequency; + unsigned int base_frequency; struct regmap *nb_clk_base, *nb_pm_base, *avs_base; struct device *cpu_dev; int load_lvl, ret; @@ -461,14 +461,6 @@ static int __init armada37xx_cpufreq_driver_init(void) return -EINVAL; } - /* Get nominal (current) CPU frequency */ - cur_frequency = clk_get_rate(clk); - if (!cur_frequency) { - dev_err(cpu_dev, "Failed to get clock rate for CPU\n"); - clk_put(clk); - return -EINVAL; - } - dvfs = armada_37xx_cpu_freq_info_get(base_frequency); if (!dvfs) { clk_put(clk); -- 2.20.1
[PATCH mvebu v3 05/10] clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
It was observed that the workaround introduced by commit 61c40f35f5cd ("clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz") when base CPU frequency is 1.2 GHz is also required when base CPU frequency is 1 GHz. Otherwise switching CPU frequency directly from L2 (250 MHz) to L0 (1 GHz) causes a crash. When base CPU frequency is just 800 MHz no crashed were observed during switch from L2 to L0. Signed-off-by: Pali Rohár Acked-by: Stephen Boyd Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 2089dc33ea0e ("clk: mvebu: armada-37xx-periph: add DVFS support for cpu clocks") Cc: sta...@vger.kernel.org # 61c40f35f5cd ("clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz") --- drivers/clk/mvebu/armada-37xx-periph.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index 6507bd2c5f31..b15e177bea7e 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -487,8 +487,10 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate, } /* - * Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz - * respectively) to L0 frequency (1.2 Ghz) requires a significant + * Workaround when base CPU frequnecy is 1000 or 1200 MHz + * + * Switching the CPU from the L2 or L3 frequencies (250/300 or 200 MHz + * respectively) to L0 frequency (1/1.2 GHz) requires a significant * amount of time to let VDD stabilize to the appropriate * voltage. This amount of time is large enough that it cannot be * covered by the hardware countdown register. Due to this, the CPU @@ -498,15 +500,15 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate, * To work around this problem, we prevent switching directly from the * L2/L3 frequencies to the L0 frequency, and instead switch to the L1 * frequency in-between. The sequence therefore becomes: - * 1. First switch from L2/L3(200/300MHz) to L1(600MHZ) + * 1. First switch from L2/L3 (200/250/300 MHz) to L1 (500/600 MHz) * 2. Sleep 20ms for stabling VDD voltage - * 3. Then switch from L1(600MHZ) to L0(1200Mhz). + * 3. Then switch from L1 (500/600 MHz) to L0 (1000/1200 MHz). */ static void clk_pm_cpu_set_rate_wa(unsigned long rate, struct regmap *base) { unsigned int cur_level; - if (rate != 1200 * 1000 * 1000) + if (rate < 1000 * 1000 * 1000) return; regmap_read(base, ARMADA_37XX_NB_CPU_LOAD, _level); -- 2.20.1
[PATCH mvebu v3 03/10] clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
From: Marek Behún Remove the .set_parent method in clk_pm_cpu_ops. This method was supposed to be needed by the armada-37xx-cpufreq driver, but was never actually called due to wrong assumptions in the cpufreq driver. After this was fixed in the cpufreq driver, this method is not needed anymore. Signed-off-by: Marek Behún Acked-by: Stephen Boyd Tested-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 2089dc33ea0e ("clk: mvebu: armada-37xx-periph: add DVFS support for cpu clocks") --- drivers/clk/mvebu/armada-37xx-periph.c | 28 -- 1 file changed, 28 deletions(-) diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index f5746f9ea929..6507bd2c5f31 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -440,33 +440,6 @@ static u8 clk_pm_cpu_get_parent(struct clk_hw *hw) return val; } -static int clk_pm_cpu_set_parent(struct clk_hw *hw, u8 index) -{ - struct clk_pm_cpu *pm_cpu = to_clk_pm_cpu(hw); - struct regmap *base = pm_cpu->nb_pm_base; - int load_level; - - /* -* We set the clock parent only if the DVFS is available but -* not enabled. -*/ - if (IS_ERR(base) || armada_3700_pm_dvfs_is_enabled(base)) - return -EINVAL; - - /* Set the parent clock for all the load level */ - for (load_level = 0; load_level < LOAD_LEVEL_NR; load_level++) { - unsigned int reg, mask, val, - offset = ARMADA_37XX_NB_TBG_SEL_OFF; - - armada_3700_pm_dvfs_update_regs(load_level, , ); - - val = index << offset; - mask = ARMADA_37XX_NB_TBG_SEL_MASK << offset; - regmap_update_bits(base, reg, mask, val); - } - return 0; -} - static unsigned long clk_pm_cpu_recalc_rate(struct clk_hw *hw, unsigned long parent_rate) { @@ -592,7 +565,6 @@ static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, static const struct clk_ops clk_pm_cpu_ops = { .get_parent = clk_pm_cpu_get_parent, - .set_parent = clk_pm_cpu_set_parent, .round_rate = clk_pm_cpu_round_rate, .set_rate = clk_pm_cpu_set_rate, .recalc_rate = clk_pm_cpu_recalc_rate, -- 2.20.1
[PATCH mvebu v3 02/10] cpufreq: armada-37xx: Fix setting TBG parent for load levels
From: Marek Behún With CPU frequency determining software [1] we have discovered that after this driver does one CPU frequency change, the base frequency of the CPU is set to the frequency of TBG-A-P clock, instead of the TBG that is parent to the CPU. This can be reproduced on EspressoBIN and Turris MOX: cd /sys/devices/system/cpu/cpufreq/policy0 echo powersave >scaling_governor echo performance >scaling_governor Running the mhz tool before this driver is loaded reports 1000 MHz, and after loading the driver and executing commands above the tool reports 800 MHz. The change of TBG clock selector is supposed to happen in function armada37xx_cpufreq_dvfs_setup. Before the function returns, it does this: parent = clk_get_parent(clk); clk_set_parent(clk, parent); The armada-37xx-periph clock driver has the .set_parent method implemented correctly for this, so if the method was actually called, this would work. But since the introduction of the common clock framework in commit b2476490ef11 ("clk: introduce the common clock..."), the clk_set_parent function checks whether the parent is actually changing, and if the requested new parent is same as the old parent (which is obviously the case for the code above), the .set_parent method is not called at all. This patch fixes this issue by filling the correct TBG clock selector directly in the armada37xx_cpufreq_dvfs_setup during the filling of other registers at the same address. But the determination of CPU TBG index cannot be done via the common clock framework, therefore we need to access the North Bridge Peripheral Clock registers directly in this driver. [1] https://github.com/wtarreau/mhz Signed-off-by: Marek Behún Tested-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Cc: sta...@vger.kernel.org --- drivers/cpufreq/armada-37xx-cpufreq.c | 35 ++- 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index b4af4094309b..b8dc6c849579 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -25,6 +25,10 @@ #include "cpufreq-dt.h" +/* Clk register set */ +#define ARMADA_37XX_CLK_TBG_SEL0 +#define ARMADA_37XX_CLK_TBG_SEL_CPU_OFF22 + /* Power management in North Bridge register set */ #define ARMADA_37XX_NB_L0L10x18 #define ARMADA_37XX_NB_L2L30x1C @@ -120,10 +124,15 @@ static struct armada_37xx_dvfs *armada_37xx_cpu_freq_info_get(u32 freq) * will be configured then the DVFS will be enabled. */ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, -struct clk *clk, u8 *divider) +struct regmap *clk_base, u8 *divider) { + u32 cpu_tbg_sel; int load_lvl; - struct clk *parent; + + /* Determine to which TBG clock is CPU connected */ + regmap_read(clk_base, ARMADA_37XX_CLK_TBG_SEL, _tbg_sel); + cpu_tbg_sel >>= ARMADA_37XX_CLK_TBG_SEL_CPU_OFF; + cpu_tbg_sel &= ARMADA_37XX_NB_TBG_SEL_MASK; for (load_lvl = 0; load_lvl < LOAD_LEVEL_NR; load_lvl++) { unsigned int reg, mask, val, offset = 0; @@ -142,6 +151,11 @@ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, mask = (ARMADA_37XX_NB_CLK_SEL_MASK << ARMADA_37XX_NB_CLK_SEL_OFF); + /* Set TBG index, for all levels we use the same TBG */ + val = cpu_tbg_sel << ARMADA_37XX_NB_TBG_SEL_OFF; + mask = (ARMADA_37XX_NB_TBG_SEL_MASK + << ARMADA_37XX_NB_TBG_SEL_OFF); + /* * Set cpu divider based on the pre-computed array in * order to have balanced step. @@ -160,14 +174,6 @@ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, regmap_update_bits(base, reg, mask, val); } - - /* -* Set cpu clock source, for all the level we keep the same -* clock source that the one already configured. For this one -* we need to use the clock framework -*/ - parent = clk_get_parent(clk); - clk_set_parent(clk, parent); } /* @@ -358,11 +364,16 @@ static int __init armada37xx_cpufreq_driver_init(void) struct platform_device *pdev; unsigned long freq; unsigned int cur_frequency, base_frequency; - struct regmap *nb_pm_base, *avs_base; + struct regmap *nb_clk_base, *nb_pm_base, *avs_base; struct device *cpu_dev; int load_lvl, ret; struct clk *clk, *parent; + nb_clk_base = + syscon_regmap_lookup_by_compatible(&qu
[PATCH mvebu v3 04/10] cpufreq: armada-37xx: Fix the AVS value for load L1
The original CPU voltage value for load L1 is too low for Armada 37xx SoC when base CPU frequency is 1000 or 1200 MHz. It leads to instabilities where CPU gets stuck soon after dynamic voltage scaling from load L1 to L0. Update the CPU voltage value for load L1 accordingly when base frequency is 1000 or 1200 MHz. The minimal L1 value for base CPU frequency 1000 MHz is updated from the original 1.05V to 1.108V and for 1200 MHz is updated to 1.155V. This minimal L1 value is used only in the case when it is lower than value for L0. This change fixes CPU instability issues on 1 GHz and 1.2 GHz variants of Espressobin and 1 GHz Turris Mox. Marvell previously for 1 GHz variant of Espressobin provided a patch [1] suitable only for their Marvell Linux kernel 4.4 fork which workarounded this issue. Patch forced CPU voltage value to 1.108V in all loads. But such change does not fix CPU instability issues on 1.2 GHz variants of Armada 3720 SoC. During testing we come to the conclusion that using 1.108V as minimal value for L1 load makes 1 GHz variants of Espressobin and Turris Mox boards stable. And similarly 1.155V for 1.2 GHz variant of Espressobin. These two values 1.108V and 1.155V are documented in Armada 3700 Hardware Specifications as typical initial CPU voltage values. Discussion about this issue is also at the Armbian forum [2]. [1] - https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/dc33b62c90696afb6adc7dbcc4ebbd48bedec269 [2] - https://forum.armbian.com/topic/10429-how-to-make-espressobin-v7-stable/ Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: 1c3528232f4b ("cpufreq: armada-37xx: Add AVS support") Cc: sta...@vger.kernel.org --- drivers/cpufreq/armada-37xx-cpufreq.c | 37 +++ 1 file changed, 37 insertions(+) diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index b8dc6c849579..c7683d447b11 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -73,6 +73,8 @@ #define LOAD_LEVEL_NR 4 #define MIN_VOLT_MV 1000 +#define MIN_VOLT_MV_FOR_L1_1000MHZ 1108 +#define MIN_VOLT_MV_FOR_L1_1200MHZ 1155 /* AVS value for the corresponding voltage (in mV) */ static int avs_map[] = { @@ -208,6 +210,8 @@ static u32 armada_37xx_avs_val_match(int target_vm) * - L2 & L3 voltage should be about 150mv smaller than L0 voltage. * This function calculates L1 & L2 & L3 AVS values dynamically based * on L0 voltage and fill all AVS values to the AVS value table. + * When base CPU frequency is 1000 or 1200 MHz then there is additional + * minimal avs value for load L1. */ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, struct armada_37xx_dvfs *dvfs) @@ -239,6 +243,19 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, for (load_level = 1; load_level < LOAD_LEVEL_NR; load_level++) dvfs->avs[load_level] = avs_min; + /* +* Set the avs values for load L0 and L1 when base CPU frequency +* is 1000/1200 MHz to its typical initial values according to +* the Armada 3700 Hardware Specifications. +*/ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + dvfs->avs[0] = dvfs->avs[1] = avs_min; + } + return; } @@ -258,6 +275,26 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, target_vm = avs_map[l0_vdd_min] - 150; target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV; dvfs->avs[2] = dvfs->avs[3] = armada_37xx_avs_val_match(target_vm); + + /* +* Fix the avs value for load L1 when base CPU frequency is 1000/1200 MHz, +* otherwise the CPU gets stuck when switching from load L1 to load L0. +* Also ensure that avs value for load L1 is not higher than for L0. +*/ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + u32 avs_min_l1; + + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + + if (avs_min_l1 > dvfs->avs[0]) + avs_min_l1 = dvfs->avs[0]; + + if (dvfs->avs[1] < avs_min_l1) +
[PATCH mvebu v3 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
Hello! This is third version of patches for Armada 37xx cpufreq driver which fix CPU scaling with 1 GHz base frequency. The only change in this third version is modified patch 04/10 with fixes for 1.2 GHz variant of Espressobin. Minimal CPU voltage in L1 load for 1.2 GHz variant was increased to 1.155V. Patches are now rebased on top of the kernel version 5.11 with all collected Acked-by/Tested-by lines and are available also in my git tree in branch a3720-cpufreq-issues: https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=a3720-cpufreq-issues If you have other Armada 3720 boards with 1.2 GHz CPU, including Espressobin V7, let us know if it is working fine for you. Marek & Pali Marek Behún (3): arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node cpufreq: armada-37xx: Fix setting TBG parent for load levels clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock Pali Rohár (7): cpufreq: armada-37xx: Fix the AVS value for load L1 clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0 cpufreq: armada-37xx: Fix driver cleanup when registration failed cpufreq: armada-37xx: Fix determining base CPU frequency cpufreq: armada-37xx: Remove cur_frequency variable cpufreq: armada-37xx: Fix module unloading arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 3 +- drivers/clk/mvebu/armada-37xx-periph.c | 83 +++--- drivers/cpufreq/armada-37xx-cpufreq.c| 111 +++ 3 files changed, 135 insertions(+), 62 deletions(-) -- 2.20.1
[PATCH mvebu v3 01/10] arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node
From: Marek Behún Add "syscon" compatible to the North Bridge clocks node to allow the cpufreq driver to access these registers via syscon API. This is needed for a fix of cpufreq driver. Signed-off-by: Marek Behún Tested-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Tested-by: Anders Trier Olesen Tested-by: Philip Soares Fixes: e8d66e7927b2 ("arm64: dts: marvell: armada-37xx: add nodes...") Cc: sta...@vger.kernel.org --- arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi index d5b6c0a1c54a..a89e47d95eef 100644 --- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi @@ -156,7 +156,8 @@ }; nb_periph_clk: nb-periph-clk@13000 { - compatible = "marvell,armada-3700-periph-clock-nb"; + compatible = "marvell,armada-3700-periph-clock-nb", +"syscon"; reg = <0x13000 0x100>; clocks = < 0>, < 1>, < 2>, < 3>, <>; -- 2.20.1
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Sunday 21 February 2021 19:17:40 nnet wrote: > > Could you test if 1.155V voltage for L1 is stable on 1.2 GHz variant? > > ++#define MIN_VOLT_MV_FOR_L1_1200MHZ 1155 > ... > ++ if (avs_min_l1 > dvfs->avs[0]) > ++ avs_min_l1 = dvfs->avs[0]; > ++ > ++ if (dvfs->avs[1] < avs_min_l1) > ++ dvfs->avs[1] = avs_min_l1; > > This works fine. Tested with switching 600MHz to 1.2GHz under load. Perfect! Therefore here is final version of patch 04/10 for both 1 GHz and 1.2 GHz variants of A3720 SoC. I will resend whole patch series. Could I add your Tested-by line to patch series? diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index b8dc6c849..c7683d447 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -73,6 +73,8 @@ #define LOAD_LEVEL_NR 4 #define MIN_VOLT_MV 1000 +#define MIN_VOLT_MV_FOR_L1_1000MHZ 1108 +#define MIN_VOLT_MV_FOR_L1_1200MHZ 1155 /* AVS value for the corresponding voltage (in mV) */ static int avs_map[] = { @@ -208,6 +210,8 @@ static u32 armada_37xx_avs_val_match(int target_vm) * - L2 & L3 voltage should be about 150mv smaller than L0 voltage. * This function calculates L1 & L2 & L3 AVS values dynamically based * on L0 voltage and fill all AVS values to the AVS value table. + * When base CPU frequency is 1000 or 1200 MHz then there is additional + * minimal avs value for load L1. */ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, struct armada_37xx_dvfs *dvfs) @@ -239,6 +243,19 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, for (load_level = 1; load_level < LOAD_LEVEL_NR; load_level++) dvfs->avs[load_level] = avs_min; + /* +* Set the avs values for load L0 and L1 when base CPU frequency +* is 1000/1200 MHz to its typical initial values according to +* the Armada 3700 Hardware Specifications. +*/ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + dvfs->avs[0] = dvfs->avs[1] = avs_min; + } + return; } @@ -258,6 +275,26 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, target_vm = avs_map[l0_vdd_min] - 150; target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV; dvfs->avs[2] = dvfs->avs[3] = armada_37xx_avs_val_match(target_vm); + + /* +* Fix the avs value for load L1 when base CPU frequency is 1000/1200 MHz, +* otherwise the CPU gets stuck when switching from load L1 to load L0. +* Also ensure that avs value for load L1 is not higher than for L0. +*/ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + u32 avs_min_l1; + + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + + if (avs_min_l1 > dvfs->avs[0]) + avs_min_l1 = dvfs->avs[0]; + + if (dvfs->avs[1] < avs_min_l1) + dvfs->avs[1] = avs_min_l1; + } } static void __init armada37xx_cpufreq_avs_setup(struct regmap *base,
Re: [PATCH] drivers: input: mouse: Change postive to positive in the file alps.c
On Monday 22 February 2021 00:07:40 Randy Dunlap wrote: > On 2/21/21 11:54 PM, Bhaskar Chowdhury wrote: > > > > s/postive/positive/ > > > > Signed-off-by: Bhaskar Chowdhury > > Acked-by: Randy Dunlap Reviewed-by: Pali Rohár > > --- > > drivers/input/mouse/alps.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c > > index b067bfd2699c..4a6b33bbe7ea 100644 > > --- a/drivers/input/mouse/alps.c > > +++ b/drivers/input/mouse/alps.c > > @@ -986,7 +986,7 @@ static void alps_get_finger_coordinate_v7(struct > > input_mt_pos *mt, > > case V7_PACKET_ID_TWO: > > mt[1].x &= ~0x000F; > > mt[1].y |= 0x000F; > > - /* Detect false-postive touches where x & y report max value */ > > + /* Detect false-positive touches where x & y report max value */ > > if (mt[1].y == 0x7ff && mt[1].x == 0xff0) { > > mt[1].x = 0; > > /* y gets set to 0 at the end of this function */ > > -- > > > -- > ~Randy >
Re: [PATCH v3 2/3] docs: arm: marvell: drop a dead links
> This one has gone away. ... > --- a/Documentation/arm/marvell.rst > +++ b/Documentation/arm/marvell.rst > @@ -399,7 +399,6 @@ Berlin family (Multimedia Solutions) >- Flavors: > - 88DE3010, Armada 1000 (no Linux support) > - Core: Marvell PJ1 (ARMv5TE), Dual-core > - - Product Brief: > http://www.marvell.com.cn/digital-entertainment/assets/armada_1000_pb.pdf It is still in web archive: https://web.archive.org/web/20131103162620/http://www.marvell.com/digital-entertainment/assets/armada_1000_pb.pdf
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Tuesday 16 February 2021 08:27:10 nnet wrote: > > Therefore I'm thinking if the correct way is instead to use L1 := L0 > > voltage value for 1/1.2 GHz mode. > > This latest 04/10 works fine for me going 600MHz <-> 1.2GHz under with and > without load. Ok, thanks for testing! Just to note that typical documented value for 1.2GHz mode is 1.155V, so it would be useful to know if this value could be stable for L1 with 1.2GHz mode. I'm thinking that for 1GHz variant it would be better to rather use 1.108V like in my original patch as this is already tested by lot of people, nobody complained yet and it can be lower value as L0 (so there is benefit to decrease CPU frequency when CPU is idle). For 1.2GHz variant I still do not know. You wrote that 1.132V is unstable, so it cannot be used for sure. Documented typical value 1.155V is bigger, so maybe it can be stable but needs testing. And stable seems to be L0 value... But then I do not see a benefit for downclocking CPU from 1.2 GHz frequency in L0 to 600 MHz freq. in L1 if it use same CPU voltage... But it is still better than unstable CPU with crashes! Could you test if 1.155V voltage for L1 is stable on 1.2 GHz variant?
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Monday 15 February 2021 21:48:08 nnet wrote: > > Could you test following change instead of PATCH 04/10? I added here also > > logic for 1.2 GHz variant with 1.132 V value another change is that > > value for load L0 is not touched as it is stable. > > These changes to patch 04/10 worked going 600 MHz <-> 1.2 GHz , _but_ only > with: > > ++#define MIN_VOLT_MV_FOR_L1_1200MHZ 1213 > > During this latest testing I saw freezes with 1132 mV. > > I've had no lockups with 1213 mV which I just used from the values for L1/L0 > from OTP. I still do not know what is the meaning of values stored in OTP... And there are more non-zero bits which are not used (yet). Anyway, with your another test it looks like that limit is not based on fixed value but rather on current L0 value. Therefore I'm thinking if the correct way is instead to use L1 := L0 voltage value for 1/1.2 GHz mode. Could you try following change instead of previous and PATCH 04/10? diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index b8dc6c849..12d0ff7b1 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -208,6 +208,8 @@ static u32 armada_37xx_avs_val_match(int target_vm) * - L2 & L3 voltage should be about 150mv smaller than L0 voltage. * This function calculates L1 & L2 & L3 AVS values dynamically based * on L0 voltage and fill all AVS values to the AVS value table. + * When base CPU frequency is 1000 or 1200 MHz then there is additional + * minimal avs value for load L1. */ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, struct armada_37xx_dvfs *dvfs) @@ -239,17 +241,36 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, for (load_level = 1; load_level < LOAD_LEVEL_NR; load_level++) dvfs->avs[load_level] = avs_min; + /* +* Set the avs values for load L0 and L1 when base CPU frequency +* is 1000/1200 MHz to its typical initial values according to +* the Armada 3700 Hardware Specifications. +*/ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min = armada_37xx_avs_val_match(1155); + else + avs_min = armada_37xx_avs_val_match(1108); + dvfs->avs[0] = dvfs->avs[1] = avs_min; + } + return; } /* * L1 voltage is equal to L0 voltage - 100mv and it must be -* larger than 1000mv +* larger than 1000mv. When base CPU frequency is 1000/1200 MHz, +* L1 voltage must must be equal to L0 voltage, otherwise +* the CPU gets stuck when switching from load L1 to load L0. */ - target_vm = avs_map[l0_vdd_min] - 100; - target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV; - dvfs->avs[1] = armada_37xx_avs_val_match(target_vm); + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + dvfs->avs[1] = dvfs->avs[0]; + } else { + target_vm = avs_map[l0_vdd_min] - 100; + target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV; + dvfs->avs[1] = armada_37xx_avs_val_match(target_vm); + } /* * L2 & L3 voltage is equal to L0 voltage - 150mv and it must
Re: [PATCH net-next 1/2] net: mvneta: Remove per-cpu queue mapping for Armada 3700
> According to Errata #23 "The per-CPU GbE interrupt is limited to Core > 0", we can't use the per-cpu interrupt mechanism on the Armada 3700 > familly. > > This is correctly checked for RSS configuration, but the initial queue > mapping is still done by having the queues spread across all the CPUs in > the system, both in the init path and in the cpu_hotplug path. Hello Maxime! This patch looks like a bug fix for Armada 3700 SoC. What about marking this commit with Fixes line? E.g.: Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC")
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Saturday 13 February 2021 10:30:19 nnet wrote: > On Sat, Feb 13, 2021, at 2:01 AM, Pali Rohár wrote: > > On Thursday 11 February 2021 16:41:13 nnet wrote: > > > On Thu, Feb 11, 2021, at 3:44 PM, Pali Rohár wrote: > > > > On Thursday 11 February 2021 12:22:52 nnet wrote: > > > > > On Thu, Feb 11, 2021, at 11:55 AM, Pali Rohár wrote: > > > > > > On Wednesday 10 February 2021 11:08:59 nnet wrote: > > > > > > > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote: > > > > > > > > > > Hello! Could you please enable userspace governor during > > > > > > > > > > kernel > > > > > > > > > > compilation? > > > > > > > > > > > > > > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y > > > > > > > > > > > > > > > > > > > > It can be activated via command: > > > > > > > > > > > > > > > > > > > > echo userspace > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > > > > > > > > > > > > > > After that you can "force" CPU frequency to specific value, > > > > > > > > > > e.g.: > > > > > > > > > > > > > > > > > > > > echo 100 > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed > > > > > > > > > > > > > > > > > > > > I need to know which switch (from --> to freq) cause this > > > > > > > > > > system hang. > > > > > > > > > > > > > > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is > > > > > > > > > > fixing only > > > > > > > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As > > > > > > > > > > only this switch > > > > > > > > > > is causing issue. > > > > > > > > > > > > > > > > > > > > I have used following simple bash script to check that > > > > > > > > > > switching between > > > > > > > > > > 500 MHz and 1 GHz is stable: > > > > > > > > > > > > > > > > > > > > while true; do > > > > > > > > > > echo 100 > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > > echo 50 > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > > echo 100 > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > > echo 50 > > > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > > done > > > > > > > > > > > > > > > > > > echo userspace | tee > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > > > while true; do > > > > > > > > > echo 120 | tee > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > echo 60 | tee > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > > done > > > > > > > > > > > > > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > > > > > > > > > > > > > > > > > With 1108 I get a freeze within a minute. The last output to > > > > > > > > > stdout is 60. > > > > > > > > > > > > > > > > > > With 1120 it takes a few minutes. > > > > > > > > > > > > > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over > > > > > > > &
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Thursday 11 February 2021 16:41:13 nnet wrote: > On Thu, Feb 11, 2021, at 3:44 PM, Pali Rohár wrote: > > On Thursday 11 February 2021 12:22:52 nnet wrote: > > > On Thu, Feb 11, 2021, at 11:55 AM, Pali Rohár wrote: > > > > On Wednesday 10 February 2021 11:08:59 nnet wrote: > > > > > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote: > > > > > > > > Hello! Could you please enable userspace governor during kernel > > > > > > > > compilation? > > > > > > > > > > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y > > > > > > > > > > > > > > > > It can be activated via command: > > > > > > > > > > > > > > > > echo userspace > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > > > > > > > > > > After that you can "force" CPU frequency to specific value, > > > > > > > > e.g.: > > > > > > > > > > > > > > > > echo 100 > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed > > > > > > > > > > > > > > > > I need to know which switch (from --> to freq) cause this > > > > > > > > system hang. > > > > > > > > > > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing > > > > > > > > only > > > > > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only > > > > > > > > this switch > > > > > > > > is causing issue. > > > > > > > > > > > > > > > > I have used following simple bash script to check that > > > > > > > > switching between > > > > > > > > 500 MHz and 1 GHz is stable: > > > > > > > > > > > > > > > > while true; do > > > > > > > > echo 100 > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > echo 50 > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > echo 100 > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > echo 50 > > > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > > done > > > > > > > > > > > > > > echo userspace | tee > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > while true; do > > > > > > > echo 120 | tee > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > echo 60 | tee > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > > done > > > > > > > > > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > > > > > > > > > > > > > With 1108 I get a freeze within a minute. The last output to > > > > > > > stdout is 60. > > > > > > > > > > > > > > With 1120 it takes a few minutes. > > > > > > > > > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over the > > > > > > > full 5 minute load test. > > > > > > > > > > > > > > I'm using ondemand now with the above at 1132 without issue so > > > > > > > far. > > > > > > > > > > > > Great, thank you for testing! > > > > > > > > > > > > Can you check if switching between any two lower frequencies 20 > > > > > > 30 60 is stable? > > > > > > > > > > This is stable using 1132 mV for MIN_VOLT_MV_FOR_L0_L1_1GHZ: > > > > > > > > > > while true; do > > > > > # down > > > > > echo 120 | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/s
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Thursday 11 February 2021 12:22:52 nnet wrote: > On Thu, Feb 11, 2021, at 11:55 AM, Pali Rohár wrote: > > On Wednesday 10 February 2021 11:08:59 nnet wrote: > > > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote: > > > > > > Hello! Could you please enable userspace governor during kernel > > > > > > compilation? > > > > > > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y > > > > > > > > > > > > It can be activated via command: > > > > > > > > > > > > echo userspace > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > > > > > > After that you can "force" CPU frequency to specific value, e.g.: > > > > > > > > > > > > echo 100 > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed > > > > > > > > > > > > I need to know which switch (from --> to freq) cause this system > > > > > > hang. > > > > > > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only > > > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this > > > > > > switch > > > > > > is causing issue. > > > > > > > > > > > > I have used following simple bash script to check that switching > > > > > > between > > > > > > 500 MHz and 1 GHz is stable: > > > > > > > > > > > > while true; do > > > > > > echo 100 > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > echo 50 > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > echo 100 > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > echo 50 > > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > > done > > > > > > > > > > echo userspace | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > while true; do > > > > > echo 120 | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > echo 60 | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > done > > > > > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > > > > > > > > > With 1108 I get a freeze within a minute. The last output to stdout > > > > > is 60. > > > > > > > > > > With 1120 it takes a few minutes. > > > > > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over the full > > > > > 5 minute load test. > > > > > > > > > > I'm using ondemand now with the above at 1132 without issue so far. > > > > > > > > Great, thank you for testing! > > > > > > > > Can you check if switching between any two lower frequencies 20 > > > > 30 60 is stable? > > > > > > This is stable using 1132 mV for MIN_VOLT_MV_FOR_L0_L1_1GHZ: > > > > > > while true; do > > > # down > > > echo 120 | tee > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > ... > > > > Hello! > > > > Could you please re-run test without tee, in form as I have shown above? > > UART is slow and printing something to console adds delay which decrease > > probability that real issue is triggered as this is timing issue. > > The test was done over SSH. Ok! But it is still better to not print any results as it adds unwanted delay between frequency switching. > > Also please do tests just between two frequencies in loop as I observed > > that switching between more decreased probability to hit issue. > > > > > > echo userspace | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > while true; do > > > > > echo 120 | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > echo 60 | tee > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > > done > > The first test ^ switched between 600 MHz and 1.2 GHz. > > > The real issue for 1 GHz variant of A3720 is only when doing switch from > > 500 MHz to 1 GHz. So could you try to do some tests also without > > changing MIN_VOLT_MV_FOR_L0_L1_1GHZ and switching just between non-1.2 > > frequencies (to verify that on 1.2 GHz variant it is also from 600 MHz > > to 1.2 GHz)? > > With 1108 mV and switching between 600 MHz and 1.2GHz, I always saw a freeze > within a minute. I mean to try switching with 1.108 V between 200 MHz and 300 MHz or between 300 MHz and 600 MHz. To check that issue is really only with switch from 600 MHz to 1.2 GHz. I need to know if current settings are fine for 200, 300 and 600 MHz frequencies and the only 600 --> 1200 needs to be fixed.
Re: [PATCH v2 05/12] arm64: dts: marvell: armada-3720-db: add comphy references
On Wednesday 10 February 2021 16:09:42 kos...@marvell.com wrote: > From: Grzegorz Jaszczyk > > Adding phy description to pcie, sata and usb will allow appropriate drivers > to configure marvell comphy-a3700 accordingly. > > Signed-off-by: Grzegorz Jaszczyk > Signed-off-by: Konstantin Porotchkin Hello! This patch is not needed too as Gregory already included into his tree alternative patch which defines SATA PHY globally into main include file armada-37xx.dtsi: https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu.git/commit/?h=for-next=6ece0f7dbd558670ec72ba390379949a4d4dc5c0 And PCIe and USB 3.0 PHY definitions are already in include file. > --- > arch/arm64/boot/dts/marvell/armada-3720-db.dts | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/boot/dts/marvell/armada-3720-db.dts > b/arch/arm64/boot/dts/marvell/armada-3720-db.dts > index 3e5789f37206..15e923f945d4 100644 > --- a/arch/arm64/boot/dts/marvell/armada-3720-db.dts > +++ b/arch/arm64/boot/dts/marvell/armada-3720-db.dts > @@ -132,11 +132,15 @@ > pinctrl-0 = <_reset_pins _clkreq_pins>; > reset-gpios = < 3 GPIO_ACTIVE_LOW>; > status = "okay"; > + /* Generic PHY, providing serdes lanes */ > + phys = < 0>; > }; > > /* CON3 */ > { > status = "okay"; > + /* Generic PHY, providing serdes lanes */ > + phys = < 0>; > }; > > { > @@ -217,4 +221,7 @@ > { > status = "okay"; > usb-phy = <_phy>; > + /* Generic PHY, providing serdes lanes */ > + phys = < 0>; > + phy-names = "usb"; > }; > -- > 2.17.1 >
Re: [PATCH v2 06/12] arm64: dts: marvell: armada-3270-espressobin: add comphy references
On Wednesday 10 February 2021 16:09:43 kos...@marvell.com wrote: > From: Grzegorz Jaszczyk > > Add "phys" entries pointing to COMPHYs to PCIe and USB3 nodes > > Signed-off-by: Grzegorz Jaszczyk > Signed-off-by: Konstantin Porotchkin Hello! This patch is not needed and now does nothing. > --- > arch/arm64/boot/dts/marvell/armada-3720-espressobin.dtsi | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dtsi > b/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dtsi > index daffe136c523..bbd955909813 100644 > --- a/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dtsi > +++ b/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dtsi > @@ -59,6 +59,8 @@ > /* J9 */ > { > status = "okay"; > + /* Generic PHY, providing serdes lanes */ > + phys = < 0>; In mainline kernel is PCIe PHY already provided in armada-37xx.dtsi: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/marvell/armada-37xx.dtsi#n497 > pinctrl-names = "default"; > pinctrl-0 = <_reset_pins _clkreq_pins>; > reset-gpios = < 3 GPIO_ACTIVE_LOW>; > @@ -139,6 +141,9 @@ > /* J7 */ > { > status = "okay"; > + /* Generic PHY, providing serdes lanes */ > + phys = < 0>; > + phy-names = "usb"; In mainline kernel is USB 3.0 PHY already provided in armada-37xx.dtsi: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/marvell/armada-37xx.dtsi#n359 > }; > > /* J8 */ > -- > 2.17.1 > So final binary espressobin DTB files are same with and without this patch.
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Wednesday 10 February 2021 11:08:59 nnet wrote: > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote: > > > > Hello! Could you please enable userspace governor during kernel > > > > compilation? > > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y > > > > > > > > It can be activated via command: > > > > > > > > echo userspace > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > > > > After that you can "force" CPU frequency to specific value, e.g.: > > > > > > > > echo 100 > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed > > > > > > > > I need to know which switch (from --> to freq) cause this system hang. > > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this switch > > > > is causing issue. > > > > > > > > I have used following simple bash script to check that switching between > > > > 500 MHz and 1 GHz is stable: > > > > > > > > while true; do > > > > echo 100 > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > echo 50 > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > echo 100 > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > echo 50 > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > > done > > > > > > echo userspace | tee > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > while true; do > > > echo 120 | tee > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > echo 60 | tee > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > > done > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > > > > > With 1108 I get a freeze within a minute. The last output to stdout is > > > 60. > > > > > > With 1120 it takes a few minutes. > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over the full 5 > > > minute load test. > > > > > > I'm using ondemand now with the above at 1132 without issue so far. > > > > Great, thank you for testing! > > > > Can you check if switching between any two lower frequencies 20 > > 30 60 is stable? > > This is stable using 1132 mV for MIN_VOLT_MV_FOR_L0_L1_1GHZ: > > while true; do > # down > echo 120 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; ... Hello! Could you please re-run test without tee, in form as I have shown above? UART is slow and printing something to console adds delay which decrease probability that real issue is triggered as this is timing issue. Also please do tests just between two frequencies in loop as I observed that switching between more decreased probability to hit issue. The real issue for 1 GHz variant of A3720 is only when doing switch from 500 MHz to 1 GHz. So could you try to do some tests also without changing MIN_VOLT_MV_FOR_L0_L1_1GHZ and switching just between non-1.2 frequencies (to verify that on 1.2 GHz variant it is also from 600 MHz to 1.2 GHz)?
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Wednesday 10 February 2021 09:34:07 nnet wrote: > On Wed, Feb 10, 2021, at 1:23 AM, Pali Rohár wrote: > > On Tuesday 09 February 2021 18:07:41 nnet wrote: > > > On Tue, Feb 9, 2021, at 5:51 PM, nnet wrote: > > > > On Tue, Feb 9, 2021, at 5:31 PM, nnet wrote: > > > > > On Tue, Feb 9, 2021, at 3:26 PM, Marek Behún wrote: > > > > > > On Tue, 09 Feb 2021 15:16:45 -0800 > > > > > > nnet wrote: > > > > > > > > > > > > > I've two of these and I've just swapped them (and re-pasted the > > > > > > > heat sinks). > > > > > > > > > > > > > > The second one ran under load for awhile and now has frozen as > > > > > > > well. > > > > > > > > > > > > > > Under a moderate load `wget -O /dev/null ` @X00Mbits > > > > > > > they are fine. > > > > > > > > > > > > > > Under a 1 min speed test of load ~200Mbits routed WireGuard they > > > > > > > freeze. > > > > > > > > > > > > > > They fine with both those workloads @1000_800. > > > > > > > > > > > > > > Perhaps it's heat? Unfortunately I don't have any numbers on that > > > > > > > ATM. > > > > > > > > > > > > Try disabling cpufreq in kernel completely, compile boot image at > > > > > > 1200 MHz. If it continues freezing, then I fear we can't help you > > > > > > with > > > > > > 1200 MHz :( > > > > > > > > > > cat > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_available_frequencies > > > > > 20 30 60 120 > > > > > > > > > > I'm not getting any freezes with 1.2GHz fixed after 20 minutes of > > > > > load: > > > > > > > > > > echo 120 > > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq > > > > > > > > > > Setting it back to min 200MHz I get a freeze within a minute: > > > > > > > > > > echo 20 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq > > > > Hello! Could you please enable userspace governor during kernel > > compilation? > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y > > > > It can be activated via command: > > > > echo userspace > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > After that you can "force" CPU frequency to specific value, e.g.: > > > > echo 100 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed > > > > I need to know which switch (from --> to freq) cause this system hang. > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this switch > > is causing issue. > > > > I have used following simple bash script to check that switching between > > 500 MHz and 1 GHz is stable: > > > > while true; do > > echo 100 > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > echo 50 > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > echo 100 > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > echo 50 > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > > done > > echo userspace | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > while true; do > echo 120 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > echo 60 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; > done > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > With 1108 I get a freeze within a minute. The last output to stdout is 60. > > With 1120 it takes a few minutes. > > With any of 1225, 1155, 1132 the device doesn't freeze over the full 5 minute > load test. > > I'm using ondemand now with the above at 1132 without issue so far. Great, thank you for testing! Can you check if switching between any two lower frequencies 20 30 60 is stable? > >> Update the CPU voltage value for loads L0 and L1 accordingly when base > >> frequency is 1000 or 1200 MHz. The minimal value is updated from the > >> original 1.05V to 1.108V. > > Perhaps similiar to how
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Tuesday 09 February 2021 18:07:41 nnet wrote: > On Tue, Feb 9, 2021, at 5:51 PM, nnet wrote: > > On Tue, Feb 9, 2021, at 5:31 PM, nnet wrote: > > > On Tue, Feb 9, 2021, at 3:26 PM, Marek Behún wrote: > > > > On Tue, 09 Feb 2021 15:16:45 -0800 > > > > nnet wrote: > > > > > > > > > I've two of these and I've just swapped them (and re-pasted the heat > > > > > sinks). > > > > > > > > > > The second one ran under load for awhile and now has frozen as well. > > > > > > > > > > Under a moderate load `wget -O /dev/null ` @X00Mbits they > > > > > are fine. > > > > > > > > > > Under a 1 min speed test of load ~200Mbits routed WireGuard they > > > > > freeze. > > > > > > > > > > They fine with both those workloads @1000_800. > > > > > > > > > > Perhaps it's heat? Unfortunately I don't have any numbers on that ATM. > > > > > > > > Try disabling cpufreq in kernel completely, compile boot image at > > > > 1200 MHz. If it continues freezing, then I fear we can't help you with > > > > 1200 MHz :( > > > > > > cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_frequencies > > > 20 30 60 120 > > > > > > I'm not getting any freezes with 1.2GHz fixed after 20 minutes of load: > > > > > > echo 120 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq > > > > > > Setting it back to min 200MHz I get a freeze within a minute: > > > > > > echo 20 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq Hello! Could you please enable userspace governor during kernel compilation? CONFIG_CPU_FREQ_GOV_USERSPACE=y It can be activated via command: echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor After that you can "force" CPU frequency to specific value, e.g.: echo 100 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed I need to know which switch (from --> to freq) cause this system hang. This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this switch is causing issue. I have used following simple bash script to check that switching between 500 MHz and 1 GHz is stable: while true; do echo 100 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; echo 50 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; echo 100 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; echo 50 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed; done (of course on 1.2 GHz variant you need to adjust values as only following frequencies 20 30 60 120 are supported) > > > > Marek > > > > > > > > > +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108 > > > > Based on the below at boot time might an equivalent of the above need > > to be 1225 for 1.2GHz? > > > > 1200_750 > > SVC REV: 5, CPU VDD voltage: 1.225V > > > > 1000_800 > > SVC REV: 5, CPU VDD voltage: 1.108V This value is printed in WTMI avs.c by following code: shift = OTP_SVC_SPEED_1000_OFF; (OR) shift = OTP_SVC_SPEED_1200_OFF; vdd_otp = ((otp_data[OTP_DATA_SVC_SPEED_ID] >> shift) + AVS_VDD_BASE) & AVS_VDD_MASK; regval |= (vdd_otp << HIGH_VDD_LIMIT_OFF); regval |= (vdd_otp << LOW_VDD_LIMIT_OFF); printf("SVC REV: %d, CPU VDD voltage: %s\n", svc_rev, avis_dump[vdd_otp].desc); So voltage value is read from the OTP memory. But I do not know what this value means. > I did this for a quick test for 1.2GHz: > > +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1225 > > This is working well so far. Frequency is shifting up/down with load > applied/stopped.
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Tuesday 09 February 2021 14:52:56 nnet wrote: > On Tue, Feb 9, 2021, at 2:42 PM, Pali Rohár wrote: > > On Tuesday 09 February 2021 13:45:08 nnet wrote: > > > On Tue, Feb 9, 2021, at 1:33 PM, Pali Rohár wrote: > > > > On Tuesday 09 February 2021 13:00:26 nnet wrote: > > > > > > If you have other Armada 3720 boards (Espressobin v5/v7, uDPU, > > > > > > Devel Board, ...) then it will be nice to do an additional tests > > > > > > and check if instability issues are finally fixed. > > > > > > > > > > These patches applied to the 5.4.96 in OpenWrt (98d61b5) work fine so > > > > > far on an Espressobin v7 AFAICT per changing values in > > > > > /sys/devices/system/cpu/cpufreq/policy0. > > > > > > > > > > Are these changes intended to work @1.2 GHz on the v7? > > > > > > > > Hello! Do you have 1.2 GHz A3720 SoC? > > > > > > Maybe (not)? ESPRESSObin_V7_1-0 on the underside. > > > > Look at the package of SoC chip. On top of the package is printed > > identifier 88F3720. On the last line should be one of the string: > > C080, C100, C120, I080, I100 which identifies frequency > > (080 = 800 MHz, 100 = 1 GHz, 120 = 1.2 GHz) > > > > Can you check what is printed on A3720 SoC package? > > Both of mine are 88F3720-A0-C120. Interesting... it is really 1.2 GHz. I have not seen it yet. And few weeks ago I thought that it was never available on market. Well, if 1.2 GHz variant has same issues as 1 GHz variants then this patch series should fix it. But you said that with this patch series is board crashing when running at 1.2 GHz?
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Tuesday 09 February 2021 13:45:08 nnet wrote: > On Tue, Feb 9, 2021, at 1:33 PM, Pali Rohár wrote: > > On Tuesday 09 February 2021 13:00:26 nnet wrote: > > > > If you have other Armada 3720 boards (Espressobin v5/v7, uDPU, Devel > > > > Board, ...) then it will be nice to do an additional tests and check if > > > > instability issues are finally fixed. > > > > > > These patches applied to the 5.4.96 in OpenWrt (98d61b5) work fine so far > > > on an Espressobin v7 AFAICT per changing values in > > > /sys/devices/system/cpu/cpufreq/policy0. > > > > > > Are these changes intended to work @1.2 GHz on the v7? > > > > Hello! Do you have 1.2 GHz A3720 SoC? > > Maybe (not)? ESPRESSObin_V7_1-0 on the underside. Look at the package of SoC chip. On top of the package is printed identifier 88F3720. On the last line should be one of the string: C080, C100, C120, I080, I100 which identifies frequency (080 = 800 MHz, 100 = 1 GHz, 120 = 1.2 GHz) Can you check what is printed on A3720 SoC package? > BTW, with the 1200_750 firmware and the patches: > > root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat > scaling_available_frequencies > 20 30 60 120 > > Of course that could mean nothing, but thought I'd mention what I do see. This is value set by firmware file which you have started.
Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
On Tuesday 09 February 2021 13:00:26 nnet wrote: > > If you have other Armada 3720 boards (Espressobin v5/v7, uDPU, Devel Board, > > ...) then it will be nice to do an additional tests and check if > > instability issues are finally fixed. > > These patches applied to the 5.4.96 in OpenWrt (98d61b5) work fine so far on > an Espressobin v7 AFAICT per changing values in > /sys/devices/system/cpu/cpufreq/policy0. > > Are these changes intended to work @1.2 GHz on the v7? Hello! Do you have 1.2 GHz A3720 SoC?
Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
On Friday 05 February 2021 10:59:50 Daniel Vetter wrote: > On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár wrote: > > > > On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote: > > > [+cc Oliver, Pali, Krzysztof] > > > > Just to note that extending or using sysfs_initialized introduces > > another race condition into kernel code which results in PCI fatal > > errors. Details are in email discussion which Bjorn already sent. > > Yeah I wondered why this doesn't race. It races, but with smaller probability. I have not seen this race condition on x86. But I was able to reproduce it with native PCIe drivers on ARM64 (Marvell Armada 3720; pci-aardvark). In mentioned discussion I wrote when this race condition happen. But I understand that it is hard to simulate it. > but since the history goes back > to pre-git times I figured it would have been addressed somehow > already if it indeed does race. > -Daniel > > > > s/also/Also/ in subject > > > > > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote: > > > > We are already doing this for all the regular sysfs files on PCI > > > > devices, but not yet on the legacy io files on the PCI buses. Thus far > > > > now problem, but in the next patch I want to wire up iomem revoke > > > > support. That needs the vfs up an running already to make so that > > > > iomem_get_mapping() works. > > > > > > s/now problem/no problem/ > > > s/an running/and running/ > > > s/so that/sure that/ ? > > > > > > iomem_get_mapping() doesn't exist; I don't know what that should be. > > > > > > > Wire it up exactly like the existing code. Note that > > > > pci_remove_legacy_files() doesn't need a check since the one for > > > > pci_bus->legacy_io is sufficient. > > > > > > I'm not sure exactly what you mean by "the existing code." I could > > > probably figure it out, but it would save time to mention the existing > > > function here. > > > > > > This looks like another instance where we should really apply Oliver's > > > idea of converting these to attribute_groups [1]. > > > > > > The cover letter mentions options discussed with Greg in [2], but I > > > don't think the "sysfs_initialized" hack vs attribute_groups was part > > > of that discussion. > > > > > > It's not absolutely a show-stopper, but it *is* a shame to extend the > > > sysfs_initialized hack if attribute_groups could do this more cleanly > > > and help solve more than one issue. > > > > > > Bjorn > > > > > > [1] > > > https://lore.kernel.org/r/caosf1chss03dbsdo4pmttmp0tceu5kscn704zewlkgxqzbf...@mail.gmail.com > > > [2] > > > https://lore.kernel.org/dri-devel/cakmk7ugrddrbtj0oyzqqc0cgrqwc2f3tfju9vlfm2jjufaz...@mail.gmail.com/ > > > > > > > Signed-off-by: Daniel Vetter > > > > Cc: Stephen Rothwell > > > > Cc: Jason Gunthorpe > > > > Cc: Kees Cook > > > > Cc: Dan Williams > > > > Cc: Andrew Morton > > > > Cc: John Hubbard > > > > Cc: Jérôme Glisse > > > > Cc: Jan Kara > > > > Cc: Dan Williams > > > > Cc: Greg Kroah-Hartman > > > > Cc: linux...@kvack.org > > > > Cc: linux-arm-ker...@lists.infradead.org > > > > Cc: linux-samsung-...@vger.kernel.org > > > > Cc: linux-me...@vger.kernel.org > > > > Cc: Bjorn Helgaas > > > > Cc: linux-...@vger.kernel.org > > > > --- > > > > drivers/pci/pci-sysfs.c | 7 +++ > > > > 1 file changed, 7 insertions(+) > > > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > > index fb072f4b3176..0c45b4f7b214 100644 > > > > --- a/drivers/pci/pci-sysfs.c > > > > +++ b/drivers/pci/pci-sysfs.c > > > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b) > > > > { > > > > int error; > > > > > > > > + if (!sysfs_initialized) > > > > + return; > > > > + > > > > b->legacy_io = kcalloc(2, sizeof(struct bin_attribute), > > > >GFP_ATOMIC); > > > > if (!b->legacy_io) > > > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev > > > > *pdev) > > > > static int __init pci_sysfs_init(void) > > > > { > > > > struct pci_dev *pdev = NULL; > > > > + struct pci_bus *pbus = NULL; > > > > int retval; > > > > > > > > sysfs_initialized = 1; > > > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void) > > > > } > > > > } > > > > > > > > + while ((pbus = pci_find_next_bus(pbus))) > > > > + pci_create_legacy_files(pbus); > > > > + > > > > return 0; > > > > } > > > > late_initcall(pci_sysfs_init); > > > > -- > > > > 2.30.0 > > > > > > > > > > > > ___ > > > > linux-arm-kernel mailing list > > > > linux-arm-ker...@lists.infradead.org > > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
On Friday 05 February 2021 11:16:00 Daniel Vetter wrote: > On Fri, Feb 5, 2021 at 11:04 AM Pali Rohár wrote: > > > > On Friday 05 February 2021 10:59:50 Daniel Vetter wrote: > > > On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár wrote: > > > > > > > > On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote: > > > > > [+cc Oliver, Pali, Krzysztof] > > > > > > > > Just to note that extending or using sysfs_initialized introduces > > > > another race condition into kernel code which results in PCI fatal > > > > errors. Details are in email discussion which Bjorn already sent. > > > > > > Yeah I wondered why this doesn't race. > > > > It races, but with smaller probability. I have not seen this race > > condition on x86. But I was able to reproduce it with native PCIe > > drivers on ARM64 (Marvell Armada 3720; pci-aardvark). In mentioned > > discussion I wrote when this race condition happen. But I understand > > that it is hard to simulate it. > > btw I looked at your patch, and isn't that just reducing the race window? I probably have not wrote reply to that thread and only to Krzysztof on IRC, but my "hack" really does not solve that race condition. And as you wrote it only reduced occurrence on tested HW. Krzysztof wrote that would look at this issue and try to solve it properly. So I have not doing more investigation on that my "hack" patch, race conditions are hard to catch and solve... > I think we have a very similar problem in drm, where the > drm_dev_register() for the overall device (which also registers all > drm_connector) can race with the hotplug of an individual connector in > drm_connector_register() which is hotplugged at runtime. > > I went with a per-connector registered boolean + a lock to make sure > that really only one of the two call paths can end up registering the > connector. Part of registering connectors is setting up sysfs files, > so I think it's exactly the same problem as here. > > Cheers, Daniel > > > > > > but since the history goes back > > > to pre-git times I figured it would have been addressed somehow > > > already if it indeed does race. > > > -Daniel > > > > > > > > s/also/Also/ in subject > > > > > > > > > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote: > > > > > > We are already doing this for all the regular sysfs files on PCI > > > > > > devices, but not yet on the legacy io files on the PCI buses. Thus > > > > > > far > > > > > > now problem, but in the next patch I want to wire up iomem revoke > > > > > > support. That needs the vfs up an running already to make so that > > > > > > iomem_get_mapping() works. > > > > > > > > > > s/now problem/no problem/ > > > > > s/an running/and running/ > > > > > s/so that/sure that/ ? > > > > > > > > > > iomem_get_mapping() doesn't exist; I don't know what that should be. > > > > > > > > > > > Wire it up exactly like the existing code. Note that > > > > > > pci_remove_legacy_files() doesn't need a check since the one for > > > > > > pci_bus->legacy_io is sufficient. > > > > > > > > > > I'm not sure exactly what you mean by "the existing code." I could > > > > > probably figure it out, but it would save time to mention the existing > > > > > function here. > > > > > > > > > > This looks like another instance where we should really apply Oliver's > > > > > idea of converting these to attribute_groups [1]. > > > > > > > > > > The cover letter mentions options discussed with Greg in [2], but I > > > > > don't think the "sysfs_initialized" hack vs attribute_groups was part > > > > > of that discussion. > > > > > > > > > > It's not absolutely a show-stopper, but it *is* a shame to extend the > > > > > sysfs_initialized hack if attribute_groups could do this more cleanly > > > > > and help solve more than one issue. > > > > > > > > > > Bjorn > > > > > > > > > > [1] > > > > > https://lore.kernel.org/r/caosf1chss03dbsdo4pmttmp0tceu5kscn704zewlkgxqzbf...@mail.gmail.com > > > > > [2] > > > > > https://lore.kernel.org/dri-devel/cakmk7ugrddrbtj0oyzqqc0cgrqwc2f3tfju9vlfm2jjufaz...@mail
Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote: > [+cc Oliver, Pali, Krzysztof] Just to note that extending or using sysfs_initialized introduces another race condition into kernel code which results in PCI fatal errors. Details are in email discussion which Bjorn already sent. > s/also/Also/ in subject > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote: > > We are already doing this for all the regular sysfs files on PCI > > devices, but not yet on the legacy io files on the PCI buses. Thus far > > now problem, but in the next patch I want to wire up iomem revoke > > support. That needs the vfs up an running already to make so that > > iomem_get_mapping() works. > > s/now problem/no problem/ > s/an running/and running/ > s/so that/sure that/ ? > > iomem_get_mapping() doesn't exist; I don't know what that should be. > > > Wire it up exactly like the existing code. Note that > > pci_remove_legacy_files() doesn't need a check since the one for > > pci_bus->legacy_io is sufficient. > > I'm not sure exactly what you mean by "the existing code." I could > probably figure it out, but it would save time to mention the existing > function here. > > This looks like another instance where we should really apply Oliver's > idea of converting these to attribute_groups [1]. > > The cover letter mentions options discussed with Greg in [2], but I > don't think the "sysfs_initialized" hack vs attribute_groups was part > of that discussion. > > It's not absolutely a show-stopper, but it *is* a shame to extend the > sysfs_initialized hack if attribute_groups could do this more cleanly > and help solve more than one issue. > > Bjorn > > [1] > https://lore.kernel.org/r/caosf1chss03dbsdo4pmttmp0tceu5kscn704zewlkgxqzbf...@mail.gmail.com > [2] > https://lore.kernel.org/dri-devel/cakmk7ugrddrbtj0oyzqqc0cgrqwc2f3tfju9vlfm2jjufaz...@mail.gmail.com/ > > > Signed-off-by: Daniel Vetter > > Cc: Stephen Rothwell > > Cc: Jason Gunthorpe > > Cc: Kees Cook > > Cc: Dan Williams > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: Greg Kroah-Hartman > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > Cc: Bjorn Helgaas > > Cc: linux-...@vger.kernel.org > > --- > > drivers/pci/pci-sysfs.c | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > index fb072f4b3176..0c45b4f7b214 100644 > > --- a/drivers/pci/pci-sysfs.c > > +++ b/drivers/pci/pci-sysfs.c > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b) > > { > > int error; > > > > + if (!sysfs_initialized) > > + return; > > + > > b->legacy_io = kcalloc(2, sizeof(struct bin_attribute), > >GFP_ATOMIC); > > if (!b->legacy_io) > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev) > > static int __init pci_sysfs_init(void) > > { > > struct pci_dev *pdev = NULL; > > + struct pci_bus *pbus = NULL; > > int retval; > > > > sysfs_initialized = 1; > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void) > > } > > } > > > > + while ((pbus = pci_find_next_bus(pbus))) > > + pci_create_legacy_files(pbus); > > + > > return 0; > > } > > late_initcall(pci_sysfs_init); > > -- > > 2.30.0 > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Re: [PATCH v2] usb: host: xhci-plat: fix support for XHCI_SKIP_PHY_INIT quirk
On Tuesday 26 January 2021 10:06:06 Pali Rohár wrote: > On Tuesday 26 January 2021 04:27:37 Yoshihiro Shimoda wrote: > > Hi Pali, > > > > I can see the benefit in this. > > > > In the xhci-plat case usb_create_hcd and usb_add_hcd are separate > > > > steps, and > > > > we could both copy the xhci_plat_priv .quirks and run the .init_qurks > > > > before > > > > adding the hcd. > > > > I guess the current way is inherited from pci case where the earliest > > > > place > > > > to do this after hcd is created is the hcd->driver->reset callback > > > > (which is > > > > set to xhci_pci_setup() or xhci_plat_setup()). > > > > > > > > xhci-rcar.c is using the .init_quirk to load firmware, we need to check > > > > with > > > > them if this change is ok. (added Yoshihiro Shimoda to cc) > > > > > > Yoshihiro, is this change OK? > > > > > > Can we move forward? I really need to now how to handle regression in > > > xhci-mvebu driver. And one option is with this patch... > > > > Thank you for asking me about this topic. I tested the patch, but > > unfortunately, > > this patch is possible to break a rcar platform because a phy > > initialization is > > needed before the firmware loading if the platform uses the phy. (Note that > > upstream code (salvator-common.dtsi) doesn't use the phy for xhci. But, > > if we use the phy on other board with this patch, the xhci will not work.) > > > > So, I think we need to add a new function pointer for your case. > > Ok, thank you for testing! I will try to come up with other solution to > mentioned mvebu-xhci issue. Hello! New version of this patch is in following thread, please review it: "[PATCH v2] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720" https://lore.kernel.org/linux-usb/20210201150803.7305-1-p...@kernel.org/ > > Best regards, > > Yoshihiro Shimoda > > > > > > Their firmware would be loaded before phy parts are initialized, usb bus > > > > registered, or roothub device allocated. > > > > > > > > Thanks > > > > -Mathias
[PATCH v2] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
Older ATF does not provide SMC call for USB 3.0 phy power on functionality and therefore initialization of xhci-hcd is failing when older version of ATF is used. In this case phy_power_on() function returns -EOPNOTSUPP. [3.108467] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try updating your firmware [3.117250] phy phy-d0018300.phy.0: phy poweron failed --> -95 [3.123465] xhci-hcd: probe of d0058000.usb failed with error -95 This patch introduces a new plat_setup callback for xhci platform drivers which is called prior calling usb_add_hcd() function. This function at its beginning skips PHY init if hcd->skip_phy_initialization is set. Current init_quirk callback for xhci platform drivers is called from xhci_plat_setup() function which is called after chip reset completes. It happens in the middle of the usb_add_hcd() function and therefore this callback cannot be used for setting if PHY init should be skipped or not. For Armada 3720 this patch introduce a new xhci_mvebu_a3700_plat_setup() function configured as a xhci platform plat_setup callback. This new function calls phy_power_on() and in case it returns -EOPNOTSUPP then XHCI_SKIP_PHY_INIT quirk is set to instruct xhci-plat to skip PHY initialization. This patch fixes above failure by ignoring 'not supported' error in xhci-hcd driver. In this case it is expected that phy is already power on. It fixes initialization of xhci-hcd on Espressobin boards where is older Marvell's Arm Trusted Firmware without SMC call for USB 3.0 phy power. This is regression introduced in commit bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with their PHYs") where USB 3.0 phy was defined and therefore xhci-hcd on Espressobin with older ATF started failing. Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Fixes: bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with their PHYs") Cc: # 5.1+: ea17a0f153af: phy: marvell: comphy: Convert internal SMCC firmware return codes to errno Cc: # 5.1+: f768e718911e: usb: host: xhci-plat: add priv quirk for skip PHY initialization --- Changes in v2: * Drop dependency on patch "usb: host: xhci-plat: fix support for XHCI_SKIP_PHY_INIT quirk" * Introduce a new .plat_setup callback for xhci-plat drivers in this patch (to simplify backporting and review) * Move implementation from xhci_mvebu_mbus_init_quirk() into xhci_mvebu_a3700_plat_setup() --- drivers/usb/host/xhci-mvebu.c | 42 +++ drivers/usb/host/xhci-mvebu.h | 6 + drivers/usb/host/xhci-plat.c | 20 - drivers/usb/host/xhci-plat.h | 1 + 4 files changed, 68 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-mvebu.c b/drivers/usb/host/xhci-mvebu.c index 60651a50770f..8ca1a235d164 100644 --- a/drivers/usb/host/xhci-mvebu.c +++ b/drivers/usb/host/xhci-mvebu.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -74,6 +75,47 @@ int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd) return 0; } +int xhci_mvebu_a3700_plat_setup(struct usb_hcd *hcd) +{ + struct xhci_hcd *xhci = hcd_to_xhci(hcd); + struct device *dev = hcd->self.controller; + struct phy *phy; + int ret; + + /* Old bindings miss the PHY handle */ + phy = of_phy_get(dev->of_node, "usb3-phy"); + if (IS_ERR(phy) && PTR_ERR(phy) == -EPROBE_DEFER) + return -EPROBE_DEFER; + else if (IS_ERR(phy)) + goto phy_out; + + ret = phy_init(phy); + if (ret) + goto phy_put; + + ret = phy_set_mode(phy, PHY_MODE_USB_HOST_SS); + if (ret) + goto phy_exit; + + ret = phy_power_on(phy); + if (ret == -EOPNOTSUPP) { + /* Skip initializatin of XHCI PHY when it is unsupported by firmware */ + dev_warn(dev, "PHY unsupported by firmware\n"); + xhci->quirks |= XHCI_SKIP_PHY_INIT; + } + if (ret) + goto phy_exit; + + phy_power_off(phy); +phy_exit: + phy_exit(phy); +phy_put: + of_phy_put(phy); +phy_out: + + return 0; +} + int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd) { struct xhci_hcd *xhci = hcd_to_xhci(hcd); diff --git a/drivers/usb/host/xhci-mvebu.h b/drivers/usb/host/xhci-mvebu.h index 3be021793cc8..01bf3fcb3eca 100644 --- a/drivers/usb/host/xhci-mvebu.h +++ b/drivers/usb/host/xhci-mvebu.h @@ -12,6 +12,7 @@ struct usb_hcd; #if IS_ENABLED(CONFIG_USB_XHCI_MVEBU) int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd); +int xhci_mvebu_a3700_plat_setup(struct usb_hcd *hcd); int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd); #else static inline int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd) @@ -19,6 +20,11 @@ static inline int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd) return 0; } +static inline int xhci_mvebu_a3700_pla
Re: [PATCH] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
On Saturday 30 January 2021 17:31:41 Tomasz Maciej Nowak wrote: > W dniu 23.12.2020 o 17:24, Pali Rohár pisze: > > Older ATF does not provide SMC call for USB 3.0 phy power on functionality > > and therefore initialization of xhci-hcd is failing when older version of > > ATF is used. In this case phy_power_on() function returns -EOPNOTSUPP. > > > > [3.108467] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try > > updating your firmware > > [3.117250] phy phy-d0018300.phy.0: phy poweron failed --> -95 > > [3.123465] xhci-hcd: probe of d0058000.usb failed with error -95 > > > > This patch calls phy_power_on() in xhci_mvebu_a3700_init_quirk() function > > and in case it returns -EOPNOTSUPP then XHCI_SKIP_PHY_INIT quirk is set to > > instruct xhci-plat to skip PHY initialization. > > > > This patch fixes above failure by ignoring 'not supported' error in > > aardvark driver. In this case it is expected that phy is already power on. > > > > It fixes initialization of xhci-hcd on Espressobin boards where is older > > Marvell's Arm Trusted Firmware without SMC call for USB 3.0 phy power. > > > > This is regression introduced in commit bd3d25b07342 ("arm64: dts: marvell: > > armada-37xx: link USB hosts with their PHYs") where USB 3.0 phy was defined > > and therefore xhci-hcd on Espressobin with older ATF started failing. > > > > Fixes: bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with > > their PHYs") > > Signed-off-by: Pali Rohár > > Cc: # 5.1+: ea17a0f153af: phy: marvell: comphy: > > Convert internal SMCC firmware return codes to errno > > Cc: # 5.1+: f768e718911e: usb: host: xhci-plat: > > add priv quirk for skip PHY initialization > > > > --- > > > > When applying this patch, please include additional line > > > > Cc: # 5.1+: : usb: host: xhci-plat: fix > > support for XHCI_SKIP_PHY_INIT quirk > > > > with correct COMMIT_ID of mentioned patch which is available in the thread: > > Hi, > and sorry for late reply, but that might be good reminder for maintainers. > Anyway I tested this patch in conjunction with v2 from this topic: > > https://lore.kernel.org/lkml/20201221150903.26630-1-p...@kernel.org/T/#u > The USB ports work flawlessly with older ATF, so: > > Tested-by: Tomasz Maciej Nowak Thank you for testing. But as was mentioned in discussion in above patch, I have to rework both patches. I have new changes prepared but have not had a chance to test new changes yet. If you want I can send you a new version of those untested patches. > At OpenWrt we are using patch which removes Comphy assignments from nodes > in dts, but that is sub optimal, since that "discriminates" users with > updated ATF. I would prefer this patch instead of what we are doing now > in OpenWrt. > > Thanks. > > > > > As mentioned patch is required for change in this patch to work. Above > > mentioned patch is prerequisite for this patch and therefore needs to be > > reviewed and applied prior this patch. > > > > Note that same issue as in this USB 3.0 PHY patch was already resolved and > > applied also for SATA PHY and PCIe PHY on A3720 SOC in following commits: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=45aefe3d2251e4e229d7662052739f96ad1d08d9 > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c6ae0f8948a2be6bf4e8b4bbab9ca1343289b6 > > > > And these commits were also backported to stable kernel versions (where > > were affected commits which broke drivers initialization). > > --- > > drivers/usb/host/xhci-mvebu.c | 35 +++ > > 1 file changed, 35 insertions(+) > > > > diff --git a/drivers/usb/host/xhci-mvebu.c b/drivers/usb/host/xhci-mvebu.c > > index 60651a50770f..ec4f6d6e44cf 100644 > > --- a/drivers/usb/host/xhci-mvebu.c > > +++ b/drivers/usb/host/xhci-mvebu.c > > @@ -8,6 +8,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -77,9 +78,43 @@ int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd) > > int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd) > > { > > struct xhci_hcd *xhci = hcd_to_xhci(hcd); > > + struct device *dev = hcd->self.controller; > > + struct phy *phy; > > + int ret; > > > > /* Without reset on resume, the HC won't work at all */ > > xhci->quirks |= XHCI_RESET_ON_RESUME; > > > > + /* Old bindin
Re: [PATCH] hwmon: (dell-smm) Add XPS 15 L502X to fan control blacklist
On Monday 25 January 2021 21:21:30 Pali Rohár wrote: > On Monday 25 January 2021 12:19:38 Guenter Roeck wrote: > > On Mon, Jan 25, 2021 at 11:05:40AM +0100, Pali Rohár wrote: > > > On Saturday 23 January 2021 18:46:08 Thomas Hebb wrote: > > > > It has been reported[0] that the Dell XPS 15 L502X exhibits similar > > > > freezing behavior to the other systems[1] on this blacklist. The issue > > > > was exposed by a prior change of mine to automatically load > > > > dell_smm_hwmon on a wider set of XPS models. To fix the regression, add > > > > this model to the blacklist. > > > > > > > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=211081 > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=195751 > > > > > > > > Fixes: b8a13e5e8f37 ("hwmon: (dell-smm) Use one DMI match for all XPS > > > > models") > > > > Cc: sta...@vger.kernel.org > > > > Reported-by: Bob Hepple > > > > Tested-by: Bob Hepple > > > > Signed-off-by: Thomas Hebb > > > > --- > > > > > > > > drivers/hwmon/dell-smm-hwmon.c | 7 +++ > > > > 1 file changed, 7 insertions(+) > > > > > > > > diff --git a/drivers/hwmon/dell-smm-hwmon.c > > > > b/drivers/hwmon/dell-smm-hwmon.c > > > > index ec448f5f2dc3..73b9db9e3aab 100644 > > > > --- a/drivers/hwmon/dell-smm-hwmon.c > > > > +++ b/drivers/hwmon/dell-smm-hwmon.c > > > > @@ -1159,6 +1159,13 @@ static struct dmi_system_id > > > > i8k_blacklist_fan_support_dmi_table[] __initdata = { > > > > DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "XPS13 9333"), > > > > }, > > > > }, > > > > + { > > > > + .ident = "Dell XPS 15 L502X", > > > > + .matches = { > > > > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > > > > + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Dell System > > > > XPS L502X"), > > > > > > Hello! Are you sure that it is required to completely disable fan > > > support? And not only access to fan type label for which is different > > > blaclist i8k_blacklist_fan_type_dmi_table? > > > > > > > I'll drop this patch from my branch. Please send a Reviewed-by: or > > Acked-by: tag > > if/when I should apply it. > > Of course! We will just wait for Bob test results. Guenter, now we have all needed information, fix is really needed in this form. So you can add my: Reviewed-by: Pali Rohár