[PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
Fixes tip commit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners). Ming reported soft lockups occurring when running xfstest due to commit b3fd4f03ca0b. When doing optimistic spinning in rwsem, threads should stop spinning when the lock owner is not running. While a thread is spinning on owner, if the owner reschedules, owner->on_cpu returns false and we stop spinning. However, commit b3fd4f03ca0b essentially caused the check to get ignored because when we break out of the spin loop due to !on_cpu, we continue spinning if sem->owner != NULL. This patch fixes this by making sure we stop spinning if the owner is not running. Furthermore, just like with mutexes, refactor the code such that we don't have separate checks for owner_running(). This makes it more straightforward in terms of why we exit the spin on owner loop and we would also avoid needing to "guess" why we broke out of the loop to make this more readable. Reported-and-tested-by: Ming Lei Acked-by: Davidlohr Bueso Signed-off-by: Jason Low --- kernel/locking/rwsem-xadd.c | 31 +++ 1 files changed, 11 insertions(+), 20 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 06e2214..3417d01 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -324,32 +324,23 @@ done: return ret; } -static inline bool owner_running(struct rw_semaphore *sem, -struct task_struct *owner) -{ - if (sem->owner != owner) - return false; - - /* -* Ensure we emit the owner->on_cpu, dereference _after_ checking -* sem->owner still matches owner, if that fails, owner might -* point to free()d memory, if it still matches, the rcu_read_lock() -* ensures the memory stays valid. -*/ - barrier(); - - return owner->on_cpu; -} - static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner) { long count; rcu_read_lock(); - while (owner_running(sem, owner)) { - /* abort spinning when need_resched */ - if (need_resched()) { + while (sem->owner == owner) { + /* +* Ensure we emit the owner->on_cpu, dereference _after_ +* checking sem->owner still matches owner, if that fails, +* owner might point to free()d memory, if it still matches, +* the rcu_read_lock() ensures the memory stays valid. +*/ + barrier(); + + /* abort spinning when need_resched or owner is not running */ + if (!owner->on_cpu || need_resched()) { rcu_read_unlock(); return false; } -- 1.7.2.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
AW: [PATCH] crypto: powerpc - move files to fix build error
> Von: Kim Phillips [kim.phill...@freescale.com] > Gesendet: Samstag, 7. März 2015 01:46 > An: Herbert Xu; Benjamin Herrenschmidt; Paul Mackerras; Michael Ellerman > Cc: Markus Stockhausen; linux-cry...@vger.kernel.org; > linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org > Betreff: [PATCH] crypto: powerpc - move files to fix build error > > The current cryptodev-2.6 tree commits: > > d9850fc529ef ("crypto: powerpc/sha1 - kernel config") > 50ba29aaa7b0 ("crypto: powerpc/sha1 - glue") > > failed to properly place files under arch/powerpc/crypto, which > leads to build errors: > > make[1]: *** No rule to make target 'arch/powerpc/crypto/sha1-spe-asm.o', > needed by 'arch/powerpc/crypto/sha1-ppc-spe.o'. Stop. > make[1]: *** No rule to make target 'arch/powerpc/crypto/sha1_spe_glue.o', > needed by 'arch/powerpc/crypto/sha1-ppc-spe.o'. Stop. > Makefile:947: recipe for target 'arch/powerpc/crypto' failed > > Move the two sha1 spe files under crypto/, and whilst there, rename > other powerpc crypto files with underscores to use dashes for > consistency. Sorry for the glitches. Did not notice that I had the files in adjacent folders and finally added the wrong ones to my git. Thanks a lot for fixing that. Markus Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. Ãber das Internet versandte E-Mails können unter fremden Namen erstellt oder manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine rechtsverbindliche Willenserklärung. Collogia Unternehmensberatung AG Ubierring 11 D-50678 Köln Vorstand: Kadir Akin Dr. Michael Höhnerbach Vorsitzender des Aufsichtsrates: Hans Kristian Langva Registergericht: Amtsgericht Köln Registernummer: HRB 52 497 This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. e-mails sent over the internet may have been written under a wrong name or been manipulated. That is why this message sent as an e-mail is not a legally binding declaration of intention. Collogia Unternehmensberatung AG Ubierring 11 D-50678 Köln executive board: Kadir Akin Dr. Michael Höhnerbach President of the supervisory board: Hans Kristian Langva Registry office: district court Cologne Register number: HRB 52 497
usb ports working only with pci=noacpi (bugzilla 94261)
Here all details (dmesg, lspci etc): https://bugzilla.kernel.org/show_bug.cgi?id=94261 The machine is: LENOVO 90BX0018IX/Aptio CRB, BIOS O07KT49AUS 12/18/2014 OS is a Debian Wheezy amd64. The issue is that USB ports work only starting the kernel with "pci=noacpi" parameter. Without it, nothing plugged into usb ports is detected. In particular, "lsusb" returns "unable to initialize libusb: -99". The same happens with a 3.13.11 kernel (same config). I've tried newer kernels (3.18.8, 3.19) and they show the same issue. With the difference that, with "pci=noacpi" they don't start at all. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 13:54 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 12:31 PM, Jason Low wrote: > > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > Cc: Ming Lei > > Cc: Davidlohr Bueso > > Signed-off-by: Jason Low > > Reported-and-tested-by: Ming Lei Thanks! > > static noinline > > bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct > > *owner) > > { > > long count; > > > > rcu_read_lock(); > > - while (owner_running(sem, owner)) { > > - /* abort spinning when need_resched */ > > - if (need_resched()) { > > + while (sem->owner == owner) { > > + /* > > +* Ensure we emit the owner->on_cpu, dereference _after_ > > +* checking sem->owner still matches owner, if that fails, > > +* owner might point to free()d memory, if it still matches, > > +* the rcu_read_lock() ensures the memory stays valid. > > +*/ > > + barrier(); > > + > > + /* abort spinning when need_resched or owner is not running > > */ > > + if (!owner->on_cpu || need_resched()) { > > BTW, could the need_resched() be handled in loop of > rwsem_optimistic_spin() directly? Then code may get > simplified a bit. We still need the need_resched() check here, since if the thread needs to reschedule, it should immediately stop spinning for the lock. Otherwise, it could potentially spin for a long time before it checks for it needs to reschedule. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 20:44 -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 20:31 -0800, Jason Low wrote: > > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > > > Just in case, here's the updated patch which addresses Linus's comments > > and with a changelog. > > > > Note: The changelog says that it fixes (locking/rwsem: Avoid deceiving > > lock spinners), though I still haven't seen full confirmation that it > > addresses all of the lockup reports. > > > > -- > > Subject: [PATCH] rwsem: Avoid spinning when owner is not running > > > > Fixes tip commmit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock > > spinners). > > > > When doing optimistic spinning in rwsem, threads should stop spinning when > > the lock owner is not running. While a thread is spinning on owner, if > > the owner reschedules, owner->on_cpu returns false and we stop spinning. > > > > However, commit b3fd4f03ca0b essentially caused the check to get ignored > > because when we break out of the spin loop due to !on_cpu, we continue > > spinning if sem->owner != NULL. > > I would mention the actual effects of the bug, either just a "lockup" > and/or a fragment of the trace. Right, we should mention about the lockup in the changelog. > > Cc: Ming Lei > > Cc: Davidlohr Bueso > > Acked-by: Davidlohr Bueso Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] net/macb: Update DT bindings documentation
Add missing "cdns,at91sam9260-macb", "atmel,sama5d3-gem" and "atmel,sama5d4-gem" compatible strings. Signed-off-by: Boris Brezillon Reviewed-by: Alexandre Belloni Acked-by: Nicolas Ferre --- Documentation/devicetree/bindings/net/macb.txt | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt index aaa6964..ba19d67 100644 --- a/Documentation/devicetree/bindings/net/macb.txt +++ b/Documentation/devicetree/bindings/net/macb.txt @@ -2,10 +2,13 @@ Required properties: - compatible: Should be "cdns,[-]{macb|gem}" - Use "cdns,at91sam9260-macb" Atmel at91sam9260 and at91sam9263 SoCs. + Use "cdns,at91sam9260-macb" for Atmel at91sam9 SoCs or the 10/100Mbit IP + available on sama5d3 SoCs. Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: "cdns,macb". Use "cdns,pc302-gem" for Picochip picoXcell pc302 and later devices based on the Cadence GEM, or the generic form: "cdns,gem". + Use "cdns,sama5d3-gem" for the Gigabit IP available on Atmel sama5d3 SoCs. + Use "cdns,sama5d4-gem" for the Gigabit IP available on Atmel sama5d4 SoCs. - reg: Address and length of the register set for the device - interrupts: Should contain macb interrupt - phy-mode: See ethernet.txt file in the same directory. -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/2] net/macb: merge at91_ether driver into macb driver
Hello Dave, On Fri, 06 Mar 2015 15:18:30 -0500 (EST) David Miller wrote: > From: Boris Brezillon > Date: Fri, 6 Mar 2015 11:48:39 +0100 > > > Changes since v2: > > - rebase after changed brought by [1] > > Ugh, actually I'm tossing all of your changes. Please do not make > complex dependencies like this. Okay, I'll remember that next time I submit patches to netdev. I've sent a new series integrating the patch this series depends on. Let me know if there's something wrong in this one. > > And furthermore, don't reference other sets of changes via lkml > postings. Reference them by where they are in patchwork which is > the definitive location for networking patch submissions. Noted. Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/4] ARM: at91/dt: fix macb compatible strings
Some at91 SoCs embed a 10/100 Mbit Ethernet IP, that is based on the at91sam9260 SoC. Fix at91 DTs accordingly. Signed-off-by: Boris Brezillon Reviewed-by: Alexandre Belloni --- arch/arm/boot/dts/at91sam9260.dtsi | 2 +- arch/arm/boot/dts/at91sam9263.dtsi | 2 +- arch/arm/boot/dts/at91sam9g45.dtsi | 2 +- arch/arm/boot/dts/at91sam9x5_macb0.dtsi | 2 +- arch/arm/boot/dts/at91sam9x5_macb1.dtsi | 2 +- arch/arm/boot/dts/sama5d3_emac.dtsi | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm/boot/dts/at91sam9260.dtsi b/arch/arm/boot/dts/at91sam9260.dtsi index fff0ee6..9f7c737 100644 --- a/arch/arm/boot/dts/at91sam9260.dtsi +++ b/arch/arm/boot/dts/at91sam9260.dtsi @@ -842,7 +842,7 @@ }; macb0: ethernet@fffc4000 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xfffc4000 0x100>; interrupts = <21 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; diff --git a/arch/arm/boot/dts/at91sam9263.dtsi b/arch/arm/boot/dts/at91sam9263.dtsi index 1f67bb4..340179e 100644 --- a/arch/arm/boot/dts/at91sam9263.dtsi +++ b/arch/arm/boot/dts/at91sam9263.dtsi @@ -845,7 +845,7 @@ }; macb0: ethernet@fffbc000 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xfffbc000 0x100>; interrupts = <21 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index ee80aa9..586eab7 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -956,7 +956,7 @@ }; macb0: ethernet@fffbc000 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xfffbc000 0x100>; interrupts = <25 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; diff --git a/arch/arm/boot/dts/at91sam9x5_macb0.dtsi b/arch/arm/boot/dts/at91sam9x5_macb0.dtsi index 57e89d1..73d7e30 100644 --- a/arch/arm/boot/dts/at91sam9x5_macb0.dtsi +++ b/arch/arm/boot/dts/at91sam9x5_macb0.dtsi @@ -53,7 +53,7 @@ }; macb0: ethernet@f802c000 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xf802c000 0x100>; interrupts = <24 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; diff --git a/arch/arm/boot/dts/at91sam9x5_macb1.dtsi b/arch/arm/boot/dts/at91sam9x5_macb1.dtsi index 663676c..d81980c 100644 --- a/arch/arm/boot/dts/at91sam9x5_macb1.dtsi +++ b/arch/arm/boot/dts/at91sam9x5_macb1.dtsi @@ -41,7 +41,7 @@ }; macb1: ethernet@f803 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xf803 0x100>; interrupts = <27 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; diff --git a/arch/arm/boot/dts/sama5d3_emac.dtsi b/arch/arm/boot/dts/sama5d3_emac.dtsi index fe2af92..b4544cf 100644 --- a/arch/arm/boot/dts/sama5d3_emac.dtsi +++ b/arch/arm/boot/dts/sama5d3_emac.dtsi @@ -41,7 +41,7 @@ }; macb1: ethernet@f802c000 { - compatible = "cdns,at32ap7000-macb", "cdns,macb"; + compatible = "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xf802c000 0x100>; interrupts = <35 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default"; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 0/4] net/macb: merge at91_ether driver into macb driver
Hello, The rm9200 boards use the dedicated at91_ether driver instead of the regular macb driver. Both the macb and at91_ether drivers can be compiled as separated modules. Since the at91_ether driver uses code from the macb driver, at91_ether.ko depends on macb.ko. However the macb.ko module always fails to load on rm9200 boards: the macb_probe() function expects a hclk clock which doesn't exist on rm9200. Then the at91_ether.ko can't be loaded in turn due to unresolved dependencies. This series of patches fix this issue by merging at91_ether into macb. Patch 1 is fixing a problem that might happen when enabling ARM multi-platform suppot. Best Regards, Boris Changes since v3: - move "net: macb: remove #if defined(CONFIG_ARCH_AT91) sections" patch into this series to avoid dependency on other patch series. Changes since v2: - rebase after changed brought by commit "net: macb: remove #if defined(CONFIG_ARCH_AT91) sections" Changes since v1: - rework probe functions to share common probing logic Boris Brezillon (2): ARM: at91/dt: fix macb compatible strings net: macb: remove #if defined(CONFIG_ARCH_AT91) sections Cyrille Pitchen (2): net/macb: unify clock management net/macb: merge at91_ether driver into macb driver arch/arm/boot/dts/at91sam9260.dtsi| 2 +- arch/arm/boot/dts/at91sam9263.dtsi| 2 +- arch/arm/boot/dts/at91sam9g45.dtsi| 2 +- arch/arm/boot/dts/at91sam9x5_macb0.dtsi | 2 +- arch/arm/boot/dts/at91sam9x5_macb1.dtsi | 2 +- arch/arm/boot/dts/sama5d3_emac.dtsi | 2 +- drivers/net/ethernet/cadence/Kconfig | 8 - drivers/net/ethernet/cadence/Makefile | 1 - drivers/net/ethernet/cadence/at91_ether.c | 481 -- drivers/net/ethernet/cadence/macb.c | 662 ++ drivers/net/ethernet/cadence/macb.h | 12 +- 11 files changed, 503 insertions(+), 673 deletions(-) delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 2/4] net: macb: remove #if defined(CONFIG_ARCH_AT91) sections
With multi platform support those sections could lead to unexpected behavior if both ARCH_AT91 and another ARM SoC using the MACB IP are selected. Add two new capabilities to encode the default MII mode and the presence of a CLKEN bit in USRIO register. Then define the appropriate config for IPs embedded in at91 SoCs. Signed-off-by: Boris Brezillon Reviewed-by: Alexandre Belloni --- drivers/net/ethernet/cadence/macb.c | 32 +--- drivers/net/ethernet/cadence/macb.h | 2 ++ 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index ad76b8e..86e915f 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -2113,6 +2113,10 @@ static const struct net_device_ops macb_netdev_ops = { }; #if defined(CONFIG_OF) +static struct macb_config at91sam9260_config = { + .caps = MACB_CAPS_USRIO_HAS_CLKEN | MACB_CAPS_USRIO_DEFAULT_IS_MII, +}; + static struct macb_config pc302gem_config = { .caps = MACB_CAPS_SG_DISABLED | MACB_CAPS_GIGABIT_MODE_AVAILABLE, .dma_burst_length = 16, @@ -2130,7 +2134,7 @@ static struct macb_config sama5d4_config = { static const struct of_device_id macb_dt_ids[] = { { .compatible = "cdns,at32ap7000-macb" }, - { .compatible = "cdns,at91sam9260-macb" }, + { .compatible = "cdns,at91sam9260-macb", .data = _config }, { .compatible = "cdns,macb" }, { .compatible = "cdns,pc302-gem", .data = _config }, { .compatible = "cdns,gem", .data = _config }, @@ -2388,21 +2392,19 @@ static int macb_probe(struct platform_device *pdev) bp->phy_interface = err; } + config = 0; if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII) - macb_or_gem_writel(bp, USRIO, GEM_BIT(RGMII)); - else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII) -#if defined(CONFIG_ARCH_AT91) - macb_or_gem_writel(bp, USRIO, (MACB_BIT(RMII) | - MACB_BIT(CLKEN))); -#else - macb_or_gem_writel(bp, USRIO, 0); -#endif - else -#if defined(CONFIG_ARCH_AT91) - macb_or_gem_writel(bp, USRIO, MACB_BIT(CLKEN)); -#else - macb_or_gem_writel(bp, USRIO, MACB_BIT(MII)); -#endif + config = GEM_BIT(RGMII); + else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII && +(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII)) + config = MACB_BIT(RMII); + else if (!(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII)) + config = MACB_BIT(MII); + + if (bp->caps & MACB_CAPS_USRIO_HAS_CLKEN) + config |= MACB_BIT(CLKEN); + + macb_or_gem_writel(bp, USRIO, config); err = register_netdev(dev); if (err) { diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h index 31dc080..efe0247 100644 --- a/drivers/net/ethernet/cadence/macb.h +++ b/drivers/net/ethernet/cadence/macb.h @@ -389,6 +389,8 @@ /* Capability mask bits */ #define MACB_CAPS_ISR_CLEAR_ON_WRITE 0x0001 +#define MACB_CAPS_USRIO_HAS_CLKEN 0x0002 +#define MACB_CAPS_USRIO_DEFAULT_IS_MII 0x0004 #define MACB_CAPS_FIFO_MODE0x1000 #define MACB_CAPS_GIGABIT_MODE_AVAILABLE 0x2000 #define MACB_CAPS_SG_DISABLED 0x4000 -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] net: dsa: mv88e6352: Add support for EEE
Enable EEE support for MV88E6352. Signed-off-by: Guenter Roeck --- Changes since RFT: - Additional testing; no code changes. drivers/net/dsa/mv88e6352.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c index 1ebd8f9..7bc5998 100644 --- a/drivers/net/dsa/mv88e6352.c +++ b/drivers/net/dsa/mv88e6352.c @@ -717,6 +717,8 @@ struct dsa_switch_driver mv88e6352_switch_driver = { .get_strings= mv88e6352_get_strings, .get_ethtool_stats = mv88e6352_get_ethtool_stats, .get_sset_count = mv88e6352_get_sset_count, + .set_eee= mv88e6xxx_set_eee, + .get_eee= mv88e6xxx_get_eee, #ifdef CONFIG_NET_DSA_HWMON .get_temp = mv88e6352_get_temp, .get_temp_limit = mv88e6352_get_temp_limit, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] net: dsa: mv88e6xxx: Add EEE support
EEE configuration is similar for the various MV88E6xxx chips. Add generic support for it. Signed-off-by: Guenter Roeck Reviewed-by: Florian Fainelli --- Changes since RFT: - Additional testing; no code changes - Dropped comment about phy_init_eee drivers/net/dsa/mv88e6xxx.c | 51 + drivers/net/dsa/mv88e6xxx.h | 3 +++ 2 files changed, 54 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index a83ace0..c18ffc9 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -649,6 +649,57 @@ int mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int addr, int regnum, return mv88e6xxx_phy_wait(ds); } +int mv88e6xxx_get_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e) +{ + int reg; + + reg = mv88e6xxx_phy_read_indirect(ds, port, 16); + if (reg < 0) + return -EOPNOTSUPP; + + e->eee_enabled = !!(reg & 0x0200); + e->tx_lpi_enabled = !!(reg & 0x0100); + + reg = REG_READ(REG_PORT(port), 0); + e->eee_active = !!(reg & 0x0040); + + return 0; +} + +static int mv88e6xxx_eee_enable_set(struct dsa_switch *ds, int port, + bool eee_enabled, bool tx_lpi_enabled) +{ + int reg, nreg; + + reg = mv88e6xxx_phy_read_indirect(ds, port, 16); + if (reg < 0) + return reg; + + nreg = reg & ~0x0300; + if (eee_enabled) + nreg |= 0x0200; + if (tx_lpi_enabled) + nreg |= 0x0100; + + if (nreg != reg) + return mv88e6xxx_phy_write_indirect(ds, port, 16, nreg); + + return 0; +} + +int mv88e6xxx_set_eee(struct dsa_switch *ds, int port, + struct phy_device *phydev, struct ethtool_eee *e) +{ + int ret; + + ret = mv88e6xxx_eee_enable_set(ds, port, e->eee_enabled, + e->tx_lpi_enabled); + if (ret) + return -EOPNOTSUPP; + + return 0; +} + static int __init mv88e6xxx_init(void) { #if IS_ENABLED(CONFIG_NET_DSA_MV88E6131) diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h index 7294227..5fd42ce 100644 --- a/drivers/net/dsa/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx.h @@ -88,6 +88,9 @@ int mv88e6xxx_eeprom_busy_wait(struct dsa_switch *ds); int mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int addr, int regnum); int mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int addr, int regnum, u16 val); +int mv88e6xxx_get_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e); +int mv88e6xxx_set_eee(struct dsa_switch *ds, int port, + struct phy_device *phydev, struct ethtool_eee *e); extern struct dsa_switch_driver mv88e6131_switch_driver; extern struct dsa_switch_driver mv88e6123_61_65_switch_driver; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 4/4] net/macb: merge at91_ether driver into macb driver
From: Cyrille Pitchen macb and at91_ether drivers can be compiled as modules, but the at91_ether driver use some functions and variables defined in the macb one, thus creating a dependency on the macb driver. Since these drivers are sharing the same logic we can easily merge at91_ether into macb. In order to factorize common probing logic we've added an ->init() function to struct macb_config (the structure associated with the compatible string), and moved macb specific init code from macb_probe to macb_init. Signed-off-by: Cyrille Pitchen Signed-off-by: Boris Brezillon Tested-by: Alexandre Belloni --- drivers/net/ethernet/cadence/Kconfig | 8 - drivers/net/ethernet/cadence/Makefile | 1 - drivers/net/ethernet/cadence/at91_ether.c | 481 -- drivers/net/ethernet/cadence/macb.c | 639 ++ drivers/net/ethernet/cadence/macb.h | 10 +- 5 files changed, 484 insertions(+), 655 deletions(-) delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig index 321d2ad..fb8d09b 100644 --- a/drivers/net/ethernet/cadence/Kconfig +++ b/drivers/net/ethernet/cadence/Kconfig @@ -20,14 +20,6 @@ config NET_CADENCE if NET_CADENCE -config ARM_AT91_ETHER - tristate "AT91RM9200 Ethernet support" - depends on HAS_DMA && (ARCH_AT91 || COMPILE_TEST) - select MACB - ---help--- - If you wish to compile a kernel for the AT91RM9200 and enable - ethernet support, then you should always answer Y to this. - config MACB tristate "Cadence MACB/GEM support" depends on HAS_DMA && (PLATFORM_AT32AP || ARCH_AT91 || ARCH_PICOXCELL || ARCH_ZYNQ || MICROBLAZE || COMPILE_TEST) diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile index 9068b83..91f79b1 100644 --- a/drivers/net/ethernet/cadence/Makefile +++ b/drivers/net/ethernet/cadence/Makefile @@ -2,5 +2,4 @@ # Makefile for the Atmel network device drivers. # -obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o obj-$(CONFIG_MACB) += macb.o diff --git a/drivers/net/ethernet/cadence/at91_ether.c b/drivers/net/ethernet/cadence/at91_ether.c deleted file mode 100644 index 7ef55f5..000 --- a/drivers/net/ethernet/cadence/at91_ether.c +++ /dev/null @@ -1,481 +0,0 @@ -/* - * Ethernet driver for the Atmel AT91RM9200 (Thunder) - * - * Copyright (C) 2003 SAN People (Pty) Ltd - * - * Based on an earlier Atmel EMAC macrocell driver by Atmel and Lineo Inc. - * Initial version by Rick Bronson 01/11/2003 - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "macb.h" - -/* 1518 rounded up */ -#define MAX_RBUFF_SZ 0x600 -/* max number of receive buffers */ -#define MAX_RX_DESCR 9 - -/* Initialize and start the Receiver and Transmit subsystems */ -static int at91ether_start(struct net_device *dev) -{ - struct macb *lp = netdev_priv(dev); - dma_addr_t addr; - u32 ctl; - int i; - - lp->rx_ring = dma_alloc_coherent(>pdev->dev, -(MAX_RX_DESCR * - sizeof(struct macb_dma_desc)), ->rx_ring_dma, GFP_KERNEL); - if (!lp->rx_ring) - return -ENOMEM; - - lp->rx_buffers = dma_alloc_coherent(>pdev->dev, - MAX_RX_DESCR * MAX_RBUFF_SZ, - >rx_buffers_dma, GFP_KERNEL); - if (!lp->rx_buffers) { - dma_free_coherent(>pdev->dev, - MAX_RX_DESCR * sizeof(struct macb_dma_desc), - lp->rx_ring, lp->rx_ring_dma); - lp->rx_ring = NULL; - return -ENOMEM; - } - - addr = lp->rx_buffers_dma; - for (i = 0; i < MAX_RX_DESCR; i++) { - lp->rx_ring[i].addr = addr; - lp->rx_ring[i].ctrl = 0; - addr += MAX_RBUFF_SZ; - } - - /* Set the Wrap bit on the last descriptor */ - lp->rx_ring[MAX_RX_DESCR - 1].addr |= MACB_BIT(RX_WRAP); - - /* Reset buffer index */ - lp->rx_tail = 0; - - /* Program address of descriptor list in Rx Buffer Queue register */ - macb_writel(lp, RBQP, lp->rx_ring_dma); - - /* Enable Receive and Transmit */ - ctl = macb_readl(lp, NCR); - macb_writel(lp, NCR, ctl | MACB_BIT(RE) | MACB_BIT(TE)); - - return 0; -} - -/* Open the ethernet interface */ -static int
[PATCH v4 3/4] net/macb: unify clock management
From: Cyrille Pitchen Most of the functions from the Common Clk Framework handle NULL pointer as input argument. Since the TX clock is optional, we now set tx_clk to NULL value instead of ERR_PTR(-ENOENT) when this clock is not available. This simplifies the clock management and avoid the need to test tx_clk value. Signed-off-by: Cyrille Pitchen Acked-by: Boris Brezillon Acked-by: Alexandre Belloni --- drivers/net/ethernet/cadence/macb.c | 31 ++- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index 86e915f..d8748c5 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -213,6 +213,9 @@ static void macb_set_tx_clk(struct clk *clk, int speed, struct net_device *dev) { long ferr, rate, rate_rounded; + if (!clk) + return; + switch (speed) { case SPEED_10: rate = 250; @@ -292,8 +295,7 @@ static void macb_handle_link_change(struct net_device *dev) spin_unlock_irqrestore(>lock, flags); - if (!IS_ERR(bp->tx_clk)) - macb_set_tx_clk(bp->tx_clk, phydev->speed, dev); + macb_set_tx_clk(bp->tx_clk, phydev->speed, dev); if (status_change) { if (phydev->link) { @@ -2244,6 +2246,8 @@ static int macb_probe(struct platform_device *pdev) } tx_clk = devm_clk_get(>dev, "tx_clk"); + if (IS_ERR(tx_clk)) + tx_clk = NULL; err = clk_prepare_enable(pclk); if (err) { @@ -2257,13 +2261,10 @@ static int macb_probe(struct platform_device *pdev) goto err_out_disable_pclk; } - if (!IS_ERR(tx_clk)) { - err = clk_prepare_enable(tx_clk); - if (err) { - dev_err(>dev, "failed to enable tx_clk (%u)\n", - err); - goto err_out_disable_hclk; - } + err = clk_prepare_enable(tx_clk); + if (err) { + dev_err(>dev, "failed to enable tx_clk (%u)\n", err); + goto err_out_disable_hclk; } err = -ENOMEM; @@ -2435,8 +2436,7 @@ err_out_unregister_netdev: err_out_free_netdev: free_netdev(dev); err_out_disable_clocks: - if (!IS_ERR(tx_clk)) - clk_disable_unprepare(tx_clk); + clk_disable_unprepare(tx_clk); err_out_disable_hclk: clk_disable_unprepare(hclk); err_out_disable_pclk: @@ -2460,8 +2460,7 @@ static int macb_remove(struct platform_device *pdev) kfree(bp->mii_bus->irq); mdiobus_free(bp->mii_bus); unregister_netdev(dev); - if (!IS_ERR(bp->tx_clk)) - clk_disable_unprepare(bp->tx_clk); + clk_disable_unprepare(bp->tx_clk); clk_disable_unprepare(bp->hclk); clk_disable_unprepare(bp->pclk); free_netdev(dev); @@ -2479,8 +2478,7 @@ static int __maybe_unused macb_suspend(struct device *dev) netif_carrier_off(netdev); netif_device_detach(netdev); - if (!IS_ERR(bp->tx_clk)) - clk_disable_unprepare(bp->tx_clk); + clk_disable_unprepare(bp->tx_clk); clk_disable_unprepare(bp->hclk); clk_disable_unprepare(bp->pclk); @@ -2495,8 +2493,7 @@ static int __maybe_unused macb_resume(struct device *dev) clk_prepare_enable(bp->pclk); clk_prepare_enable(bp->hclk); - if (!IS_ERR(bp->tx_clk)) - clk_prepare_enable(bp->tx_clk); + clk_prepare_enable(bp->tx_clk); netif_device_attach(netdev); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 2/7] Drivers: hv: vmbus: Perform device register in the per-channel work element
This patch is a continuation of the rescind handling cleanup work. We cannot block in the global message handling work context especially if we are blocking waiting for the host to wake us up. I would like to thank Dexuan Cui for observing this problem. Signed-off-by: K. Y. Srinivasan --- - Free up the work element after processing rescind. drivers/hv/channel_mgmt.c | 143 +++- drivers/hv/connection.c |6 ++- drivers/hv/hyperv_vmbus.h |2 +- 3 files changed, 107 insertions(+), 44 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 6117891..5f8e47b 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -37,6 +38,10 @@ struct vmbus_channel_message_table_entry { void (*message_handler)(struct vmbus_channel_message_header *msg); }; +struct vmbus_rescind_work { + struct work_struct work; + struct vmbus_channel *channel; +}; /** * vmbus_prep_negotiate_resp() - Create default response for Hyper-V Negotiate message @@ -134,20 +139,6 @@ fw_error: EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp); -static void vmbus_process_device_unregister(struct work_struct *work) -{ - struct device *dev; - struct vmbus_channel *channel = container_of(work, - struct vmbus_channel, - work); - - dev = get_device(>device_obj->device); - if (dev) { - vmbus_device_unregister(channel->device_obj); - put_device(dev); - } -} - static void vmbus_sc_creation_cb(struct work_struct *work) { struct vmbus_channel *newchannel = container_of(work, @@ -220,6 +211,40 @@ static void free_channel(struct vmbus_channel *channel) queue_work(vmbus_connection.work_queue, >work); } +static void process_rescind_fn(struct work_struct *work) +{ + struct vmbus_rescind_work *rc_work; + struct vmbus_channel *channel; + struct device *dev; + + rc_work = container_of(work, struct vmbus_rescind_work, work); + channel = rc_work->channel; + + /* +* We have already acquired a reference on the channel +* and so it cannot vanish underneath us. +* It is possible (while very unlikely) that we may +* get here while the processing of the initial offer +* is still not complete. Deal with this situation by +* just waiting until the channel is in the correct state. +*/ + + while (channel->work.func != release_channel) + msleep(1000); + + if (channel->device_obj) { + dev = get_device(>device_obj->device); + if (dev) { + vmbus_device_unregister(channel->device_obj); + put_device(dev); + } + } else { + hv_process_channel_removal(channel, + channel->offermsg.child_relid); + } + kfree(work); +} + static void percpu_channel_enq(void *arg) { struct vmbus_channel *channel = arg; @@ -282,6 +307,46 @@ void vmbus_free_channels(void) } } +static void vmbus_do_device_register(struct work_struct *work) +{ + struct hv_device *device_obj; + int ret; + unsigned long flags; + struct vmbus_channel *newchannel = container_of(work, +struct vmbus_channel, +work); + + ret = vmbus_device_register(newchannel->device_obj); + if (ret != 0) { + pr_err("unable to add child device object (relid %d)\n", + newchannel->offermsg.child_relid); + spin_lock_irqsave(_connection.channel_lock, flags); + list_del(>listentry); + device_obj = newchannel->device_obj; + newchannel->device_obj = NULL; + spin_unlock_irqrestore(_connection.channel_lock, flags); + + if (newchannel->target_cpu != get_cpu()) { + put_cpu(); + smp_call_function_single(newchannel->target_cpu, +percpu_channel_deq, newchannel, true); + } else { + percpu_channel_deq(newchannel); + put_cpu(); + } + + kfree(device_obj); + if (!newchannel->rescind) { + free_channel(newchannel); + return; + } + } + /* +* The next state for this channel is to be freed. +*/ + INIT_WORK(>work, release_channel); +} + /* * vmbus_process_offer - Process the offer by creating a channel/device * associated with this offer @@ -291,7 +356,6
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 12:31 PM, Jason Low wrote: > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > Just in case, here's the updated patch which addresses Linus's comments > and with a changelog. > > Note: The changelog says that it fixes (locking/rwsem: Avoid deceiving > lock spinners), though I still haven't seen full confirmation that it > addresses all of the lockup reports. > > -- > Subject: [PATCH] rwsem: Avoid spinning when owner is not running > > Fixes tip commmit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners). > > When doing optimistic spinning in rwsem, threads should stop spinning when > the lock owner is not running. While a thread is spinning on owner, if > the owner reschedules, owner->on_cpu returns false and we stop spinning. > > However, commit b3fd4f03ca0b essentially caused the check to get ignored > because when we break out of the spin loop due to !on_cpu, we continue > spinning if sem->owner != NULL. > > This patch fixes this by making sure we stop spinning if the owner is not > running. Furthermore, just like with mutexes, refactor the code such that > we don't have separate checks for owner_running(). This makes it more > straightforward in terms of why we exit the spin on owner loop and we > would also avoid needing to "guess" why we broke out of the loop to make > this more readable. > > Cc: Ming Lei > Cc: Davidlohr Bueso > Signed-off-by: Jason Low Reported-and-tested-by: Ming Lei > --- > kernel/locking/rwsem-xadd.c | 31 +++ > 1 files changed, 11 insertions(+), 20 deletions(-) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index 06e2214..3417d01 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -324,32 +324,23 @@ done: > return ret; > } > > -static inline bool owner_running(struct rw_semaphore *sem, > -struct task_struct *owner) > -{ > - if (sem->owner != owner) > - return false; > - > - /* > -* Ensure we emit the owner->on_cpu, dereference _after_ checking > -* sem->owner still matches owner, if that fails, owner might > -* point to free()d memory, if it still matches, the rcu_read_lock() > -* ensures the memory stays valid. > -*/ > - barrier(); > - > - return owner->on_cpu; > -} > - > static noinline > bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner) > { > long count; > > rcu_read_lock(); > - while (owner_running(sem, owner)) { > - /* abort spinning when need_resched */ > - if (need_resched()) { > + while (sem->owner == owner) { > + /* > +* Ensure we emit the owner->on_cpu, dereference _after_ > +* checking sem->owner still matches owner, if that fails, > +* owner might point to free()d memory, if it still matches, > +* the rcu_read_lock() ensures the memory stays valid. > +*/ > + barrier(); > + > + /* abort spinning when need_resched or owner is not running */ > + if (!owner->on_cpu || need_resched()) { BTW, could the need_resched() be handled in loop of rwsem_optimistic_spin() directly? Then code may get simplified a bit. Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 7/7] tools: hv: fcopy_daemon: support >2GB files for x86_32 guest
From: Dexuan Cui Without this patch, hv_fcopy_daemon's hv_copy_data() -> pwrite() will fail for >2GB file offset. Signed-off-by: Alex Ng Signed-off-by: Dexuan Cui Cc: K. Y. Srinivasan Signed-off-by: K. Y. Srinivasan --- tools/hv/Makefile |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tools/hv/Makefile b/tools/hv/Makefile index 99ffe61..a8ab795 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -3,7 +3,7 @@ CC = $(CROSS_COMPILE)gcc PTHREAD_LIBS = -lpthread WARNINGS = -Wall -Wextra -CFLAGS = $(WARNINGS) -g $(PTHREAD_LIBS) +CFLAGS = $(WARNINGS) -g $(PTHREAD_LIBS) $(shell getconf LFS_CFLAGS) all: hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon %: %.c -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 1/7] Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl()
Export the vmbus_sendpacket_pagebuffer_ctl() interface. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index da53180..e58cdb7 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -710,6 +710,7 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, return ret; } +EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl); /* * vmbus_sendpacket_pagebuffer - Send a range of single-page buffer -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 3/7] Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure
From: Vitaly Kuznetsov When add_memory() fails the following BUG is observed: [ 743.646107] hv_balloon: hot_add memory failed error is -17 [ 743.679973] [ 743.680930] = [ 743.680930] [ BUG: bad unlock balance detected! ] [ 743.680930] 3.19.0-rc5_bug1131426+ #552 Not tainted [ 743.680930] - [ 743.680930] kworker/0:2/255 is trying to release lock (_device.ha_region_mutex) at: [ 743.680930] [] mutex_unlock+0xe/0x10 [ 743.680930] but there are no more locks to release! This happens as we don't acquire ha_region_mutex and hot_add_req() expects us to as it does unconditional mutex_unlock(). Acquire the lock on the error path. Signed-off-by: Vitaly Kuznetsov Acked-by: Jason Wang Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index c5bb872..f1f17c5 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -652,6 +652,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, } has->ha_end_pfn -= HA_CHUNK; has->covered_end_pfn -= processed_pfn; + mutex_lock(_device.ha_region_mutex); break; } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 4/7] Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural
From: Vitaly Kuznetsov Memory blocks can be onlined in random order. When this order is not natural some memory pages are not onlined because of the redundant check in hv_online_page(). Here is a real world scenario: 1) Host tries to hot-add the following (process_hot_add): pg_start=rg_start=0x48000, pfn_cnt=111616, rg_size=262144 2) This results in adding 4 memory blocks: [ 109.057866] init_memory_mapping: [mem 0x4800-0x4fff] [ 114.102698] init_memory_mapping: [mem 0x5000-0x57ff] [ 119.168039] init_memory_mapping: [mem 0x5800-0x5fff] [ 124.233053] init_memory_mapping: [mem 0x6000-0x67ff] The last one is incomplete but we have special has->covered_end_pfn counter to avoid onlining non-backed frames and hv_bring_pgs_online() function to bring them online later on. 3) Now we have 4 offline memory blocks: /sys/devices/system/memory/memory9-12 $ for f in /sys/devices/system/memory/memory*/state; do echo $f `cat $f`; done | grep -v onlin /sys/devices/system/memory/memory10/state offline /sys/devices/system/memory/memory11/state offline /sys/devices/system/memory/memory12/state offline /sys/devices/system/memory/memory9/state offline 4) We bring them online in non-natural order: $grep MemTotal /proc/meminfo MemTotal: 966348 kB $echo online > /sys/devices/system/memory/memory12/state && grep MemTotal /proc/meminfo MemTotal:1019596 kB $echo online > /sys/devices/system/memory/memory11/state && grep MemTotal /proc/meminfo MemTotal:1150668 kB $echo online > /sys/devices/system/memory/memory9/state && grep MemTotal /proc/meminfo MemTotal:1150668 kB As you can see memory9 block gives us zero additional memory. We can also observe a huge discrepancy between host- and guest-reported memory sizes. The root cause of the issue is the redundant pg >= covered_start_pfn check (and covered_start_pfn advancing) in hv_online_page(). When upper memory block in being onlined before the lower one (memory12 and memory11 in the above case) we advance the covered_start_pfn pointer and all memory9 pages do not pass the check. If the assumption that host always gives us requests in sequential order and pg_start always equals rg_start when the first request for the new HA region is received (that's the case in my testing) is correct than we can get rid of covered_start_pfn and pg >= start_pfn check in hv_online_page() is sufficient. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 14 -- 1 files changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index f1f17c5..014256a 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -428,14 +428,13 @@ struct dm_info_msg { * currently hot added. We hot add in multiples of 128M * chunks; it is possible that we may not be able to bring * online all the pages in the region. The range - * covered_start_pfn : covered_end_pfn defines the pages that can + * covered_end_pfn defines the pages that can * be brough online. */ struct hv_hotadd_state { struct list_head list; unsigned long start_pfn; - unsigned long covered_start_pfn; unsigned long covered_end_pfn; unsigned long ha_end_pfn; unsigned long end_pfn; @@ -679,8 +678,7 @@ static void hv_online_page(struct page *pg) list_for_each(cur, _device.ha_region_list) { has = list_entry(cur, struct hv_hotadd_state, list); - cur_start_pgp = (unsigned long) - pfn_to_page(has->covered_start_pfn); + cur_start_pgp = (unsigned long)pfn_to_page(has->start_pfn); cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn); if (((unsigned long)pg >= cur_start_pgp) && @@ -692,7 +690,6 @@ static void hv_online_page(struct page *pg) __online_page_set_limits(pg); __online_page_increment_counters(pg); __online_page_free(pg); - has->covered_start_pfn++; } } } @@ -736,10 +733,9 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) * is, update it. */ - if (has->covered_end_pfn != start_pfn) { + if (has->covered_end_pfn != start_pfn) has->covered_end_pfn = start_pfn; - has->covered_start_pfn = start_pfn; - } + return true; } @@ -784,7 +780,6 @@ static unsigned long handle_pg_range(unsigned long pg_start, pgs_ol = pfn_cnt; hv_bring_pgs_online(start_pfn, pgs_ol); has->covered_end_pfn += pgs_ol; - has->covered_start_pfn += pgs_ol; pfn_cnt -= pgs_ol; } @@ -845,7
[PATCH V2 5/7] Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY
From: Nick Meier HV_CRASH_CTL_CRASH_NOTIFY is a 64 bit number. Depending on the usage context, the value may be truncated. This patch is in response from the following email from Intel: [char-misc:char-misc-testing 25/45] drivers/hv/vmbus_drv.c:67:9: sparse: constant 0x8000 is so big it is unsigned long tree: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git char-misc-testing head: b3de8e3719e582f3182bb504295e4a8e43c8c96f commit: 96c1d0581d00f7abe033350edb021a9d947d8d81 [25/45] Drivers: hv: vmbus: Add support for VMBus panic notifier handler reproduce: # apt-get install sparse git checkout 96c1d0581d00f7abe033350edb021a9d947d8d81 make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) drivers/hv/vmbus_drv.c:67:9: sparse: constant 0x8000 is so big it is unsigned long ... Signed-off-by: Nick Meier Signed-off-by: K. Y. Srinivasan --- drivers/hv/hyperv_vmbus.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 6339589..c8e27e0 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -58,7 +58,7 @@ enum hv_cpuid_function { #define HV_X64_MSR_CRASH_P4 0x4104 #define HV_X64_MSR_CRASH_CTL 0x4105 -#define HV_CRASH_CTL_CRASH_NOTIFY 0x8000 +#define HV_CRASH_CTL_CRASH_NOTIFY (1ULL << 63) /* Define version of the synthetic interrupt controller. */ #define HV_SYNIC_VERSION (1) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 6/7] hv: vmbus: missing curly braces in vmbus_process_offer()
From: Dan Carpenter The indenting makes it clear that there were curly braces intended here. Fixes: 2dd37cb81580 ('Drivers: hv: vmbus: Handle both rescind and offer messages in the same context') Signed-off-by: Dan Carpenter Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 5f8e47b..25dbbaf 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -415,7 +415,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) newchannel->state = CHANNEL_OPEN_STATE; channel->num_sc++; - if (channel->sc_creation_callback != NULL) + if (channel->sc_creation_callback != NULL) { /* * We need to invoke the sub-channel creation * callback; invoke this in a seperate work @@ -427,6 +427,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) vmbus_sc_creation_cb); queue_work(newchannel->controlwq, >work); + } return; } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2 0/7] Drivers: hv: Miscellaneous fixes
This patch-set has miscellaneous fixes for both the VMBUS as well as the balloon driver. There is also a fix for the hv tools makefile. In this version of the patchset, I have included a patch from Dexuan. Furthermore, I have addressed a memory leak issue in the patch: Drivers: hv: vmbus: Perform device register in the per-channel work element. Dan Carpenter (1): hv: vmbus: missing curly braces in vmbus_process_offer() Dexuan Cui (1): tools: hv: fcopy_daemon: support >2GB files for x86_32 guest K. Y. Srinivasan (2): Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl() Drivers: hv: vmbus: Perform device register in the per-channel work element Nick Meier (1): Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY Vitaly Kuznetsov (2): Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural drivers/hv/channel.c |1 + drivers/hv/channel_mgmt.c | 146 +++- drivers/hv/connection.c |6 ++- drivers/hv/hv_balloon.c | 15 ++--- drivers/hv/hyperv_vmbus.h |4 +- tools/hv/Makefile |2 +- 6 files changed, 117 insertions(+), 57 deletions(-) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] SCSI: sd: fix null dereference
we were dereferencing sdkp first and then we were checking for it being NULL. Signed-off-by: Sudip Mukherjee --- drivers/scsi/sd_dif.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c index 14c7d42..a514645 100644 --- a/drivers/scsi/sd_dif.c +++ b/drivers/scsi/sd_dif.c @@ -40,11 +40,16 @@ */ void sd_dif_config_host(struct scsi_disk *sdkp) { - struct scsi_device *sdp = sdkp->device; - struct gendisk *disk = sdkp->disk; - u8 type = sdkp->protection_type; + struct scsi_device *sdp = NULL; + struct gendisk *disk = NULL; + u8 type; int dif, dix; + if (!sdkp) + return; + sdp = sdkp->device; + disk = sdkp->disk; + type = sdkp->protection_type; dif = scsi_host_dif_capable(sdp->host, type); dix = scsi_host_dix_capable(sdp->host, type); @@ -77,9 +82,6 @@ void sd_dif_config_host(struct scsi_disk *sdkp) disk->integrity->flags |= BLK_INTEGRITY_DEVICE_CAPABLE; - if (!sdkp) - return; - if (type == SD_DIF_TYPE3_PROTECTION) disk->integrity->tag_size = sizeof(u16) + sizeof(u32); else -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dwc3: make LPM configurable in DT
Hi, On Fri, Mar 06, 2015 at 11:08:53AM +0100, Robert Baldyga wrote: > LPM capability is hardware property, so now it's moved to DT. you need a better commit log here. > Signed-off-by: Robert Baldyga > --- > Documentation/devicetree/bindings/usb/dwc3.txt | 1 + > drivers/usb/dwc3/Kconfig | 7 --- > drivers/usb/dwc3/core.c| 3 +++ > drivers/usb/dwc3/core.h| 1 + > drivers/usb/dwc3/host.c| 5 ++--- > drivers/usb/dwc3/platform_data.h | 1 + > 6 files changed, 8 insertions(+), 10 deletions(-) > > diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt > b/Documentation/devicetree/bindings/usb/dwc3.txt > index cd7f045..36b9148 100644 > --- a/Documentation/devicetree/bindings/usb/dwc3.txt > +++ b/Documentation/devicetree/bindings/usb/dwc3.txt > @@ -14,6 +14,7 @@ Optional properties: > - phys: from the *Generic PHY* bindings > - phy-names: from the *Generic PHY* bindings > - tx-fifo-resize: determines if the FIFO *has* to be reallocated. > + - usb3_lpm_capable: determines if platform is USB3 LPM capable needs a snps, prefix > @@ -848,6 +850,7 @@ static int dwc3_probe(struct platform_device *pdev) > hird_threshold = pdata->hird_threshold; > > dwc->needs_fifo_resize = pdata->tx_fifo_resize; > + dwc->usb3_lpm_capable = pdata->usb3_lpm_capable; > dwc->dr_mode = pdata->dr_mode; > > dwc->disable_scramble_quirk = pdata->disable_scramble_quirk; > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > index d201910..622f65f 100644 > --- a/drivers/usb/dwc3/core.h > +++ b/drivers/usb/dwc3/core.h > @@ -812,6 +812,7 @@ struct dwc3 { > unsignedsetup_packet_pending:1; > unsignedstart_config_issued:1; > unsignedthree_stage_setup:1; > + unsignedusb3_lpm_capable:1; missing kdoc for this new field. > > unsigneddisable_scramble_quirk:1; > unsignedu2exit_lfps_quirk:1; > diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c > index 12bfd3c..507eddf 100644 > --- a/drivers/usb/dwc3/host.c > +++ b/drivers/usb/dwc3/host.c > @@ -49,9 +49,8 @@ int dwc3_host_init(struct dwc3 *dwc) > > memset(, 0, sizeof(pdata)); > > -#ifdef CONFIG_DWC3_HOST_USB3_LPM_ENABLE > - pdata.usb3_lpm_capable = 1; > -#endif > + if (dwc->usb3_lpm_capable) > + pdata.usb3_lpm_capable = 1; pdata.usb3_lpm_capable = dwc->usb3_lpm_capable; ?? drop the branch altogether > diff --git a/drivers/usb/dwc3/platform_data.h > b/drivers/usb/dwc3/platform_data.h > index a3a3b6d5..a2bd464 100644 > --- a/drivers/usb/dwc3/platform_data.h > +++ b/drivers/usb/dwc3/platform_data.h > @@ -24,6 +24,7 @@ struct dwc3_platform_data { > enum usb_device_speed maximum_speed; > enum usb_dr_mode dr_mode; > bool tx_fifo_resize; > + bool usb3_lpm_capable; add kdoc for this too. -- balbi signature.asc Description: Digital signature
[PATCH 2/2] staging: ft1000: remove code indention
modified the code to keep the logic same but removed some indention. Signed-off-by: Sudip Mukherjee --- this patch will generate checkpatch warning about line more than 80char, and too many use of tab. but unless the total function is rewrtten it will be difficult to fix that. drivers/staging/ft1000/ft1000-usb/ft1000_debug.c | 44 +++- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c b/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c index 86dd699..584c59a 100644 --- a/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c +++ b/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c @@ -587,32 +587,30 @@ static long ft1000_ioctl(struct file *file, unsigned int command, if (qtype) { } else { /* Put message into Slow Queue */ - /* Only put a message into the DPRAM if msg doorbell is available */ - ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); - /* pr_debug("READ REGISTER tempword=%x\n", tempword); */ - if (tempword & FT1000_DB_DPRAM_TX) { - /* Suspend for 2ms and try again due to DSP doorbell busy */ - mdelay(2); + u8 cnt = 0; + + do { + /* Only put a message into the DPRAM if msg doorbell is available */ ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + /* pr_debug("READ REGISTER tempword=%x\n", tempword); */ if (tempword & FT1000_DB_DPRAM_TX) { + /* Suspend for 2ms and try again due to DSP doorbell busy */ + if (cnt == 0) + mdelay(2); /* Suspend for 1ms and try again due to DSP doorbell busy */ - mdelay(1); - ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); - if (tempword & FT1000_DB_DPRAM_TX) { - ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); - if (tempword & FT1000_DB_DPRAM_TX) { - /* Suspend for 3ms and try again due to DSP doorbell busy */ - mdelay(3); - ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); - if (tempword & FT1000_DB_DPRAM_TX) { - pr_debug("Doorbell not available\n"); - result = -ENOTTY; - kfree(dpram_data); - break; - } - } - } - } + else if (cnt == 1) + mdelay(1); + /* Suspend for 3ms and try again due to DSP doorbell busy */ + else + mdelay(3); + } else + break; + } while (++cnt < 3); + if (tempword & FT1000_DB_DPRAM_TX) { + pr_debug("Doorbell not available\n"); + result = -ENOTTY; + kfree(dpram_data); + break; }
[PATCH 1/2] staging: ft1000: remove unused variables
these variables were assigned some values but they were never being reused again. Signed-off-by: Sudip Mukherjee --- this patch will generate some checkpatch warning about line being above 80char, but the code is so much indented that it is difficult to break the line. drivers/staging/ft1000/ft1000-usb/ft1000_debug.c | 14 ++-- .../staging/ft1000/ft1000-usb/ft1000_download.c| 92 +++--- 2 files changed, 50 insertions(+), 56 deletions(-) diff --git a/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c b/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c index 58ad946..86dd699 100644 --- a/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c +++ b/drivers/staging/ft1000/ft1000-usb/ft1000_debug.c @@ -297,14 +297,13 @@ void ft1000_destroy_dev(struct net_device *netdev) */ static int ft1000_open(struct inode *inode, struct file *file) { - struct ft1000_info *info; struct ft1000_usb *dev = (struct ft1000_usb *)inode->i_private; int i, num; num = (MINOR(inode->i_rdev) & 0xf); pr_debug("minor number=%d\n", num); - info = file->private_data = netdev_priv(dev->net); + file->private_data = netdev_priv(dev->net); pr_debug("f_owner = %p number of application = %d\n", >f_owner, dev->appcnt); @@ -528,7 +527,6 @@ static long ft1000_ioctl(struct file *file, unsigned int command, u16 *pmsg; u16 total_len; u16 app_index; - u16 status; /* pr_debug("IOCTL_FT1000_SET_DPRAM called\n");*/ @@ -590,22 +588,22 @@ static long ft1000_ioctl(struct file *file, unsigned int command, } else { /* Put message into Slow Queue */ /* Only put a message into the DPRAM if msg doorbell is available */ - status = ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); /* pr_debug("READ REGISTER tempword=%x\n", tempword); */ if (tempword & FT1000_DB_DPRAM_TX) { /* Suspend for 2ms and try again due to DSP doorbell busy */ mdelay(2); - status = ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); if (tempword & FT1000_DB_DPRAM_TX) { /* Suspend for 1ms and try again due to DSP doorbell busy */ mdelay(1); - status = ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); if (tempword & FT1000_DB_DPRAM_TX) { - status = ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); if (tempword & FT1000_DB_DPRAM_TX) { /* Suspend for 3ms and try again due to DSP doorbell busy */ mdelay(3); - status = ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); + ft1000_read_register(ft1000dev, , FT1000_REG_DOORBELL); if (tempword & FT1000_DB_DPRAM_TX) { pr_debug("Doorbell not available\n"); result = -ENOTTY; diff --git a/drivers/staging/ft1000/ft1000-usb/ft1000_download.c b/drivers/staging/ft1000/ft1000-usb/ft1000_download.c index e8126325..9e1104b 100644 --- a/drivers/staging/ft1000/ft1000-usb/ft1000_download.c +++ b/drivers/staging/ft1000/ft1000-usb/ft1000_download.c @@ -113,22 +113,21 @@ static int check_usb_db(struct ft1000_usb *ft1000dev) { int loopcnt; u16 temp; - int status; loopcnt = 0; while (loopcnt < 10) { - status = ft1000_read_register(ft1000dev, , -
RE: [PATCH 0/6] Drivers: hv: Miscellaneous fixes
> -Original Message- > From: K. Y. Srinivasan [mailto:k...@microsoft.com] > Sent: Friday, March 6, 2015 9:10 PM > To: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org; > de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com; > vkuzn...@redhat.com > Cc: KY Srinivasan > Subject: [PATCH 0/6] Drivers: hv: Miscellaneous fixes > > This patch-set has miscellaneous fixes for both the VMBUS as well as the > balloon driver. > > Dan Carpenter (1): > hv: vmbus: missing curly braces in vmbus_process_offer() > > K. Y. Srinivasan (2): > Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl() > Drivers: hv: vmbus: Perform device register in the per-channel work > element > > Nick Meier (1): > Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY > > Vitaly Kuznetsov (2): > Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure > Drivers: hv: hv_balloon: don't lose memory when onlining order is not > natural Greg, Please drop the patch-set; one of the patches I sent was incorrect. I will resend. K. Y > > drivers/hv/channel.c |1 + > drivers/hv/channel_mgmt.c | 146 +++- > - > drivers/hv/connection.c |6 ++- > drivers/hv/hv_balloon.c | 15 ++--- > drivers/hv/hyperv_vmbus.h |4 +- > 5 files changed, 115 insertions(+), 57 deletions(-) > > -- > 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 20:31 -0800, Jason Low wrote: > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > Just in case, here's the updated patch which addresses Linus's comments > and with a changelog. > > Note: The changelog says that it fixes (locking/rwsem: Avoid deceiving > lock spinners), though I still haven't seen full confirmation that it > addresses all of the lockup reports. > > -- > Subject: [PATCH] rwsem: Avoid spinning when owner is not running > > Fixes tip commmit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners). > > When doing optimistic spinning in rwsem, threads should stop spinning when > the lock owner is not running. While a thread is spinning on owner, if > the owner reschedules, owner->on_cpu returns false and we stop spinning. > > However, commit b3fd4f03ca0b essentially caused the check to get ignored > because when we break out of the spin loop due to !on_cpu, we continue > spinning if sem->owner != NULL. I would mention the actual effects of the bug, either just a "lockup" and/or a fragment of the trace. But ultimately this comes down to missing a need_resched() condition. > > This patch fixes this by making sure we stop spinning if the owner is not > running. Furthermore, just like with mutexes, refactor the code such that > we don't have separate checks for owner_running(). This makes it more > straightforward in terms of why we exit the spin on owner loop and we > would also avoid needing to "guess" why we broke out of the loop to make > this more readable. > > Cc: Ming Lei > Cc: Davidlohr Bueso Acked-by: Davidlohr Bueso -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: Just in case, here's the updated patch which addresses Linus's comments and with a changelog. Note: The changelog says that it fixes (locking/rwsem: Avoid deceiving lock spinners), though I still haven't seen full confirmation that it addresses all of the lockup reports. -- Subject: [PATCH] rwsem: Avoid spinning when owner is not running Fixes tip commmit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners). When doing optimistic spinning in rwsem, threads should stop spinning when the lock owner is not running. While a thread is spinning on owner, if the owner reschedules, owner->on_cpu returns false and we stop spinning. However, commit b3fd4f03ca0b essentially caused the check to get ignored because when we break out of the spin loop due to !on_cpu, we continue spinning if sem->owner != NULL. This patch fixes this by making sure we stop spinning if the owner is not running. Furthermore, just like with mutexes, refactor the code such that we don't have separate checks for owner_running(). This makes it more straightforward in terms of why we exit the spin on owner loop and we would also avoid needing to "guess" why we broke out of the loop to make this more readable. Cc: Ming Lei Cc: Davidlohr Bueso Signed-off-by: Jason Low --- kernel/locking/rwsem-xadd.c | 31 +++ 1 files changed, 11 insertions(+), 20 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 06e2214..3417d01 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -324,32 +324,23 @@ done: return ret; } -static inline bool owner_running(struct rw_semaphore *sem, -struct task_struct *owner) -{ - if (sem->owner != owner) - return false; - - /* -* Ensure we emit the owner->on_cpu, dereference _after_ checking -* sem->owner still matches owner, if that fails, owner might -* point to free()d memory, if it still matches, the rcu_read_lock() -* ensures the memory stays valid. -*/ - barrier(); - - return owner->on_cpu; -} - static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner) { long count; rcu_read_lock(); - while (owner_running(sem, owner)) { - /* abort spinning when need_resched */ - if (need_resched()) { + while (sem->owner == owner) { + /* +* Ensure we emit the owner->on_cpu, dereference _after_ +* checking sem->owner still matches owner, if that fails, +* owner might point to free()d memory, if it still matches, +* the rcu_read_lock() ensures the memory stays valid. +*/ + barrier(); + + /* abort spinning when need_resched or owner is not running */ + if (!owner->on_cpu || need_resched()) { rcu_read_unlock(); return false; } -- 1.7.2.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION in 3.18][PPC] PA Semi fails to boot after: of/base: Fix PowerPC address parsing hack
On Fri, 2015-03-06 at 15:50 -0800, Olof Johansson wrote: > On Fri, Mar 6, 2015 at 2:56 PM, Benjamin Herrenschmidt > wrote: > > On Fri, 2015-03-06 at 10:00 -0500, Steven Rostedt wrote: > >> On Fri, 06 Mar 2015 15:18:42 +1100 > >> Benjamin Herrenschmidt wrote: > >> > >> > >> > Can you shoot me the DT (/proc/device-tree in a tarball) ? > >> > >> Attached. > > > > This is indeed a bug in their DT. We might want to add quirks for > > that unless it can be fixed (or has been via FW update). Olof ? > > FW updates on this platform are highly unlikely. Quirk it is. Oh I was not expecting a new FW, I was mostly wondering whether Steven had the latest one since I *think* Michael has been testing with the PA board we got here and didn't see that problem ... anyway, I'll check with him early next week and clean up / submit that patch. Cheers, Ben. > > -Olof > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] hv: vmbus: missing curly braces in vmbus_process_offer()
From: Dan Carpenter The indenting makes it clear that there were curly braces intended here. Fixes: 2dd37cb81580 ('Drivers: hv: vmbus: Handle both rescind and offer messages in the same context') Signed-off-by: Dan Carpenter Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 681c806b..1bc2378 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -413,7 +413,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) newchannel->state = CHANNEL_OPEN_STATE; channel->num_sc++; - if (channel->sc_creation_callback != NULL) + if (channel->sc_creation_callback != NULL) { /* * We need to invoke the sub-channel creation * callback; invoke this in a seperate work @@ -425,6 +425,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) vmbus_sc_creation_cb); queue_work(newchannel->controlwq, >work); + } return; } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl()
Export the vmbus_sendpacket_pagebuffer_ctl() interface. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index da53180..e58cdb7 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -710,6 +710,7 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, return ret; } +EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl); /* * vmbus_sendpacket_pagebuffer - Send a range of single-page buffer -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural
From: Vitaly Kuznetsov Memory blocks can be onlined in random order. When this order is not natural some memory pages are not onlined because of the redundant check in hv_online_page(). Here is a real world scenario: 1) Host tries to hot-add the following (process_hot_add): pg_start=rg_start=0x48000, pfn_cnt=111616, rg_size=262144 2) This results in adding 4 memory blocks: [ 109.057866] init_memory_mapping: [mem 0x4800-0x4fff] [ 114.102698] init_memory_mapping: [mem 0x5000-0x57ff] [ 119.168039] init_memory_mapping: [mem 0x5800-0x5fff] [ 124.233053] init_memory_mapping: [mem 0x6000-0x67ff] The last one is incomplete but we have special has->covered_end_pfn counter to avoid onlining non-backed frames and hv_bring_pgs_online() function to bring them online later on. 3) Now we have 4 offline memory blocks: /sys/devices/system/memory/memory9-12 $ for f in /sys/devices/system/memory/memory*/state; do echo $f `cat $f`; done | grep -v onlin /sys/devices/system/memory/memory10/state offline /sys/devices/system/memory/memory11/state offline /sys/devices/system/memory/memory12/state offline /sys/devices/system/memory/memory9/state offline 4) We bring them online in non-natural order: $grep MemTotal /proc/meminfo MemTotal: 966348 kB $echo online > /sys/devices/system/memory/memory12/state && grep MemTotal /proc/meminfo MemTotal:1019596 kB $echo online > /sys/devices/system/memory/memory11/state && grep MemTotal /proc/meminfo MemTotal:1150668 kB $echo online > /sys/devices/system/memory/memory9/state && grep MemTotal /proc/meminfo MemTotal:1150668 kB As you can see memory9 block gives us zero additional memory. We can also observe a huge discrepancy between host- and guest-reported memory sizes. The root cause of the issue is the redundant pg >= covered_start_pfn check (and covered_start_pfn advancing) in hv_online_page(). When upper memory block in being onlined before the lower one (memory12 and memory11 in the above case) we advance the covered_start_pfn pointer and all memory9 pages do not pass the check. If the assumption that host always gives us requests in sequential order and pg_start always equals rg_start when the first request for the new HA region is received (that's the case in my testing) is correct than we can get rid of covered_start_pfn and pg >= start_pfn check in hv_online_page() is sufficient. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 14 -- 1 files changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index f1f17c5..014256a 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -428,14 +428,13 @@ struct dm_info_msg { * currently hot added. We hot add in multiples of 128M * chunks; it is possible that we may not be able to bring * online all the pages in the region. The range - * covered_start_pfn : covered_end_pfn defines the pages that can + * covered_end_pfn defines the pages that can * be brough online. */ struct hv_hotadd_state { struct list_head list; unsigned long start_pfn; - unsigned long covered_start_pfn; unsigned long covered_end_pfn; unsigned long ha_end_pfn; unsigned long end_pfn; @@ -679,8 +678,7 @@ static void hv_online_page(struct page *pg) list_for_each(cur, _device.ha_region_list) { has = list_entry(cur, struct hv_hotadd_state, list); - cur_start_pgp = (unsigned long) - pfn_to_page(has->covered_start_pfn); + cur_start_pgp = (unsigned long)pfn_to_page(has->start_pfn); cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn); if (((unsigned long)pg >= cur_start_pgp) && @@ -692,7 +690,6 @@ static void hv_online_page(struct page *pg) __online_page_set_limits(pg); __online_page_increment_counters(pg); __online_page_free(pg); - has->covered_start_pfn++; } } } @@ -736,10 +733,9 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) * is, update it. */ - if (has->covered_end_pfn != start_pfn) { + if (has->covered_end_pfn != start_pfn) has->covered_end_pfn = start_pfn; - has->covered_start_pfn = start_pfn; - } + return true; } @@ -784,7 +780,6 @@ static unsigned long handle_pg_range(unsigned long pg_start, pgs_ol = pfn_cnt; hv_bring_pgs_online(start_pfn, pgs_ol); has->covered_end_pfn += pgs_ol; - has->covered_start_pfn += pgs_ol; pfn_cnt -= pgs_ol; } @@ -845,7
[PATCH 3/6] Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure
From: Vitaly Kuznetsov When add_memory() fails the following BUG is observed: [ 743.646107] hv_balloon: hot_add memory failed error is -17 [ 743.679973] [ 743.680930] = [ 743.680930] [ BUG: bad unlock balance detected! ] [ 743.680930] 3.19.0-rc5_bug1131426+ #552 Not tainted [ 743.680930] - [ 743.680930] kworker/0:2/255 is trying to release lock (_device.ha_region_mutex) at: [ 743.680930] [] mutex_unlock+0xe/0x10 [ 743.680930] but there are no more locks to release! This happens as we don't acquire ha_region_mutex and hot_add_req() expects us to as it does unconditional mutex_unlock(). Acquire the lock on the error path. Signed-off-by: Vitaly Kuznetsov Acked-by: Jason Wang Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index c5bb872..f1f17c5 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -652,6 +652,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, } has->ha_end_pfn -= HA_CHUNK; has->covered_end_pfn -= processed_pfn; + mutex_lock(_device.ha_region_mutex); break; } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY
From: Nick Meier HV_CRASH_CTL_CRASH_NOTIFY is a 64 bit number. Depending on the usage context, the value may be truncated. This patch is in response from the following email from Intel: [char-misc:char-misc-testing 25/45] drivers/hv/vmbus_drv.c:67:9: sparse: constant 0x8000 is so big it is unsigned long tree: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git char-misc-testing head: b3de8e3719e582f3182bb504295e4a8e43c8c96f commit: 96c1d0581d00f7abe033350edb021a9d947d8d81 [25/45] Drivers: hv: vmbus: Add support for VMBus panic notifier handler reproduce: # apt-get install sparse git checkout 96c1d0581d00f7abe033350edb021a9d947d8d81 make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) drivers/hv/vmbus_drv.c:67:9: sparse: constant 0x8000 is so big it is unsigned long ... Signed-off-by: Nick Meier Signed-off-by: K. Y. Srinivasan --- drivers/hv/hyperv_vmbus.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 6339589..c8e27e0 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -58,7 +58,7 @@ enum hv_cpuid_function { #define HV_X64_MSR_CRASH_P4 0x4104 #define HV_X64_MSR_CRASH_CTL 0x4105 -#define HV_CRASH_CTL_CRASH_NOTIFY 0x8000 +#define HV_CRASH_CTL_CRASH_NOTIFY (1ULL << 63) /* Define version of the synthetic interrupt controller. */ #define HV_SYNIC_VERSION (1) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] Drivers: hv: Miscellaneous fixes
This patch-set has miscellaneous fixes for both the VMBUS as well as the balloon driver. Dan Carpenter (1): hv: vmbus: missing curly braces in vmbus_process_offer() K. Y. Srinivasan (2): Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl() Drivers: hv: vmbus: Perform device register in the per-channel work element Nick Meier (1): Correcting truncation error for constant HV_CRASH_CTL_CRASH_NOTIFY Vitaly Kuznetsov (2): Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural drivers/hv/channel.c |1 + drivers/hv/channel_mgmt.c | 146 +++-- drivers/hv/connection.c |6 ++- drivers/hv/hv_balloon.c | 15 ++--- drivers/hv/hyperv_vmbus.h |4 +- 5 files changed, 115 insertions(+), 57 deletions(-) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] Drivers: hv: vmbus: Perform device register in the per-channel work element
This patch is a continuation of the rescind handling cleanup work. We cannot block in the global message handling work context especially if we are blocking waiting for the host to wake us up. I would like to thank Dexuan Cui for observing this problem. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 143 +++-- drivers/hv/connection.c |6 ++- drivers/hv/hyperv_vmbus.h |2 +- 3 files changed, 106 insertions(+), 45 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 6117891..681c806b 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -37,6 +38,10 @@ struct vmbus_channel_message_table_entry { void (*message_handler)(struct vmbus_channel_message_header *msg); }; +struct vmbus_rescind_work { + struct work_struct work; + struct vmbus_channel *channel; +}; /** * vmbus_prep_negotiate_resp() - Create default response for Hyper-V Negotiate message @@ -134,20 +139,6 @@ fw_error: EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp); -static void vmbus_process_device_unregister(struct work_struct *work) -{ - struct device *dev; - struct vmbus_channel *channel = container_of(work, - struct vmbus_channel, - work); - - dev = get_device(>device_obj->device); - if (dev) { - vmbus_device_unregister(channel->device_obj); - put_device(dev); - } -} - static void vmbus_sc_creation_cb(struct work_struct *work) { struct vmbus_channel *newchannel = container_of(work, @@ -201,7 +192,6 @@ static void release_channel(struct work_struct *work) work); destroy_workqueue(channel->controlwq); - kfree(channel); } @@ -220,6 +210,39 @@ static void free_channel(struct vmbus_channel *channel) queue_work(vmbus_connection.work_queue, >work); } +static void process_rescind_fn(struct work_struct *work) +{ + struct vmbus_rescind_work *rc_work; + struct vmbus_channel *channel; + struct device *dev; + + rc_work = container_of(work, struct vmbus_rescind_work, work); + channel = rc_work->channel; + + /* +* We have already acquired a reference on the channel +* and so it cannot vanish underneath us. +* It is possible (while very unlikely) that we may +* get here while the processing of the initial offer +* is still not complete. Deal with this situation by +* just waiting until the channel is in the correct state. +*/ + + while (channel->work.func != release_channel) + msleep(1000); + + if (channel->device_obj) { + dev = get_device(>device_obj->device); + if (dev) { + vmbus_device_unregister(channel->device_obj); + put_device(dev); + } + } else { + hv_process_channel_removal(channel, + channel->offermsg.child_relid); + } +} + static void percpu_channel_enq(void *arg) { struct vmbus_channel *channel = arg; @@ -282,6 +305,46 @@ void vmbus_free_channels(void) } } +static void vmbus_do_device_register(struct work_struct *work) +{ + struct hv_device *device_obj; + int ret; + unsigned long flags; + struct vmbus_channel *newchannel = container_of(work, +struct vmbus_channel, +work); + + ret = vmbus_device_register(newchannel->device_obj); + if (ret != 0) { + pr_err("unable to add child device object (relid %d)\n", + newchannel->offermsg.child_relid); + spin_lock_irqsave(_connection.channel_lock, flags); + list_del(>listentry); + device_obj = newchannel->device_obj; + newchannel->device_obj = NULL; + spin_unlock_irqrestore(_connection.channel_lock, flags); + + if (newchannel->target_cpu != get_cpu()) { + put_cpu(); + smp_call_function_single(newchannel->target_cpu, +percpu_channel_deq, newchannel, true); + } else { + percpu_channel_deq(newchannel); + put_cpu(); + } + + kfree(device_obj); + if (!newchannel->rescind) { + free_channel(newchannel); + return; + } + } + /* +* The next state for this channel is to be freed. +*/ + INIT_WORK(>work,
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 11:39 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 11:17 AM, Jason Low wrote: > > On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > >> > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > >> >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso > >> >> wrote: > >> >> > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> >> >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> >> >> Davidlohr Bueso wrote: > >> >> >> > >> >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > >> >> >> > > In owner_running() there are 2 conditions that would make it > >> >> >> > > return > >> >> >> > > false: if the owner changed or if the owner is not running. > >> >> >> > > However, > >> >> >> > > that patch continues spinning if there is a "new owner" but it > >> >> >> > > does not > >> >> >> > > take into account that we may want to stop spinning if the owner > >> >> >> > > is not > >> >> >> > > running (due to getting rescheduled). > >> >> >> > > >> >> >> > So you're rationale is that we're missing this need_resched: > >> >> >> > > >> >> >> > while (owner_running(sem, owner)) { > >> >> >> > /* abort spinning when need_resched */ > >> >> >> > if (need_resched()) { > >> >> >> > rcu_read_unlock(); > >> >> >> > return false; > >> >> >> > } > >> >> >> > } > >> >> >> > > >> >> >> > Because the owner_running() would return false, right? Yeah that > >> >> >> > makes > >> >> >> > sense, as missing a resched is a bug, as opposed to our heuristics > >> >> >> > being > >> >> >> > so painfully off. > >> >> >> > > >> >> >> > Sasha, Ming (Cc'ed), does this address the issues you guys are > >> >> >> > seeing? > >> >> >> > >> >> >> For the xfstest lockup, what matters is that the owner isn't > >> >> >> running, since > >> >> >> the following simple change does fix the issue: > >> >> > > >> >> > I much prefer Jason's approach, which should also take care of the > >> >> > issue, as it includes the !owner->on_cpu stop condition to stop > >> >> > spinning. > >> >> > >> >> But the check on owner->on_cpu should be moved outside the loop > >> >> because new owner can be scheduled out too, right? > >> > > >> > We should keep the owner->on_cpu check inside the loop, otherwise we > >> > could continue spinning if the owner is not running. > >> > >> So how about checking in this way outside the loop for avoiding the spin? > >> > >> if (owner) > >>return owner->on_cpu; > > > > So these owner->on_cpu checks outside of the loop "fixes" the issue as > > well, but I don't see the benefit of needing to guess why we break out > > of the spin loop (which may make things less readable) and checking > > owner->on_cpu duplicate times when one check is enough. > > I mean moving the check on owner->on_cpu outside loop, so there is > only one check for both new and old owner. If it is inside loop, > the check is only on old owner. > > That is correct to keep it inside loop if you guys are sure new > owner can't be scheduled out, but better to add comment why > it can't, looks no one explained yet. The new owner can get rescheduled. And if there's a new owner, then the spinner goes to rwsem_spin_on_owner() again and checks the new owner's on_cpu. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 11:19 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 11:10 AM, Davidlohr Bueso wrote: > > On Sat, 2015-03-07 at 10:55 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:29 AM, Davidlohr Bueso wrote: > >> > On Fri, 2015-03-06 at 18:26 -0800, Davidlohr Bueso wrote: > >> >> That's not what this is about. New lock _owners_ need to worry about > >> > ^^^ make that > >> > "need not" > >> > >> Sorry, could you explain a bit why new owner can't be scheduled > >> out(on_cpu becomes zero)? If that is possible, it still can cause > >> soft lockup like current problem. > > > > Oh its not that it can't be scheduled out. The point is we don't care > > what happens with the lock owner itself (new or not). We care about, and > > the point of this discussion, how _other_ threads handle themselves when > > trying to take that lock (a lock having an owner implies the lock is not > > free, of course). So if a lock owner gets scheduled out... so what? > > That's already taken into account by spinners. > > Not exactly, current problem is just in spinner because it > ignores scheduled out owner and continues to spin, then > cause lockup, isn't it? Exactly my point, Ming. It's the _spinner_ that has the problem, hence the fix in the part of the code that must decide just that. By the time we're doing this: if (READ_ONCE(sem->owner)) return true; /* new owner, continue spinning */ We need to have already taken into account the owner->on_cpu situation. We fix spinners, not lock owners. I'm really running out of ways to explain this, and you are going in circles, which is getting annoying given that you haven't even tried the other patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 11:17 AM, Jason Low wrote: > On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: >> On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: >> > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: >> >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso >> >> wrote: >> >> > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: >> >> >> On Fri, 06 Mar 2015 14:15:37 -0800 >> >> >> Davidlohr Bueso wrote: >> >> >> >> >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: >> >> >> > > In owner_running() there are 2 conditions that would make it return >> >> >> > > false: if the owner changed or if the owner is not running. >> >> >> > > However, >> >> >> > > that patch continues spinning if there is a "new owner" but it >> >> >> > > does not >> >> >> > > take into account that we may want to stop spinning if the owner >> >> >> > > is not >> >> >> > > running (due to getting rescheduled). >> >> >> > >> >> >> > So you're rationale is that we're missing this need_resched: >> >> >> > >> >> >> > while (owner_running(sem, owner)) { >> >> >> > /* abort spinning when need_resched */ >> >> >> > if (need_resched()) { >> >> >> > rcu_read_unlock(); >> >> >> > return false; >> >> >> > } >> >> >> > } >> >> >> > >> >> >> > Because the owner_running() would return false, right? Yeah that >> >> >> > makes >> >> >> > sense, as missing a resched is a bug, as opposed to our heuristics >> >> >> > being >> >> >> > so painfully off. >> >> >> > >> >> >> > Sasha, Ming (Cc'ed), does this address the issues you guys are >> >> >> > seeing? >> >> >> >> >> >> For the xfstest lockup, what matters is that the owner isn't running, >> >> >> since >> >> >> the following simple change does fix the issue: >> >> > >> >> > I much prefer Jason's approach, which should also take care of the >> >> > issue, as it includes the !owner->on_cpu stop condition to stop >> >> > spinning. >> >> >> >> But the check on owner->on_cpu should be moved outside the loop >> >> because new owner can be scheduled out too, right? >> > >> > We should keep the owner->on_cpu check inside the loop, otherwise we >> > could continue spinning if the owner is not running. >> >> So how about checking in this way outside the loop for avoiding the spin? >> >> if (owner) >>return owner->on_cpu; > > So these owner->on_cpu checks outside of the loop "fixes" the issue as > well, but I don't see the benefit of needing to guess why we break out > of the spin loop (which may make things less readable) and checking > owner->on_cpu duplicate times when one check is enough. I mean moving the check on owner->on_cpu outside loop, so there is only one check for both new and old owner. If it is inside loop, the check is only on old owner. That is correct to keep it inside loop if you guys are sure new owner can't be scheduled out, but better to add comment why it can't, looks no one explained yet. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Commemorative linux-1.0-3.19 compressed tarball
As I did prior to the linux-3.0 release, I've created a commemorative tarball of all stable point releases from linux 1.0 to linux 3.19 to commemorate the upcoming 4.0 release, excluding minor point releases. http://ck.kolivas.org/linux-1.0-3.19.tar.lrz This was a 29GB tarball compressed to 355MB. Of course this is mostly just for fun, and as a shameless plug for lrzip. Be aware that to create the tarball I scripted up downloading of patches and recreated tarballs so many of the tarballs will not be identical files to the official release ones. The command used to create the tarball was: lrzip -U linux-1.0-3.19.tar linux-1.0-3.19.tar - Compression Ratio: 77.885. Average Compression Speed: 37.565MB/s. md5sums: b9a1e7e6762150028ec36b481df7f264 linux-1.0-3.19.tar 5f5dff66416346fdf9453825f8e4435c linux-1.0-3.19.tar.lrz tar -tvf linux-1.0-3.19.tar drwxrwxr-x con/con 0 2015-03-07 11:33 v1.0/ -rw-rw-r-- con/con 5171200 1994-03-13 10:00 v1.0/linux-1.0.tar drwxrwxr-x con/con 0 2015-03-07 11:33 v1.2/ -rw-rw-r-- con/con 9308160 1995-04-29 10:00 v1.2/linux-1.2.7.tar -rw-rw-r-- con/con 9379840 1995-08-02 10:00 v1.2/linux-1.2.13.tar -rw-rw-r-- con/con 9195520 1995-03-17 11:00 v1.2/linux-1.2.1.tar -rw-rw-r-- con/con 9359360 1995-05-03 10:00 v1.2/linux-1.2.8.tar -rw-rw-r-- con/con 9267200 1995-04-02 10:00 v1.2/linux-1.2.3.tar -rw-rw-r-- con/con 9369600 1995-06-12 10:00 v1.2/linux-1.2.10.tar -rw-rw-r-- con/con 9369600 1995-06-01 10:00 v1.2/linux-1.2.9.tar -rw-rw-r-- con/con 9267200 1995-03-27 10:00 v1.2/linux-1.2.2.tar -rw-rw-r-- con/con 9369600 1995-06-26 10:00 v1.2/linux-1.2.11.tar -rw-rw-r-- con/con 9277440 1995-04-12 10:00 v1.2/linux-1.2.5.tar -rw-rw-r-- con/con 9277440 1995-04-06 10:00 v1.2/linux-1.2.4.tar -rw-rw-r-- con/con 9379840 1995-07-25 10:00 v1.2/linux-1.2.12.tar -rw-rw-r-- con/con 9297920 1995-04-23 10:00 v1.2/linux-1.2.6.tar -rw-rw-r-- con/con 9185280 1995-03-07 11:00 v1.2/linux-1.2.0.tar drwxrwxr-x con/con 0 2015-03-07 11:02 v2.0/ -rw-rw-r-- con/con26603520 1997-11-18 13:34 v2.0/linux-2.0.32.tar -rw-rw-r-- con/con24709120 1996-12-02 05:18 v2.0/linux-2.0.27.tar -rw-rw-r-- con/con23900160 1996-07-05 10:00 v2.0/linux-2.0.2.tar -rw-rw-r-- con/con30228480 1998-11-16 16:50 v2.0/linux-2.0.36.tar -rw-rw-r-- con/con28231680 1998-06-04 15:15 v2.0/linux-2.0.34.tar -rw-rw-r-- con/con24688640 1996-10-30 14:14 v2.0/linux-2.0.24.tar -rw-rw-r-- con/con24401920 1996-09-01 04:03 v2.0/linux-2.0.16.tar -rw-rw-r-- con/con24166400 1996-07-12 10:00 v2.0/linux-2.0.6.tar -rw-rw-r-- con/con24483840 1996-10-18 22:20 v2.0/linux-2.0.23.tar -rw-rw-r-- con/con26562560 1997-10-18 08:25 v2.0/linux-2.0.31.tar -rw-rw-r-- con/con24494080 1997-02-08 01:56 v2.0/linux-2.0.29.tar -rw-rw-r-- con/con24176640 1996-07-15 10:00 v2.0/linux-2.0.7.tar -rw-rw-r-- con/con24432640 1996-09-12 00:21 v2.0/linux-2.0.19.tar -rw-rw-r-- con/con31324160 1999-08-26 08:11 v2.0/linux-2.0.38.tar -rw-rw-r-- con/con24238080 1996-08-05 10:00 v2.0/linux-2.0.11.tar -rw-rw-r-- con/con24238080 1996-08-09 10:00 v2.0/linux-2.0.12.tar -rw-rw-r-- con/con24412160 1996-09-02 20:37 v2.0/linux-2.0.17.tar -rw-rw-r-- con/con23879680 1996-06-09 10:00 v2.0/linux-2.0.tar -rw-rw-r-- con/con31324160 1999-06-14 15:15 v2.0/linux-2.0.37.tar -rw-rw-r-- con/con24422400 1996-09-06 00:38 v2.0/linux-2.0.18.tar -rw-rw-r-- con/con24197120 1996-07-26 10:00 v2.0/linux-2.0.9.tar -rw-rw-r-- con/con24186880 1996-07-25 10:00 v2.0/linux-2.0.8.tar -rw-rw-r-- con/con29122560 1998-07-14 07:09 v2.0/linux-2.0.35.tar -rw-rw-r-- con/con24166400 1996-07-06 10:00 v2.0/linux-2.0.3.tar -rw-rw-r-- con/con24401920 1996-08-25 20:20 v2.0/linux-2.0.15.tar -rw-rw-r-- con/con24729600 1996-11-23 00:17 v2.0/linux-2.0.26.tar -rw-rw-r-- con/con31426560 2001-01-10 08:30 v2.0/linux-2.0.39.tar -rw-rw-r-- con/con24494080 1997-01-14 23:33 v2.0/linux-2.0.28.tar -rw-rw-r-- con/con24432640 1996-09-13 22:53 v2.0/linux-2.0.20.tar -rw-rw-r-- con/con24473600 1996-10-09 03:02 v2.0/linux-2.0.22.tar -rw-rw-r-- con/con24698880 1996-11-08 20:31 v2.0/linux-2.0.25.tar -rw-rw-r-- con/con25108480 1997-04-09 02:34 v2.0/linux-2.0.30.tar -rw-rw-r-- con/con24442880 1996-09-20 23:51 v2.0/linux-2.0.21.tar -rw-rw-r-- con/con24145920 1996-07-08 10:00 v2.0/linux-2.0.4.tar -rw-rw-r-- con/con23900160 1996-07-03 10:00 v2.0/linux-2.0.1.tar -rw-rw-r-- con/con26624000 1997-12-17 09:55 v2.0/linux-2.0.33.tar -rw-rw-r-- con/con24391680 1996-08-21 01:52 v2.0/linux-2.0.14.tar -rw-rw-r-- con/con24371200 1996-08-16 20:19 v2.0/linux-2.0.13.tar -rw-rw-r-- con/con31467520 2004-02-08 18:13 v2.0/linux-2.0.40.tar -rw-rw-r-- con/con24156160 1996-07-10 10:00 v2.0/linux-2.0.5.tar -rw-rw-r-- con/con24197120 1996-07-27 10:00 v2.0/linux-2.0.10.tar drwxrwxr-x con/con 0 2015-03-07 11:06 v2.2/
Re: [PATCH v3 5/9] mtd: pxa3xx_nand: add support for the Marvell Berlin nand controller
Hi Antoine, On 03/05/2015 08:31 AM, Antoine Tenart wrote: [..] > + > +static struct pxa3xx_nand_flash berlin_builtin_flash_types[] = { > +{ "4GiB 8-bit",0xd7ec, 128, 8192, 8, 8, 4096 }, > +{ }, IMHO, supporting a specific flash shouldn't be part of this patch. In any case, why do you need this? If you can share the details about this device, it would be interesting for me to take a look. This driver's open-coded, legacy-style flash detection is nasty, and the only reason I've kept it is to avoid breaking some wacky user with some old board. In fact, maybe we can just kill it so nobody thinks it's sane. Flash detection is the NAND core's job, and duplicating it in the driver is not nice. Let's try to avoid it. BTW, nand_ids.c seems to list a similar device: EXTENDED_ID_NAND("NAND 4GiB 3,3V 8-bit", 0xD7, 4096, LP_OPTIONS), Have you tried this? -- Ezequiel García, Free Electrons Embedded Linux, Kernel and Android Engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 11:10 AM, Davidlohr Bueso wrote: > On Sat, 2015-03-07 at 10:55 +0800, Ming Lei wrote: >> On Sat, Mar 7, 2015 at 10:29 AM, Davidlohr Bueso wrote: >> > On Fri, 2015-03-06 at 18:26 -0800, Davidlohr Bueso wrote: >> >> That's not what this is about. New lock _owners_ need to worry about >> > ^^^ make that "need >> > not" >> >> Sorry, could you explain a bit why new owner can't be scheduled >> out(on_cpu becomes zero)? If that is possible, it still can cause >> soft lockup like current problem. > > Oh its not that it can't be scheduled out. The point is we don't care > what happens with the lock owner itself (new or not). We care about, and > the point of this discussion, how _other_ threads handle themselves when > trying to take that lock (a lock having an owner implies the lock is not > free, of course). So if a lock owner gets scheduled out... so what? > That's already taken into account by spinners. Not exactly, current problem is just in spinner because it ignores scheduled out owner and continues to spin, then cause lockup, isn't it? Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > >> > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> >> Davidlohr Bueso wrote: > >> >> > >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > >> >> > > In owner_running() there are 2 conditions that would make it return > >> >> > > false: if the owner changed or if the owner is not running. However, > >> >> > > that patch continues spinning if there is a "new owner" but it does > >> >> > > not > >> >> > > take into account that we may want to stop spinning if the owner is > >> >> > > not > >> >> > > running (due to getting rescheduled). > >> >> > > >> >> > So you're rationale is that we're missing this need_resched: > >> >> > > >> >> > while (owner_running(sem, owner)) { > >> >> > /* abort spinning when need_resched */ > >> >> > if (need_resched()) { > >> >> > rcu_read_unlock(); > >> >> > return false; > >> >> > } > >> >> > } > >> >> > > >> >> > Because the owner_running() would return false, right? Yeah that makes > >> >> > sense, as missing a resched is a bug, as opposed to our heuristics > >> >> > being > >> >> > so painfully off. > >> >> > > >> >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? > >> >> > >> >> For the xfstest lockup, what matters is that the owner isn't running, > >> >> since > >> >> the following simple change does fix the issue: > >> > > >> > I much prefer Jason's approach, which should also take care of the > >> > issue, as it includes the !owner->on_cpu stop condition to stop > >> > spinning. > >> > >> But the check on owner->on_cpu should be moved outside the loop > >> because new owner can be scheduled out too, right? > > > > We should keep the owner->on_cpu check inside the loop, otherwise we > > could continue spinning if the owner is not running. > > So how about checking in this way outside the loop for avoiding the spin? > > if (owner) >return owner->on_cpu; So these owner->on_cpu checks outside of the loop "fixes" the issue as well, but I don't see the benefit of needing to guess why we break out of the spin loop (which may make things less readable) and checking owner->on_cpu duplicate times when one check is enough. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 00/18] ARM: msm multiplatform support
On Wed, Mar 04, 2015 at 08:32:54PM +0100, Arnd Bergmann wrote: > This is my final piece of the puzzle for ARMv6/v7 multiplatform > support. In combination with the other patches that are now > at git://kernel.org/pub/scm/linux/kernel/git/arnd/playground.git > multiplatform-4.0-rc2 and the at91 and shmobile parts from other > developers, you can now build one kernel that includes all > ARMv6 and ARMv7 targets we support in Linux. > > Since mach-msm has seen very few updates over the last years, > it was more work to get to this point than the others, and > some patches are more of a band-aid than a proper solution. > Still, I think that each patch in the series is an improvement > over the status-quo and I really want to see the last one > merged into 4.1 and it depends on all the other ones. > > Stephen Boyd mentioned on IRC that he has been workin on > a similar series, and I'm more than happy to replace some > of this work with patches that he has done, as long as we > can still have the full multiplatform support for 4.1. > > Since a lot of the patches are nontrivial and I have not > been able to test any of this, I'm posting it as an RFC, > but I'm also very interested in people testing it. > I think I would support deleting mach-msm at this point. I did work on adding device tree support last year, but lost my motivation. It seems like the community has a tendency to attack things that are "old" and mach-msm seems like a constant whipping horse. I don't have plans for mach-msm., Qualcomm never cared about the remaining platforms and Google never cared either. I have no reason to care about it anymore. If someone out there still wants the code I'm game to start actively maintaining it, now that it's clear David and Bryan have completely walked away. Baring someone coming forward it seems no one else is actively using it. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > >> > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> >> Davidlohr Bueso wrote: > >> >> > >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > >> >> > > In owner_running() there are 2 conditions that would make it return > >> >> > > false: if the owner changed or if the owner is not running. However, > >> >> > > that patch continues spinning if there is a "new owner" but it does > >> >> > > not > >> >> > > take into account that we may want to stop spinning if the owner is > >> >> > > not > >> >> > > running (due to getting rescheduled). > >> >> > > >> >> > So you're rationale is that we're missing this need_resched: > >> >> > > >> >> > while (owner_running(sem, owner)) { > >> >> > /* abort spinning when need_resched */ > >> >> > if (need_resched()) { > >> >> > rcu_read_unlock(); > >> >> > return false; > >> >> > } > >> >> > } > >> >> > > >> >> > Because the owner_running() would return false, right? Yeah that makes > >> >> > sense, as missing a resched is a bug, as opposed to our heuristics > >> >> > being > >> >> > so painfully off. > >> >> > > >> >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? > >> >> > >> >> For the xfstest lockup, what matters is that the owner isn't running, > >> >> since > >> >> the following simple change does fix the issue: > >> > > >> > I much prefer Jason's approach, which should also take care of the > >> > issue, as it includes the !owner->on_cpu stop condition to stop > >> > spinning. > >> > >> But the check on owner->on_cpu should be moved outside the loop > >> because new owner can be scheduled out too, right? > > > > We should keep the owner->on_cpu check inside the loop, otherwise we > > could continue spinning if the owner is not running. > > So how about checking in this way outside the loop for avoiding the spin? > > if (owner) >return owner->on_cpu; Ming are you reading the thread?? Have you at least tried jason's patch?? *sigh* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 10:55 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:29 AM, Davidlohr Bueso wrote: > > On Fri, 2015-03-06 at 18:26 -0800, Davidlohr Bueso wrote: > >> That's not what this is about. New lock _owners_ need to worry about > > ^^^ make that "need > > not" > > Sorry, could you explain a bit why new owner can't be scheduled > out(on_cpu becomes zero)? If that is possible, it still can cause > soft lockup like current problem. Oh its not that it can't be scheduled out. The point is we don't care what happens with the lock owner itself (new or not). We care about, and the point of this discussion, how _other_ threads handle themselves when trying to take that lock (a lock having an owner implies the lock is not free, of course). So if a lock owner gets scheduled out... so what? That's already taken into account by spinners. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: >> > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: >> >> On Fri, 06 Mar 2015 14:15:37 -0800 >> >> Davidlohr Bueso wrote: >> >> >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: >> >> > > In owner_running() there are 2 conditions that would make it return >> >> > > false: if the owner changed or if the owner is not running. However, >> >> > > that patch continues spinning if there is a "new owner" but it does >> >> > > not >> >> > > take into account that we may want to stop spinning if the owner is >> >> > > not >> >> > > running (due to getting rescheduled). >> >> > >> >> > So you're rationale is that we're missing this need_resched: >> >> > >> >> > while (owner_running(sem, owner)) { >> >> > /* abort spinning when need_resched */ >> >> > if (need_resched()) { >> >> > rcu_read_unlock(); >> >> > return false; >> >> > } >> >> > } >> >> > >> >> > Because the owner_running() would return false, right? Yeah that makes >> >> > sense, as missing a resched is a bug, as opposed to our heuristics being >> >> > so painfully off. >> >> > >> >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? >> >> >> >> For the xfstest lockup, what matters is that the owner isn't running, >> >> since >> >> the following simple change does fix the issue: >> > >> > I much prefer Jason's approach, which should also take care of the >> > issue, as it includes the !owner->on_cpu stop condition to stop >> > spinning. >> >> But the check on owner->on_cpu should be moved outside the loop >> because new owner can be scheduled out too, right? > > We should keep the owner->on_cpu check inside the loop, otherwise we > could continue spinning if the owner is not running. So how about checking in this way outside the loop for avoiding the spin? if (owner) return owner->on_cpu; Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] With 8250 Designware UART, if writes to the LCR failed the kernel will hung up
You only hit the silicon bug if you bombard the uart with characters and simultaneously request a baud rate or framing change. I'm not sure why you would do either to the uart console. Is it possible your host machine is doing something weird? If you have the leverage, remind the SoC vendor to upgrade the serial block. Synopsys fixed this a long time ago. -Tim On Fri, Mar 6, 2015 at 8:50 AM, Peter Hurley wrote: > Hi Zhang, > > On 03/06/2015 04:11 AM, Zhang Zhen wrote: >> Hi, >> >> I'm testing 4.0-rc1 kernel on my board with 8250 Designware UART.(ARM >> Cortex-a15 single core). >> >> I found if serial is busy and writes to the LCR failed after tried >> 1000 times. >> The kernel will hung up. >> >> The system boot success after changed from: >> >> 95 static void dw8250_serial_out(struct uart_port *p, int offset, int value) >> 96 { >> 97 struct dw8250_data *d = p->private_data; >> 98 >> ... >> ... >> 112 writeb(value, p->membase + (UART_LCR << >> p->regshift)); >> 113 } >> 114 dev_err(p->dev, "Couldn't set LCR to %d\n", value); >> 115 } >> 116 } >> >> to: >> >> 95 static void dw8250_serial_out(struct uart_port *p, int offset, int value) >> 96 { >> 97 struct dw8250_data *d = p->private_data; >> 98 >> ... >> ... >> 112 writeb(value, p->membase + (UART_LCR << >> p->regshift)); >> 113 } >> 114 dev_info(p->dev, "Couldn't set LCR to %d\n", value); >>//changed here >> 115 } >> 116 } >> >> The reason is serial8250_console_write can't get port->lock because >> serial8250_do_set_termios has >> got port->lock. >> So i think here we should change from dev_err to dev_info ? > > That's not really going to help because this will still hang if the > console_loglevel is set to < KERN_INFO. > >> Any suggestions are welcome. > > Check that the port is not the uart_console() before logging the error, > like; > > if (!uart_console(p)) > dev_err(p->dev, "Couldn't ."); > > Use a global flag to note the error and check it from other contexts. > Plus, find out why you can't write LCR there. > > Also, consider re-designing how the 8250_dw driver implements that > "feature". > > Regards, > Peter Hurley > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> Davidlohr Bueso wrote: > >> > >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > >> > > In owner_running() there are 2 conditions that would make it return > >> > > false: if the owner changed or if the owner is not running. However, > >> > > that patch continues spinning if there is a "new owner" but it does not > >> > > take into account that we may want to stop spinning if the owner is not > >> > > running (due to getting rescheduled). > >> > > >> > So you're rationale is that we're missing this need_resched: > >> > > >> > while (owner_running(sem, owner)) { > >> > /* abort spinning when need_resched */ > >> > if (need_resched()) { > >> > rcu_read_unlock(); > >> > return false; > >> > } > >> > } > >> > > >> > Because the owner_running() would return false, right? Yeah that makes > >> > sense, as missing a resched is a bug, as opposed to our heuristics being > >> > so painfully off. > >> > > >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? > >> > >> For the xfstest lockup, what matters is that the owner isn't running, since > >> the following simple change does fix the issue: > > > > I much prefer Jason's approach, which should also take care of the > > issue, as it includes the !owner->on_cpu stop condition to stop > > spinning. > > But the check on owner->on_cpu should be moved outside the loop > because new owner can be scheduled out too, right? We should keep the owner->on_cpu check inside the loop, otherwise we could continue spinning if the owner is not running. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 10:29 AM, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 18:26 -0800, Davidlohr Bueso wrote: >> That's not what this is about. New lock _owners_ need to worry about > ^^^ make that "need not" Sorry, could you explain a bit why new owner can't be scheduled out(on_cpu becomes zero)? If that is possible, it still can cause soft lockup like current problem. Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/12] clocksource: Add max_cycles to clocksource structure
In order to facilitate some clocksource validation, add a max_cycles entry to the structure which will hold the maximum cycle value that can safely be multiplied without potentially causing an overflow. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- include/linux/clocksource.h | 6 -- kernel/time/clocksource.c | 15 --- kernel/time/sched_clock.c | 2 +- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index 9c78d15..63fe52f 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -56,6 +56,7 @@ struct module; * @shift: cycle to nanosecond divisor (power of two) * @max_idle_ns: max idle time permitted by the clocksource (nsecs) * @maxadj:maximum adjustment value to mult (~11%) + * @max_cycles:maximum safe cycle value which won't overflow on mult * @flags: flags describing special properties * @archdata: arch-specific data * @suspend: suspend function for the clocksource, if necessary @@ -76,7 +77,7 @@ struct clocksource { #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA struct arch_clocksource_data archdata; #endif - + u64 max_cycles; const char *name; struct list_head list; int rating; @@ -189,7 +190,8 @@ extern struct clocksource * __init clocksource_default_clock(void); extern void clocksource_mark_unstable(struct clocksource *cs); extern u64 -clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask); +clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, + u64 *max_cycles); extern void clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec); diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 4988411..e6c752b 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -469,11 +469,14 @@ static u32 clocksource_max_adjustment(struct clocksource *cs) * @shift: cycle to nanosecond divisor (power of two) * @maxadj:maximum adjustment value to mult (~11%) * @mask: bitmask for two's complement subtraction of non 64 bit counters + * @max_cyc:maximum cycle value before potential overflow (does not include + * any saftey margin) * * NOTE: This function includes a saftey margin of 50%, so that bad clock values * can be detected. */ -u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) +u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, + u64 *max_cyc) { u64 max_nsecs, max_cycles; @@ -493,6 +496,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) max_cycles = min(max_cycles, mask); max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift); + /* return the max_cycles value as well if requested */ + if (max_cyc) + *max_cyc = max_cycles; + /* Return 50% of the actual maximum, so we can detect bad values */ max_nsecs >>= 1; @@ -671,7 +678,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) } cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift, -cs->maxadj, cs->mask); +cs->maxadj, cs->mask, +>max_cycles); } EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale); @@ -719,7 +727,8 @@ int clocksource_register(struct clocksource *cs) /* calculate max idle time permitted for this clocksource */ cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift, -cs->maxadj, cs->mask); +cs->maxadj, cs->mask, +>max_cycles); mutex_lock(_mutex); clocksource_enqueue(cs); diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index c794b84..d43855b 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -126,7 +126,7 @@ void __init sched_clock_register(u64 (*read)(void), int bits, new_mask = CLOCKSOURCE_MASK(bits); /* calculate how many ns until we risk wrapping */ - wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask); + wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask, NULL); new_wrap_kt = ns_to_ktime(wrap); /* update epoch for new counter and update epoch_ns from old counter*/ -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org
[PATCH 00/12] Increased clocksource validation and cleanups (v3)
So Ingo asked me to resend this series, which is the result of earlier discussions with Linus and his suggestions around improvements to clocksource validation in the hope we can more easily catch bad hardware. There's also a few cleanups Linus suggested as well as a few I've been meaning to get to for awhile. I tried in address all the feedback that had been given, adding the checks behind CONFIG_DEBUG_TIMEKEEPING. I also sorted out a sane way to print rate-limited warnings if we see cycle deltas that are too large, or if they look like small underflows. I'd like to get this queued into -tip soon so it can get as much testing in -next as possible. If there are any objections or feedback, I'd love to hear it! New in v3: * Its been awhile since v2, and I don't recall any substantial changes, but I did change a function macro into a static inline per PeterZ's request. thanks -john The patches are also available via a git pull: The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539: Linux 4.0-rc1 (2015-02-22 18:21:14 -0800) are available in the git repository at: https://git.linaro.org/people/john.stultz/linux.git fortglx/4.1/time for you to fetch changes up to 904d6befdcab1b9947c14dd517977782216daafa: clocksource: Add some debug info about clocksources being registered (2015-02-25 20:58:30 -0800) Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra John Stultz (12): clocksource: Simplify clocks_calc_max_nsecs logic clocksource: Simplify logic around clocksource wrapping saftey margins clocksource: Remove clocksource_max_deferment() clocksource: Add max_cycles to clocksource structure time: Add debugging checks to warn if we see delays time: Add infrastructure to cap clocksource reads to the max_cycles value time: Try to catch clocksource delta underflows time: Add warnings when overflows or underflows are observed clocksource: Improve clocksource watchdog reporting clocksource: Mostly kill clocksource_register() sparc: Convert to using clocksource_register_hz() clocksource: Add some debug info about clocksources being registered arch/s390/kernel/time.c | 2 +- arch/sparc/kernel/time_32.c | 6 +- include/linux/clocksource.h | 16 - kernel/time/clocksource.c | 164 +++- kernel/time/jiffies.c | 5 +- kernel/time/sched_clock.c | 6 +- kernel/time/timekeeping.c | 116 +++ lib/Kconfig.debug | 12 8 files changed, 208 insertions(+), 119 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic
The previous clocks_calc_max_nsecs had some unecessarily complex bit logic to find the max interval that could cause multiplication overflows. Since this is not in the hot path, just do the divide to make it easier to read. The previous implementation also had a subtle issue that it avoided overflows into signed 64bit values, where as the intervals are always unsigned. This resulted in overly conservative intervals, which other saftey margins were then added to, reducing the intended interval length. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/clocksource.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 4892352..11323f4 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -476,19 +476,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) /* * Calculate the maximum number of cycles that we can pass to the -* cyc2ns function without overflowing a 64-bit signed result. The -* maximum number of cycles is equal to ULLONG_MAX/(mult+maxadj) -* which is equivalent to the below. -* max_cycles < (2^63)/(mult + maxadj) -* max_cycles < 2^(log2((2^63)/(mult + maxadj))) -* max_cycles < 2^(log2(2^63) - log2(mult + maxadj)) -* max_cycles < 2^(63 - log2(mult + maxadj)) -* max_cycles < 1 << (63 - log2(mult + maxadj)) -* Please note that we add 1 to the result of the log2 to account for -* any rounding errors, ensure the above inequality is satisfied and -* no overflow will occur. +* cyc2ns function without overflowing a 64-bit result. */ - max_cycles = 1ULL << (63 - (ilog2(mult + maxadj) + 1)); + max_cycles = ULLONG_MAX; + do_div(max_cycles, mult+maxadj); /* * The actual maximum number of cycles we can defer the clocksource is -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/12] clocksource: Remove clocksource_max_deferment()
clocksource_max_deferment() doesn't do anything useful anymore, so zap it. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/clocksource.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index e5d00e6..4988411 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -499,20 +499,6 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) return max_nsecs; } -/** - * clocksource_max_deferment - Returns max time the clocksource should be deferred - * @cs: Pointer to clocksource - * - */ -static u64 clocksource_max_deferment(struct clocksource *cs) -{ - u64 max_nsecs; - - max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj, - cs->mask); - return max_nsecs; -} - #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET static struct clocksource *clocksource_find_best(bool oneshot, bool skipcur) @@ -684,7 +670,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) cs->maxadj = clocksource_max_adjustment(cs); } - cs->max_idle_ns = clocksource_max_deferment(cs); + cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift, +cs->maxadj, cs->mask); } EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale); @@ -731,7 +718,8 @@ int clocksource_register(struct clocksource *cs) cs->name); /* calculate max idle time permitted for this clocksource */ - cs->max_idle_ns = clocksource_max_deferment(cs); + cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift, +cs->maxadj, cs->mask); mutex_lock(_mutex); clocksource_enqueue(cs); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/12] time: Add infrastructure to cap clocksource reads to the max_cycles value
When calculating the current delta since the last tick, we currently have no hard protections to prevent a multiplciation overflow from ocurring. This patch introduces infrastructure to allow a capp that limits the read delta value to the max_cycles value, which is where an overflow would occur. Since this is in the hotpath, it adds the extra checking under CONFIG_DEBUG_TIMEKEEPING. There was some concern that capping time like this could cause problems as we may stop expiring timers, which could go circular if the timer that triggers time accumulation were misscheduled too far in the future, which would cause time to stop. However, since the mult overflow would result in a smaller time value, we would effectively have the same problem there. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/timekeeping.c | 39 +++ 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 7e9d433..8b9e328 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -134,11 +134,34 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) " the %s 50%% safety margin (%lld)\n", offset, name, max_cycles>>1); } + +static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr) +{ + cycle_t cycle_now, delta; + + /* read clocksource */ + cycle_now = tkr->read(tkr->clock); + + /* calculate the delta since the last update_wall_time */ + delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask); + + /* Cap delta value to the max_cycles values to avoid mult overflows */ + if (unlikely(delta > tkr->clock->max_cycles)) + delta = tkr->clock->max_cycles; + + return delta; +} #else static inline void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) { } +static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr) +{ + /* calculate the delta since the last update_wall_time */ + return clocksource_delta(tkr->read(tkr->clock), tkr->cycle_last, + tkr->mask); +} #endif /** @@ -216,14 +239,10 @@ static inline u32 arch_gettimeoffset(void) { return 0; } static inline s64 timekeeping_get_ns(struct tk_read_base *tkr) { - cycle_t cycle_now, delta; + cycle_t delta; s64 nsec; - /* read clocksource: */ - cycle_now = tkr->read(tkr->clock); - - /* calculate the delta since the last update_wall_time: */ - delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask); + delta = timekeeping_get_delta(tkr); nsec = delta * tkr->mult + tkr->xtime_nsec; nsec >>= tkr->shift; @@ -235,14 +254,10 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr) static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk) { struct clocksource *clock = tk->tkr.clock; - cycle_t cycle_now, delta; + cycle_t delta; s64 nsec; - /* read clocksource: */ - cycle_now = tk->tkr.read(clock); - - /* calculate the delta since the last update_wall_time: */ - delta = clocksource_delta(cycle_now, tk->tkr.cycle_last, tk->tkr.mask); + delta = timekeeping_get_delta(>tkr); /* convert delta to nanoseconds. */ nsec = clocksource_cyc2ns(delta, clock->mult, clock->shift); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/12] clocksource: Improve clocksource watchdog reporting
The clocksource watchdog reporting has been less helpful then desired, as it just printed the delta between the two clocksources. This prevents any useful analysis of why the skew occurred. Thus this patch tries to improve the output when we mark a clocksource as unstable, printing out the cycle last and now values for both the current clocksource and the watchdog clocksource. This will allow us to see if the result was due to a false positive caused by a problematic watchdog. Signed-off-by: John Stultz --- kernel/time/clocksource.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index e6c752b..51c7b3a 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -142,13 +142,6 @@ static void __clocksource_unstable(struct clocksource *cs) schedule_work(_work); } -static void clocksource_unstable(struct clocksource *cs, int64_t delta) -{ - printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n", - cs->name, delta); - __clocksource_unstable(cs); -} - /** * clocksource_mark_unstable - mark clocksource unstable via watchdog * @cs:clocksource to be marked unstable @@ -174,7 +167,7 @@ void clocksource_mark_unstable(struct clocksource *cs) static void clocksource_watchdog(unsigned long data) { struct clocksource *cs; - cycle_t csnow, wdnow, delta; + cycle_t csnow, wdnow, cslast, wdlast, delta; int64_t wd_nsec, cs_nsec; int next_cpu, reset_pending; @@ -213,6 +206,8 @@ static void clocksource_watchdog(unsigned long data) delta = clocksource_delta(csnow, cs->cs_last, cs->mask); cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift); + wdlast = cs->wd_last; /* save these incase we print them */ + cslast = cs->cs_last; cs->cs_last = csnow; cs->wd_last = wdnow; @@ -221,7 +216,15 @@ static void clocksource_watchdog(unsigned long data) /* Check the deviation from the watchdog clocksource. */ if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) { - clocksource_unstable(cs, cs_nsec - wd_nsec); + pr_warn("Watchdog: clocksource %s unstable\n", + cs->name); + pr_warn(" " + "%s wd_now: %llx wd_last: %llx mask: %llx\n", + watchdog->name, wdnow, wdlast, watchdog->mask); + pr_warn(" " + "%s cs_now: %llx cs_last: %llx mask: %llx\n", + cs->name, csnow, cslast, cs->mask); + __clocksource_unstable(cs); continue; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/12] time: Add warnings when overflows or underflows are observed
It was suggested that the underflow/overflow protection should probably throw some sort of warning out, rather then just silently fixing the issue. So this patch adds some warnings here. The flag variables used are not protected by locks, but since we can't print from the reading functions, just being able to say we saw an issue in the update interval is useful enough, and can be slightly racy without real consequnece. The big complication is that we're only under a read seqlock, so the data could shift under us during our calcualtion to see if there was a problem. This patch avoids this issue by nesting another seqlock which allows us to snapshot the just required values atomically. So we shouldn't see false positives. I also added some basic ratelimiting here, since on one build machine w/ skewed TSCs it was fairly noisy at bootup. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/timekeeping.c | 58 +-- 1 file changed, 51 insertions(+), 7 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 4e8ccde..5f62308 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -119,11 +119,23 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta) } #ifdef CONFIG_DEBUG_TIMEKEEPING +#define WARNINGFREQ (HZ*300) /* 5 minute rate-limiting */ +/* + * These simple flag variables are managed + * without locks, which is racy, but ok since + * we don't really care about being super + * precise about how many events were seen, + * just that a problem was observed. + */ +static int timekeeping_underflow_seen; +static int timekeeping_overflow_seen; + static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) { cycle_t max_cycles = tk->tkr.clock->max_cycles; const char *name = tk->tkr.clock->name; + static long last_warning; /* we always hold write on timekeeper lock */ if (offset > max_cycles) printk_deferred("ERROR: cycle offset (%lld) is larger then" @@ -133,28 +145,60 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) printk_deferred("WARNING: cycle offset (%lld) is past" " the %s 50%% safety margin (%lld)\n", offset, name, max_cycles>>1); + + if (timekeeping_underflow_seen) { + if (jiffies - last_warning > WARNINGFREQ) { + printk_deferred("WARNING: Clocksource underflow observed\n"); + last_warning = jiffies; + } + timekeeping_underflow_seen = 0; + } + if (timekeeping_overflow_seen) { + if (jiffies - last_warning > WARNINGFREQ) { + printk_deferred("WARNING: Clocksource overflow observed\n"); + last_warning = jiffies; + } + timekeeping_overflow_seen = 0; + } + } static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr) { - cycle_t cycle_now, delta; + cycle_t now, last, mask, max, delta; + unsigned int seq; - /* read clocksource */ - cycle_now = tkr->read(tkr->clock); + /* +* Since we're called holding a seqlock, the data may shift +* under us while we're doign the calculation. This can cause +* false positives, since we'd note a problem but throw the +* results away. So nest another seqlock here to atomically +* grab the points we are checking with. +*/ + do { + seq = read_seqcount_begin(_core.seq); + now = tkr->read(tkr->clock); + last = tkr->cycle_last; + mask = tkr->mask; + max = tkr->clock->max_cycles; + } while (read_seqcount_retry(_core.seq, seq)); - /* calculate the delta since the last update_wall_time */ - delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask); + delta = clocksource_delta(now, last, mask); /* * Try to catch underflows by checking if we are seeing small * mask-relative negative values. */ - if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3))) + if (unlikely((~delta & mask) < (mask >> 3))) { + timekeeping_underflow_seen = 1; delta = 0; + } /* Cap delta value to the max_cycles values to avoid mult overflows */ - if (unlikely(delta > tkr->clock->max_cycles)) + if (unlikely(delta > max)) { + timekeeping_overflow_seen = 1; delta = tkr->clock->max_cycles; + } return delta; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org
[PATCH 0/2] i2c_imc: New driver, at long last
This adds i2c_imc, a driver for the SMBUS lines on DIMM slots on modern Intel server chips. Conceptually, I like it a lot -- it's a driver for a bus for which we know the exact topology a priori. That means that we can actually enumerate the things on the bus reasonably cleanly. This driver is weird, but I don't think that merging it will cause problems, and I believe that there are Real Users (tm) of this driver. It has two caveats. Big caveat: Lots of things like to touch this bus, such as a BMC, the memory controller power management stuff, and maybe even SMM code. Intel forgot to define a way to arbitrate between them, nor are the SMBUS master regs designed to be poked by multiple things at once. The upshot is that loading this driver is generally unsafe. The driver takes some measures to detect contention for the registers and shut itself down, but no one should rely on that. My understanding is that there's work in ACPI land to define an arbitration mechanism. Once this happens, we can support it in this driver. In the mean time, this driver has actual users, and there are motherboards specifically designed to be used with a driver like this. (I have such a board, and you can even buy them on regular online shopping sutes.) Therefore, I added a module parameter called allow_unsafe_access. If unset, the driver will refuse to load with an informative message. If set, the driver will pr_warn and load. Little caveat: This submission only supposts Sandy Bridge. I'd rather get it merged without Ivy Bridge and Haswell support and add them later (should be easy) rather than trying to make the driver support all possible hardware before merging it. Changes: This is an updated resubmission after about a year of thumb twiddling. Andy Lutomirski (2): i2c_imc: New driver for Intel's iMC, found on LGA2011 chips i2c, i2c_imc: Add DIMM bus code drivers/i2c/busses/Kconfig| 22 ++ drivers/i2c/busses/Makefile | 5 + drivers/i2c/busses/dimm-bus.c | 97 +++ drivers/i2c/busses/i2c-imc.c | 586 ++ include/linux/i2c/dimm-bus.h | 24 ++ 5 files changed, 734 insertions(+) create mode 100644 drivers/i2c/busses/dimm-bus.c create mode 100644 drivers/i2c/busses/i2c-imc.c create mode 100644 include/linux/i2c/dimm-bus.h -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/12] sparc: Convert to using clocksource_register_hz()
While cleaning up some clocksource code, I noticed the time_32 impelementation uses the hz2mult helper, but doesn't use the clocksource_register_hz() method. I don't believe the sparc clocksource is a default clocksource, so we shouldn't need to self-define the mult/shift pair. So convert the time_32.c implementation to use clocksource_register_hz(). Untested. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Cc: "David S. Miller" Acked-by: David S. Miller Signed-off-by: John Stultz --- arch/sparc/kernel/time_32.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c index a31c0c8..18147a5 100644 --- a/arch/sparc/kernel/time_32.c +++ b/arch/sparc/kernel/time_32.c @@ -181,17 +181,13 @@ static struct clocksource timer_cs = { .rating = 100, .read = timer_cs_read, .mask = CLOCKSOURCE_MASK(64), - .shift = 2, .flags = CLOCK_SOURCE_IS_CONTINUOUS, }; static __init int setup_timer_cs(void) { timer_cs_enabled = 1; - timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate, - timer_cs.shift); - - return __clocksource_register(_cs); + return clocksource_register_hz(_cs, sparc_config.clock_rate); } #ifdef CONFIG_SMP -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/12] time: Add debugging checks to warn if we see delays
Recently there's been some request for better sanity checking in the time code, so that its more clear when something is going wrong since timekeeping issues could manifest in a large number of strange ways with various subsystems. Thus, this patch adds some extra infrastructure to add a check update_wall_time to print warnings if we see the call delayed beyond the max_cycles overflow point, or beyond the clocksource max_idle_ns value which is currently 50% of the overflow point. This extra infrastructure is conditionalized behind a new CONFIG_DEBUG_TIMEKEEPING option also added in this patch. Tested this a bit by halting qemu for specified lengths of time to trigger the warnings. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/jiffies.c | 1 + kernel/time/timekeeping.c | 26 ++ lib/Kconfig.debug | 12 3 files changed, 39 insertions(+) diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c index a6a5bf5..7e41390 100644 --- a/kernel/time/jiffies.c +++ b/kernel/time/jiffies.c @@ -71,6 +71,7 @@ static struct clocksource clocksource_jiffies = { .mask = 0x, /*32bits*/ .mult = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */ .shift = JIFFIES_SHIFT, + .max_cycles = 10, }; __cacheline_aligned_in_smp DEFINE_SEQLOCK(jiffies_lock); diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 91db941..7e9d433 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -118,6 +118,29 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta) tk->offs_boot = ktime_add(tk->offs_boot, delta); } +#ifdef CONFIG_DEBUG_TIMEKEEPING +static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) +{ + + cycle_t max_cycles = tk->tkr.clock->max_cycles; + const char *name = tk->tkr.clock->name; + + if (offset > max_cycles) + printk_deferred("ERROR: cycle offset (%lld) is larger then" + " allowed %s max_cycles (%lld)\n", + offset, name, max_cycles); + else if (offset > (max_cycles >> 1)) + printk_deferred("WARNING: cycle offset (%lld) is past" + " the %s 50%% safety margin (%lld)\n", + offset, name, max_cycles>>1); +} +#else +static inline +void timekeeping_check_update(struct timekeeper *tk, cycle_t offset) +{ +} +#endif + /** * tk_setup_internals - Set up internals to use clocksource clock. * @@ -1630,6 +1653,9 @@ void update_wall_time(void) if (offset < real_tk->cycle_interval) goto out; + /* Do some additional sanity checking */ + timekeeping_check_update(real_tk, offset); + /* * With NO_HZ we may have to accumulate many cycle_intervals * (think "ticks") worth of time at once. To do this efficiently, diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c5cefb3..32065f6 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -865,6 +865,18 @@ config SCHED_STACK_END_CHECK data corruption or a sporadic crash at a later stage once the region is examined. The runtime overhead introduced is minimal. +config DEBUG_TIMEKEEPING + bool "Enable extra timekeeping sanity checking" + help + This option will enable additional timekeeping sanity checks + which may be helpful when diagnoising issues where timekeeping + problems are suspected. + + This may include checks in the timekeeping hotpaths, so this + option may have a performance impact to some workloads. + + If unsure, say N. + config TIMER_STATS bool "Collect kernel timers statistics" depends on DEBUG_KERNEL && PROC_FS -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] i2c, i2c_imc: Add DIMM bus code
Add i2c_scan_dimm_bus to declare that a particular i2c_adapter contains DIMMs. This will probe (and autoload modules!) for useful SMBUS devices that live on DIMMs. i2c_imc calls it. As more SMBUS-addressable DIMM components become supported, this code can be extended to probe for them. Signed-off-by: Andy Lutomirski --- drivers/i2c/busses/Kconfig| 4 ++ drivers/i2c/busses/Makefile | 4 ++ drivers/i2c/busses/dimm-bus.c | 97 +++ drivers/i2c/busses/i2c-imc.c | 3 ++ include/linux/i2c/dimm-bus.h | 24 +++ 5 files changed, 132 insertions(+) create mode 100644 drivers/i2c/busses/dimm-bus.c create mode 100644 include/linux/i2c/dimm-bus.h diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index d6b9ce164fbf..2ea6648492eb 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -149,6 +149,10 @@ config I2C_ISMT This driver can also be built as a module. If so, the module will be called i2c-ismt. +config I2C_DIMM_BUS + tristate + default n + config I2C_IMC tristate "Intel iMC (LGA 2011) SMBus Controller" depends on PCI && X86 diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile index 4287c891e782..a01bdcc0e239 100644 --- a/drivers/i2c/busses/Makefile +++ b/drivers/i2c/busses/Makefile @@ -25,6 +25,10 @@ obj-$(CONFIG_I2C_SIS96X) += i2c-sis96x.o obj-$(CONFIG_I2C_VIA) += i2c-via.o obj-$(CONFIG_I2C_VIAPRO) += i2c-viapro.o +# DIMM busses +obj-$(CONFIG_I2C_DIMM_BUS) += dimm-bus.o +obj-$(CONFIG_I2C_IMC) += i2c-imc.o + # Mac SMBus host controller drivers obj-$(CONFIG_I2C_HYDRA)+= i2c-hydra.o obj-$(CONFIG_I2C_POWERMAC) += i2c-powermac.o diff --git a/drivers/i2c/busses/dimm-bus.c b/drivers/i2c/busses/dimm-bus.c new file mode 100644 index ..096842811199 --- /dev/null +++ b/drivers/i2c/busses/dimm-bus.c @@ -0,0 +1,97 @@ +/* + * Copyright (c) 2013 Andrew Lutomirski + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include + +static bool probe_addr(struct i2c_adapter *adapter, int addr) +{ + /* +* So far, all known devices that live on DIMMs can be safely +* and reliably detected by trying to read a byte at address +* zero. (The exception is the SPD write protection control, +* which can't be probed and requires special hardware and/or +* quick writes to access, and has no driver.) +*/ + union i2c_smbus_data dummy; + return i2c_smbus_xfer(adapter, addr, 0, I2C_SMBUS_READ, 0, + I2C_SMBUS_BYTE_DATA, ) >= 0; +} + +/** + * i2c_scan_dimm_bus() - Scans an SMBUS segment known to contain DIMMs + * @adapter: The SMBUS adapter to scan + * + * This function tells the DIMM-bus code that the adapter is known to + * contain DIMMs. i2c_scan_dimm_bus will probe for devices known to + * live on DIMMs. + * + * Do NOT call this function on general-purpose system SMBUS segments + * unless you know that the only things on the bus are DIMMs. + * Otherwise is it very likely to mis-identify other things on the + * bus. + * + * Callers are advised not to set adapter->class = I2C_CLASS_SPD. + */ +void i2c_scan_dimm_bus(struct i2c_adapter *adapter) +{ + struct i2c_board_info info = {}; + int slot; + + /* +* We probe with "read byte data". If any DIMM SMBUS driver can't +* support that access type, this function should be updated. +*/ + if (WARN_ON(!i2c_check_functionality(adapter, + I2C_FUNC_SMBUS_READ_BYTE_DATA))) + return; + + /* +* Addresses on DIMMs use the three low bits to identify the slot +* and the four high bits to identify the device type. Known +* devices are: +* +* - 0x50 - 0x57: SPD (Serial Presence Detect) EEPROM +* - 0x30 - 0x37: SPD WP control -- not worth trying to probe +* - 0x18 - 0x1f: TSOD (Temperature Sensor on DIMM) +* +* There may be more some day. +*/ + for (slot = 0; slot < 8; slot++) { + /* If there's no SPD, then assume there's no DIMM here. */ + if (!probe_addr(adapter, 0x50 | slot)) + continue; + +
[PATCH 1/2] i2c_imc: New driver for Intel's iMC, found on LGA2011 chips
Sandy Bridge Xeon and Extreme chips have integrated memory controllers with (rather limited) onboard SMBUS masters. This driver gives access to the bus. There are various groups working on standardizing a way to arbitrate access to the bus between the OS, SMM firmware, a BMC, hardware thermal control, etc. In the mean time, running this driver is unsafe except under special circumstances. Nonetheless, this driver has real users. As a compromise, the driver will refuse to load unless i2c_imc.allow_unsafe_access=Y. When safe access becomes available, we can leave this option as a way for legacy users to run the driver, and we'll allow the driver to load by default if safe bus access is available. Signed-off-by: Andy Lutomirski --- drivers/i2c/busses/Kconfig | 18 ++ drivers/i2c/busses/Makefile | 1 + drivers/i2c/busses/i2c-imc.c | 583 +++ 3 files changed, 602 insertions(+) create mode 100644 drivers/i2c/busses/i2c-imc.c diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index ab838d9e28b6..d6b9ce164fbf 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -149,6 +149,24 @@ config I2C_ISMT This driver can also be built as a module. If so, the module will be called i2c-ismt. +config I2C_IMC + tristate "Intel iMC (LGA 2011) SMBus Controller" + depends on PCI && X86 + select I2C_DIMM_BUS + help + If you say yes to this option, support will be included for the Intel + Integrated Memory Controller SMBus host controller interface. This + controller is found on LGA 2011 Xeons and Core i7 Extremes. + + There are currently no systems on which the kernel knows that it can + safely enable this driver. For now, you need to pass this driver a + scary module parameter, and you should only pass that parameter if you + have a special motherboard and know exactly what you are doing. + Special motherboards include the Supermicro X9DRH-iF-NV. + + This driver can also be built as a module. If so, the module will be + called i2c-imc. + config I2C_PIIX4 tristate "Intel PIIX4 and compatible (ATI/AMD/Serverworks/Broadcom/SMSC)" depends on PCI diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile index 56388f658d2f..4287c891e782 100644 --- a/drivers/i2c/busses/Makefile +++ b/drivers/i2c/busses/Makefile @@ -15,6 +15,7 @@ obj-$(CONFIG_I2C_AMD8111) += i2c-amd8111.o obj-$(CONFIG_I2C_I801) += i2c-i801.o obj-$(CONFIG_I2C_ISCH) += i2c-isch.o obj-$(CONFIG_I2C_ISMT) += i2c-ismt.o +obj-$(CONFIG_I2C_IMC) += i2c-imc.o obj-$(CONFIG_I2C_NFORCE2) += i2c-nforce2.o obj-$(CONFIG_I2C_NFORCE2_S4985)+= i2c-nforce2-s4985.o obj-$(CONFIG_I2C_PIIX4)+= i2c-piix4.o diff --git a/drivers/i2c/busses/i2c-imc.c b/drivers/i2c/busses/i2c-imc.c new file mode 100644 index ..2dbf171114c6 --- /dev/null +++ b/drivers/i2c/busses/i2c-imc.c @@ -0,0 +1,583 @@ +/* + * Copyright (c) 2013 Andrew Lutomirski + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include + +/* + * The datasheet can be found here, for example: + * http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-1600-2600-vol-2-datasheet.pdf + * + * There seem to be quite a few bugs or spec errors, though: + * + * - A successful transaction sets WOD and RDO. + * + * - The docs for TSOD_POLL_EN make no sense (see imc_channel_claim). + * + * - Erratum BT109, which says: + * + * The processor may not complete SMBus (System Management Bus) + * transactions targeting the TSOD (Temperature Sensor On DIMM) + * when Package C-States are enabled. Due to this erratum, if the + * processor transitions into a Package C-State while an SMBus + * transaction with the TSOD is in process, the processor will + * suspend receipt of the transaction. The transaction completes + * while the processor is in a Package C-State. Upon exiting + * Package C-State, the processor will attempt to resume the + * SMBus transaction, detect a protocol violation, and log an + * error. + * + * The description notwithstanding, I've seen
[PATCH 12/12] clocksource: Add some debug info about clocksources being registered
Print the mask, max_cycles, and max_idle_ns values for clocksources being registered. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/clocksource.c | 4 1 file changed, 4 insertions(+) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 3f24bb3..9b75316 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -697,6 +697,10 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj, cs->mask, >max_cycles); + + pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n", + cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns); + } EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/12] clocksource: Mostly kill clocksource_register()
A long running project has been to cleanup remaining uses of clocksource_register(), replacing it with the simpler clocksource_register_khz/hz(). However, there are a few cases where we need to self-define our mult/shift values, so switch the function to a more obviously internal __clocksource_register(), and consolidate much of the internal logic so we don't have duplication. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Cc: "David S. Miller" Cc: Martin Schwidefsky Signed-off-by: John Stultz --- arch/s390/kernel/time.c | 2 +- arch/sparc/kernel/time_32.c | 2 +- include/linux/clocksource.h | 10 +- kernel/time/clocksource.c | 83 +++-- kernel/time/jiffies.c | 4 +-- 5 files changed, 47 insertions(+), 54 deletions(-) diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index 20660dd..6c273cd 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -283,7 +283,7 @@ void __init time_init(void) if (register_external_irq(EXT_IRQ_TIMING_ALERT, timing_alert_interrupt)) panic("Couldn't request external interrupt 0x1406"); - if (clocksource_register(_tod) != 0) + if (__clocksource_register(_tod) != 0) panic("Could not register TOD clock source"); /* Enable TOD clock interrupts on the boot cpu. */ diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c index 2f80d23..a31c0c8 100644 --- a/arch/sparc/kernel/time_32.c +++ b/arch/sparc/kernel/time_32.c @@ -191,7 +191,7 @@ static __init int setup_timer_cs(void) timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate, timer_cs.shift); - return clocksource_register(_cs); + return __clocksource_register(_cs); } #ifdef CONFIG_SMP diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index 63fe52f..c064349 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -179,7 +179,6 @@ static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift) } -extern int clocksource_register(struct clocksource*); extern int clocksource_unregister(struct clocksource*); extern void clocksource_touch_watchdog(void); extern struct clocksource* clocksource_get_next(void); @@ -204,6 +203,15 @@ __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq); extern void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq); +/* + * Dont' call this unless you're a default clocksource + * (AKA: jiffies) and absolutely have to. + */ +static inline int __clocksource_register(struct clocksource *cs) +{ + return __clocksource_register_scale(cs, 1, 0); +} + static inline int clocksource_register_hz(struct clocksource *cs, u32 hz) { return __clocksource_register_scale(cs, 1, hz); diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 51c7b3a..3f24bb3 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -648,38 +648,52 @@ static void clocksource_enqueue(struct clocksource *cs) void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) { u64 sec; + /* -* Calc the maximum number of seconds which we can run before -* wrapping around. For clocksources which have a mask > 32bit -* we need to limit the max sleep time to have a good -* conversion precision. 10 minutes is still a reasonable -* amount. That results in a shift value of 24 for a -* clocksource with mask >= 40bit and f >= 4GHz. That maps to -* ~ 0.06ppm granularity for NTP. +* Default clocksources are *special* and self-define their mult/shift. +* But, you're not special, so you should specify a freq value. */ - sec = cs->mask; - do_div(sec, freq); - do_div(sec, scale); - if (!sec) - sec = 1; - else if (sec > 600 && cs->mask > UINT_MAX) - sec = 600; - - clocks_calc_mult_shift(>mult, >shift, freq, - NSEC_PER_SEC / scale, sec * scale); - + if (freq) { + /* +* Calc the maximum number of seconds which we can run before +* wrapping around. For clocksources which have a mask > 32bit +* we need to limit the max sleep time to have a good +* conversion precision. 10 minutes is still a reasonable +* amount. That results in a shift value of 24 for a +* clocksource with mask >= 40bit and f >= 4GHz. That maps to +* ~ 0.06ppm granularity for NTP. +*/ + sec = cs->mask; + do_div(sec, freq); + do_div(sec, scale); + if (!sec) + sec = 1; +
[PATCH 07/12] time: Try to catch clocksource delta underflows
In the case where there is a broken clocksource where there are multiple actual clocks that aren't perfectly aligned, we may see small "negative" deltas when we subtract now from cycle_last. The values are actually negative with respect to the clocksource mask value, not necessarily negative if cast to a s64, but we can check by checking the delta see if it is a small (relative to the mask) negative value (again negative relative to the mask). If so, we assume we jumped backwards somehow and instead use zero for our delta. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: John Stultz --- kernel/time/timekeeping.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 8b9e328..4e8ccde 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -145,6 +145,13 @@ static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr) /* calculate the delta since the last update_wall_time */ delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask); + /* +* Try to catch underflows by checking if we are seeing small +* mask-relative negative values. +*/ + if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3))) + delta = 0; + /* Cap delta value to the max_cycles values to avoid mult overflows */ if (unlikely(delta > tkr->clock->max_cycles)) delta = tkr->clock->max_cycles; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/12] clocksource: Simplify logic around clocksource wrapping saftey margins
The clocksource logic has a number of places where we try to include a safety margin. Most of these are 12% safety margins, but they are inconsistently applied and sometimes are applied on top of each other. Additionally, in the previous patch, we corrected an issue where we unintentionally in effect created a 50% saftey margin, which these 12.5% margins where then added to. So to siplify the logic here, this patch removes the various 12.5% margins, and consolidates adding the margin in one place: clocks_calc_max_nsecs(). Addtionally, Linus prefers a 50% safety margin, as it allows bad clock values to be more easily caught. This should really have no net effect, due to the corrected issue earlier which caused greater then 50% margins to be used w/o issue. Cc: Dave Jones Cc: Linus Torvalds Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Ingo Molnar Cc: Peter Zijlstra Acked-by: Stephen Boyd (for sched_clock.c bit) Signed-off-by: John Stultz --- kernel/time/clocksource.c | 26 -- kernel/time/sched_clock.c | 4 ++-- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 11323f4..e5d00e6 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -469,6 +469,9 @@ static u32 clocksource_max_adjustment(struct clocksource *cs) * @shift: cycle to nanosecond divisor (power of two) * @maxadj:maximum adjustment value to mult (~11%) * @mask: bitmask for two's complement subtraction of non 64 bit counters + * + * NOTE: This function includes a saftey margin of 50%, so that bad clock values + * can be detected. */ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) { @@ -490,11 +493,14 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask) max_cycles = min(max_cycles, mask); max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift); + /* Return 50% of the actual maximum, so we can detect bad values */ + max_nsecs >>= 1; + return max_nsecs; } /** - * clocksource_max_deferment - Returns max time the clocksource can be deferred + * clocksource_max_deferment - Returns max time the clocksource should be deferred * @cs: Pointer to clocksource * */ @@ -504,13 +510,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs) max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj, cs->mask); - /* -* To ensure that the clocksource does not wrap whilst we are idle, -* limit the time the clocksource can be deferred by 12.5%. Please -* note a margin of 12.5% is used because this can be computed with -* a shift, versus say 10% which would require division. -*/ - return max_nsecs - (max_nsecs >> 3); + return max_nsecs; } #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET @@ -659,10 +659,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) * conversion precision. 10 minutes is still a reasonable * amount. That results in a shift value of 24 for a * clocksource with mask >= 40bit and f >= 4GHz. That maps to -* ~ 0.06ppm granularity for NTP. We apply the same 12.5% -* margin as we do in clocksource_max_deferment() +* ~ 0.06ppm granularity for NTP. */ - sec = (cs->mask - (cs->mask >> 3)); + sec = cs->mask; do_div(sec, freq); do_div(sec, scale); if (!sec) @@ -674,9 +673,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) NSEC_PER_SEC / scale, sec * scale); /* -* for clocksources that have large mults, to avoid overflow. -* Since mult may be adjusted by ntp, add an safety extra margin -* +* Ensure clocksources that have large mults don't overflow +* when adjusted. */ cs->maxadj = clocksource_max_adjustment(cs); while ((cs->mult + cs->maxadj < cs->mult) diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 01d2d15..c794b84 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -125,9 +125,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits, new_mask = CLOCKSOURCE_MASK(bits); - /* calculate how many ns until we wrap */ + /* calculate how many ns until we risk wrapping */ wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask); - new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); + new_wrap_kt = ns_to_ktime(wrap); /* update epoch for new counter and update epoch_ns from old counter*/ new_epoch = read(); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
€950,000.00 euro
Liebe Begünstigte, Sie wurden ausgewählt, um (€ 950.000,00 EURO) als Charity-Spenden / Hilfe der Qatar Foundation erhalten. Kontaktieren Sie uns über E-Mail für weitere Informationen; Mit freundlichen Grüßen, Ingenieur Saad Al Muhannadi. Kontakt e-mail: qatarfoundationinternatio...@gmail.com Präsident der Qatar Foundation. --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf, tools, script: Always print raw IP
From: Andi Kleen Fix regression caused by 85c116a6cb We did not print the IP anymore for perf script -o ip, but instead symbol+offset, and if there was no symbol only +offset. Print the raw IP correctly again in this case. Reported-by: Yuanfang Chen Cc: Yuanfang Chen Signed-off-by: Andi Kleen --- tools/perf/util/srcline.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index c93fb0c..7d61f8b 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -293,8 +293,14 @@ out: dso__free_a2l(dso); } if (sym) { - if (asprintf(, "%s+%" PRIu64, show_sym ? sym->name : "", - addr - sym->start) < 0) + int err; + + if (show_sym) + err = asprintf(, "%s+%" PRIu64, sym->name, + addr - sym->start); + else + err = asprintf(, "%" PRIx64, addr); + if (err < 0) return SRCLINE_UNKNOWN; } else if (asprintf(, "%s[%" PRIx64 "]", dso->short_name, addr) < 0) return SRCLINE_UNKNOWN; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 18:26 -0800, Davidlohr Bueso wrote: > That's not what this is about. New lock _owners_ need to worry about ^^^ make that "need not" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> Davidlohr Bueso wrote: > >> > >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > >> > > In owner_running() there are 2 conditions that would make it return > >> > > false: if the owner changed or if the owner is not running. However, > >> > > that patch continues spinning if there is a "new owner" but it does not > >> > > take into account that we may want to stop spinning if the owner is not > >> > > running (due to getting rescheduled). > >> > > >> > So you're rationale is that we're missing this need_resched: > >> > > >> > while (owner_running(sem, owner)) { > >> > /* abort spinning when need_resched */ > >> > if (need_resched()) { > >> > rcu_read_unlock(); > >> > return false; > >> > } > >> > } > >> > > >> > Because the owner_running() would return false, right? Yeah that makes > >> > sense, as missing a resched is a bug, as opposed to our heuristics being > >> > so painfully off. > >> > > >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? > >> > >> For the xfstest lockup, what matters is that the owner isn't running, since > >> the following simple change does fix the issue: > > > > I much prefer Jason's approach, which should also take care of the > > issue, as it includes the !owner->on_cpu stop condition to stop > > spinning. > > But the check on owner->on_cpu should be moved outside the loop > because new owner can be scheduled out too, right? That's not what this is about. New lock _owners_ need to worry about burning cycles trying to acquire the lock ;) > >> > >> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > >> index 06e2214..5e08705 100644 > >> --- a/kernel/locking/rwsem-xadd.c > >> +++ b/kernel/locking/rwsem-xadd.c > >> @@ -358,8 +358,9 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, > >> struct task_struct *owner) > >> } > >> rcu_read_unlock(); > >> > >> - if (READ_ONCE(sem->owner)) > >> - return true; /* new owner, continue spinning */ > >> + owner = READ_ONCE(sem->owner); > >> + if (owner && owner->on_cpu) > >> + return true; So if I'm understanding this right, your patch works because you add another on_cpu check and at this point we could very well have sem->owner == owner -- such that owner_running return false for the same reason in the first place! So Jason's patch takes on the issue directly by never allowing ups to reach this point. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: >> On Fri, 06 Mar 2015 14:15:37 -0800 >> Davidlohr Bueso wrote: >> >> > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: >> > > In owner_running() there are 2 conditions that would make it return >> > > false: if the owner changed or if the owner is not running. However, >> > > that patch continues spinning if there is a "new owner" but it does not >> > > take into account that we may want to stop spinning if the owner is not >> > > running (due to getting rescheduled). >> > >> > So you're rationale is that we're missing this need_resched: >> > >> > while (owner_running(sem, owner)) { >> > /* abort spinning when need_resched */ >> > if (need_resched()) { >> > rcu_read_unlock(); >> > return false; >> > } >> > } >> > >> > Because the owner_running() would return false, right? Yeah that makes >> > sense, as missing a resched is a bug, as opposed to our heuristics being >> > so painfully off. >> > >> > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? >> >> For the xfstest lockup, what matters is that the owner isn't running, since >> the following simple change does fix the issue: > > I much prefer Jason's approach, which should also take care of the > issue, as it includes the !owner->on_cpu stop condition to stop > spinning. But the check on owner->on_cpu should be moved outside the loop because new owner can be scheduled out too, right? >> >> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c >> index 06e2214..5e08705 100644 >> --- a/kernel/locking/rwsem-xadd.c >> +++ b/kernel/locking/rwsem-xadd.c >> @@ -358,8 +358,9 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, >> struct task_struct *owner) >> } >> rcu_read_unlock(); >> >> - if (READ_ONCE(sem->owner)) >> - return true; /* new owner, continue spinning */ >> + owner = READ_ONCE(sem->owner); >> + if (owner && owner->on_cpu) >> + return true; >> >> /* >>* When the owner is not set, the lock could be free or >> >> >> Thanks, >> Ming Lei > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Rcceiving a generic netlink multicast - should be restricted to the root user?
I wrote a kernel module that sends generic Netlink multicasts, and wrote a userland client using libmnl that receives them. That all works fine, but my client works even when it's not the root user. man 7 netlink says: Only processes with an effective UID of 0 or the CAP_NET_ADMIN capability may send or listen to a netlink multicast group. The listen part of this is seemingly not true. I've tried this on kernels 3.13 (Ubuntu 14.04), 2.6.32 (CentOS 6) and 2.6.18 (CentOS 5). Is this a bug? If not: I know that restricting receiving generic netlink commands incoming to the kernel to being only from root is possible with GENL_ADMIN_PERM flag, but is it possible to send multicasts from the kernel that can only be received by root? Thank you -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > On Fri, 06 Mar 2015 14:15:37 -0800 > Davidlohr Bueso wrote: > > > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > > In owner_running() there are 2 conditions that would make it return > > > false: if the owner changed or if the owner is not running. However, > > > that patch continues spinning if there is a "new owner" but it does not > > > take into account that we may want to stop spinning if the owner is not > > > running (due to getting rescheduled). > > > > So you're rationale is that we're missing this need_resched: > > > > while (owner_running(sem, owner)) { > > /* abort spinning when need_resched */ > > if (need_resched()) { > > rcu_read_unlock(); > > return false; > > } > > } > > > > Because the owner_running() would return false, right? Yeah that makes > > sense, as missing a resched is a bug, as opposed to our heuristics being > > so painfully off. > > > > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? > > For the xfstest lockup, what matters is that the owner isn't running, since > the following simple change does fix the issue: I much prefer Jason's approach, which should also take care of the issue, as it includes the !owner->on_cpu stop condition to stop spinning. > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index 06e2214..5e08705 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -358,8 +358,9 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct > task_struct *owner) > } > rcu_read_unlock(); > > - if (READ_ONCE(sem->owner)) > - return true; /* new owner, continue spinning */ > + owner = READ_ONCE(sem->owner); > + if (owner && owner->on_cpu) > + return true; > > /* >* When the owner is not set, the lock could be free or > > > Thanks, > Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 14:15 -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > In owner_running() there are 2 conditions that would make it return > > false: if the owner changed or if the owner is not running. However, > > that patch continues spinning if there is a "new owner" but it does not > > take into account that we may want to stop spinning if the owner is not > > running (due to getting rescheduled). > > So you're rationale is that we're missing this need_resched: > > while (owner_running(sem, owner)) { > /* abort spinning when need_resched */ > if (need_resched()) { > rcu_read_unlock(); > return false; > } > } > > Because the owner_running() would return false, right? Yeah that makes > sense, as missing a resched is a bug, as opposed to our heuristics being > so painfully off. Actually, the rationale is that when the lock owner reschedules while holding the lock, we'd want the spinners to stop spinning. The original owner_running() check takes care of this since it returns false if ->on_cpu gets set to false and the sem->owner != NULL would be false causing us to stop spinning . However, with the patch, when owner_running returns false, we check sem->owner, which causes the ->on_cpu check to essentially get ignored. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 06 Mar 2015 14:15:37 -0800 Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > In owner_running() there are 2 conditions that would make it return > > false: if the owner changed or if the owner is not running. However, > > that patch continues spinning if there is a "new owner" but it does not > > take into account that we may want to stop spinning if the owner is not > > running (due to getting rescheduled). > > So you're rationale is that we're missing this need_resched: > > while (owner_running(sem, owner)) { > /* abort spinning when need_resched */ > if (need_resched()) { > rcu_read_unlock(); > return false; > } > } > > Because the owner_running() would return false, right? Yeah that makes > sense, as missing a resched is a bug, as opposed to our heuristics being > so painfully off. > > Sasha, Ming (Cc'ed), does this address the issues you guys are seeing? For the xfstest lockup, what matters is that the owner isn't running, since the following simple change does fix the issue: diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 06e2214..5e08705 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -358,8 +358,9 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner) } rcu_read_unlock(); - if (READ_ONCE(sem->owner)) - return true; /* new owner, continue spinning */ + owner = READ_ONCE(sem->owner); + if (owner && owner->on_cpu) + return true; /* * When the owner is not set, the lock could be free or Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softlockups in multi_cpu_stop
On Fri, 2015-03-06 at 13:24 -0800, Linus Torvalds wrote: > On Fri, Mar 6, 2015 at 1:12 PM, Jason Low wrote: > > > > + while (true) { > > + if (sem->owner != owner) > > + break; > > That looks *really* odd. > > Why is this not > > while (sem->owner == owner) { Yes, this looks more readable. That while (true) thing was something we recently did for mutexes which was why I originally had that. > Also, this "barrier()" now lost the comment: > > > + barrier(); > > so it looks very odd indeed. Right, we should keep the comment for the barrier(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] x86: Delay loading sp0 slightly on task switch
The change: 75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0() had the unintended side effect of changing the return value of current_thread_info() during part of the context switch process. Change it back. This has no effect as far as I can tell -- it's just for consistency. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/process_32.c | 10 +- arch/x86/kernel/process_64.c | 6 +++--- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index d3460af3d27a..0405cab6634d 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -256,11 +256,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) fpu = switch_fpu_prepare(prev_p, next_p, cpu); /* -* Reload esp0. -*/ - load_sp0(tss, next); - - /* * Save away %gs. No need to save %fs, as it was saved on the * stack on entry. No need to save %es and %ds, as those are * always kernel segments while inside the kernel. Doing this @@ -310,6 +305,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) */ arch_end_context_switch(next_p); + /* +* Reload esp0. This changes current_thread_info(). +*/ + load_sp0(tss, next); + this_cpu_write(kernel_stack, (unsigned long)task_stack_page(next_p) + THREAD_SIZE - KERNEL_STACK_OFFSET); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 2cd562f96c1f..1e393d27d701 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -283,9 +283,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) fpu = switch_fpu_prepare(prev_p, next_p, cpu); - /* Reload esp0 and ss1. */ - load_sp0(tss, next); - /* We must save %fs and %gs before load_TLS() because * %fs and %gs may be cleared by load_TLS(). * @@ -413,6 +410,9 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) task_thread_info(prev_p)->saved_preempt_count = this_cpu_read(__preempt_count); this_cpu_write(__preempt_count, task_thread_info(next_p)->saved_preempt_count); + /* Reload esp0 and ss1. This changes current_thread_info(). */ + load_sp0(tss, next); + this_cpu_write(kernel_stack, (unsigned long)task_stack_page(next_p) + THREAD_SIZE - KERNEL_STACK_OFFSET); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
I broke 32-bit kernels. The implementation of sp0 was correct as far as I can tell, but sp0 was much weirder on x86_32 than I realized. It has the following issues: - Init's sp0 is inconsistent with everything else's: non-init tasks are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.) - vm86 does crazy things to sp0. Fix it up by replacing this_cpu_sp0() with current_top_of_stack() and using a new percpu variable to track the top of the stack on x86_32. Fixes: 75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0() Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 11 ++- arch/x86/include/asm/thread_info.h | 4 +--- arch/x86/kernel/cpu/common.c | 13 +++-- arch/x86/kernel/process_32.c | 11 +++ arch/x86/kernel/smpboot.c | 2 ++ arch/x86/kernel/traps.c| 4 ++-- 6 files changed, 33 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index f5e3ec63767d..48a61c1c626e 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -284,6 +284,10 @@ struct tss_struct { DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss); +#ifdef CONFIG_X86_32 +DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack); +#endif + /* * Save the original ist values for checking stack pointers during debugging */ @@ -564,9 +568,14 @@ static inline void native_swapgs(void) #endif } -static inline unsigned long this_cpu_sp0(void) +static inline unsigned long current_top_of_stack(void) { +#ifdef CONFIG_X86_64 return this_cpu_read_stable(cpu_tss.x86_tss.sp0); +#else + /* sp0 on x86_32 is special in and around vm86 mode. */ + return this_cpu_read_stable(cpu_current_top_of_stack); +#endif } #ifdef CONFIG_PARAVIRT diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index a2fa1899494e..7740edd56fed 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -158,9 +158,7 @@ DECLARE_PER_CPU(unsigned long, kernel_stack); static inline struct thread_info *current_thread_info(void) { - struct thread_info *ti; - ti = (void *)(this_cpu_sp0() - THREAD_SIZE); - return ti; + return (struct thread_info *)(current_top_of_stack() - THREAD_SIZE); } static inline unsigned long current_stack_pointer(void) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 5d0f0cc7ea26..76348334b934 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1130,8 +1130,8 @@ DEFINE_PER_CPU_FIRST(union irq_stack_union, irq_stack_union) __aligned(PAGE_SIZE) __visible; /* - * The following four percpu variables are hot. Align current_task to - * cacheline size such that all four fall in the same cacheline. + * The following percpu variables are hot. Align current_task to + * cacheline size such that they fall in the same cacheline. */ DEFINE_PER_CPU(struct task_struct *, current_task) cacheline_aligned = _task; @@ -1226,6 +1226,15 @@ DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT; EXPORT_PER_CPU_SYMBOL(__preempt_count); DEFINE_PER_CPU(struct task_struct *, fpu_owner_task); +/* + * On x86_32, vm86 modifies tss.sp0, so sp0 isn't a reliable way to find + * the top of the kernel stack. Use an extra percpu variable to track the + * top of the kernel stack directly. + */ +DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) = + (unsigned long)_thread_union + THREAD_SIZE; +EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack); + #ifdef CONFIG_CC_STACKPROTECTOR DEFINE_PER_CPU_ALIGNED(struct stack_canary, stack_canary); #endif diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 0405cab6634d..1b9963faf4eb 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -306,13 +306,16 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) arch_end_context_switch(next_p); /* -* Reload esp0. This changes current_thread_info(). +* Reload esp0, kernel_stack, and current_top_of_stack. This changes +* current_thread_info(). */ load_sp0(tss, next); - this_cpu_write(kernel_stack, - (unsigned long)task_stack_page(next_p) + - THREAD_SIZE - KERNEL_STACK_OFFSET); + (unsigned long)task_stack_page(next_p) + + THREAD_SIZE - KERNEL_STACK_OFFSET); + this_cpu_write(cpu_current_top_of_stack, + (unsigned long)task_stack_page(next_p) + + THREAD_SIZE); /* * Restore %gs if needed (which is common) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index febc6aabc72e..759388c538cf 100644 --- a/arch/x86/kernel/smpboot.c +++
[PATCH 0/2] x86: sp0 fixes
I broke x86_32 and I made an inadvertent change to both bitnesses. Undo the inadvertent change and fix x86_32. This isn't as pretty as I hoped. Sorry. Andy Lutomirski (2): x86: Delay loading sp0 slightly on task switch x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 arch/x86/include/asm/processor.h | 11 ++- arch/x86/include/asm/thread_info.h | 4 +--- arch/x86/kernel/cpu/common.c | 13 +++-- arch/x86/kernel/process_32.c | 17 ++--- arch/x86/kernel/process_64.c | 6 +++--- arch/x86/kernel/smpboot.c | 2 ++ arch/x86/kernel/traps.c| 4 ++-- 7 files changed, 39 insertions(+), 18 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/8] pmem: Submission of the Persistent memory block device
On Fri, Mar 06, 2015 at 11:37:45AM -0700, Ross Zwisler wrote: > Regarding the PMEM series, my group has been working on an updated > version of this driver for the past 6 months or so since I initially > posted the beginnings of this series: > > https://lkml.org/lkml/2014/8/27/674 > > That new version should be ready for public viewing sometime in April. > > It's my preference that we wait to try and upstream any form of PMEM > until we've released our updated version of the driver, and you've had a > chance to review and add in any changes you need. I'm cool with > gathering additional feedback until then, of course. > > Trying to upstream this older version and then merging it with the newer > stuff in-kernel seems like it'll just end up being more work in the end. We've been waiting far too long to get any version of this merged. I dont think waiting for vapourware is a good idea. So either please post your new code ASAP given that you apparently have it, or you'll just have to do more work later. Given how simple the pmem driver is I really can't see any major merge problems anyway. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kernel/locking/locktorture: fix deadlock in 'rw_lock_irq' type
On Fri, Mar 06, 2015 at 05:03:00PM -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 16:37 -0800, Paul E. McKenney wrote: > > On Sat, Mar 07, 2015 at 03:06:53AM +0300, Alexey Kodanev wrote: > > > torture_rwlock_read_unlock_irq() must use read_unlock_irqrestore() > > > instead of write_unlock_irqrestore(). > > > > > > Use read_unlock_irqrestore() instead of write_unlock_irqrestore(). > > > > > > Signed-off-by: Alexey Kodanev > > > > Good catch! If Davidlohr has no objections, I will queue this one. > > Dear me, yes that looks completely borken. Thanks for the fix. Queued for v4.2. (If this is more urgent than that, please let me know.) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3] mtd: nand: add Broadcom NAND controller support
Hi, This adds (long in coming) support for the Broadcom BCM7xxx Set-Top Box NAND controller. This controller has been used in a variety of Broadcom SoCs. There are a few more features I'd like add in the near future, mostly to support more SoCs, but this is the base set, which should only need relatively minor additions to support chips like BCM63138, BCM3384, and Cygnus/iProc. Particularly, we may need to straighten out some endianness issues for the data path on iProc, and interrupt enabling/acking on iProc, BCM63xxx, BCM3xxx, and others. TODO: add this to the DTS(I) files for BCM7445. Happy reviewing! (Speaking of which, I need to catch up on reviewing everybody else's MTD submissions...) Brian Brian Norris (3): mtd: nand: add common DT init code Documentation: devicetree: add binding doc for Broadcom NAND controller mtd: nand: add NAND driver for Broadcom STB NAND controller .../devicetree/bindings/mtd/brcmstb_nand.txt | 109 + drivers/mtd/nand/Kconfig |8 + drivers/mtd/nand/Makefile |1 + drivers/mtd/nand/brcmstb_nand.c| 2182 drivers/mtd/nand/nand_base.c | 41 + include/linux/mtd/nand.h |5 + 6 files changed, 2346 insertions(+) create mode 100644 Documentation/devicetree/bindings/mtd/brcmstb_nand.txt create mode 100644 drivers/mtd/nand/brcmstb_nand.c -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] mtd: nand: add NAND driver for Broadcom STB NAND controller
This core originated in Set-Top Box chips (BCM7xxx) but is used in a variety of other Broadcom chips, including some BCM63xxx, BCM33xx, and iProc/Cygnus. It's been used only on ARM and MIPS SoCs, so restrict it to those architectures. There are multiple revisions of this core throughout the years, and almost every version broke register compatibility in some small way, but with some effort, this driver is able to support v4.0, v5.0, v6.x, v7.0, and v7.1. It's been tested on v5.0, v6.0, v7.0, and v7.1 recently, so there hopefully are no more lurking inconsistencies. Signed-off-by: Brian Norris --- drivers/mtd/nand/Kconfig|8 + drivers/mtd/nand/Makefile |1 + drivers/mtd/nand/brcmstb_nand.c | 2182 +++ 3 files changed, 2191 insertions(+) create mode 100644 drivers/mtd/nand/brcmstb_nand.c diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig index 5b76a173cd95..6445323a8cff 100644 --- a/drivers/mtd/nand/Kconfig +++ b/drivers/mtd/nand/Kconfig @@ -394,6 +394,14 @@ config MTD_NAND_GPMI_NAND block, such as SD card. So pay attention to it when you enable the GPMI. +config MTD_NAND_BRCMSTB + tristate "Broadcom STB NAND controller" + depends on ARM || MIPS + help + Enables the Broadcom NAND controller driver. The controller was + originally designed for Set-Top Box but is used on various BCM7xxx, + BCM3xxx, BCM63xxx, iProc/Cygnus and more. + config MTD_NAND_BCM47XXNFLASH tristate "Support for NAND flash on BCM4706 BCMA bus" depends on BCMA_NFLASH diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile index 582bbd05aff7..3b1adddc83dd 100644 --- a/drivers/mtd/nand/Makefile +++ b/drivers/mtd/nand/Makefile @@ -52,5 +52,6 @@ obj-$(CONFIG_MTD_NAND_XWAY) += xway_nand.o obj-$(CONFIG_MTD_NAND_BCM47XXNFLASH) += bcm47xxnflash/ obj-$(CONFIG_MTD_NAND_SUNXI) += sunxi_nand.o obj-$(CONFIG_MTD_NAND_HISI504) += hisi504_nand.o +obj-$(CONFIG_MTD_NAND_BRCMSTB) += brcmstb_nand.o nand-objs := nand_base.o nand_bbt.o nand_timings.o diff --git a/drivers/mtd/nand/brcmstb_nand.c b/drivers/mtd/nand/brcmstb_nand.c new file mode 100644 index ..4d74f7a17dc3 --- /dev/null +++ b/drivers/mtd/nand/brcmstb_nand.c @@ -0,0 +1,2182 @@ +/* + * Copyright © 2010-2015 Broadcom Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * This flag controls if WP stays on between erase/write commands to mitigate + * flash corruption due to power glitches. Values: + * 0: NAND_WP is not used or not available + * 1: NAND_WP is set by default, cleared for erase/write operations + * 2: NAND_WP is always cleared + */ +static int wp_on = 1; +module_param(wp_on, int, 0444); + +/*** + * Definitions + ***/ + +#define DRV_NAME "brcmstb_nand" + +#define CMD_NULL 0x00 +#define CMD_PAGE_READ 0x01 +#define CMD_SPARE_AREA_READ0x02 +#define CMD_STATUS_READ0x03 +#define CMD_PROGRAM_PAGE 0x04 +#define CMD_PROGRAM_SPARE_AREA 0x05 +#define CMD_COPY_BACK 0x06 +#define CMD_DEVICE_ID_READ 0x07 +#define CMD_BLOCK_ERASE0x08 +#define CMD_FLASH_RESET0x09 +#define CMD_BLOCKS_LOCK0x0a +#define CMD_BLOCKS_LOCK_DOWN 0x0b +#define CMD_BLOCKS_UNLOCK 0x0c +#define CMD_READ_BLOCKS_LOCK_STATUS0x0d +#define CMD_PARAMETER_READ 0x0e +#define CMD_PARAMETER_CHANGE_COL 0x0f +#define CMD_LOW_LEVEL_OP 0x10 + +struct brcm_nand_dma_desc { + u32 next_desc; + u32 next_desc_ext; + u32 cmd_irq; + u32 dram_addr; + u32 dram_addr_ext; + u32 tfr_len; + u32 total_len; + u32 flash_addr; + u32 flash_addr_ext; + u32 cs; + u32 pad2[5]; + u32 status_valid; +} __packed; + +/* Bitfields for brcm_nand_dma_desc::status_valid */ +#define FLASH_DMA_ECC_ERROR(1 << 8) +#define FLASH_DMA_CORR_ERROR (1 << 9) + +/* 512B flash cache in the NAND
[PATCH 1/3] mtd: nand: add common DT init code
These are already-documented common bindings for NAND chips. Let's handle them in nand_base. If NAND controller drivers need to act on this data before bringing up the NAND chip (e.g., fill out ECC callback functions, change HW modes, etc.), then they can do so between calling nand_scan_ident() and nand_scan_tail(). Signed-off-by: Brian Norris --- drivers/mtd/nand/nand_base.c | 41 + include/linux/mtd/nand.h | 5 + 2 files changed, 46 insertions(+) diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index df7eb4ff07d1..271866b038b3 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -48,6 +48,7 @@ #include #include #include +#include /* Define default oob placement schemes for large and small page devices */ static struct nand_ecclayout nand_oob_8 = { @@ -3779,6 +3780,39 @@ ident_done: return type; } +static int nand_dt_init(struct mtd_info *mtd, struct nand_chip *chip, + struct device_node *dn) +{ + int ecc_mode, ecc_strength, ecc_step; + + if (of_get_nand_bus_width(dn) == 16) + chip->options |= NAND_BUSWIDTH_16; + + if (of_get_nand_on_flash_bbt(dn)) + chip->bbt_options |= NAND_BBT_USE_FLASH; + + ecc_mode = of_get_nand_ecc_mode(dn); + ecc_strength = of_get_nand_ecc_strength(dn); + ecc_step = of_get_nand_ecc_step_size(dn); + + if ((ecc_step >= 0 && !(ecc_strength >= 0)) || + (!(ecc_step >= 0) && ecc_strength >= 0)) { + pr_err("must set both strength and step size in DT\n"); + return -EINVAL; + } + + if (ecc_mode >= 0) + chip->ecc.mode = ecc_mode; + + if (ecc_strength >= 0) + chip->ecc.strength = ecc_strength; + + if (ecc_step > 0) + chip->ecc.size = ecc_step; + + return 0; +} + /** * nand_scan_ident - [NAND Interface] Scan for the NAND device * @mtd: MTD device structure @@ -3796,6 +3830,13 @@ int nand_scan_ident(struct mtd_info *mtd, int maxchips, int i, nand_maf_id, nand_dev_id; struct nand_chip *chip = mtd->priv; struct nand_flash_dev *type; + int ret; + + if (chip->dn) { + ret = nand_dt_init(mtd, chip, chip->dn); + if (ret) + return ret; + } /* Set the default functions */ nand_set_defaults(chip, chip->options & NAND_BUSWIDTH_16); diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h index 3d4ea7eb2b68..e0f40e12a2c8 100644 --- a/include/linux/mtd/nand.h +++ b/include/linux/mtd/nand.h @@ -26,6 +26,8 @@ struct mtd_info; struct nand_flash_dev; +struct device_node; + /* Scan and identify a NAND device */ extern int nand_scan(struct mtd_info *mtd, int max_chips); /* @@ -542,6 +544,7 @@ struct nand_buffers { * flash device * @IO_ADDR_W: [BOARDSPECIFIC] address to write the 8 I/O lines of the * flash device. + * @dn:[BOARDSPECIFIC] device node describing this instance * @read_byte: [REPLACEABLE] read one byte from the chip * @read_word: [REPLACEABLE] read one word from the chip * @write_byte:[REPLACEABLE] write a single byte to the chip on the @@ -644,6 +647,8 @@ struct nand_chip { void __iomem *IO_ADDR_R; void __iomem *IO_ADDR_W; + struct device_node *dn; + uint8_t (*read_byte)(struct mtd_info *mtd); u16 (*read_word)(struct mtd_info *mtd); void (*write_byte)(struct mtd_info *mtd, uint8_t byte); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] Documentation: devicetree: add binding doc for Broadcom NAND controller
Signed-off-by: Brian Norris --- .../devicetree/bindings/mtd/brcmstb_nand.txt | 109 + 1 file changed, 109 insertions(+) create mode 100644 Documentation/devicetree/bindings/mtd/brcmstb_nand.txt diff --git a/Documentation/devicetree/bindings/mtd/brcmstb_nand.txt b/Documentation/devicetree/bindings/mtd/brcmstb_nand.txt new file mode 100644 index ..933d44943cbb --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/brcmstb_nand.txt @@ -0,0 +1,109 @@ +* Broadcom STB NAND Controller + +The Broadcom Set-Top Box NAND controller supports low-level access to raw NAND +flash chips. It has a memory-mapped register interface for both control +registers and for its data input/output buffer. On some SoCs, this controller is +paired with a custom DMA engine (inventively named "Flash DMA") which supports +basic PROGRAM and READ functions, among other features. + +This controller was originally designed for STB SoCs (BCM7xxx) but is now +available on a variety of Broadcom SoCs, including some BCM3xxx, BCM63xx, and +iProc/Cygnus. Its history includes several similar (but not fully register +compatible) versions. + +Required properties: +- compatible : should contain "brcm,brcmnand" and an appropriate version + compatibility string, like "brcm,brcmnand-v7.0" + Possible values: + brcm,brcmnand-v4.0 + brcm,brcmnand-v5.0 + brcm,brcmnand-v6.0 + brcm,brcmnand-v7.0 + brcm,brcmnand-v7.1 + brcm,brcmnand +- reg : the register start and length for NAND register region. + (optional) Flash DMA register range (if present) + (optional) NAND flash cache range (if at non-standard offset) +- reg-names: a list of the names corresponding to the previous register + ranges. Should contain "nand" and (optionally) + "flash-dma" and/or "nand-cache". +- interrupts : The NAND CTLRDY interrupt and (if Flash DMA is available) + FLASH_DMA_DONE +- interrupt-names : May be "nand_ctlrdy" or "flash_dma_done" +- interrupt-parent : See standard interrupt bindings +- #address-cells : <1> - subnodes give the chip-select number +- #size-cells : <0> + +Optional properties: +- brcm,nand-has-wp : Some versions of this IP include a write-protect + (WP) control bit. It is always available on >= + v7.0. Use this property to describe the rare + earlier versions of this core that include WP + +* NAND chip-select + +Each controller (compatible: "brcm,brcmnand") may contain one or more subnodes +to represent enabled chip-selects which (may) contain NAND flash chips. Their +properties are as follows. + +Required properties: +- compatible: should contain "brcm,nandcs" +- reg : a single integer representing the chip-select + number (e.g., 0, 1, 2, etc.) +- #address-cells: see partition.txt +- #size-cells : see partition.txt +- nand-ecc-strength : see nand.txt +- nand-ecc-step-size: must be 512 or 1024. See nand.txt + +Optional properties: +- nand-on-flash-bbt : boolean, to enable the on-flash BBT for this + chip-select. See nand.txt +- brcm,nand-oob-sector-size : integer, to denote the spare area sector size + expected for the ECC layout in use. This size, in + addition to the strength and step-size, + determines how the hardware BCH engine will lay + out the parity bytes it stores on the flash. + This property can be automatically determined by + the flash geometry (particularly the NAND page + and OOB size) in many cases, but when booting + from NAND, the boot controller has only a limited + number of available options for its default ECC + layout. + +Each nandcs device node may optionally contain sub-nodes describing the flash +partition mapping. See partition.txt for more detail. + +Example: + +nand@f0442800 { + compatible = "brcm,brcmnand-v7.0", "brcm,brcmnand"; + reg = <0xF0442800 0x600>, + <0xF0443000 0x100>; + reg-names = "nand", "flash-dma"; + interrupt-parent = <_intr2_intc>; + interrupts = <24>, <4>; + + #address-cells = <1>; + #size-cells = <0>; + + nandcs@1 { + compatible = "brcm,nandcs"; + reg = <1>; // Chip select 1 + nand-on-flash-bbt; + nand-ecc-strength
Re: [PATCH 1/4] PM / Wakeirq: Add minimal device wakeirq helper functions
* Rafael J. Wysocki [150306 16:19]: > On Friday, March 06, 2015 03:05:40 PM Tony Lindgren wrote: > > > > Oh it naturally would not work in irq context, it's for the bottom > > half again. But I'll take a look if we can just call > > pm_request_resume() and disable_irq() on the wakeirq in without > > waiting for the device driver runtime_suspend to disable the wakeirq. > > That would minimize the interface to just dev_pm_request_wakeirq() > > and dev_pm_free_wakeirq(). > > But this is part of a bigger picture. Namely, if a separete wakeup interrupt > is required for a device, the device's power.can_wakeup flag cannot be set > until that interrupt has been successfully requested. Also for devices that > can signal wakeup via their own IO interrupts, it would make sense to allow > those interrupts to be registered somehow as "wakeup interrupts". It sure would be nice to provide at least some automated handling for those too, even if it was just to deal with if device_may_wake() irq_set_irq_wake(). At least in the cases I've seen, the IO interrupt is capable of waking up too, but not from any deeper idle states. The wakeirq is always capable of waking up the system, so if wakeirq was configured we could just ignore the wake configureation for the IO interrupt. And it seems some devices have a single wakeirq dealing with a group of IO interrupts (GPIOs), see commit 97d86e07b716 ("Input: gpio_keys - allow separating gpio and irq in device tree"). Not sure if that interrupt is wake-up capable, but that would certainly make sense considering it's for gpio-keys. So it seems as long as we have one wakeirq entry per device, we should be covered, even if a single wakeirq needs to wake up multiple devices. > So I wonder if we can define a new struct along the lines of your > struct wakeirq_source, but call it struct wake_irq and make it look > something like this: > > struct wake_irq { >struct device *dev; >int irq; >irq_handler_t handler; > }; > > Then, add a struct wake_irq pointer to struct dev_pm_info *and* to > struct wakeup_source. Next, make dev_pm_request_wake_irq() allocate the > structure and request the interrupt and only set the pointer to it from > struct dev_pm_info *along* *with* power.can_wakeup if all that was > successful. > > For devices that use their own IO IRQ for wakeup, we can add something > like dev_pm_set_wake_irq() that will work analogously, but without requesting > the interrupt. It will just set the dev and irq members of struct wake_irq > and point struct dev_pm_info to it and set its power.can_wakeup flag. OK > Then, device_wakeup_enable() will be able to see that the device has a > wakeup IRQ and it may then point its own struct wake_irq pointer to that. > The core may then use that pointer to trigger enable_irq_wake() for the > IRQ in question and it will cover the devices that don't need separate > wakeup interrupts too. Are you thinking we could do more than automate irq_set_irq_wake() for the devices with just wake-up capable IO IRQ? > Does that make sense to you? Sure, at least for the irq_set_irq_wake() case. Regards, Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAINTAINERS: Add missing Toshiba devices and add myself as maintainer
Add the missing toshiba_bluetooth and toshiba_haps entries and add myself as their maintainer. Also add the Maintainers entry for toshiba_acpi driver and change its status to maintained. Signed-off-by: Azael Avalos --- MAINTAINERS | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 3376bda..7dcbd22e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9849,10 +9849,23 @@ S: Maintained F: drivers/platform/x86/topstar-laptop.c TOSHIBA ACPI EXTRAS DRIVER +M: Azael Avalos L: platform-driver-...@vger.kernel.org -S: Orphan +S: Maintained F: drivers/platform/x86/toshiba_acpi.c +TOSHIBA BLUETOOTH DRIVER +M: Azael Avalos +L: platform-driver-...@vger.kernel.org +S: Maintained +F: drivers/platform/x86/toshiba_bluetooth.c + +TOSHIBA HDD ACTIVE PROTECTION SENSOR DRIVER +M: Azael Avalos +L: platform-driver-...@vger.kernel.org +S: Maintained +F: drivers/platform/x86/toshiba_haps.c + TOSHIBA SMM DRIVER M: Jonathan Buzzard L: tlinux-us...@tce.toshiba-dme.co.jp -- 2.2.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] toshiba_acpi: Update events in toshiba_acpi_notify
This patch adds a few more events sent to TOS devices, some of them are already identified, while some others simply print a message informing the type of event received. Also, a netlink event is generated so that userspace apps, daemons, etc. act accordingly to these events. Signed-off-by: Azael Avalos --- drivers/platform/x86/toshiba_acpi.c | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/platform/x86/toshiba_acpi.c b/drivers/platform/x86/toshiba_acpi.c index 09c6a2f..3a895a8 100644 --- a/drivers/platform/x86/toshiba_acpi.c +++ b/drivers/platform/x86/toshiba_acpi.c @@ -2801,6 +2801,21 @@ static void toshiba_acpi_notify(struct acpi_device *acpi_dev, u32 event) case 0x80: /* Hotkeys and some system events */ toshiba_acpi_process_hotkeys(dev); break; + case 0x81: /* Dock events */ + case 0x82: + case 0x83: + pr_info("Dock event received %x\n", event); + break; + case 0x88: /* Thermal events */ + pr_info("Thermal event received\n"); + break; + case 0x8f: /* LID closed */ + case 0x90: /* LID is closed and Dock has been ejected */ + break; + case 0x8c: /* SATA power events */ + case 0x8b: + pr_info("SATA power event received %x\n", event); + break; case 0x92: /* Keyboard backlight mode changed */ /* Update sysfs entries */ ret = sysfs_update_group(_dev->dev.kobj, @@ -2808,17 +2823,19 @@ static void toshiba_acpi_notify(struct acpi_device *acpi_dev, u32 event) if (ret) pr_err("Unable to update sysfs entries\n"); break; - case 0x81: /* Unknown */ - case 0x82: /* Unknown */ - case 0x83: /* Unknown */ - case 0x8c: /* Unknown */ + case 0x85: /* Unknown */ + case 0x8d: /* Unknown */ case 0x8e: /* Unknown */ - case 0x8f: /* Unknown */ - case 0x90: /* Unknown */ + case 0x94: /* Unknown */ + case 0x95: /* Unknown */ default: pr_info("Unknown event received %x\n", event); break; } + + acpi_bus_generate_netlink_event(acpi_dev->pnp.device_class, + dev_name(_dev->dev), + event, 0); } #ifdef CONFIG_PM_SLEEP -- 2.2.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes
On Wed, 4 Mar 2015 15:48:24 -0500 Steven Rostedt wrote: > On Wed, 4 Mar 2015 21:33:16 +0100 > Ingo Molnar wrote: > > > > > * Alexei Starovoitov wrote: > > > > > On Sun, Mar 1, 2015 at 3:27 PM, Alexei Starovoitov > > > wrote: > > > > Peter, Steven, > > > > I think this set addresses everything we've discussed. > > > > Please review/ack. Thanks! > > > > > > icmp echo request > > > > I'd really like to have an Acked-by from Steve (propagated into the > > changelogs) before looking at applying these patches. > > I'll have to look at this tomorrow. I'm a bit swamped with other things > at the moment :-/ > Just an update. I started looking at it but then was pulled off to do other things. I'll make this a priority next week. Sorry for the delay. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] ARM: shmobile: R-Car Gen2: Add da9063/da9210 regulator quirk
On Tue, Mar 03, 2015 at 09:44:06AM +, Mark Brown wrote: > On Mon, Mar 02, 2015 at 06:28:43PM +0100, Geert Uytterhoeven wrote: > > The r8a7791/koelsch development board has da9063 and da9210 regulators. > > Both regulators have their interrupt request lines tied to the same > > interrupt pin (IRQ2) on the SoC. > > Reviewed-by: Mark Brown > > This is all rather fun isn't it? Hi Geert, please repost this with Mark's Ack as a non-RFC if you would like me to pick it up. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kernel/locking/locktorture: fix deadlock in 'rw_lock_irq' type
On Fri, 2015-03-06 at 16:37 -0800, Paul E. McKenney wrote: > On Sat, Mar 07, 2015 at 03:06:53AM +0300, Alexey Kodanev wrote: > > torture_rwlock_read_unlock_irq() must use read_unlock_irqrestore() > > instead of write_unlock_irqrestore(). > > > > Use read_unlock_irqrestore() instead of write_unlock_irqrestore(). > > > > Signed-off-by: Alexey Kodanev > > Good catch! If Davidlohr has no objections, I will queue this one. Dear me, yes that looks completely borken. Thanks for the fix. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] crypto: caam_rng: fix rng_unmap_ctx's DMA_UNMAP size problem
On Fri, 6 Mar 2015 10:34:42 +0800 wrote: > From: Yanjiang Jin > > Fix rng_unmap_ctx's DMA_UNMAP size problem for caam_rng, else system would > report the below calltrace during cleanup caam_rng. > Since rng_create_sh_desc() creates a fixed descriptor of exactly 4 > command-lengths now, also update DESC_RNG_LEN to (4 * CAAM_CMD_SZ). > > caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different > size [device address=0x7f080010] [map size=16 bytes] [unmap size=40 > bytes] > [ cut here ] > WARNING: at lib/dma-debug.c:887 > Modules linked in: > task: c000f7cdaa80 ti: c000e534 task.ti: c000e534 > NIP: c04f5bc8 LR: c04f5bc4 CTR: c05f69b0 > REGS: c000e53433c0 TRAP: 0700 Not tainted > MSR: 80029000 CR: 24088482 XER: > SOFTE: 0 > > GPR00: c04f5bc4 c000e5343640 c12af360 009f > GPR04: 00a0 c0d02070 c00015980660 > GPR08: c0cff360 c12da018 > GPR12: 01e3 c1fff780 100f 0001 > GPR16: 0002 > GPR20: 0001 > GPR24: 0001 0001 0001 > GPR28: c1556b90 c1565b80 c000e5343750 c000f9427480 > NIP [c04f5bc8] .check_unmap+0x538/0x9c0 > LR [c04f5bc4] .check_unmap+0x534/0x9c0 > Call Trace: > [c000e5343640] [c04f5bc4] .check_unmap+0x534/0x9c0 (unreliable) > [c000e53436e0] [c04f60d4] .debug_dma_unmap_page+0x84/0xb0 > [c000e5343810] [c082f9d4] .caam_cleanup+0x1d4/0x240 > [c000e53438a0] [c056cc88] .hwrng_unregister+0xd8/0x1c0 > Instruction dump: > 7c641b78 41de0410 e8a90050 2fa5 419e0484 e8de0028 e8ff0030 3c62ff90 > e91e0030 38638388 48546ed9 6000 <0fe0> 3c62ff8f 38637fc8 48546ec5 > ---[ end trace e43fd1734d6600df ]--- > > Signed-off-by: Yanjiang Jin > --- Acked-by: Kim Phillips Thanks, Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] soc/fman: Add the FMan FLIB
On Wed, 2015-03-04 at 23:45 -0600, Emil Medve wrote: > From: Igal Liberman > > Signed-off-by: Igal Liberman > --- > drivers/soc/Kconfig |1 + > drivers/soc/Makefile |1 + > drivers/soc/fsl/Kconfig |1 + > drivers/soc/fsl/Makefile |1 + > drivers/soc/fsl/fman/Kconfig |7 + > drivers/soc/fsl/fman/Makefile | 13 + > drivers/soc/fsl/fman/fman.c | 1396 > + > 7 files changed, 1420 insertions(+) > create mode 100644 drivers/soc/fsl/Kconfig > create mode 100644 drivers/soc/fsl/Makefile > create mode 100644 drivers/soc/fsl/fman/Kconfig > create mode 100644 drivers/soc/fsl/fman/Makefile > create mode 100644 drivers/soc/fsl/fman/fman.c > > diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig > index 76d6bd4..674a6e6 100644 > --- a/drivers/soc/Kconfig > +++ b/drivers/soc/Kconfig > @@ -1,5 +1,6 @@ > menu "SOC (System On Chip) specific Drivers" > > +source "drivers/soc/fsl/Kconfig" > source "drivers/soc/qcom/Kconfig" > source "drivers/soc/ti/Kconfig" > source "drivers/soc/versatile/Kconfig" > diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile > index 063113d..42836ee 100644 > --- a/drivers/soc/Makefile > +++ b/drivers/soc/Makefile > @@ -2,6 +2,7 @@ > # Makefile for the Linux Kernel SOC specific device drivers. > # > > +obj-$(CONFIG_FSL_SOC)+= fsl/ > obj-$(CONFIG_ARCH_QCOM) += qcom/ > obj-$(CONFIG_ARCH_TEGRA) += tegra/ > obj-$(CONFIG_SOC_TI) += ti/ > diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig > new file mode 100644 > index 000..38c08ae > --- /dev/null > +++ b/drivers/soc/fsl/Kconfig > @@ -0,0 +1 @@ > +source "drivers/soc/fsl/fman/Kconfig" > diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile > new file mode 100644 > index 000..97d715c > --- /dev/null > +++ b/drivers/soc/fsl/Makefile > @@ -0,0 +1 @@ > +obj-$(CONFIG_FSL_FMAN) += fman/ > diff --git a/drivers/soc/fsl/fman/Kconfig b/drivers/soc/fsl/fman/Kconfig > new file mode 100644 > index 000..e5009a9 > --- /dev/null > +++ b/drivers/soc/fsl/fman/Kconfig > @@ -0,0 +1,7 @@ > +menuconfig FSL_FMAN > + tristate "Freescale DPAA Frame Manager" > + depends on FSL_SOC || COMPILE_TEST > + default n > + help > + Freescale Data-Path Acceleration Architecture Frame Manager > + (FMan) support As this doesn't appear to be a complete driver, why is it user-selectable? Likewise for the other flib patches. Why are you adding the flibs in a big batch, rather than bundling the code with the driver patches that use it? Is there a minimal core of fman functionality (complete functionality, not flibs) that you could start with, rather than trying to get the entire thing reviewed at once? In other words, break it up vertically, not horizontally. Note that the lack of caller context is especially harmful given that the functions are not documented. > diff --git a/drivers/soc/fsl/fman/Makefile b/drivers/soc/fsl/fman/Makefile > new file mode 100644 > index 000..d7fbecb > --- /dev/null > +++ b/drivers/soc/fsl/fman/Makefile > @@ -0,0 +1,13 @@ > +ccflags-y += -DVERSION=\"\" > + > +ifeq ($(CONFIG_FSL_FMAN),y) > + > +FMAN = $(srctree)/drivers/soc/fsl/fman > + > +ccflags-y += -I$(FMAN)/flib > + > +obj-$(CONFIG_FSL_FMAN) += fsl_fman.o > + > +fsl_fman-objs:= fman.o > + > +endif You won't even get into this makefile if CONFIG_FSL_FMAN=n... Did you really mean to exclude this stuff when CONFIG_FSL_FMAN=m? > +int fman_reset_mac(struct fman_fpm_regs __iomem *fpm_rg, uint8_t mac_id) > +{ > + uint32_t msk, timeout = 100; > + > + /* Get the relevant bit mask */ > + switch (mac_id) { > + case (0): > + msk = FPM_RSTC_MAC0_RESET; > + break; > + case (1): > + msk = FPM_RSTC_MAC1_RESET; > + break; > + case (2): > + msk = FPM_RSTC_MAC2_RESET; > + break; > + case (3): > + msk = FPM_RSTC_MAC3_RESET; > + break; > + case (4): > + msk = FPM_RSTC_MAC4_RESET; > + break; > + case (5): > + msk = FPM_RSTC_MAC5_RESET; > + break; > + case (6): > + msk = FPM_RSTC_MAC6_RESET; > + break; > + case (7): > + msk = FPM_RSTC_MAC7_RESET; > + break; > + case (8): > + msk = FPM_RSTC_MAC8_RESET; > + break; > + case (9): > + msk = FPM_RSTC_MAC9_RESET; > + break; > + default: > + return -EINVAL; > + } Without seeing the caller, I can't judge whether there's a good reason for passing this in by number rather than passing in FPM_RSTC_MACn_RESET directly and avoiding the switch. > + /* reset */ > + iowrite32be(msk, _rg->fm_rstc); > + while ((ioread32be(_rg->fm_rstc) & msk) && --timeout) > + usleep_range(10, 11); > + > +
Re: [PATCH 1/2] Input: add support for Semtech SX8654 I2C touchscreen controller
On Fri, Mar 06, 2015 at 10:21:55AM -0800, Dmitry Torokhov wrote: > On Fri, Mar 06, 2015 at 07:21:38PM +0100, Sébastien Szymanski wrote: > > Signed-off-by: Sébastien Szymanski > > --- > > drivers/input/touchscreen/Kconfig | 11 ++ > > drivers/input/touchscreen/Makefile | 1 + > > drivers/input/touchscreen/sx8654.c | 285 > > + > > 3 files changed, 297 insertions(+) > > create mode 100644 drivers/input/touchscreen/sx8654.c > > > > diff --git a/drivers/input/touchscreen/Kconfig > > b/drivers/input/touchscreen/Kconfig > > index 5891752..6f713fd0 100644 > > --- a/drivers/input/touchscreen/Kconfig > > +++ b/drivers/input/touchscreen/Kconfig > > @@ -961,6 +961,17 @@ config TOUCHSCREEN_SUR40 > > To compile this driver as a module, choose M here: the > > module will be called sur40. > > > > +config TOUCHSCREEN_SX8654 > > + tristate "Semtech SX8654 touchscreen" > > + depends on I2C && OF > > Does it have to depend on OF? I do not see anything OF-specific there... > > No need to resumbit. Applied with some cosmetic edits and DT bindings folded into this patch. Thank you. > > > + help > > + Say Y here if you have a Semtech SX8654 touchscreen controller. > > + > > + If unsure, say N > > + > > + To compile this driver as a module, choose M here: the > > + module will be called sx8654. > > + > > config TOUCHSCREEN_TPS6507X > > tristate "TPS6507x based touchscreens" > > depends on I2C > > diff --git a/drivers/input/touchscreen/Makefile > > b/drivers/input/touchscreen/Makefile > > index 0242fea..a06a752 100644 > > --- a/drivers/input/touchscreen/Makefile > > +++ b/drivers/input/touchscreen/Makefile > > @@ -79,5 +79,6 @@ obj-$(CONFIG_TOUCHSCREEN_WM97XX_ATMEL)+= > > atmel-wm97xx.o > > obj-$(CONFIG_TOUCHSCREEN_WM97XX_MAINSTONE) += mainstone-wm97xx.o > > obj-$(CONFIG_TOUCHSCREEN_WM97XX_ZYLONITE) += zylonite-wm97xx.o > > obj-$(CONFIG_TOUCHSCREEN_W90X900) += w90p910_ts.o > > +obj-$(CONFIG_TOUCHSCREEN_SX8654) += sx8654.o > > obj-$(CONFIG_TOUCHSCREEN_TPS6507X) += tps6507x-ts.o > > obj-$(CONFIG_TOUCHSCREEN_ZFORCE) += zforce_ts.o > > diff --git a/drivers/input/touchscreen/sx8654.c > > b/drivers/input/touchscreen/sx8654.c > > new file mode 100644 > > index 000..58cc478 > > --- /dev/null > > +++ b/drivers/input/touchscreen/sx8654.c > > @@ -0,0 +1,285 @@ > > +/* > > + * drivers/input/touchscreen/sx8654.c > > + * > > + * Copyright (c) 2015 Armadeus Systems > > + * Sébastien Szymanski > > + * > > + * Using code from: > > + * - sx865x.c > > + * Copyright (c) 2013 U-MoBo Srl > > + * Pierluigi Passaro > > + * - sx8650.c > > + * Copyright (c) 2009 Wayne Roberts > > + * - tsc2007.c > > + * Copyright (c) 2008 Kwangwoo Lee > > + * - ads7846.c > > + * Copyright (c) 2005 David Brownell > > + * Copyright (c) 2006 Nokia Corporation > > + * - corgi_ts.c > > + * Copyright (C) 2004-2005 Richard Purdie > > + * - omap_ts.[hc], ads7846.h, ts_osk.c > > + * Copyright (C) 2002 MontaVista Software > > + * Copyright (C) 2004 Texas Instruments > > + * Copyright (C) 2005 Dirk Behme > > + * > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License version 2 as > > + * published by the Free Software Foundation. > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +/* register addresses */ > > +#define I2C_REG_TOUCH0 0x00 > > +#define I2C_REG_TOUCH1 0x01 > > +#define I2C_REG_CHANMASK 0x04 > > +#define I2C_REG_IRQMASK0x22 > > +#define I2C_REG_IRQSRC 0x23 > > +#define I2C_REG_SOFTRESET 0x3f > > + > > +/* commands */ > > +#define CMD_READ_REGISTER 0x40 > > +#define CMD_MANUAL 0xc0 > > +#define CMD_PENTRG 0xe0 > > + > > +/* value for I2C_REG_SOFTRESET */ > > +#define SOFTRESET_VALUE0xde > > + > > +/* bits for I2C_REG_IRQSRC */ > > +#define IRQ_PENTOUCH_TOUCHCONVDONE 0x08 > > +#define IRQ_PENRELEASE 0x04 > > + > > +/* bits for RegTouch1 */ > > +#define CONDIRQ0x20 > > +#define FILT_7SA 0x03 > > + > > +/* bits for I2C_REG_CHANMASK */ > > +#define CONV_X 0x80 > > +#define CONV_Y 0x40 > > + > > +/* coordinates rate: higher nibble of CTRL0 register */ > > +#define RATE_MANUAL0x00 > > +#define RATE_5000CPS 0xf0 > > + > > +/* power delay: lower nibble of CTRL0 register */ > > +#define POWDLY_1_1MS 0x0b > > + > > +#define MAX_12BIT ((1 << 12) - 1) > > + > > +struct sx8654 { > > + struct input_dev *input; > > + struct i2c_client *client; > > +}; > > + > > +static irqreturn_t sx8654_irq(int irq, void
Re: [PATCH v2 01/15] x86, kaslr: Use init_size instead of run_size
On Fri, Mar 6, 2015 at 11:56 AM, Kees Cook wrote: > On Fri, Mar 6, 2015 at 11:28 AM, Yinghai Lu wrote: > Okay, I've proven this to myself now. :) I think it would be valuable > to call out that brk and bss are included in the _end calculation. For > others: ... > So, _end - _text does equal _text + bss offset + bss size + brk size > > Thanks! It'll be nice to lose the run_size hack. Adding some > documentation to the code here would help others in the future trying > to find this value, I think. :) in arch/x86/kernel/vmlinux.lds.S, we have /* BSS */ . = ALIGN(PAGE_SIZE); .bss : AT(ADDR(.bss) - LOAD_OFFSET) { __bss_start = .; *(.bss..page_aligned) *(.bss) . = ALIGN(PAGE_SIZE); __bss_stop = .; } . = ALIGN(PAGE_SIZE); .brk : AT(ADDR(.brk) - LOAD_OFFSET) { __brk_base = .; . += 64 * 1024; /* 64k alignment slop space */ *(.brk_reservation) /* areas brk users have reserved */ __brk_limit = .; } _end = .; so _end already cover bss and brk. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] crypto: powerpc - move files to fix build error
The current cryptodev-2.6 tree commits: d9850fc529ef ("crypto: powerpc/sha1 - kernel config") 50ba29aaa7b0 ("crypto: powerpc/sha1 - glue") failed to properly place files under arch/powerpc/crypto, which leads to build errors: make[1]: *** No rule to make target 'arch/powerpc/crypto/sha1-spe-asm.o', needed by 'arch/powerpc/crypto/sha1-ppc-spe.o'. Stop. make[1]: *** No rule to make target 'arch/powerpc/crypto/sha1_spe_glue.o', needed by 'arch/powerpc/crypto/sha1-ppc-spe.o'. Stop. Makefile:947: recipe for target 'arch/powerpc/crypto' failed Move the two sha1 spe files under crypto/, and whilst there, rename other powerpc crypto files with underscores to use dashes for consistency. Cc: Markus Stockhausen Signed-off-by: Kim Phillips --- applies to today's cryptodev-2.6. arch/powerpc/crypto/Makefile | 8 arch/powerpc/crypto/{aes_spe_glue.c => aes-spe-glue.c} | 0 arch/powerpc/crypto/{md5_glue.c => md5-glue.c} | 0 arch/powerpc/{ => crypto}/sha1-spe-asm.S | 0 arch/powerpc/{sha1_spe_glue.c => crypto/sha1-spe-glue.c} | 0 arch/powerpc/crypto/{sha256_spe_glue.c => sha256-spe-glue.c} | 0 6 files changed, 4 insertions(+), 4 deletions(-) rename arch/powerpc/crypto/{aes_spe_glue.c => aes-spe-glue.c} (100%) rename arch/powerpc/crypto/{md5_glue.c => md5-glue.c} (100%) rename arch/powerpc/{ => crypto}/sha1-spe-asm.S (100%) rename arch/powerpc/{sha1_spe_glue.c => crypto/sha1-spe-glue.c} (100%) rename arch/powerpc/crypto/{sha256_spe_glue.c => sha256-spe-glue.c} (100%) diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile index c6b25cba..9c221b6 100644 --- a/arch/powerpc/crypto/Makefile +++ b/arch/powerpc/crypto/Makefile @@ -10,8 +10,8 @@ obj-$(CONFIG_CRYPTO_SHA1_PPC) += sha1-powerpc.o obj-$(CONFIG_CRYPTO_SHA1_PPC_SPE) += sha1-ppc-spe.o obj-$(CONFIG_CRYPTO_SHA256_PPC_SPE) += sha256-ppc-spe.o -aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o aes_spe_glue.o -md5-ppc-y := md5-asm.o md5_glue.o +aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o aes-spe-glue.o +md5-ppc-y := md5-asm.o md5-glue.o sha1-powerpc-y := sha1-powerpc-asm.o sha1.o -sha1-ppc-spe-y := sha1-spe-asm.o sha1_spe_glue.o -sha256-ppc-spe-y := sha256-spe-asm.o sha256_spe_glue.o +sha1-ppc-spe-y := sha1-spe-asm.o sha1-spe-glue.o +sha256-ppc-spe-y := sha256-spe-asm.o sha256-spe-glue.o diff --git a/arch/powerpc/crypto/aes_spe_glue.c b/arch/powerpc/crypto/aes-spe-glue.c similarity index 100% rename from arch/powerpc/crypto/aes_spe_glue.c rename to arch/powerpc/crypto/aes-spe-glue.c diff --git a/arch/powerpc/crypto/md5_glue.c b/arch/powerpc/crypto/md5-glue.c similarity index 100% rename from arch/powerpc/crypto/md5_glue.c rename to arch/powerpc/crypto/md5-glue.c diff --git a/arch/powerpc/sha1-spe-asm.S b/arch/powerpc/crypto/sha1-spe-asm.S similarity index 100% rename from arch/powerpc/sha1-spe-asm.S rename to arch/powerpc/crypto/sha1-spe-asm.S diff --git a/arch/powerpc/sha1_spe_glue.c b/arch/powerpc/crypto/sha1-spe-glue.c similarity index 100% rename from arch/powerpc/sha1_spe_glue.c rename to arch/powerpc/crypto/sha1-spe-glue.c diff --git a/arch/powerpc/crypto/sha256_spe_glue.c b/arch/powerpc/crypto/sha256-spe-glue.c similarity index 100% rename from arch/powerpc/crypto/sha256_spe_glue.c rename to arch/powerpc/crypto/sha256-spe-glue.c -- 2.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] crypto: caamhash: - fix uninitialized edesc->sec4_sg_bytes field
On Fri, 6 Mar 2015 10:34:41 +0800 wrote: > From: Yanjiang Jin > > sec4_sg_bytes not being properly initialized causes ahash_done > to try to free unallocated DMA memory: > > caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has > not allocated [device address=0xdeadbeefdeadbeef] [size=3735928559 bytes] > [ cut here ] > WARNING: at lib/dma-debug.c:1093 > Modules linked in: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.0.0-rc1+ #6 > task: e9598c00 ti: effca000 task.ti: e95a2000 > NIP: c04ef24c LR: c04ef24c CTR: c0549730 > REGS: effcbd40 TRAP: 0700 Not tainted (4.0.0-rc1+) > MSR: 00029002 CR: 22008084 XER: 2000 > > GPR00: c04ef24c effcbdf0 e9598c00 0096 c08f7424 c00ab2b0 0001 > GPR08: c0fe7510 effca000 01c3 22008082 c1048e77 c105 > GPR16: c0c36700 493c0040 002c e690e4a0 c1054fb4 c18bac40 00029002 c18b0788 > GPR24: 0014 e690e480 effcbe48 c0fde128 e6ffac10 deadbeef deadbeef > NIP [c04ef24c] check_unmap+0x93c/0xb40 > LR [c04ef24c] check_unmap+0x93c/0xb40 > Call Trace: > [effcbdf0] [c04ef24c] check_unmap+0x93c/0xb40 (unreliable) > [effcbe40] [c04ef4f4] debug_dma_unmap_page+0xa4/0xc0 > [effcbec0] [c070cda8] ahash_done+0x128/0x1a0 > [effcbef0] [c0700070] caam_jr_dequeue+0x1d0/0x290 > [effcbf40] [c0045f40] tasklet_action+0x110/0x1f0 > [effcbf80] [c0044bc8] __do_softirq+0x188/0x700 > [effcbfe0] [c00455d8] irq_exit+0x108/0x120 > [effcbff0] [c000f520] call_do_irq+0x24/0x3c > [e95a3e20] [c00059b8] do_IRQ+0xc8/0x170 > [e95a3e50] [c0011bc8] ret_from_except+0x0/0x18 > > Signed-off-by: Yanjiang Jin > --- Acked-by: Kim Phillips Thanks, Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kernel/locking/locktorture: fix deadlock in 'rw_lock_irq' type
On Sat, Mar 07, 2015 at 03:06:53AM +0300, Alexey Kodanev wrote: > torture_rwlock_read_unlock_irq() must use read_unlock_irqrestore() > instead of write_unlock_irqrestore(). > > Use read_unlock_irqrestore() instead of write_unlock_irqrestore(). > > Signed-off-by: Alexey Kodanev Good catch! If Davidlohr has no objections, I will queue this one. Thanx, Paul > --- > kernel/locking/locktorture.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c > index ec8cce2..6a2723c 100644 > --- a/kernel/locking/locktorture.c > +++ b/kernel/locking/locktorture.c > @@ -309,7 +309,7 @@ static int torture_rwlock_read_lock_irq(void) > __acquires(torture_rwlock) > static void torture_rwlock_read_unlock_irq(void) > __releases(torture_rwlock) > { > - write_unlock_irqrestore(_rwlock, cxt.cur_ops->flags); > + read_unlock_irqrestore(_rwlock, cxt.cur_ops->flags); > } > > static struct lock_torture_ops rw_lock_irq_ops = { > -- > 1.7.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/