RE: [EXT] Re: [PATCH 1/3] dt-bindings: i2c: add optional mul-value property to binding
> -Original Message- > From: Uwe Kleine-König > Sent: 2019年4月30日 14:38 > To: Chuanhua Han > Cc: robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; > s.ha...@pengutronix.de; Leo Li ; > linux-kernel@vger.kernel.org; devicet...@vger.kernel.org; > linux-arm-ker...@lists.infradead.org; linux-...@vger.kernel.org; > ker...@pengutronix.de; dl-linux-imx ; > feste...@gmail.com; wsa+rene...@sang-engineering.com; e...@deif.com; > li...@rempel-privat.de; Sumit Batra ; > l.st...@pengutronix.de; p...@axentia.se > Subject: [EXT] Re: [PATCH 1/3] dt-bindings: i2c: add optional mul-value > property to binding > > Caution: EXT Email > > On Tue, Apr 30, 2019 at 12:32:40PM +0800, Chuanhua Han wrote: > > NXP Layerscape SoC have up to three MUL options available for all > > divider values, we choice of MUL determines the internal monitor rate > > of the I2C bus (SCL and SDA signals): > > A lower MUL value results in a higher sampling rate of the I2C signals. > > A higher MUL value results in a lower sampling rate of the I2C signals. > > > > So in Optional properties we added our custom mul-value property in > > the binding to select which mul option for the device tree i2c > > controller node. > > > > Signed-off-by: Chuanhua Han > > --- > > Documentation/devicetree/bindings/i2c/i2c-imx.txt | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt > > b/Documentation/devicetree/bindings/i2c/i2c-imx.txt > > index b967544590e8..ba8e7b7b3fa8 100644 > > --- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt > > +++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt > > @@ -18,6 +18,9 @@ Optional properties: > > - sda-gpios: specify the gpio related to SDA pin > > - pinctrl: add extra pinctrl to configure i2c pins to gpio function for i2c > >bus recovery, call it "gpio" state > > +- mul-value: NXP Layerscape SoC have up to three MUL options > > +available for all I2C divider values, it describes which MUL we > > +choose to use for the driver, the values should be 1,2,4. > > Indention is broken. Yes, I also found this problem, next version I will fix the indent problem > > I wonder why this needs to be configurable on a per-machine/device level. > What is the trade-off? According to NXP Layerscape SoC Reference Manual, there are three MUL options for i2c controller to configure i2c Bus Frequency Divider Register (IBFD) to determine the clock Frequency of i2c. Some socs (such as ls1046a) have the best performance when MUL=4, and the default is MUL=1. This option is optional and can be configured by device tree > > Best regards > Uwe > > -- > Pengutronix e.K. | Uwe Kleine-König > | > Industrial Linux Solutions | > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pe > ngutronix.de%2F&data=02%7C01%7Cchuanhua.han%40nxp.com%7C158 > 21c9cf4c449f2d5ea08d6cd367aaa%7C686ea1d3bc2b4c6fa92cd99c5c301635 > %7C0%7C0%7C636922031201957736&sdata=8jKPN%2FSJghgOF890NTr > %2FC%2B9PsFpEr64%2B%2FXHLSX5Cipo%3D&reserved=0 |
Re: [PATCH 0/14] v2 multi-die/package topology support
On Tue, Feb 26, 2019 at 2:05 PM Peter Zijlstra wrote: > > On Tue, Feb 26, 2019 at 01:19:58AM -0500, Len Brown wrote: > > Documentation/cputopology.txt| 72 ++- > > Documentation/x86/topology.txt | 6 +- > > arch/x86/include/asm/processor.h | 5 +- > > arch/x86/include/asm/smp.h | 1 + > > arch/x86/include/asm/topology.h | 5 ++ > > arch/x86/kernel/cpu/topology.c | 85 > > +--- > > arch/x86/kernel/smpboot.c| 73 +++- > > arch/x86/xen/smp_pv.c| 1 + > > drivers/base/topology.c | 22 +++ > > drivers/hwmon/coretemp.c | 9 +-- > > drivers/powercap/intel_rapl.c| 75 +--- > > drivers/thermal/intel/x86_pkg_temp_thermal.c | 9 +-- > > include/linux/topology.h | 6 ++ > > 13 files changed, 276 insertions(+), 93 deletions(-) > > Should we not also have changes to > arch/x86/kernel/cpu/proc.c:show_cpuinfo_cores() ? Good question. I was thinking that /proc/cpuinfo was sort of the legacy API, and adding a field might break something. While adding an attribute to sysfs topology directory was the compatible/safe way to make additions. /proc/cpuinfo has these fields today: physical id : 0 this is the physical package id siblings : 8 this is the count of cpus in the same package core id : 3 this is cpu_core_id cpu cores : 4 this is booted_cores If one were to make a change here, I'd consider adding the (physical) die_id, though it is already in sysfs topology as an attribute. Not sure if it would then make sense to print the count of cpus in the die. Not sure what I'd name it, and this info is already in sysfs as a map and list. Len Brown, Intel Open Source Technology Center
[PATCH v4 3/3] dt-bindings: power: supply: Add bindings for Microchip UCS1002
Add bindings for Microchip UCS1002 Programmable USB Port Power Controller with Charger Emulation. Signed-off-by: Andrey Smirnov Cc: Enric Balletbo Serra Cc: Chris Healy Cc: Lucas Stach Cc: Fabio Estevam Cc: Guenter Roeck Cc: Rob Herring Cc: devicet...@vger.kernel.org Cc: Sebastian Reichel Cc: linux-kernel@vger.kernel.org Cc: linux...@vger.kernel.org --- .../power/supply/microchip,ucs1002.txt| 27 +++ 1 file changed, 27 insertions(+) create mode 100644 Documentation/devicetree/bindings/power/supply/microchip,ucs1002.txt diff --git a/Documentation/devicetree/bindings/power/supply/microchip,ucs1002.txt b/Documentation/devicetree/bindings/power/supply/microchip,ucs1002.txt new file mode 100644 index ..021fd7aba75e --- /dev/null +++ b/Documentation/devicetree/bindings/power/supply/microchip,ucs1002.txt @@ -0,0 +1,27 @@ +Microchip UCS1002 USB Port Power Controller + +Required properties: +- compatible : Should be "microchip,ucs1002"; +- reg : I2C slave address + +Optional properties: +- interrupts-extended : A list of interrupts lines present (could be either + corresponding to A_DET# pin, ALERT# pin, or both) +- interrupt-names : A list of interrupt names. Should contain (if + present): + - "a_det" for line connected to A_DET# pin + - "alert" for line connected to ALERT# pin + Both are expected to be IRQ_TYPE_EDGE_BOTH +Example: + +&i2c3 { + charger@32 { + compatible = "microchip,ucs1002"; + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_ucs1002_pins>; + reg = <0x32>; + interrupts-extended = <&gpio5 2 IRQ_TYPE_EDGE_BOTH>, + <&gpio3 21 IRQ_TYPE_EDGE_BOTH>; + interrupt-names = "a_det", "alert"; + }; +}; -- 2.20.1
[PATCH v4 2/3] power: supply: Add driver for Microchip UCS1002
Add driver for Microchip UCS1002 Programmable USB Port Power Controller with Charger Emulation. The driver exposed a power supply device to control/monitor various parameter of the device as well as a regulator to allow controlling VBUS line. Signed-off-by: Enric Balletbo Serra Signed-off-by: Andrey Smirnov Cc: Chris Healy Cc: Lucas Stach Cc: Fabio Estevam Cc: Guenter Roeck Cc: Sebastian Reichel Cc: linux-kernel@vger.kernel.org Cc: linux...@vger.kernel.org --- drivers/power/supply/Kconfig | 9 + drivers/power/supply/Makefile| 1 + drivers/power/supply/ucs1002_power.c | 646 +++ 3 files changed, 656 insertions(+) create mode 100644 drivers/power/supply/ucs1002_power.c diff --git a/drivers/power/supply/Kconfig b/drivers/power/supply/Kconfig index e901b9879e7e..c614c8a196f3 100644 --- a/drivers/power/supply/Kconfig +++ b/drivers/power/supply/Kconfig @@ -660,4 +660,13 @@ config FUEL_GAUGE_SC27XX Say Y here to enable support for fuel gauge with SC27XX PMIC chips. +config CHARGER_UCS1002 +tristate "Microchip UCS1002 USB Port Power Controller" + depends on I2C + depends on OF + select REGMAP_I2C + help + Say Y to enable support for Microchip UCS1002 Programmable + USB Port Power Controller with Charger Emulation. + endif # POWER_SUPPLY diff --git a/drivers/power/supply/Makefile b/drivers/power/supply/Makefile index b731c2a9b695..c56803a9e4fe 100644 --- a/drivers/power/supply/Makefile +++ b/drivers/power/supply/Makefile @@ -87,3 +87,4 @@ obj-$(CONFIG_AXP288_CHARGER) += axp288_charger.o obj-$(CONFIG_CHARGER_CROS_USBPD) += cros_usbpd-charger.o obj-$(CONFIG_CHARGER_SC2731) += sc2731_charger.o obj-$(CONFIG_FUEL_GAUGE_SC27XX)+= sc27xx_fuel_gauge.o +obj-$(CONFIG_CHARGER_UCS1002) += ucs1002_power.o diff --git a/drivers/power/supply/ucs1002_power.c b/drivers/power/supply/ucs1002_power.c new file mode 100644 index ..d66b4eff9b7a --- /dev/null +++ b/drivers/power/supply/ucs1002_power.c @@ -0,0 +1,646 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Driver for UCS1002 Programmable USB Port Power Controller + * + * Copyright (C) 2019 Zodiac Inflight Innovations + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* UCS1002 Registers */ +#define UCS1002_REG_CURRENT_MEASUREMENT0x00 + +/* + * The Total Accumulated Charge registers store the total accumulated + * charge delivered from the VS source to a portable device. The total + * value is calculated using four registers, from 01h to 04h. The bit + * weighting of the registers is given in mA/hrs. + */ +#define UCS1002_REG_TOTAL_ACC_CHARGE 0x01 + +/* Other Status Register */ +#define UCS1002_REG_OTHER_STATUS 0x0f +# define F_ADET_PIN BIT(4) +# define F_CHG_ACTBIT(3) + +/* Interrupt Status */ +#define UCS1002_REG_INTERRUPT_STATUS 0x10 +# define F_DISCHARGE_ERR BIT(6) +# define F_RESET BIT(5) +# define F_MIN_KEEP_OUT BIT(4) +# define F_TSDBIT(3) +# define F_OVER_VOLT BIT(2) +# define F_BACK_VOLT BIT(1) +# define F_OVER_ILIM BIT(0) + +/* Pin Status Register */ +#define UCS1002_REG_PIN_STATUS 0x14 +# define UCS1002_PWR_STATE_MASK 0x03 +# define F_PWR_EN_PIN BIT(6) +# define F_M2_PIN BIT(5) +# define F_M1_PIN BIT(4) +# define F_EM_EN_PIN BIT(3) +# define F_SEL_PINBIT(2) +# define F_ACTIVE_MODE_MASK GENMASK(5, 3) +# define F_ACTIVE_MODE_PASSTHROUGHF_M2_PIN +# define F_ACTIVE_MODE_DEDICATED F_EM_EN_PIN +# define F_ACTIVE_MODE_BC12_DCP (F_M2_PIN | F_EM_EN_PIN) +# define F_ACTIVE_MODE_BC12_SDP F_M1_PIN +# define F_ACTIVE_MODE_BC12_CDP (F_M1_PIN | F_M2_PIN | F_EM_EN_PIN) + +/* General Configuration Register */ +#define UCS1002_REG_GENERAL_CFG0x15 +# define F_RATION_EN BIT(3) + +/* Emulation Configuration Register */ +#define UCS1002_REG_EMU_CFG0x16 + +/* Switch Configuration Register */ +#define UCS1002_REG_SWITCH_CFG 0x17 +# define F_PIN_IGNORE BIT(7) +# define F_EM_EN_SET BIT(5) +# define F_M2_SET BIT(4) +# define F_M1_SET BIT(3) +# define F_S0_SET BIT(2) +# define F_PWR_EN_SET BIT(1) +# define F_LATCH_SET BIT(0) +# define V_SET_ACTIVE_MODE_MASK GENMASK(5, 3) +# define V_SET_ACTIVE_MODE_PASSTHROUGHF_M2_SET +# define V_SET_ACTIVE_MODE_DEDICATED F_EM_EN_SET +# define V_SET_ACTIVE_MODE_BC12_DCP (F_M2_SET | F_EM_EN_SET) +# define V_SET_ACTIVE_MODE_BC12_SDP F_M1_SE
[PATCH v4 1/3] power: supply: core: Add POWER_SUPPLY_HEALTH_OVERCURRENT constant
Add POWER_SUPPLY_HEALTH_OVERCURRENT constant in order to allow singalling overcurrent condition via power supply health information. Signed-off-by: Andrey Smirnov Reviewed-by: Guenter Roeck Cc: Enric Balletbo Serra Cc: Chris Healy Cc: Lucas Stach Cc: Fabio Estevam Cc: Guenter Roeck Cc: Sebastian Reichel Cc: linux-kernel@vger.kernel.org Cc: linux...@vger.kernel.org --- drivers/power/supply/power_supply_sysfs.c | 2 +- include/linux/power_supply.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/power/supply/power_supply_sysfs.c b/drivers/power/supply/power_supply_sysfs.c index 5358a80d854f..153f4a6ca57c 100644 --- a/drivers/power/supply/power_supply_sysfs.c +++ b/drivers/power/supply/power_supply_sysfs.c @@ -62,7 +62,7 @@ static const char * const power_supply_charge_type_text[] = { static const char * const power_supply_health_text[] = { "Unknown", "Good", "Overheat", "Dead", "Over voltage", "Unspecified failure", "Cold", "Watchdog timer expire", - "Safety timer expire" + "Safety timer expire", "Over current" }; static const char * const power_supply_technology_text[] = { diff --git a/include/linux/power_supply.h b/include/linux/power_supply.h index 2f9c201a54d1..bdab14c7ca4d 100644 --- a/include/linux/power_supply.h +++ b/include/linux/power_supply.h @@ -57,6 +57,7 @@ enum { POWER_SUPPLY_HEALTH_COLD, POWER_SUPPLY_HEALTH_WATCHDOG_TIMER_EXPIRE, POWER_SUPPLY_HEALTH_SAFETY_TIMER_EXPIRE, + POWER_SUPPLY_HEALTH_OVERCURRENT, }; enum { -- 2.20.1
[PATCH v4 0/3] Driver for UCS1002
Everyone: This small series adds a driver for UCS1002 Programmable USB Port Power Controller with Charger Emulation. See [page] for product page and [datasheet] for device dataseet. Hopefully each individual patch is self explanatory. Note that this series is a revival of the upstreaming effort by Enric Balletbo Serra last version of which can be found at [original-effort] Feedback is welcome! Thanks, Andrey Smirnov Changes since [v3]: - Added a check for negative values to ucs1002_set_usb_type() Changes since [v2]: - Fixed a bug pointed out by Lucas Changes since [v1]: - Moved IRQ trigger specification to DT - Fixed silent error paths in probe() - Dropped error message in ucs1002_set_max_current() - Fixed license mismatch - Changed the driver to configure the chip to BC1.2 CDP by default - Made other small fixes as per feedback for v1 [v3] https://lore.kernel.org/lkml/20190429195349.20335-1-andrew.smir...@gmail.com [v2] https://lore.kernel.org/lkml/20190429054741.7286-1-andrew.smir...@gmail.com [v1] https://lore.kernel.org/lkml/20190417084457.28747-1-andrew.smir...@gmail.com/ [page] https://www.microchip.com/wwwproducts/en/UCS1002-2 [datasheet] https://ww1.microchip.com/downloads/en/DeviceDoc/UCS1002-2%20Data%20Sheet.pdf [original-effort] https://lore.kernel.org/lkml/1460705181-10493-1-git-send-email-enric.balle...@collabora.com/ Andrey Smirnov (3): power: supply: core: Add POWER_SUPPLY_HEALTH_OVERCURRENT constant power: supply: Add driver for Microchip UCS1002 dt-bindings: power: supply: Add bindings for Microchip UCS1002 .../power/supply/microchip,ucs1002.txt| 27 + drivers/power/supply/Kconfig | 9 + drivers/power/supply/Makefile | 1 + drivers/power/supply/power_supply_sysfs.c | 2 +- drivers/power/supply/ucs1002_power.c | 646 ++ include/linux/power_supply.h | 1 + 6 files changed, 685 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/power/supply/microchip,ucs1002.txt create mode 100644 drivers/power/supply/ucs1002_power.c -- 2.20.1
[PATCH] quota: check time limit when back out space/inode change
When we fail from allocating inode/space, we back out the change we already did. In a special case which has exceeded soft limit by the change, we should also check time limit and reset it properly. Signed-off-by: Chengguang Xu --- fs/quota/dquot.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 9d7dfc47c854..58f15a083dd1 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1681,13 +1681,11 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags) if (!dquots[cnt]) continue; spin_lock(&dquots[cnt]->dq_dqb_lock); - if (reserve) { - dquots[cnt]->dq_dqb.dqb_rsvspace -= - number; - } else { - dquots[cnt]->dq_dqb.dqb_curspace -= - number; - } + if (reserve) + dquot_free_reserved_space(dquots[cnt], + number); + else + dquot_decr_space(dquots[cnt], number); spin_unlock(&dquots[cnt]->dq_dqb_lock); } spin_unlock(&inode->i_lock); @@ -1738,7 +1736,7 @@ int dquot_alloc_inode(struct inode *inode) continue; /* Back out changes we already did */ spin_lock(&dquots[cnt]->dq_dqb_lock); - dquots[cnt]->dq_dqb.dqb_curinodes--; + dquot_decr_inodes(dquots[cnt], 1); spin_unlock(&dquots[cnt]->dq_dqb_lock); } goto warn_put_all; -- 2.20.1
Re: [PATCH 1/3] dt-bindings: i2c: add optional mul-value property to binding
On Tue, Apr 30, 2019 at 12:32:40PM +0800, Chuanhua Han wrote: > NXP Layerscape SoC have up to three MUL options available for all > divider values, we choice of MUL determines the internal monitor rate > of the I2C bus (SCL and SDA signals): > A lower MUL value results in a higher sampling rate of the I2C signals. > A higher MUL value results in a lower sampling rate of the I2C signals. > > So in Optional properties we added our custom mul-value property in the > binding to select which mul option for the device tree i2c controller > node. > > Signed-off-by: Chuanhua Han > --- > Documentation/devicetree/bindings/i2c/i2c-imx.txt | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt > b/Documentation/devicetree/bindings/i2c/i2c-imx.txt > index b967544590e8..ba8e7b7b3fa8 100644 > --- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt > +++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt > @@ -18,6 +18,9 @@ Optional properties: > - sda-gpios: specify the gpio related to SDA pin > - pinctrl: add extra pinctrl to configure i2c pins to gpio function for i2c >bus recovery, call it "gpio" state > +- mul-value: NXP Layerscape SoC have up to three MUL options available for > +all I2C divider values, it describes which MUL we choose to use for the > driver, > +the values should be 1,2,4. Indention is broken. I wonder why this needs to be configurable on a per-machine/device level. What is the trade-off? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ |
Re: [PATCH v2 3/4] dt-bindings: pinctrl: meson: Add drive-strength-uA property
Hi Martin, On 4/27/19 9:21 PM, Martin Blumenstingl wrote: > Hi Guillaume, > > On Thu, Apr 18, 2019 at 2:48 PM Guillaume La Roque > wrote: >> Add optional drive-strength-uA property >> >> Signed-off-by: Guillaume La Roque >> --- >> Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt >> b/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt >> index a47dd990a8d3..b3e4be696ddc 100644 >> --- a/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt >> +++ b/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt >> @@ -51,6 +51,9 @@ Configuration nodes support the generic properties >> "bias-disable", >> "bias-pull-up" and "bias-pull-down", described in file >> pinctrl-bindings.txt >> >> +Optional properties : >> + - drive-strength-uA: Drive strength for the specified pins in uA. > if you have to re-send this series for whatever reason then please > mention that drive-strength-uA is only valid for G12A and newer thanks for your review, i will do if i send new series. > otherwise: > Reviewed-by: Martin Blumenstingl
Re: [PATCH] cpufreq: Fix kobject memleak
On Tue, Apr 30, 2019 at 11:35:52AM +0530, Viresh Kumar wrote: > Currently the error return path from kobject_init_and_add() is not > followed by a call to kobject_put() - which means we are leaking the > kobject. > > Fix it by adding a call to kobject_put() in the error path of > kobject_init_and_add(). > > Signed-off-by: Viresh Kumar > --- > Tobin fixed this for schedutil already. For what its worth: Reviewed-by: Tobin C. Harding Thanks Viresh, one less for me to do! Tobin
Re: [PATCH 2/4] mtd: nand: Move ONFI code into nand/ directory
Hi Shivamurthy, "Shivamurthy Shastri (sshivamurthy)" wrote on Tue, 26 Mar 2019 10:51:56 +: > Move generic ONFI code to nand/ directory, which can be used by SPI > NAND layer. > > Signed-off-by: Shivamurthy Shastri Reviewed-by: Miquel Raynal Thanks, Miquèl
Re: [PATCH 1/4] mtd: rawnand: Turn the ONFI support to generic
Hi Shivamurthy, Sorry for the long delay I was a bit overloaded. "Shivamurthy Shastri (sshivamurthy)" wrote on Tue, 26 Mar 2019 10:51:47 +: > Fix headers to make way for adding helper functions. > > Add onfi helper structure. > > Add helper functions in raw NAND core, which later will be used during > ONFI detection. > As you are touching the core, I need to identify clearly each change you make; typically in this commit you do several different changes. Please split this patch in small meaningful peaces. > Signed-off-by: Shivamurthy Shastri > --- > drivers/mtd/nand/raw/internals.h | 6 +- > drivers/mtd/nand/raw/nand_base.c | 236 --- > drivers/mtd/nand/raw/nand_onfi.c | 215 +--- > include/linux/mtd/nand.h | 30 > include/linux/mtd/rawnand.h | 5 + > 5 files changed, 289 insertions(+), 203 deletions(-) > Thanks, Miquèl
Re: [PATCH v8] Bluetooth: btqca: inject command complete event during fw download
Hi Matthias, Thank you for the patch! Yet something to improve: [auto build test ERROR on bluetooth-next/master] [also build test ERROR on next-20190429] [cannot apply to v5.1-rc7] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Matthias-Kaehlcke/Bluetooth-btqca-inject-command-complete-event-during-fw-download/20190430-125407 base: https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master config: xtensa-allyesconfig (attached as .config) compiler: xtensa-linux-gcc (GCC) 8.1.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=8.1.0 make.cross ARCH=xtensa If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): drivers/bluetooth/btqca.c: In function 'qca_inject_cmd_complete_event': >> drivers/bluetooth/btqca.c:286:18: error: 'QCA_HCI_CC_SUCCESS' undeclared >> (first use in this function); did you mean 'QCA_HCI_CC_OPCODE'? skb_put_u8(skb, QCA_HCI_CC_SUCCESS); ^~ QCA_HCI_CC_OPCODE drivers/bluetooth/btqca.c:286:18: note: each undeclared identifier is reported only once for each function it appears in vim +286 drivers/bluetooth/btqca.c 267 268 static int qca_inject_cmd_complete_event(struct hci_dev *hdev) 269 { 270 struct hci_event_hdr *hdr; 271 struct hci_ev_cmd_complete *evt; 272 struct sk_buff *skb; 273 274 skb = bt_skb_alloc(sizeof(*hdr) + sizeof(*evt) + 1, GFP_KERNEL); 275 if (!skb) 276 return -ENOMEM; 277 278 hdr = skb_put(skb, sizeof(*hdr)); 279 hdr->evt = HCI_EV_CMD_COMPLETE; 280 hdr->plen = sizeof(*evt) + 1; 281 282 evt = skb_put(skb, sizeof(*evt)); 283 evt->ncmd = 1; 284 evt->opcode = HCI_OP_NOP; 285 > 286 skb_put_u8(skb, QCA_HCI_CC_SUCCESS); 287 288 hci_skb_pkt_type(skb) = HCI_EVENT_PKT; 289 290 return hci_recv_frame(hdev, skb); 291 } 292 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [tip:sched/urgent] sched/cpufreq: Fix kobject memleak
On Tue, Apr 30, 2019 at 11:26:27AM +0530, Viresh Kumar wrote: > On 29-04-19, 22:52, tip-bot for Tobin C. Harding wrote: > > Commit-ID: 8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > > Gitweb: > > https://git.kernel.org/tip/8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > > Author: Tobin C. Harding > > AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 > > Committer: Ingo Molnar > > CommitDate: Tue, 30 Apr 2019 06:24:09 +0200 > > > > sched/cpufreq: Fix kobject memleak > > > > Currently the error return path from kobject_init_and_add() is not > > followed by a call to kobject_put() - which means we are leaking > > the kobject. > > > > Fix it by adding a call to kobject_put() in the error path of > > kobject_init_and_add(). > > > > Signed-off-by: Tobin C. Harding > > Add call to kobject_put() in error path of kobject_init_and_add(). > > This should have been present before the signed-off ? Thanks. Some face palm fails on this patch. Its hard to get good help :) Tobin
Re: [PATCH v3 3/3] clk: sifive: add a driver for the SiFive FU540 PRCI IP block
Hi Atish, On Sat, 27 Apr 2019, Atish Patra wrote: > On 4/11/19 1:28 AM, Paul Walmsley wrote: > > Add driver code for the SiFive FU540 PRCI IP block. This IP block > > handles reset and clock control for the SiFive FU540 device and > > implements SoC-level clock tree controls and dividers. [...] > > +static const struct of_device_id sifive_fu540_prci_of_match[] = { > > + { .compatible = "sifive,fu540-c000-prci", }, > > All the existing unleashed devices have prci clock compatible string as > "sifive,aloeprci0" or "sifive,ux00prci0". Should it be added to maintain > backward compatibility? As you note, just adding the old (unreviewed) compatible string isn't enough. > Even after adding the compatible string (just for my testing purpose), I get > this while booting. > > [0.104571] sifive-fu540-prci 1000.prci: expected only two parent > clocks, found 1 > [0.112460] sifive-fu540-prci 1000.prci: could not register clocks: -22 > [0.119499] sifive-fu540-prci: probe of 1000.prci failed with error -22 > > Looking at the DT entries, your DT patch has > > + prci: clock-controller@1000 { > + compatible = "sifive,fu540-c000-prci"; > + reg = <0x0 0x1000 0x0 0x1000>; > + clocks = <&hfclk>, <&rtcclk>; > + #clock-cells = <1>; > + }; > > > while current DT from FSBL > (https://github.com/sifive/freedom-u540-c000-bootloader/blob/master/fsbl/ux00_fsbl.dts) > > prci: prci@1000 { > compatible = "sifive,aloeprci0", "sifive,ux00prci0"; > reg = <0x0 0x1000 0x0 0x1000>; > reg-names = "control"; > clocks = <&refclk>; > #clock-cells = <1>; > }; > > This seems to be the cause of error. It looks like this patch needs a complete > different DT (your DT patch) than FSBL provides. That's right. That old data was completely out of tree and unreviewed. It's part of the reason why we're going through the process of posting DT data to the kernel and devicetree lists and getting that data reviewed: https://lore.kernel.org/linux-riscv/20190411084242.4999-1-paul.walms...@sifive.com/ > This means everybody must upgrade the FSBL to use your DT patch in their > boards once this driver is merged. Is this okay? People can continue to use the out-of-tree DT data if they want. They'll just have to continue to patch their kernels to add out-of-tree drivers, as they do now. Otherwise, if people want to use the upstream PRCI driver in the upstream kernel, then it's necessary to use DT data that aligns with what's in the upstream binding documentation. - Paul
Re: [SOLVED] PROBLEM: Elan touchpad regression on Kernel 5.0.10
Hello, After a cold restart, this problems seem to be solved automatically on kernel 5.0.10. Regards, On Tue, Apr 30, 2019, at 12:21, Outvi V wrote: > Hello, > > [1.] One line summary of the problem: Elan touchpad regression on Kernel > 5.0.10 > > [2.] Full description of the problem/report: > Elan touchpad does not work on 5.0.10 while working on 5.0.9 > > [3.] Keywords: elan_i2c_core elan i2c touchpad 5.0.10 > > [4.] Kernel information > [4.1.] Kernel version: > Linux version 5.0.10-arch1-1-ARCH (builduser@heftig-2592) (gcc > version 8.3.0 (GCC)) #1 SMP PREEMPT Sat Apr 27 20:06:45 UTC 2019 > [4.2.] Kernel .config file: > I'm not sure, but I think it may be referring to > > https://git.archlinux.org/svntogit/packages.git/tree/trunk/config?h=packages/linux > [5.] Most recent kernel version which did not have the bug: 5.0.9 > > [6.] Output of Oops.. message (if applicable) with symbolic information > resolved (Not appliable) > [7.] A small shell script or example program which triggers the > problem: (Not appliable) > > [8.] Environment > [8.1.] Software (add the output of the ver_linux script here) > > Linux sheltty 5.0.10-arch1-1-ARCH #1 SMP PREEMPT Sat Apr 27 20:06:45 > UTC 2019 x86_64 GNU/Linux > > GNU C 8.3.0 > GNU Make4.2.1 > Binutils2.32 > Util-linux 2.33.2 > Mount 2.33.2 > Module-init-tools 26 > E2fsprogs 1.45.0 > Jfsutils1.1.15 > Reiserfsprogs 3.6.27 > Xfsprogs4.20.0 > PPP 2.4.7 > Linux C Library 2.29 > Dynamic linker (ldd)2.29 > Linux C++ Library 6.0.25 > Procps 3.3.15 > Kbd 2.0.4 > Console-tools 2.0.4 > Sh-utils8.31 > Udev242 > Modules Loaded 8021q 8250_dw ac ac97_bus acpi_thermal_rel > aesni_intel aes_x86_64 agpgart ahci arc4 atkbd battery bbswitch > bluetooth btbcm btintel btrtl btusb cfg80211 coretemp crc16 > crc32c_generic crc32c_intel crc32_pclmul crct10dif_pclmul cryptd > crypto_simd crypto_user drm drm_kms_helper ecdh_generic elan_i2c evdev > ext4 fat fb_sys_fops fscrypto garp ghash_clmulni_intel glue_helper hid > hid_generic i2c_algo_bit i2c_hid i2c_i801 i8042 i915 idma64 input_leds > int3400_thermal int3403_thermal int340x_thermal_zone intel_cstate > intel_gtt intel_lpss intel_lpss_pci intel_pch_thermal intel_powerclamp > intel_rapl intel_rapl_perf intel_soc_dts_iosf intel_uncore > intel_wmi_thunderbolt ip_tables irqbypass iTCO_vendor_support iTCO_wdt > jbd2 joydev kvm kvmgt kvm_intel ledtrig_audio libahci libata libphy > libps2 llc mac80211 mac_hid mbcache mdev media mei mei_me mousedev mrp > nls_cp437 nls_iso8859_1 pcc_cpufreq processor_thermal_device r8169 > r8822be realtek rfkill rng_core scsi_mod serio serio_raw snd > snd_compress snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi > snd_hda_codec_realtek snd_hda_core snd_hda_ext_core snd_hda_intel > snd_hwdep snd_pcm snd_pcm_dmaengine snd_soc_acpi > snd_soc_acpi_intel_match snd_soc_core snd_soc_hdac_hda snd_soc_skl > snd_soc_skl_ipc snd_soc_sst_dsp snd_soc_sst_ipc snd_timer soundcore stp > syscopyarea sysfillrect sysimgblt tpm tpm_crb tpm_tis tpm_tis_core > typec typec_ucsi ucsi_acpi usbhid uvcvideo vfat vfio vfio_iommu_type1 > vfio_mdev videobuf2_common videobuf2_memops videobuf2_v4l2 > videobuf2_vmalloc videodev wmi wmi_bmof x86_pkg_temp_thermal xhci_hcd > xhci_pci x_tables > > [8.2.] Processor information (from /proc/cpuinfo): (Maybe not appliable) > [8.3.] Module information (from /proc/modules): > > (Parts related to i2c and elan:) > > i2c_algo_bit 16384 1 i915, Live 0x > i2c_hid 32768 0 - Live 0x > hid 147456 3 hid_generic,usbhid,i2c_hid, Live 0x > elan_i2c 49152 0 - Live 0x > i2c_i801 36864 0 - Live 0x > > [8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) > > /proc/ioports: > - : PCI Bus :00 > - : dma1 > - : pic1 > - : iTCO_wdt > - : timer0 > - : timer1 > - : keyboard > - : PNP0C09:00 > - : EC data > - : keyboard > - : PNP0C09:00 > - : EC cmd > - : rtc0 > - : dma page reg > - : pic2 > - : dma2 > - : fpu > - : PNP0C04:00 > - : iTCO_wdt > - : pnp 00:02 > - : PCI conf1 > - : PCI Bus :00 > - : pnp 00:02 > - : pnp 00:00 > - : ACPI PM1a_EVT_BLK > - : ACPI PM1a_CNT_BLK > - : ACPI PM_TMR > - : ACPI CPU throttle > - : ACPI PM2_CNT_BLK > - : pnp 00:04 > - : ACPI GPE0_BLK > - : pnp 00:01 > - : PCI Bus :08 > - : :08:00.0 > -0
Re: [PATCH v3 3/4] Documentation: devicetree: add PPMU events description
Hi Lukasz, On 19. 4. 19. 오후 10:48, Lukasz Luba wrote: > Extend the documenation by events description with new 'event-data-type' > field. Add example how the event might be defined in DT. > > Signed-off-by: Lukasz Luba > --- > .../devicetree/bindings/devfreq/event/exynos-ppmu.txt | 18 > ++ > 1 file changed, 18 insertions(+) > > diff --git a/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt > b/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt > index 3e36c1d..47feb5f 100644 > --- a/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt > +++ b/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt > @@ -145,3 +145,21 @@ Example3 : PPMUv2 nodes in exynos5433.dtsi are listed > below. > reg = <0x104d 0x2000>; > status = "disabled"; > }; > + > +The 'event' type specified in the PPMU node defines 'event-name' > +which also contains 'id' number and optionally 'event-data-type'. > + > +Example: > + > + events { > + ppmu_leftbus_0: ppmu-event0-leftbus { > + event-name = "ppmu-event0-leftbus"; > + event-data-type = ; > + }; > + }; > + > +The 'event-data-type' defines the type of data which shell be counted > +by the counter. You can check include/dt-bindings/pmu/exynos_ppmu.h for > +all possible type, i.e. count read requests, count write data in bytes, > +etc. This field is optional and when it is missing, the driver code will > +use default data type. > How about editing it as following? --- a/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt +++ b/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt @@ -10,14 +10,23 @@ The Exynos PPMU driver uses the devfreq-event class to provide event data to various devfreq devices. The devfreq devices would use the event data when derterming the current state of each IP. -Required properties: +Required properties for PPMU device: - compatible: Should be "samsung,exynos-ppmu" or "samsung,exynos-ppmu-v2. - reg: physical base address of each PPMU and length of memory mapped region. -Optional properties: +Optional properties for PPMU device: - clock-names : the name of clock used by the PPMU, "ppmu" - clocks : phandles for clock specified in "clock-names" property +Required properties for 'events' child node of PPMU device: +- event-name : the unique event name among PPMU device +Optional properties for 'events' child node of PPMU device: +- event-data-type : Define the type of data which shell be counted +by the counter. You can check include/dt-bindings/pmu/exynos_ppmu.h for +all possible type, i.e. count read requests, count write data in bytes, +etc. This field is optional and when it is missing, the driver code +will use default data type. + Example1 : PPMUv1 nodes in exynos3250.dtsi are listed below. ppmu_dmc0: ppmu_dmc0@106a { @@ -145,3 +154,16 @@ Example3 : PPMUv2 nodes in exynos5433.dtsi are listed below. reg = <0x104d 0x2000>; status = "disabled"; }; + +Example4 : 'event-data-type' in exynos4412-ppmu-common.dtsi are listed below. + + &ppmu_dmc0 { + status = "okay"; + events { + ppmu_dmc0_3: ppmu-event3-dmc0 { + event-name = "ppmu-event3-dmc0"; + event-data-type = <(PPMU_RO_DATA_CNT | + PPMU_WO_DATA_CNT)>; + }; + }; + }; -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH] ALSA: hda: check RIRB to avoid use NULL pointer
From: Liwei Song Fix the following BUG: BUG: unable to handle kernel NULL pointer dereference at 000c Workqueue: events azx_probe_work [snd_hda_intel] RIP: 0010:snd_hdac_bus_update_rirb+0x80/0x160 [snd_hda_core] Call Trace: azx_interrupt+0x78/0x140 [snd_hda_codec] __handle_irq_event_percpu+0x49/0x300 handle_irq_event_percpu+0x23/0x60 handle_irq_event+0x3c/0x60 handle_edge_irq+0xdb/0x180 handle_irq+0x23/0x30 do_IRQ+0x6a/0x140 common_interrupt+0xf/0xf The Call Trace happened when run kdump on a NFS rootfs system. Exist the following calling sequence when boot the second kernel: azx_first_init() --> azx_acquire_irq() <-- interrupt come in, azx_interrupt() was called --> hda_intel_init_chip() --> azx_init_chip() --> snd_hdac_bus_init_chip() --> snd_hdac_bus_init_cmd_io(); --> init rirb.buf and corb.buf Interrupt happened after azx_acquire_irq() while RIRB still didn't got initialized, then NULL pointer will be used when process the interrupt. Check the value of RIRB to ensure it is not NULL, to aviod some special case may hang the system. Fixes: 14752412721c ("ALSA: hda - Add the controller helper codes to hda-core module") Signed-off-by: Liwei Song --- sound/hda/hdac_controller.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 74244d8e2909..2f0fa5353361 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -195,6 +195,9 @@ void snd_hdac_bus_update_rirb(struct hdac_bus *bus) return; bus->rirb.wp = wp; + if (!bus->rirb.buf) + return; + while (bus->rirb.rp != wp) { bus->rirb.rp++; bus->rirb.rp %= AZX_MAX_RIRB_ENTRIES; -- 2.7.4
Re: [PATCH 0/7] introduce cpu.headroom knob to cpu controller
> On Apr 29, 2019, at 8:24 AM, Vincent Guittot > wrote: > > Hi Song, > > On Sun, 28 Apr 2019 at 21:47, Song Liu wrote: >> >> Hi Morten and Vincent, >> >>> On Apr 22, 2019, at 6:22 PM, Song Liu wrote: >>> >>> Hi Vincent, >>> On Apr 17, 2019, at 5:56 AM, Vincent Guittot wrote: On Wed, 10 Apr 2019 at 21:43, Song Liu wrote: > > Hi Morten, > >> On Apr 10, 2019, at 4:59 AM, Morten Rasmussen >> wrote: >> >> >> The bit that isn't clear to me, is _why_ adding idle cycles helps your >> workload. I'm not convinced that adding headroom gives any latency >> improvements beyond watering down the impact of your side jobs. AFAIK, > > We think the latency improvements actually come from watering down the > impact of side jobs. It is not just statistically improving average > latency numbers, but also reduces resource contention caused by the side > workload. I don't know whether it is from reducing contention of ALUs, > memory bandwidth, CPU caches, or something else, but we saw reduced > latencies when headroom is used. > >> the throttling mechanism effectively removes the throttled tasks from >> the schedule according to a specific duty cycle. When the side job is >> not throttled the main workload is experiencing the same latency issues >> as before, but by dynamically tuning the side job throttling you can >> achieve a better average latency. Am I missing something? >> >> Have you looked at your distribution of main job latency and tried to >> compare with when throttling is active/not active? > > cfs_bandwidth adjusts allowed runtime for each task_group each period > (configurable, 100ms by default). cpu.headroom logic applies gentle > throttling, so that the side workload gets some runtime in every period. > Therefore, if we look at time window equal to or bigger than 100ms, we > don't really see "throttling active time" vs. "throttling inactive time". > >> >> I'm wondering if the headroom solution is really the right solution for >> your use-case or if what you are really after is something which is >> lower priority than just setting the weight to 1. Something that > > The experiments show that, cpu.weight does proper work for priority: the > main workload gets priority to use the CPU; while the side workload only > fill the idle CPU. However, this is not sufficient, as the side workload > creates big enough contention to impact the main workload. > >> (nearly) always gets pre-empted by your main job (SCHED_BATCH and >> SCHED_IDLE might not be enough). If your main job consist >> of lots of relatively short wake-ups things like the min_granularity >> could have significant latency impact. > > cpu.headroom gives benefits in addition to optimizations in pre-empt > side. By maintaining some idle time, fewer pre-empt actions are > necessary, thus the main workload will get better latency. I agree with Morten's proposal, SCHED_IDLE should help your latency problem because side job will be directly preempted unlike normal cfs task even lowest priority. In addition to min_granularity, sched_period also has an impact on the time that a task has to wait before preempting the running task. Also, some sched_feature like GENTLE_FAIR_SLEEPERS can also impact the latency of a task. It would be nice to know if the latency problem comes from contention on cache resources or if it's mainly because you main load waits before running on a CPU Regards, Vincent >>> >>> Thanks for these suggestions. Here are some more tests to show the impact >>> of scheduler knobs and cpu.headroom. >>> >>> side-load | cpu.headroom | side/cpu.weight | min_gran | cpu-idle | >>> main/latency >>> >>> none| 0 | n/a |1 ms | 45.20% | 1.00 >>> ffmpeg | 0 | 1 | 10 ms | 3.38% | 1.46 >>> ffmpeg | 0 | SCHED_IDLE|1 ms | 5.69% | 1.42 >>> ffmpeg |20% | SCHED_IDLE|1 ms | 19.00% | 1.13 >>> ffmpeg |30% | SCHED_IDLE|1 ms | 27.60% | 1.08 >>> >>> In all these cases, the main workload is loaded with same level of >>> traffic (request per second). Main workload latency numbers are normalized >>> based on the baseline (first row). >>> >>> For the baseline, the main workload runs without any side workload, the >>> system has about 45.20% idle CPU. >>> >>> The next two rows compare the impact of scheduling knobs cpu.weight and >>> sched_min_granularity. With cpu.weight of 1 and min_granularity of 10ms, >>> we see a latency of 1.46; with SCHED_IDLE and min_granularity of 1ms, we >>> see a latency of 1.42. So
Re: [PATCH v3 4/4] DT: arm: exynos4412: add event data type which is monitored
Hi, On 19. 4. 19. 오후 10:48, Lukasz Luba wrote: > The patch adds new field in the PPMU event which shows explicitly > what kind of data the event is monitoring. It is possible to change it > using defined values in exynos_ppmu.h file. > > Signed-off-by: Lukasz Luba > --- > arch/arm/boot/dts/exynos4412-ppmu-common.dtsi | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/arm/boot/dts/exynos4412-ppmu-common.dtsi > b/arch/arm/boot/dts/exynos4412-ppmu-common.dtsi > index 3a3b2fa..549faba 100644 > --- a/arch/arm/boot/dts/exynos4412-ppmu-common.dtsi > +++ b/arch/arm/boot/dts/exynos4412-ppmu-common.dtsi > @@ -6,12 +6,16 @@ > * Author: Chanwoo Choi > */ > > +#include > + > &ppmu_dmc0 { > status = "okay"; > > events { > ppmu_dmc0_3: ppmu-event3-dmc0 { > event-name = "ppmu-event3-dmc0"; > +event-data-type = <(PPMU_RO_DATA_CNT | > +PPMU_WO_DATA_CNT)>; > }; > }; > }; > @@ -22,6 +26,8 @@ > events { > ppmu_dmc1_3: ppmu-event3-dmc1 { > event-name = "ppmu-event3-dmc1"; > +event-data-type = <(PPMU_RO_DATA_CNT | > +PPMU_WO_DATA_CNT)>; > }; > }; > }; > @@ -32,6 +38,8 @@ > events { > ppmu_leftbus_3: ppmu-event3-leftbus { > event-name = "ppmu-event3-leftbus"; > +event-data-type = <(PPMU_RO_DATA_CNT | > +PPMU_WO_DATA_CNT)>; > }; > }; > }; > @@ -42,6 +50,8 @@ > events { > ppmu_rightbus_3: ppmu-event3-rightbus { > event-name = "ppmu-event3-rightbus"; > +event-data-type = <(PPMU_RO_DATA_CNT | > +PPMU_WO_DATA_CNT)>; > }; > }; > }; > Acked-by: Chanwoo Choi -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH] ARM: dts: dra76x: Update MMC2_HS200_MANUAL1 iodelay values
Update the MMC2_HS200_MANUAL1 iodelay values to match with the latest dra76x data manual[1]. Also this particular pinctrl-array is using spaces instead of tabs for spacing between the values and the comments. Fix this as well. [1] http://www.ti.com/lit/ds/symlink/dra76p.pdf Signed-off-by: Faiz Abbas --- Tested on dra76x-evm and am574x-idk. arch/arm/boot/dts/dra76x-mmc-iodelay.dtsi | 40 +++ 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/arch/arm/boot/dts/dra76x-mmc-iodelay.dtsi b/arch/arm/boot/dts/dra76x-mmc-iodelay.dtsi index baba7b00eca7..fdca48186916 100644 --- a/arch/arm/boot/dts/dra76x-mmc-iodelay.dtsi +++ b/arch/arm/boot/dts/dra76x-mmc-iodelay.dtsi @@ -22,7 +22,7 @@ * * Datamanual Revisions: * - * DRA76x Silicon Revision 1.0: SPRS993A, Revised July 2017 + * DRA76x Silicon Revision 1.0: SPRS993E, Revised December 2018 * */ @@ -169,25 +169,25 @@ /* Corresponds to MMC2_HS200_MANUAL1 in datamanual */ mmc2_iodelay_hs200_conf: mmc2_iodelay_hs200_conf { pinctrl-pin-array = < - 0x190 A_DELAY_PS(384) G_DELAY_PS(0) /* CFG_GPMC_A19_OEN */ - 0x194 A_DELAY_PS(0) G_DELAY_PS(174) /* CFG_GPMC_A19_OUT */ - 0x1a8 A_DELAY_PS(410) G_DELAY_PS(0) /* CFG_GPMC_A20_OEN */ - 0x1ac A_DELAY_PS(85) G_DELAY_PS(0)/* CFG_GPMC_A20_OUT */ - 0x1b4 A_DELAY_PS(468) G_DELAY_PS(0) /* CFG_GPMC_A21_OEN */ - 0x1b8 A_DELAY_PS(139) G_DELAY_PS(0) /* CFG_GPMC_A21_OUT */ - 0x1c0 A_DELAY_PS(676) G_DELAY_PS(0) /* CFG_GPMC_A22_OEN */ - 0x1c4 A_DELAY_PS(69) G_DELAY_PS(0)/* CFG_GPMC_A22_OUT */ - 0x1d0 A_DELAY_PS(1062) G_DELAY_PS(154)/* CFG_GPMC_A23_OUT */ - 0x1d8 A_DELAY_PS(640) G_DELAY_PS(0) /* CFG_GPMC_A24_OEN */ - 0x1dc A_DELAY_PS(0) G_DELAY_PS(0) /* CFG_GPMC_A24_OUT */ - 0x1e4 A_DELAY_PS(356) G_DELAY_PS(0) /* CFG_GPMC_A25_OEN */ - 0x1e8 A_DELAY_PS(0) G_DELAY_PS(0) /* CFG_GPMC_A25_OUT */ - 0x1f0 A_DELAY_PS(579) G_DELAY_PS(0) /* CFG_GPMC_A26_OEN */ - 0x1f4 A_DELAY_PS(0) G_DELAY_PS(0) /* CFG_GPMC_A26_OUT */ - 0x1fc A_DELAY_PS(435) G_DELAY_PS(0) /* CFG_GPMC_A27_OEN */ - 0x200 A_DELAY_PS(36) G_DELAY_PS(0)/* CFG_GPMC_A27_OUT */ - 0x364 A_DELAY_PS(759) G_DELAY_PS(0) /* CFG_GPMC_CS1_OEN */ - 0x368 A_DELAY_PS(72) G_DELAY_PS(0)/* CFG_GPMC_CS1_OUT */ + 0x190 A_DELAY_PS(384) G_DELAY_PS(0) /* CFG_GPMC_A19_OEN */ + 0x194 A_DELAY_PS(350) G_DELAY_PS(174) /* CFG_GPMC_A19_OUT */ + 0x1a8 A_DELAY_PS(410) G_DELAY_PS(0) /* CFG_GPMC_A20_OEN */ + 0x1ac A_DELAY_PS(335) G_DELAY_PS(0) /* CFG_GPMC_A20_OUT */ + 0x1b4 A_DELAY_PS(468) G_DELAY_PS(0) /* CFG_GPMC_A21_OEN */ + 0x1b8 A_DELAY_PS(339) G_DELAY_PS(0) /* CFG_GPMC_A21_OUT */ + 0x1c0 A_DELAY_PS(676) G_DELAY_PS(0) /* CFG_GPMC_A22_OEN */ + 0x1c4 A_DELAY_PS(219) G_DELAY_PS(0) /* CFG_GPMC_A22_OUT */ + 0x1d0 A_DELAY_PS(1062) G_DELAY_PS(154) /* CFG_GPMC_A23_OUT */ + 0x1d8 A_DELAY_PS(640) G_DELAY_PS(0) /* CFG_GPMC_A24_OEN */ + 0x1dc A_DELAY_PS(150) G_DELAY_PS(0) /* CFG_GPMC_A24_OUT */ + 0x1e4 A_DELAY_PS(356) G_DELAY_PS(0) /* CFG_GPMC_A25_OEN */ + 0x1e8 A_DELAY_PS(150) G_DELAY_PS(0) /* CFG_GPMC_A25_OUT */ + 0x1f0 A_DELAY_PS(579) G_DELAY_PS(0) /* CFG_GPMC_A26_OEN */ + 0x1f4 A_DELAY_PS(200) G_DELAY_PS(0) /* CFG_GPMC_A26_OUT */ + 0x1fc A_DELAY_PS(435) G_DELAY_PS(0) /* CFG_GPMC_A27_OEN */ + 0x200 A_DELAY_PS(236) G_DELAY_PS(0) /* CFG_GPMC_A27_OUT */ + 0x364 A_DELAY_PS(759) G_DELAY_PS(0) /* CFG_GPMC_CS1_OEN */ + 0x368 A_DELAY_PS(372) G_DELAY_PS(0) /* CFG_GPMC_CS1_OUT */ >; }; -- 2.19.2
[PATCH] cpufreq: Fix kobject memleak
Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Viresh Kumar --- Tobin fixed this for schedutil already. drivers/cpufreq/cpufreq.c | 1 + drivers/cpufreq/cpufreq_governor.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e10922709d13..bbf79544d0ad 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1098,6 +1098,7 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) cpufreq_global_kobject, "policy%u", cpu); if (ret) { pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret); + kobject_put(&policy->kobj); goto err_free_real_cpus; } diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index ffa9adeaba31..9d1d9bf02710 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -459,6 +459,8 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy) /* Failure, so roll back. */ pr_err("initialization failed (dbs_data kobject init error %d)\n", ret); + kobject_put(&dbs_data->attr_set.kobj); + policy->governor_data = NULL; if (!have_governor_per_policy()) -- 2.21.0.rc0.269.g1a574e7a288b
Re: [PATCH v7 11/14] irqchip: ti-sci-inta: Add support for Interrupt Aggregator driver
On 29/04/19 6:41 PM, Marc Zyngier wrote: > On 20/04/2019 11:09, Lokesh Vutla wrote: >> Texas Instruments' K3 generation SoCs has an IP Interrupt Aggregator >> which is an interrupt controller that does the following: >> - Converts events to interrupts that can be understood by >> an interrupt router. >> - Allows for multiplexing of events to interrupts. >> >> Configuration of the interrupt aggregator registers can only be done by >> a system co-processor and the driver needs to send a message to this >> co processor over TISCI protocol. This patch adds support for Interrupt >> Aggregator irqdomain. >> >> Signed-off-by: Lokesh Vutla >> --- >> Changes since v6: >> - Updated commit message. >> - Arranged header files in alphabetical order >> - Included vint_bit in struct ti_sci_inta_event_desc >> - With the above change now the chip_data is event_desc instead of vint_desc >> - No loops are used in atomic contexts. >> - Fixed locking issue while freeing parent virq >> - Fixed few other cosmetic changes. >> >> MAINTAINERS | 1 + >> drivers/irqchip/Kconfig | 11 + >> drivers/irqchip/Makefile | 1 + >> drivers/irqchip/irq-ti-sci-inta.c | 589 ++ >> 4 files changed, 602 insertions(+) >> create mode 100644 drivers/irqchip/irq-ti-sci-inta.c >> > > [...] > >> +/** >> + * ti_sci_inta_alloc_irq() - Allocate an irq within INTA domain >> + * @domain: irq_domain pointer corresponding to INTA >> + * @hwirq: hwirq of the input event >> + * >> + * Note: Allocation happens in the following manner: >> + * - Find a free bit available in any of the vints available in the list. >> + * - If not found, allocate a vint from the vint pool >> + * - Attach the free bit to input hwirq. >> + * Return event_desc if all went ok else appropriate error value. >> + */ >> +static struct ti_sci_inta_event_desc *ti_sci_inta_alloc_irq(struct >> irq_domain *domain, >> +u32 hwirq) >> +{ >> +struct ti_sci_inta_irq_domain *inta = domain->host_data; >> +struct ti_sci_inta_vint_desc *vint_desc = NULL; >> +u16 free_bit; >> + >> +mutex_lock(&inta->vint_mutex); >> +list_for_each_entry(vint_desc, &inta->vint_list, list) { >> +mutex_lock(&vint_desc->event_mutex); >> +free_bit = find_first_zero_bit(vint_desc->event_map, >> + MAX_EVENTS_PER_VINT); >> +if (free_bit != MAX_EVENTS_PER_VINT) { >> +set_bit(free_bit, vint_desc->event_map); >> +mutex_unlock(&vint_desc->event_mutex); >> +mutex_unlock(&inta->vint_mutex); >> +goto alloc_event; >> +} >> +mutex_unlock(&vint_desc->event_mutex); >> +} >> +mutex_unlock(&inta->vint_mutex); >> + >> +/* No free bits available. Allocate a new vint */ >> +vint_desc = ti_sci_inta_alloc_parent_irq(domain); >> +if (IS_ERR(vint_desc)) >> +return ERR_PTR(PTR_ERR(vint_desc)); >> + >> +mutex_lock(&vint_desc->event_mutex); >> +free_bit = find_first_zero_bit(vint_desc->event_map, >> + MAX_EVENTS_PER_VINT); >> +set_bit(free_bit, vint_desc->event_map); >> +mutex_unlock(&vint_desc->event_mutex); > > This code is still quite racy: you can have two parallel allocations > failing to get a free bit in any of the already allocated vint_desc, and > then both allocating a new vint_desc. If there was only one left, one of > the allocation will fail despite having at least 63 free interrupts. Good point. After thinking a bit more, I saw similar issue when two parallel frees happens on a vint with only 2 bits allocated. First free when freeing parent_irq might see all the bits cleared and does kfree(vint). Then second free will crash when freeing parent irq. Ill guard the entire allocation and freeing with vint_mutex and drop the event_mutex altogether. Thanks and regards, Lokesh > > M. >
[tip:sched/urgent] sched/cpufreq: Fix kobject memleak
Commit-ID: 9a4f26cc98d81b67ecc23b890c28e2df324e29f3 Gitweb: https://git.kernel.org/tip/9a4f26cc98d81b67ecc23b890c28e2df324e29f3 Author: Tobin C. Harding AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 Committer: Ingo Molnar CommitDate: Tue, 30 Apr 2019 07:57:23 +0200 sched/cpufreq: Fix kobject memleak Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Tobin C. Harding Cc: Greg Kroah-Hartman Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rafael J. Wysocki Cc: Thomas Gleixner Cc: Tobin C. Harding Cc: Vincent Guittot Cc: Viresh Kumar Link: http://lkml.kernel.org/r/20190430001144.24890-1-to...@kernel.org Signed-off-by: Ingo Molnar --- kernel/sched/cpufreq_schedutil.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 5c41ea367422..3638d2377e3c 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -771,6 +771,7 @@ out: return 0; fail: + kobject_put(&tunables->attr_set.kobj); policy->governor_data = NULL; sugov_tunables_free(tunables);
Re: [PATCH v3 1/3] clk: analogbits: add Wide-Range PLL library
On Mon, 29 Apr 2019, Stephen Boyd wrote: > Quoting Paul Walmsley (2019-04-29 12:42:07) > > On Fri, 26 Apr 2019, Paul Walmsley wrote: > > > On Fri, 26 Apr 2019, Stephen Boyd wrote: > > > > > > > Quoting Paul Walmsley (2019-04-11 01:27:32) > > > > > Add common library code for the Analog Bits Wide-Range PLL (WRPLL) IP > > > > > block, as implemented in TSMC CLN28HPC. > > > > > > > > I haven't deeply reviewed at all, but I already get two problems when > > > > compile testing these patches. I can fix them up if nothing else needs > > > > fixing. > > > > > > > > drivers/clk/analogbits/wrpll-cln28hpc.c:165 __wrpll_calc_divq() warn: > > > > should 'target_rate << divq' be a 64 bit type? > > > > drivers/clk/sifive/fu540-prci.c:214:16: error: return expression in > > > > void function > > > > > > Hmm, that's odd. I will definitely take a look and repost. > > > > I'm not able to reproduce these problems. The configs tried here were: > > > > - 64-bit RISC-V defconfig w/ PRCI driver enabled (gcc 8.2.0 built with > > crosstool-NG 1.24.0) > > > > - 32-bit ARM defconfig w/ PRCI driver enabled (gcc 8.3.0 built with > > crosstool-NG 1.24.0) > > > > - 32-bit i386 defconfig w/ PRCI driver enabled (gcc > > 5.4.0-6ubuntu1~16.04.11) > > > > Could you post the toolchain and kernel config you're using? > > > > I'm running sparse and smatch too. OK. I was able to reproduce the __wrpll_calc_divq() warning. It's been resolved in the upcoming revision. But I don't see the second error with either sparse or smatch. (This is with sparse at commit 2b96cd804dc7 and smatch at commit f0092daff69d.) - Paul
Re: [tip:sched/urgent] sched/cpufreq: Fix kobject memleak
On 29-04-19, 22:52, tip-bot for Tobin C. Harding wrote: > Commit-ID: 8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > Gitweb: > https://git.kernel.org/tip/8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 > Author: Tobin C. Harding > AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 > Committer: Ingo Molnar > CommitDate: Tue, 30 Apr 2019 06:24:09 +0200 > > sched/cpufreq: Fix kobject memleak > > Currently the error return path from kobject_init_and_add() is not > followed by a call to kobject_put() - which means we are leaking > the kobject. > > Fix it by adding a call to kobject_put() in the error path of > kobject_init_and_add(). > > Signed-off-by: Tobin C. Harding > Add call to kobject_put() in error path of kobject_init_and_add(). This should have been present before the signed-off ? > Cc: Greg Kroah-Hartman > Cc: Linus Torvalds > Cc: Peter Zijlstra > Cc: Rafael J. Wysocki > Cc: Thomas Gleixner > Cc: Tobin C. Harding > Cc: Vincent Guittot > Cc: Viresh Kumar > Link: http://lkml.kernel.org/r/20190430001144.24890-1-to...@kernel.org > Signed-off-by: Ingo Molnar > --- > kernel/sched/cpufreq_schedutil.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 5c41ea367422..3638d2377e3c 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -771,6 +771,7 @@ out: > return 0; > > fail: > + kobject_put(&tunables->attr_set.kobj); > policy->governor_data = NULL; > sugov_tunables_free(tunables); > -- viresh
Re: linux-next: build warning after merge of the clk tree
Hi Anson, On Tue, 30 Apr 2019 01:44:58 + Anson Huang wrote: > > Thanks for notice. > As it is intentional, I will send out a patch to add "/* fall through > */" to avoid this build warning, Excellent, thanks. -- Cheers, Stephen Rothwell pgpWOKjnAq9zo.pgp Description: OpenPGP digital signature
[tip:sched/urgent] sched/cpufreq: Fix kobject memleak
Commit-ID: 8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 Gitweb: https://git.kernel.org/tip/8bf7ab9c79f3d1a5f02ebac369f656de9ec0aca8 Author: Tobin C. Harding AuthorDate: Tue, 30 Apr 2019 10:11:44 +1000 Committer: Ingo Molnar CommitDate: Tue, 30 Apr 2019 06:24:09 +0200 sched/cpufreq: Fix kobject memleak Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Tobin C. Harding Add call to kobject_put() in error path of kobject_init_and_add(). Cc: Greg Kroah-Hartman Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rafael J. Wysocki Cc: Thomas Gleixner Cc: Tobin C. Harding Cc: Vincent Guittot Cc: Viresh Kumar Link: http://lkml.kernel.org/r/20190430001144.24890-1-to...@kernel.org Signed-off-by: Ingo Molnar --- kernel/sched/cpufreq_schedutil.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 5c41ea367422..3638d2377e3c 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -771,6 +771,7 @@ out: return 0; fail: + kobject_put(&tunables->attr_set.kobj); policy->governor_data = NULL; sugov_tunables_free(tunables);
Re: [PATCH] RISC-V: Add an Image header that boot loader can parse.
On 4/29/19 4:40 PM, Palmer Dabbelt wrote: On Tue, 23 Apr 2019 16:25:06 PDT (-0700), atish.pa...@wdc.com wrote: Currently, last stage boot loaders such as U-Boot can accept only uImage which is an unnecessary additional step in automating boot flows. Add a simple image header that boot loaders can parse and directly load kernel flat Image. The existing booting methods will continue to work as it is. Tested on both QEMU and HiFive Unleashed using OpenSBI + U-Boot + Linux. Signed-off-by: Atish Patra --- arch/riscv/include/asm/image.h | 32 arch/riscv/kernel/head.S | 28 2 files changed, 60 insertions(+) create mode 100644 arch/riscv/include/asm/image.h diff --git a/arch/riscv/include/asm/image.h b/arch/riscv/include/asm/image.h new file mode 100644 index ..76a7e0d4068a --- /dev/null +++ b/arch/riscv/include/asm/image.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ASM_IMAGE_H +#define __ASM_IMAGE_H + +#define RISCV_IMAGE_MAGIC "RISCV" + +#ifndef __ASSEMBLY__ +/* + * struct riscv_image_header - riscv kernel image header + * + * @code0: Executable code + * @code1: Executable code + * @text_offset: Image load offset + * @image_size:Effective Image size + * @reserved: reserved + * @magic: Magic number + * @reserved: reserved + */ + +struct riscv_image_header { + u32 code0; + u32 code1; + u64 text_offset; + u64 image_size; + u64 res1; + u64 magic; + u32 res2; + u32 res3; +}; I don't want to invent our own file format. Is there a reason we can't just use something standard? Off the top of my head I can think of ELF files and multiboot. Additional header is required to accommodate PE header format. Currently, this is only used for booti command but it will be reused for EFI headers as well. Linux kernel Image can pretend as an EFI application if PE/COFF header is present. This removes the need of an explicit EFI boot loader and EFI firmware can directly load Linux (obviously after EFI stub implementation for RISC-V). ARM64 follows the similar header format as well. https://www.kernel.org/doc/Documentation/arm64/booting.txt Regards, Atish +#endif /* __ASSEMBLY__ */ +#endif /* __ASM_IMAGE_H */ diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index fe884cd69abd..154647395601 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -19,9 +19,37 @@ #include #include #include +#include __INIT ENTRY(_start) + /* +* Image header expected by Linux boot-loaders. The image header data +* structure is described in asm/image.h. +* Do not modify it without modifying the structure and all bootloaders +* that expects this header format!! +*/ + /* jump to start kernel */ + j _start_kernel + /* reserved */ + .word 0 + .balign 8 +#if __riscv_xlen == 64 + /* Image load offset(2MB) from start of RAM */ + .dword 0x20 +#else + /* Image load offset(4MB) from start of RAM */ + .dword 0x40 +#endif + /* Effective size of kernel image */ + .dword _end - _start + .dword 0 + .asciz RISCV_IMAGE_MAGIC + .word 0 + .word 0 + +.global _start_kernel +_start_kernel: /* Mask all interrupts */ csrw sie, zero ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:undefined reference to `followparent_recalc'
On 4/29/19 9:48 PM, kbuild test robot wrote: > Hi Randy, > > It's probably a bug fix that unveils the link errors. Yoshinori Sato (cc-ed) has a patch for this. I guess that it's not in the arch/sh git tree yet ??? or wherever arch/sh changes come from. > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 83a50840e72a5a964b4704fcdc2fbb2d771015ab > commit: acaf892ecbf5be7710ae05a61fd43c668f68ad95 sh: fix multiple function > definition build errors > date: 3 weeks ago > config: sh-allmodconfig (attached as .config) > compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 > reproduce: > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout acaf892ecbf5be7710ae05a61fd43c668f68ad95 > # save the attached .config to linux build tree > GCC_VERSION=7.2.0 make.cross ARCH=sh > > If you fix the issue, kindly add following tag > Reported-by: kbuild test robot > > All errors (new ones prefixed by >>): > >>> sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.data+0x1c): >>> undefined reference to `followparent_recalc' > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation > -- ~Randy
Re: [PATCH 7/7] dmaengine: sprd: Add interrupt support for 2-stage transfer
On Mon, 29 Apr 2019 at 22:10, Vinod Koul wrote: > > On 29-04-19, 20:11, Baolin Wang wrote: > > On Mon, 29 Apr 2019 at 20:01, Vinod Koul wrote: > > > On 15-04-19, 20:15, Baolin Wang wrote: > > > > > @@ -429,6 +433,9 @@ static int sprd_dma_set_2stage_config(struct > > > > sprd_dma_chn *schan) > > > > val = chn & SPRD_DMA_GLB_SRC_CHN_MASK; > > > > val |= BIT(schan->trg_mode - 1) << > > > > SPRD_DMA_GLB_TRG_OFFSET; > > > > val |= SPRD_DMA_GLB_2STAGE_EN; > > > > + if (schan->int_type != SPRD_DMA_NO_INT) > > > > > > Who configure int_type? > > > > The int_type is configured through the flags of > > sprd_dma_prep_slave_sg() by users, see: > > https://elixir.bootlin.com/linux/v5.1-rc6/source/include/linux/dma/sprd-dma.h#L9 > > Please use DMA_PREP_INTERRUPT flag instead! We can not use DMA_PREP_INTERRUPT flag, since we have some Spreadtrum specific DMA interrupt flags configured by users, which I think we have made a consensus before. See: https://elixir.bootlin.com/linux/v5.1-rc6/source/include/linux/dma/sprd-dma.h#L105 -- Baolin Wang Best Regards
[PATCH] pid: Remove unneeded hash header file
Hash functions are not needed since idr is used now. Let's remove hash header file for cleanup. Signed-off-by: Timmy Li --- kernel/pid.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/pid.c b/kernel/pid.c index 20881598bdfa..89548d35eefb 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -32,7 +32,6 @@ #include #include #include -#include #include #include #include -- 2.17.1
Re: [PATCH 4/7] dmaengine: sprd: Add device validation to support multiple controllers
On Mon, 29 Apr 2019 at 22:05, Vinod Koul wrote: > > On 29-04-19, 20:20, Baolin Wang wrote: > > On Mon, 29 Apr 2019 at 19:57, Vinod Koul wrote: > > > > > > On 15-04-19, 20:14, Baolin Wang wrote: > > > > From: Eric Long > > > > > > > > Since we can support multiple DMA engine controllers, we should add > > > > device validation in filter function to check if the correct controller > > > > to be requested. > > > > > > > > Signed-off-by: Eric Long > > > > Signed-off-by: Baolin Wang > > > > --- > > > > drivers/dma/sprd-dma.c |5 + > > > > 1 file changed, 5 insertions(+) > > > > > > > > diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c > > > > index 0f92e60..9f99d4b 100644 > > > > --- a/drivers/dma/sprd-dma.c > > > > +++ b/drivers/dma/sprd-dma.c > > > > @@ -1020,8 +1020,13 @@ static void sprd_dma_free_desc(struct > > > > virt_dma_desc *vd) > > > > static bool sprd_dma_filter_fn(struct dma_chan *chan, void *param) > > > > { > > > > struct sprd_dma_chn *schan = to_sprd_dma_chan(chan); > > > > + struct of_phandle_args *dma_spec = > > > > + container_of(param, struct of_phandle_args, args[0]); > > > > u32 slave_id = *(u32 *)param; > > > > > > > > + if (chan->device->dev->of_node != dma_spec->np) > > > > > > Are you not using of_dma_find_controller() that does this, so this would > > > be useless! > > > > Yes, we can use of_dma_find_controller(), but that will be a little > > complicated than current solution. Since we need introduce one > > structure to save the node to validate in the filter function like > > below, which seems make things complicated. But if you still like to > > use of_dma_find_controller(), I can change to use it in next version. > > Sorry I should have clarified more.. > > of_dma_find_controller() is called by xlate, so you already run this > check, so why use this :) The of_dma_find_controller() can save the requested device node into dma_spec, and in the of_dma_simple_xlate() function, it will call dma_request_channel() to request one channel, but it did not validate the device node to find the corresponding dma device in dma_request_channel(). So we should in our filter function to validate the device node with the device node specified by the dma_spec. Hope I make things clear. -- Baolin Wang Best Regards
Re: [PATCH v4] panic: add an option to replay all the printk message in buffer
On (04/29/19 13:44), Petr Mladek wrote: > On Sat 2019-04-27 02:16:40, Sergey Senozhatsky wrote: > > On (04/27/19 01:43), Sergey Senozhatsky wrote: > > [..] > > > > The console waiter logic is effective but it does not always > > > > work. The current console owner must be calling the console > > > > drivers. > > > > > > > > > Hmm, we might have a bit of a problem here, maybe. > > > > > > > > Hmm, the printk() might wait forever when NMI stopped > > > > the current console owner in the console driver code > > > > or with the logbuf_lock taken. > > > > > > I guess this is why we re-init logbuf lock from panic, > > > however, we don't do anything with the console_owner. > > > > > The console waiter logic might get solved by clearing > > > > the console_owner in console_flush_on_panic(). It can't > > > > be much worse, we already ignore console_lock() there, ... > > > > Hmm, or maybe we are fine... console_waiter logic should work > > before we send out stop IPI/NMI from panic CPU. When we call > > flush_on_panic() console_unlock() clears console_owner, so > > panic_print_sys_info() should not deadlock on console_owner. > > Good point! > > > It's probably only problematic if we kill a console_owner > > CPU and then try to printk() (from smp_send_stop()) before > > we do flush_on_panic()->console_unlock(). > > Yup. There are called several functions between smp_send_stop() > and console_flush_on_panic(). > > The question is if it is worth a code complication. We could > never 100% guarantee that printk() would work in panic(). > I more and more understand what Peter Zijlstra means > by the duct taping. Agreed. -ss
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
> On Apr 29, 2019, at 10:26 PM, Al Viro wrote: > > On Mon, Apr 29, 2019 at 10:18:04PM -0600, Andreas Dilger wrote: >>> >>> void*i_private; /* fs or device private pointer */ >>> + void (*free_inode)(struct inode *); >> >> It seems like a waste to increase the size of every struct inode just to >> access >> a static pointer. Is this the only place that ->free_inode() is called? Why >> not move the ->free_inode() pointer into inode->i_fop->free_inode() so that >> it >> is still directly accessible at this point. > > i_op, surely? Yes, i_op is what I was thinking. > In any case, increasing sizeof(struct inode) is not a problem - > if anything, I'd turn ->i_fop into an anon union with that. As in, > > diff --git a/fs/inode.c b/fs/inode.c > index fb45590d284e..627e1766503a 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -211,8 +211,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); > static void i_callback(struct rcu_head *head) > { > struct inode *inode = container_of(head, struct inode, i_rcu); > - if (inode->i_sb->s_op->free_inode) > - inode->i_sb->s_op->free_inode(inode); > + if (inode->free_inode) > + inode->free_inode(inode); > else > free_inode_nonrcu(inode); > } > @@ -236,6 +236,7 @@ static struct inode *alloc_inode(struct super_block *sb) > if (!ops->free_inode) > return NULL; > } > + inode->free_inode = ops->free_inode; > i_callback(&inode->i_rcu); > return NULL; > } > @@ -276,6 +277,7 @@ static void destroy_inode(struct inode *inode) > if (!ops->free_inode) > return; > } > + inode->free_inode = ops->free_inode; > call_rcu(&inode->i_rcu, i_callback); > } This seems like kind of a hack. I guess your goal is to have ->free_inode accessible regardless of whether the filesystem has installed its own ->i_op methods or not, and i_fop is no longer used by this point. That said, this seems better than increasing the size of struct inode. > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 2e9b9f87caca..92732286b748 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -694,7 +694,10 @@ struct inode { > #ifdef CONFIG_IMA > atomic_ti_readcount; /* struct files open RO */ > #endif > - const struct file_operations*i_fop; /* former > ->i_op->default_file_ops */ > + union { > + const struct file_operations*i_fop; /* former > ->i_op->default_file_ops */ > + void (*free_inode)(struct inode *); > + }; Cheers, Andreas signature.asc Description: Message signed with OpenPGP
RE: [PATCH v3 1/1] Add support for IPMB driver
> -Original Message- > From: Asmaa Mnebhi > Sent: Tuesday, April 30, 2019 12:57 AM > To: miny...@acm.org; w...@the-dreams.de; Vadim Pasternak > ; Michael Shych > Cc: Asmaa Mnebhi ; linux-kernel@vger.kernel.org; > linux-...@vger.kernel.org > Subject: [PATCH v3 1/1] Add support for IPMB driver > > Support receiving IPMB requests on a Satellite MC from the BMC. > Once a response is ready, this driver will send back a response to the BMC via > the IPMB channel. Hi Asmaa, Few common questions. You define this driver as "Mellanox BlueField IPMB driver". What makes it Mellanox BlueField specific? Which HW configuration you used for testing? Could you please explain connectivity schema between main BMC and satellite BMCs? How this module is supposed to be activated? Don't you need to add DTS/ACPI records? Also few comments below. > > Signed-off-by: Asmaa Mnebhi > --- > drivers/char/ipmi/Kconfig| 8 + > drivers/char/ipmi/Makefile | 1 + > drivers/char/ipmi/ipmb_dev_int.c | 386 > +++ > 3 files changed, 395 insertions(+) > create mode 100644 drivers/char/ipmi/ipmb_dev_int.c > > diff --git a/drivers/char/ipmi/Kconfig b/drivers/char/ipmi/Kconfig index > 94719fc..12fe8f2 100644 > --- a/drivers/char/ipmi/Kconfig > +++ b/drivers/char/ipmi/Kconfig > @@ -74,6 +74,14 @@ config IPMI_SSIF >have a driver that must be accessed over an I2C bus instead of a >standard interface. This module requires I2C support. > > +config IPMB_DEVICE_INTERFACE > + tristate 'IPMB Interface handler' > + depends on I2C && I2C_SLAVE > + help > + Provides a driver for a device (Satellite MC) to > + receive requests and send responses back to the BMC via > + the IPMB interface. This module requires I2C support. > + > config IPMI_POWERNV > depends on PPC_POWERNV > tristate 'POWERNV (OPAL firmware) IPMI interface' > diff --git a/drivers/char/ipmi/Makefile b/drivers/char/ipmi/Makefile index > 3f06b20..0822adc 100644 > --- a/drivers/char/ipmi/Makefile > +++ b/drivers/char/ipmi/Makefile > @@ -26,3 +26,4 @@ obj-$(CONFIG_IPMI_KCS_BMC) += kcs_bmc.o > obj-$(CONFIG_ASPEED_BT_IPMI_BMC) += bt-bmc.o > obj-$(CONFIG_ASPEED_KCS_IPMI_BMC) += kcs_bmc_aspeed.o > obj-$(CONFIG_NPCM7XX_KCS_IPMI_BMC) += kcs_bmc_npcm7xx.o > +obj-$(CONFIG_IPMB_DEVICE_INTERFACE) += ipmb_dev_int.o > diff --git a/drivers/char/ipmi/ipmb_dev_int.c > b/drivers/char/ipmi/ipmb_dev_int.c > new file mode 100644 > index 000..63122c3 > --- /dev/null > +++ b/drivers/char/ipmi/ipmb_dev_int.c > @@ -0,0 +1,386 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * Mellanox IPMB driver to receive a request and send a response > + * > + * Copyright (C) 2018 Mellanox Techologies, Ltd. > + * > + * This was inspired by Brendan Higgins' ipmi-bmc-bt-i2c driver. > + */ > + > +#define pr_fmt(fmt) "ipmb_dev_int: " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define MAX_MSG_LEN 128 > +#define IPMB_REQUEST_LEN_MIN7 > +#define NETFN_RSP_BIT_MASK 0x4 > +#define REQUEST_QUEUE_MAX_LEN 256 > + > +#define IPMB_MSG_LEN_IDX0 > +#define RQ_SA_8BIT_IDX 1 > +#define NETFN_LUN_IDX 2 > + > +#define IPMB_MSG_PAYLOAD_LEN_MAX (MAX_MSG_LEN - > IPMB_REQUEST_LEN_MIN - 1) > + > +struct ipmb_msg { > + u8 len; > + u8 rs_sa; > + u8 netfn_rs_lun; > + u8 checksum1; > + u8 rq_sa; > + u8 rq_seq_rq_lun; > + u8 cmd; > + u8 payload[IPMB_MSG_PAYLOAD_LEN_MAX]; > + /* checksum2 is included in payload */ } __packed; > + > +static u32 ipmb_msg_len(struct ipmb_msg *ipmb_msg) { > + return ipmb_msg->len + 1; > +} Do you really need it as function? > + > +struct ipmb_request_elem { > + struct list_head list; > + struct ipmb_msg request; > +}; > + > +struct ipmb_dev { > + struct i2c_client *client; > + struct miscdevice miscdev; > + struct ipmb_msg request; > + struct list_head request_queue; > + atomic_t request_queue_len; > + struct ipmb_msg response; Where you are using 'response' field? > + size_t msg_idx; > + spinlock_t lock; > + wait_queue_head_t wait_queue; > + struct mutex file_mutex; > +}; > + > +static int receive_ipmb_request(struct ipmb_dev *ipmb_dev_p, > + bool non_blocking, > + struct ipmb_msg *ipmb_request) > +{ > + struct ipmb_request_elem *queue_elem; > + unsigned long flags; > + int res; > + > + spin_lock_irqsave(&ipmb_dev_p->lock, flags); > + > + while (!atomic_read(&ipmb_dev_p->request_queue_len)) { > + spin_unlock_irqrestore(&ipmb_dev_p->lock, flags); > + if (non_blocking) > + return -EAGAIN; > + > + res = wait_event_interruptible(ipmb_dev_p->wait_queue, > +
Re: [PATCH RESEND] sched/cpufreq: Fix kobject memleak
On Tue, Apr 30, 2019 at 06:24:43AM +0200, Ingo Molnar wrote: > > * Tobin C. Harding wrote: > > > Currently error return from kobject_init_and_add() is not followed by a > > call to kobject_put(). This means there is a memory leak. > > > > Add call to kobject_put() in error path of kobject_init_and_add(). > > > > Signed-off-by: Tobin C. Harding > > --- > > > > Resend with SOB tag. > > Please ignore my previous mail :-) Cheers Ingo, caught myself not checkpatching :( thanks, Tobin.
[PATCH v1] mmc: dt: add DT bindings for ls1028a eSDHC host controller
From: Yinbo Zhu Add "fsl,ls1028a-esdhc" bindings for ls1028a eSDHC host controller Signed-off-by: Yinbo Zhu --- .../devicetree/bindings/mmc/fsl-esdhc.txt |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt index 99c5cf8..a7250b9 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt @@ -21,6 +21,7 @@ Required properties: "fsl,ls1043a-esdhc" "fsl,ls1046a-esdhc" "fsl,ls2080a-esdhc" + "fsl,ls1028a-esdhc" - clock-frequency : specifies eSDHC base clock frequency. Optional properties: -- 1.7.1
Re: [PATCH v2 17/19] iommu: Add max num of cache and granu types
Hi Jacob, On 4/29/19 6:17 PM, Jacob Pan wrote: > On Fri, 26 Apr 2019 18:22:46 +0200 > Auger Eric wrote: > >> Hi Jacob, >> >> On 4/24/19 1:31 AM, Jacob Pan wrote: >>> To convert to/from cache types and granularities between generic and >>> VT-d specific counterparts, a 2D arrary is used. Introduce the >>> limits >> array >>> to help define the converstion array size. >> conversion >>> > will fix, thanks >>> Signed-off-by: Jacob Pan >>> --- >>> include/uapi/linux/iommu.h | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h >>> index 5c95905..2d8fac8 100644 >>> --- a/include/uapi/linux/iommu.h >>> +++ b/include/uapi/linux/iommu.h >>> @@ -197,6 +197,7 @@ struct iommu_inv_addr_info { >>> __u64 granule_size; >>> __u64 nb_granules; >>> }; >>> +#define NR_IOMMU_CACHE_INVAL_GRANU (3) >>> >>> /** >>> * First level/stage invalidation information >>> @@ -235,6 +236,7 @@ struct iommu_cache_invalidate_info { >>> struct iommu_inv_addr_info addr_info; >>> }; >>> }; >>> +#define NR_IOMMU_CACHE_TYPE(3) >>> /** >>> * struct gpasid_bind_data - Information about device and guest >>> PASID binding >>> * @gcr3: Guest CR3 value from guest mm >>> >> Is it really something that needs to be exposed in the uapi? >> > I put it in uapi since the related definitions for granularity and > cache type are in the same file. > Maybe putting them close together like this? I was thinking you can just > fold it into your next series as one patch for introducing cache > invalidation. > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > index 2d8fac8..4ff6929 100644 > --- a/include/uapi/linux/iommu.h > +++ b/include/uapi/linux/iommu.h > @@ -164,6 +164,7 @@ enum iommu_inv_granularity { > IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */ > IOMMU_INV_GRANU_PASID, /* pasid-selective invalidation */ > IOMMU_INV_GRANU_ADDR, /* page-selective invalidation */ > + NR_IOMMU_INVAL_GRANU, /* number of invalidation granularities > */ }; > > /** > @@ -228,6 +229,7 @@ struct iommu_cache_invalidate_info { > #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ > #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ > #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ > +#define NR_IOMMU_CACHE_TYPE(3) OK I will add this. Thanks Eric > __u8cache; > __u8granularity; > >> Thanks >> >> Eric > > [Jacob Pan] >
Re: [RFC PATCH 2/7] x86/sci: add core implementation for system call isolation
* Andy Lutomirski wrote: > On Sat, Apr 27, 2019 at 3:46 AM Ingo Molnar wrote: > > So I'm wondering whether there's a 4th choice as well, which avoids > > control flow corruption *before* it happens: > > > > - A C language runtime that is a subset of current C syntax and > >semantics used in the kernel, and which doesn't allow access outside > >of existing objects and thus creates a strictly enforced separation > >between memory used for data, and memory used for code and control > >flow. > > > > - This would involve, at minimum: > > > > - tracking every type and object and its inherent length and valid > > access patterns, and never losing track of its type. > > > > - being a lot more organized about initialization, i.e. no > > uninitialized variables/fields. > > > > - being a lot more strict about type conversions and pointers in > > general. > > You're not the only one to suggest this. There are at least a few > things that make this extremely difficult if not impossible. For > example, consider this code: > > void maybe_buggy(void) > { > int a, b; > int *p = &a; > int *q = (int *)some_function((unsigned long)p); > *q = 1; > } > > If some_function(&a) returns &a, then all is well. But if > some_function(&a) returns &b or even a valid address of some unrelated > kernel object, then the code might be entirely valid and correct C, > but I don't see how the runtime checks are supposed to tell whether > the resulting address is valid or is a bug. This type of code is, I > think, quite common in the kernel -- it happens in every data > structure where we have unions of pointers and integers or where we > steal some known-zero bits of a pointer to store something else. So the thing is, for the infinitely large state space of "valid C code" we already disallow an infinitely many versions in the Linux kernel. We have complicated rules that disallow certain C syntactical and semantical constructs, both on the tooling (build failure/warning) and on the review (style/taste) level. So the question IMHO isn't whether it's "valid C", because we already have the Linux kernel's own C syntax variant and are enforcing it with varying degrees of success. The question is whether the example you gave can be written in a strongly typed fashion, whether it makes sense to do so, and what the costs are. I think it's evident that it can be written with strongly typed constructs, by separating pointers from embedded error codes - with negative side effects to code generation: for example it increases structure sizes and error return paths. I think there's four main costs of converting such a pattern to strongly typed constructs: - memory/cache footprint: there's a nonzero cost there. - performance: this will hurt too. - code readability:this will probably improve. - code robustness: this will improve too. So I think the proper question to ask is not whether there's common C syntax within the kernel that would have to be rewritten, but whether the total sum of memory and runtime overhead of strongly typed C programming (if it's possible/desirable) is larger than the total sum of a typical Linux distro enabling the various current and proposed kernel hardening features that have a runtime overhead: - the SMAP/SMEP overhead of STAC/CLAC for every single user copy - other usercopy hardening features - stackprotector - KASLR - compiler plugins against information leaks - proposed KASLR extension to implement module randomization and -PIE overhead - proposed function call integrity checks - proposed per system call kernel stack offset randomization - ( and I'm sure I forgot about a few more, and it's all still only reactive security, not proactive security. ) That's death by a thousand cuts and CR3 switching during system calls is also throwing a hand grenade into the fight ;-) So if people are also proposing to do CR3 switches in every system call, I'm pretty sure the answer is "yes, even a managed C runtime is probably faster than *THAT* sum of a performanc mess" - at least with the current CR3 switching x86-uarch cost structure... Thanks, Ingo
Re: [PATCH v3 1/4] include: dt-bindings: add Performance Monitoring Unit for Exynos
Hi, I agree of this patch. But, I add the minor comments. If you edit them according to my comment, feel free to add my following tag: Acked-by: Chanwoo Choi On 19. 4. 19. 오후 10:48, Lukasz Luba wrote: > This patch add support of a new feature which can be used in DT: > Performance Monitoring Unit with defined event data type. > In this patch the event data types are defined for Exynos PPMU. > The patch also updates the MAINTAINERS file accordingly and > adds the header file to devfreq event subsystem. > > Signed-off-by: Lukasz Luba > --- > MAINTAINERS | 1 + > include/dt-bindings/pmu/exynos_ppmu.h | 26 ++ > 2 files changed, 27 insertions(+) > create mode 100644 include/dt-bindings/pmu/exynos_ppmu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 3671fde..1ba4b9b 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -4560,6 +4560,7 @@ T: git > git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git > S: Supported > F: drivers/devfreq/event/ > F: drivers/devfreq/devfreq-event.c > +F: include/dt-bindings/pmu/exynos_ppmu.h > F: include/linux/devfreq-event.h > F: Documentation/devicetree/bindings/devfreq/event/ > > diff --git a/include/dt-bindings/pmu/exynos_ppmu.h > b/include/dt-bindings/pmu/exynos_ppmu.h > new file mode 100644 > index 000..08fdce9 > --- /dev/null > +++ b/include/dt-bindings/pmu/exynos_ppmu.h > @@ -0,0 +1,26 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Samsung Exynos PPMU event types for counting in regs > + * > + * Copyright (c) 2019, Samsung Mabye, "Samsung Electronics" instead of 'Samsung'. > + * Author: Lukasz Luba > + */ > + > +#ifndef __DT_BINDINGS_PMU_EXYNOS_PPMU_H > +#define __DT_BINDINGS_PMU_EXYNOS_PPMU_H > + > + Remove unneeded blank line. > +#define PPMU_RO_BUSY_CYCLE_CNT 0x0 > +#define PPMU_WO_BUSY_CYCLE_CNT 0x1 > +#define PPMU_RW_BUSY_CYCLE_CNT 0x2 > +#define PPMU_RO_REQUEST_CNT 0x3 > +#define PPMU_WO_REQUEST_CNT 0x4 > +#define PPMU_RO_DATA_CNT 0x5 > +#define PPMU_WO_DATA_CNT 0x6 > +#define PPMU_RO_LATENCY 0x12 > +#define PPMU_WO_LATENCY 0x16 > +#define PPMU_V2_RO_DATA_CNT 0x4 > +#define PPMU_V2_WO_DATA_CNT 0x5 > +#define PPMU_V2_EVT3_RW_DATA_CNT 0x22 > + > +#endif > -- Best Regards, Chanwoo Choi Samsung Electronics
sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:undefined reference to `followparent_recalc'
Hi Randy, It's probably a bug fix that unveils the link errors. tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 83a50840e72a5a964b4704fcdc2fbb2d771015ab commit: acaf892ecbf5be7710ae05a61fd43c668f68ad95 sh: fix multiple function definition build errors date: 3 weeks ago config: sh-allmodconfig (attached as .config) compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout acaf892ecbf5be7710ae05a61fd43c668f68ad95 # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=sh If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): >> sh4-linux-gnu-ld: arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.data+0x1c): >> undefined reference to `followparent_recalc' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v6 01/10] clk: samsung: add needed IDs for DMC clocks in Exynos5420
Hi, On 19. 4. 19. 오후 11:19, Lukasz Luba wrote: > Define new IDs for clocks used by Dynamic Memory Controller in > Exynos5422 SoC. > > Acked-by: Rob Herring > Signed-off-by: Lukasz Luba > --- > include/dt-bindings/clock/exynos5420.h | 18 +- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/include/dt-bindings/clock/exynos5420.h > b/include/dt-bindings/clock/exynos5420.h > index 355f469..abb1842 100644 > --- a/include/dt-bindings/clock/exynos5420.h > +++ b/include/dt-bindings/clock/exynos5420.h > @@ -60,6 +60,7 @@ > #define CLK_MAU_EPLL 159 > #define CLK_SCLK_HSIC_12M160 > #define CLK_SCLK_MPHY_IXTAL24161 > +#define CLK_SCLK_BPLL162 > > /* gate clocks */ > #define CLK_UART0257 > @@ -195,6 +196,18 @@ > #define CLK_ACLK432_CAM 518 > #define CLK_ACLK_FL1550_CAM 519 > #define CLK_ACLK550_CAM 520 > +#define CLK_CLKM_PHY0521 > +#define CLK_CLKM_PHY1522 > +#define CLK_ACLK_PPMU_DREX0_0523 > +#define CLK_ACLK_PPMU_DREX0_1524 > +#define CLK_ACLK_PPMU_DREX1_0525 > +#define CLK_ACLK_PPMU_DREX1_1526 > +#define CLK_PCLK_PPMU_DREX0_0527 > +#define CLK_PCLK_PPMU_DREX0_1528 > +#define CLK_PCLK_PPMU_DREX1_0529 > +#define CLK_PCLK_PPMU_DREX1_1530 > +#define CLK_CDREX_PAUSE 531 > +#define CLK_CDREX_TIMING_SET 532 I cannot find the usage code of both CLK_CDREX_PAUSE and CLK_CDREX_TIMING_SET in these patchset. Please remove them. (snip) -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH 1/2] i2c: imx: I2C Driver doesn't consider I2C_IPGCLK_SEL RCW bit when using ls1046a SoC
The current kernel driver does not consider I2C_IPGCLK_SEL (424 bit of RCW) in deciding i2c_clk_rate in function i2c_imx_set_clk() { 0 Platform clock/4, 1 Platform clock/2}. When using ls1046a SoC, this populates incorrect value in IBFD register if I2C_IPGCLK_SEL = 0, which generates half of the desired Clock. Therefore, if ls1046a SoC is used, we need to set the i2c clock according to the corresponding RCW. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- drivers/i2c/busses/i2c-imx.c | 64 1 file changed, 64 insertions(+) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 422f1a445b55..7186cf3c7d24 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -45,6 +45,8 @@ #include #include #include +#include +#include /* This will be the driver name the kernel reports */ #define DRIVER_NAME "imx-i2c" @@ -109,6 +111,21 @@ #define I2C_PM_TIMEOUT 10 /* ms */ +/* 14-1 Since array index starts from 0 */ +#define RCW_I2C_IPGCLK_WORD (14 - 1) +/* + * Set mask for RCW 424th bit, reading from DCFG_CCSR RCW Status Registers + * Since this register in RM depicted as big endian, + * so consider 31st bit as LSB for creating the mask. + */ +#define RCW_I2C_IPGCLK_MASK0x80 +int i2c_ipgclk_sel = 1; + +static const struct soc_device_attribute ls1046a_soc[] = { + {.family = "QorIQ LS1046A"}, + { /* sentinel */ } +}; + /* * sorted list of clock divider, register value pairs * taken from table 26-5, p.26-9, Freescale i.MX @@ -304,6 +321,11 @@ static const struct platform_device_id imx_i2c_devtype[] = { }; MODULE_DEVICE_TABLE(platform, imx_i2c_devtype); +static const struct of_device_id guts_device_ids[] = { + { .compatible = "fsl,qoriq-device-config", }, + {} +}; + static const struct of_device_id i2c_imx_dt_ids[] = { { .compatible = "fsl,imx1-i2c", .data = &imx1_i2c_hwdata, }, { .compatible = "fsl,imx21-i2c", .data = &imx21_i2c_hwdata, }, @@ -533,6 +555,9 @@ static void i2c_imx_set_clk(struct imx_i2c_struct *i2c_imx, unsigned int div; int i; + if (!i2c_ipgclk_sel) + i2c_clk_rate = i2c_clk_rate / 2; + /* Divider value calculation */ if (i2c_imx->cur_clk == i2c_clk_rate) return; @@ -551,6 +576,10 @@ static void i2c_imx_set_clk(struct imx_i2c_struct *i2c_imx, /* Store divider value */ i2c_imx->ifdr = i2c_clk_div[i].val; + pr_alert("[%s] CLK Rate=%u Bitrate =%u Div =%u Value =%d\n", +__func__, i2c_clk_rate, i2c_imx->bitrate, +div, i2c_clk_div[i].val); + /* * There dummy delay is calculated. * It should be about one I2C clock period long. @@ -1116,6 +1145,9 @@ static int i2c_imx_probe(struct platform_device *pdev) int irq, ret; dma_addr_t phy_addr; u32 mul_value; + struct device_node *guts_node; + static struct ccsr_guts __iomem *guts_regs; + u32 rcw_reg; dev_dbg(&pdev->dev, "<%s>\n", __func__); @@ -1135,6 +1167,38 @@ static int i2c_imx_probe(struct platform_device *pdev) if (!i2c_imx) return -ENOMEM; + if (soc_device_match(ls1046a_soc)) { + /* +* Make device node for GUTS/DCFG (global utilities block) +* to read RCW. +*/ + guts_node = of_find_matching_node(NULL, guts_device_ids); + if (!guts_node) { + dev_err(&pdev->dev, "Could not find GUTS node\n"); + return -ENODEV; + } + /* +* Memory (IO) MAP the DCFG registers(for RCW) to +* be used in kernel virtual address space. +*/ + guts_regs = of_iomap(guts_node, 0); + of_node_put(guts_node); + if (!guts_regs) { + dev_err(&pdev->dev, "IOREMAP of GUTS node failed\n"); + return -ENOMEM; + } + /* Read rcw bit 424 (starting from 0) */ + rcw_reg = ioread32be(&guts_regs->rcwsr[RCW_I2C_IPGCLK_WORD]); + pr_alert("RCW REG[%d]=0x%x\n", RCW_I2C_IPGCLK_WORD, rcw_reg); + if (rcw_reg & RCW_I2C_IPGCLK_MASK) { + pr_alert("Div by 2 Case Detected in RCW\n"); + i2c_ipgclk_sel = 1; + } else { + pr_alert("Div by 4 Case Detected in RCW\n"); + i2c_ipgclk_sel = 0; + } + } + if (of_id) { i2c_imx->hwdata = of_id->data; ret = of_property_read_u32(pdev->dev.of_node, -- 2.17.1
Re: [PATCH v6 06/10] dt-bindings: memory-controllers: add Exynos5422 DMC device description
On 19. 4. 19. 오후 11:19, Lukasz Luba wrote: > The patch adds description for DT binding for a new Exynos5422 Dynamic > Memory Controller device. > > Signed-off-by: Lukasz Luba > --- > .../bindings/memory-controllers/exynos5422-dmc.txt | 73 > ++ > 1 file changed, 73 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > > diff --git > a/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > b/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > new file mode 100644 > index 000..133b3cc > --- /dev/null > +++ b/Documentation/devicetree/bindings/memory-controllers/exynos5422-dmc.txt > @@ -0,0 +1,73 @@ > +* Exynos5422 frequency and voltage scaling for Dynamic Memory Controller > device > + > +The Samsung Exynos5422 SoC has DMC (Dynamic Memory Controller) to which the > DRAM > +memory chips are connected. The driver is to monitor the controller in > runtime > +and switch frequency and voltage. To monitor the usage of the controller in > +runtime, the driver uses the PPMU (Platform Performance Monitoring Unit), > which > +is able to measure the current load of the memory. > +When 'userspace' governor is used for the driver, an application is able to > +switch the DMC and memory frequency. > + > +Required properties for DMC device for Exynos5422: > +- compatible: Should be "samsung,exynos5422-bus". As I already mentioned on many times, it is not fixed. You have to fix it as following: - exynos5422-bus -> exynos5422-dmc > +- clock-names : the name of clock used by the bus, "bus". The below examples doesn't contain the 'bus' clock name. > +- clocks : phandles for clock specified in "clock-names" property. > +- devfreq-events : phandles for PPMU devices connected to this DMC. > +- vdd-supply : phandle for voltage regulator which is connected. > +- reg : registers of two CDREX controllers, chip information, clocks > subsystem. > +- operating-points-v2 : phandle for OPPs described in v2 definition. > +- device-handle : phandle of the connected DRAM memory device. For more > + information please refer to Documentation > +- devfreq-events : phandles of the PPMU events used by the controller. > + > +Example: > + > + ppmu_dmc0_0: ppmu@10d0 { > + compatible = "samsung,exynos-ppmu"; > + reg = <0x10d0 0x2000>; > + clocks = <&clock CLK_PCLK_PPMU_DREX0_0>; > + clock-names = "ppmu"; > + status = "okay"; > + events { > + ppmu_event_dmc0_0: ppmu-event3-dmc0_0 { > + event-name = "ppmu-event3-dmc0_0"; > + }; > + }; > + }; > + > + dmc: memory-controller@10c2 { > + compatible = "samsung,exynos5422-dmc"; > + reg = <0x10c2 0x1>, <0x10c3 0x1>, > + <0x1000 0x1000>, <0x1003 0x1000>; > + clocks =<&clock CLK_FOUT_SPLL>, > + <&clock CLK_MOUT_SCLK_SPLL>, > + <&clock CLK_FF_DOUT_SPLL2>, > + <&clock CLK_FOUT_BPLL>, > + <&clock CLK_MOUT_BPLL>, > + <&clock CLK_SCLK_BPLL>, > + <&clock CLK_MOUT_MX_MSPLL_CCORE>, > + <&clock CLK_MOUT_MX_MSPLL_CCORE_PHY>, > + <&clock CLK_MOUT_MCLK_CDREX>, > + <&clock CLK_DOUT_CLK2X_PHY0>, > + <&clock CLK_CLKM_PHY0>, > + <&clock CLK_CLKM_PHY1>; > + clock-names = "fout_spll", > + "mout_sclk_spll", > + "ff_dout_spll2", > + "fout_bpll", > + "mout_bpll", > + "sclk_bpll", > + "mout_mx_mspll_ccore", > + "mout_mx_mspll_ccore_phy", > + "mout_mclk_cdrex", > + "dout_clk2x_phy0", > + "clkm_phy0", > + "clkm_phy1"; > + status = "okay"; > + operating-points-v2 = <&dmc_opp_table>; > + devfreq-events = <&ppmu_event3_dmc0_0>, <&ppmu_event3_dmc0_1>, > + <&ppmu_event3_dmc1_0>, <&ppmu_event3_dmc1_1>; > + operating-points-v2 = <&dmc_opp_table>; > + device-handle = <&samsung_K3QF2F20DB>; > + vdd-supply = <&buck1_reg>; > + }; > -- Best Regards, Chanwoo Choi Samsung Electronics
[PATCH 2/2] arm64: dts: fsl: ls1046a: Add the guts node in dts
For NXP ls1046a SoC, the i2c clock needs to be configured with the appropriate bit of RCW, so we add the guts node (GUTS/DCFG global utilities block) for the driver to read. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index 373310e4c0ea..f88599df18bb 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -205,6 +205,11 @@ status = "disabled"; }; + guts: global-utilities@1ee { + compatible = "fsl,qoriq-device-config"; + reg = <0x0 0x1ee 0x0 0x1000>; + }; + qspi: spi@155 { compatible = "fsl,ls1021a-qspi"; #address-cells = <1>; -- 2.17.1
Re: [RFC PATCH v2 00/17] Core scheduling v2
* Aubrey Li wrote: > On Tue, Apr 30, 2019 at 12:01 AM Ingo Molnar wrote: > > * Li, Aubrey wrote: > > > > > > I.e. showing the approximate CPU thread-load figure column would be > > > > very useful too, where '50%' shows half-loaded, '100%' fully-loaded, > > > > '200%' over-saturated, etc. - for each row? > > > > > > See below, hope this helps. > > > .--. > > > |NA/AVX vanilla-SMT [std% / sem%] cpu% |coresched-SMT [std% / > > > sem%] +/- cpu% | no-SMT [std% / sem%] +/- cpu% | > > > |--| > > > | 1/1508.5 [ 0.2%/ 0.0%] 2.1% |504.7 [ 1.1%/ > > > 0.1%]-0.8%2.1% | 509.0 [ 0.2%/ 0.0%] 0.1% 4.3% | > > > | 2/2 1000.2 [ 1.4%/ 0.1%] 4.1% | 1004.1 [ 1.6%/ > > > 0.2%] 0.4%4.1% | 997.6 [ 1.2%/ 0.1%] -0.3% 8.1% | > > > | 4/4 1912.1 [ 1.0%/ 0.1%] 7.9% | 1904.2 [ 1.1%/ > > > 0.1%]-0.4%7.9% | 1914.9 [ 1.3%/ 0.1%] 0.1%15.1% | > > > | 8/8 3753.5 [ 0.3%/ 0.0%]14.9% | 3748.2 [ 0.3%/ > > > 0.0%]-0.1% 14.9% | 3751.3 [ 0.4%/ 0.0%] -0.1%30.5% | > > > | 16/16 7139.3 [ 2.4%/ 0.2%]30.3% | 7137.9 [ 1.8%/ > > > 0.2%]-0.0% 30.3% | 7049.2 [ 2.4%/ 0.2%] -1.3%60.4% | > > > | 32/32 10899.0 [ 4.2%/ 0.4%]60.3% | 10780.3 [ 4.4%/ > > > 0.4%]-1.1% 55.9% | 10339.2 [ 9.6%/ 0.9%] -5.1%97.7% | > > > | 64/64 15086.1 [11.5%/ 1.2%]97.7% | 14262.0 [ 8.2%/ > > > 0.8%]-5.5% 82.0% | 11168.7 [22.2%/ 1.7%] -26.0% 100.0% | > > > |128/12815371.9 [22.0%/ 2.2%] 100.0% | 14675.8 [14.4%/ > > > 1.4%]-4.5% 82.8% | 10963.9 [18.5%/ 1.4%] -28.7% 100.0% | > > > |256/25615990.8 [22.0%/ 2.2%] 100.0% | 12227.9 [10.3%/ > > > 1.0%] -23.5% 73.2% | 10469.9 [19.6%/ 1.7%] -34.5% 100.0% | > > > '--' > > > > Very nice, thank you! > > > > What's interesting is how in the over-saturated case (the last three > > rows: 128, 256 and 512 total threads) coresched-SMT leaves 20-30% CPU > > performance on the floor according to the load figures. > > Yeah, I found the next focus. > > > Is this true idle time (which shows up as 'id' during 'top'), or some > > load average artifact? > > vmstat periodically reported intermediate CPU utilization in one > second, it was running simultaneously when the benchmarks run. The cpu% > is computed by the average of (100-idle) series. Ok - so 'vmstat' uses /proc/stat, which uses cpustat[CPUTIME_IDLE] (or its NOHZ work-alike), so this should be true idle time - to the extent the HZ process clock's sampling is accurate. So I guess the answer to my question is "yes". ;-) BTW., for robustness sake you might want to add iowait to idle time (it's the 'wa' field of vmstat) - it shouldn't matter for this particular benchmark which doesn't do much IO, but it might for others. Both CPUTIME_IDLE and CPUTIME_IOWAIT are idle states when a CPU is not utilized. [ Side note: we should really implement precise idle time accounting when CONFIG_IRQ_TIME_ACCOUNTING=y is enabled. We pay all the costs of the timestamps, but AFAICS we don't propagate that into the idle cputime metrics. ] Thanks, Ingo
Re: [PATCH v3 2/2] dt-bindings: cpufreq: Document allwinner,cpu-operating-points-v2
On 29-04-19, 11:18, Rob Herring wrote: > On Sun, Apr 28, 2019 at 4:53 AM Frank Lee wrote: > > > > On Sat, Apr 27, 2019 at 5:15 AM Rob Herring wrote: > > > > > > On Wed, Apr 10, 2019 at 01:41:39PM -0400, Yangtao Li wrote: > > > > Allwinner Process Voltage Scaling Tables defines the voltage and > > > > frequency value based on the speedbin blown in the efuse combination. > > > > The sunxi-cpufreq-nvmem driver reads the efuse value from the SoC to > > > > provide the OPP framework with required information. > > > > This is used to determine the voltage and frequency value for each > > > > OPP of operating-points-v2 table when it is parsed by the OPP framework. > > > > > > > > The "allwinner,cpu-operating-points-v2" DT extends the > > > > "operating-points-v2" > > > > with following parameters: > > > > - nvmem-cells (NVMEM area containig the speedbin information) > > > > - opp-microvolt-: voltage in micro Volts. > > > > At runtime, the platform can pick a and matching > > > > opp-microvolt- property. > > > > HW: : > > > > sun50iw-h6 speed0 speed1 speed2 > > > > > > We already have at least one way to support speed bins with QC kryo > > > binding. Why do we need a different way? > > > > For some SOCs, for some reason (making the CPU have approximate > > performance), > > they use the same frequency but different voltage. In the case where > > this speed bin > > is not a lot and opp uses the same frequency, too many repeated opp > > nodes are a bit > > redundant and not intuitive enough. > > > > So, I think it's worth the new method. > > Well, I don't. > > We can't have every SoC vendor doing their own thing just because they > want to. If there are technical reasons why existing bindings don't > work, then maybe we need to do something different. But I haven't > heard any reasons. Well there is a good reason for attempting the new bindings and I wasn't sure if updating the earlier bindings or adding another one for platform is correct. As we aren't really adding new bindings, but just documentation around it. So there are two ways OPP core support this thing: - opp-supported-hw: This is a better fit if we have a smaller group of frequencies to select from a bigger group, so we disable non-required OPPs completely. This is what Qcom did as they wanted to select different frequencies all together. - opp-microvolt-: This is a better fit if the frequencies remain same and only few of the properties like voltage/current have a different value. So we don't disable any OPPs but just select the right voltage/current for those frequencies. This avoids unnecessary duplication of the OPPs in DT and that's what allwinner guys want. The kryo nvmem bindings currently supports opp-supported-hw, maybe we can add mention support for second one in the same file and rename it well. -- viresh
[PATCH 1/3] dt-bindings: i2c: add optional mul-value property to binding
NXP Layerscape SoC have up to three MUL options available for all divider values, we choice of MUL determines the internal monitor rate of the I2C bus (SCL and SDA signals): A lower MUL value results in a higher sampling rate of the I2C signals. A higher MUL value results in a lower sampling rate of the I2C signals. So in Optional properties we added our custom mul-value property in the binding to select which mul option for the device tree i2c controller node. Signed-off-by: Chuanhua Han --- Documentation/devicetree/bindings/i2c/i2c-imx.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt b/Documentation/devicetree/bindings/i2c/i2c-imx.txt index b967544590e8..ba8e7b7b3fa8 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt +++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt @@ -18,6 +18,9 @@ Optional properties: - sda-gpios: specify the gpio related to SDA pin - pinctrl: add extra pinctrl to configure i2c pins to gpio function for i2c bus recovery, call it "gpio" state +- mul-value: NXP Layerscape SoC have up to three MUL options available for +all I2C divider values, it describes which MUL we choose to use for the driver, +the values should be 1,2,4. Examples: -- 2.17.1
[PATCH 2/3] i2c: imx: I2C Driver IBC and SCL Divider for MUL=2 and MUL=4
NXP Layerscape SoC have up to three MUL options available for all divider values,we choice of MUL determines the internal monitor rate of the I2C bus (SCL and SDA signals). The current kernel driver supports MUL=1 by default ,but doesn't have the IBC and SCL Divider entries in vf610_i2c_clk_div for MUL=2 and MUL=4,so we need to add the corresponding support. Signed-off-by: Sumit Batra Signed-off-by: Chuanhua Han --- drivers/i2c/busses/i2c-imx.c | 71 +++- 1 file changed, 69 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 42fed40198a0..ac5a334b7339 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -156,6 +157,44 @@ static struct imx_i2c_clk_pair vf610_i2c_clk_div[] = { { 3840, 0x3F }, { 4096, 0x7B }, { 5120, 0x7D }, { 6144, 0x7E }, }; +static struct imx_i2c_clk_pair mul2_i2c_clk_div[] = { + { 40, 0x40 }, { 44, 0x41 }, { 48, 0x42 }, { 52, 0x43 }, + { 56, 0x44 }, { 60, 0x45 }, { 68, 0x46 }, { 80, 0x47 }, + { 56, 0x48 }, { 64, 0x49 }, { 72, 0x4A }, { 80, 0x4B }, + { 88, 0x4C }, { 96, 0x4D }, { 112, 0x4E }, { 136, 0x4F }, + { 96, 0x50 }, { 112, 0x51 }, { 128, 0x52 }, { 144, 0x53 }, + { 160, 0x54 }, { 176, 0x55 }, { 208, 0x56 }, { 256, 0x57 }, + { 160, 0x58 }, { 192, 0x59 }, { 224, 0x5A }, { 256, 0x5B }, + { 288, 0x5C }, { 320, 0x5D }, { 384, 0x5E }, { 480, 0x5F }, + { 320, 0x60 }, { 384, 0x61 }, { 448, 0x62 }, { 512, 0x63 }, + { 576, 0x64 }, { 640, 0x65 }, { 768, 0x66 }, { 960, 0x67 }, + { 640, 0x68 }, { 768, 0x69 }, { 896, 0x6A }, { 1024, 0x6B }, + { 1152, 0x6C }, { 1280, 0x6D }, { 1536, 0x6E }, { 1920, 0x6F }, + { 1280, 0x70 }, { 1536, 0x71 }, { 1792, 0x72 }, { 2048, 0x73 }, + { 2304, 0x74 }, { 2560, 0x75 }, { 3072, 0x76 }, { 3840, 0x77 }, + { 2560, 0x78 }, { 3072, 0x79 }, { 3584, 0x7A }, { 4096, 0x7B }, + { 4608, 0x7C }, { 5120, 0x7D }, { 6144, 0x7E }, { 7680, 0x7F }, +}; + +static struct imx_i2c_clk_pair mul4_i2c_clk_div[] = { + { 80,0x80 }, { 88,0x81 }, { 96,0x82 }, { 104, 0x83 }, + { 112, 0x84 }, { 120, 0x85 }, { 136, 0x86 }, { 160, 0x87 }, + { 112, 0x88 }, { 128, 0x89 }, { 144, 0x8A }, { 160, 0x8B }, + { 176, 0x8C }, { 192, 0x8D }, { 224, 0x8E }, { 272, 0x8F }, + { 192, 0x90 }, { 224, 0x91 }, { 256, 0x92 }, { 288, 0x93 }, + { 320, 0x94 }, { 352, 0x95 }, { 416, 0x96 }, { 512, 0x97 }, + { 320, 0x98 }, { 384, 0x99 }, { 448, 0x9A }, { 512, 0x9B }, + { 576, 0x9C }, { 640, 0x9D }, { 768, 0x9E }, { 960, 0x9F }, + { 640, 0xA0 }, { 768, 0xA1 }, { 896, 0xA2 }, { 1024, 0xA3 }, + { 1152, 0xA4 }, { 1280, 0xA5 }, { 1536, 0xA6 }, { 1792, 0xAA }, + { 1280, 0xA8 }, { 1536, 0xA9 }, { 1920, 0xA7 }, { 2048, 0xAB }, + { 2304, 0xAC }, { 2560, 0xAD }, { 3072, 0xAE }, { 3584, 0xB2 }, + { 2560, 0xB0 }, { 3072, 0xB1 }, { 3820, 0xAF }, { 4096, 0xB3 }, + { 4608, 0xB4 }, { 5120, 0xB5 }, { 6144, 0xB6 }, { 7680, 0xB7 }, + { 5120, 0xB8 }, { 6144, 0xB9 }, { 7168, 0xBA }, { 8192, 0xBB }, + { 9216, 0xBC }, { 10240, 0xBD }, { 12288, 0xBE }, { 15360, 0xBF }, +}; + enum imx_i2c_type { IMX1_I2C, IMX21_I2C, @@ -234,6 +273,24 @@ static struct imx_i2c_hwdata vf610_i2c_hwdata = { }; +static struct imx_i2c_hwdata mul2_i2c_hwdata = { + .devtype= VF610_I2C, + .regshift = VF610_I2C_REGSHIFT, + .clk_div= mul2_i2c_clk_div, + .ndivs = ARRAY_SIZE(mul2_i2c_clk_div), + .i2sr_clr_opcode= I2SR_CLR_OPCODE_W1C, + .i2cr_ien_opcode= I2CR_IEN_OPCODE_0, +}; + +static struct imx_i2c_hwdata mul4_i2c_hwdata = { + .devtype= VF610_I2C, + .regshift = VF610_I2C_REGSHIFT, + .clk_div= mul4_i2c_clk_div, + .ndivs = ARRAY_SIZE(mul4_i2c_clk_div), + .i2sr_clr_opcode= I2SR_CLR_OPCODE_W1C, + .i2cr_ien_opcode= I2CR_IEN_OPCODE_0, +}; + static const struct platform_device_id imx_i2c_devtype[] = { { .name = "imx1-i2c", @@ -1058,6 +1115,7 @@ static int i2c_imx_probe(struct platform_device *pdev) void __iomem *base; int irq, ret; dma_addr_t phy_addr; + u32 mul_value; dev_dbg(&pdev->dev, "<%s>\n", __func__); @@ -1077,11 +1135,20 @@ static int i2c_imx_probe(struct platform_device *pdev) if (!i2c_imx) return -ENOMEM; - if (of_id) + if (of_id) { i2c_imx->hwdata = of_id->data; - else + ret = of_property_read_u32(pdev->dev.of_nod
[PATCH 3/3] arm64: dts: fsl: ls1046a: Add mul-value property of the i2c controller nodes
According to LS1046A Reference Manual, for the i2c controller, you have up to three MUL options available for all divider values. Therefore, we need to determine which MUL to use in the device tree for driver use. The "mul-value" property provides which mul is used in our driver. Signed-off-by: Chuanhua Han --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index b0ef08b090dd..373310e4c0ea 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -385,6 +385,7 @@ dmas = <&edma0 1 39>, <&edma0 1 38>; dma-names = "tx", "rx"; + mul-value = <4>; status = "disabled"; }; @@ -395,6 +396,7 @@ reg = <0x0 0x219 0x0 0x1>; interrupts = ; clocks = <&clockgen 4 1>; + mul-value = <4>; status = "disabled"; }; @@ -405,6 +407,7 @@ reg = <0x0 0x21a 0x0 0x1>; interrupts = ; clocks = <&clockgen 4 1>; + mul-value = <4>; status = "disabled"; }; @@ -415,6 +418,7 @@ reg = <0x0 0x21b 0x0 0x1>; interrupts = ; clocks = <&clockgen 4 1>; + mul-value = <4>; status = "disabled"; }; -- 2.17.1
PROBLEM: Elan touchpad regression on Kernel 5.0.10
Hello, [1.] One line summary of the problem: Elan touchpad regression on Kernel 5.0.10 [2.] Full description of the problem/report: Elan touchpad does not work on 5.0.10 while working on 5.0.9 [3.] Keywords: elan_i2c_core elan i2c touchpad 5.0.10 [4.] Kernel information [4.1.] Kernel version: Linux version 5.0.10-arch1-1-ARCH (builduser@heftig-2592) (gcc version 8.3.0 (GCC)) #1 SMP PREEMPT Sat Apr 27 20:06:45 UTC 2019 [4.2.] Kernel .config file: I'm not sure, but I think it may be referring to https://git.archlinux.org/svntogit/packages.git/tree/trunk/config?h=packages/linux [5.] Most recent kernel version which did not have the bug: 5.0.9 [6.] Output of Oops.. message (if applicable) with symbolic information resolved (Not appliable) [7.] A small shell script or example program which triggers the problem: (Not appliable) [8.] Environment [8.1.] Software (add the output of the ver_linux script here) Linux sheltty 5.0.10-arch1-1-ARCH #1 SMP PREEMPT Sat Apr 27 20:06:45 UTC 2019 x86_64 GNU/Linux GNU C 8.3.0 GNU Make4.2.1 Binutils2.32 Util-linux 2.33.2 Mount 2.33.2 Module-init-tools 26 E2fsprogs 1.45.0 Jfsutils1.1.15 Reiserfsprogs 3.6.27 Xfsprogs4.20.0 PPP 2.4.7 Linux C Library 2.29 Dynamic linker (ldd)2.29 Linux C++ Library 6.0.25 Procps 3.3.15 Kbd 2.0.4 Console-tools 2.0.4 Sh-utils8.31 Udev242 Modules Loaded 8021q 8250_dw ac ac97_bus acpi_thermal_rel aesni_intel aes_x86_64 agpgart ahci arc4 atkbd battery bbswitch bluetooth btbcm btintel btrtl btusb cfg80211 coretemp crc16 crc32c_generic crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd crypto_user drm drm_kms_helper ecdh_generic elan_i2c evdev ext4 fat fb_sys_fops fscrypto garp ghash_clmulni_intel glue_helper hid hid_generic i2c_algo_bit i2c_hid i2c_i801 i8042 i915 idma64 input_leds int3400_thermal int3403_thermal int340x_thermal_zone intel_cstate intel_gtt intel_lpss intel_lpss_pci intel_pch_thermal intel_powerclamp intel_rapl intel_rapl_perf intel_soc_dts_iosf intel_uncore intel_wmi_thunderbolt ip_tables irqbypass iTCO_vendor_support iTCO_wdt jbd2 joydev kvm kvmgt kvm_intel ledtrig_audio libahci libata libphy libps2 llc mac80211 mac_hid mbcache mdev media mei mei_me mousedev mrp nls_cp437 nls_iso8859_1 pcc_cpufreq processor_thermal_device r8169 r8822be realtek rfkill rng_core scsi_mod serio serio_raw snd snd_compress snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_core snd_hda_ext_core snd_hda_intel snd_hwdep snd_pcm snd_pcm_dmaengine snd_soc_acpi snd_soc_acpi_intel_match snd_soc_core snd_soc_hdac_hda snd_soc_skl snd_soc_skl_ipc snd_soc_sst_dsp snd_soc_sst_ipc snd_timer soundcore stp syscopyarea sysfillrect sysimgblt tpm tpm_crb tpm_tis tpm_tis_core typec typec_ucsi ucsi_acpi usbhid uvcvideo vfat vfio vfio_iommu_type1 vfio_mdev videobuf2_common videobuf2_memops videobuf2_v4l2 videobuf2_vmalloc videodev wmi wmi_bmof x86_pkg_temp_thermal xhci_hcd xhci_pci x_tables [8.2.] Processor information (from /proc/cpuinfo): (Maybe not appliable) [8.3.] Module information (from /proc/modules): (Parts related to i2c and elan:) i2c_algo_bit 16384 1 i915, Live 0x i2c_hid 32768 0 - Live 0x hid 147456 3 hid_generic,usbhid,i2c_hid, Live 0x elan_i2c 49152 0 - Live 0x i2c_i801 36864 0 - Live 0x [8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) /proc/ioports: - : PCI Bus :00 - : dma1 - : pic1 - : iTCO_wdt - : timer0 - : timer1 - : keyboard - : PNP0C09:00 - : EC data - : keyboard - : PNP0C09:00 - : EC cmd - : rtc0 - : dma page reg - : pic2 - : dma2 - : fpu - : PNP0C04:00 - : iTCO_wdt - : pnp 00:02 - : PCI conf1 - : PCI Bus :00 - : pnp 00:02 - : pnp 00:00 - : ACPI PM1a_EVT_BLK - : ACPI PM1a_CNT_BLK - : ACPI PM_TMR - : ACPI CPU throttle - : ACPI PM2_CNT_BLK - : pnp 00:04 - : ACPI GPE0_BLK - : pnp 00:01 - : PCI Bus :08 - : :08:00.0 - : PCI Bus :07 - : :07:00.0 - : r8822be - : PCI Bus :01 - : :01:00.0 - : :00:02.0 - : :00:1f.4 - : i801_smbus - : :00:17.0 - : ahci - : :00:17.0 - : ahci - : :00:17.0 - : ahci [8.5.] PCI information It seems to be long (
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Mon, Apr 29, 2019 at 10:18:04PM -0600, Andreas Dilger wrote: > > > > void*i_private; /* fs or device private pointer */ > > + void (*free_inode)(struct inode *); > > It seems like a waste to increase the size of every struct inode just to > access > a static pointer. Is this the only place that ->free_inode() is called? Why > not move the ->free_inode() pointer into inode->i_fop->free_inode() so that it > is still directly accessible at this point. i_op, surely? In any case, increasing sizeof(struct inode) is not a problem - if anything, I'd turn ->i_fop into an anon union with that. As in, diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 9d80f9e0855e..b8d3ddd8b8db 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -655,3 +655,11 @@ in your dentry operations instead. * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() * combination of NULL ->destroy_inode and NULL ->free_inode is treated as NULL/free_inode_nonrcu, to preserve the compatibility. + + Note that the callback (be it via ->free_inode() or explicit call_rcu() + in ->destroy_inode()) is *NOT* ordered wrt superblock destruction; + as the matter of fact, the superblock and all associated structures + might be already gone. The filesystem driver is guaranteed to be still + there, but that's it. Freeing memory in the callback is fine; doing + more than that is possible, but requires a lot of care and is best + avoided. diff --git a/fs/inode.c b/fs/inode.c index fb45590d284e..627e1766503a 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -211,8 +211,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); static void i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); - if (inode->i_sb->s_op->free_inode) - inode->i_sb->s_op->free_inode(inode); + if (inode->free_inode) + inode->free_inode(inode); else free_inode_nonrcu(inode); } @@ -236,6 +236,7 @@ static struct inode *alloc_inode(struct super_block *sb) if (!ops->free_inode) return NULL; } + inode->free_inode = ops->free_inode; i_callback(&inode->i_rcu); return NULL; } @@ -276,6 +277,7 @@ static void destroy_inode(struct inode *inode) if (!ops->free_inode) return; } + inode->free_inode = ops->free_inode; call_rcu(&inode->i_rcu, i_callback); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2e9b9f87caca..92732286b748 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -694,7 +694,10 @@ struct inode { #ifdef CONFIG_IMA atomic_ti_readcount; /* struct files open RO */ #endif - const struct file_operations*i_fop; /* former ->i_op->default_file_ops */ + union { + const struct file_operations*i_fop; /* former ->i_op->default_file_ops */ + void (*free_inode)(struct inode *); + }; struct file_lock_context*i_flctx; struct address_spacei_data; struct list_headi_devices;
Re: [PATCH RESEND] sched/cpufreq: Fix kobject memleak
* Tobin C. Harding wrote: > Currently error return from kobject_init_and_add() is not followed by a > call to kobject_put(). This means there is a memory leak. > > Add call to kobject_put() in error path of kobject_init_and_add(). > > Signed-off-by: Tobin C. Harding > --- > > Resend with SOB tag. Please ignore my previous mail :-) Thanks, Ingo
Re: [PATCH] sched/cpufreq: Fix kobject memleak
* Tobin C. Harding wrote: > Currently error return from kobject_init_and_add() is not followed by a > call to kobject_put(). This means there is a memory leak. > > Add call to kobject_put() in error path of kobject_init_and_add(). > --- > kernel/sched/cpufreq_schedutil.c | 1 + > 1 file changed, 1 insertion(+) I've added your: Signed-off-by: Tobin C. Harding Which I suppose you intended to include? Thanks, Ingo
Re: [PATCH 1/2] RISC-V: Add DT documentation for SiFive L2 Cache Controller
On Fri, Apr 26, 2019 at 3:04 PM Sudeep Holla wrote: > > On Fri, Apr 26, 2019 at 11:20:17AM +0530, Yash Shah wrote: > > On Thu, Apr 25, 2019 at 3:43 PM Sudeep Holla wrote: > > > > > > On Thu, Apr 25, 2019 at 11:24:55AM +0530, Yash Shah wrote: > > > > Add device tree bindings for SiFive FU540 L2 cache controller driver > > > > > > > > Signed-off-by: Yash Shah > > > > --- > > > > .../devicetree/bindings/riscv/sifive-l2-cache.txt | 53 > > > > ++ > > > > 1 file changed, 53 insertions(+) > > > > create mode 100644 > > > > Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > > > > > diff --git > > > > a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > new file mode 100644 > > > > index 000..15132e2 > > > > --- /dev/null > > > > +++ b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt > > > > @@ -0,0 +1,53 @@ > > > > +SiFive L2 Cache Controller > > > > +-- > > > > +The SiFive Level 2 Cache Controller is used to provide access to fast > > > > copies > > > > +of memory for masters in a Core Complex. The Level 2 Cache Controller > > > > also > > > > +acts as directory-based coherency manager. > > > > + > > > > +Required Properties: > > > > + > > > > +- compatible: Should be "sifive,fu540-c000-ccache" > > > > + > > > > +- cache-block-size: Specifies the block size in bytes of the cache > > > > + > > > > +- cache-level: Should be set to 2 for a level 2 cache > > > > + > > > > +- cache-sets: Specifies the number of associativity sets of the cache > > > > + > > > > +- cache-size: Specifies the size in bytes of the cache > > > > + > > > > +- cache-unified: Specifies the cache is a unified cache > > > > + > > > > +- interrupt-parent: Must be core interrupt controller > > > > + > > > > +- interrupts: Must contain 3 entries (DirError, DataError and DataFail > > > > signals) > > > > + > > > > +- reg: Physical base address and size of L2 cache controller registers > > > > map > > > > + > > > > +- reg-names: Should be "control" > > > > + > > > > > > It would be good if you mark the properties that are present in DT > > > specification and those that are added for sifive,fu540-c000-ccache > > > > I believe there isn't any property which is added explicitly for > > sifive,fu540-c000-ccache. > > > > reg and interrupts are generally optional for normal cache and may be > required for cache controller like this. DT specification[1] covers > only caches and not cache controllers. Are you suggesting something like this: Required Properties: Standard Properties: - compatible: Should be "sifive,-ccache" Supported compatible strings are: "sifive,fu540-c000-ccache" and "sifive,fu740-c000-ccache" - cache-block-size: Specifies the block size in bytes of the cache - cache-level: Should be set to 2 for a level 2 cache - cache-sets: Specifies the number of associativity sets of the cache - cache-size: Specifies the size in bytes of the cache - cache-unified: Specifies the cache is a unified cache Non-Standard Properties: - interrupt-parent: Must be core interrupt controller - interrupts: Must contain 3 entries for FU540 (DirError, DataError and DataFail signals) or 4 entries for other chips (DirError, DirFail, DataError, DataFail signals) - reg: Physical base address and size of L2 cache controller registers map - reg-names: Should be "control" - Yash > > -- > Regards, > Sudeep > > [1] > https://github.com/devicetree-org/devicetree-specification/releases/download/v0.2/devicetree-specification-v0.2.pdf
Re: [PATCH v4 1/7] ocxl: Split pci.c
On 27/3/19 4:31 pm, Alastair D'Silva wrote: From: Alastair D'Silva In preparation for making core code available for external drivers, move the core code out of pci.c and into core.c Signed-off-by: Alastair D'Silva There doesn't seem to be much left in pci.c, is there? Acked-by: Andrew Donnellan --- drivers/misc/ocxl/Makefile| 1 + drivers/misc/ocxl/core.c | 517 + drivers/misc/ocxl/ocxl_internal.h | 5 + drivers/misc/ocxl/pci.c | 519 +- 4 files changed, 524 insertions(+), 518 deletions(-) create mode 100644 drivers/misc/ocxl/core.c diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile index 5229dcda8297..bc4e39bfda7b 100644 --- a/drivers/misc/ocxl/Makefile +++ b/drivers/misc/ocxl/Makefile @@ -3,6 +3,7 @@ ccflags-$(CONFIG_PPC_WERROR)+= -Werror ocxl-y+= main.o pci.o config.o file.o pasid.o ocxl-y+= link.o context.o afu_irq.o sysfs.o trace.o +ocxl-y += core.o obj-$(CONFIG_OCXL)+= ocxl.o # For tracepoints to include our trace.h from tracepoint infrastructure: diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c new file mode 100644 index ..1a4411b72d35 --- /dev/null +++ b/drivers/misc/ocxl/core.c @@ -0,0 +1,517 @@ +// SPDX-License-Identifier: GPL-2.0+ +// Copyright 2019 IBM Corp. +#include +#include "ocxl_internal.h" + +static struct ocxl_fn *ocxl_fn_get(struct ocxl_fn *fn) +{ + return (get_device(&fn->dev) == NULL) ? NULL : fn; +} + +static void ocxl_fn_put(struct ocxl_fn *fn) +{ + put_device(&fn->dev); +} + +struct ocxl_afu *ocxl_afu_get(struct ocxl_afu *afu) +{ + return (get_device(&afu->dev) == NULL) ? NULL : afu; +} + +void ocxl_afu_put(struct ocxl_afu *afu) +{ + put_device(&afu->dev); +} + +static struct ocxl_afu *alloc_afu(struct ocxl_fn *fn) +{ + struct ocxl_afu *afu; + + afu = kzalloc(sizeof(struct ocxl_afu), GFP_KERNEL); + if (!afu) + return NULL; + + mutex_init(&afu->contexts_lock); + mutex_init(&afu->afu_control_lock); + idr_init(&afu->contexts_idr); + afu->fn = fn; + ocxl_fn_get(fn); + return afu; +} + +static void free_afu(struct ocxl_afu *afu) +{ + idr_destroy(&afu->contexts_idr); + ocxl_fn_put(afu->fn); + kfree(afu); +} + +static void free_afu_dev(struct device *dev) +{ + struct ocxl_afu *afu = to_ocxl_afu(dev); + + ocxl_unregister_afu(afu); + free_afu(afu); +} + +static int set_afu_device(struct ocxl_afu *afu, const char *location) +{ + struct ocxl_fn *fn = afu->fn; + int rc; + + afu->dev.parent = &fn->dev; + afu->dev.release = free_afu_dev; + rc = dev_set_name(&afu->dev, "%s.%s.%hhu", afu->config.name, location, + afu->config.idx); + return rc; +} + +static int assign_afu_actag(struct ocxl_afu *afu, struct pci_dev *dev) +{ + struct ocxl_fn *fn = afu->fn; + int actag_count, actag_offset; + + /* +* if there were not enough actags for the function, each afu +* reduces its count as well +*/ + actag_count = afu->config.actag_supported * + fn->actag_enabled / fn->actag_supported; + actag_offset = ocxl_actag_afu_alloc(fn, actag_count); + if (actag_offset < 0) { + dev_err(&afu->dev, "Can't allocate %d actags for AFU: %d\n", + actag_count, actag_offset); + return actag_offset; + } + afu->actag_base = fn->actag_base + actag_offset; + afu->actag_enabled = actag_count; + + ocxl_config_set_afu_actag(dev, afu->config.dvsec_afu_control_pos, + afu->actag_base, afu->actag_enabled); + dev_dbg(&afu->dev, "actag base=%d enabled=%d\n", + afu->actag_base, afu->actag_enabled); + return 0; +} + +static void reclaim_afu_actag(struct ocxl_afu *afu) +{ + struct ocxl_fn *fn = afu->fn; + int start_offset, size; + + start_offset = afu->actag_base - fn->actag_base; + size = afu->actag_enabled; + ocxl_actag_afu_free(afu->fn, start_offset, size); +} + +static int assign_afu_pasid(struct ocxl_afu *afu, struct pci_dev *dev) +{ + struct ocxl_fn *fn = afu->fn; + int pasid_count, pasid_offset; + + /* +* We only support the case where the function configuration +* requested enough PASIDs to cover all AFUs. +*/ + pasid_count = 1 << afu->config.pasid_supported_log; + pasid_offset = ocxl_pasid_afu_alloc(fn, pasid_count); + if (pasid_offset < 0) { + dev_err(&afu->dev, "Can't allocate %d PASIDs for AFU: %d\n", + pasid_count, pasid_offset); + return pasid_offset; + } + afu->pasid_base = fn->pasid_base + pasid_offset; + afu->pasid_count = 0; + afu->pasid_max = pas
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 05:33:10AM +0200, Nicholas Mc Guire wrote: > ok - my bad thn - I had assumed that using __force is reasonable > if the handling is correct and its a localized conversoin only > like var = be16_to_cpu(var) which evaded introducing additinal > variables just to have different types but no different function. If compiler can't recognize that in T1 v1; T2 v2; code using v1, but not v2 v2 = f(v1); code using v2, but not v1 it can use the same memory for v1 and v2, file a bug against the compiler. Or stop using that toy altogether - that kind of optimizations is early 60s stuff and any real compiler will handle that. Both gcc and clang certainly do handle that. Another thing they handle is figuring out that be16_to_cpu() et.al. are pure functions, so f(be16_to_cpu(n)); no modifications of n g(be16_to_cpu(n)); doesn't need to have le16_to_cpu recalculated. IOW, that particular code could as well have been dev_info(dev, "Fieldbus type: %04X", be16_to_cpu(fieldbus_type)); ... cd->client->fieldbus_type = be16_to_cpu(fieldbus_type); ... not that there's much sense keeping ->fieldbus_type in host-endian, while we are at it.
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Apr 29, 2019, at 9:09 PM, Al Viro wrote: > > On Tue, Apr 16, 2019 at 11:01:16AM -0700, Linus Torvalds wrote: >> >> I only skimmed through the actual filesystem (and one networking) >> patches, but they looked like trivial conversions to a better >> interface. > > ... except that this callback can (and always could) get executed after > freeing struct super_block. So we can't just dereference ->i_sb->s_op > and expect to survive; the table ->s_op pointed to will still be there, > but ->i_sb might very well have been freed, with all its contents overwritten. > We need to copy the callback into struct inode itself, unfortunately. > The following incremental fixes it; I'm going to fold it into the first > commit in there. > > diff --git a/fs/inode.c b/fs/inode.c > index fb45590d284e..855dad43b11d 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -164,6 +164,7 @@ int inode_init_always(struct super_block *sb, struct > inode *inode) > inode->i_wb_frn_avg_time = 0; > inode->i_wb_frn_history = 0; > #endif > + inode->free_inode = sb->s_op->free_inode; > > if (security_inode_alloc(inode)) > goto out; > @@ -211,8 +212,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); > static void i_callback(struct rcu_head *head) > { > struct inode *inode = container_of(head, struct inode, i_rcu); > - if (inode->i_sb->s_op->free_inode) > - inode->i_sb->s_op->free_inode(inode); > + if (inode->free_inode) > + inode->free_inode(inode); > else > free_inode_nonrcu(inode); > } > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 2e9b9f87caca..5ed6b39e588e 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -718,6 +718,7 @@ struct inode { > #endif > > void*i_private; /* fs or device private pointer */ > + void (*free_inode)(struct inode *); It seems like a waste to increase the size of every struct inode just to access a static pointer. Is this the only place that ->free_inode() is called? Why not move the ->free_inode() pointer into inode->i_fop->free_inode() so that it is still directly accessible at this point. Cheers, Andreas signature.asc Description: Message signed with OpenPGP
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Mon, Apr 29, 2019 at 08:37:29PM -0700, Linus Torvalds wrote: > On Mon, Apr 29, 2019, 20:09 Al Viro wrote: > > > > > ... except that this callback can (and always could) get executed after > > freeing struct super_block. > > > > Ugh. > > That food looks nasty. Shouldn't the super block freeing wait for the > filesystem to be all done instead? Do a rcu synchronization or something? > > Adding that pointer looks really wrong to me. I'd much rather delay the sb > freeing. Is there some reason that can't be done that I'm missing? Where would you put that synchronize_rcu()? Doing that before ->put_super() is too early - inode references might be dropped in there. OTOH, doing that after that point means that while struct super_block itself will be there, any number of data structures hanging from it might be not. So we are still very limited in what we can do inside ->free_inode() instance *and* we get bunch of synchronize_rcu() for no good reason. Note that for normal lockless accesses (lockless ->d_revalidate(), ->d_hash(), etc.) we are just fine with having struct super_block freeing RCU-delayed (along with any data structures we might need) - the superblock had been seen at some point after we'd taken rcu_read_lock(), so its freeing won't happen until we drop it. So we don't need synchronize_rcu() for that. Here the problem is that we are dealing with another RCU callback; synchronize_rcu() would be needed for it, but it will only protect that intermediate dereference of ->i_sb; any rcu-delayed stuff scheduled from inside ->put_super() would not be ordered wrt ->free_inode(). And if we are doing that just for the sake of that one dereference, we might as well do it before scheduling i_callback(). PS: we *are* guaranteed that module will still be there (unregister_filesystem() does synchronize_rcu() and rcu_barrier() is done before kmem_cache_destroy() in assorted exit_foo_fs()).
linux-next: manual merge of the mlx5-next tree with the rdma tree
Hi Leon, Today's linux-next merge of the mlx5-next tree got a conflict in: drivers/infiniband/hw/mlx5/main.c between commit: 35b0aa67b298 ("RDMA/mlx5: Refactor netdev affinity code") from the rdma tree and commit: c42260f19545 ("net/mlx5: Separate and generalize dma device from pci device") from the mlx5-next tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/infiniband/hw/mlx5/main.c index 6135a0b285de,fae6a6a1fbea.. --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@@ -200,12 -172,18 +200,12 @@@ static int mlx5_netdev_event(struct not switch (event) { case NETDEV_REGISTER: + /* Should already be registered during the load */ + if (ibdev->is_rep) + break; write_lock(&roce->netdev_lock); - if (ndev->dev.parent == &mdev->pdev->dev) - if (ibdev->rep) { - struct mlx5_eswitch *esw = ibdev->mdev->priv.eswitch; - struct net_device *rep_ndev; - - rep_ndev = mlx5_ib_get_rep_netdev(esw, -ibdev->rep->vport); - if (rep_ndev == ndev) - roce->netdev = ndev; - } else if (ndev->dev.parent == mdev->device) { ++ if (ndev->dev.parent == mdev->device) roce->netdev = ndev; - } write_unlock(&roce->netdev_lock); break; pgp_PtkGrXy9B.pgp Description: OpenPGP digital signature
REVIEW NOTICE ???
Dear friend , My name is Hans Erich Helmut . I have a client who is interested to invest in your country, she is a well known politician in her country and deserve a lucrative investment partnership with you outside her country without any delay Please can you manage such investment please Kindly reply for further details. Yours sincerely, Hans Erich Helmut London,UK.
linux-next: build warning after merge of the thermal tree
Hi Zhang, After merging the thermal tree, today's linux-next build (arm multi_v7_defconfig) produced this warning: boolean symbol THERMAL tested for 'm'? test forced to 'n' Introduced by commit be33e4fbbea5 ("thermal/drivers/core: Remove the module Kconfig's option") There is a test for =m in drivers/net/ethernet/mellanox/mlxsw/Kconfig. -- Cheers, Stephen Rothwell pgppg10Zmo5Rl.pgp Description: OpenPGP digital signature
[PATCH v6 0/4] x86: Add the support of ACRN guest under x86
ACRN is a flexible, lightweight reference hypervisor, built with real-time and safety-criticality in mind, optimized to streamline embedded development through an open source platform. It is built for embedded IOT with small footprint and real-time features. More details can be found in https://projectacrn.org/ This is the patch set that allows the Linux to work on ACRN hypervisor and it can work with the following patch set to manage the Linux guest on ACRN hypervisor. It includes the detection of ACRN hypervisor, upcall notification vector from hypervisor, hypercall. The hypervisor detection is similar to Xen/VMWARE/Hyperv. ACRN also uses the upcall notification mechanism similar to that in Xen/Microsoft HyperV when it needs to send the notification to Linux guest. The hypercall provides the mechanism that can be used to query/configure the ACRN hypervisor by Linux guest. Following this patch set, we will send acrn driver part, which provides the interface that can be used to manage the virtualized CPU/memory/device/interrupt for other guest OS after the ACRN hypervisor is detected. v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to understand. Remove the export of x86_hyper_acrn. Remove the unused API definition of acrn_setup_intr_handler and acrn_remove_intr_handler. Adjust the order of header file Add the declaration of acrn_hv_vector_handler and tracing definition of acrn_hv_callback_vector. Refine the comments for the function of acrn_hypercall0/1/2 v2-v3: Add one new config symbol to unify the conditional definition of hv_irq_callback_count Use the "vmcall" mnemonic to replace the hard-code byte definition Remove the unnecessary dependency of CONFIG_PARAVIRT for ACRN_GUEST v3-v4: Rename the file name of acrnhyper.h to acrn.h Refine the commit log and some other minor changes(more comments and redundant ifdef in acrn.h, sorting the header file in acrn.c) v4->v5: Minor changes of comments/commit log in patch 04 Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H. Use the "VMCALL" mnemonic in comment/commit log. Uppercase r8/rdi/rsi/rax for hypercall parameter register in comment. v5->v6: Remove the explicit register variable for inline assembly Add the "extern" for the function declaration in acrn.h Add comments about acking ACPI EOI in acrn_hv_callback_handler Minor changes for comments/commit log in patch 03/04 Zhao Yakui (4): x86/Kconfig: Add new config symbol to unify conditional definition of hv_irq_callback_count x86: Add the support of Linux guest on ACRN hypervisor x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector x86/acrn: Add hypercall for ACRN guest arch/x86/Kconfig | 16 +++ arch/x86/entry/entry_64.S | 5 +++ arch/x86/include/asm/acrn.h | 11 + arch/x86/include/asm/acrn_hypercall.h | 84 +++ arch/x86/include/asm/hardirq.h| 2 +- arch/x86/include/asm/hypervisor.h | 1 + arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/acrn.c| 68 arch/x86/kernel/cpu/hypervisor.c | 4 ++ arch/x86/kernel/irq.c | 2 +- arch/x86/xen/Kconfig | 1 + drivers/hv/Kconfig| 1 + 12 files changed, 194 insertions(+), 2 deletions(-) create mode 100644 arch/x86/include/asm/acrn.h create mode 100644 arch/x86/include/asm/acrn_hypercall.h create mode 100644 arch/x86/kernel/cpu/acrn.c -- 2.7.4
[PATCH v6 1/4] x86/Kconfig: Add new config symbol to unify conditional definition of hv_irq_callback_count
Add a special Kconfig symbol X86_HV_CALLBACK_VECTOR so that the guests using the hypervisor interrupt callback counter can select and thus enable that counter. Select it when xen or hyperv support is enabled. No functional changes. Signed-off-by: Zhao Yakui Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner --- v3->v4: Follow the comments to refine the commit log. --- arch/x86/Kconfig | 3 +++ arch/x86/include/asm/hardirq.h | 2 +- arch/x86/kernel/irq.c | 2 +- arch/x86/xen/Kconfig | 1 + drivers/hv/Kconfig | 1 + 5 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 62fc3fd..2fc9297 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -791,6 +791,9 @@ config QUEUED_LOCK_STAT behavior of paravirtualized queued spinlocks and report them on debugfs. +config X86_HV_CALLBACK_VECTOR + def_bool n + source "arch/x86/xen/Kconfig" config KVM_GUEST diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index d9069bb..0753379 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -37,7 +37,7 @@ typedef struct { #ifdef CONFIG_X86_MCE_AMD unsigned int irq_deferred_error_count; #endif -#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN) +#ifdef CONFIG_X86_HV_CALLBACK_VECTOR unsigned int irq_hv_callback_count; #endif #if IS_ENABLED(CONFIG_HYPERV) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 59b5f2e..a147826 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -134,7 +134,7 @@ int arch_show_interrupts(struct seq_file *p, int prec) seq_printf(p, "%10u ", per_cpu(mce_poll_count, j)); seq_puts(p, " Machine check polls\n"); #endif -#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN) +#ifdef CONFIG_X86_HV_CALLBACK_VECTOR if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) { seq_printf(p, "%*s: ", prec, "HYP"); for_each_online_cpu(j) diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index e07abef..ba5a418 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -7,6 +7,7 @@ config XEN bool "Xen guest support" depends on PARAVIRT select PARAVIRT_CLOCK + select X86_HV_CALLBACK_VECTOR depends on X86_64 || (X86_32 && X86_PAE) depends on X86_LOCAL_APIC && X86_TSC help diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 1c1a251..cafcb97 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -6,6 +6,7 @@ config HYPERV tristate "Microsoft Hyper-V client drivers" depends on X86 && ACPI && X86_LOCAL_APIC && HYPERVISOR_GUEST select PARAVIRT + select X86_HV_CALLBACK_VECTOR help Select this option to run Linux as a Hyper-V client operating system. -- 2.7.4
[PATCH v6 4/4] x86/acrn: Add hypercall for ACRN guest
When the ACRN hypervisor is detected, the hypercall is needed so that the ACRN guest can query/config some settings. For example: it can be used to query the resources in hypervisor and manage the CPU/memory/device/ interrupt for guest operating system. Add the hypercall so that the ACRN guest can communicate with the low-level ACRN hypervisor. On x86 it is implemented with the VMCALL instruction. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- V1->V2: Refine the comments for the function of acrn_hypercall0/1/2 v2->v3: Use the "vmcall" mnemonic to replace hard-code byte definition v4->v5: Use _ASM_X86_ACRN_HYPERCALL_H instead of _ASM_X86_ACRNHYPERCALL_H. Use the "VMCALL" mnemonic in comment/commit log. Uppercase r8/rdi/rsi/rax for hypercall parameter register in comment. v5->v6: Remove explicit local register variable for inline assembly --- arch/x86/include/asm/acrn_hypercall.h | 84 +++ 1 file changed, 84 insertions(+) create mode 100644 arch/x86/include/asm/acrn_hypercall.h diff --git a/arch/x86/include/asm/acrn_hypercall.h b/arch/x86/include/asm/acrn_hypercall.h new file mode 100644 index 000..5cb438e --- /dev/null +++ b/arch/x86/include/asm/acrn_hypercall.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _ASM_X86_ACRN_HYPERCALL_H +#define _ASM_X86_ACRN_HYPERCALL_H + +#include + +#ifdef CONFIG_ACRN_GUEST + +/* + * Hypercalls for ACRN guest + * + * Hypercall number is passed in R8 register. + * Up to 2 arguments are passed in RDI, RSI. + * Return value will be placed in RAX. + */ + +static inline long acrn_hypercall0(unsigned long hcall_id) +{ + long result; + + /* the hypercall is implemented with the VMCALL instruction. +* volatile qualifier is added to avoid that it is dropped +* because of compiler optimization. +*/ + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id) +: "r8"); + + return result; +} + +static inline long acrn_hypercall1(unsigned long hcall_id, + unsigned long param1) +{ + long result; + + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id), "D" (param1) +: "r8"); + + return result; +} + +static inline long acrn_hypercall2(unsigned long hcall_id, + unsigned long param1, + unsigned long param2) +{ + long result; + + asm volatile("movq %[hcall_id], %%r8\n\t" +"vmcall\n\t" +: "=a" (result) +: [hcall_id] "g" (hcall_id), "D" (param1), "S" (param2) +: "r8"); + + return result; +} + +#else + +static inline long acrn_hypercall0(unsigned long hcall_id) +{ + return -ENOTSUPP; +} + +static inline long acrn_hypercall1(unsigned long hcall_id, + unsigned long param1) +{ + return -ENOTSUPP; +} + +static inline long acrn_hypercall2(unsigned long hcall_id, + unsigned long param1, + unsigned long param2) +{ + return -ENOTSUPP; +} +#endif /* CONFIG_ACRN_GUEST */ +#endif /* _ASM_X86_ACRN_HYPERCALL_H */ -- 2.7.4
[PATCH v6 2/4] x86: Add the support of Linux guest on ACRN hypervisor
ACRN is an open-source hypervisor maintained by Linux Foundation. It is built for embedded IOT with small footprint and real-time features. Add the ACRN guest support so that it allows linux to be booted under the ACRN hypervisor. Following this patch it will setup the upcall notification vector, enable hypercall and provide the interface that is used to manage the virtualized CPU/memory/device/interrupt for other guest OS. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- v1->v2: Change the CONFIG_ACRN to CONFIG_ACRN_GUEST, which makes it easy to understand. Remove the export of x86_hyper_acrn. v2->v3: Remove the unnecessary dependency of PARAVIRT v3->v4: Refine the commit log and add more meaningful description in Kconfig v4->v5: No change v5->v6: No change --- arch/x86/Kconfig | 12 arch/x86/include/asm/hypervisor.h | 1 + arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/acrn.c| 39 +++ arch/x86/kernel/cpu/hypervisor.c | 4 5 files changed, 57 insertions(+) create mode 100644 arch/x86/kernel/cpu/acrn.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2fc9297..8dc4200 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -845,6 +845,18 @@ config JAILHOUSE_GUEST cell. You can leave this option disabled if you only want to start Jailhouse and run Linux afterwards in the root cell. +config ACRN_GUEST + bool "ACRN Guest support" + depends on X86_64 + help + This option allows to run Linux as guest in ACRN hypervisor. Enabling + this will allow the kernel to boot in virtualized environment under + the ACRN hypervisor. + ACRN is a flexible, lightweight reference open-source hypervisor, built + with real-time and safety-criticality in mind. It is built for embedded + IOT with small footprint and real-time features. More details can be + found in https://projectacrn.org/ + endif #HYPERVISOR_GUEST source "arch/x86/Kconfig.cpu" diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h index 8c5aaba..50a30f6 100644 --- a/arch/x86/include/asm/hypervisor.h +++ b/arch/x86/include/asm/hypervisor.h @@ -29,6 +29,7 @@ enum x86_hypervisor_type { X86_HYPER_XEN_HVM, X86_HYPER_KVM, X86_HYPER_JAILHOUSE, + X86_HYPER_ACRN, }; #ifdef CONFIG_HYPERVISOR_GUEST diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index cfd24f9..17a7cdf 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -44,6 +44,7 @@ obj-$(CONFIG_X86_CPU_RESCTRL) += resctrl/ obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o +obj-$(CONFIG_ACRN_GUEST) += acrn.o ifdef CONFIG_X86_FEATURE_NAMES quiet_cmd_mkcapflags = MKCAP $@ diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c new file mode 100644 index 000..f556640 --- /dev/null +++ b/arch/x86/kernel/cpu/acrn.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ACRN detection support + * + * Copyright (C) 2019 Intel Corporation. All rights reserved. + * + * Jason Chen CJ + * Zhao Yakui + * + */ + +#include + +static uint32_t __init acrn_detect(void) +{ + return hypervisor_cpuid_base("ACRNACRNACRN\0\0", 0); +} + +static void __init acrn_init_platform(void) +{ +} + +static bool acrn_x2apic_available(void) +{ + /* x2apic is not supported now. +* Later it needs to check the X86_FEATURE_X2APIC bit of cpu info +* returned by CPUID to determine whether the x2apic is +* supported in Linux guest. +*/ + return false; +} + +const __initconst struct hypervisor_x86 x86_hyper_acrn = { + .name = "ACRN", + .detect = acrn_detect, + .type = X86_HYPER_ACRN, + .init.init_platform = acrn_init_platform, + .init.x2apic_available = acrn_x2apic_available, +}; diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c index 479ca47..87e39ad 100644 --- a/arch/x86/kernel/cpu/hypervisor.c +++ b/arch/x86/kernel/cpu/hypervisor.c @@ -32,6 +32,7 @@ extern const struct hypervisor_x86 x86_hyper_xen_pv; extern const struct hypervisor_x86 x86_hyper_xen_hvm; extern const struct hypervisor_x86 x86_hyper_kvm; extern const struct hypervisor_x86 x86_hyper_jailhouse; +extern const struct hypervisor_x86 x86_hyper_acrn; static const __initconst struct hypervisor_x86 * const hypervisors[] = { @@ -49,6 +50,9 @@ static const __initconst struct hypervisor_x86 * const hypervisors[] = #ifdef CONFIG_JAILHOUSE_GUEST &x86_hyper_jailhouse, #endif +#ifdef CONFIG_ACRN_GUEST + &x86_hyper_acrn, +#endif }; enum x86_hypervisor_type x86_hyper_type; -- 2.7
[PATCH v6 3/4] x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector
Linux kernel uses the HYPERVISOR_CALLBACK_VECTOR for hypervisor upcall vector. It is already used for Xen and HyperV. After the ACRN hypervisor is detected, it will also use this defined vector to notify the ACRN guest. Co-developed-by: Jason Chen CJ Signed-off-by: Jason Chen CJ Signed-off-by: Zhao Yakui Reviewed-by: Thomas Gleixner --- V1->V2: Remove the unused API definition of acrn_setup_intr_handler and acrn_remove_intr_handler. Adjust the order of header file Add the declaration of acrn_hv_vector_handler and tracing definition of acrn_hv_callback_vector. v2->v3: No change v3->v4: Refine the file name of acrnhyper.h to acrn.h v5->v6: Add the "extern" for the function declarations in header file Add some comments for calling entering_ack_irq Some other minor changes(unnecessary spliting two lines. and minor change in commit log) --- arch/x86/Kconfig| 1 + arch/x86/entry/entry_64.S | 5 + arch/x86/include/asm/acrn.h | 11 +++ arch/x86/kernel/cpu/acrn.c | 29 + 4 files changed, 46 insertions(+) create mode 100644 arch/x86/include/asm/acrn.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8dc4200..d7a10f6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -848,6 +848,7 @@ config JAILHOUSE_GUEST config ACRN_GUEST bool "ACRN Guest support" depends on X86_64 + select X86_HV_CALLBACK_VECTOR help This option allows to run Linux as guest in ACRN hypervisor. Enabling this will allow the kernel to boot in virtualized environment under diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 1f0efdb..d1b8ad3 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1129,6 +1129,11 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \ hv_stimer0_callback_vector hv_stimer0_vector_handler #endif /* CONFIG_HYPERV */ +#if IS_ENABLED(CONFIG_ACRN_GUEST) +apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ + acrn_hv_callback_vector acrn_hv_vector_handler +#endif + idtentry debug do_debughas_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry int3 do_int3 has_error_code=0 idtentry stack_segment do_stack_segmenthas_error_code=1 diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h new file mode 100644 index 000..4adb13f --- /dev/null +++ b/arch/x86/include/asm/acrn.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_ACRN_H +#define _ASM_X86_ACRN_H + +extern void acrn_hv_callback_vector(void); +#ifdef CONFIG_TRACING +#define trace_acrn_hv_callback_vector acrn_hv_callback_vector +#endif + +extern void acrn_hv_vector_handler(struct pt_regs *regs); +#endif /* _ASM_X86_ACRN_H */ diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c index f556640..ce88d2d 100644 --- a/arch/x86/kernel/cpu/acrn.c +++ b/arch/x86/kernel/cpu/acrn.c @@ -9,7 +9,11 @@ * */ +#include +#include +#include #include +#include static uint32_t __init acrn_detect(void) { @@ -18,6 +22,8 @@ static uint32_t __init acrn_detect(void) static void __init acrn_init_platform(void) { + /* Setup the IDT for ACRN hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, acrn_hv_callback_vector); } static bool acrn_x2apic_available(void) @@ -30,6 +36,29 @@ static bool acrn_x2apic_available(void) return false; } +static void (*acrn_intr_handler)(void); + +__visible void __irq_entry acrn_hv_vector_handler(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + /* +* The hypervisor requires that the APIC EOI should be acked. +* If the APIC EOI is not acked, the APIC ISR bit for the +* HYPERVISOR_CALLBACK_VECTOR will not be cleared and then it +* will block the interrupt whose vector is lower than +* HYPERVISOR_CALLBACK_VECTOR. +*/ + entering_ack_irq(); + inc_irq_stat(irq_hv_callback_count); + + if (acrn_intr_handler) + acrn_intr_handler(); + + exiting_irq(); + set_irq_regs(old_regs); +} + const __initconst struct hypervisor_x86 x86_hyper_acrn = { .name = "ACRN", .detect = acrn_detect, -- 2.7.4
[PATCH] drivers: thermal: processor_thermal: Read PPCC on resume
Read PPCC power limits on system resume in case those limits changed while system was suspended. Signed-off-by: Srinivas Pandruvada --- .../int340x_thermal/processor_thermal_device.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c index 436c256f111d..acb22157b9ac 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c @@ -465,6 +465,18 @@ static void proc_thermal_pci_remove(struct pci_dev *pdev) pci_disable_device(pdev); } +static int proc_thermal_resume(struct device *dev) +{ + struct proc_thermal_device *proc_dev; + + proc_dev = dev_get_drvdata(dev); + proc_thermal_read_ppcc(proc_dev); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(proc_thermal_pm, NULL, proc_thermal_resume); + static const struct pci_device_id proc_thermal_pci_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PROC_BDW_THERMAL)}, { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PROC_HSB_THERMAL)}, @@ -489,6 +501,7 @@ static struct pci_driver proc_thermal_pci_driver = { .probe = proc_thermal_pci_probe, .remove = proc_thermal_pci_remove, .id_table = proc_thermal_pci_ids, + .driver.pm = &proc_thermal_pm, }; static const struct acpi_device_id int3401_device_ids[] = { @@ -503,6 +516,7 @@ static struct platform_driver int3401_driver = { .driver = { .name = "int3401 thermal", .acpi_match_table = int3401_device_ids, + .pm = &proc_thermal_pm, }, }; -- 2.17.2
[PATCH] drivers: thermal: processor_thermal: Downgrade error message
Downgrade "Unsupported event" message from dev_err to dev_dbg. Otherwise it floods with this message one some platforms. Signed-off-by: Srinivas Pandruvada --- .../thermal/intel/int340x_thermal/processor_thermal_device.c| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c index 4b206b594825..436c256f111d 100644 --- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c @@ -275,7 +275,7 @@ static void proc_thermal_notify(acpi_handle handle, u32 event, void *data) THERMAL_DEVICE_POWER_CAPABILITY_CHANGED); break; default: - dev_err(proc_priv->dev, "Unsupported event [0x%x]\n", event); + dev_dbg(proc_priv->dev, "Unsupported event [0x%x]\n", event); break; } } -- 2.17.2
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 04:02:23AM +0100, Al Viro wrote: > On Tue, Apr 30, 2019 at 04:22:38AM +0200, Nicholas Mc Guire wrote: > > On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > > > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire > > > wrote: > > > > > > > > V2: As requested by Sven Van Asbroeck make the > > > > impact of the patch clear in the commit message. > > > > > > Thank you, but did you miss my comment about creating a local variable > > > instead? See: > > > https://lkml.org/lkml/2019/4/28/97 > > > > Did not miss it - I just don't think that makes it any more > > understandable - the __force __be16 makes it clear I believe > > that this is correct, sparse does not like this though - so tell > > sparse. > > ... to STFU, 'cause you know better. The trouble is, how do we > (or yourself a year or two later) know *why* it is correct? > Worse, how do we (or yourself, etc.) know if a change about to be > done to the code won't invalidate the proof of yours? > > > The local variable would need to be explained as it is > > functionally not necessary - therefor I find it more confusing > > that using __force here. > > What's confusing is mixing host- and fixed-endian values in the > same variable at different times. Treat those as unrelated > types that happen to have the same sizeof. > > Quite a few of __force instances in the tree should be taken out > and shot. Don't add to their number. ok - my bad thn - I had assumed that using __force is reasonable if the handling is correct and its a localized conversoin only like var = be16_to_cpu(var) which evaded introducing additinal variables just to have different types but no different function. But the long-term issue of hiding bugs by __force makes sesne to me - will give it another shot at scripting this in coccinelle. thx! hofrat
Re: [PATCH 2/2] memcg, fsnotify: no oom-kill for remote memcg charging
On Mon, Apr 29, 2019 at 5:41 PM Michal Hocko wrote: > > On Mon 29-04-19 10:13:32, Shakeel Butt wrote: > [...] > > /* > >* For queues with unlimited length lost events are not expected and > >* can possibly have security implications. Avoid losing events when > >* memory is short. > > + * > > + * Note: __GFP_NOFAIL takes precedence over __GFP_RETRY_MAYFAIL. > >*/ > > No, I there is no rule like that. Combining the two is undefined > currently and I do not think we want to legitimize it. What does it even > mean? > Actually the code is doing that but I agree this is not documented and weird. I will fix this. Shakeel
Re: [PATCH] riscv: Support non-coherency memory model
On Mon, Apr 29, 2019 at 01:11:43PM -0700, Palmer Dabbelt wrote: > On Mon, 22 Apr 2019 08:44:30 PDT (-0700), guo...@kernel.org wrote: > >From: Guo Ren > > > >The current riscv linux implementation requires SOC system to support > >memory coherence between all I/O devices and CPUs. But some SOC systems > >cannot maintain the coherence and they need support cache clean/invalid > >operations to synchronize data. > > > >Current implementation is no problem with SiFive FU540, because FU540 > >keeps all IO devices and DMA master devices coherence with CPU. But to a > >traditional SOC vendor, it may already have a stable non-coherency SOC > >system, the need is simply to replace the CPU with RV CPU and rebuild > >the whole system with IO-coherency is very expensive. > > > >So we should make riscv linux also support non-coherency memory model. > >Here are the two points that riscv linux needs to be modified: > > > > - Add _PAGE_COHERENCY bit in current page table entry attributes. The bit > > designates a coherence for this page mapping. Software set the bit to > > tell the hardware that the region of the page's memory area must be > > coherent with IOs devices in SOC system by PMA settings. > > If IOs and CPU are already coherent in SOC system, CPU just ignore > > this bit. > > > > PTE format: > > | XLEN-1 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 > > PFN C RSW D A G U X W R V > > ^ > > BIT(9): Coherence attribute bit > > 0: hardware needn't keep the page coherenct and software will > > maintain the coherence with cache clear/invalid operations. > > 1: hardware must keep the page coherenct and software needn't > > maintain the coherence. > > BIT(8): Reserved for software and now it's _PAGE_SPECIAL in linux > > > > Add a new hardware bit in PTE also need to modify Privileged > > Architecture Supervisor-Level ISA: > > https://github.com/riscv/riscv-isa-manual/pull/374 > > This is a RISC-V ISA modification, which isn't really appropriate to suggest > on > the kernel mailing lists. The right place to talk about this is at the RISC-V > foundation, which owns the ISA -- we can't change the hardware with a patch to > Linux :). I just want a discussion and a wide discussion is good for all of us :) > > > - Add SBI_FENCE_DMA 9 in riscv-sbi. > > sbi_fence_dma(start, size, dir) could synchronize CPU cache data with > > DMA device in non-coherency memory model. The third param's definition > > is the same with linux's in include/linux/dma-direction.h: > > > > enum dma_data_direction { > > DMA_BIDIRECTIONAL = 0, > > DMA_TO_DEVICE = 1, > > DMA_FROM_DEVICE = 2, > > DMA_NONE = 3, > > }; > > > > The first param:start must be physical address which could be handled > > in M-state. > > > > Here is a pull request to the riscv-sbi-doc: > > https://github.com/riscv/riscv-sbi-doc/pull/15 > > > >We have tested the patch on our fpga SOC system which network controller > >connected to a non-cache-coherency interconnect in and it couldn't work > >without the patch. > > > >There is no side effect for FU540 whose CPU don't care _PAGE_COHERENCY > >in PTE, but FU540's bbl also need to implement a simple sbi_fence_dma > >by directly return. In fact, if you give a correct configuration for > >dev_is_dma_conherent(), linux dma framework wouldn't call sbi_fence_dma > >any more. > > Non-coherent fences also need to be discussed as part of a RISC-V ISA ^^ fences instructions? not page attributes? > extension. > I know people have expressed interest, but I don't know of a > working group that's already been set up. Is that mean current RISC-V ISA forces the SOC to be coherent memory model? Best Regards Guo Ren
Re: INFO: task hung in __get_super
On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote: > Yeah, you're right. And if we push the patch a bit further to not take > loop_ctl_mutex for invalid ioctl number, that would fix the problem. I > can send a fix. Huh? We don't take it until in lo_simple_ioctl(), and that patch doesn't get to its call on invalid ioctl numbers. What am I missing here?
[RFC PATCH v4 15/15] dcache: Add CONFIG_DCACHE_SMO
In an attempt to make the SMO patchset as non-invasive as possible add a config option CONFIG_DCACHE_SMO (under "Memory Management options") for enabling SMO for the DCACHE. Whithout this option dcache constructor is used but no other code is built in, with this option enabled slab mobility is enabled and the isolate/migrate functions are built in. Add CONFIG_DCACHE_SMO to guard the partial shrinking of the dcache via Slab Movable Objects infrastructure. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 4 mm/Kconfig | 7 +++ 2 files changed, 11 insertions(+) diff --git a/fs/dcache.c b/fs/dcache.c index 3f9daba1cc78..9edce104613b 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3068,6 +3068,7 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode) } EXPORT_SYMBOL(d_tmpfile); +#ifdef CONFIG_DCACHE_SMO /* * d_isolate() - Dentry isolation callback function. * @s: The dentry cache. @@ -3140,6 +3141,7 @@ static void d_partial_shrink(struct kmem_cache *s, void **_unused, int __unused, kfree(private); } +#endif /* CONFIG_DCACHE_SMO */ static __initdata unsigned long dhash_entries; static int __init set_dhash_entries(char *str) @@ -3186,7 +3188,9 @@ static void __init dcache_init(void) sizeof_field(struct dentry, d_iname), dcache_ctor); +#ifdef CONFIG_DCACHE_SMO kmem_cache_setup_mobility(dentry_cache, d_isolate, d_partial_shrink); +#endif /* Hash may have been set up in dcache_init_early */ if (!hashdist) diff --git a/mm/Kconfig b/mm/Kconfig index 47040d939f3b..92fc27ad3472 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -265,6 +265,13 @@ config SMO_NODE help On NUMA systems enable moving objects to and from a specified node. +config DCACHE_SMO + bool "Enable Slab Movable Objects for the dcache" + depends on SLUB + help + Under memory pressure we can try to free dentry slab cache objects from + the partial slab list if this is enabled. + config PHYS_ADDR_T_64BIT def_bool 64BIT -- 2.21.0
[RFC PATCH v4 13/15] dcache: Provide a dentry constructor
In order to support object migration on the dentry cache we need to have a determined object state at all times. Without a constructor the object would have a random state after allocation. Provide a dentry constructor. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 30 +- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index aac41adf4743..3d6cc06eca56 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1603,6 +1603,16 @@ void d_invalidate(struct dentry *dentry) } EXPORT_SYMBOL(d_invalidate); +static void dcache_ctor(void *p) +{ + struct dentry *dentry = p; + + /* Mimic lockref_mark_dead() */ + dentry->d_lockref.count = -128; + + spin_lock_init(&dentry->d_lock); +} + /** * __d_alloc - allocate a dcache entry * @sb: filesystem it will belong to @@ -1658,7 +1668,6 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) dentry->d_lockref.count = 1; dentry->d_flags = 0; - spin_lock_init(&dentry->d_lock); seqcount_init(&dentry->d_seq); dentry->d_inode = NULL; dentry->d_parent = dentry; @@ -3091,14 +3100,17 @@ static void __init dcache_init_early(void) static void __init dcache_init(void) { - /* -* A constructor could be added for stable state like the lists, -* but it is probably not worth it because of the cache nature -* of the dcache. -*/ - dentry_cache = KMEM_CACHE_USERCOPY(dentry, - SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT, - d_iname); + slab_flags_t flags = + SLAB_RECLAIM_ACCOUNT | SLAB_PANIC | SLAB_MEM_SPREAD | SLAB_ACCOUNT; + + dentry_cache = + kmem_cache_create_usercopy("dentry", + sizeof(struct dentry), + __alignof__(struct dentry), + flags, + offsetof(struct dentry, d_iname), + sizeof_field(struct dentry, d_iname), + dcache_ctor); /* Hash may have been set up in dcache_init_early */ if (!hashdist) -- 2.21.0
[RFC PATCH v4 11/15] slub: Enable moving objects to/from specific nodes
We have just implemented Slab Movable Objects (object migration). Currently object migration is used to defrag a cache. On NUMA systems it would be nice to be able to control the source and destination nodes when moving objects. Add CONFIG_SMO_NODE to guard this feature. CONFIG_SMO_NODE depends on CONFIG_SLUB_DEBUG because we use the full list. Leave it like this for the RFC because the patch will be less cluttered to review, separate full list out of CONFIG_DEBUG before doing a PATCH version. Implement moving all objects (including those in full slabs) to a specific node. Expose this functionality to userspace via a sysfs entry. Add sysfs entry: /sysfs/kernel/slab//move With this users get access to the following functionality: - Move all objects to specified node. echo "N1" > move - Move all objects from specified node to other specified node (from N1 -> to N2): echo "N1 N2" > move This also enables shrinking slabs on a specific node: echo "N1 N1" > move Signed-off-by: Tobin C. Harding --- mm/Kconfig | 7 ++ mm/slub.c | 249 + 2 files changed, 256 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 25c71eb8a7db..47040d939f3b 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -258,6 +258,13 @@ config ARCH_ENABLE_HUGEPAGE_MIGRATION config ARCH_ENABLE_THP_MIGRATION bool +config SMO_NODE + bool "Enable per node control of Slab Movable Objects" + depends on SLUB && SYSFS + select SLUB_DEBUG + help + On NUMA systems enable moving objects to and from a specified node. + config PHYS_ADDR_T_64BIT def_bool 64BIT diff --git a/mm/slub.c b/mm/slub.c index e601c804ed79..e4f3dde443f5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4345,6 +4345,106 @@ static void move_slab_page(struct page *page, void *scratch, int node) s->migrate(s, vector, count, node, private); } +#ifdef CONFIG_SMO_NODE +/* + * kmem_cache_move() - Attempt to move all slab objects. + * @s: The cache we are working on. + * @node: The node to move objects away from. + * @target_node: The node to move objects on to. + * + * Attempts to move all objects (partial slabs and full slabs) to target + * node. + * + * Context: Takes the list_lock. + * Return: The number of slabs remaining on node. + */ +static unsigned long kmem_cache_move(struct kmem_cache *s, +int node, int target_node) +{ + struct kmem_cache_node *n = get_node(s, node); + LIST_HEAD(move_list); + struct page *page, *page2; + unsigned long flags; + void **scratch; + + if (!s->migrate) { + pr_warn("%s SMO not enabled, cannot move objects\n", s->name); + goto out; + } + + scratch = alloc_scratch(s); + if (!scratch) + goto out; + + spin_lock_irqsave(&n->list_lock, flags); + + list_for_each_entry_safe(page, page2, &n->partial, lru) { + if (!slab_trylock(page)) + /* Busy slab. Get out of the way */ + continue; + + if (page->inuse) { + list_move(&page->lru, &move_list); + /* Stop page being considered for allocations */ + n->nr_partial--; + page->frozen = 1; + + slab_unlock(page); + } else {/* Empty slab page */ + list_del(&page->lru); + n->nr_partial--; + slab_unlock(page); + discard_slab(s, page); + } + } + list_for_each_entry_safe(page, page2, &n->full, lru) { + if (!slab_trylock(page)) + continue; + + list_move(&page->lru, &move_list); + page->frozen = 1; + slab_unlock(page); + } + + spin_unlock_irqrestore(&n->list_lock, flags); + + list_for_each_entry(page, &move_list, lru) { + if (page->inuse) + move_slab_page(page, scratch, target_node); + } + kfree(scratch); + + /* Bail here to save taking the list_lock */ + if (list_empty(&move_list)) + goto out; + + /* Inspect results and dispose of pages */ + spin_lock_irqsave(&n->list_lock, flags); + list_for_each_entry_safe(page, page2, &move_list, lru) { + list_del(&page->lru); + slab_lock(page); + page->frozen = 0; + + if (page->inuse) { + if (page->inuse == page->objects) { + list_add(&page->lru, &n->full); + slab_unlock(page); + } else { + n->nr_partial++; + list_add_tail(&page->lru, &n->partial); + slab_
[RFC PATCH v4 12/15] slub: Enable balancing slabs across nodes
We have just implemented Slab Movable Objects (SMO). On NUMA systems slabs can become unbalanced i.e. many slabs on one node while other nodes have few slabs. Using SMO we can balance the slabs across all the nodes. The algorithm used is as follows: 1. Move all objects to node 0 (this has the effect of defragmenting the cache). 2. Calculate the desired number of slabs for each node (this is done using the approximation nr_slabs / nr_nodes). 3. Loop over the nodes moving the desired number of slabs from node 0 to the node. Feature is conditionally built in with CONFIG_SMO_NODE, this is because we need the full list (we enable SLUB_DEBUG to get this). Future version may separate final list out of SLUB_DEBUG. Expose this functionality to userspace via a sysfs entry. Add sysfs entry: /sysfs/kernel/slab//balance Write of '1' to this file triggers balance, no other value accepted. This feature relies on SMO being enable for the cache, this is done with a call to, after the isolate/migrate functions have been defined. kmem_cache_setup_mobility(s, isolate, migrate) Signed-off-by: Tobin C. Harding --- mm/slub.c | 120 ++ 1 file changed, 120 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index e4f3dde443f5..a5c48c41d72b 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4583,6 +4583,109 @@ static unsigned long kmem_cache_move_to_node(struct kmem_cache *s, int node) return left; } + +/* + * kmem_cache_move_slabs() - Attempt to move @num slabs to target_node, + * @s: The cache we are working on. + * @node: The node to move objects from. + * @target_node: The node to move objects to. + * @num: The number of slabs to move. + * + * Attempts to move @num slabs from @node to @target_node. This is done + * by migrating objects from slabs on the full_list. + * + * Return: The number of slabs moved or error code. + */ +static long kmem_cache_move_slabs(struct kmem_cache *s, + int node, int target_node, long num) +{ + struct kmem_cache_node *n = get_node(s, node); + LIST_HEAD(move_list); + struct page *page, *page2; + unsigned long flags; + void **scratch; + long done = 0; + + if (node == target_node) + return -EINVAL; + + scratch = alloc_scratch(s); + if (!scratch) + return -ENOMEM; + + spin_lock_irqsave(&n->list_lock, flags); + list_for_each_entry_safe(page, page2, &n->full, lru) { + if (!slab_trylock(page)) + /* Busy slab. Get out of the way */ + continue; + + list_move(&page->lru, &move_list); + page->frozen = 1; + slab_unlock(page); + + if (++done >= num) + break; + } + spin_unlock_irqrestore(&n->list_lock, flags); + + list_for_each_entry(page, &move_list, lru) { + if (page->inuse) + move_slab_page(page, scratch, target_node); + } + kfree(scratch); + + /* Inspect results and dispose of pages */ + spin_lock_irqsave(&n->list_lock, flags); + list_for_each_entry_safe(page, page2, &move_list, lru) { + list_del(&page->lru); + slab_lock(page); + page->frozen = 0; + + if (page->inuse) { + /* +* This is best effort only, if slab still has +* objects just put it back on the partial list. +*/ + n->nr_partial++; + list_add_tail(&page->lru, &n->partial); + slab_unlock(page); + } else { + slab_unlock(page); + discard_slab(s, page); + } + } + spin_unlock_irqrestore(&n->list_lock, flags); + + return done; +} + +/* + * kmem_cache_balance_nodes() - Balance slabs across nodes. + * @s: The cache we are working on. + */ +static void kmem_cache_balance_nodes(struct kmem_cache *s) +{ + struct kmem_cache_node *n = get_node(s, 0); + unsigned long desired_nr_slabs_per_node; + unsigned long nr_slabs; + int nr_nodes = 0; + int nid; + + (void)kmem_cache_move_to_node(s, 0); + + for_each_node_state(nid, N_NORMAL_MEMORY) + nr_nodes++; + + nr_slabs = atomic_long_read(&n->nr_slabs); + desired_nr_slabs_per_node = nr_slabs / nr_nodes; + + for_each_node_state(nid, N_NORMAL_MEMORY) { + if (nid == 0) + continue; + + kmem_cache_move_slabs(s, 0, nid, desired_nr_slabs_per_node); + } +} #endif /** @@ -5847,6 +5950,22 @@ static ssize_t move_store(struct kmem_cache *s, const char *buf, size_t length) return length; } SLAB_ATTR(move); + +static ssize_t balance_show(struct kmem_cac
[RFC PATCH v4 14/15] dcache: Implement partial shrink via Slab Movable Objects
The dentry slab cache is susceptible to internal fragmentation. Now that we have Slab Movable Objects we can attempt to defragment the dcache. Dentry objects are inherently _not_ relocatable however under some conditions they can be free'd. This is the same as shrinking the dcache but instead of shrinking the whole cache we only attempt to free those objects that are located in partially full slab pages. There is no guarantee that this will reduce the memory usage of the system, it is a compromise between fragmented memory and total cache shrinkage with the hope that some memory pressure can be alleviated. This is implemented using the newly added Slab Movable Objects infrastructure. The dcache 'migration' function is intentionally _not_ called 'd_migrate' because we only free, we do not migrate. Call it 'd_partial_shrink' to make explicit that no reallocation is done. Implement isolate and 'migrate' functions for the dentry slab cache. Signed-off-by: Tobin C. Harding --- fs/dcache.c | 76 + 1 file changed, 76 insertions(+) diff --git a/fs/dcache.c b/fs/dcache.c index 3d6cc06eca56..3f9daba1cc78 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "mount.h" @@ -3067,6 +3068,79 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode) } EXPORT_SYMBOL(d_tmpfile); +/* + * d_isolate() - Dentry isolation callback function. + * @s: The dentry cache. + * @v: Vector of pointers to the objects to isolate. + * @nr: Number of objects in @v. + * + * The slab allocator is holding off frees. We can safely examine + * the object without the danger of it vanishing from under us. + */ +static void *d_isolate(struct kmem_cache *s, void **v, int nr) +{ + struct list_head *dispose; + struct dentry *dentry; + int i; + + dispose = kmalloc(sizeof(*dispose), GFP_KERNEL); + if (!dispose) + return NULL; + + INIT_LIST_HEAD(dispose); + + for (i = 0; i < nr; i++) { + dentry = v[i]; + spin_lock(&dentry->d_lock); + + if (dentry->d_lockref.count > 0 || + dentry->d_flags & DCACHE_SHRINK_LIST) { + spin_unlock(&dentry->d_lock); + continue; + } + + if (dentry->d_flags & DCACHE_LRU_LIST) + d_lru_del(dentry); + + d_shrink_add(dentry, dispose); + spin_unlock(&dentry->d_lock); + } + + return dispose; +} + +/* + * d_partial_shrink() - Dentry migration callback function. + * @s: The dentry cache. + * @_unused: We do not access the vector. + * @__unused: No need for length of vector. + * @___unused: We do not do any allocation. + * @private: list_head pointer representing the shrink list. + * + * Dispose of the shrink list created during isolation function. + * + * Dentry objects can _not_ be relocated and shrinking the whole dcache + * can be expensive. This is an effort to free dentry objects that are + * stopping slab pages from being free'd without clearing the whole dcache. + * + * This callback is called from the SLUB allocator object migration + * infrastructure in attempt to free up slab pages by freeing dentry + * objects from partially full slabs. + */ +static void d_partial_shrink(struct kmem_cache *s, void **_unused, int __unused, +int ___unused, void *private) +{ + struct list_head *dispose = private; + + if (!private) /* kmalloc error during isolate. */ + return; + + if (!list_empty(dispose)) + shrink_dentry_list(dispose); + + kfree(private); +} + static __initdata unsigned long dhash_entries; static int __init set_dhash_entries(char *str) { @@ -3112,6 +3186,8 @@ static void __init dcache_init(void) sizeof_field(struct dentry, d_iname), dcache_ctor); + kmem_cache_setup_mobility(dentry_cache, d_isolate, d_partial_shrink); + /* Hash may have been set up in dcache_init_early */ if (!hashdist) return; -- 2.21.0
[RFC PATCH v4 10/15] tools/testing/slab: Add XArray movable objects tests
We just implemented movable objects for the XArray. Let's test it intree. Add test module for the XArray's movable objects implementation. Functionality of the XArray Slab Movable Object implementation can usually be seen by simply by using `slabinfo` on a running machine since the radix tree is typically in use on a running machine and will have partial slabs. For repeated testing we can use the test module to run to simulate a workload on the XArray then use `slabinfo` to test object migration is functioning. If testing on freshly spun up VM (low radix tree workload) it may be necessary to load/unload the module a number of times to create partial slabs. Example test session Relevant /proc/slabinfo column headers: name Prior to testing slabinfo report for radix_tree_node: # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8352 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 497 Sanity Checks : On Total: 8142848 SlabObj: 912 Full : 473 Redzoning : On Used : 4810752 SlabSiz: 16384 Partial: 24 Poisoning : On Loss : 3332096 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2806272 Align : 8 Objects: 17 Tracing : Off Lpadd: 437360 Here you can see the kernel was built with Slab Movable Objects enabled for the XArray (XArray uses the radix tree below the surface). After inserting the test module (note we have triggered allocation of a number of radix tree nodes increasing the object count but decreasing the number of partial slabs): # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8442 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 499 Sanity Checks : On Total: 8175616 SlabObj: 912 Full : 484 Redzoning : On Used : 4862592 SlabSiz: 16384 Partial: 15 Poisoning : On Loss : 3313024 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2836512 Align : 8 Objects: 17 Tracing : Off Lpadd: 439120 Now we can shrink the radix_tree_node cache: # slabinfo radix_tree_node --shrink # slabinfo radix_tree_node --report Slabcache: radix_tree_node Aliases: 0 Order : 2 Objects: 8515 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 576 Total : 501 Sanity Checks : On Total: 8208384 SlabObj: 912 Full : 500 Redzoning : On Used : 4904640 SlabSiz: 16384 Partial: 1 Poisoning : On Loss : 3303744 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 2861040 Align : 8 Objects: 17 Tracing : Off Lpadd: 440880 Note the single remaining partial slab. Signed-off-by: Tobin C. Harding --- tools/testing/slab/Makefile | 2 +- tools/testing/slab/slub_defrag_xarray.c | 211 2 files changed, 212 insertions(+), 1 deletion(-) create mode 100644 tools/testing/slab/slub_defrag_xarray.c diff --git a/tools/testing/slab/Makefile b/tools/testing/slab/Makefile index 440c2e3e356f..44c18d9a4d52 100644 --- a/tools/testing/slab/Makefile +++ b/tools/testing/slab/Makefile @@ -1,4 +1,4 @@ -obj-m += slub_defrag.o +obj-m += slub_defrag.o slub_defrag_xarray.o KTREE=../../.. diff --git a/tools/testing/slab/slub_defrag_xarray.c b/tools/testing/slab/slub_defrag_xarray.c new file mode 100644 index ..41143f73256c --- /dev/null +++ b/tools/testing/slab/slub_defrag_xarray.c @@ -0,0 +1,211 @@ +// SPDX-License-Identifier: GPL-2.0+ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define SMOX_CACHE_NAME "smox_test" +static struct kmem_cache *cachep; + +/* + * Declare XArrays globally so we can clean them up on module unload. + */ + +/* Used by test_smo_xarray()*/ +DEFINE_XARRAY(things); + +/* Thing to store pointers to in the XArray */ +struct smox_thing { + long id; +}; + +/* It's up to the caller to ensure id is unique */ +static struct smox_thing *alloc_thing(int id) +{ + struct smox_thing *thing; + + thing = kmem_cache_alloc(cachep, GFP_KERNEL); + if (!thing) + return ERR_PTR(-ENOMEM); + + thing->id = id; + return thing; +} + +/** + * smox_object_ctor() - SMO object constructor function. + * @ptr: Pointer to memory where the object should be constructed. + */ +void smox_object_
[RFC PATCH v4 09/15] xarray: Implement migration function for objects
Implement functions to migrate objects. This is based on initial code by Matthew Wilcox and was modified to work with slab object migration. This patch can not be merged until all radix tree & IDR users are converted to the XArray because xa_nodes and radix tree nodes share the same slab cache (thanks Matthew). Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- lib/radix-tree.c | 13 + lib/xarray.c | 49 2 files changed, 62 insertions(+) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index 14d51548bea6..9412c2853726 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -1613,6 +1613,17 @@ static int radix_tree_cpu_dead(unsigned int cpu) return 0; } +extern void xa_object_migrate(void *tree_node, int numa_node); + +static void radix_tree_migrate(struct kmem_cache *s, void **objects, int nr, + int node, void *private) +{ + int i; + + for (i = 0; i < nr; i++) + xa_object_migrate(objects[i], node); +} + void __init radix_tree_init(void) { int ret; @@ -1627,4 +1638,6 @@ void __init radix_tree_init(void) ret = cpuhp_setup_state_nocalls(CPUHP_RADIX_DEAD, "lib/radix:dead", NULL, radix_tree_cpu_dead); WARN_ON(ret < 0); + kmem_cache_setup_mobility(radix_tree_node_cachep, NULL, + radix_tree_migrate); } diff --git a/lib/xarray.c b/lib/xarray.c index 6be3acbb861f..731dd3d8ddb8 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1971,6 +1971,55 @@ void xa_destroy(struct xarray *xa) } EXPORT_SYMBOL(xa_destroy); +void xa_object_migrate(struct xa_node *node, int numa_node) +{ + struct xarray *xa = READ_ONCE(node->array); + void __rcu **slot; + struct xa_node *new_node; + int i; + + /* Freed or not yet in tree then skip */ + if (!xa || xa == XA_RCU_FREE) + return; + + new_node = kmem_cache_alloc_node(radix_tree_node_cachep, +GFP_KERNEL, numa_node); + if (!new_node) + return; + + xa_lock_irq(xa); + + /* Check again. */ + if (xa != node->array) { + node = new_node; + goto unlock; + } + + memcpy(new_node, node, sizeof(struct xa_node)); + + if (list_empty(&node->private_list)) + INIT_LIST_HEAD(&new_node->private_list); + else + list_replace(&node->private_list, &new_node->private_list); + + for (i = 0; i < XA_CHUNK_SIZE; i++) { + void *x = xa_entry_locked(xa, new_node, i); + + if (xa_is_node(x)) + rcu_assign_pointer(xa_to_node(x)->parent, new_node); + } + if (!new_node->parent) + slot = &xa->xa_head; + else + slot = &xa_parent_locked(xa, new_node)->slots[new_node->offset]; + rcu_assign_pointer(*slot, xa_mk_node(new_node)); + +unlock: + xa_unlock_irq(xa); + xa_node_free(node); + rcu_barrier(); +} + #ifdef XA_DEBUG void xa_dump_node(const struct xa_node *node) { -- 2.21.0
[RFC PATCH v4 08/15] tools/testing/slab: Add object migration test suite
We just added a module that enables testing the SLUB allocators ability to defrag/shrink caches via movable objects. Tests are better when they are automated. Add automated testing via a python script for SLUB movable objects. Example output: $ cd path/to/linux/tools/testing/slab $ /slub_defrag.py Please run script as root $ sudo ./slub_defrag.py $ sudo ./slub_defrag.py --debug Loading module ... Slab cache smo_test created Objects per slab: 20 Running sanity checks ... Running module stress test (see dmesg for additional test output) ... Removing module slub_defrag ... Loading module ... Slab cache smo_test created Running test non-movable ... testing slab 'smo_test' prior to enabling movable objects ... verified non-movable slabs are NOT shrinkable Running test movable ... testing slab 'smo_test' after enabling movable objects ... verified movable slabs are shrinkable Removing module slub_defrag ... Signed-off-by: Tobin C. Harding --- tools/testing/slab/slub_defrag.c | 1 + tools/testing/slab/slub_defrag.py | 451 ++ 2 files changed, 452 insertions(+) create mode 100755 tools/testing/slab/slub_defrag.py diff --git a/tools/testing/slab/slub_defrag.c b/tools/testing/slab/slub_defrag.c index 4a5c24394b96..8332e69ee868 100644 --- a/tools/testing/slab/slub_defrag.c +++ b/tools/testing/slab/slub_defrag.c @@ -337,6 +337,7 @@ static int smo_run_module_tests(int nr_objs, int keep) /* * struct functions() - Map command to a function pointer. + * If you update this please update the documentation in slub_defrag.py */ struct functions { char *fn_name; diff --git a/tools/testing/slab/slub_defrag.py b/tools/testing/slab/slub_defrag.py new file mode 100755 index ..41747c0db39b --- /dev/null +++ b/tools/testing/slab/slub_defrag.py @@ -0,0 +1,451 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import subprocess +import sys +from os import path + +# SLUB Movable Objects test suite. +# +# Requirements: +# - CONFIG_SLUB=y +# - CONFIG_SLUB_DEBUG=y +# - The slub_defrag module in this directory. + +# Test SMO using a kernel module that enables triggering arbitrary +# kernel code from userspace via a debugfs file. +# +# Module code is in ./slub_defrag.c, basically the functionality is as +# follows: +# +# - Creates debugfs file /sys/kernel/debugfs/smo/callfn +# - Writes to 'callfn' are parsed as a command string and the function +#associated with command is called. +# - Defines 4 commands (all commands operate on smo_test cache): +# - 'test': Runs module stress tests. +# - 'alloc N': Allocates N slub objects +# - 'free N POS': Frees N objects starting at POS (see below) +# - 'enable': Enables SLUB Movable Objects +# +# The module maintains a list of allocated objects. Allocation adds +# objects to the tail of the list. Free'ing frees from the head of the +# list. This has the effect of creating free slots in the slab. For +# finer grained control over where in the cache slots are free'd POS +# (position) argument may be used. + +# The main() function is reasonably readable; the test suite does the +# following: +# +# 1. Runs the module stress tests. +# 2. Tests the cache without movable objects enabled. +#- Creates multiple partial slabs as explained above. +#- Verifies that partial slabs are _not_ removed by shrink (see below). +# 3. Tests the cache with movable objects enabled. +#- Creates multiple partial slabs as explained above. +#- Verifies that partial slabs _are_ removed by shrink (see below). + +# The sysfs file /sys/kernel/slab//shrink enables calling the +# function kmem_cache_shrink() (see mm/slab_common.c and mm/slub.cc). +# Shrinking a cache attempts to consolidate all partial slabs by moving +# objects if object migration is enable for the cache, otherwise +# shrinking a cache simply re-orders the partial list so as most densely +# populated slab are at the head of the list. + +# Enable/disable debugging output (also enabled via -d | --debug). +debug = False + +# Used in debug messages and when running `insmod`. +MODULE_NAME = "slub_defrag" + +# Slab cache created by the test module. +CACHE_NAME = "smo_test" + +# Set by get_slab_config() +objects_per_slab = 0 +pages_per_slab = 0 +debugfs_mounted = False # Set to true if we mount debugfs. + + +def eprint(*args, **kwargs): +print(*args, file=sys.stderr, **kwargs) + + +def dprint(*args, **kwargs): +if debug: +print(*args, file=sys.stderr, **kwargs) + + +def run_shell(cmd): +return subprocess.call([cmd], shell=True) + + +def run_shell_get_stdout(cmd): +return subprocess.check_output([cmd], shell=True) + + +def assert_root(): +user = run_shell_get_stdout('whoami') +if user != b'root\n': +eprint("Please run script as root") +sys.exit(1) + + +def mount_debugfs(): +mounted = False + +# Check if debugfs is mounted at a known mount
Re: [RFC][PATCHSET] sorting out RCU-delayed stuff in ->destroy_inode()
On Tue, Apr 16, 2019 at 11:01:16AM -0700, Linus Torvalds wrote: > On Tue, Apr 16, 2019 at 10:49 AM Al Viro wrote: > > > > 83 files changed, 241 insertions(+), 516 deletions(-) > > I think this single line is pretty convincing on its own. Ignoring > docs and fs/inode.c, we have > > 80 files changed, 190 insertions(+), 494 deletions(-) > > IOW, just over 300 lines of boiler plate code removed. > > The additions are > > - Ten more lines of actual code in fs/inode.c (and that's not > actually added complexity, it looks simpler if anything - most of it > is the new "i_callback()" helper function) > > - 19 lines of doc updates. > > So it absolutely looks fine to me. > > I only skimmed through the actual filesystem (and one networking) > patches, but they looked like trivial conversions to a better > interface. ... except that this callback can (and always could) get executed after freeing struct super_block. So we can't just dereference ->i_sb->s_op and expect to survive; the table ->s_op pointed to will still be there, but ->i_sb might very well have been freed, with all its contents overwritten. We need to copy the callback into struct inode itself, unfortunately. The following incremental fixes it; I'm going to fold it into the first commit in there. diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 9d80f9e0855e..b8d3ddd8b8db 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -655,3 +655,11 @@ in your dentry operations instead. * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() * combination of NULL ->destroy_inode and NULL ->free_inode is treated as NULL/free_inode_nonrcu, to preserve the compatibility. + + Note that the callback (be it via ->free_inode() or explicit call_rcu() + in ->destroy_inode()) is *NOT* ordered wrt superblock destruction; + as the matter of fact, the superblock and all associated structures + might be already gone. The filesystem driver is guaranteed to be still + there, but that's it. Freeing memory in the callback is fine; doing + more than that is possible, but requires a lot of care and is best + avoided. diff --git a/fs/inode.c b/fs/inode.c index fb45590d284e..855dad43b11d 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -164,6 +164,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_wb_frn_avg_time = 0; inode->i_wb_frn_history = 0; #endif + inode->free_inode = sb->s_op->free_inode; if (security_inode_alloc(inode)) goto out; @@ -211,8 +212,8 @@ EXPORT_SYMBOL(free_inode_nonrcu); static void i_callback(struct rcu_head *head) { struct inode *inode = container_of(head, struct inode, i_rcu); - if (inode->i_sb->s_op->free_inode) - inode->i_sb->s_op->free_inode(inode); + if (inode->free_inode) + inode->free_inode(inode); else free_inode_nonrcu(inode); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2e9b9f87caca..5ed6b39e588e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -718,6 +718,7 @@ struct inode { #endif void*i_private; /* fs or device private pointer */ + void (*free_inode)(struct inode *); } __randomize_layout; static inline unsigned int i_blocksize(const struct inode *node)
[RFC PATCH v4 07/15] tools/testing/slab: Add object migration test module
We just implemented slab movable objects for the SLUB allocator. We should test that code. In order to do so we need to be able to do a number of things - Create a cache - Enable Slab Movable Objects for the cache - Allocate objects to the cache - Free objects from within specific slabs of the cache We can do all this via a loadable module. Add a module that defines functions that can be triggered from userspace via a debugfs entry. From the source: /* * SLUB defragmentation a.k.a. Slab Movable Objects (SMO). * * This module is used for testing the SLUB allocator. Enables * userspace to run kernel functions via a debugfs file. * * debugfs: /sys/kernel/debugfs/smo/callfn (write only) * * String written to `callfn` is parsed by the module and associated * function is called. See fn_tab for mapping of strings to functions. */ References to allocated objects are kept by the module in a linked list so that userspace can control which object to free. We introduce the following four functions via the function table "enable": Enables object migration for the test cache. "alloc X": Allocates X objects "free X [Y]": Frees X objects starting at list position Y (default Y==0) "test": Runs [stress] tests from within the module (see below). {"enable", smo_enable_cache_mobility}, {"alloc", smo_alloc_objects}, {"free", smo_free_object}, {"test", smo_run_module_tests}, Freeing from the start of the list creates a hole in the slab being freed from (i.e. creates a partial slab). The results of running these commands can be see using `slabinfo` (available in tools/vm/): make -o slabinfo tools/vm/slabinfo.c Stress tests can be run from within the module. These tests are internal to the module because we verify that object references are still good after object migration. These are called 'stress' tests because it is intended that they create/free a lot of objects. Userspace can control the number of objects to create, default is 1000. Example test session Relevant /proc/slabinfo column headers: name # mount -t debugfs none /sys/kernel/debug/ $ cd path/to/linux/tools/testing/slab; make ... # insmod slub_defrag.ko # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 0 0392 202 >From this we can see that the module created cache 'smo_test' with 20 objects per slab and 2 pages per slab (and cache is currently empty). We can play with the slab allocator manually: # insmod slub_defrag.ko # echo 'alloc 21' > callfn # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 21 40392 202 We see here that 21 active objects have been allocated creating 2 slabs (40 total objects). # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 21 Sizes (bytes) Slabs DebugMemory Object : 56 Total : 2 Sanity Checks : On Total: 16384 SlabObj: 392 Full : 1 Redzoning : On Used :1176 SlabSiz:8192 Partial: 1 Poisoning : On Loss : 15208 Loss : 336 CpuSlab: 0 Tracking : On Lalig:7056 Align : 8 Objects: 20 Tracing : Off Lpadd: 704 Now free an object from the first slot of the first slab # echo 'free 1' > callfn # cat /proc/slabinfo | grep smo_test | sed 's/:.*//' smo_test 20 40392 202 # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 20 Sizes (bytes) Slabs DebugMemory Object : 56 Total : 2 Sanity Checks : On Total: 16384 SlabObj: 392 Full : 0 Redzoning : On Used :1120 SlabSiz:8192 Partial: 2 Poisoning : On Loss : 15264 Loss : 336 CpuSlab: 0 Tracking : On Lalig:6720 Align : 8 Objects: 20 Tracing : Off Lpadd: 704 Calling shrink now on the cache does nothing because object migration is not enabled (output omitted). If we enable object migration then shrink the cache we expect the object from the second slab to me moved to the first slot in the first slab and the second slab to be removed from the partial list. # echo 'enable' > callfn # slabinfo smo_test --shrink # slabinfo smo_test --report Slabcache: smo_test Aliases: 0 Order : 1 Objects: 20 ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 56 Total : 1 Sanity Checks : On Total:8192 SlabObj: 392 Full : 1 Redzonin
[RFC PATCH v4 06/15] tools/vm/slabinfo: Add defrag_used_ratio output
Add output for the newly added defrag_used_ratio sysfs knob. Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 4 1 file changed, 4 insertions(+) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index d2c22f9ee2d8..ef4ff93df4cc 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -34,6 +34,7 @@ struct slabinfo { unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; int movable, ctor; + int defrag_used_ratio; int remote_node_defrag_ratio; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; @@ -549,6 +550,8 @@ static void report(struct slabinfo *s) printf("** Slabs are destroyed via RCU\n"); if (s->reclaim_account) printf("** Reclaim accounting active\n"); + if (s->movable) + printf("** Defragmentation at %d%%\n", s->defrag_used_ratio); printf("\nSizes (bytes) Slabs Debug Memory\n"); printf("\n"); @@ -1279,6 +1282,7 @@ static void read_slab_dir(void) slab->deactivate_bypass = get_obj("deactivate_bypass"); slab->remote_node_defrag_ratio = get_obj("remote_node_defrag_ratio"); + slab->defrag_used_ratio = get_obj("defrag_used_ratio"); chdir(".."); if (read_slab_obj(slab, "ops")) { if (strstr(buffer, "ctor :")) -- 2.21.0
[RFC PATCH v4 05/15] tools/vm/slabinfo: Add remote node defrag ratio output
Add output line for NUMA remote node defrag ratio. Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index cbfc56c44c2f..d2c22f9ee2d8 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -34,6 +34,7 @@ struct slabinfo { unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; int movable, ctor; + int remote_node_defrag_ratio; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; unsigned long free_fastpath, free_slowpath; @@ -377,6 +378,10 @@ static void slab_numa(struct slabinfo *s, int mode) if (skip_zero && !s->slabs) return; + if (mode) { + printf("\nNUMA remote node defrag ratio: %3d\n", + s->remote_node_defrag_ratio); + } if (!line) { printf("\n%-21s:", mode ? "NUMA nodes" : "Slab"); for(node = 0; node <= highest_node; node++) @@ -1272,6 +1277,8 @@ static void read_slab_dir(void) slab->cpu_partial_free = get_obj("cpu_partial_free"); slab->alloc_node_mismatch = get_obj("alloc_node_mismatch"); slab->deactivate_bypass = get_obj("deactivate_bypass"); + slab->remote_node_defrag_ratio = + get_obj("remote_node_defrag_ratio"); chdir(".."); if (read_slab_obj(slab, "ops")) { if (strstr(buffer, "ctor :")) -- 2.21.0
[RFC PATCH v4 04/15] slub: Slab defrag core
Internal fragmentation can occur within pages used by the slub allocator. Under some workloads large numbers of pages can be used by partial slab pages. This under-utilisation is bad simply because it wastes memory but also because if the system is under memory pressure higher order allocations may become difficult to satisfy. If we can defrag slab caches we can alleviate these problems. Implement Slab Movable Objects in order to defragment slab caches. Slab defragmentation may occur: 1. Unconditionally when __kmem_cache_shrink() is called on a slab cache by the kernel calling kmem_cache_shrink(). 2. Unconditionally through the use of the slabinfo command. slabinfo -s 3. Conditionally via the use of kmem_cache_defrag() - Use Slab Movable Objects when shrinking cache. Currently when the kernel calls kmem_cache_shrink() we curate the partial slabs list. If object migration is not enabled for the cache we still do this, if however, SMO is enabled we attempt to move objects in partially full slabs in order to defragment the cache. Shrink attempts to move all objects in order to reduce the cache to a single partial slab for each node. - Add conditional per node defrag via new function: kmem_defrag_slabs(int node). kmem_defrag_slabs() attempts to defragment all slab caches for node. Defragmentation is done conditionally dependent on MAX_PARTIAL _AND_ defrag_used_ratio. Caches are only considered for defragmentation if the number of partial slabs exceeds MAX_PARTIAL (per node). Also, defragmentation only occurs if the usage ratio of the slab is lower than the configured percentage (sysfs field added in this patch). Fragmentation ratios are measured by calculating the percentage of objects in use compared to the total number of objects that the slab page can accommodate. The scanning of slab caches is optimized because the defragmentable slabs come first on the list. Thus we can terminate scans on the first slab encountered that does not support defragmentation. kmem_defrag_slabs() takes a node parameter. This can either be -1 if defragmentation should be performed on all nodes, or a node number. Defragmentation may be disabled by setting defrag ratio to 0 echo 0 > /sys/kernel/slab//defrag_used_ratio - Add a defrag ratio sysfs field and set it to 30% by default. A limit of 30% specifies that more than 3 out of 10 available slots for objects need to be in use otherwise slab defragmentation will be attempted on the remaining objects. In order for a cache to be defragmentable the cache must support object migration (SMO). Enabling SMO for a cache is done via a call to the recently added function: void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func, kmem_cache_migrate_func); Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- Documentation/ABI/testing/sysfs-kernel-slab | 14 + include/linux/slab.h| 1 + include/linux/slub_def.h| 7 + mm/slub.c | 385 4 files changed, 334 insertions(+), 73 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab index 29601d93a1c2..7770c03be6b4 100644 --- a/Documentation/ABI/testing/sysfs-kernel-slab +++ b/Documentation/ABI/testing/sysfs-kernel-slab @@ -180,6 +180,20 @@ Description: list. It can be written to clear the current count. Available when CONFIG_SLUB_STATS is enabled. +What: /sys/kernel/slab/cache/defrag_used_ratio +Date: February 2019 +KernelVersion: 5.0 +Contact: Christoph Lameter + Pekka Enberg , +Description: + The defrag_used_ratio file allows the control of how aggressive + slab fragmentation reduction works at reclaiming objects from + sparsely populated slabs. This is a percentage. If a slab has + less than this percentage of objects allocated then reclaim will + attempt to reclaim objects so that the whole slab page can be + freed. 0% specifies no reclaim attempt (defrag disabled), 100% + specifies attempt to reclaim all pages. The default is 30%. + What: /sys/kernel/slab/cache/deactivate_to_tail Date: February 2008 KernelVersion: 2.6.25 diff --git a/include/linux/slab.h b/include/linux/slab.h index 886fc130334d..4bf381b34829 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -149,6 +149,7 @@ struct kmem_cache *kmem_cache_create_usercopy(const char *name, void (*ctor)(void *)); void kmem_cache_destroy(struct kmem_cache *); int kmem_cache_shrink(struct kmem_cache *); +unsigned long kmem_defrag_slabs(int node); void memcg_create_kmem_cache(struct mem
[RFC PATCH v4 02/15] tools/vm/slabinfo: Add support for -C and -M options
-C lists caches that use a ctor. -M lists caches that support object migration. Add command line options to show caches with a constructor and caches that are movable (i.e. have migrate function). Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- tools/vm/slabinfo.c | 40 1 file changed, 36 insertions(+), 4 deletions(-) diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c index 73818f1b2ef8..cbfc56c44c2f 100644 --- a/tools/vm/slabinfo.c +++ b/tools/vm/slabinfo.c @@ -33,6 +33,7 @@ struct slabinfo { unsigned int hwcache_align, object_size, objs_per_slab; unsigned int sanity_checks, slab_size, store_user, trace; int order, poison, reclaim_account, red_zone; + int movable, ctor; unsigned long partial, objects, slabs, objects_partial, objects_total; unsigned long alloc_fastpath, alloc_slowpath; unsigned long free_fastpath, free_slowpath; @@ -67,6 +68,8 @@ int show_report; int show_alias; int show_slab; int skip_zero = 1; +int show_movable; +int show_ctor; int show_numa; int show_track; int show_first_alias; @@ -109,11 +112,13 @@ static void fatal(const char *x, ...) static void usage(void) { - printf("slabinfo 4/15/2011. (c) 2007 sgi/(c) 2011 Linux Foundation.\n\n" - "slabinfo [-aADefhilnosrStTvz1LXBU] [N=K] [-dafzput] [slab-regexp]\n" + printf("slabinfo 4/15/2017. (c) 2007 sgi/(c) 2011 Linux Foundation/(c) 2017 Jump Trading LLC.\n\n" + "slabinfo [-aACDefhilMnosrStTvz1LXBU] [N=K] [-dafzput] [slab-regexp]\n" + "-a|--aliases Show aliases\n" "-A|--activity Most active slabs first\n" "-B|--Bytes Show size in bytes\n" + "-C|--ctor Show slabs with ctors\n" "-D|--display-activeSwitch line format to activity\n" "-e|--empty Show empty slabs\n" "-f|--first-alias Show first alias\n" @@ -121,6 +126,7 @@ static void usage(void) "-i|--inverted Inverted list\n" "-l|--slabs Show slabs\n" "-L|--Loss Sort by loss\n" + "-M|--movable Show caches that support movable objects\n" "-n|--numa Show NUMA information\n" "-N|--lines=K Show the first K slabs\n" "-o|--ops Show kmem_cache_ops\n" @@ -588,6 +594,12 @@ static void slabcache(struct slabinfo *s) if (show_empty && s->slabs) return; + if (show_ctor && !s->ctor) + return; + + if (show_movable && !s->movable) + return; + if (sort_loss == 0) store_size(size_str, slab_size(s)); else @@ -602,6 +614,10 @@ static void slabcache(struct slabinfo *s) *p++ = '*'; if (s->cache_dma) *p++ = 'd'; + if (s->ctor) + *p++ = 'C'; + if (s->movable) + *p++ = 'M'; if (s->hwcache_align) *p++ = 'A'; if (s->poison) @@ -636,7 +652,8 @@ static void slabcache(struct slabinfo *s) printf("%-21s %8ld %7d %15s %14s %4d %1d %3ld %3ld %s\n", s->name, s->objects, s->object_size, size_str, dist_str, s->objs_per_slab, s->order, - s->slabs ? (s->partial * 100) / s->slabs : 100, + s->slabs ? (s->partial * 100) / + (s->slabs * s->objs_per_slab) : 100, s->slabs ? (s->objects * s->object_size * 100) / (s->slabs * (page_size << s->order)) : 100, flags); @@ -1256,6 +1273,13 @@ static void read_slab_dir(void) slab->alloc_node_mismatch = get_obj("alloc_node_mismatch"); slab->deactivate_bypass = get_obj("deactivate_bypass"); chdir(".."); + if (read_slab_obj(slab, "ops")) { + if (strstr(buffer, "ctor :")) + slab->ctor = 1; + if (strstr(buffer, "migrate :")) + slab->movable = 1; + } + if (slab->name[0] == ':') alias_targets++; slab++; @@ -1332,6 +1356,8 @@ static void xtotals(void) } struct option opts[] = { + { "ctor", no_argument, NULL, 'C' }, + { "movable", no_argument, NULL, 'M' }, { "aliases", no_argument, NULL, 'a' }, { "activity", no_argument, NULL, 'A' }, { "debug", optional_argument, NULL, 'd' }, @@ -1367,7 +1393,7 @@ int main(int argc, char *argv[]) page_size = getpagesize(); -
[RFC PATCH v4 03/15] slub: Sort slab cache list
It is advantageous to have all defragmentable slabs together at the beginning of the list of slabs so that there is no need to scan the complete list. Put defragmentable caches first when adding a slab cache and others last. Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- mm/slab_common.c | 2 +- mm/slub.c| 6 ++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 58251ba63e4a..db5e9a0b1535 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -393,7 +393,7 @@ static struct kmem_cache *create_cache(const char *name, goto out_free_cache; s->refcount = 1; - list_add(&s->list, &slab_caches); + list_add_tail(&s->list, &slab_caches); memcg_link_cache(s); out: if (err) diff --git a/mm/slub.c b/mm/slub.c index ae44d640b8c1..f6b0e4a395ef 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4342,6 +4342,8 @@ void kmem_cache_setup_mobility(struct kmem_cache *s, return; } + mutex_lock(&slab_mutex); + s->isolate = isolate; s->migrate = migrate; @@ -4350,6 +4352,10 @@ void kmem_cache_setup_mobility(struct kmem_cache *s, * to disable fast cmpxchg based processing. */ s->flags &= ~__CMPXCHG_DOUBLE; + + list_move(&s->list, &slab_caches); /* Move to top */ + + mutex_unlock(&slab_mutex); } EXPORT_SYMBOL(kmem_cache_setup_mobility); -- 2.21.0
[RFC PATCH v4 01/15] slub: Add isolate() and migrate() methods
Add the two methods needed for moving objects and enable the display of the callbacks via the /sys/kernel/slab interface. Add documentation explaining the use of these methods and the prototypes for slab.h. Add functions to setup the callbacks method for a slab cache. Add empty functions for SLAB/SLOB. The API is generic so it could be theoretically implemented for these allocators as well. Change sysfs 'ctor' field to be 'ops' to contain all the callback operations defined for a slab cache. Display the existing 'ctor' callback in the ops fields contents along with 'isolate' and 'migrate' callbacks. Co-developed-by: Christoph Lameter Signed-off-by: Tobin C. Harding --- include/linux/slab.h | 70 include/linux/slub_def.h | 3 ++ mm/slub.c| 59 + 3 files changed, 126 insertions(+), 6 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 9449b19c5f10..886fc130334d 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -154,6 +154,76 @@ void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *); void memcg_deactivate_kmem_caches(struct mem_cgroup *); void memcg_destroy_kmem_caches(struct mem_cgroup *); +/* + * Function prototypes passed to kmem_cache_setup_mobility() to enable + * mobile objects and targeted reclaim in slab caches. + */ + +/** + * typedef kmem_cache_isolate_func - Object migration callback function. + * @s: The cache we are working on. + * @ptr: Pointer to an array of pointers to the objects to isolate. + * @nr: Number of objects in @ptr array. + * + * The purpose of kmem_cache_isolate_func() is to pin each object so that + * they cannot be freed until kmem_cache_migrate_func() has processed + * them. This may be accomplished by increasing the refcount or setting + * a flag. + * + * The object pointer array passed is also passed to + * kmem_cache_migrate_func(). The function may remove objects from the + * array by setting pointers to %NULL. This is useful if we can + * determine that an object is being freed because + * kmem_cache_isolate_func() was called when the subsystem was calling + * kmem_cache_free(). In that case it is not necessary to increase the + * refcount or specially mark the object because the release of the slab + * lock will lead to the immediate freeing of the object. + * + * Context: Called with locks held so that the slab objects cannot be + * freed. We are in an atomic context and no slab operations + * may be performed. + * Return: A pointer that is passed to the migrate function. If any + * objects cannot be touched at this point then the pointer may + * indicate a failure and then the migration function can simply + * remove the references that were already obtained. The private + * data could be used to track the objects that were already pinned. + */ +typedef void *kmem_cache_isolate_func(struct kmem_cache *s, void **ptr, int nr); + +/** + * typedef kmem_cache_migrate_func - Object migration callback function. + * @s: The cache we are working on. + * @ptr: Pointer to an array of pointers to the objects to migrate. + * @nr: Number of objects in @ptr array. + * @node: The NUMA node where the object should be allocated. + * @private: The pointer returned by kmem_cache_isolate_func(). + * + * This function is responsible for migrating objects. Typically, for + * each object in the input array you will want to allocate an new + * object, copy the original object, update any pointers, and free the + * old object. + * + * After this function returns all pointers to the old object should now + * point to the new object. + * + * Context: Called with no locks held and interrupts enabled. Sleeping + * is possible. Any operation may be performed. + */ +typedef void kmem_cache_migrate_func(struct kmem_cache *s, void **ptr, +int nr, int node, void *private); + +/* + * kmem_cache_setup_mobility() is used to setup callbacks for a slab cache. + */ +#ifdef CONFIG_SLUB +void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func, + kmem_cache_migrate_func); +#else +static inline void +kmem_cache_setup_mobility(struct kmem_cache *s, kmem_cache_isolate_func isolate, + kmem_cache_migrate_func migrate) {} +#endif + /* * Please use this macro to create slab caches. Simply specify the * name of the structure and maybe some flags that are listed above. diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index d2153789bd9f..2879a2f5f8eb 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -99,6 +99,9 @@ struct kmem_cache { gfp_t allocflags; /* gfp flags to use on each alloc */ int refcount; /* Refcount for slab cache destroy */ void (*ctor)(void *); + kmem_cache_isolate_func *isolate; +
[RFC PATCH v4 00/15] Slab Movable Objects (SMO)
Hi, Another iteration of the SMO patch set, updates to this version are restricted to the dcache patch #14. Applies on top of Linus' tree (tag: v5.1-rc6). This is a patch set implementing movable objects within the SLUB allocator. This is work based on Christopher Lameter's patch set: https://lore.kernel.org/patchwork/project/lkml/list/?series=377335 The original code logic is from that set and implemented by Christopher. Clean up, refactoring, documentation, and additional features by myself. Responsibility for any bugs remaining falls solely with myself. Changes to this version: Re-write the dcache Slab Movable Objects isolate/migrate functions. Based on review/suggestions by Alexander on the last version. In this version the isolate function loops over the object vector and builds a shrink list for all objects that have refcount==0 AND are NOT on anyone else's shrink list. A pointer to this list is returned from the isolate function and passed to the migrate function (by the SMO infrastructure). The dentry migration function d_partial_shrink() simply calls shrink_dentry_list() on the received shrink list pointer and frees the memory associated with the list_head. Hopefully if this is all ok I can move on to violating the inode slab cache :) FWIW testing on a VM in Qemu brings this mild benefit to the dentry slab cache with no _apparent_ negatives. CONFIG_SLUB_DEBUG=y CONFIG_SLUB=y CONFIG_SLUB_CPU_PARTIAL=y CONFIG_SLUB_DEBUG_ON=y CONFIG_SLUB_STATS=y CONFIG_SMO_NODE=y CONFIG_DCACHE_SMO=y [root@vm ~]# slabinfo dentry -r | head -n 13 Slabcache: dentry Aliases: 0 Order : 1 Objects: 38585 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 192 Total :2582 Sanity Checks : On Total: 21151744 SlabObj: 528 Full :2547 Redzoning : On Used : 7408320 SlabSiz:8192 Partial: 35 Poisoning : On Loss : 13743424 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 12964560 Align : 8 Objects: 15 Tracing : Off Lpadd: 702304 [root@vm ~]# slabinfo dentry --shrink [root@vm ~]# slabinfo dentry -r | head -n 13 Slabcache: dentry Aliases: 0 Order : 1 Objects: 38426 ** Reclaim accounting active ** Defragmentation at 30% Sizes (bytes) Slabs DebugMemory Object : 192 Total :2578 Sanity Checks : On Total: 21118976 SlabObj: 528 Full :2547 Redzoning : On Used : 7377792 SlabSiz:8192 Partial: 31 Poisoning : On Loss : 13741184 Loss : 336 CpuSlab: 0 Tracking : On Lalig: 12911136 Align : 8 Objects: 15 Tracing : Off Lpadd: 701216 Please note, this dentry shrink implementation is 'best effort', results vary. This is as is expected. We are trying to unobtrusively shrink the dentry cache. thanks, Tobin. Tobin C. Harding (15): slub: Add isolate() and migrate() methods tools/vm/slabinfo: Add support for -C and -M options slub: Sort slab cache list slub: Slab defrag core tools/vm/slabinfo: Add remote node defrag ratio output tools/vm/slabinfo: Add defrag_used_ratio output tools/testing/slab: Add object migration test module tools/testing/slab: Add object migration test suite xarray: Implement migration function for objects tools/testing/slab: Add XArray movable objects tests slub: Enable moving objects to/from specific nodes slub: Enable balancing slabs across nodes dcache: Provide a dentry constructor dcache: Implement partial shrink via Slab Movable Objects dcache: Add CONFIG_DCACHE_SMO Documentation/ABI/testing/sysfs-kernel-slab | 14 + fs/dcache.c | 110 ++- include/linux/slab.h| 71 ++ include/linux/slub_def.h| 10 + lib/radix-tree.c| 13 + lib/xarray.c| 49 ++ mm/Kconfig | 14 + mm/slab_common.c| 2 +- mm/slub.c | 819 ++-- tools/testing/slab/Makefile | 10 + tools/testing/slab/slub_defrag.c| 567 ++ tools/testing/slab/slub_defrag.py | 451 +++ tools/testing/slab/slub_defrag_xarray.c | 211 + tools/vm/slabinfo.c | 51 +- 14 files changed, 2299 insertions(+), 93 deletions(-) create mode 100644 tools/testing/slab/Makefile create mode 100644 tools/testing/slab/slub_defrag.c create mode 100755 tools/testing/slab/slub_defrag.py create mode 100644 tools/testing/slab/slub_defrag_xarray.c -- 2.21.0
Re: [PATCH -next] ASoC: sprd: Fix to use list_for_each_entry_safe() when delete items
Hi, On Mon, 29 Apr 2019 at 20:27, Wei Yongjun wrote: > > Since we will remove items off the list using list_del() we need > to use a safe version of the list_for_each_entry() macro aptly named > list_for_each_entry_safe(). > > Fixes: d7bff893e04f ("ASoC: sprd: Add Spreadtrum multi-channel data transfer > support") > Signed-off-by: Wei Yongjun Yes, thanks for your fixes. Reviewed-by: Baolin Wang > --- > sound/soc/sprd/sprd-mcdt.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/sound/soc/sprd/sprd-mcdt.c b/sound/soc/sprd/sprd-mcdt.c > index 28f5e649733d..df250f7f2b6f 100644 > --- a/sound/soc/sprd/sprd-mcdt.c > +++ b/sound/soc/sprd/sprd-mcdt.c > @@ -978,12 +978,12 @@ static int sprd_mcdt_probe(struct platform_device *pdev) > > static int sprd_mcdt_remove(struct platform_device *pdev) > { > - struct sprd_mcdt_chan *temp; > + struct sprd_mcdt_chan *chan, *temp; > > mutex_lock(&sprd_mcdt_list_mutex); > > - list_for_each_entry(temp, &sprd_mcdt_chan_list, list) > - list_del(&temp->list); > + list_for_each_entry_safe(chan, temp, &sprd_mcdt_chan_list, list) > + list_del(&chan->list); > > mutex_unlock(&sprd_mcdt_list_mutex); > > > -- Baolin Wang Best Regards
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Tue, Apr 30, 2019 at 04:22:38AM +0200, Nicholas Mc Guire wrote: > On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire wrote: > > > > > > V2: As requested by Sven Van Asbroeck make the > > > impact of the patch clear in the commit message. > > > > Thank you, but did you miss my comment about creating a local variable > > instead? See: > > https://lkml.org/lkml/2019/4/28/97 > > Did not miss it - I just don't think that makes it any more > understandable - the __force __be16 makes it clear I believe > that this is correct, sparse does not like this though - so tell > sparse. ... to STFU, 'cause you know better. The trouble is, how do we (or yourself a year or two later) know *why* it is correct? Worse, how do we (or yourself, etc.) know if a change about to be done to the code won't invalidate the proof of yours? > The local variable would need to be explained as it is > functionally not necessary - therefor I find it more confusing > that using __force here. What's confusing is mixing host- and fixed-endian values in the same variable at different times. Treat those as unrelated types that happen to have the same sizeof. Quite a few of __force instances in the tree should be taken out and shot. Don't add to their number.
RE: [PATCH] clk: imx: pllv3: Fix fall through build warning
> From: Anson Huang > Sent: Tuesday, April 30, 2019 9:55 AM > Subject: [PATCH] clk: imx: pllv3: Fix fall through build warning > > Fix below fall through build warning: > > drivers/clk/imx/clk-pllv3.c:453:21: warning: > this statement may fall through [-Wimplicit-fallthrough=] > >pll->denom_offset = PLL_IMX7_DENOM_OFFSET; > ^ > drivers/clk/imx/clk-pllv3.c:454:2: note: here > case IMX_PLLV3_AV: > ^~~~ > > Signed-off-by: Anson Huang Reviewed-by: Dong Aisheng Regards Dong Aisheng
Re: [PATCH -next] ASoC: sprd: Fix return value check in sprd_mcdt_probe()
On Mon, 29 Apr 2019 at 20:15, Wei Yongjun wrote: > > In case of error, the function devm_ioremap_resource() returns ERR_PTR() > and never returns NULL. The NULL test in the return value check should > be replaced with IS_ERR(). > > Fixes: d7bff893e04f ("ASoC: sprd: Add Spreadtrum multi-channel data transfer > support") > Signed-off-by: Wei Yongjun Thanks for fixing my mistake. Reviewed-by: Baolin Wang > --- > sound/soc/sprd/sprd-mcdt.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sound/soc/sprd/sprd-mcdt.c b/sound/soc/sprd/sprd-mcdt.c > index 28f5e649733d..e9318d7a4810 100644 > --- a/sound/soc/sprd/sprd-mcdt.c > +++ b/sound/soc/sprd/sprd-mcdt.c > @@ -951,8 +951,8 @@ static int sprd_mcdt_probe(struct platform_device *pdev) > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > mcdt->base = devm_ioremap_resource(&pdev->dev, res); > - if (!mcdt->base) > - return -ENOMEM; > + if (IS_ERR(mcdt->base)) > + return PTR_ERR(mcdt->base); > > mcdt->dev = &pdev->dev; > spin_lock_init(&mcdt->lock); > > > -- Baolin Wang Best Regards
Re: INFO: task hung in __get_super
On Sun 28-04-19 19:51:09, Al Viro wrote: > On Sun, Apr 28, 2019 at 11:14:06AM -0700, syzbot wrote: > > down_read+0x49/0x90 kernel/locking/rwsem.c:26 > > __get_super.part.0+0x203/0x2e0 fs/super.c:788 > > __get_super include/linux/spinlock.h:329 [inline] > > get_super+0x2e/0x50 fs/super.c:817 > > fsync_bdev+0x19/0xd0 fs/block_dev.c:525 > > invalidate_partition+0x36/0x60 block/genhd.c:1581 > > drop_partitions block/partition-generic.c:443 [inline] > > rescan_partitions+0xef/0xa20 block/partition-generic.c:516 > > __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173 > > blkdev_reread_part+0x27/0x40 block/ioctl.c:193 > > loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633 > > loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296 > > loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416 > > lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559 > > __blkdev_driver_ioctl block/ioctl.c:303 [inline] > > blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605 > > block_ioctl+0xee/0x130 fs/block_dev.c:1933 > > vfs_ioctl fs/ioctl.c:46 [inline] > > file_ioctl fs/ioctl.c:509 [inline] > > do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696 > > ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 > > __do_sys_ioctl fs/ioctl.c:720 [inline] > > __se_sys_ioctl fs/ioctl.c:718 [inline] > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 > > do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super(). > The trouble is, the only things holding ->s_umount appears to be > these: > > > 2 locks held by syz-executor274/11716: > > #0: a19e2025 (&type->s_umount_key#38/1){+.+.}, at: > > alloc_super+0x158/0x890 fs/super.c:228 > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl > > drivers/block/loop.c:1514 [inline] > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150 > > drivers/block/loop.c:1572 > > > 2 locks held by syz-executor274/11717: > > #0: e185c083 (&type->s_umount_key#38/1){+.+.}, at: > > alloc_super+0x158/0x890 fs/super.c:228 > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl > > drivers/block/loop.c:1514 [inline] > > #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150 > > drivers/block/loop.c:1572 > > ... and that's bollocks. ->s_umount held there is that on freshly allocated > superblock. It *MUST* be in mount(2); no other syscall should be able to > call alloc_super() in the first place. So what the hell is that doing > trying to call lo_ioctl() inside mount(2)? Something like isofs attempting > cdrom ioctls on the underlying device? Actually UDF also calls CDROMMULTISESSION ioctl during mount. So I could see how we get to lo_simple_ioctl() and indeed that would acquire loop_ctl_mutex under s_umount which is the other way around than in BLKRRPART ioctl. > Why do we have loop_func_table->ioctl(), BTW? All in-tree instances are > either NULL or return -EINVAL unconditionally. Considering that the > caller is > err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; > we could bloody well just get rid of cryptoloop_ioctl() (the only > non-NULL instance) and get rid of calling lo_simple_ioctl() in > lo_ioctl() switch's default. Yeah, you're right. And if we push the patch a bit further to not take loop_ctl_mutex for invalid ioctl number, that would fix the problem. I can send a fix. Honza > > Something like this: > > diff --git a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c > index 254ee7d54e91..f16468a562f5 100644 > --- a/drivers/block/cryptoloop.c > +++ b/drivers/block/cryptoloop.c > @@ -167,12 +167,6 @@ cryptoloop_transfer(struct loop_device *lo, int cmd, > } > > static int > -cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg) > -{ > - return -EINVAL; > -} > - > -static int > cryptoloop_release(struct loop_device *lo) > { > struct crypto_sync_skcipher *tfm = lo->key_data; > @@ -188,7 +182,6 @@ cryptoloop_release(struct loop_device *lo) > static struct loop_func_table cryptoloop_funcs = { > .number = LO_CRYPT_CRYPTOAPI, > .init = cryptoloop_init, > - .ioctl = cryptoloop_ioctl, > .transfer = cryptoloop_transfer, > .release = cryptoloop_release, > .owner = THIS_MODULE > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index bf1c61cab8eb..2ec162b80562 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -955,7 +955,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t > mode, > lo->lo_flags = lo_flags; > lo->lo_backing_file = file; > lo->transfer = NULL; > - lo->ioctl = NULL; > lo->lo_sizelimit = 0; > lo->old_gfp_mask = mapping_gfp_mask(mapping); > mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); > @@ -1064,7 +1063,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool > release)
Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation
On Mon, Apr 29, 2019 at 5:45 PM Sean Christopherson wrote: > > On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote: > > > > It's 486 based, but either way I suspect the answer is "yes". IIRC, > > Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that > > was based on P54C, though I'm struggling to recall exactly what the > > Larrabee weirdness was. > > Aha! Found an ancient comment that explicitly states P5 does not block > NMI/SMI in the STI shadow, while P6 does block NMI/SMI. Ok, so the STI shadow really wouldn't be reliable on those machines. Scary. Of course, the good news is that hopefully nobody has them any more, and if they do, they presumably don't use fancy NMI profiling etc, so any actual NMI's are probably relegated purely to largely rare and effectively fatal errors anyway (ie memory parity errors). Linus
Re: [PATCH] quota: set init_needed flag only when successfully getting dquot
On 4/30/19 5:49 AM, Jan Kara wrote: On Sun 28-04-19 13:39:21, Chengguang Xu wrote: Set init_needed flag only when successfully getting dquot, so that we can skip unnecessary subsequent operation. Signed-off-by: Chengguang Xu Thanks for the patch but I don't think it's really useful. It will be very rare that we race with quotaoff of dqget() fails due to error. So the additional overhead of iterating over dquots doesn't really matter in that case. Hi Jan, Thanks for the comment, I got it. Chengguang.
Re: [PATCH V2] staging: fieldbus: anybus-s: force endiannes annotation
On Mon, Apr 29, 2019 at 10:03:36AM -0400, Sven Van Asbroeck wrote: > On Mon, Apr 29, 2019 at 2:11 AM Nicholas Mc Guire wrote: > > > > V2: As requested by Sven Van Asbroeck make the > > impact of the patch clear in the commit message. > > Thank you, but did you miss my comment about creating a local variable > instead? See: > https://lkml.org/lkml/2019/4/28/97 Did not miss it - I just don't think that makes it any more understandable - the __force __be16 makes it clear I believe that this is correct, sparse does not like this though - so tell sparse. The local variable would need to be explained as it is functionally not necessary - therefor I find it more confusing that using __force here. If that rational is wrong let me know. thx! hofrat
[PATCH] treewide: fix awk regexp over-escaping
Fix "warning: regexp escape sequence is not a known regexp operator" on gawk 5.0.0. Results found by: - grepping '\\[^\[\\^$.|?*+()a-z]' on *.awk - grepping 'awk.*\\[^\[\\^$.|?*+()a-z]' - running awk --lint -f /dev/null on *.awk Signed-off-by: Alex Xu (Hello71) --- Documentation/arm/Samsung/clksrc-change-registers.awk | 2 +- arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- lib/raid6/unroll.awk | 2 +- tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk | 4 ++-- tools/perf/arch/x86/tests/gen-insn-x86-dat.awk | 2 +- tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk | 4 ++-- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/arm/Samsung/clksrc-change-registers.awk b/Documentation/arm/Samsung/clksrc-change-registers.awk index 7be1b8aa7cd9..d853f750c861 100755 --- a/Documentation/arm/Samsung/clksrc-change-registers.awk +++ b/Documentation/arm/Samsung/clksrc-change-registers.awk @@ -67,7 +67,7 @@ BEGIN { # to replace and create an associative array of values while (getline line < ARGV[1] > 0) { - if (line ~ /\#define.*_MASK/ && + if (line ~ /#define.*_MASK/ && !(line ~ /USB_SIG_MASK/)) { splitdefine(line, fields) name = fields[0] diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk index b02a36b2c14f..a42015b305f4 100644 --- a/arch/x86/tools/gen-insn-attr-x86.awk +++ b/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\&F2)\\)" + lprefix3_expr = "\\((F2|!F3|66&F2)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index diff --git a/lib/raid6/unroll.awk b/lib/raid6/unroll.awk index c6aa03631df8..0809805a7e23 100644 --- a/lib/raid6/unroll.awk +++ b/lib/raid6/unroll.awk @@ -13,7 +13,7 @@ BEGIN { for (i = 0; i < rep; ++i) { tmp = $0 gsub(/\$\$/, i, tmp) - gsub(/\$\#/, n, tmp) + gsub(/\$#/, n, tmp) gsub(/\$\*/, "$", tmp) print tmp } diff --git a/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk index b02a36b2c14f..a42015b305f4 100644 --- a/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk +++ b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\&F2)\\)" + lprefix3_expr = "\\((F2|!F3|66&F2)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index diff --git a/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk b/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk index a21454835cd4..27585d032ee6 100644 --- a/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk +++ b/tools/perf/arch/x86/tests/gen-insn-x86-dat.awk @@ -31,7 +31,7 @@ BEGIN { going = 0 } -/^\s*[0-9a-fA-F]+\:/ { +/^\s*[0-9a-fA-F]+:/ { if (going) { colon_pos = index($0, ":") useful_line = substr($0, colon_pos + 1) diff --git a/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk b/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk index ddd5c4c21129..606ccd154392 100644 --- a/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk +++ b/tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk @@ -69,7 +69,7 @@ BEGIN { lprefix1_expr = "\\((66|!F3)\\)" lprefix2_expr = "\\(F3\\)" - lprefix3_expr = "\\((F2|!F3|66\\&F2)\\)" + lprefix3_expr = "\\((F2|!F3|66&F2)\\)" lprefix_expr = "\\((66|F2|F3)\\)" max_lprefix = 4 @@ -257,7 +257,7 @@ function convert_operands(count,opnd, i,j,imm,mod) return add_flags(imm, mod) } -/^[0-9a-f]+\:/ { +/^[0-9a-f]+:/ { if (NR == 1) next # get index -- 2.21.0