Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote: > if SWAP is disabled in kernel config, the following compile error will raise > up with this release > > arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): > undefined reference to `generic_max_swapfile_size' > > of course this is simple to fix. the function max_swapfile_size must be > excluded if CONFIG_SWAP is disabled Now fixed up, thanks. greg k-h
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote: > if SWAP is disabled in kernel config, the following compile error will raise > up with this release > > arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): > undefined reference to `generic_max_swapfile_size' > > of course this is simple to fix. the function max_swapfile_size must be > excluded if CONFIG_SWAP is disabled Now fixed up, thanks. greg k-h
Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.
On Tue, Aug 14, 2018 at 08:14:47PM -0400, Michael Jin wrote: > Therefore, I would like you to confirm that model 10h uses 0x15e8 (device > F0) and 0x15ee (device F6) as I can not find any documentation or test > whether ECC works. He just asked you to make the change for models 0x10-0x2f. How much more confirmation do you need? -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.
On Tue, Aug 14, 2018 at 08:14:47PM -0400, Michael Jin wrote: > Therefore, I would like you to confirm that model 10h uses 0x15e8 (device > F0) and 0x15ee (device F6) as I can not find any documentation or test > whether ECC works. He just asked you to make the change for models 0x10-0x2f. How much more confirmation do you need? -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Tue, Aug 14, 2018 at 11:58:18AM -0700, Nathan Chancellor wrote: > On Tue, Aug 14, 2018 at 07:16:23PM +0200, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.9.120 release. > > There are 107 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu Aug 16 17:14:53 UTC 2018. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > > > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz > > or in the git tree and branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-4.9.y > > and the diffstat can be found below. > > > > thanks, > > > > greg k-h > > > > Merged, compiled with -Werror, and installed onto my OnePlus 6, which is > my daily driver while my Pixel 2 XL is in for another screen RMA... > > No initial issues noticed in dmesg or general usage. Thanks for the testing and letting me know. greg k-h
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Tue, Aug 14, 2018 at 11:58:18AM -0700, Nathan Chancellor wrote: > On Tue, Aug 14, 2018 at 07:16:23PM +0200, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.9.120 release. > > There are 107 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu Aug 16 17:14:53 UTC 2018. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > > > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz > > or in the git tree and branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-4.9.y > > and the diffstat can be found below. > > > > thanks, > > > > greg k-h > > > > Merged, compiled with -Werror, and installed onto my OnePlus 6, which is > my daily driver while my Pixel 2 XL is in for another screen RMA... > > No initial issues noticed in dmesg or general usage. Thanks for the testing and letting me know. greg k-h
Re: [PATCH 1/2] dt-bindings: clk: meson-g12a: Add G12A AO Clock Bindings
On 2018/8/15 4:48, Rob Herring wrote: On Fri, Aug 10, 2018 at 05:54:27PM +0800, Jian Hu wrote: Add new clock controller compatible and dt-bingdings headers for the Always-On domain of the g12a SoC Signed-off-by: Jian Hu --- .../bindings/clock/amlogic,gxbb-aoclkc.txt | 1 + include/dt-bindings/clock/g12a-aoclkc.h| 28 ++ 2 files changed, 29 insertions(+) create mode 100755 include/dt-bindings/clock/g12a-aoclkc.h checkpatch says wrong mode. Yes, I have checked the g12a-aoclk.h file,It is wrong mode. I will chmod a-x for it.Thank you for review. diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt index 3a88052..6f02288 100644 --- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt +++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt @@ -10,6 +10,7 @@ Required Properties: - GXL (S905X, S905D) : "amlogic,meson-gxl-aoclkc" - GXM (S912) : "amlogic,meson-gxm-aoclkc" - AXG (A113D, A113X) : "amlogic,meson-axg-aoclkc" + - G12A (S905D2, S905X2) : "amlogic,g12a-aoclkc" followed by the common "amlogic,meson-gx-aoclkc" - #clock-cells: should be 1. diff --git a/include/dt-bindings/clock/g12a-aoclkc.h b/include/dt-bindings/clock/g12a-aoclkc.h new file mode 100755 index 000..6b3f921 --- /dev/null +++ b/include/dt-bindings/clock/g12a-aoclkc.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */ +/* + * Copyright (c) 2016 BayLibre, SAS + * Author: Neil Armstrong + * + * Copyright (c) 2018 Amlogic, inc. + * Author: Jian Hu + */ + +#ifndef DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK +#define DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK + +#define CLKID_AO_AHB_BUS 0 +#define CLKID_AO_REMOTE1 +#define CLKID_AO_I2C_MASTER2 +#define CLKID_AO_I2C_SLAVE 3 +#define CLKID_AO_UART1 4 +#define CLKID_AO_PROD_I2C 5 +#define CLKID_AO_UART2 6 +#define CLKID_AO_IR_BLASTER7 +#define CLKID_AO_SAR_ADC 8 +#define CLKID_AO_CLK81 9 +#define CLKID_AO_SAR_ADC_SEL 10 +#define CLKID_AO_SAR_ADC_DIV 11 +#define CLKID_AO_SAR_ADC_CLK 12 +#define CLKID_AO_ALT_XTAL 13 + +#endif -- 1.9.1 .
Re: [PATCH 1/2] dt-bindings: clk: meson-g12a: Add G12A AO Clock Bindings
On 2018/8/15 4:48, Rob Herring wrote: On Fri, Aug 10, 2018 at 05:54:27PM +0800, Jian Hu wrote: Add new clock controller compatible and dt-bingdings headers for the Always-On domain of the g12a SoC Signed-off-by: Jian Hu --- .../bindings/clock/amlogic,gxbb-aoclkc.txt | 1 + include/dt-bindings/clock/g12a-aoclkc.h| 28 ++ 2 files changed, 29 insertions(+) create mode 100755 include/dt-bindings/clock/g12a-aoclkc.h checkpatch says wrong mode. Yes, I have checked the g12a-aoclk.h file,It is wrong mode. I will chmod a-x for it.Thank you for review. diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt index 3a88052..6f02288 100644 --- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt +++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt @@ -10,6 +10,7 @@ Required Properties: - GXL (S905X, S905D) : "amlogic,meson-gxl-aoclkc" - GXM (S912) : "amlogic,meson-gxm-aoclkc" - AXG (A113D, A113X) : "amlogic,meson-axg-aoclkc" + - G12A (S905D2, S905X2) : "amlogic,g12a-aoclkc" followed by the common "amlogic,meson-gx-aoclkc" - #clock-cells: should be 1. diff --git a/include/dt-bindings/clock/g12a-aoclkc.h b/include/dt-bindings/clock/g12a-aoclkc.h new file mode 100755 index 000..6b3f921 --- /dev/null +++ b/include/dt-bindings/clock/g12a-aoclkc.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */ +/* + * Copyright (c) 2016 BayLibre, SAS + * Author: Neil Armstrong + * + * Copyright (c) 2018 Amlogic, inc. + * Author: Jian Hu + */ + +#ifndef DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK +#define DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK + +#define CLKID_AO_AHB_BUS 0 +#define CLKID_AO_REMOTE1 +#define CLKID_AO_I2C_MASTER2 +#define CLKID_AO_I2C_SLAVE 3 +#define CLKID_AO_UART1 4 +#define CLKID_AO_PROD_I2C 5 +#define CLKID_AO_UART2 6 +#define CLKID_AO_IR_BLASTER7 +#define CLKID_AO_SAR_ADC 8 +#define CLKID_AO_CLK81 9 +#define CLKID_AO_SAR_ADC_SEL 10 +#define CLKID_AO_SAR_ADC_DIV 11 +#define CLKID_AO_SAR_ADC_CLK 12 +#define CLKID_AO_ALT_XTAL 13 + +#endif -- 1.9.1 .
[PATCH 3/5] arm64: dts: mt7622: fix ram size for rfb1
Fix ram size and sort nodes in alphabetical order. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 196 +-- 1 file changed, 98 insertions(+), 98 deletions(-) diff --git a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts index b783764..033d7d1 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts +++ b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts @@ -51,7 +51,7 @@ }; memory { - reg = <0 0x4000 0 0x3F00>; + reg = <0 0x4000 0 0x2000>; }; reg_1p8v: regulator-1p8v { @@ -81,6 +81,103 @@ }; }; + { + status = "disabled"; +}; + + { + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; + + gmac1: mac@1 { + compatible = "mediatek,eth-mac"; + reg = <1>; + phy-handle = <>; + }; + + mdio-bus { + #address-cells = <1>; + #size-cells = <0>; + + phy5: ethernet-phy@5 { + reg = <5>; + phy-mode = "sgmii"; + }; + }; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <8>; + max-frequency = <5000>; + cap-mmc-highspeed; + mmc-hs200-1_8v; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_1p8v>; + assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; + non-removable; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <4>; + max-frequency = <5000>; + cap-sd-highspeed; + r_smpl = <1>; + cd-gpios = < 81 GPIO_ACTIVE_LOW>; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_3p3v>; + assigned-clocks = < CLK_TOP_MSDC30_1_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_nand_pins>; + status = "disabled"; +}; + +_flash { + pinctrl-names = "default"; + pinctrl-0 = <_nor_pins>; + status = "disabled"; + + flash@0 { + compatible = "jedec,spi-nor"; + reg = <0>; + }; +}; + { pinctrl-names = "default"; pinctrl-0 = <_pins>; @@ -344,103 +441,6 @@ }; }; - { - status = "disabled"; -}; - - { - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; - - gmac1: mac@1 { - compatible = "mediatek,eth-mac"; - reg = <1>; - phy-handle = <>; - }; - - mdio-bus { - #address-cells = <1>; - #size-cells = <0>; - - phy5: ethernet-phy@5 { - reg = <5>; - phy-mode = "sgmii"; - }; - }; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default", "state_uhs"; - pinctrl-0 = <_pins_default>; - pinctrl-1 = <_pins_uhs>; - status = "okay"; - bus-width = <8>; - max-frequency = <5000>; - cap-mmc-highspeed; - mmc-hs200-1_8v; - vmmc-supply = <_3p3v>; - vqmmc-supply = <_1p8v>; - assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; - assigned-clock-parents = < CLK_TOP_UNIV48M>; - non-removable; -}; - - { - pinctrl-names = "default", "state_uhs"; - pinctrl-0 = <_pins_default>; - pinctrl-1 = <_pins_uhs>; - status = "okay"; - bus-width = <4>; - max-frequency = <5000>; - cap-sd-highspeed; - r_smpl = <1>; - cd-gpios = < 81 GPIO_ACTIVE_LOW>; - vmmc-supply = <_3p3v>; - vqmmc-supply = <_3p3v>; - assigned-clocks = < CLK_TOP_MSDC30_1_SEL>; - assigned-clock-parents = < CLK_TOP_UNIV48M>; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_nand_pins>; - status = "disabled"; -}; - -_flash { - pinctrl-names = "default"; - pinctrl-0 = <_nor_pins>; - status = "disabled"; - - flash@0 { - compatible = "jedec,spi-nor"; - reg = <0>; -
[PATCH 3/5] arm64: dts: mt7622: fix ram size for rfb1
Fix ram size and sort nodes in alphabetical order. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 196 +-- 1 file changed, 98 insertions(+), 98 deletions(-) diff --git a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts index b783764..033d7d1 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts +++ b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts @@ -51,7 +51,7 @@ }; memory { - reg = <0 0x4000 0 0x3F00>; + reg = <0 0x4000 0 0x2000>; }; reg_1p8v: regulator-1p8v { @@ -81,6 +81,103 @@ }; }; + { + status = "disabled"; +}; + + { + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; + + gmac1: mac@1 { + compatible = "mediatek,eth-mac"; + reg = <1>; + phy-handle = <>; + }; + + mdio-bus { + #address-cells = <1>; + #size-cells = <0>; + + phy5: ethernet-phy@5 { + reg = <5>; + phy-mode = "sgmii"; + }; + }; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <8>; + max-frequency = <5000>; + cap-mmc-highspeed; + mmc-hs200-1_8v; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_1p8v>; + assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; + non-removable; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <4>; + max-frequency = <5000>; + cap-sd-highspeed; + r_smpl = <1>; + cd-gpios = < 81 GPIO_ACTIVE_LOW>; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_3p3v>; + assigned-clocks = < CLK_TOP_MSDC30_1_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_nand_pins>; + status = "disabled"; +}; + +_flash { + pinctrl-names = "default"; + pinctrl-0 = <_nor_pins>; + status = "disabled"; + + flash@0 { + compatible = "jedec,spi-nor"; + reg = <0>; + }; +}; + { pinctrl-names = "default"; pinctrl-0 = <_pins>; @@ -344,103 +441,6 @@ }; }; - { - status = "disabled"; -}; - - { - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; - - gmac1: mac@1 { - compatible = "mediatek,eth-mac"; - reg = <1>; - phy-handle = <>; - }; - - mdio-bus { - #address-cells = <1>; - #size-cells = <0>; - - phy5: ethernet-phy@5 { - reg = <5>; - phy-mode = "sgmii"; - }; - }; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_pins>; - status = "okay"; -}; - - { - pinctrl-names = "default", "state_uhs"; - pinctrl-0 = <_pins_default>; - pinctrl-1 = <_pins_uhs>; - status = "okay"; - bus-width = <8>; - max-frequency = <5000>; - cap-mmc-highspeed; - mmc-hs200-1_8v; - vmmc-supply = <_3p3v>; - vqmmc-supply = <_1p8v>; - assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; - assigned-clock-parents = < CLK_TOP_UNIV48M>; - non-removable; -}; - - { - pinctrl-names = "default", "state_uhs"; - pinctrl-0 = <_pins_default>; - pinctrl-1 = <_pins_uhs>; - status = "okay"; - bus-width = <4>; - max-frequency = <5000>; - cap-sd-highspeed; - r_smpl = <1>; - cd-gpios = < 81 GPIO_ACTIVE_LOW>; - vmmc-supply = <_3p3v>; - vqmmc-supply = <_3p3v>; - assigned-clocks = < CLK_TOP_MSDC30_1_SEL>; - assigned-clock-parents = < CLK_TOP_UNIV48M>; -}; - - { - pinctrl-names = "default"; - pinctrl-0 = <_nand_pins>; - status = "disabled"; -}; - -_flash { - pinctrl-names = "default"; - pinctrl-0 = <_nor_pins>; - status = "disabled"; - - flash@0 { - compatible = "jedec,spi-nor"; - reg = <0>; -
[PATCH 1/5] arm64: dts: mt7622: add some misc device nodes
Add timer and CCI-400 device nodes for MT7622. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622.dtsi | 48 1 file changed, 48 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index 9213c96..b235df7 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -79,6 +79,7 @@ #cooling-cells = <2>; enable-method = "psci"; clock-frequency = <13>; + cci-control-port = <_control2>; }; cpu1: cpu@1 { @@ -91,6 +92,7 @@ operating-points-v2 = <_opp_table>; enable-method = "psci"; clock-frequency = <13>; + cci-control-port = <_control2>; }; }; @@ -217,6 +219,16 @@ #reset-cells = <1>; }; + timer: timer@10004000 { + compatible = "mediatek,mt7622-timer", +"mediatek,mt6577-timer"; + reg = <0 0x10004000 0 0x80>; + interrupts = ; + clocks = < CLK_INFRA_APXGPT_PD>, +< CLK_TOP_RTC>; + clock-names = "system-clk", "rtc-clk"; + }; + scpsys: scpsys@10006000 { compatible = "mediatek,mt7622-scpsys", "syscon"; @@ -317,6 +329,42 @@ <0 0x1036 0 0x2000>; }; + cci: cci@1039 { + compatible = "arm,cci-400"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0 0x1039 0 0x1000>; + ranges = <0 0 0x1039 0x1>; + + cci_control0: slave-if@1000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace-lite"; + reg = <0x1000 0x1000>; + }; + + cci_control1: slave-if@4000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x4000 0x1000>; + }; + + cci_control2: slave-if@5000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x5000 0x1000>; + }; + + pmu@9000 { + compatible = "arm,cci-400-pmu,r1"; + reg = <0x9000 0x5000>; + interrupts = , +, +, +, +; + }; + }; + auxadc: adc@11001000 { compatible = "mediatek,mt7622-auxadc"; reg = <0 0x11001000 0 0x1000>; -- 1.9.1
[PATCH 1/5] arm64: dts: mt7622: add some misc device nodes
Add timer and CCI-400 device nodes for MT7622. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622.dtsi | 48 1 file changed, 48 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index 9213c96..b235df7 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -79,6 +79,7 @@ #cooling-cells = <2>; enable-method = "psci"; clock-frequency = <13>; + cci-control-port = <_control2>; }; cpu1: cpu@1 { @@ -91,6 +92,7 @@ operating-points-v2 = <_opp_table>; enable-method = "psci"; clock-frequency = <13>; + cci-control-port = <_control2>; }; }; @@ -217,6 +219,16 @@ #reset-cells = <1>; }; + timer: timer@10004000 { + compatible = "mediatek,mt7622-timer", +"mediatek,mt6577-timer"; + reg = <0 0x10004000 0 0x80>; + interrupts = ; + clocks = < CLK_INFRA_APXGPT_PD>, +< CLK_TOP_RTC>; + clock-names = "system-clk", "rtc-clk"; + }; + scpsys: scpsys@10006000 { compatible = "mediatek,mt7622-scpsys", "syscon"; @@ -317,6 +329,42 @@ <0 0x1036 0 0x2000>; }; + cci: cci@1039 { + compatible = "arm,cci-400"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0 0x1039 0 0x1000>; + ranges = <0 0 0x1039 0x1>; + + cci_control0: slave-if@1000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace-lite"; + reg = <0x1000 0x1000>; + }; + + cci_control1: slave-if@4000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x4000 0x1000>; + }; + + cci_control2: slave-if@5000 { + compatible = "arm,cci-400-ctrl-if"; + interface-type = "ace"; + reg = <0x5000 0x1000>; + }; + + pmu@9000 { + compatible = "arm,cci-400-pmu,r1"; + reg = <0x9000 0x5000>; + interrupts = , +, +, +, +; + }; + }; + auxadc: adc@11001000 { compatible = "mediatek,mt7622-auxadc"; reg = <0 0x11001000 0 0x1000>; -- 1.9.1
[PATCH 4/5] arm64: dts: mt7622: add bananapi BPI-R64 board
Add support for the bananapi R64 (BPI-R64) development board from BIPAI KEJI. Detailed hardware information for BPI-R64 which could be found on http://wiki.banana-pi.org/Banana_Pi_BPI-R64 Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/Makefile | 1 + .../boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts | 508 + 2 files changed, 509 insertions(+) create mode 100644 arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index ac17f60..2a1abe5 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -3,5 +3,6 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt2712-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6755-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-bananapi-bpi-r64.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts new file mode 100644 index 000..bc504fe --- /dev/null +++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts @@ -0,0 +1,508 @@ +/* + * Copyright (c) 2018 MediaTek Inc. + * Author: Ryder Lee + * + * SPDX-License-Identifier: (GPL-2.0 OR MIT) + */ + +/dts-v1/; +#include +#include + +#include "mt7622.dtsi" +#include "mt6380.dtsi" + +/ { + model = "Bananapi BPI-R64"; + compatible = "bananapi,bpi-r64", "mediatek,mt7622"; + + chosen { + bootargs = "earlycon=uart8250,mmio32,0x11002000 console=ttyS0,115200n1 swiotlb=512"; + }; + + cpus { + cpu@0 { + proc-supply = <_vcpu_reg>; + sram-supply = <_vm_reg>; + }; + + cpu@1 { + proc-supply = <_vcpu_reg>; + sram-supply = <_vm_reg>; + }; + }; + + gpio-keys { + compatible = "gpio-keys"; + poll-interval = <100>; + + factory { + label = "factory"; + linux,code = ; + gpios = < 0 0>; + }; + + wps { + label = "wps"; + linux,code = ; + gpios = < 102 0>; + }; + }; + + memory { + reg = <0 0x4000 0 0x4000>; + }; + + reg_1p8v: regulator-1p8v { + compatible = "regulator-fixed"; + regulator-name = "fixed-1.8V"; + regulator-min-microvolt = <180>; + regulator-max-microvolt = <180>; + regulator-always-on; + }; + + reg_3p3v: regulator-3p3v { + compatible = "regulator-fixed"; + regulator-name = "fixed-3.3V"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + regulator-boot-on; + regulator-always-on; + }; + + reg_5v: regulator-5v { + compatible = "regulator-fixed"; + regulator-name = "fixed-5V"; + regulator-min-microvolt = <500>; + regulator-max-microvolt = <500>; + regulator-boot-on; + regulator-always-on; + }; +}; + + { + status = "disabled"; +}; + + { + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; + + gmac1: mac@1 { + compatible = "mediatek,eth-mac"; + reg = <1>; + phy-handle = <>; + }; + + mdio-bus { + #address-cells = <1>; + #size-cells = <0>; + + phy5: ethernet-phy@5 { + reg = <5>; + phy-mode = "sgmii"; + }; + }; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <8>; + max-frequency = <5000>; + cap-mmc-highspeed; + mmc-hs200-1_8v; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_1p8v>; + assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; + non-removable; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width
[PATCH 2/5] arm64: dts: mt7622: add a bluetooth 5 device node
Add a built-in bluetooth 5 support for MT7622. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index b235df7..f8c3495 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -515,6 +515,13 @@ reg-shift = <2>; reg-io-width = <4>; status = "disabled"; + + bluetooth { + compatible = "mediatek,mt7622-bluetooth"; + power-domains = < MT7622_POWER_DOMAIN_WB>; + clocks = <>; + clock-names = "ref"; + }; }; nandc: nfi@1100d000 { -- 1.9.1
[PATCH 5/5] dt-bindings: arm: mediatek: add support for bananapi BPI-R64 board
Update binding document for bananapi BPI-R64 board being supported. Signed-off-by: Ryder Lee --- Documentation/devicetree/bindings/arm/mediatek.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt b/Documentation/devicetree/bindings/arm/mediatek.txt index 7d21ab3..677fdcf 100644 --- a/Documentation/devicetree/bindings/arm/mediatek.txt +++ b/Documentation/devicetree/bindings/arm/mediatek.txt @@ -50,6 +50,9 @@ Supported boards: - Reference board variant 1 for MT7622: Required root node properties: - compatible = "mediatek,mt7622-rfb1", "mediatek,mt7622"; +- Bananapi BPI-R64 for MT7622: +Required root node properties: + - compatible = "bananapi,bpi-r64", "mediatek,mt7622"; - Reference board for MT7623a with eMMC: Required root node properties: - compatible = "mediatek,mt7623a-rfb-emmc", "mediatek,mt7623"; -- 1.9.1
[PATCH 5/5] dt-bindings: arm: mediatek: add support for bananapi BPI-R64 board
Update binding document for bananapi BPI-R64 board being supported. Signed-off-by: Ryder Lee --- Documentation/devicetree/bindings/arm/mediatek.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt b/Documentation/devicetree/bindings/arm/mediatek.txt index 7d21ab3..677fdcf 100644 --- a/Documentation/devicetree/bindings/arm/mediatek.txt +++ b/Documentation/devicetree/bindings/arm/mediatek.txt @@ -50,6 +50,9 @@ Supported boards: - Reference board variant 1 for MT7622: Required root node properties: - compatible = "mediatek,mt7622-rfb1", "mediatek,mt7622"; +- Bananapi BPI-R64 for MT7622: +Required root node properties: + - compatible = "bananapi,bpi-r64", "mediatek,mt7622"; - Reference board for MT7623a with eMMC: Required root node properties: - compatible = "mediatek,mt7623a-rfb-emmc", "mediatek,mt7623"; -- 1.9.1
[PATCH 4/5] arm64: dts: mt7622: add bananapi BPI-R64 board
Add support for the bananapi R64 (BPI-R64) development board from BIPAI KEJI. Detailed hardware information for BPI-R64 which could be found on http://wiki.banana-pi.org/Banana_Pi_BPI-R64 Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/Makefile | 1 + .../boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts | 508 + 2 files changed, 509 insertions(+) create mode 100644 arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts diff --git a/arch/arm64/boot/dts/mediatek/Makefile b/arch/arm64/boot/dts/mediatek/Makefile index ac17f60..2a1abe5 100644 --- a/arch/arm64/boot/dts/mediatek/Makefile +++ b/arch/arm64/boot/dts/mediatek/Makefile @@ -3,5 +3,6 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt2712-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6755-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb +dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-bananapi-bpi-r64.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts new file mode 100644 index 000..bc504fe --- /dev/null +++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts @@ -0,0 +1,508 @@ +/* + * Copyright (c) 2018 MediaTek Inc. + * Author: Ryder Lee + * + * SPDX-License-Identifier: (GPL-2.0 OR MIT) + */ + +/dts-v1/; +#include +#include + +#include "mt7622.dtsi" +#include "mt6380.dtsi" + +/ { + model = "Bananapi BPI-R64"; + compatible = "bananapi,bpi-r64", "mediatek,mt7622"; + + chosen { + bootargs = "earlycon=uart8250,mmio32,0x11002000 console=ttyS0,115200n1 swiotlb=512"; + }; + + cpus { + cpu@0 { + proc-supply = <_vcpu_reg>; + sram-supply = <_vm_reg>; + }; + + cpu@1 { + proc-supply = <_vcpu_reg>; + sram-supply = <_vm_reg>; + }; + }; + + gpio-keys { + compatible = "gpio-keys"; + poll-interval = <100>; + + factory { + label = "factory"; + linux,code = ; + gpios = < 0 0>; + }; + + wps { + label = "wps"; + linux,code = ; + gpios = < 102 0>; + }; + }; + + memory { + reg = <0 0x4000 0 0x4000>; + }; + + reg_1p8v: regulator-1p8v { + compatible = "regulator-fixed"; + regulator-name = "fixed-1.8V"; + regulator-min-microvolt = <180>; + regulator-max-microvolt = <180>; + regulator-always-on; + }; + + reg_3p3v: regulator-3p3v { + compatible = "regulator-fixed"; + regulator-name = "fixed-3.3V"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + regulator-boot-on; + regulator-always-on; + }; + + reg_5v: regulator-5v { + compatible = "regulator-fixed"; + regulator-name = "fixed-5V"; + regulator-min-microvolt = <500>; + regulator-max-microvolt = <500>; + regulator-boot-on; + regulator-always-on; + }; +}; + + { + status = "disabled"; +}; + + { + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; + + gmac1: mac@1 { + compatible = "mediatek,eth-mac"; + reg = <1>; + phy-handle = <>; + }; + + mdio-bus { + #address-cells = <1>; + #size-cells = <0>; + + phy5: ethernet-phy@5 { + reg = <5>; + phy-mode = "sgmii"; + }; + }; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + status = "okay"; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width = <8>; + max-frequency = <5000>; + cap-mmc-highspeed; + mmc-hs200-1_8v; + vmmc-supply = <_3p3v>; + vqmmc-supply = <_1p8v>; + assigned-clocks = < CLK_TOP_MSDC30_0_SEL>; + assigned-clock-parents = < CLK_TOP_UNIV48M>; + non-removable; +}; + + { + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <_pins_default>; + pinctrl-1 = <_pins_uhs>; + status = "okay"; + bus-width
[PATCH 2/5] arm64: dts: mt7622: add a bluetooth 5 device node
Add a built-in bluetooth 5 support for MT7622. Signed-off-by: Ryder Lee --- arch/arm64/boot/dts/mediatek/mt7622.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index b235df7..f8c3495 100644 --- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -515,6 +515,13 @@ reg-shift = <2>; reg-io-width = <4>; status = "disabled"; + + bluetooth { + compatible = "mediatek,mt7622-bluetooth"; + power-domains = < MT7622_POWER_DOMAIN_WB>; + clocks = <>; + clock-names = "ref"; + }; }; nandc: nfi@1100d000 { -- 1.9.1
Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference
On Wed, 2018-08-15 at 11:59 +0800, Dave Young wrote: > > Does this improve things, and plug the no boot hole? > > Would you mind to tune my patch with some acpi_rsdp checking and add > some error message in case kexec load failure? Eg. suggest people to use > append acpi_rsdp for noefi booting etc. Yeah, -ENODEV is better than hanging, but not very informative. > I'm still not very satisfied with the code cleanup.. Not surprising, I didn't like it much either (ergo interrogative). -Mike
Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference
On Wed, 2018-08-15 at 11:59 +0800, Dave Young wrote: > > Does this improve things, and plug the no boot hole? > > Would you mind to tune my patch with some acpi_rsdp checking and add > some error message in case kexec load failure? Eg. suggest people to use > append acpi_rsdp for noefi booting etc. Yeah, -ENODEV is better than hanging, but not very informative. > I'm still not very satisfied with the code cleanup.. Not surprising, I didn't like it much either (ergo interrogative). -Mike
Re: linux-next: manual merge of the kvm tree with the tip tree
Hi all, On Wed, 8 Aug 2018 13:54:45 +1000 Stephen Rothwell wrote: > > Paolo pointed out a semantic conflict between the kvm tree and the tip > tree in > > arch/x86/kernel/kvm.c > > between commit: > > 368a540e0232 ("x86/kvmclock: Remove memblock dependency") > > from the tip tree and commit: > > d63bae079b64 ("KVM: X86: Add kvm hypervisor init time platform setup > callback") > > from the kvm tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > I applied the following (as suggested) after the automatic merge: > > From: Stephen Rothwell > Date: Wed, 8 Aug 2018 13:48:34 +1000 > Subject: [PATCH] kvm: x86: fix semantic conflict in platform init > > Signed-off-by: Stephen Rothwell > --- > arch/x86/kernel/kvm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 0906b8731393..e2499bad3209 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -714,13 +714,13 @@ static void __init kvm_apic_init(void) > static void __init kvm_init_platform(void) > { > x86_platform.apic_post_init = kvm_apic_init; > + kvmclock_init(); > } > > const __initconst struct hypervisor_x86 x86_hyper_kvm = { > .name = "KVM", > .detect = kvm_detect, > .type = X86_HYPER_KVM, > - .init.init_platform = kvmclock_init, > .init.guest_late_init = kvm_guest_init, > .init.x2apic_available = kvm_para_available, > .init.init_platform = kvm_init_platform, > -- > 2.18.0 This is now a conflict between Linus' tree and the kvm tree. -- Cheers, Stephen Rothwell pgpR6IrpU56nd.pgp Description: OpenPGP digital signature
Re: linux-next: manual merge of the kvm tree with the tip tree
Hi all, On Wed, 8 Aug 2018 13:54:45 +1000 Stephen Rothwell wrote: > > Paolo pointed out a semantic conflict between the kvm tree and the tip > tree in > > arch/x86/kernel/kvm.c > > between commit: > > 368a540e0232 ("x86/kvmclock: Remove memblock dependency") > > from the tip tree and commit: > > d63bae079b64 ("KVM: X86: Add kvm hypervisor init time platform setup > callback") > > from the kvm tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > I applied the following (as suggested) after the automatic merge: > > From: Stephen Rothwell > Date: Wed, 8 Aug 2018 13:48:34 +1000 > Subject: [PATCH] kvm: x86: fix semantic conflict in platform init > > Signed-off-by: Stephen Rothwell > --- > arch/x86/kernel/kvm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 0906b8731393..e2499bad3209 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -714,13 +714,13 @@ static void __init kvm_apic_init(void) > static void __init kvm_init_platform(void) > { > x86_platform.apic_post_init = kvm_apic_init; > + kvmclock_init(); > } > > const __initconst struct hypervisor_x86 x86_hyper_kvm = { > .name = "KVM", > .detect = kvm_detect, > .type = X86_HYPER_KVM, > - .init.init_platform = kvmclock_init, > .init.guest_late_init = kvm_guest_init, > .init.x2apic_available = kvm_para_available, > .init.init_platform = kvm_init_platform, > -- > 2.18.0 This is now a conflict between Linus' tree and the kvm tree. -- Cheers, Stephen Rothwell pgpR6IrpU56nd.pgp Description: OpenPGP digital signature
linux-next: manual merge of the kvm tree with Linus' tree
Hi all, Today's linux-next merge of the kvm tree got a conflict in: arch/x86/include/asm/kvm_host.h between commit: 5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry") from Linus' tree and commit: 4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall") from the kvm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/include/asm/kvm_host.h index acebb808c4b5,c18958ef17d2.. --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@@ -1418,7 -1457,10 +1462,11 @@@ int kvm_cpu_get_interrupt(struct kvm_vc void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu); + int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, + unsigned long ipi_bitmap_high, int min, + unsigned long icr, int op_64_bit); + +u64 kvm_get_arch_capabilities(void); void kvm_define_shared_msr(unsigned index, u32 msr); int kvm_set_shared_msr(unsigned index, u64 val, u64 mask); pgp0kiRgw9mGV.pgp Description: OpenPGP digital signature
linux-next: manual merge of the kvm tree with Linus' tree
Hi all, Today's linux-next merge of the kvm tree got a conflict in: arch/x86/include/asm/kvm_host.h between commit: 5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry") from Linus' tree and commit: 4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall") from the kvm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/include/asm/kvm_host.h index acebb808c4b5,c18958ef17d2.. --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@@ -1418,7 -1457,10 +1462,11 @@@ int kvm_cpu_get_interrupt(struct kvm_vc void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu); + int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, + unsigned long ipi_bitmap_high, int min, + unsigned long icr, int op_64_bit); + +u64 kvm_get_arch_capabilities(void); void kvm_define_shared_msr(unsigned index, u32 msr); int kvm_set_shared_msr(unsigned index, u64 val, u64 mask); pgp0kiRgw9mGV.pgp Description: OpenPGP digital signature
linux-next: manual merge of the kvm tree with Linus' tree
Hi all, Today's linux-next merge of the kvm tree got a conflict in: arch/x86/kvm/vmx.c between commit: a399477e52c1 ("x86/KVM/VMX: Add module argument for L1TF mitigation") from Linus' tree and commit: 877ad952be3d ("KVM: vmx: Add tlb_remote_flush callback support") from the kvm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/kvm/vmx.c index 46b428c0990e,16f9373c01de.. --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@@ -188,150 -189,12 +189,156 @@@ module_param(ple_window_max, uint, 0444 extern const ulong vmx_return; +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush); +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond); +static DEFINE_MUTEX(vmx_l1d_flush_mutex); + +/* Storage for pre module init parameter parsing */ +static enum vmx_l1d_flush_state __read_mostly vmentry_l1d_flush_param = VMENTER_L1D_FLUSH_AUTO; + +static const struct { + const char *option; + enum vmx_l1d_flush_state cmd; +} vmentry_l1d_param[] = { + {"auto",VMENTER_L1D_FLUSH_AUTO}, + {"never", VMENTER_L1D_FLUSH_NEVER}, + {"cond",VMENTER_L1D_FLUSH_COND}, + {"always", VMENTER_L1D_FLUSH_ALWAYS}, +}; + +#define L1D_CACHE_ORDER 4 +static void *vmx_l1d_flush_pages; + +static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) +{ + struct page *page; + unsigned int i; + + if (!enable_ept) { + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_EPT_DISABLED; + return 0; + } + + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) { + u64 msr; + + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr); + if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) { + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED; + return 0; + } + } + + /* If set to auto use the default l1tf mitigation method */ + if (l1tf == VMENTER_L1D_FLUSH_AUTO) { + switch (l1tf_mitigation) { + case L1TF_MITIGATION_OFF: + l1tf = VMENTER_L1D_FLUSH_NEVER; + break; + case L1TF_MITIGATION_FLUSH_NOWARN: + case L1TF_MITIGATION_FLUSH: + case L1TF_MITIGATION_FLUSH_NOSMT: + l1tf = VMENTER_L1D_FLUSH_COND; + break; + case L1TF_MITIGATION_FULL: + case L1TF_MITIGATION_FULL_FORCE: + l1tf = VMENTER_L1D_FLUSH_ALWAYS; + break; + } + } else if (l1tf_mitigation == L1TF_MITIGATION_FULL_FORCE) { + l1tf = VMENTER_L1D_FLUSH_ALWAYS; + } + + if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages && + !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) { + page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER); + if (!page) + return -ENOMEM; + vmx_l1d_flush_pages = page_address(page); + + /* + * Initialize each page with a different pattern in + * order to protect against KSM in the nested + * virtualization case. + */ + for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) { + memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1, + PAGE_SIZE); + } + } + + l1tf_vmx_mitigation = l1tf; + + if (l1tf != VMENTER_L1D_FLUSH_NEVER) + static_branch_enable(_l1d_should_flush); + else + static_branch_disable(_l1d_should_flush); + + if (l1tf == VMENTER_L1D_FLUSH_COND) + static_branch_enable(_l1d_flush_cond); + else + static_branch_disable(_l1d_flush_cond); + return 0; +} + +static int vmentry_l1d_flush_parse(const char *s) +{ + unsigned int i; + + if (s) { + for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) { + if (sysfs_streq(s, vmentry_l1d_param[i].option)) + return vmentry_l1d_param[i].cmd; + } + } + return -EINVAL; +} + +static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp) +{ + int l1tf, ret; + + if (!boot_cpu_has(X86_BUG_L1TF)) + return 0; + + l1tf = vmentry_l1d_flush_parse(s); + if (l1tf < 0) + return l1tf; + + /* + * Has vmx_init() run already? If not then this is the pre init + * parameter parsing. In that
linux-next: manual merge of the kvm tree with Linus' tree
Hi all, Today's linux-next merge of the kvm tree got a conflict in: arch/x86/kvm/vmx.c between commit: a399477e52c1 ("x86/KVM/VMX: Add module argument for L1TF mitigation") from Linus' tree and commit: 877ad952be3d ("KVM: vmx: Add tlb_remote_flush callback support") from the kvm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/x86/kvm/vmx.c index 46b428c0990e,16f9373c01de.. --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@@ -188,150 -189,12 +189,156 @@@ module_param(ple_window_max, uint, 0444 extern const ulong vmx_return; +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush); +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond); +static DEFINE_MUTEX(vmx_l1d_flush_mutex); + +/* Storage for pre module init parameter parsing */ +static enum vmx_l1d_flush_state __read_mostly vmentry_l1d_flush_param = VMENTER_L1D_FLUSH_AUTO; + +static const struct { + const char *option; + enum vmx_l1d_flush_state cmd; +} vmentry_l1d_param[] = { + {"auto",VMENTER_L1D_FLUSH_AUTO}, + {"never", VMENTER_L1D_FLUSH_NEVER}, + {"cond",VMENTER_L1D_FLUSH_COND}, + {"always", VMENTER_L1D_FLUSH_ALWAYS}, +}; + +#define L1D_CACHE_ORDER 4 +static void *vmx_l1d_flush_pages; + +static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) +{ + struct page *page; + unsigned int i; + + if (!enable_ept) { + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_EPT_DISABLED; + return 0; + } + + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) { + u64 msr; + + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr); + if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) { + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED; + return 0; + } + } + + /* If set to auto use the default l1tf mitigation method */ + if (l1tf == VMENTER_L1D_FLUSH_AUTO) { + switch (l1tf_mitigation) { + case L1TF_MITIGATION_OFF: + l1tf = VMENTER_L1D_FLUSH_NEVER; + break; + case L1TF_MITIGATION_FLUSH_NOWARN: + case L1TF_MITIGATION_FLUSH: + case L1TF_MITIGATION_FLUSH_NOSMT: + l1tf = VMENTER_L1D_FLUSH_COND; + break; + case L1TF_MITIGATION_FULL: + case L1TF_MITIGATION_FULL_FORCE: + l1tf = VMENTER_L1D_FLUSH_ALWAYS; + break; + } + } else if (l1tf_mitigation == L1TF_MITIGATION_FULL_FORCE) { + l1tf = VMENTER_L1D_FLUSH_ALWAYS; + } + + if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages && + !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) { + page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER); + if (!page) + return -ENOMEM; + vmx_l1d_flush_pages = page_address(page); + + /* + * Initialize each page with a different pattern in + * order to protect against KSM in the nested + * virtualization case. + */ + for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) { + memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1, + PAGE_SIZE); + } + } + + l1tf_vmx_mitigation = l1tf; + + if (l1tf != VMENTER_L1D_FLUSH_NEVER) + static_branch_enable(_l1d_should_flush); + else + static_branch_disable(_l1d_should_flush); + + if (l1tf == VMENTER_L1D_FLUSH_COND) + static_branch_enable(_l1d_flush_cond); + else + static_branch_disable(_l1d_flush_cond); + return 0; +} + +static int vmentry_l1d_flush_parse(const char *s) +{ + unsigned int i; + + if (s) { + for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) { + if (sysfs_streq(s, vmentry_l1d_param[i].option)) + return vmentry_l1d_param[i].cmd; + } + } + return -EINVAL; +} + +static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp) +{ + int l1tf, ret; + + if (!boot_cpu_has(X86_BUG_L1TF)) + return 0; + + l1tf = vmentry_l1d_flush_parse(s); + if (l1tf < 0) + return l1tf; + + /* + * Has vmx_init() run already? If not then this is the pre init + * parameter parsing. In that
linux-next: manual merge of the ftrace tree with Linus' tree
Hi Steven, Today's linux-next merge of the ftrace tree got a conflict in: kernel/events/uprobes.c between commit: 788faab70d5a ("perf, tools: Use correct articles in comments") from Linus' tree and commit: 38e967ae1e60 ("Uprobes: Simplify uprobe_register() body") from the ftrace tree. I fixed it up (the latter moved the uprobe_unregister function - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc kernel/events/uprobes.c index aed1ba569954,c0418ba52ba8.. --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@@ -860,7 -856,28 +856,28 @@@ __uprobe_unregister(struct uprobe *upro } /* - * uprobe_register - register a probe - * uprobe_unregister - unregister a already registered probe. ++ * uprobe_unregister - unregister an already registered probe. + * @inode: the file in which the probe has to be removed. + * @offset: offset from the start of the file. + * @uc: identify which probe if multiple probes are colocated. + */ + void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) + { + struct uprobe *uprobe; + + uprobe = find_uprobe(inode, offset); + if (WARN_ON(!uprobe)) + return; + + down_write(>register_rwsem); + __uprobe_unregister(uprobe, uc); + up_write(>register_rwsem); + put_uprobe(uprobe); + } + EXPORT_SYMBOL_GPL(uprobe_unregister); + + /* + * __uprobe_register - register a probe * @inode: the file in which the probe has to be placed. * @offset: offset from the start of the file. * @uc: information on howto handle the probe.. pgpF0XnTROJcL.pgp Description: OpenPGP digital signature
linux-next: manual merge of the ftrace tree with Linus' tree
Hi Steven, Today's linux-next merge of the ftrace tree got a conflict in: kernel/events/uprobes.c between commit: 788faab70d5a ("perf, tools: Use correct articles in comments") from Linus' tree and commit: 38e967ae1e60 ("Uprobes: Simplify uprobe_register() body") from the ftrace tree. I fixed it up (the latter moved the uprobe_unregister function - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc kernel/events/uprobes.c index aed1ba569954,c0418ba52ba8.. --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@@ -860,7 -856,28 +856,28 @@@ __uprobe_unregister(struct uprobe *upro } /* - * uprobe_register - register a probe - * uprobe_unregister - unregister a already registered probe. ++ * uprobe_unregister - unregister an already registered probe. + * @inode: the file in which the probe has to be removed. + * @offset: offset from the start of the file. + * @uc: identify which probe if multiple probes are colocated. + */ + void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) + { + struct uprobe *uprobe; + + uprobe = find_uprobe(inode, offset); + if (WARN_ON(!uprobe)) + return; + + down_write(>register_rwsem); + __uprobe_unregister(uprobe, uc); + up_write(>register_rwsem); + put_uprobe(uprobe); + } + EXPORT_SYMBOL_GPL(uprobe_unregister); + + /* + * __uprobe_register - register a probe * @inode: the file in which the probe has to be placed. * @offset: offset from the start of the file. * @uc: information on howto handle the probe.. pgpF0XnTROJcL.pgp Description: OpenPGP digital signature
Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference
Apologize for late reply, I'm occupied with something else. On 08/10/18 at 07:39pm, Mike Galbraith wrote: > On Fri, 2018-08-10 at 18:28 +0800, Dave Young wrote: > > > > > @@ -250,8 +253,10 @@ setup_boot_parameters(struct kimage *image, struct > > > boot_params *params, > > > > > > #ifdef CONFIG_EFI > > > /* Setup EFI state */ > > > - setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz, > > > + ret = setup_efi_state(params, params_load_addr, efi_map_offset, > > > efi_map_sz, > > > efi_setup_data_offset); > > > + if (ret) > > > > Here should check efi_enabled(EFI_BOOT) && ret > > Patch with that works for me. > > > In case efi boot we need the efi info set correctly, or one need pass > > acpi_rsdp= in kernel cmdline param. > > > > Still not sure how to allow one to workaround it by using acpi_rsdp= > > param with kexec_file_load.. > > Does this improve things, and plug the no boot hole? Would you mind to tune my patch with some acpi_rsdp checking and add some error message in case kexec load failure? Eg. suggest people to use append acpi_rsdp for noefi booting etc. I'm still not very satisfied with the code cleanup, ideally we should add a separate kbuf for efi stuff, so that we can isolate the efi_map_sz efi_setup_data_offset, and efi_map_offset initialization only when necessary. Anyway the cleanup can be a separate patch. > > x86, kdump: cleanup efi setup data handling a bit > > 1. Remove efi specific variables from bzImage64_load() other than the > one it needs, efi_map_sz, passing it and params_cmdline_sz on to efi > setup functions, giving them all they need without duplication. > > 2. Only allocate space for efi setup data when a 1:1 mapping is available. > Bail early with -ENODEV if not available, but is required to boot, and > acpi_rsdp= was not passed on the command line. > > 3. Use the proper config dependency to isolate efi setup functions, > adding a !EFI_RUNTIME_MAP stub for setup_efi_state(). > > 4. Change efi functions that cannot fail to void. > > Signed-off-by: Mike Galbraith > --- > arch/x86/kernel/kexec-bzimage64.c | 99 > +- > 1 file changed, 45 insertions(+), 54 deletions(-) > > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -112,35 +112,32 @@ static int setup_e820_entries(struct boo > return 0; > } > > -#ifdef CONFIG_EFI > -static int setup_efi_info_memmap(struct boot_params *params, > +#ifdef CONFIG_EFI_RUNTIME_MAP > +static void setup_efi_info_memmap(struct boot_params *params, > unsigned long params_load_addr, > - unsigned int efi_map_offset, > + unsigned int params_cmdline_sz, > unsigned int efi_map_sz) > { > - void *efi_map = (void *)params + efi_map_offset; > - unsigned long efi_map_phys_addr = params_load_addr + efi_map_offset; > + void *efi_map = (void *)params + params_cmdline_sz; > + unsigned long efi_map_phys_addr = params_load_addr + params_cmdline_sz; > struct efi_info *ei = >efi_info; > > - if (!efi_map_sz) > - return -EINVAL; > - > efi_runtime_map_copy(efi_map, efi_map_sz); > > ei->efi_memmap = efi_map_phys_addr & 0x; > ei->efi_memmap_hi = efi_map_phys_addr >> 32; > ei->efi_memmap_size = efi_map_sz; > - > - return 0; > } > > -static int > +static void > prepare_add_efi_setup_data(struct boot_params *params, > -unsigned long params_load_addr, > -unsigned int efi_setup_data_offset) > +unsigned long params_load_addr, > +unsigned int params_cmdline_sz, > +unsigned int efi_map_sz) > { > + unsigned int data_offset = params_cmdline_sz + ALIGN(efi_map_sz, 16); > unsigned long setup_data_phys; > - struct setup_data *sd = (void *)params + efi_setup_data_offset; > + struct setup_data *sd = (void *)params + data_offset; > struct efi_setup_data *esd = (void *)sd + sizeof(struct setup_data); > > esd->fw_vendor = efi.fw_vendor; > @@ -152,33 +149,20 @@ prepare_add_efi_setup_data(struct boot_p > sd->len = sizeof(struct efi_setup_data); > > /* Add setup data */ > - setup_data_phys = params_load_addr + efi_setup_data_offset; > + setup_data_phys = params_load_addr + data_offset; > sd->next = params->hdr.setup_data; > params->hdr.setup_data = setup_data_phys; > - > - return 0; > } > > static int > setup_efi_state(struct boot_params *params, unsigned long params_load_addr, > - unsigned int efi_map_offset, unsigned int efi_map_sz, > - unsigned int efi_setup_data_offset) > + unsigned int params_cmdline_sz, unsigned int efi_map_sz) > { > struct efi_info *current_ei = _params.efi_info; > struct efi_info *ei =
Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference
Apologize for late reply, I'm occupied with something else. On 08/10/18 at 07:39pm, Mike Galbraith wrote: > On Fri, 2018-08-10 at 18:28 +0800, Dave Young wrote: > > > > > @@ -250,8 +253,10 @@ setup_boot_parameters(struct kimage *image, struct > > > boot_params *params, > > > > > > #ifdef CONFIG_EFI > > > /* Setup EFI state */ > > > - setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz, > > > + ret = setup_efi_state(params, params_load_addr, efi_map_offset, > > > efi_map_sz, > > > efi_setup_data_offset); > > > + if (ret) > > > > Here should check efi_enabled(EFI_BOOT) && ret > > Patch with that works for me. > > > In case efi boot we need the efi info set correctly, or one need pass > > acpi_rsdp= in kernel cmdline param. > > > > Still not sure how to allow one to workaround it by using acpi_rsdp= > > param with kexec_file_load.. > > Does this improve things, and plug the no boot hole? Would you mind to tune my patch with some acpi_rsdp checking and add some error message in case kexec load failure? Eg. suggest people to use append acpi_rsdp for noefi booting etc. I'm still not very satisfied with the code cleanup, ideally we should add a separate kbuf for efi stuff, so that we can isolate the efi_map_sz efi_setup_data_offset, and efi_map_offset initialization only when necessary. Anyway the cleanup can be a separate patch. > > x86, kdump: cleanup efi setup data handling a bit > > 1. Remove efi specific variables from bzImage64_load() other than the > one it needs, efi_map_sz, passing it and params_cmdline_sz on to efi > setup functions, giving them all they need without duplication. > > 2. Only allocate space for efi setup data when a 1:1 mapping is available. > Bail early with -ENODEV if not available, but is required to boot, and > acpi_rsdp= was not passed on the command line. > > 3. Use the proper config dependency to isolate efi setup functions, > adding a !EFI_RUNTIME_MAP stub for setup_efi_state(). > > 4. Change efi functions that cannot fail to void. > > Signed-off-by: Mike Galbraith > --- > arch/x86/kernel/kexec-bzimage64.c | 99 > +- > 1 file changed, 45 insertions(+), 54 deletions(-) > > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -112,35 +112,32 @@ static int setup_e820_entries(struct boo > return 0; > } > > -#ifdef CONFIG_EFI > -static int setup_efi_info_memmap(struct boot_params *params, > +#ifdef CONFIG_EFI_RUNTIME_MAP > +static void setup_efi_info_memmap(struct boot_params *params, > unsigned long params_load_addr, > - unsigned int efi_map_offset, > + unsigned int params_cmdline_sz, > unsigned int efi_map_sz) > { > - void *efi_map = (void *)params + efi_map_offset; > - unsigned long efi_map_phys_addr = params_load_addr + efi_map_offset; > + void *efi_map = (void *)params + params_cmdline_sz; > + unsigned long efi_map_phys_addr = params_load_addr + params_cmdline_sz; > struct efi_info *ei = >efi_info; > > - if (!efi_map_sz) > - return -EINVAL; > - > efi_runtime_map_copy(efi_map, efi_map_sz); > > ei->efi_memmap = efi_map_phys_addr & 0x; > ei->efi_memmap_hi = efi_map_phys_addr >> 32; > ei->efi_memmap_size = efi_map_sz; > - > - return 0; > } > > -static int > +static void > prepare_add_efi_setup_data(struct boot_params *params, > -unsigned long params_load_addr, > -unsigned int efi_setup_data_offset) > +unsigned long params_load_addr, > +unsigned int params_cmdline_sz, > +unsigned int efi_map_sz) > { > + unsigned int data_offset = params_cmdline_sz + ALIGN(efi_map_sz, 16); > unsigned long setup_data_phys; > - struct setup_data *sd = (void *)params + efi_setup_data_offset; > + struct setup_data *sd = (void *)params + data_offset; > struct efi_setup_data *esd = (void *)sd + sizeof(struct setup_data); > > esd->fw_vendor = efi.fw_vendor; > @@ -152,33 +149,20 @@ prepare_add_efi_setup_data(struct boot_p > sd->len = sizeof(struct efi_setup_data); > > /* Add setup data */ > - setup_data_phys = params_load_addr + efi_setup_data_offset; > + setup_data_phys = params_load_addr + data_offset; > sd->next = params->hdr.setup_data; > params->hdr.setup_data = setup_data_phys; > - > - return 0; > } > > static int > setup_efi_state(struct boot_params *params, unsigned long params_load_addr, > - unsigned int efi_map_offset, unsigned int efi_map_sz, > - unsigned int efi_setup_data_offset) > + unsigned int params_cmdline_sz, unsigned int efi_map_sz) > { > struct efi_info *current_ei = _params.efi_info; > struct efi_info *ei =
Re: [PATCH] modpost: check strdup() return value
2018-08-15 5:50 GMT+09:00 Randy Dunlap : > From: Randy Dunlap > > Fix missing error check for function strdup() in scripts/mod/modpost.c. > > Fixes kernel bugzilla #200319: > https://bugzilla.kernel.org/show_bug.cgi?id=200319 > > Signed-off-by: Randy Dunlap > Cc: Yuexing Wang > Cc: Masahiro Yamada > --- > scripts/mod/modpost.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- linux-next-20180814.orig/scripts/mod/modpost.c > +++ linux-next-20180814/scripts/mod/modpost.c > @@ -672,7 +672,7 @@ static void handle_modversions(struct mo > if (ELF_ST_TYPE(sym->st_info) == STT_SPARC_REGISTER) > break; > if (symname[0] == '.') { > - char *munged = strdup(symname); > + char *munged = NOFAIL(strdup(symname)); > munged[0] = '_'; > munged[1] = toupper(munged[1]); > symname = munged; > > While you are here, will you fix a little more? line 1321: char *p = malloc(20); line 1341: return strdup(""); line 2039: buf->p = realloc(buf->p, buf->size); -- Best Regards Masahiro Yamada
Re: [PATCH] modpost: check strdup() return value
2018-08-15 5:50 GMT+09:00 Randy Dunlap : > From: Randy Dunlap > > Fix missing error check for function strdup() in scripts/mod/modpost.c. > > Fixes kernel bugzilla #200319: > https://bugzilla.kernel.org/show_bug.cgi?id=200319 > > Signed-off-by: Randy Dunlap > Cc: Yuexing Wang > Cc: Masahiro Yamada > --- > scripts/mod/modpost.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- linux-next-20180814.orig/scripts/mod/modpost.c > +++ linux-next-20180814/scripts/mod/modpost.c > @@ -672,7 +672,7 @@ static void handle_modversions(struct mo > if (ELF_ST_TYPE(sym->st_info) == STT_SPARC_REGISTER) > break; > if (symname[0] == '.') { > - char *munged = strdup(symname); > + char *munged = NOFAIL(strdup(symname)); > munged[0] = '_'; > munged[1] = toupper(munged[1]); > symname = munged; > > While you are here, will you fix a little more? line 1321: char *p = malloc(20); line 1341: return strdup(""); line 2039: buf->p = realloc(buf->p, buf->size); -- Best Regards Masahiro Yamada
Greetings in the name of God, Business proposal in God we trust
Greetings in the name of God Dear Friend Greetings in the name of God,please let this not sound strange to you for my only surviving lawyer who would have done this died early this year.I prayed and got your email id from your country guestbook. I am Mrs Suran Yoda from London,I am 72 years old,i am suffering from a long time cancer of the lungs which also affected my brain,from all indication my conditions is really deteriorating and it is quite obvious that,according to my doctors they have advised me that i may not live for the next two months,this is because the cancer stage has gotten to a very bad stage.I am married to (Dr Andrews Yoda) who worked with the Embassy of United Kingdom in South Africa for nine years,Before he died in 2004. I was bred up from a motherless babies home and was married to my late husband for Thirty years without a child,my husband died in a fatal motor accident Before his death we were true believers.Since his death I decided not to re-marry,I sold all my inherited belongings and deposited all the sum of $6.5 Million dollars with Bank in South Africa.Though what disturbs me mostly is the cancer. Having known my condition I decided to donate this fund to church,i want you as God fearing person,to also use this money to fund church,orphanages and widows,I took this decision,before i rest in peace because my time will so on be up. The Bible made us to understand that blessed are the hands that giveth. I took this decision because I don`t have any child that will inherit this money and my husband's relatives are not Christians and I don`t want my husband hard earned money to be misused by unbelievers. I don`t want a situation where these money will be used in an ungodly manner,hence the reason for taking this bold decision.I am not afraid of death hence i know where am going.Presently,I'm with my laptop in a hospital here in London where I have been undergoing treatment for cancer of the lungs. As soon as I receive your reply I shall give you the contact of the Bank.I will also issue you a letter of authority that will prove you as the new beneficiary of my fund.Please assure me that you will act accordingly as I stated.Hoping to hear from you soon. Remain blessed in the name of the Lord. Yours in Christ, Mrs Suran Yoda
Greetings in the name of God, Business proposal in God we trust
Greetings in the name of God Dear Friend Greetings in the name of God,please let this not sound strange to you for my only surviving lawyer who would have done this died early this year.I prayed and got your email id from your country guestbook. I am Mrs Suran Yoda from London,I am 72 years old,i am suffering from a long time cancer of the lungs which also affected my brain,from all indication my conditions is really deteriorating and it is quite obvious that,according to my doctors they have advised me that i may not live for the next two months,this is because the cancer stage has gotten to a very bad stage.I am married to (Dr Andrews Yoda) who worked with the Embassy of United Kingdom in South Africa for nine years,Before he died in 2004. I was bred up from a motherless babies home and was married to my late husband for Thirty years without a child,my husband died in a fatal motor accident Before his death we were true believers.Since his death I decided not to re-marry,I sold all my inherited belongings and deposited all the sum of $6.5 Million dollars with Bank in South Africa.Though what disturbs me mostly is the cancer. Having known my condition I decided to donate this fund to church,i want you as God fearing person,to also use this money to fund church,orphanages and widows,I took this decision,before i rest in peace because my time will so on be up. The Bible made us to understand that blessed are the hands that giveth. I took this decision because I don`t have any child that will inherit this money and my husband's relatives are not Christians and I don`t want my husband hard earned money to be misused by unbelievers. I don`t want a situation where these money will be used in an ungodly manner,hence the reason for taking this bold decision.I am not afraid of death hence i know where am going.Presently,I'm with my laptop in a hospital here in London where I have been undergoing treatment for cancer of the lungs. As soon as I receive your reply I shall give you the contact of the Bank.I will also issue you a letter of authority that will prove you as the new beneficiary of my fund.Please assure me that you will act accordingly as I stated.Hoping to hear from you soon. Remain blessed in the name of the Lord. Yours in Christ, Mrs Suran Yoda
Re: [RFC][PATCH 00/24] tools lib traceevent: Rename pevent to tep for preparation for library
On Fri, 10 Aug 2018 13:57:06 -0300 Arnaldo Carvalho de Melo wrote: > > Arnaldo takes care of that, but I guess pulling from branch is the prefered > > way > > I'll try pulling and building it automatically patch by patch, Hi Arnaldo, Were you able to take this? We have another patch series that's ready to go (it can wait till after the merge window though). -- Steve
Re: [RFC][PATCH 00/24] tools lib traceevent: Rename pevent to tep for preparation for library
On Fri, 10 Aug 2018 13:57:06 -0300 Arnaldo Carvalho de Melo wrote: > > Arnaldo takes care of that, but I guess pulling from branch is the prefered > > way > > I'll try pulling and building it automatically patch by patch, Hi Arnaldo, Were you able to take this? We have another patch series that's ready to go (it can wait till after the merge window though). -- Steve
Re: [PATCH V2] i2c: ismt: fix wrong device address when unmap the data buffer
+Cc: stable Hi Greg, JFI: This one has hit a couple of times on autotests on v4.9 stable. The fix for BUG() is trivial, so probably worth to ship it to v4.9/v4.4/v3.18. 2017-06-13 5:59 GMT+01:00 Song liwei : > From: Liwei Song > > Fix the following kernel bug: > > kernel BUG at drivers/iommu/intel-iommu.c:3260! > invalid opcode: [#5] PREEMPT SMP > Hardware name: Intel Corp. Harcuvar/Server, BIOS > HAVLCRB0.X64.0013.D39.1608311820 08/31/2016 > task: 880175389950 ti: 880176bec000 task.ti: 880176bec000 > RIP: 0010:[] [] intel_unmap+0x25b/0x260 > RSP: 0018:880176bef5e8 EFLAGS: 00010296 > RAX: 0024 RBX: 8800773c7c88 RCX: ce04 > RDX: 8000 RSI: RDI: 0009 > RBP: 880176bef638 R08: 0010 R09: 0004 > R10: 880175389c78 R11: 0a4f R12: 8800773c7868 > R13: ac88 R14: 8800773c7818 R15: 0001 > FS: 7fef21258700() GS:88017b5c() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 0066d6d8 CR3: 7118c000 CR4: 003406e0 > Stack: > ac88 8199867f 880176bef5f8 88010030 > 880176bef668 8800773c7c88 880178288098 8800772c0010 > 8800773c7818 0001 880176bef648 8150a86e > Call Trace: > [] ? printk+0x46/0x48 > [] intel_unmap_page+0xe/0x10 > [] ismt_access+0x27b/0x8fa [i2c_ismt] > [] ? __pm_runtime_suspend+0xa0/0xa0 > [] ? pm_suspend_timer_fn+0x80/0x80 > [] ? __pm_runtime_suspend+0xa0/0xa0 > [] ? pm_suspend_timer_fn+0x80/0x80 > [] ? pci_bus_read_dev_vendor_id+0xf0/0xf0 > [] i2c_smbus_xfer+0xec/0x4b0 > [] ? vprintk_emit+0x345/0x530 > [] i2cdev_ioctl_smbus+0x12b/0x240 [i2c_dev] > [] ? vprintk_default+0x29/0x40 > [] i2cdev_ioctl+0x63/0x1ec [i2c_dev] > [] do_vfs_ioctl+0x328/0x5d0 > [] ? vfs_write+0x11c/0x190 > [] ? rt_up_read+0x19/0x20 > [] SyS_ioctl+0x81/0xa0 > [] system_call_fastpath+0x16/0x6e > > This happen When run "i2cdetect -y 0" detect SMBus iSMT adapter. > > After finished I2C block read/write, when unmap the data buffer, > a wrong device address was pass to dma_unmap_single(). > > To fix this, give dma_unmap_single() the "dev" parameter, just like > what dma_map_single() does, then unmap can find the right devices. > > Fixes: 13f35ac14cd0 ("i2c: Adding support for Intel iSMT SMBus 2.0 host > controller") > Signed-off-by: Liwei Song > --- > drivers/i2c/busses/i2c-ismt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c > index f573448..e98e44e 100644 > --- a/drivers/i2c/busses/i2c-ismt.c > +++ b/drivers/i2c/busses/i2c-ismt.c > @@ -584,7 +584,7 @@ static int ismt_access(struct i2c_adapter *adap, u16 addr, > > /* unmap the data buffer */ > if (dma_size != 0) > - dma_unmap_single(>dev, dma_addr, dma_size, > dma_direction); > + dma_unmap_single(dev, dma_addr, dma_size, dma_direction); > > if (unlikely(!time_left)) { > dev_err(dev, "completion wait timed out\n"); > -- > 2.7.4 > -- Dmitry
Re: [PATCH V2] i2c: ismt: fix wrong device address when unmap the data buffer
+Cc: stable Hi Greg, JFI: This one has hit a couple of times on autotests on v4.9 stable. The fix for BUG() is trivial, so probably worth to ship it to v4.9/v4.4/v3.18. 2017-06-13 5:59 GMT+01:00 Song liwei : > From: Liwei Song > > Fix the following kernel bug: > > kernel BUG at drivers/iommu/intel-iommu.c:3260! > invalid opcode: [#5] PREEMPT SMP > Hardware name: Intel Corp. Harcuvar/Server, BIOS > HAVLCRB0.X64.0013.D39.1608311820 08/31/2016 > task: 880175389950 ti: 880176bec000 task.ti: 880176bec000 > RIP: 0010:[] [] intel_unmap+0x25b/0x260 > RSP: 0018:880176bef5e8 EFLAGS: 00010296 > RAX: 0024 RBX: 8800773c7c88 RCX: ce04 > RDX: 8000 RSI: RDI: 0009 > RBP: 880176bef638 R08: 0010 R09: 0004 > R10: 880175389c78 R11: 0a4f R12: 8800773c7868 > R13: ac88 R14: 8800773c7818 R15: 0001 > FS: 7fef21258700() GS:88017b5c() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 0066d6d8 CR3: 7118c000 CR4: 003406e0 > Stack: > ac88 8199867f 880176bef5f8 88010030 > 880176bef668 8800773c7c88 880178288098 8800772c0010 > 8800773c7818 0001 880176bef648 8150a86e > Call Trace: > [] ? printk+0x46/0x48 > [] intel_unmap_page+0xe/0x10 > [] ismt_access+0x27b/0x8fa [i2c_ismt] > [] ? __pm_runtime_suspend+0xa0/0xa0 > [] ? pm_suspend_timer_fn+0x80/0x80 > [] ? __pm_runtime_suspend+0xa0/0xa0 > [] ? pm_suspend_timer_fn+0x80/0x80 > [] ? pci_bus_read_dev_vendor_id+0xf0/0xf0 > [] i2c_smbus_xfer+0xec/0x4b0 > [] ? vprintk_emit+0x345/0x530 > [] i2cdev_ioctl_smbus+0x12b/0x240 [i2c_dev] > [] ? vprintk_default+0x29/0x40 > [] i2cdev_ioctl+0x63/0x1ec [i2c_dev] > [] do_vfs_ioctl+0x328/0x5d0 > [] ? vfs_write+0x11c/0x190 > [] ? rt_up_read+0x19/0x20 > [] SyS_ioctl+0x81/0xa0 > [] system_call_fastpath+0x16/0x6e > > This happen When run "i2cdetect -y 0" detect SMBus iSMT adapter. > > After finished I2C block read/write, when unmap the data buffer, > a wrong device address was pass to dma_unmap_single(). > > To fix this, give dma_unmap_single() the "dev" parameter, just like > what dma_map_single() does, then unmap can find the right devices. > > Fixes: 13f35ac14cd0 ("i2c: Adding support for Intel iSMT SMBus 2.0 host > controller") > Signed-off-by: Liwei Song > --- > drivers/i2c/busses/i2c-ismt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c > index f573448..e98e44e 100644 > --- a/drivers/i2c/busses/i2c-ismt.c > +++ b/drivers/i2c/busses/i2c-ismt.c > @@ -584,7 +584,7 @@ static int ismt_access(struct i2c_adapter *adap, u16 addr, > > /* unmap the data buffer */ > if (dma_size != 0) > - dma_unmap_single(>dev, dma_addr, dma_size, > dma_direction); > + dma_unmap_single(dev, dma_addr, dma_size, dma_direction); > > if (unlikely(!time_left)) { > dev_err(dev, "completion wait timed out\n"); > -- > 2.7.4 > -- Dmitry
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation
Hi Rob, On 15 August 2018 at 04:27, Rob Herring wrote: > On Thu, Aug 09, 2018 at 11:03:11AM +0800, Baolin Wang wrote: >> Hi Trent, >> >> On 9 August 2018 at 02:57, Trent Piepho wrote: >> > On Wed, 2018-08-08 at 11:54 +0100, Mark Brown wrote: >> >> On Wed, Aug 08, 2018 at 06:35:28PM +0800, Baolin Wang wrote: >> >> > On 8 August 2018 at 17:50, Mark Brown wrote: >> >> > > Right, I don't think we added this yet (if we did I can't see >> >> > > it). I'd >> >> > > add a new field to spi_transfer for this, then other controllers >> >> > > with >> >> > > the same support can implement it as well and drivers can start >> >> > > using >> >> > > it too. >> >> > OK. So I will name the new filed as 'word_delay', is it OK for you? >> >> >> >> Sounds good, yes. >> > >> > Should it be in µs like the existing iter-transfer delay? I think >> > perhaps units of cycles of the SPI clock make more sense? >> >> Since some SPI controllers just want some interval values (neither µs >> unit nor cycles unit ) set into hardware, and the hardware will >> convert to the correct delay time automatically. So I did not force >> 'word_delay' unit as µs or cycle, and just let the slave devices >> decide the unit which depends on the SPI hardware requirement. > > This needs to be defined units in DT, not decided by each controller. Do you mean we should introduce one standard property (maybe named as 'word_delay_unit') to define the word_delay unit? If we really need to specify the unit of word_delay, I think we can add comments for spi_tansfer to specify the unit, which will be better. diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index a64235e..7a72c0a 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -711,6 +711,8 @@ extern void spi_res_release(struct spi_controller *ctlr, * @delay_usecs: microseconds to delay after this transfer before * (optionally) changing the chipselect status, then starting * the next transfer or completing this @spi_message. + * @word_delay: clock cycles to inter word delay after each word size (set by bits_per_word) + * transmission. * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use @@ -793,6 +795,7 @@ struct spi_transfer { u8 bits_per_word; u16 delay_usecs; u32 speed_hz; + u16 word_delay; struct list_head transfer_list; }; -- Baolin Wang Best Regards
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation
Hi Rob, On 15 August 2018 at 04:27, Rob Herring wrote: > On Thu, Aug 09, 2018 at 11:03:11AM +0800, Baolin Wang wrote: >> Hi Trent, >> >> On 9 August 2018 at 02:57, Trent Piepho wrote: >> > On Wed, 2018-08-08 at 11:54 +0100, Mark Brown wrote: >> >> On Wed, Aug 08, 2018 at 06:35:28PM +0800, Baolin Wang wrote: >> >> > On 8 August 2018 at 17:50, Mark Brown wrote: >> >> > > Right, I don't think we added this yet (if we did I can't see >> >> > > it). I'd >> >> > > add a new field to spi_transfer for this, then other controllers >> >> > > with >> >> > > the same support can implement it as well and drivers can start >> >> > > using >> >> > > it too. >> >> > OK. So I will name the new filed as 'word_delay', is it OK for you? >> >> >> >> Sounds good, yes. >> > >> > Should it be in µs like the existing iter-transfer delay? I think >> > perhaps units of cycles of the SPI clock make more sense? >> >> Since some SPI controllers just want some interval values (neither µs >> unit nor cycles unit ) set into hardware, and the hardware will >> convert to the correct delay time automatically. So I did not force >> 'word_delay' unit as µs or cycle, and just let the slave devices >> decide the unit which depends on the SPI hardware requirement. > > This needs to be defined units in DT, not decided by each controller. Do you mean we should introduce one standard property (maybe named as 'word_delay_unit') to define the word_delay unit? If we really need to specify the unit of word_delay, I think we can add comments for spi_tansfer to specify the unit, which will be better. diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index a64235e..7a72c0a 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -711,6 +711,8 @@ extern void spi_res_release(struct spi_controller *ctlr, * @delay_usecs: microseconds to delay after this transfer before * (optionally) changing the chipselect status, then starting * the next transfer or completing this @spi_message. + * @word_delay: clock cycles to inter word delay after each word size (set by bits_per_word) + * transmission. * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use @@ -793,6 +795,7 @@ struct spi_transfer { u8 bits_per_word; u16 delay_usecs; u32 speed_hz; + u16 word_delay; struct list_head transfer_list; }; -- Baolin Wang Best Regards
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On Tue, Aug 14, 2018 at 6:35 PM Michal Wnukowski wrote: > > I got confused after comaring disassembly of this code with and > without volatile keyword. Thanks for the correction. Note that _usually_, the "volatile" has absolutely no impact. When there is one read in the source code, it's almost always one access in the generated code too. That's particularly true if the read like this access do dbbuf_ei: if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value)) is only used as an argument to a real function call. But if that function is an inline function (which it is), and the argument ends up getting used multiple times (which in this case it is not), then it is possible in theory that gcc ends up saying "instead of reading the value once, and keep it in a register, I'll just read it again for the second time". That happens rarely, but it _can_ happen without the volatile (or the READ_ONCE()). The advantage of READ_ONCE() over using "volatile" in a data structure tends to be that you explicitly mark the memory accesses that you care about. That's nice documentation for whoever reads the code (and it's not unheard of that the _same_ data structure is sometimes volatile, and sometimes not - for example, the data structure might be protected by a lock - not volatile - but people might use an optimistic lockless access to the value too - when it ends up being volatile. So then it's really good to explicitly use READ_ONCE() for the volatile cases where you show that you know that you're now doing something special that really depends on memory ordering or other magic "access exactly once" behavior. > > I'm assuming that's the actual controller hardware, but it needs a > > comment about *that* access being ordered too, because if it isn't, > > then ordering this side is pointless. > > The other side in this case is not actual controller hardware, but > virtual one (the regular hardware should rely on normal MMIO > doorbells). Ok, Maybe worth adding a one-line note about the ordering guarantees by the virtual controller accesses. Linus
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On Tue, Aug 14, 2018 at 6:35 PM Michal Wnukowski wrote: > > I got confused after comaring disassembly of this code with and > without volatile keyword. Thanks for the correction. Note that _usually_, the "volatile" has absolutely no impact. When there is one read in the source code, it's almost always one access in the generated code too. That's particularly true if the read like this access do dbbuf_ei: if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value)) is only used as an argument to a real function call. But if that function is an inline function (which it is), and the argument ends up getting used multiple times (which in this case it is not), then it is possible in theory that gcc ends up saying "instead of reading the value once, and keep it in a register, I'll just read it again for the second time". That happens rarely, but it _can_ happen without the volatile (or the READ_ONCE()). The advantage of READ_ONCE() over using "volatile" in a data structure tends to be that you explicitly mark the memory accesses that you care about. That's nice documentation for whoever reads the code (and it's not unheard of that the _same_ data structure is sometimes volatile, and sometimes not - for example, the data structure might be protected by a lock - not volatile - but people might use an optimistic lockless access to the value too - when it ends up being volatile. So then it's really good to explicitly use READ_ONCE() for the volatile cases where you show that you know that you're now doing something special that really depends on memory ordering or other magic "access exactly once" behavior. > > I'm assuming that's the actual controller hardware, but it needs a > > comment about *that* access being ordered too, because if it isn't, > > then ordering this side is pointless. > > The other side in this case is not actual controller hardware, but > virtual one (the regular hardware should rely on normal MMIO > doorbells). Ok, Maybe worth adding a one-line note about the ordering guarantees by the virtual controller accesses. Linus
Re: linux-next: manual merge of the tip tree with the rdma tree
Hi all, On Mon, 6 Aug 2018 14:53:31 +1000 Stephen Rothwell wrote: > > Today's linux-next merge of the tip tree got a conflict in: > > drivers/infiniband/core/rdma_core.c > > between commit: > > 9867f5c6695f ("IB/uverbs: Convert 'bool exclusive' into an enum") > > from the rdma tree and commit: > > bfc18e389c7a ("atomics/treewide: Rename __atomic_add_unless() => > atomic_fetch_add_unless()") > > from the tip tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > -- > Cheers, > Stephen Rothwell > > diff --cc drivers/infiniband/core/rdma_core.c > index 4235b9ddc2ad,475910ffbcb6.. > --- a/drivers/infiniband/core/rdma_core.c > +++ b/drivers/infiniband/core/rdma_core.c > @@@ -122,206 -120,19 +122,206 @@@ static int uverbs_try_lock_object(struc >* concurrently, setting the counter to zero is enough for releasing >* this lock. >*/ > -if (!exclusive) > +switch (mode) { > +case UVERBS_LOOKUP_READ: > - return __atomic_add_unless(>usecnt, 1, -1) == -1 ? > + return atomic_fetch_add_unless(>usecnt, 1, -1) == -1 ? > -EBUSY : 0; > +case UVERBS_LOOKUP_WRITE: > +/* lock is exclusive */ > +return atomic_cmpxchg(>usecnt, 0, -1) == 0 ? 0 : -EBUSY; > +case UVERBS_LOOKUP_DESTROY: > +return 0; > +} > +return 0; > +} > + > +static void assert_uverbs_usecnt(struct ib_uobject *uobj, > + enum rdma_lookup_mode mode) > +{ > +#ifdef CONFIG_LOCKDEP > +switch (mode) { > +case UVERBS_LOOKUP_READ: > +WARN_ON(atomic_read(>usecnt) <= 0); > +break; > +case UVERBS_LOOKUP_WRITE: > +WARN_ON(atomic_read(>usecnt) != -1); > +break; > +case UVERBS_LOOKUP_DESTROY: > +break; > +} > +#endif > +} > + > +/* > + * This must be called with the hw_destroy_rwsem locked for read or write, > + * also the uobject itself must be locked for write. > + * > + * Upon return the HW object is guaranteed to be destroyed. > + * > + * For RDMA_REMOVE_ABORT, the hw_destroy_rwsem is not required to be held, > + * however the type's allocat_commit function cannot have been called and > the > + * uobject cannot be on the uobjects_lists > + * > + * For RDMA_REMOVE_DESTROY the caller shold be holding a kref (eg via > + * rdma_lookup_get_uobject) and the object is left in a state where the > caller > + * needs to call rdma_lookup_put_uobject. > + * > + * For all other destroy modes this function internally unlocks the uobject > + * and consumes the kref on the uobj. > + */ > +static int uverbs_destroy_uobject(struct ib_uobject *uobj, > + enum rdma_remove_reason reason) > +{ > +struct ib_uverbs_file *ufile = uobj->ufile; > +unsigned long flags; > +int ret; > + > +lockdep_assert_held(>hw_destroy_rwsem); > +assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE); > + > +if (uobj->object) { > +ret = uobj->type->type_class->destroy_hw(uobj, reason); > +if (ret) { > +if (ib_is_destroy_retryable(ret, reason, uobj)) > +return ret; > + > +/* Nothing to be done, dangle the memory and move on */ > +WARN(true, > + "ib_uverbs: failed to remove uobject id %d, driver > err=%d", > + uobj->id, ret); > +} > + > +uobj->object = NULL; > +} > + > +if (reason == RDMA_REMOVE_ABORT) { > +WARN_ON(!list_empty(>list)); > +WARN_ON(!uobj->context); > +uobj->type->type_class->alloc_abort(uobj); > +} > + > +uobj->context = NULL; > + > +/* > + * For DESTROY the usecnt is held write locked, the caller is expected > + * to put it unlock and put the object when done with it. Only DESTROY > + * can remove the IDR handle. > + */ > +if (reason != RDMA_REMOVE_DESTROY) > +atomic_set(>usecnt, 0); > +else > +uobj->type->type_class->remove_handle(uobj); > + > +if (!list_empty(>list)) { > +spin_lock_irqsave(>uobjects_lock, flags); > +list_del_init(>list); > +spin_unlock_irqrestore(>uobjects_lock, flags); > + > +/* > + * Pairs with the get in rdma_alloc_commit_uobject(), could > + * destroy uobj. > + */ > +uverbs_uobject_put(uobj); > +} > > -/* lock is either WRITE or DESTROY - should
Re: linux-next: manual merge of the tip tree with the rdma tree
Hi all, On Mon, 6 Aug 2018 14:53:31 +1000 Stephen Rothwell wrote: > > Today's linux-next merge of the tip tree got a conflict in: > > drivers/infiniband/core/rdma_core.c > > between commit: > > 9867f5c6695f ("IB/uverbs: Convert 'bool exclusive' into an enum") > > from the rdma tree and commit: > > bfc18e389c7a ("atomics/treewide: Rename __atomic_add_unless() => > atomic_fetch_add_unless()") > > from the tip tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > -- > Cheers, > Stephen Rothwell > > diff --cc drivers/infiniband/core/rdma_core.c > index 4235b9ddc2ad,475910ffbcb6.. > --- a/drivers/infiniband/core/rdma_core.c > +++ b/drivers/infiniband/core/rdma_core.c > @@@ -122,206 -120,19 +122,206 @@@ static int uverbs_try_lock_object(struc >* concurrently, setting the counter to zero is enough for releasing >* this lock. >*/ > -if (!exclusive) > +switch (mode) { > +case UVERBS_LOOKUP_READ: > - return __atomic_add_unless(>usecnt, 1, -1) == -1 ? > + return atomic_fetch_add_unless(>usecnt, 1, -1) == -1 ? > -EBUSY : 0; > +case UVERBS_LOOKUP_WRITE: > +/* lock is exclusive */ > +return atomic_cmpxchg(>usecnt, 0, -1) == 0 ? 0 : -EBUSY; > +case UVERBS_LOOKUP_DESTROY: > +return 0; > +} > +return 0; > +} > + > +static void assert_uverbs_usecnt(struct ib_uobject *uobj, > + enum rdma_lookup_mode mode) > +{ > +#ifdef CONFIG_LOCKDEP > +switch (mode) { > +case UVERBS_LOOKUP_READ: > +WARN_ON(atomic_read(>usecnt) <= 0); > +break; > +case UVERBS_LOOKUP_WRITE: > +WARN_ON(atomic_read(>usecnt) != -1); > +break; > +case UVERBS_LOOKUP_DESTROY: > +break; > +} > +#endif > +} > + > +/* > + * This must be called with the hw_destroy_rwsem locked for read or write, > + * also the uobject itself must be locked for write. > + * > + * Upon return the HW object is guaranteed to be destroyed. > + * > + * For RDMA_REMOVE_ABORT, the hw_destroy_rwsem is not required to be held, > + * however the type's allocat_commit function cannot have been called and > the > + * uobject cannot be on the uobjects_lists > + * > + * For RDMA_REMOVE_DESTROY the caller shold be holding a kref (eg via > + * rdma_lookup_get_uobject) and the object is left in a state where the > caller > + * needs to call rdma_lookup_put_uobject. > + * > + * For all other destroy modes this function internally unlocks the uobject > + * and consumes the kref on the uobj. > + */ > +static int uverbs_destroy_uobject(struct ib_uobject *uobj, > + enum rdma_remove_reason reason) > +{ > +struct ib_uverbs_file *ufile = uobj->ufile; > +unsigned long flags; > +int ret; > + > +lockdep_assert_held(>hw_destroy_rwsem); > +assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE); > + > +if (uobj->object) { > +ret = uobj->type->type_class->destroy_hw(uobj, reason); > +if (ret) { > +if (ib_is_destroy_retryable(ret, reason, uobj)) > +return ret; > + > +/* Nothing to be done, dangle the memory and move on */ > +WARN(true, > + "ib_uverbs: failed to remove uobject id %d, driver > err=%d", > + uobj->id, ret); > +} > + > +uobj->object = NULL; > +} > + > +if (reason == RDMA_REMOVE_ABORT) { > +WARN_ON(!list_empty(>list)); > +WARN_ON(!uobj->context); > +uobj->type->type_class->alloc_abort(uobj); > +} > + > +uobj->context = NULL; > + > +/* > + * For DESTROY the usecnt is held write locked, the caller is expected > + * to put it unlock and put the object when done with it. Only DESTROY > + * can remove the IDR handle. > + */ > +if (reason != RDMA_REMOVE_DESTROY) > +atomic_set(>usecnt, 0); > +else > +uobj->type->type_class->remove_handle(uobj); > + > +if (!list_empty(>list)) { > +spin_lock_irqsave(>uobjects_lock, flags); > +list_del_init(>list); > +spin_unlock_irqrestore(>uobjects_lock, flags); > + > +/* > + * Pairs with the get in rdma_alloc_commit_uobject(), could > + * destroy uobj. > + */ > +uverbs_uobject_put(uobj); > +} > > -/* lock is either WRITE or DESTROY - should
Re: [PATCH v23 3/4] arm64: dts: mt8173: Add GCE node
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote: > This patch adds the device node of the GCE hardware for CMDQ module. > > Signed-off-by: Houlong Wei > Signed-off-by: HS Liao > --- > arch/arm64/boot/dts/mediatek/mt8173.dtsi | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi > b/arch/arm64/boot/dts/mediatek/mt8173.dtsi > index 94597e3..97b1ec6 100644 > --- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi > +++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi > @@ -18,6 +18,7 @@ > #include > #include > #include > +#include > #include "mt8173-pinfunc.h" > > / { > @@ -519,6 +520,15 @@ > status = "disabled"; > }; > > + gce: mailbox@10212000 { > + compatible = "mediatek,mt8173-gce"; > + reg = <0 0x10212000 0 0x1000>; > + interrupts = ; > + clocks = < CLK_INFRA_GCE>; > + clock-names = "gce"; > + #mbox-cells = <3>; > + }; > + > mipi_tx0: mipi-dphy@10215000 { > compatible = "mediatek,mt8173-mipi-tx"; > reg = <0 0x10215000 0 0x1000>; Hi Matthias, Could you please review this patch when you are available? Thanks a lot.
Re: [PATCH v23 3/4] arm64: dts: mt8173: Add GCE node
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote: > This patch adds the device node of the GCE hardware for CMDQ module. > > Signed-off-by: Houlong Wei > Signed-off-by: HS Liao > --- > arch/arm64/boot/dts/mediatek/mt8173.dtsi | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi > b/arch/arm64/boot/dts/mediatek/mt8173.dtsi > index 94597e3..97b1ec6 100644 > --- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi > +++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi > @@ -18,6 +18,7 @@ > #include > #include > #include > +#include > #include "mt8173-pinfunc.h" > > / { > @@ -519,6 +520,15 @@ > status = "disabled"; > }; > > + gce: mailbox@10212000 { > + compatible = "mediatek,mt8173-gce"; > + reg = <0 0x10212000 0 0x1000>; > + interrupts = ; > + clocks = < CLK_INFRA_GCE>; > + clock-names = "gce"; > + #mbox-cells = <3>; > + }; > + > mipi_tx0: mipi-dphy@10215000 { > compatible = "mediatek,mt8173-mipi-tx"; > reg = <0 0x10215000 0 0x1000>; Hi Matthias, Could you please review this patch when you are available? Thanks a lot.
Re: [PATCH v23 4/4] soc: mediatek: Add Mediatek CMDQ helper
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote: > Add Mediatek CMDQ helper to create CMDQ packet and assemble GCE op code. > > Signed-off-by: Houlong Wei > Signed-off-by: HS Liao > --- > drivers/soc/mediatek/Kconfig | 12 ++ > drivers/soc/mediatek/Makefile |1 + > drivers/soc/mediatek/mtk-cmdq-helper.c | 291 > > include/linux/soc/mediatek/mtk-cmdq.h | 135 +++ > 4 files changed, 439 insertions(+) > create mode 100644 drivers/soc/mediatek/mtk-cmdq-helper.c > create mode 100644 include/linux/soc/mediatek/mtk-cmdq.h > > diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig > index a7d0667..17bd759 100644 > --- a/drivers/soc/mediatek/Kconfig > +++ b/drivers/soc/mediatek/Kconfig > @@ -4,6 +4,18 @@ > menu "MediaTek SoC drivers" > depends on ARCH_MEDIATEK || COMPILE_TEST > > +config MTK_CMDQ > + tristate "MediaTek CMDQ Support" > + depends on ARCH_MEDIATEK || COMPILE_TEST > + select MAILBOX > + select MTK_CMDQ_MBOX > + select MTK_INFRACFG > + help > + Say yes here to add support for the MediaTek Command Queue (CMDQ) > + driver. The CMDQ is used to help read/write registers with critical > + time limitation, such as updating display configuration during the > + vblank. > + > config MTK_INFRACFG > bool "MediaTek INFRACFG Support" > select REGMAP > diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile > index 12998b0..64ce5ee 100644 > --- a/drivers/soc/mediatek/Makefile > +++ b/drivers/soc/mediatek/Makefile > @@ -1,3 +1,4 @@ > +obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o > obj-$(CONFIG_MTK_INFRACFG) += mtk-infracfg.o > obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o > obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o > diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c > b/drivers/soc/mediatek/mtk-cmdq-helper.c > new file mode 100644 > index 000..e4dbb7e > --- /dev/null > +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c > @@ -0,0 +1,291 @@ > +// SPDX-License-Identifier: GPL-2.0 > +// > +// Copyright (c) 2018 MediaTek Inc. > + > +#include > +#include > +#include > +#include > +#include > + > +#define CMDQ_ARG_A_WRITE_MASK0x > +#define CMDQ_WRITE_ENABLE_MASK BIT(0) > +#define CMDQ_EOC_IRQ_EN BIT(0) > +#define CMDQ_EOC_CMD ((u64)((CMDQ_CODE_EOC << CMDQ_OP_CODE_SHIFT)) \ > + << 32 | CMDQ_EOC_IRQ_EN) > + > +static void cmdq_client_timeout(struct timer_list *t) > +{ > + struct cmdq_client *client = from_timer(client, t, timer); > + > + dev_err(client->client.dev, "cmdq timeout!\n"); > +} > + > +struct cmdq_client *cmdq_mbox_create(struct device *dev, int index, u32 > timeout) > +{ > + struct cmdq_client *client; > + > + client = kzalloc(sizeof(*client), GFP_KERNEL); > + if (!client) > + return (struct cmdq_client *)-ENOMEM; > + > + client->timeout_ms = timeout; > + if (timeout != CMDQ_NO_TIMEOUT) { > + spin_lock_init(>lock); > + timer_setup(>timer, cmdq_client_timeout, 0); > + } > + client->pkt_cnt = 0; > + client->client.dev = dev; > + client->client.tx_block = false; > + client->chan = mbox_request_channel(>client, index); > + > + if (IS_ERR(client->chan)) { > + long err = 0; > + > + dev_err(dev, "failed to request channel\n"); > + err = PTR_ERR(client->chan); > + kfree(client); > + > + return (struct cmdq_client *)err; > + } > + > + return client; > +} > +EXPORT_SYMBOL(cmdq_mbox_create); > + > +void cmdq_mbox_destroy(struct cmdq_client *client) > +{ > + if (client->timeout_ms != CMDQ_NO_TIMEOUT) { > + spin_lock(>lock); > + del_timer_sync(>timer); > + spin_unlock(>lock); > + } > + mbox_free_channel(client->chan); > + kfree(client); > +} > +EXPORT_SYMBOL(cmdq_mbox_destroy); > + > +int cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt **pkt_ptr, > + size_t size) > +{ > + struct cmdq_pkt *pkt; > + struct device *dev; > + dma_addr_t dma_addr; > + > + pkt = kzalloc(sizeof(*pkt), GFP_KERNEL); > + if (!pkt) > + return -ENOMEM; > + pkt->va_base = kzalloc(size, GFP_KERNEL); > + if (!pkt->va_base) { > + kfree(pkt); > + return -ENOMEM; > + } > + pkt->buf_size = size; > + pkt->cl = (void *)client; > + > + dev = client->chan->mbox->dev; > + dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size, > + DMA_TO_DEVICE); > + if (dma_mapping_error(dev, dma_addr)) { > + dev_err(dev, "dma map failed, size=%u\n", (u32)(u64)size); > + kfree(pkt->va_base); > + kfree(pkt); > + return -ENOMEM; > + } > + > + pkt->pa_base = dma_addr; > + *pkt_ptr = pkt; > + > + return 0; > +} >
Re: [PATCH v23 4/4] soc: mediatek: Add Mediatek CMDQ helper
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote: > Add Mediatek CMDQ helper to create CMDQ packet and assemble GCE op code. > > Signed-off-by: Houlong Wei > Signed-off-by: HS Liao > --- > drivers/soc/mediatek/Kconfig | 12 ++ > drivers/soc/mediatek/Makefile |1 + > drivers/soc/mediatek/mtk-cmdq-helper.c | 291 > > include/linux/soc/mediatek/mtk-cmdq.h | 135 +++ > 4 files changed, 439 insertions(+) > create mode 100644 drivers/soc/mediatek/mtk-cmdq-helper.c > create mode 100644 include/linux/soc/mediatek/mtk-cmdq.h > > diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig > index a7d0667..17bd759 100644 > --- a/drivers/soc/mediatek/Kconfig > +++ b/drivers/soc/mediatek/Kconfig > @@ -4,6 +4,18 @@ > menu "MediaTek SoC drivers" > depends on ARCH_MEDIATEK || COMPILE_TEST > > +config MTK_CMDQ > + tristate "MediaTek CMDQ Support" > + depends on ARCH_MEDIATEK || COMPILE_TEST > + select MAILBOX > + select MTK_CMDQ_MBOX > + select MTK_INFRACFG > + help > + Say yes here to add support for the MediaTek Command Queue (CMDQ) > + driver. The CMDQ is used to help read/write registers with critical > + time limitation, such as updating display configuration during the > + vblank. > + > config MTK_INFRACFG > bool "MediaTek INFRACFG Support" > select REGMAP > diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile > index 12998b0..64ce5ee 100644 > --- a/drivers/soc/mediatek/Makefile > +++ b/drivers/soc/mediatek/Makefile > @@ -1,3 +1,4 @@ > +obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o > obj-$(CONFIG_MTK_INFRACFG) += mtk-infracfg.o > obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o > obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o > diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c > b/drivers/soc/mediatek/mtk-cmdq-helper.c > new file mode 100644 > index 000..e4dbb7e > --- /dev/null > +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c > @@ -0,0 +1,291 @@ > +// SPDX-License-Identifier: GPL-2.0 > +// > +// Copyright (c) 2018 MediaTek Inc. > + > +#include > +#include > +#include > +#include > +#include > + > +#define CMDQ_ARG_A_WRITE_MASK0x > +#define CMDQ_WRITE_ENABLE_MASK BIT(0) > +#define CMDQ_EOC_IRQ_EN BIT(0) > +#define CMDQ_EOC_CMD ((u64)((CMDQ_CODE_EOC << CMDQ_OP_CODE_SHIFT)) \ > + << 32 | CMDQ_EOC_IRQ_EN) > + > +static void cmdq_client_timeout(struct timer_list *t) > +{ > + struct cmdq_client *client = from_timer(client, t, timer); > + > + dev_err(client->client.dev, "cmdq timeout!\n"); > +} > + > +struct cmdq_client *cmdq_mbox_create(struct device *dev, int index, u32 > timeout) > +{ > + struct cmdq_client *client; > + > + client = kzalloc(sizeof(*client), GFP_KERNEL); > + if (!client) > + return (struct cmdq_client *)-ENOMEM; > + > + client->timeout_ms = timeout; > + if (timeout != CMDQ_NO_TIMEOUT) { > + spin_lock_init(>lock); > + timer_setup(>timer, cmdq_client_timeout, 0); > + } > + client->pkt_cnt = 0; > + client->client.dev = dev; > + client->client.tx_block = false; > + client->chan = mbox_request_channel(>client, index); > + > + if (IS_ERR(client->chan)) { > + long err = 0; > + > + dev_err(dev, "failed to request channel\n"); > + err = PTR_ERR(client->chan); > + kfree(client); > + > + return (struct cmdq_client *)err; > + } > + > + return client; > +} > +EXPORT_SYMBOL(cmdq_mbox_create); > + > +void cmdq_mbox_destroy(struct cmdq_client *client) > +{ > + if (client->timeout_ms != CMDQ_NO_TIMEOUT) { > + spin_lock(>lock); > + del_timer_sync(>timer); > + spin_unlock(>lock); > + } > + mbox_free_channel(client->chan); > + kfree(client); > +} > +EXPORT_SYMBOL(cmdq_mbox_destroy); > + > +int cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt **pkt_ptr, > + size_t size) > +{ > + struct cmdq_pkt *pkt; > + struct device *dev; > + dma_addr_t dma_addr; > + > + pkt = kzalloc(sizeof(*pkt), GFP_KERNEL); > + if (!pkt) > + return -ENOMEM; > + pkt->va_base = kzalloc(size, GFP_KERNEL); > + if (!pkt->va_base) { > + kfree(pkt); > + return -ENOMEM; > + } > + pkt->buf_size = size; > + pkt->cl = (void *)client; > + > + dev = client->chan->mbox->dev; > + dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size, > + DMA_TO_DEVICE); > + if (dma_mapping_error(dev, dma_addr)) { > + dev_err(dev, "dma map failed, size=%u\n", (u32)(u64)size); > + kfree(pkt->va_base); > + kfree(pkt); > + return -ENOMEM; > + } > + > + pkt->pa_base = dma_addr; > + *pkt_ptr = pkt; > + > + return 0; > +} >
Re: linux-next: manual merge of the block tree with the rdma tree
Hi all, On Thu, 26 Jul 2018 13:58:04 +1000 Stephen Rothwell wrote: > > Today's linux-next merge of the block tree got a conflict in: > > drivers/nvme/target/rdma.c > > between commit: > > 23f96d1f15a7 ("nvmet-rdma: Simplify ib_post_(send|recv|srq_recv)() calls") > 202093848cac ("nvmet-rdma: add an error flow for post_recv failures") > > from the rdma tree and commits: > > 2fc464e2162c ("nvmet-rdma: add unlikely check in the fast path") > > from the block tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > -- > Cheers, > Stephen Rothwell > > diff --cc drivers/nvme/target/rdma.c > index 1a642e214a4c,e7f43d1e1779.. > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@@ -382,13 -435,22 +435,21 @@@ static void nvmet_rdma_free_rsps(struc > static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev, > struct nvmet_rdma_cmd *cmd) > { > -struct ib_recv_wr *bad_wr; > + int ret; > + > ib_dma_sync_single_for_device(ndev->device, > cmd->sge[0].addr, cmd->sge[0].length, > DMA_FROM_DEVICE); > > if (ndev->srq) > - return ib_post_srq_recv(ndev->srq, >wr, NULL); > - return ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL); > -ret = ib_post_srq_recv(ndev->srq, >wr, _wr); > ++ret = ib_post_srq_recv(ndev->srq, >wr, NULL); > + else > -ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, _wr); > ++ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL); > + > + if (unlikely(ret)) > + pr_err("post_recv cmd failed\n"); > + > + return ret; > } > > static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue) > @@@ -491,7 -553,7 +552,7 @@@ static void nvmet_rdma_queue_response(s > rsp->send_sge.addr, rsp->send_sge.length, > DMA_TO_DEVICE); > > - if (ib_post_send(cm_id->qp, first_wr, NULL)) { > -if (unlikely(ib_post_send(cm_id->qp, first_wr, _wr))) { > ++if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { > pr_err("sending cmd response failed\n"); > nvmet_rdma_release_rsp(rsp); > } This is now a conflict between Linus' tree and the rdma tree. -- Cheers, Stephen Rothwell pgpIIARNDrEL8.pgp Description: OpenPGP digital signature
Re: linux-next: manual merge of the block tree with the rdma tree
Hi all, On Thu, 26 Jul 2018 13:58:04 +1000 Stephen Rothwell wrote: > > Today's linux-next merge of the block tree got a conflict in: > > drivers/nvme/target/rdma.c > > between commit: > > 23f96d1f15a7 ("nvmet-rdma: Simplify ib_post_(send|recv|srq_recv)() calls") > 202093848cac ("nvmet-rdma: add an error flow for post_recv failures") > > from the rdma tree and commits: > > 2fc464e2162c ("nvmet-rdma: add unlikely check in the fast path") > > from the block tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > -- > Cheers, > Stephen Rothwell > > diff --cc drivers/nvme/target/rdma.c > index 1a642e214a4c,e7f43d1e1779.. > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@@ -382,13 -435,22 +435,21 @@@ static void nvmet_rdma_free_rsps(struc > static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev, > struct nvmet_rdma_cmd *cmd) > { > -struct ib_recv_wr *bad_wr; > + int ret; > + > ib_dma_sync_single_for_device(ndev->device, > cmd->sge[0].addr, cmd->sge[0].length, > DMA_FROM_DEVICE); > > if (ndev->srq) > - return ib_post_srq_recv(ndev->srq, >wr, NULL); > - return ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL); > -ret = ib_post_srq_recv(ndev->srq, >wr, _wr); > ++ret = ib_post_srq_recv(ndev->srq, >wr, NULL); > + else > -ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, _wr); > ++ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL); > + > + if (unlikely(ret)) > + pr_err("post_recv cmd failed\n"); > + > + return ret; > } > > static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue) > @@@ -491,7 -553,7 +552,7 @@@ static void nvmet_rdma_queue_response(s > rsp->send_sge.addr, rsp->send_sge.length, > DMA_TO_DEVICE); > > - if (ib_post_send(cm_id->qp, first_wr, NULL)) { > -if (unlikely(ib_post_send(cm_id->qp, first_wr, _wr))) { > ++if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { > pr_err("sending cmd response failed\n"); > nvmet_rdma_release_rsp(rsp); > } This is now a conflict between Linus' tree and the rdma tree. -- Cheers, Stephen Rothwell pgpIIARNDrEL8.pgp Description: OpenPGP digital signature
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation
Hi Rob, On 15 August 2018 at 04:21, Rob Herring wrote: > On Tue, Aug 07, 2018 at 06:43:37PM +0800, Baolin Wang wrote: >> From: Lanqing Liu >> >> This patch adds the binding documentation for Spreadtrum SPI >> controller device. >> >> Signed-off-by: Lanqing Liu >> Signed-off-by: Baolin Wang >> --- >> Documentation/devicetree/bindings/spi/spi-sprd.txt | 31 >> >> 1 file changed, 31 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/spi/spi-sprd.txt >> >> diff --git a/Documentation/devicetree/bindings/spi/spi-sprd.txt >> b/Documentation/devicetree/bindings/spi/spi-sprd.txt >> new file mode 100644 >> index 000..06ff746 >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/spi/spi-sprd.txt >> @@ -0,0 +1,31 @@ >> +Spreadtrum SPI Controller >> + >> +Required properties: >> +- compatible: Should be "sprd,sc9860-spi". >> +- reg: Offset and length of SPI controller register space. >> +- interrupts: Should contain SPI interrupt. >> +- clock-names: Should contain following entries: >> + "spi" for SPI clock, >> + "source" for SPI source (parent) clock, > > Do the h/w block actually get this clock or the driver needs it? In the > latter case, it should not be in DT. Yes, we will get the actual SPI source clock form the "source" clock property. > >> + "enable" for SPI module enable clock. >> +- clocks: List of clock input name strings sorted in the same order >> + as the clock-names property. >> +- #address-cells: The number of cells required to define a chip select >> + address on the SPI bus. Should be set to 1. >> +- #size-cells: Should be set to 0. >> + >> +Optional properties: >> +- sprd,spi-interval: Specify the intervals of two SPI frames, which can be >> + converted to the delay clock cycles = interval number * 4 + 10. > > There are read and write delay properties you can use. Right. We've agreed introducing another field for spi_transfer to represent this and will remove 'sprd,spi-interval' from DT. -- Baolin Wang Best Regards
Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation
Hi Rob, On 15 August 2018 at 04:21, Rob Herring wrote: > On Tue, Aug 07, 2018 at 06:43:37PM +0800, Baolin Wang wrote: >> From: Lanqing Liu >> >> This patch adds the binding documentation for Spreadtrum SPI >> controller device. >> >> Signed-off-by: Lanqing Liu >> Signed-off-by: Baolin Wang >> --- >> Documentation/devicetree/bindings/spi/spi-sprd.txt | 31 >> >> 1 file changed, 31 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/spi/spi-sprd.txt >> >> diff --git a/Documentation/devicetree/bindings/spi/spi-sprd.txt >> b/Documentation/devicetree/bindings/spi/spi-sprd.txt >> new file mode 100644 >> index 000..06ff746 >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/spi/spi-sprd.txt >> @@ -0,0 +1,31 @@ >> +Spreadtrum SPI Controller >> + >> +Required properties: >> +- compatible: Should be "sprd,sc9860-spi". >> +- reg: Offset and length of SPI controller register space. >> +- interrupts: Should contain SPI interrupt. >> +- clock-names: Should contain following entries: >> + "spi" for SPI clock, >> + "source" for SPI source (parent) clock, > > Do the h/w block actually get this clock or the driver needs it? In the > latter case, it should not be in DT. Yes, we will get the actual SPI source clock form the "source" clock property. > >> + "enable" for SPI module enable clock. >> +- clocks: List of clock input name strings sorted in the same order >> + as the clock-names property. >> +- #address-cells: The number of cells required to define a chip select >> + address on the SPI bus. Should be set to 1. >> +- #size-cells: Should be set to 0. >> + >> +Optional properties: >> +- sprd,spi-interval: Specify the intervals of two SPI frames, which can be >> + converted to the delay clock cycles = interval number * 4 + 10. > > There are read and write delay properties you can use. Right. We've agreed introducing another field for spi_transfer to represent this and will remove 'sprd,spi-interval' from DT. -- Baolin Wang Best Regards
Re: linux-next: build failure after merge of the block tree
Hi all, On Thu, 26 Jul 2018 14:56:24 +1000 Stephen Rothwell wrote: > > After merging the block tree, today's linux-next build (x86_64 > allmodconfig) failed like this: > > drivers/nvme/target/rdma.c: In function 'nvmet_rdma_find_get_device': > drivers/nvme/target/rdma.c:894:26: error: 'struct ib_device_attr' has no > member named 'max_sge'; did you mean 'max_cqe'? > cm_id->device->attrs.max_sge) - 1; > ^ > drivers/nvme/target/rdma.c:893:21: note: in expansion of macro 'max' > inline_sge_count = max(cm_id->device->attrs.max_sge_rd, > ^~~ > In file included from include/linux/list.h:9:0, > from include/linux/module.h:9, > from drivers/nvme/host/rdma.c:15: > drivers/nvme/host/rdma.c: In function 'nvme_rdma_find_get_device': > drivers/nvme/host/rdma.c:381:23: error: 'struct ib_device_attr' has no member > named 'max_sge'; did you mean 'max_cqe'? > ndev->dev->attrs.max_sge - 1); >^ > drivers/nvme/host/rdma.c:380:30: note: in expansion of macro 'min' > ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS, > ^~~ > > Caused by commits > > 64a741c1eaa8 ("nvme-rdma: support up to 4 segments of inline data") > 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data") > > interacting with commit > > 33023fb85a42 ("IB/core: add max_send_sge and max_recv_sge attributes") > > from the rdma tree. > > I have applied the following merge fix patch for today: > > From: Stephen Rothwell > Date: Thu, 26 Jul 2018 14:32:15 +1000 > Subject: [PATCH] nvme-dma: merge fix up for replacement of max_sge > > Signed-off-by: Stephen Rothwell > --- > drivers/nvme/host/rdma.c | 2 +- > drivers/nvme/target/rdma.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index cfa0319fcd1c..368fe5ac0c6b 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -378,7 +378,7 @@ nvme_rdma_find_get_device(struct rdma_cm_id *cm_id) > } > > ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS, > - ndev->dev->attrs.max_sge - 1); > + ndev->dev->attrs.max_send_sge - 1); > list_add(>entry, _list); > out_unlock: > mutex_unlock(_list_mutex); > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index 86121a7a19b2..8c30ac7d8078 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -891,7 +891,7 @@ nvmet_rdma_find_get_device(struct rdma_cm_id *cm_id) > > inline_page_count = num_pages(port->inline_data_size); > inline_sge_count = max(cm_id->device->attrs.max_sge_rd, > - cm_id->device->attrs.max_sge) - 1; > + cm_id->device->attrs.max_send_sge) - 1; > if (inline_page_count > inline_sge_count) { > pr_warn("inline_data_size %d cannot be supported by device %s. > Reducing to %lu.\n", > port->inline_data_size, cm_id->device->name, > -- > 2.18.0 This is now needed (with the max_send_sge -> max_recv_sge update in drivers/nvme/target/rdma.c patch) when I merge the rdma tree with Linus' tree. -- Cheers, Stephen Rothwell pgp8nEyBD3iLU.pgp Description: OpenPGP digital signature
Re: linux-next: build failure after merge of the block tree
Hi all, On Thu, 26 Jul 2018 14:56:24 +1000 Stephen Rothwell wrote: > > After merging the block tree, today's linux-next build (x86_64 > allmodconfig) failed like this: > > drivers/nvme/target/rdma.c: In function 'nvmet_rdma_find_get_device': > drivers/nvme/target/rdma.c:894:26: error: 'struct ib_device_attr' has no > member named 'max_sge'; did you mean 'max_cqe'? > cm_id->device->attrs.max_sge) - 1; > ^ > drivers/nvme/target/rdma.c:893:21: note: in expansion of macro 'max' > inline_sge_count = max(cm_id->device->attrs.max_sge_rd, > ^~~ > In file included from include/linux/list.h:9:0, > from include/linux/module.h:9, > from drivers/nvme/host/rdma.c:15: > drivers/nvme/host/rdma.c: In function 'nvme_rdma_find_get_device': > drivers/nvme/host/rdma.c:381:23: error: 'struct ib_device_attr' has no member > named 'max_sge'; did you mean 'max_cqe'? > ndev->dev->attrs.max_sge - 1); >^ > drivers/nvme/host/rdma.c:380:30: note: in expansion of macro 'min' > ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS, > ^~~ > > Caused by commits > > 64a741c1eaa8 ("nvme-rdma: support up to 4 segments of inline data") > 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data") > > interacting with commit > > 33023fb85a42 ("IB/core: add max_send_sge and max_recv_sge attributes") > > from the rdma tree. > > I have applied the following merge fix patch for today: > > From: Stephen Rothwell > Date: Thu, 26 Jul 2018 14:32:15 +1000 > Subject: [PATCH] nvme-dma: merge fix up for replacement of max_sge > > Signed-off-by: Stephen Rothwell > --- > drivers/nvme/host/rdma.c | 2 +- > drivers/nvme/target/rdma.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index cfa0319fcd1c..368fe5ac0c6b 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -378,7 +378,7 @@ nvme_rdma_find_get_device(struct rdma_cm_id *cm_id) > } > > ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS, > - ndev->dev->attrs.max_sge - 1); > + ndev->dev->attrs.max_send_sge - 1); > list_add(>entry, _list); > out_unlock: > mutex_unlock(_list_mutex); > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index 86121a7a19b2..8c30ac7d8078 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -891,7 +891,7 @@ nvmet_rdma_find_get_device(struct rdma_cm_id *cm_id) > > inline_page_count = num_pages(port->inline_data_size); > inline_sge_count = max(cm_id->device->attrs.max_sge_rd, > - cm_id->device->attrs.max_sge) - 1; > + cm_id->device->attrs.max_send_sge) - 1; > if (inline_page_count > inline_sge_count) { > pr_warn("inline_data_size %d cannot be supported by device %s. > Reducing to %lu.\n", > port->inline_data_size, cm_id->device->name, > -- > 2.18.0 This is now needed (with the max_send_sge -> max_recv_sge update in drivers/nvme/target/rdma.c patch) when I merge the rdma tree with Linus' tree. -- Cheers, Stephen Rothwell pgp8nEyBD3iLU.pgp Description: OpenPGP digital signature
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On 08/14/2018 04:16 PM, Linus Torvalds wrote: > On Tue, Aug 14, 2018 at 03:17:35PM -0700, Michal Wnukowski wrote: >> >> With memory barrier in place, the volatile keyword around *dbbuf_ei is >> redundant. > > No. The memory barrier enforces _ordering_, but it doesn't enforce > that the accesses are only done once. So when you do > >> *dbbuf_db = value; > > to write to dbbuf_db, and > >>*dbbuf_ei > > to read from dbbuf_ei, without the volatile the write (or the read) > could be done multiple times, which can cause serious confusion. > I got confused after comaring disassembly of this code with and without volatile keyword. Thanks for the correction. > > However, there's a more serious problem with your patch: > >> + /* >> + * Ensure that the doorbell is updated before reading >> + * the EventIdx from memory >> + */ >> + mb(); > > Good comment. Except what about the other side? > > When you use memory ordering rules, as opposed to locking, there's > always *two* sides to any access order. There's this "write dbbuf_db" > vs "read dbbuf_ei" ordering. > > But there's the other side: what about the side that writes dbbuf_ei, > and reads dbbuf_db? > > I'm assuming that's the actual controller hardware, but it needs a > comment about *that* access being ordered too, because if it isn't, > then ordering this side is pointless. > The other side in this case is not actual controller hardware, but virtual one (the regular hardware should rely on normal MMIO doorbells). I spent some time going through the code of internal hypervisor and double-checking all guarantees around memory access before asking the same question: "what about the other side?". This execution ordering is mentioned in NVMe spec under "Controller Architecture", and it turned out that the NVMe driver itself had missing memory barrier. Thanks, Michal
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On 08/14/2018 04:16 PM, Linus Torvalds wrote: > On Tue, Aug 14, 2018 at 03:17:35PM -0700, Michal Wnukowski wrote: >> >> With memory barrier in place, the volatile keyword around *dbbuf_ei is >> redundant. > > No. The memory barrier enforces _ordering_, but it doesn't enforce > that the accesses are only done once. So when you do > >> *dbbuf_db = value; > > to write to dbbuf_db, and > >>*dbbuf_ei > > to read from dbbuf_ei, without the volatile the write (or the read) > could be done multiple times, which can cause serious confusion. > I got confused after comaring disassembly of this code with and without volatile keyword. Thanks for the correction. > > However, there's a more serious problem with your patch: > >> + /* >> + * Ensure that the doorbell is updated before reading >> + * the EventIdx from memory >> + */ >> + mb(); > > Good comment. Except what about the other side? > > When you use memory ordering rules, as opposed to locking, there's > always *two* sides to any access order. There's this "write dbbuf_db" > vs "read dbbuf_ei" ordering. > > But there's the other side: what about the side that writes dbbuf_ei, > and reads dbbuf_db? > > I'm assuming that's the actual controller hardware, but it needs a > comment about *that* access being ordered too, because if it isn't, > then ordering this side is pointless. > The other side in this case is not actual controller hardware, but virtual one (the regular hardware should rely on normal MMIO doorbells). I spent some time going through the code of internal hypervisor and double-checking all guarantees around memory access before asking the same question: "what about the other side?". This execution ordering is mentioned in NVMe spec under "Controller Architecture", and it turned out that the NVMe driver itself had missing memory barrier. Thanks, Michal
Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max
On 08/14/2018 04:56 PM, Doug Anderson wrote: > On Tue, Aug 14, 2018 at 2:59 PM, David Collins > wrote: >> On 08/14/2018 01:03 PM, Doug Anderson wrote: >>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins >>> wrote:>>> --- a/drivers/regulator/core.c > +++ b/drivers/regulator/core.c > @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev) > struct regulator *sibling; > int current_uA = 0, output_uV, input_uV, err; > unsigned int mode; > + bool any_unset = false; > > lockdep_assert_held_once(>mutex); > > @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev > *rdev) > return -EINVAL; > > /* calc total requested load */ > - list_for_each_entry(sibling, >consumer_list, list) > + list_for_each_entry(sibling, >consumer_list, list) { > current_uA += sibling->uA_load; > + if (!sibling->uA_load_set) > + any_unset = true; > + } > > current_uA += rdev->constraints->system_load; > > + if (any_unset) > + current_uA = INT_MAX; > + This check will incorrectly result in a constant load request of INT_MAX for all regulators that have at least one child regulator. This is the case because such child regulators are present in rdev->consumer_list and because regulator_set_load() requests are not propagated up to parent regulators. Thus, the regulator structs for child regulators will always have uA_load==0 and uA_load_set==false. >>> >>> Ah, interesting. >>> >>> ...but doesn't this same bug exist anyway, just in the opposite >>> direction? Without my patch we'll try to request a 0 mA load in this >>> case which seems just as wrong (or perhaps worse?). I guess on RPMh >>> regulator you're "saved" because the RPMh hardware itself knows the >>> parent/child relationship and knows to ignore this 0 mA load, but it's >>> still a bug in the overall Linux framework... >> >> I'm not sure what existing bug you are referring to here. Where is a 0 mA >> load being requested? Could you list the consumer function calls and the >> behavior on the framework side that you're envisioning? > > Imagine you've got a tree like this: > > - regulatorParent (1.8 V) > - regulatorChildA (1.7 V) > - regulatorChildB (1.2 V) > - regulatorChildC (1.5 V) > > ...and this set of calls: > > regulator_set_load(regulatorChildA, 1000); > regulator_set_load(regulatorChildB, 2000); > regulator_set_load(regulatorChildC, 3000); > > regulator_enable(regulatorChildA); > regulator_enable(regulatorChildB); > regulator_enable(regulatorChildC); > > > With the existing code in Mark's tree then ChildA / ChildB / ChildC > will presumably have a load high enough to be in high power mode. > However, as you said, loads aren't propagated. ...so "Parent" will > see no load requested (which translates to a load request of 0 mA). > > The "bug" is that when you did the > "regulator_enable(regulatorChildA);" then that will propagate up and > cause a "regulator_enable(regulatorParent)". In _regulator_enable() > you can see drms_uA_update(). That will cause it to set the mode of > regulatorParent based on a load of 0 mA. That is the bug. You may > respond "but REGULATOR_CHANGE_DRMS isn't set for the parent! See > below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the > parent. > > With my code in the same situation the parent will end up with a load > of "MAX_INT" mA which I think is the bug you pointed out. ...while it > would be good to fix this it does seem to be better to set the parent > to a mode based on MAX_INT mA rather than 0 mA? I agree that the existing behavior resulting from a lack of load current propagation from child to parent regulators is wrong. However, I do not agree that replacing it with something else that results in different wrong behavior is the way forward. Adding load current propagation would resolve your 0 mA concern but does not necessary require making MAX_INT the default for all consumers. >>> I have no idea how we ought to propagate regulator_set_load() to >>> parents though. That seems like a tough thing to try to handle >>> automagically. >> >> I attempted it seven years ago with two revisions of a patch series [1] >> and [2]. The feature wasn't completed though. Perhaps something along >> the same lines could be reintroduced now that child regulators are handled >> the same way as ordinary consumers within the regulator framework. >> >> [1]: https://lkml.org/lkml/2011/3/28/246 >> [2]: https://lkml.org/lkml/2011/3/28/530 > > Thanks for the links. My first instinct is just what Mark said there: > you can't really take the child load and map it accurately to a parent > load without loads of per-device complexity and even then you'd > probably get nothing more than an approximation. > > IMO about the
Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max
On 08/14/2018 04:56 PM, Doug Anderson wrote: > On Tue, Aug 14, 2018 at 2:59 PM, David Collins > wrote: >> On 08/14/2018 01:03 PM, Doug Anderson wrote: >>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins >>> wrote:>>> --- a/drivers/regulator/core.c > +++ b/drivers/regulator/core.c > @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev) > struct regulator *sibling; > int current_uA = 0, output_uV, input_uV, err; > unsigned int mode; > + bool any_unset = false; > > lockdep_assert_held_once(>mutex); > > @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev > *rdev) > return -EINVAL; > > /* calc total requested load */ > - list_for_each_entry(sibling, >consumer_list, list) > + list_for_each_entry(sibling, >consumer_list, list) { > current_uA += sibling->uA_load; > + if (!sibling->uA_load_set) > + any_unset = true; > + } > > current_uA += rdev->constraints->system_load; > > + if (any_unset) > + current_uA = INT_MAX; > + This check will incorrectly result in a constant load request of INT_MAX for all regulators that have at least one child regulator. This is the case because such child regulators are present in rdev->consumer_list and because regulator_set_load() requests are not propagated up to parent regulators. Thus, the regulator structs for child regulators will always have uA_load==0 and uA_load_set==false. >>> >>> Ah, interesting. >>> >>> ...but doesn't this same bug exist anyway, just in the opposite >>> direction? Without my patch we'll try to request a 0 mA load in this >>> case which seems just as wrong (or perhaps worse?). I guess on RPMh >>> regulator you're "saved" because the RPMh hardware itself knows the >>> parent/child relationship and knows to ignore this 0 mA load, but it's >>> still a bug in the overall Linux framework... >> >> I'm not sure what existing bug you are referring to here. Where is a 0 mA >> load being requested? Could you list the consumer function calls and the >> behavior on the framework side that you're envisioning? > > Imagine you've got a tree like this: > > - regulatorParent (1.8 V) > - regulatorChildA (1.7 V) > - regulatorChildB (1.2 V) > - regulatorChildC (1.5 V) > > ...and this set of calls: > > regulator_set_load(regulatorChildA, 1000); > regulator_set_load(regulatorChildB, 2000); > regulator_set_load(regulatorChildC, 3000); > > regulator_enable(regulatorChildA); > regulator_enable(regulatorChildB); > regulator_enable(regulatorChildC); > > > With the existing code in Mark's tree then ChildA / ChildB / ChildC > will presumably have a load high enough to be in high power mode. > However, as you said, loads aren't propagated. ...so "Parent" will > see no load requested (which translates to a load request of 0 mA). > > The "bug" is that when you did the > "regulator_enable(regulatorChildA);" then that will propagate up and > cause a "regulator_enable(regulatorParent)". In _regulator_enable() > you can see drms_uA_update(). That will cause it to set the mode of > regulatorParent based on a load of 0 mA. That is the bug. You may > respond "but REGULATOR_CHANGE_DRMS isn't set for the parent! See > below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the > parent. > > With my code in the same situation the parent will end up with a load > of "MAX_INT" mA which I think is the bug you pointed out. ...while it > would be good to fix this it does seem to be better to set the parent > to a mode based on MAX_INT mA rather than 0 mA? I agree that the existing behavior resulting from a lack of load current propagation from child to parent regulators is wrong. However, I do not agree that replacing it with something else that results in different wrong behavior is the way forward. Adding load current propagation would resolve your 0 mA concern but does not necessary require making MAX_INT the default for all consumers. >>> I have no idea how we ought to propagate regulator_set_load() to >>> parents though. That seems like a tough thing to try to handle >>> automagically. >> >> I attempted it seven years ago with two revisions of a patch series [1] >> and [2]. The feature wasn't completed though. Perhaps something along >> the same lines could be reintroduced now that child regulators are handled >> the same way as ordinary consumers within the regulator framework. >> >> [1]: https://lkml.org/lkml/2011/3/28/246 >> [2]: https://lkml.org/lkml/2011/3/28/530 > > Thanks for the links. My first instinct is just what Mark said there: > you can't really take the child load and map it accurately to a parent > load without loads of per-device complexity and even then you'd > probably get nothing more than an approximation. > > IMO about the
[PATCH v5 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook
.flush_iotlb_all can not just wait for previous tlbi operations to be completed, but should also invalid all TLBs of the related domain. Signed-off-by: Zhen Lei Reviewed-by: Robin Murphy --- drivers/iommu/arm-smmu-v3.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1d64710..4402187 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, return ops->unmap(ops, iova, size); } +static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + + if (smmu_domain->smmu) + arm_smmu_tlb_inv_context(smmu_domain); +} + static void arm_smmu_iotlb_sync(struct iommu_domain *domain) { struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; @@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev, .map= arm_smmu_map, .unmap = arm_smmu_unmap, .map_sg = default_iommu_map_sg, - .flush_iotlb_all= arm_smmu_iotlb_sync, + .flush_iotlb_all= arm_smmu_flush_iotlb_all, .iotlb_sync = arm_smmu_iotlb_sync, .iova_to_phys = arm_smmu_iova_to_phys, .add_device = arm_smmu_add_device, -- 1.8.3
[PATCH v5 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook
.flush_iotlb_all can not just wait for previous tlbi operations to be completed, but should also invalid all TLBs of the related domain. Signed-off-by: Zhen Lei Reviewed-by: Robin Murphy --- drivers/iommu/arm-smmu-v3.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1d64710..4402187 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, return ops->unmap(ops, iova, size); } +static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + + if (smmu_domain->smmu) + arm_smmu_tlb_inv_context(smmu_domain); +} + static void arm_smmu_iotlb_sync(struct iommu_domain *domain) { struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; @@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev, .map= arm_smmu_map, .unmap = arm_smmu_unmap, .map_sg = default_iommu_map_sg, - .flush_iotlb_all= arm_smmu_iotlb_sync, + .flush_iotlb_all= arm_smmu_flush_iotlb_all, .iotlb_sync = arm_smmu_iotlb_sync, .iova_to_phys = arm_smmu_iova_to_phys, .add_device = arm_smmu_add_device, -- 1.8.3
Re: How delete node or property in overlayd dts?
On 08/14/18 07:46, 张波 wrote: > /delete-node/ /delete-prop/ could be used in dtsi files without device > tree overlay. > > but with device tree overlay, /delete-node/ and /delete-prop/ are not work. > How to delete property and node in overlay dts? > > for example, > in basel.dts have following node > node1 { > property1; > property3; > node2 { > property2; > } > } > > in overlay.dts as following > node1 { > /delete-property/ property1; > /delete-node/ node2; > } > > after overlay, property1 and node2 is not deleted. > The /delete-node/ and /delete-prop/ directives are only used by the dtc compiler within a single compilation. There is nothing in the format of a devicetree blob to represent the notion of deleting a property or a node. You can not delete a property or a node in an overlay dtb.
Re: How delete node or property in overlayd dts?
On 08/14/18 07:46, 张波 wrote: > /delete-node/ /delete-prop/ could be used in dtsi files without device > tree overlay. > > but with device tree overlay, /delete-node/ and /delete-prop/ are not work. > How to delete property and node in overlay dts? > > for example, > in basel.dts have following node > node1 { > property1; > property3; > node2 { > property2; > } > } > > in overlay.dts as following > node1 { > /delete-property/ property1; > /delete-node/ node2; > } > > after overlay, property1 and node2 is not deleted. > The /delete-node/ and /delete-prop/ directives are only used by the dtc compiler within a single compilation. There is nothing in the format of a devicetree blob to represent the notion of deleting a property or a node. You can not delete a property or a node in an overlay dtb.
Re: [RFC PATCH 1/2] mm: rework memcg kernel stack accounting
On Tue, Aug 14, 2018 at 5:37 PM Roman Gushchin wrote: > > If CONFIG_VMAP_STACK is set, kernel stacks are allocated > using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel > stack pages are charged against corresponding memory cgroups > on allocation and uncharged on releasing them. > > The problem is that we do cache kernel stacks in small > per-cpu caches and do reuse them for new tasks, which can > belong to different memory cgroups. > > Each stack page still holds a reference to the original cgroup, > so the cgroup can't be released until the vmap area is released. > > To make this happen we need more than two subsequent exits > without forks in between on the current cpu, which makes it > very unlikely to happen. As a result, I saw a significant number > of dying cgroups (in theory, up to 2 * number_of_cpu + > number_of_tasks), which can't be released even by significant > memory pressure. > > As a cgroup structure can take a significant amount of memory > (first of all, per-cpu data like memcg statistics), it leads > to a noticeable waste of memory. > > Signed-off-by: Roman Gushchin I was also looking into this issue. I was thinking of having a per-memcg per-cpu stack cache. However this solution seems much simpler. Can you also add the performance number for a similar simple benchmark done in ac496bf48d97 ("fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y"). Reviewed-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Andy Lutomirski > Cc: Konstantin Khlebnikov > Cc: Tejun Heo > --- > kernel/fork.c | 44 ++-- > 1 file changed, 38 insertions(+), 6 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 69b6fea5a181..91872b2b37bd 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct > task_struct *tsk, int node) > return s->addr; > } > > + /* > +* Allocated stacks are cached and later reused by new threads, > +* so memcg accounting is performed manually on assigning/releasing > +* stacks to tasks. Drop __GFP_ACCOUNT. > +*/ > stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN, > VMALLOC_START, VMALLOC_END, > -THREADINFO_GFP, > +THREADINFO_GFP & ~__GFP_ACCOUNT, > PAGE_KERNEL, > 0, node, __builtin_return_address(0)); > > @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct > task_struct *tsk, int node) > #endif > } > > +static void memcg_charge_kernel_stack(struct task_struct *tsk) > +{ > +#ifdef CONFIG_VMAP_STACK > + struct vm_struct *vm = task_stack_vm_area(tsk); > + > + if (vm) { > + int i; > + > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > + memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL, > + compound_order(vm->pages[i])); > + > + /* All stack pages belong to the same memcg. */ > + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > +THREAD_SIZE / 1024); > + } > +#endif > +} > + > static inline void free_thread_stack(struct task_struct *tsk) > { > #ifdef CONFIG_VMAP_STACK > - if (task_stack_vm_area(tsk)) { > + struct vm_struct *vm = task_stack_vm_area(tsk); > + > + if (vm) { > int i; > > + /* All stack pages belong to the same memcg. */ > + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > +-(int)(THREAD_SIZE / 1024)); > + > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > + memcg_kmem_uncharge(vm->pages[i], > + compound_order(vm->pages[i])); > + > for (i = 0; i < NR_CACHED_STACKS; i++) { > if (this_cpu_cmpxchg(cached_stacks[i], > NULL, tsk->stack_vm_area) != NULL) > @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct > *tsk, int account) > NR_KERNEL_STACK_KB, > PAGE_SIZE / 1024 * account); > } > - > - /* All stack pages belong to the same memcg. */ > - mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > -account * (THREAD_SIZE / 1024)); > } else { > /* > * All stack pages are in the same zone and belong to the > @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct > task_struct *orig, int node) > if (!stack) >
Re: [RFC PATCH 1/2] mm: rework memcg kernel stack accounting
On Tue, Aug 14, 2018 at 5:37 PM Roman Gushchin wrote: > > If CONFIG_VMAP_STACK is set, kernel stacks are allocated > using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel > stack pages are charged against corresponding memory cgroups > on allocation and uncharged on releasing them. > > The problem is that we do cache kernel stacks in small > per-cpu caches and do reuse them for new tasks, which can > belong to different memory cgroups. > > Each stack page still holds a reference to the original cgroup, > so the cgroup can't be released until the vmap area is released. > > To make this happen we need more than two subsequent exits > without forks in between on the current cpu, which makes it > very unlikely to happen. As a result, I saw a significant number > of dying cgroups (in theory, up to 2 * number_of_cpu + > number_of_tasks), which can't be released even by significant > memory pressure. > > As a cgroup structure can take a significant amount of memory > (first of all, per-cpu data like memcg statistics), it leads > to a noticeable waste of memory. > > Signed-off-by: Roman Gushchin I was also looking into this issue. I was thinking of having a per-memcg per-cpu stack cache. However this solution seems much simpler. Can you also add the performance number for a similar simple benchmark done in ac496bf48d97 ("fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y"). Reviewed-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Andy Lutomirski > Cc: Konstantin Khlebnikov > Cc: Tejun Heo > --- > kernel/fork.c | 44 ++-- > 1 file changed, 38 insertions(+), 6 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 69b6fea5a181..91872b2b37bd 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct > task_struct *tsk, int node) > return s->addr; > } > > + /* > +* Allocated stacks are cached and later reused by new threads, > +* so memcg accounting is performed manually on assigning/releasing > +* stacks to tasks. Drop __GFP_ACCOUNT. > +*/ > stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN, > VMALLOC_START, VMALLOC_END, > -THREADINFO_GFP, > +THREADINFO_GFP & ~__GFP_ACCOUNT, > PAGE_KERNEL, > 0, node, __builtin_return_address(0)); > > @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct > task_struct *tsk, int node) > #endif > } > > +static void memcg_charge_kernel_stack(struct task_struct *tsk) > +{ > +#ifdef CONFIG_VMAP_STACK > + struct vm_struct *vm = task_stack_vm_area(tsk); > + > + if (vm) { > + int i; > + > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > + memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL, > + compound_order(vm->pages[i])); > + > + /* All stack pages belong to the same memcg. */ > + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > +THREAD_SIZE / 1024); > + } > +#endif > +} > + > static inline void free_thread_stack(struct task_struct *tsk) > { > #ifdef CONFIG_VMAP_STACK > - if (task_stack_vm_area(tsk)) { > + struct vm_struct *vm = task_stack_vm_area(tsk); > + > + if (vm) { > int i; > > + /* All stack pages belong to the same memcg. */ > + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > +-(int)(THREAD_SIZE / 1024)); > + > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) > + memcg_kmem_uncharge(vm->pages[i], > + compound_order(vm->pages[i])); > + > for (i = 0; i < NR_CACHED_STACKS; i++) { > if (this_cpu_cmpxchg(cached_stacks[i], > NULL, tsk->stack_vm_area) != NULL) > @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct > *tsk, int account) > NR_KERNEL_STACK_KB, > PAGE_SIZE / 1024 * account); > } > - > - /* All stack pages belong to the same memcg. */ > - mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, > -account * (THREAD_SIZE / 1024)); > } else { > /* > * All stack pages are in the same zone and belong to the > @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct > task_struct *orig, int node) > if (!stack) >
Re: [BUG] kernel: rcu: a possible sleep-in-atomic-context bug in srcu_read_delay()
On 2018/8/13 20:42, Paul E. McKenney wrote: On Mon, Aug 13, 2018 at 05:26:49PM +0800, Jia-Ju Bai wrote: On 2018/8/13 12:18, Paul E. McKenney wrote: On Mon, Aug 13, 2018 at 11:04:10AM +0800, Jia-Ju Bai wrote: The kernel may sleep with holding a spinlock. The function call paths (from bottom to top) in Linux-4.16 are: [FUNC] schedule_timeout_interruptible kernel/rcu/rcutorture.c, 523: schedule_timeout_interruptible in srcu_read_delay kernel/rcu/rcutorture.c, 1105: [FUNC_PTR]srcu_read_delay in rcu_torture_timer kernel/rcu/rcutorture.c, 1104: spin_lock in rcu_torture_timer Note that [FUNC_PTR] means a function pointer call is used. I do not find a good way to fix, so I only report. This is found by my static analysis tool (DSAC). Interesting. I would have expected to have gotten a "scheduling while atomic" error message, which I do not recall seeing. And I ran a great deal of rcutorture on v4.16. So let's see... As you say, the rcu_torture_timer() function does in fact acquire rand_lock in 4.16 and 4.17, in which case sleeping would indeed be illegal. But let's take a look at srcu_read_delay(): static void srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp) { long delay; const long uspertick = 100 / HZ; const long longdelay = 10; /* We want there to be long-running readers, but not all the time. */ delay = torture_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick); if (!delay && in_task()) { schedule_timeout_interruptible(longdelay); rtrsp->rt_delay_jiffies = longdelay; } else { rcu_read_delay(rrsp, rtrsp); } } The call to schedule_timeout_interruptible() cannot happen unless the in_task() macro returns true, which it won't if the SOFTIRQ_OFFSET bit is set: #define in_task() (!(preempt_count() & \ (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) And the SOFTIRQ_OFFSET bit will be set if srcu_read_delay() is invoked from a timer handler, which is the case for the call from rcu_torture_timer(). So if that lock is held, schedule_timeout_interruptible() won't ever be invoked. Thanks for your reply :) My tool does not track this bit... Sorry for this false report. Not a problem, a few false positives are to be expected. And it looks like you have some work to do on your tool -- which is good, because I would not want you to be bored. ;-) Thanx, Paul Thanks for your advice. I will improve my tool to produce less false positives :) Best wishes, Jia-Ju Bai
Re: [BUG] kernel: rcu: a possible sleep-in-atomic-context bug in srcu_read_delay()
On 2018/8/13 20:42, Paul E. McKenney wrote: On Mon, Aug 13, 2018 at 05:26:49PM +0800, Jia-Ju Bai wrote: On 2018/8/13 12:18, Paul E. McKenney wrote: On Mon, Aug 13, 2018 at 11:04:10AM +0800, Jia-Ju Bai wrote: The kernel may sleep with holding a spinlock. The function call paths (from bottom to top) in Linux-4.16 are: [FUNC] schedule_timeout_interruptible kernel/rcu/rcutorture.c, 523: schedule_timeout_interruptible in srcu_read_delay kernel/rcu/rcutorture.c, 1105: [FUNC_PTR]srcu_read_delay in rcu_torture_timer kernel/rcu/rcutorture.c, 1104: spin_lock in rcu_torture_timer Note that [FUNC_PTR] means a function pointer call is used. I do not find a good way to fix, so I only report. This is found by my static analysis tool (DSAC). Interesting. I would have expected to have gotten a "scheduling while atomic" error message, which I do not recall seeing. And I ran a great deal of rcutorture on v4.16. So let's see... As you say, the rcu_torture_timer() function does in fact acquire rand_lock in 4.16 and 4.17, in which case sleeping would indeed be illegal. But let's take a look at srcu_read_delay(): static void srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp) { long delay; const long uspertick = 100 / HZ; const long longdelay = 10; /* We want there to be long-running readers, but not all the time. */ delay = torture_random(rrsp) % (nrealreaders * 2 * longdelay * uspertick); if (!delay && in_task()) { schedule_timeout_interruptible(longdelay); rtrsp->rt_delay_jiffies = longdelay; } else { rcu_read_delay(rrsp, rtrsp); } } The call to schedule_timeout_interruptible() cannot happen unless the in_task() macro returns true, which it won't if the SOFTIRQ_OFFSET bit is set: #define in_task() (!(preempt_count() & \ (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) And the SOFTIRQ_OFFSET bit will be set if srcu_read_delay() is invoked from a timer handler, which is the case for the call from rcu_torture_timer(). So if that lock is held, schedule_timeout_interruptible() won't ever be invoked. Thanks for your reply :) My tool does not track this bit... Sorry for this false report. Not a problem, a few false positives are to be expected. And it looks like you have some work to do on your tool -- which is good, because I would not want you to be bored. ;-) Thanx, Paul Thanks for your advice. I will improve my tool to produce less false positives :) Best wishes, Jia-Ju Bai
Re: Build failures with gcc 4.5 and older
On 08/14/2018 04:20 PM, Linus Torvalds wrote: On Tue, Aug 14, 2018 at 4:02 PM Andrew Morton wrote: The m68k build still fails because 0cc3cd21657 ("cpu/hotplug: Boot HT siblings at least once") was evidently never tested on CONFIG_SMP=n. How could that come about - the patch is six weeks old?? Ehh, meet the joys of embargoes. The code was tested (and people even found subtle arm64 problems due to that testing), but because it couldn't be made public until today, it didn't go through all the usual infrastructure we depend on. But: kernel/cpu.c: In function 'boot_cpu_hotplug_init': kernel/cpu.c:2275:2: error: 'struct cpuhp_cpu_state' has no member named 'booted_once' it should be fixed now in -git. @@ -490,6 +490,8 @@ struct mm_struct { #endif } __randomize_layout; + int wibble; + Can we call this something informative? Like int __gcc_4_4_is_garbage_that_shouldnt_be_used; or something? That is, if we actually want to really drag out this whole pointless pain of allowing ancient compilers? Guys, at some point we need to switch to 4.6. The people who feel the pain today *will* feel the pain at some point. Just get it over with already. For my part I am all for making gcc 4.6 mandatory. Guenter
Re: Build failures with gcc 4.5 and older
On 08/14/2018 04:20 PM, Linus Torvalds wrote: On Tue, Aug 14, 2018 at 4:02 PM Andrew Morton wrote: The m68k build still fails because 0cc3cd21657 ("cpu/hotplug: Boot HT siblings at least once") was evidently never tested on CONFIG_SMP=n. How could that come about - the patch is six weeks old?? Ehh, meet the joys of embargoes. The code was tested (and people even found subtle arm64 problems due to that testing), but because it couldn't be made public until today, it didn't go through all the usual infrastructure we depend on. But: kernel/cpu.c: In function 'boot_cpu_hotplug_init': kernel/cpu.c:2275:2: error: 'struct cpuhp_cpu_state' has no member named 'booted_once' it should be fixed now in -git. @@ -490,6 +490,8 @@ struct mm_struct { #endif } __randomize_layout; + int wibble; + Can we call this something informative? Like int __gcc_4_4_is_garbage_that_shouldnt_be_used; or something? That is, if we actually want to really drag out this whole pointless pain of allowing ancient compilers? Guys, at some point we need to switch to 4.6. The people who feel the pain today *will* feel the pain at some point. Just get it over with already. For my part I am all for making gcc 4.6 mandatory. Guenter
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On 08/14/2018 05:36 PM, Sebastian Gottschall wrote: if SWAP is disabled in kernel config, the following compile error will raise up with this release arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): undefined reference to `generic_max_swapfile_size' of course this is simple to fix. the function max_swapfile_size must be excluded if CONFIG_SWAP is disabled Fixed with upstream commit 792adb90fa72 ("x86/init: fix build with CONFIG_SWAP=n"). Also needed in other stable branches (at least in 4.4). Guenter Sebastian Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman: This is the start of the stable review cycle for the 4.9.120 release. There are 107 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu Aug 16 17:14:53 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 4.9.120-rc1 Josh Poimboeuf x86/microcode: Allow late microcode loading with SMT disabled Ashok Raj x86/microcode: Do not upload microcode if CPUs are offline David Woodhouse tools headers: Synchronise x86 cpufeatures.h for L1TF additions Andi Kleen x86/mm/kmmio: Make the tracer robust against L1TF Andi Kleen x86/mm/pat: Make set_memory_np() L1TF safe Andi Kleen x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert Andi Kleen x86/speculation/l1tf: Invert all not present mappings Thomas Gleixner cpu/hotplug: Fix SMT supported evaluation Paolo Bonzini KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Paolo Bonzini KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR Wanpeng Li KVM: X86: Allow userspace to define the microcode version Wanpeng Li KVM: X86: Introduce kvm_get_msr_feature() Tom Lendacky KVM: SVM: Add MSR-based feature support for serializing LFENCE Tom Lendacky KVM: x86: Add a framework for supporting MSR-based features Thomas Gleixner Documentation/l1tf: Remove Yonah processors from not vulnerable list Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() Nicolai Stange x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d Nicolai Stange x86: Don't include linux/irq.h from asm/hardirq.h Nicolai Stange x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d Nicolai Stange x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 Nicolai Stange x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() Nicolai Stange x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() Josh Poimboeuf cpu/hotplug: detect SMT disabled by BIOS Tony Luck Documentation/l1tf: Fix typos Nicolai Stange x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content Thomas Gleixner Documentation: Add section about CPU vulnerabilities Jiri Kosina x86/bugs, kvm: Introduce boot-time control of L1TF mitigations Thomas Gleixner cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early Jiri Kosina cpu/hotplug: Expose SMT control init function Thomas Gleixner x86/kvm: Allow runtime control of L1D flush Thomas Gleixner x86/kvm: Serialize L1D flush parameter setter Thomas Gleixner x86/kvm: Add static key for flush always Thomas Gleixner x86/kvm: Move l1tf setup function Thomas Gleixner x86/l1tf: Handle EPT disabled state proper Thomas Gleixner x86/kvm: Drop L1TF MSR list approach Thomas Gleixner x86/litf: Introduce vmx status variable Thomas Gleixner cpu/hotplug: Online siblings when SMT control is turned on Konrad Rzeszutek Wilk x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required Konrad Rzeszutek Wilk x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs Konrad Rzeszutek Wilk x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting Konrad Rzeszutek Wilk x86/KVM/VMX: Add find_msr() helper function Konrad Rzeszutek Wilk x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers Jim Mattson kvm: nVMX: Update MSR load counts on a VMCS switch Paolo Bonzini x86/KVM/VMX: Add L1D flush logic Paolo Bonzini x86/KVM/VMX: Add L1D MSR based
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On 08/14/2018 05:36 PM, Sebastian Gottschall wrote: if SWAP is disabled in kernel config, the following compile error will raise up with this release arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): undefined reference to `generic_max_swapfile_size' of course this is simple to fix. the function max_swapfile_size must be excluded if CONFIG_SWAP is disabled Fixed with upstream commit 792adb90fa72 ("x86/init: fix build with CONFIG_SWAP=n"). Also needed in other stable branches (at least in 4.4). Guenter Sebastian Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman: This is the start of the stable review cycle for the 4.9.120 release. There are 107 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu Aug 16 17:14:53 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 4.9.120-rc1 Josh Poimboeuf x86/microcode: Allow late microcode loading with SMT disabled Ashok Raj x86/microcode: Do not upload microcode if CPUs are offline David Woodhouse tools headers: Synchronise x86 cpufeatures.h for L1TF additions Andi Kleen x86/mm/kmmio: Make the tracer robust against L1TF Andi Kleen x86/mm/pat: Make set_memory_np() L1TF safe Andi Kleen x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert Andi Kleen x86/speculation/l1tf: Invert all not present mappings Thomas Gleixner cpu/hotplug: Fix SMT supported evaluation Paolo Bonzini KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Paolo Bonzini KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR Wanpeng Li KVM: X86: Allow userspace to define the microcode version Wanpeng Li KVM: X86: Introduce kvm_get_msr_feature() Tom Lendacky KVM: SVM: Add MSR-based feature support for serializing LFENCE Tom Lendacky KVM: x86: Add a framework for supporting MSR-based features Thomas Gleixner Documentation/l1tf: Remove Yonah processors from not vulnerable list Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() Nicolai Stange x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d Nicolai Stange x86: Don't include linux/irq.h from asm/hardirq.h Nicolai Stange x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d Nicolai Stange x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 Nicolai Stange x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() Nicolai Stange x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() Josh Poimboeuf cpu/hotplug: detect SMT disabled by BIOS Tony Luck Documentation/l1tf: Fix typos Nicolai Stange x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content Thomas Gleixner Documentation: Add section about CPU vulnerabilities Jiri Kosina x86/bugs, kvm: Introduce boot-time control of L1TF mitigations Thomas Gleixner cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early Jiri Kosina cpu/hotplug: Expose SMT control init function Thomas Gleixner x86/kvm: Allow runtime control of L1D flush Thomas Gleixner x86/kvm: Serialize L1D flush parameter setter Thomas Gleixner x86/kvm: Add static key for flush always Thomas Gleixner x86/kvm: Move l1tf setup function Thomas Gleixner x86/l1tf: Handle EPT disabled state proper Thomas Gleixner x86/kvm: Drop L1TF MSR list approach Thomas Gleixner x86/litf: Introduce vmx status variable Thomas Gleixner cpu/hotplug: Online siblings when SMT control is turned on Konrad Rzeszutek Wilk x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required Konrad Rzeszutek Wilk x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs Konrad Rzeszutek Wilk x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting Konrad Rzeszutek Wilk x86/KVM/VMX: Add find_msr() helper function Konrad Rzeszutek Wilk x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers Jim Mattson kvm: nVMX: Update MSR load counts on a VMCS switch Paolo Bonzini x86/KVM/VMX: Add L1D flush logic Paolo Bonzini x86/KVM/VMX: Add L1D MSR based
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote: > if SWAP is disabled in kernel config, the following compile error will raise > up with this release > > arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): > undefined reference to `generic_max_swapfile_size' > > of course this is simple to fix. the function max_swapfile_size must be > excluded if CONFIG_SWAP is disabled > > Sebastian > This is fixed upstream as commmit 792adb90fa72 ("x86/init: fix build with CONFIG_SWAP=n"). Cheers, Nathan
Re: [PATCH 4.9 000/107] 4.9.120-stable review
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote: > if SWAP is disabled in kernel config, the following compile error will raise > up with this release > > arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): > undefined reference to `generic_max_swapfile_size' > > of course this is simple to fix. the function max_swapfile_size must be > excluded if CONFIG_SWAP is disabled > > Sebastian > This is fixed upstream as commmit 792adb90fa72 ("x86/init: fix build with CONFIG_SWAP=n"). Cheers, Nathan
Re: [PATCH 4.9 000/107] 4.9.120-stable review
if SWAP is disabled in kernel config, the following compile error will raise up with this release arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): undefined reference to `generic_max_swapfile_size' of course this is simple to fix. the function max_swapfile_size must be excluded if CONFIG_SWAP is disabled Sebastian Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman: This is the start of the stable review cycle for the 4.9.120 release. There are 107 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu Aug 16 17:14:53 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 4.9.120-rc1 Josh Poimboeuf x86/microcode: Allow late microcode loading with SMT disabled Ashok Raj x86/microcode: Do not upload microcode if CPUs are offline David Woodhouse tools headers: Synchronise x86 cpufeatures.h for L1TF additions Andi Kleen x86/mm/kmmio: Make the tracer robust against L1TF Andi Kleen x86/mm/pat: Make set_memory_np() L1TF safe Andi Kleen x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert Andi Kleen x86/speculation/l1tf: Invert all not present mappings Thomas Gleixner cpu/hotplug: Fix SMT supported evaluation Paolo Bonzini KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Paolo Bonzini KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR Wanpeng Li KVM: X86: Allow userspace to define the microcode version Wanpeng Li KVM: X86: Introduce kvm_get_msr_feature() Tom Lendacky KVM: SVM: Add MSR-based feature support for serializing LFENCE Tom Lendacky KVM: x86: Add a framework for supporting MSR-based features Thomas Gleixner Documentation/l1tf: Remove Yonah processors from not vulnerable list Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() Nicolai Stange x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d Nicolai Stange x86: Don't include linux/irq.h from asm/hardirq.h Nicolai Stange x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d Nicolai Stange x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 Nicolai Stange x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() Nicolai Stange x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() Josh Poimboeuf cpu/hotplug: detect SMT disabled by BIOS Tony Luck Documentation/l1tf: Fix typos Nicolai Stange x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content Thomas Gleixner Documentation: Add section about CPU vulnerabilities Jiri Kosina x86/bugs, kvm: Introduce boot-time control of L1TF mitigations Thomas Gleixner cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early Jiri Kosina cpu/hotplug: Expose SMT control init function Thomas Gleixner x86/kvm: Allow runtime control of L1D flush Thomas Gleixner x86/kvm: Serialize L1D flush parameter setter Thomas Gleixner x86/kvm: Add static key for flush always Thomas Gleixner x86/kvm: Move l1tf setup function Thomas Gleixner x86/l1tf: Handle EPT disabled state proper Thomas Gleixner x86/kvm: Drop L1TF MSR list approach Thomas Gleixner x86/litf: Introduce vmx status variable Thomas Gleixner cpu/hotplug: Online siblings when SMT control is turned on Konrad Rzeszutek Wilk x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required Konrad Rzeszutek Wilk x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs Konrad Rzeszutek Wilk x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting Konrad Rzeszutek Wilk x86/KVM/VMX: Add find_msr() helper function Konrad Rzeszutek Wilk x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers Jim Mattson kvm: nVMX: Update MSR load counts on a VMCS switch Paolo Bonzini x86/KVM/VMX: Add L1D flush logic Paolo Bonzini x86/KVM/VMX: Add L1D MSR based flush Paolo Bonzini x86/KVM/VMX: Add L1D flush algorithm Konrad Rzeszutek Wilk x86/KVM/VMX: Add module argument for L1TF mitigation Konrad Rzeszutek Wilk x86/KVM: Warn user if
Re: [PATCH 4.9 000/107] 4.9.120-stable review
if SWAP is disabled in kernel config, the following compile error will raise up with this release arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): undefined reference to `generic_max_swapfile_size' of course this is simple to fix. the function max_swapfile_size must be excluded if CONFIG_SWAP is disabled Sebastian Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman: This is the start of the stable review cycle for the 4.9.120 release. There are 107 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu Aug 16 17:14:53 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 4.9.120-rc1 Josh Poimboeuf x86/microcode: Allow late microcode loading with SMT disabled Ashok Raj x86/microcode: Do not upload microcode if CPUs are offline David Woodhouse tools headers: Synchronise x86 cpufeatures.h for L1TF additions Andi Kleen x86/mm/kmmio: Make the tracer robust against L1TF Andi Kleen x86/mm/pat: Make set_memory_np() L1TF safe Andi Kleen x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert Andi Kleen x86/speculation/l1tf: Invert all not present mappings Thomas Gleixner cpu/hotplug: Fix SMT supported evaluation Paolo Bonzini KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Paolo Bonzini x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Paolo Bonzini KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR Wanpeng Li KVM: X86: Allow userspace to define the microcode version Wanpeng Li KVM: X86: Introduce kvm_get_msr_feature() Tom Lendacky KVM: SVM: Add MSR-based feature support for serializing LFENCE Tom Lendacky KVM: x86: Add a framework for supporting MSR-based features Thomas Gleixner Documentation/l1tf: Remove Yonah processors from not vulnerable list Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() Nicolai Stange x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d Nicolai Stange x86: Don't include linux/irq.h from asm/hardirq.h Nicolai Stange x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d Nicolai Stange x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 Nicolai Stange x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() Nicolai Stange x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' Nicolai Stange x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() Josh Poimboeuf cpu/hotplug: detect SMT disabled by BIOS Tony Luck Documentation/l1tf: Fix typos Nicolai Stange x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content Thomas Gleixner Documentation: Add section about CPU vulnerabilities Jiri Kosina x86/bugs, kvm: Introduce boot-time control of L1TF mitigations Thomas Gleixner cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early Jiri Kosina cpu/hotplug: Expose SMT control init function Thomas Gleixner x86/kvm: Allow runtime control of L1D flush Thomas Gleixner x86/kvm: Serialize L1D flush parameter setter Thomas Gleixner x86/kvm: Add static key for flush always Thomas Gleixner x86/kvm: Move l1tf setup function Thomas Gleixner x86/l1tf: Handle EPT disabled state proper Thomas Gleixner x86/kvm: Drop L1TF MSR list approach Thomas Gleixner x86/litf: Introduce vmx status variable Thomas Gleixner cpu/hotplug: Online siblings when SMT control is turned on Konrad Rzeszutek Wilk x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required Konrad Rzeszutek Wilk x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs Konrad Rzeszutek Wilk x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting Konrad Rzeszutek Wilk x86/KVM/VMX: Add find_msr() helper function Konrad Rzeszutek Wilk x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers Jim Mattson kvm: nVMX: Update MSR load counts on a VMCS switch Paolo Bonzini x86/KVM/VMX: Add L1D flush logic Paolo Bonzini x86/KVM/VMX: Add L1D MSR based flush Paolo Bonzini x86/KVM/VMX: Add L1D flush algorithm Konrad Rzeszutek Wilk x86/KVM/VMX: Add module argument for L1TF mitigation Konrad Rzeszutek Wilk x86/KVM: Warn user if
Re: [RFC PATCH 2/2] mm: drain memcg stocks on css offlining
On Tue, Aug 14, 2018 at 5:36 PM Roman Gushchin wrote: > > Memcg charge is batched using per-cpu stocks, so an offline memcg > can be pinned by a cached charge up to a moment, when a process > belonging to some other cgroup will charge some memory on the same > cpu. In other words, cached charges can prevent a memory cgroup > from being reclaimed for some time, without any clear need. > > Let's optimize it by explicit draining of all stocks on css offlining. > As draining is performed asynchronously, and is skipped if any > parallel draining is happening, it's cheap. > > Signed-off-by: Roman Gushchin Seems reasonable. Reviewed-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Konstantin Khlebnikov > Cc: Tejun Heo > --- > mm/memcontrol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 4e3c1315b1de..cfb64b5b9957 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct > cgroup_subsys_state *css) > memcg_offline_kmem(memcg); > wb_memcg_offline(memcg); > > + drain_all_stock(memcg); > + > mem_cgroup_id_put(memcg); > } > > -- > 2.14.4 >
Re: [RFC PATCH 2/2] mm: drain memcg stocks on css offlining
On Tue, Aug 14, 2018 at 5:36 PM Roman Gushchin wrote: > > Memcg charge is batched using per-cpu stocks, so an offline memcg > can be pinned by a cached charge up to a moment, when a process > belonging to some other cgroup will charge some memory on the same > cpu. In other words, cached charges can prevent a memory cgroup > from being reclaimed for some time, without any clear need. > > Let's optimize it by explicit draining of all stocks on css offlining. > As draining is performed asynchronously, and is skipped if any > parallel draining is happening, it's cheap. > > Signed-off-by: Roman Gushchin Seems reasonable. Reviewed-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Konstantin Khlebnikov > Cc: Tejun Heo > --- > mm/memcontrol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 4e3c1315b1de..cfb64b5b9957 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct > cgroup_subsys_state *css) > memcg_offline_kmem(memcg); > wb_memcg_offline(memcg); > > + drain_all_stock(memcg); > + > mem_cgroup_id_put(memcg); > } > > -- > 2.14.4 >
[PATCH] kconfig: add build-only configurator targets
From: Randy Dunlap Add build-only targets for build_menuconfig, build_nconfig, build_xconfig, and build_gconfig. (targets must end in "config" to qualify in top-level Makefile) This allows these target to be built without execution (e.g., to look for errors or warnings) and/or to be built and checked by sparse. Signed-off-by: Randy Dunlap --- scripts/kconfig/Makefile |8 1 file changed, 8 insertions(+) --- linux-next-20180814.orig/scripts/kconfig/Makefile +++ linux-next-20180814/scripts/kconfig/Makefile @@ -33,6 +33,14 @@ config: $(obj)/conf nconfig: $(obj)/nconf $< $(silent) $(Kconfig) +build_menuconfig: $(obj)/mconf + +build_nconfig: $(obj)/nconf + +build_gconfig: $(obj)/gconf + +build_xconfig: $(obj)/qconf + localyesconfig localmodconfig: $(obj)/conf $(Q)perl $(srctree)/$(src)/streamline_config.pl --$@ $(srctree) $(Kconfig) > .tmp.config $(Q)if [ -f .config ]; then \
[RFC PATCH 1/2] mm: rework memcg kernel stack accounting
If CONFIG_VMAP_STACK is set, kernel stacks are allocated using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel stack pages are charged against corresponding memory cgroups on allocation and uncharged on releasing them. The problem is that we do cache kernel stacks in small per-cpu caches and do reuse them for new tasks, which can belong to different memory cgroups. Each stack page still holds a reference to the original cgroup, so the cgroup can't be released until the vmap area is released. To make this happen we need more than two subsequent exits without forks in between on the current cpu, which makes it very unlikely to happen. As a result, I saw a significant number of dying cgroups (in theory, up to 2 * number_of_cpu + number_of_tasks), which can't be released even by significant memory pressure. As a cgroup structure can take a significant amount of memory (first of all, per-cpu data like memcg statistics), it leads to a noticeable waste of memory. Signed-off-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Andy Lutomirski Cc: Konstantin Khlebnikov Cc: Tejun Heo --- kernel/fork.c | 44 ++-- 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 69b6fea5a181..91872b2b37bd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) return s->addr; } + /* +* Allocated stacks are cached and later reused by new threads, +* so memcg accounting is performed manually on assigning/releasing +* stacks to tasks. Drop __GFP_ACCOUNT. +*/ stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN, VMALLOC_START, VMALLOC_END, -THREADINFO_GFP, +THREADINFO_GFP & ~__GFP_ACCOUNT, PAGE_KERNEL, 0, node, __builtin_return_address(0)); @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) #endif } +static void memcg_charge_kernel_stack(struct task_struct *tsk) +{ +#ifdef CONFIG_VMAP_STACK + struct vm_struct *vm = task_stack_vm_area(tsk); + + if (vm) { + int i; + + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL, + compound_order(vm->pages[i])); + + /* All stack pages belong to the same memcg. */ + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, +THREAD_SIZE / 1024); + } +#endif +} + static inline void free_thread_stack(struct task_struct *tsk) { #ifdef CONFIG_VMAP_STACK - if (task_stack_vm_area(tsk)) { + struct vm_struct *vm = task_stack_vm_area(tsk); + + if (vm) { int i; + /* All stack pages belong to the same memcg. */ + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, +-(int)(THREAD_SIZE / 1024)); + + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_uncharge(vm->pages[i], + compound_order(vm->pages[i])); + for (i = 0; i < NR_CACHED_STACKS; i++) { if (this_cpu_cmpxchg(cached_stacks[i], NULL, tsk->stack_vm_area) != NULL) @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct *tsk, int account) NR_KERNEL_STACK_KB, PAGE_SIZE / 1024 * account); } - - /* All stack pages belong to the same memcg. */ - mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, -account * (THREAD_SIZE / 1024)); } else { /* * All stack pages are in the same zone and belong to the @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (!stack) goto free_tsk; + memcg_charge_kernel_stack(tsk); + stack_vm_area = task_stack_vm_area(tsk); err = arch_dup_task_struct(tsk, orig); -- 2.14.4
[RFC PATCH 1/2] mm: rework memcg kernel stack accounting
If CONFIG_VMAP_STACK is set, kernel stacks are allocated using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel stack pages are charged against corresponding memory cgroups on allocation and uncharged on releasing them. The problem is that we do cache kernel stacks in small per-cpu caches and do reuse them for new tasks, which can belong to different memory cgroups. Each stack page still holds a reference to the original cgroup, so the cgroup can't be released until the vmap area is released. To make this happen we need more than two subsequent exits without forks in between on the current cpu, which makes it very unlikely to happen. As a result, I saw a significant number of dying cgroups (in theory, up to 2 * number_of_cpu + number_of_tasks), which can't be released even by significant memory pressure. As a cgroup structure can take a significant amount of memory (first of all, per-cpu data like memcg statistics), it leads to a noticeable waste of memory. Signed-off-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Andy Lutomirski Cc: Konstantin Khlebnikov Cc: Tejun Heo --- kernel/fork.c | 44 ++-- 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 69b6fea5a181..91872b2b37bd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) return s->addr; } + /* +* Allocated stacks are cached and later reused by new threads, +* so memcg accounting is performed manually on assigning/releasing +* stacks to tasks. Drop __GFP_ACCOUNT. +*/ stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN, VMALLOC_START, VMALLOC_END, -THREADINFO_GFP, +THREADINFO_GFP & ~__GFP_ACCOUNT, PAGE_KERNEL, 0, node, __builtin_return_address(0)); @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) #endif } +static void memcg_charge_kernel_stack(struct task_struct *tsk) +{ +#ifdef CONFIG_VMAP_STACK + struct vm_struct *vm = task_stack_vm_area(tsk); + + if (vm) { + int i; + + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL, + compound_order(vm->pages[i])); + + /* All stack pages belong to the same memcg. */ + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, +THREAD_SIZE / 1024); + } +#endif +} + static inline void free_thread_stack(struct task_struct *tsk) { #ifdef CONFIG_VMAP_STACK - if (task_stack_vm_area(tsk)) { + struct vm_struct *vm = task_stack_vm_area(tsk); + + if (vm) { int i; + /* All stack pages belong to the same memcg. */ + mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, +-(int)(THREAD_SIZE / 1024)); + + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_uncharge(vm->pages[i], + compound_order(vm->pages[i])); + for (i = 0; i < NR_CACHED_STACKS; i++) { if (this_cpu_cmpxchg(cached_stacks[i], NULL, tsk->stack_vm_area) != NULL) @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct *tsk, int account) NR_KERNEL_STACK_KB, PAGE_SIZE / 1024 * account); } - - /* All stack pages belong to the same memcg. */ - mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB, -account * (THREAD_SIZE / 1024)); } else { /* * All stack pages are in the same zone and belong to the @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (!stack) goto free_tsk; + memcg_charge_kernel_stack(tsk); + stack_vm_area = task_stack_vm_area(tsk); err = arch_dup_task_struct(tsk, orig); -- 2.14.4
[PATCH] kconfig: add build-only configurator targets
From: Randy Dunlap Add build-only targets for build_menuconfig, build_nconfig, build_xconfig, and build_gconfig. (targets must end in "config" to qualify in top-level Makefile) This allows these target to be built without execution (e.g., to look for errors or warnings) and/or to be built and checked by sparse. Signed-off-by: Randy Dunlap --- scripts/kconfig/Makefile |8 1 file changed, 8 insertions(+) --- linux-next-20180814.orig/scripts/kconfig/Makefile +++ linux-next-20180814/scripts/kconfig/Makefile @@ -33,6 +33,14 @@ config: $(obj)/conf nconfig: $(obj)/nconf $< $(silent) $(Kconfig) +build_menuconfig: $(obj)/mconf + +build_nconfig: $(obj)/nconf + +build_gconfig: $(obj)/gconf + +build_xconfig: $(obj)/qconf + localyesconfig localmodconfig: $(obj)/conf $(Q)perl $(srctree)/$(src)/streamline_config.pl --$@ $(srctree) $(Kconfig) > .tmp.config $(Q)if [ -f .config ]; then \
[RFC PATCH 2/2] mm: drain memcg stocks on css offlining
Memcg charge is batched using per-cpu stocks, so an offline memcg can be pinned by a cached charge up to a moment, when a process belonging to some other cgroup will charge some memory on the same cpu. In other words, cached charges can prevent a memory cgroup from being reclaimed for some time, without any clear need. Let's optimize it by explicit draining of all stocks on css offlining. As draining is performed asynchronously, and is skipped if any parallel draining is happening, it's cheap. Signed-off-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Konstantin Khlebnikov Cc: Tejun Heo --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4e3c1315b1de..cfb64b5b9957 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css) memcg_offline_kmem(memcg); wb_memcg_offline(memcg); + drain_all_stock(memcg); + mem_cgroup_id_put(memcg); } -- 2.14.4
[RFC PATCH 2/2] mm: drain memcg stocks on css offlining
Memcg charge is batched using per-cpu stocks, so an offline memcg can be pinned by a cached charge up to a moment, when a process belonging to some other cgroup will charge some memory on the same cpu. In other words, cached charges can prevent a memory cgroup from being reclaimed for some time, without any clear need. Let's optimize it by explicit draining of all stocks on css offlining. As draining is performed asynchronously, and is skipped if any parallel draining is happening, it's cheap. Signed-off-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Konstantin Khlebnikov Cc: Tejun Heo --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4e3c1315b1de..cfb64b5b9957 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css) memcg_offline_kmem(memcg); wb_memcg_offline(memcg); + drain_all_stock(memcg); + mem_cgroup_id_put(memcg); } -- 2.14.4
Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.
On Tue, Aug 14, 2018 at 4:26 PM, Ghannam, Yazen wrote: > > > -Original Message- > > From: Michael Jin > > Sent: Friday, August 10, 2018 2:36 PM > > To: Borislav Petkov ; Ghannam, Yazen > > ; Mauro Carvalho Chehab > > > > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Michael Jin > > > There may be some differences between models, but things should generally > work. It's just a matter of whether or not the Platform enables certain things > like DRAM ECC, etc. > > Does the amd64_edac_mod module load on your platform with just this patch? > > > + fam_type = _types[F17_M11H_CPUS]; > > + pvt->ops = _types[F17_M11H_CPUS].ops; > > + break; > > + } > > fam_type= _types[F17_CPUS]; > > pvt->ops= _types[F17_CPUS].ops; > > break; Yes, I was able to load amd64_edac_mod on my AMD Ryzen Embedded V1807B (further details on my blog - https://ndimcomputing.io/fs-fp5v_ecc_linux.html). > > diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h > > index 1d4b74e9a037..e50226cd53c6 100644 > > --- a/drivers/edac/amd64_edac.h > > +++ b/drivers/edac/amd64_edac.h > > @@ -115,6 +115,8 @@ > > #define PCI_DEVICE_ID_AMD_16H_M30H_NB_F2 0x1582 > > #define PCI_DEVICE_ID_AMD_17H_DF_F0 0x1460 > > #define PCI_DEVICE_ID_AMD_17H_DF_F6 0x1466 > > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F0 0x15e8 > > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F6 0x15ee > > > > These IDs are used for Fam17h Models 10h-2Fh. Can you please change > the names here and in the rest of this patch? > > The format is to use the first supported model in the name, e.g. M11H -> M10H. > > > /* > > * Function 1 - Address Map > > @@ -281,6 +283,7 @@ enum amd_families { > > F16_CPUS, > > F16_M30H_CPUS, > > F17_CPUS, > > + F17_M11H_CPUS, > > NUM_FAMILIES, > > }; > > According to https://en.wikichip.org/wiki/amd/cpuid, family 17h model 10h is not publicly known. Therefore, I would like you to confirm that model 10h uses 0x15e8 (device F0) and 0x15ee (device F6) as I can not find any documentation or test whether ECC works. Raven Ridge BIOS motherboards do not enable ECC, but share the same CPUID (0081_0F10h http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html) as AMD Ryzen Embedded V1000 Processor Family. This patch was written due to the fact that amd64_edac_mod erroneously sets the wrong device ids for the AMD Ryzen Embedded V1000 Family (device F0 and F6 fail to load, as they do not share 0x1460 and 0x1466 with the other family 17h processors). Michael
Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.
On Tue, Aug 14, 2018 at 4:26 PM, Ghannam, Yazen wrote: > > > -Original Message- > > From: Michael Jin > > Sent: Friday, August 10, 2018 2:36 PM > > To: Borislav Petkov ; Ghannam, Yazen > > ; Mauro Carvalho Chehab > > > > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Michael Jin > > > There may be some differences between models, but things should generally > work. It's just a matter of whether or not the Platform enables certain things > like DRAM ECC, etc. > > Does the amd64_edac_mod module load on your platform with just this patch? > > > + fam_type = _types[F17_M11H_CPUS]; > > + pvt->ops = _types[F17_M11H_CPUS].ops; > > + break; > > + } > > fam_type= _types[F17_CPUS]; > > pvt->ops= _types[F17_CPUS].ops; > > break; Yes, I was able to load amd64_edac_mod on my AMD Ryzen Embedded V1807B (further details on my blog - https://ndimcomputing.io/fs-fp5v_ecc_linux.html). > > diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h > > index 1d4b74e9a037..e50226cd53c6 100644 > > --- a/drivers/edac/amd64_edac.h > > +++ b/drivers/edac/amd64_edac.h > > @@ -115,6 +115,8 @@ > > #define PCI_DEVICE_ID_AMD_16H_M30H_NB_F2 0x1582 > > #define PCI_DEVICE_ID_AMD_17H_DF_F0 0x1460 > > #define PCI_DEVICE_ID_AMD_17H_DF_F6 0x1466 > > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F0 0x15e8 > > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F6 0x15ee > > > > These IDs are used for Fam17h Models 10h-2Fh. Can you please change > the names here and in the rest of this patch? > > The format is to use the first supported model in the name, e.g. M11H -> M10H. > > > /* > > * Function 1 - Address Map > > @@ -281,6 +283,7 @@ enum amd_families { > > F16_CPUS, > > F16_M30H_CPUS, > > F17_CPUS, > > + F17_M11H_CPUS, > > NUM_FAMILIES, > > }; > > According to https://en.wikichip.org/wiki/amd/cpuid, family 17h model 10h is not publicly known. Therefore, I would like you to confirm that model 10h uses 0x15e8 (device F0) and 0x15ee (device F6) as I can not find any documentation or test whether ECC works. Raven Ridge BIOS motherboards do not enable ECC, but share the same CPUID (0081_0F10h http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html) as AMD Ryzen Embedded V1000 Processor Family. This patch was written due to the fact that amd64_edac_mod erroneously sets the wrong device ids for the AMD Ryzen Embedded V1000 Family (device F0 and F6 fail to load, as they do not share 0x1460 and 0x1466 with the other family 17h processors). Michael
Re: [PATCH] mm: migration: fix migration of huge PMD shared pages
On 08/14/2018 01:48 AM, Kirill A. Shutemov wrote: > On Mon, Aug 13, 2018 at 11:21:41PM +, Mike Kravetz wrote: >> On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote: >>> On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote: I am not %100 sure on the required flushing, so suggestions would be appreciated. This also should go to stable. It has been around for a long time so still looking for an appropriate 'fixes:'. >>> >>> I believe we need flushing. And huge_pmd_unshare() usage in >>> __unmap_hugepage_range() looks suspicious: I don't see how we flush TLB in >>> that case. >> >> Thanks Kirill, >> >> __unmap_hugepage_range() has two callers: >> 1) unmap_hugepage_range, which wraps the call with tlb_gather_mmu and >>tlb_finish_mmu on the range. IIUC, this should cause an appropriate >>TLB flush. >> 2) __unmap_hugepage_range_final via unmap_single_vma. unmap_single_vma >> has three callers: >> - unmap_vmas which assumes the caller will flush the whole range after >> return. >> - zap_page_range wraps the call with tlb_gather_mmu/tlb_finish_mmu >> - zap_page_range_single wraps the call with tlb_gather_mmu/tlb_finish_mmu >> >> So, it appears we are covered. But, I could be missing something. > > My problem here is that the mapping that moved by huge_pmd_unshare() in > not accounted into mmu_gather and can be missed on tlb_finish_mmu(). Ah, I think I now see the issue you are concerned with. When huge_pmd_unshare succeeds we effectively unmap a PUD_SIZE area. The routine __unmap_hugepage_range may only have been passed a range that is a subset of PUD_SIZE. In the case I was trying to address, try_to_unmap_one() the 'range' will certainly be less than PUD_SIZE. Upon further thought, I think that even in the case of try_to_unmap_one we should flush PUD_SIZE range. My first thought would be to embed this flushing within huge_pmd_unshare itself. Perhaps, whenever huge_pmd_unshare succeeds we should do an explicit: flush_cache_range(PUD_SIZE) flush_tlb_range(PUD_SIZE) mmu_notifier_invalidate_range(PUD_SIZE) That would take some of the burden off the callers of huge_pmd_unshare. However, I am not sure if the flushing calls above play nice in all the calling environments. I'll look into it some more, but would appreciate additional comments. -- Mike Kravetz
Re: [PATCH] mm: migration: fix migration of huge PMD shared pages
On 08/14/2018 01:48 AM, Kirill A. Shutemov wrote: > On Mon, Aug 13, 2018 at 11:21:41PM +, Mike Kravetz wrote: >> On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote: >>> On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote: I am not %100 sure on the required flushing, so suggestions would be appreciated. This also should go to stable. It has been around for a long time so still looking for an appropriate 'fixes:'. >>> >>> I believe we need flushing. And huge_pmd_unshare() usage in >>> __unmap_hugepage_range() looks suspicious: I don't see how we flush TLB in >>> that case. >> >> Thanks Kirill, >> >> __unmap_hugepage_range() has two callers: >> 1) unmap_hugepage_range, which wraps the call with tlb_gather_mmu and >>tlb_finish_mmu on the range. IIUC, this should cause an appropriate >>TLB flush. >> 2) __unmap_hugepage_range_final via unmap_single_vma. unmap_single_vma >> has three callers: >> - unmap_vmas which assumes the caller will flush the whole range after >> return. >> - zap_page_range wraps the call with tlb_gather_mmu/tlb_finish_mmu >> - zap_page_range_single wraps the call with tlb_gather_mmu/tlb_finish_mmu >> >> So, it appears we are covered. But, I could be missing something. > > My problem here is that the mapping that moved by huge_pmd_unshare() in > not accounted into mmu_gather and can be missed on tlb_finish_mmu(). Ah, I think I now see the issue you are concerned with. When huge_pmd_unshare succeeds we effectively unmap a PUD_SIZE area. The routine __unmap_hugepage_range may only have been passed a range that is a subset of PUD_SIZE. In the case I was trying to address, try_to_unmap_one() the 'range' will certainly be less than PUD_SIZE. Upon further thought, I think that even in the case of try_to_unmap_one we should flush PUD_SIZE range. My first thought would be to embed this flushing within huge_pmd_unshare itself. Perhaps, whenever huge_pmd_unshare succeeds we should do an explicit: flush_cache_range(PUD_SIZE) flush_tlb_range(PUD_SIZE) mmu_notifier_invalidate_range(PUD_SIZE) That would take some of the burden off the callers of huge_pmd_unshare. However, I am not sure if the flushing calls above play nice in all the calling environments. I'll look into it some more, but would appreciate additional comments. -- Mike Kravetz
Re: general protection fault in send_sigurg_to_task
Hi Bruce, On Tue, 14 Aug 2018 13:50:20 -0700 Dmitry Vyukov wrote: > > On Tue, Aug 14, 2018 at 12:11 PM, J. Bruce Fields > wrote: > > On Mon, Aug 13, 2018 at 06:33:02AM -0700, syzbot wrote: > >> syzbot has found a reproducer for the following crash on: > >> > >> HEAD commit:5ed5da74de9e Add linux-next specific files for 20180813 > >> git tree: linux-next > > > > I fetched linux-next but don't have 5ed5da74de9e. > > +Stephen for the disappeared linux-next commit. That is just the HEAD commit on linux-next that day. If you fetched linux-next after I released next-20180814 yesterday, then it would have a different HEAD commit. If you check out the tag next-20180813, you will get the above HEAD commit. > On the dashboard link you can see that it also happened on a more > recent commit 4e8b38549b50459a22573d756dd1f4e1963c2a8d that I do see > now in linux-next. Which is just the HEAD commit of next-20180814. Linux-next is rebuilt every day based on Linus' tree of the day, followed by merging all the other branches, so its HEAD is different every day. -- Cheers, Stephen Rothwell pgpaplQCyIbqo.pgp Description: OpenPGP digital signature
Re: general protection fault in send_sigurg_to_task
Hi Bruce, On Tue, 14 Aug 2018 13:50:20 -0700 Dmitry Vyukov wrote: > > On Tue, Aug 14, 2018 at 12:11 PM, J. Bruce Fields > wrote: > > On Mon, Aug 13, 2018 at 06:33:02AM -0700, syzbot wrote: > >> syzbot has found a reproducer for the following crash on: > >> > >> HEAD commit:5ed5da74de9e Add linux-next specific files for 20180813 > >> git tree: linux-next > > > > I fetched linux-next but don't have 5ed5da74de9e. > > +Stephen for the disappeared linux-next commit. That is just the HEAD commit on linux-next that day. If you fetched linux-next after I released next-20180814 yesterday, then it would have a different HEAD commit. If you check out the tag next-20180813, you will get the above HEAD commit. > On the dashboard link you can see that it also happened on a more > recent commit 4e8b38549b50459a22573d756dd1f4e1963c2a8d that I do see > now in linux-next. Which is just the HEAD commit of next-20180814. Linux-next is rebuilt every day based on Linus' tree of the day, followed by merging all the other branches, so its HEAD is different every day. -- Cheers, Stephen Rothwell pgpaplQCyIbqo.pgp Description: OpenPGP digital signature
[PATCH] cpufreq: governor: Protect cpufreq governor_data
If cppc_cpufreq.ko is deleted at the same time that tuned-adm is changing profiles, there is a small chance that a race can occur between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits() resulting in a system failure when the latter tries to use policy->governor_data that has been freed by the former. This patch uses gov_dbs_data_mutex to synchronize access. Signed-off-by: Henry Willard --- drivers/cpufreq/cpufreq_governor.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 1d50e97d49f1..43416ee55849 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -555,12 +555,20 @@ void cpufreq_dbs_governor_stop(struct cpufreq_policy *policy) void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy) { - struct policy_dbs_info *policy_dbs = policy->governor_data; + struct policy_dbs_info *policy_dbs; + + /* Protect gov->gdbs_data against cpufreq_dbs_governor_exit */ + mutex_lock(_dbs_data_mutex); + policy_dbs = policy->governor_data; + if (!policy_dbs) + goto out; mutex_lock(_dbs->update_mutex); cpufreq_policy_apply_limits(policy); gov_update_sample_delay(policy_dbs, 0); mutex_unlock(_dbs->update_mutex); +out: + mutex_unlock(_dbs_data_mutex); } EXPORT_SYMBOL_GPL(cpufreq_dbs_governor_limits); -- 1.8.3.1
[PATCH] cpufreq: governor: Protect cpufreq governor_data
If cppc_cpufreq.ko is deleted at the same time that tuned-adm is changing profiles, there is a small chance that a race can occur between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits() resulting in a system failure when the latter tries to use policy->governor_data that has been freed by the former. This patch uses gov_dbs_data_mutex to synchronize access. Signed-off-by: Henry Willard --- drivers/cpufreq/cpufreq_governor.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 1d50e97d49f1..43416ee55849 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -555,12 +555,20 @@ void cpufreq_dbs_governor_stop(struct cpufreq_policy *policy) void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy) { - struct policy_dbs_info *policy_dbs = policy->governor_data; + struct policy_dbs_info *policy_dbs; + + /* Protect gov->gdbs_data against cpufreq_dbs_governor_exit */ + mutex_lock(_dbs_data_mutex); + policy_dbs = policy->governor_data; + if (!policy_dbs) + goto out; mutex_lock(_dbs->update_mutex); cpufreq_policy_apply_limits(policy); gov_update_sample_delay(policy_dbs, 0); mutex_unlock(_dbs->update_mutex); +out: + mutex_unlock(_dbs_data_mutex); } EXPORT_SYMBOL_GPL(cpufreq_dbs_governor_limits); -- 1.8.3.1
Re: [PATCH 1/2] fpga: Document when fpga_blah_free functions should be used
On Thu, Jul 26, 2018 at 2:26 AM, Federico Vaga wrote: > Hi Alan, > > have you considered the possibility of having something like devm_fpga_[mgr| > bridge|region]_[create|free]() ? Like this, it will be obvious that 'struct > fpga_mgr' will be released automatically without reading any comment (but the > comment is still good), and you use devm_*_free() only to handle error > conditions. Hi Federico, Sorry for the late reply. I was waiting until I had this all implemented. As you've seen, I've posted the devm_fpga_mgr_create implementation [1]. I found I didn't need devm_*_free() for the error conditions, the managed device code handled cleanup nicely without it. Just to be clear, I'm dropping this patch in favor of devm support instead. Thanks again for your suggestions. Alan [1] https://lkml.org/lkml/2018/8/14/954 > > On Wednesday, July 25, 2018 8:15:13 PM CEST Alan Tull wrote: >> Clarify when fpga_(mgr|bridge|region)_free functions should be used. >> The class's dev_release will handle cleanup when the device is released >> so once the mgr/brige/region has been successfully registered, it >> would be a bug to call fpga_(mgr|bridge|region)_free. >> >> Signed-off-by: Alan Tull >> Suggested-by: Florian Fainelli >> Suggested-by: Federico Vaga >> --- >> drivers/fpga/fpga-bridge.c | 10 +- >> drivers/fpga/fpga-mgr.c| 10 +- >> drivers/fpga/fpga-region.c | 10 +- >> 3 files changed, 27 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/fpga/fpga-bridge.c b/drivers/fpga/fpga-bridge.c >> index 24b8f98..528d2149 100644 >> --- a/drivers/fpga/fpga-bridge.c >> +++ b/drivers/fpga/fpga-bridge.c >> @@ -379,7 +379,11 @@ EXPORT_SYMBOL_GPL(fpga_bridge_create); >> >> /** >> * fpga_bridge_free - free a fpga bridge and its id >> - * @bridge: FPGA bridge struct created by fpga_bridge_create >> + * @bridge: FPGA bridge struct created by fpga_bridge_create() >> + * >> + * Free a FPGA bridge. This function should only be called for >> + * freeing a bridge that has not been registered yet (such as in error >> + * paths in a probe function). >> */ >> void fpga_bridge_free(struct fpga_bridge *bridge) >> { >> @@ -414,6 +418,10 @@ EXPORT_SYMBOL_GPL(fpga_bridge_register); >> /** >> * fpga_bridge_unregister - unregister and free a fpga bridge >> * @bridge: FPGA bridge struct created by fpga_bridge_create >> + * >> + * Unregister the bridge device. The class's dev_release will handle >> + * freeing the bridge struct when the device is released so don't >> + * call fpga_bridge_free() after calling fpga_bridge_unregister(). >> */ >> void fpga_bridge_unregister(struct fpga_bridge *bridge) >> { >> diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c >> index a41b07e..9632cbd 100644 >> --- a/drivers/fpga/fpga-mgr.c >> +++ b/drivers/fpga/fpga-mgr.c >> @@ -619,7 +619,11 @@ EXPORT_SYMBOL_GPL(fpga_mgr_create); >> >> /** >> * fpga_mgr_free - deallocate a FPGA manager >> - * @mgr: fpga manager struct created by fpga_mgr_create >> + * @mgr: fpga manager struct created by fpga_mgr_create() >> + * >> + * Free a FPGA manager struct. This function should only be called >> + * for freeing a manager that has not been registered yet (such as in >> + * error paths in a probe function). >> */ >> void fpga_mgr_free(struct fpga_manager *mgr) >> { >> @@ -663,6 +667,10 @@ EXPORT_SYMBOL_GPL(fpga_mgr_register); >> /** >> * fpga_mgr_unregister - unregister and free a FPGA manager >> * @mgr: fpga manager struct >> + * >> + * Unregister the manager device. The class's dev_release will handle >> + * freeing the manager struct when the device is released so don't >> + * call fpga_mgr_free() after calling fpga_mgr_unregister(). >> */ >> void fpga_mgr_unregister(struct fpga_manager *mgr) >> { >> diff --git a/drivers/fpga/fpga-region.c b/drivers/fpga/fpga-region.c >> index 0d65220..7335fa9 100644 >> --- a/drivers/fpga/fpga-region.c >> +++ b/drivers/fpga/fpga-region.c >> @@ -231,7 +231,11 @@ EXPORT_SYMBOL_GPL(fpga_region_create); >> >> /** >> * fpga_region_free - free a struct fpga_region >> - * @region: FPGA region created by fpga_region_create >> + * @region: FPGA region created by fpga_region_create() >> + * >> + * Free a FPGA region struct. This function should only be called for >> + * freeing a region that has not been registered yet (such as in error >> + * paths in a probe function). >> */ >> void fpga_region_free(struct fpga_region *region) >> { >> @@ -255,6 +259,10 @@ EXPORT_SYMBOL_GPL(fpga_region_register); >> /** >> * fpga_region_unregister - unregister and free a FPGA region >> * @region: FPGA region >> + * >> + * Unregister the region device. The class's dev_release will handle >> + * freeing the region so don't call fpga_region_free() after calling >> + * fpga_region_unregister(). >> */ >> void fpga_region_unregister(struct fpga_region *region) >> { > > > > > -- > To unsubscribe from this list: send the line "unsubscribe
Re: [PATCH 1/2] fpga: Document when fpga_blah_free functions should be used
On Thu, Jul 26, 2018 at 2:26 AM, Federico Vaga wrote: > Hi Alan, > > have you considered the possibility of having something like devm_fpga_[mgr| > bridge|region]_[create|free]() ? Like this, it will be obvious that 'struct > fpga_mgr' will be released automatically without reading any comment (but the > comment is still good), and you use devm_*_free() only to handle error > conditions. Hi Federico, Sorry for the late reply. I was waiting until I had this all implemented. As you've seen, I've posted the devm_fpga_mgr_create implementation [1]. I found I didn't need devm_*_free() for the error conditions, the managed device code handled cleanup nicely without it. Just to be clear, I'm dropping this patch in favor of devm support instead. Thanks again for your suggestions. Alan [1] https://lkml.org/lkml/2018/8/14/954 > > On Wednesday, July 25, 2018 8:15:13 PM CEST Alan Tull wrote: >> Clarify when fpga_(mgr|bridge|region)_free functions should be used. >> The class's dev_release will handle cleanup when the device is released >> so once the mgr/brige/region has been successfully registered, it >> would be a bug to call fpga_(mgr|bridge|region)_free. >> >> Signed-off-by: Alan Tull >> Suggested-by: Florian Fainelli >> Suggested-by: Federico Vaga >> --- >> drivers/fpga/fpga-bridge.c | 10 +- >> drivers/fpga/fpga-mgr.c| 10 +- >> drivers/fpga/fpga-region.c | 10 +- >> 3 files changed, 27 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/fpga/fpga-bridge.c b/drivers/fpga/fpga-bridge.c >> index 24b8f98..528d2149 100644 >> --- a/drivers/fpga/fpga-bridge.c >> +++ b/drivers/fpga/fpga-bridge.c >> @@ -379,7 +379,11 @@ EXPORT_SYMBOL_GPL(fpga_bridge_create); >> >> /** >> * fpga_bridge_free - free a fpga bridge and its id >> - * @bridge: FPGA bridge struct created by fpga_bridge_create >> + * @bridge: FPGA bridge struct created by fpga_bridge_create() >> + * >> + * Free a FPGA bridge. This function should only be called for >> + * freeing a bridge that has not been registered yet (such as in error >> + * paths in a probe function). >> */ >> void fpga_bridge_free(struct fpga_bridge *bridge) >> { >> @@ -414,6 +418,10 @@ EXPORT_SYMBOL_GPL(fpga_bridge_register); >> /** >> * fpga_bridge_unregister - unregister and free a fpga bridge >> * @bridge: FPGA bridge struct created by fpga_bridge_create >> + * >> + * Unregister the bridge device. The class's dev_release will handle >> + * freeing the bridge struct when the device is released so don't >> + * call fpga_bridge_free() after calling fpga_bridge_unregister(). >> */ >> void fpga_bridge_unregister(struct fpga_bridge *bridge) >> { >> diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c >> index a41b07e..9632cbd 100644 >> --- a/drivers/fpga/fpga-mgr.c >> +++ b/drivers/fpga/fpga-mgr.c >> @@ -619,7 +619,11 @@ EXPORT_SYMBOL_GPL(fpga_mgr_create); >> >> /** >> * fpga_mgr_free - deallocate a FPGA manager >> - * @mgr: fpga manager struct created by fpga_mgr_create >> + * @mgr: fpga manager struct created by fpga_mgr_create() >> + * >> + * Free a FPGA manager struct. This function should only be called >> + * for freeing a manager that has not been registered yet (such as in >> + * error paths in a probe function). >> */ >> void fpga_mgr_free(struct fpga_manager *mgr) >> { >> @@ -663,6 +667,10 @@ EXPORT_SYMBOL_GPL(fpga_mgr_register); >> /** >> * fpga_mgr_unregister - unregister and free a FPGA manager >> * @mgr: fpga manager struct >> + * >> + * Unregister the manager device. The class's dev_release will handle >> + * freeing the manager struct when the device is released so don't >> + * call fpga_mgr_free() after calling fpga_mgr_unregister(). >> */ >> void fpga_mgr_unregister(struct fpga_manager *mgr) >> { >> diff --git a/drivers/fpga/fpga-region.c b/drivers/fpga/fpga-region.c >> index 0d65220..7335fa9 100644 >> --- a/drivers/fpga/fpga-region.c >> +++ b/drivers/fpga/fpga-region.c >> @@ -231,7 +231,11 @@ EXPORT_SYMBOL_GPL(fpga_region_create); >> >> /** >> * fpga_region_free - free a struct fpga_region >> - * @region: FPGA region created by fpga_region_create >> + * @region: FPGA region created by fpga_region_create() >> + * >> + * Free a FPGA region struct. This function should only be called for >> + * freeing a region that has not been registered yet (such as in error >> + * paths in a probe function). >> */ >> void fpga_region_free(struct fpga_region *region) >> { >> @@ -255,6 +259,10 @@ EXPORT_SYMBOL_GPL(fpga_region_register); >> /** >> * fpga_region_unregister - unregister and free a FPGA region >> * @region: FPGA region >> + * >> + * Unregister the region device. The class's dev_release will handle >> + * freeing the region so don't call fpga_region_free() after calling >> + * fpga_region_unregister(). >> */ >> void fpga_region_unregister(struct fpga_region *region) >> { > > > > > -- > To unsubscribe from this list: send the line "unsubscribe
Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max
Hi, On Tue, Aug 14, 2018 at 2:59 PM, David Collins wrote: > Hi, > > On 08/14/2018 01:03 PM, Doug Anderson wrote: >> On Tue, Aug 14, 2018 at 11:30 AM, David Collins >> wrote:>>> --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev) struct regulator *sibling; int current_uA = 0, output_uV, input_uV, err; unsigned int mode; + bool any_unset = false; lockdep_assert_held_once(>mutex); @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev *rdev) return -EINVAL; /* calc total requested load */ - list_for_each_entry(sibling, >consumer_list, list) + list_for_each_entry(sibling, >consumer_list, list) { current_uA += sibling->uA_load; + if (!sibling->uA_load_set) + any_unset = true; + } current_uA += rdev->constraints->system_load; + if (any_unset) + current_uA = INT_MAX; + >>> >>> This check will incorrectly result in a constant load request of INT_MAX >>> for all regulators that have at least one child regulator. This is the >>> case because such child regulators are present in rdev->consumer_list and >>> because regulator_set_load() requests are not propagated up to parent >>> regulators. Thus, the regulator structs for child regulators will always >>> have uA_load==0 and uA_load_set==false. >> >> Ah, interesting. >> >> ...but doesn't this same bug exist anyway, just in the opposite >> direction? Without my patch we'll try to request a 0 mA load in this >> case which seems just as wrong (or perhaps worse?). I guess on RPMh >> regulator you're "saved" because the RPMh hardware itself knows the >> parent/child relationship and knows to ignore this 0 mA load, but it's >> still a bug in the overall Linux framework... > > I'm not sure what existing bug you are referring to here. Where is a 0 mA > load being requested? Could you list the consumer function calls and the > behavior on the framework side that you're envisioning? Imagine you've got a tree like this: - regulatorParent (1.8 V) - regulatorChildA (1.7 V) - regulatorChildB (1.2 V) - regulatorChildC (1.5 V) ...and this set of calls: regulator_set_load(regulatorChildA, 1000); regulator_set_load(regulatorChildB, 2000); regulator_set_load(regulatorChildC, 3000); regulator_enable(regulatorChildA); regulator_enable(regulatorChildB); regulator_enable(regulatorChildC); With the existing code in Mark's tree then ChildA / ChildB / ChildC will presumably have a load high enough to be in high power mode. However, as you said, loads aren't propagated. ...so "Parent" will see no load requested (which translates to a load request of 0 mA). The "bug" is that when you did the "regulator_enable(regulatorChildA);" then that will propagate up and cause a "regulator_enable(regulatorParent)". In _regulator_enable() you can see drms_uA_update(). That will cause it to set the mode of regulatorParent based on a load of 0 mA. That is the bug. You may respond "but REGULATOR_CHANGE_DRMS isn't set for the parent! See below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the parent. With my code in the same situation the parent will end up with a load of "MAX_INT" mA which I think is the bug you pointed out. ...while it would be good to fix this it does seem to be better to set the parent to a mode based on MAX_INT mA rather than 0 mA? > The RPMh hardware never sees requests with units of mA (though its > predecessor RPM SMD did). Instead, mode requests sent to RPMh are raw > PMIC MODE_CTL register values. RPMh then applies max aggregation of these > register values across masters (APPS, modem, etc). Also note that the > RPMh knowledge of parent/child relationships is only utilized in enable > control and voltage headroom management. It does not apply to mode control. Yeah, I know RPMh only sees modes. I was sorta assuming that perhaps RPMh would use its parent/child relationship knowledge to say that if a child is in HPM then the parent should automatically go to HPM. ...but as I dug further I figured out why we're not running into this bug (at least with the regulator setup I have in my tree). All of the "parent" regulators are always configured such that "REGULATOR_CHANGE_DRMS" is not valid. Thus we never end up calling into drms_uA_update() and never update the mode. So tl;dr: your original comment was that my patch had problems whenever one regulator is the supply for another regulator if parent has REGULATOR_CHANGE_DRMS set. I believe I have shown that even without my patch we have problems in this exact same case. >> I have no idea how we ought to propagate regulator_set_load() to >> parents though. That seems like a tough thing to try to handle >> automagically. > > I attempted
Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max
Hi, On Tue, Aug 14, 2018 at 2:59 PM, David Collins wrote: > Hi, > > On 08/14/2018 01:03 PM, Doug Anderson wrote: >> On Tue, Aug 14, 2018 at 11:30 AM, David Collins >> wrote:>>> --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev) struct regulator *sibling; int current_uA = 0, output_uV, input_uV, err; unsigned int mode; + bool any_unset = false; lockdep_assert_held_once(>mutex); @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev *rdev) return -EINVAL; /* calc total requested load */ - list_for_each_entry(sibling, >consumer_list, list) + list_for_each_entry(sibling, >consumer_list, list) { current_uA += sibling->uA_load; + if (!sibling->uA_load_set) + any_unset = true; + } current_uA += rdev->constraints->system_load; + if (any_unset) + current_uA = INT_MAX; + >>> >>> This check will incorrectly result in a constant load request of INT_MAX >>> for all regulators that have at least one child regulator. This is the >>> case because such child regulators are present in rdev->consumer_list and >>> because regulator_set_load() requests are not propagated up to parent >>> regulators. Thus, the regulator structs for child regulators will always >>> have uA_load==0 and uA_load_set==false. >> >> Ah, interesting. >> >> ...but doesn't this same bug exist anyway, just in the opposite >> direction? Without my patch we'll try to request a 0 mA load in this >> case which seems just as wrong (or perhaps worse?). I guess on RPMh >> regulator you're "saved" because the RPMh hardware itself knows the >> parent/child relationship and knows to ignore this 0 mA load, but it's >> still a bug in the overall Linux framework... > > I'm not sure what existing bug you are referring to here. Where is a 0 mA > load being requested? Could you list the consumer function calls and the > behavior on the framework side that you're envisioning? Imagine you've got a tree like this: - regulatorParent (1.8 V) - regulatorChildA (1.7 V) - regulatorChildB (1.2 V) - regulatorChildC (1.5 V) ...and this set of calls: regulator_set_load(regulatorChildA, 1000); regulator_set_load(regulatorChildB, 2000); regulator_set_load(regulatorChildC, 3000); regulator_enable(regulatorChildA); regulator_enable(regulatorChildB); regulator_enable(regulatorChildC); With the existing code in Mark's tree then ChildA / ChildB / ChildC will presumably have a load high enough to be in high power mode. However, as you said, loads aren't propagated. ...so "Parent" will see no load requested (which translates to a load request of 0 mA). The "bug" is that when you did the "regulator_enable(regulatorChildA);" then that will propagate up and cause a "regulator_enable(regulatorParent)". In _regulator_enable() you can see drms_uA_update(). That will cause it to set the mode of regulatorParent based on a load of 0 mA. That is the bug. You may respond "but REGULATOR_CHANGE_DRMS isn't set for the parent! See below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the parent. With my code in the same situation the parent will end up with a load of "MAX_INT" mA which I think is the bug you pointed out. ...while it would be good to fix this it does seem to be better to set the parent to a mode based on MAX_INT mA rather than 0 mA? > The RPMh hardware never sees requests with units of mA (though its > predecessor RPM SMD did). Instead, mode requests sent to RPMh are raw > PMIC MODE_CTL register values. RPMh then applies max aggregation of these > register values across masters (APPS, modem, etc). Also note that the > RPMh knowledge of parent/child relationships is only utilized in enable > control and voltage headroom management. It does not apply to mode control. Yeah, I know RPMh only sees modes. I was sorta assuming that perhaps RPMh would use its parent/child relationship knowledge to say that if a child is in HPM then the parent should automatically go to HPM. ...but as I dug further I figured out why we're not running into this bug (at least with the regulator setup I have in my tree). All of the "parent" regulators are always configured such that "REGULATOR_CHANGE_DRMS" is not valid. Thus we never end up calling into drms_uA_update() and never update the mode. So tl;dr: your original comment was that my patch had problems whenever one regulator is the supply for another regulator if parent has REGULATOR_CHANGE_DRMS set. I believe I have shown that even without my patch we have problems in this exact same case. >> I have no idea how we ought to propagate regulator_set_load() to >> parents though. That seems like a tough thing to try to handle >> automagically. > > I attempted
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On Tue, Aug 14, 2018 at 04:16:41PM -0700, Linus Torvalds wrote: > On Tue, Aug 14, 2018 at 3:56 PM Keith Busch > wrote: > > > > You just want to ensure the '*dbbuf_db = value' isn't reordered, right? > > The order dependency might be more obvious if done as: > > > > WRITE_ONCE(*dbbuf_db, value); > > > > if (!nvme_dbbuf_need_event(READ_ONCE(*dbbuf_ei), value, old_value)) > > return false; > > > > And 'volatile' is again redundant. > > Yes, using READ_ONCE/WRITE_ONCE obviates the need for volatile, but it > does *not* impose a memory ordering. > > It imposes an ordering on the compiler, but not on the CPU, so you > still want the "mb()" there I mistakenly recalled memory-barriers.txt mentioned order was enforced on the CPU, but that's true only for overlapping memory, which this is not. Thanks for the correction.
Re: [PATCH] Bugfix for handling of shadow doorbell buffer.
On Tue, Aug 14, 2018 at 04:16:41PM -0700, Linus Torvalds wrote: > On Tue, Aug 14, 2018 at 3:56 PM Keith Busch > wrote: > > > > You just want to ensure the '*dbbuf_db = value' isn't reordered, right? > > The order dependency might be more obvious if done as: > > > > WRITE_ONCE(*dbbuf_db, value); > > > > if (!nvme_dbbuf_need_event(READ_ONCE(*dbbuf_ei), value, old_value)) > > return false; > > > > And 'volatile' is again redundant. > > Yes, using READ_ONCE/WRITE_ONCE obviates the need for volatile, but it > does *not* impose a memory ordering. > > It imposes an ordering on the compiler, but not on the CPU, so you > still want the "mb()" there I mistakenly recalled memory-barriers.txt mentioned order was enforced on the CPU, but that's true only for overlapping memory, which this is not. Thanks for the correction.
Re: [PATCH] zram: fix bug storing backing_dev
On Mon, 13 Aug 2018 16:38:25 +0900 Sergey Senozhatsky wrote: > On (08/13/18 15:16), Minchan Kim wrote: > > > The call to strlcpy in backing_dev_store is incorrect. It should take > > > the size of the destination buffer instead of the size of the source > > > buffer. Additionally, ignore the newline character (\n) when reading > > > the new file_name buffer. This makes it possible to set the backing_dev > > > as follows: > > > > > > echo /dev/sdX > /sys/block/zram0/backing_dev > > > > > > Signed-off-by: Peter Kalauskas > > Acked-by: Minchan Kim > > > > Cc: Andrew Morton > > Cc: Sergey Senozhatsky > > CC: LKML > > Cc: [4.14+] > > Thanks for Cc-ing Minchan. > > Reviewed-by: Sergey Senozhatsky > > > > - strlcpy(file_name, buf, len); > > This is quite interesting. The reason it worked before was the fact that > strlcpy() copies 'len - 1' bytes, which is strlen(buf) - 1 in our case, > so it accidentally didn't copy the trailing new line symbol. Which also > means that "echo -n /dev/sdX" most likely was broken. > I can't find the original email on lkml for some reason, but I recreated the patch. The changelog doesn't describe the end-user impact of the bug, which is very desirable when tagging a patch for -stable backporting. Can we have that paragraph please? The implementation might be able to use strim() somehow.
Re: [PATCH] zram: fix bug storing backing_dev
On Mon, 13 Aug 2018 16:38:25 +0900 Sergey Senozhatsky wrote: > On (08/13/18 15:16), Minchan Kim wrote: > > > The call to strlcpy in backing_dev_store is incorrect. It should take > > > the size of the destination buffer instead of the size of the source > > > buffer. Additionally, ignore the newline character (\n) when reading > > > the new file_name buffer. This makes it possible to set the backing_dev > > > as follows: > > > > > > echo /dev/sdX > /sys/block/zram0/backing_dev > > > > > > Signed-off-by: Peter Kalauskas > > Acked-by: Minchan Kim > > > > Cc: Andrew Morton > > Cc: Sergey Senozhatsky > > CC: LKML > > Cc: [4.14+] > > Thanks for Cc-ing Minchan. > > Reviewed-by: Sergey Senozhatsky > > > > - strlcpy(file_name, buf, len); > > This is quite interesting. The reason it worked before was the fact that > strlcpy() copies 'len - 1' bytes, which is strlen(buf) - 1 in our case, > so it accidentally didn't copy the trailing new line symbol. Which also > means that "echo -n /dev/sdX" most likely was broken. > I can't find the original email on lkml for some reason, but I recreated the patch. The changelog doesn't describe the end-user impact of the bug, which is very desirable when tagging a patch for -stable backporting. Can we have that paragraph please? The implementation might be able to use strim() somehow.
[PATCH] gpio: brcmstb: allow 0 width GPIO banks
From: Justin Chen Sometimes we have empty banks within the GPIO block. This commit allows proper handling of 0 width GPIO banks. We handle 0 width GPIO banks by incrementing the bank and number of GPIOs, but not initializing them. This will mean a call into the non-existent GPIOs will return an error. Also remove banks and GPIO information from the dev_info print. This information is misleading since the incremented banks and gpio_base do not reflect the actual GPIOs that got initialized. We leave this information out since it is already printed with dev_dbg. Signed-off-by: Justin Chen --- drivers/gpio/gpio-brcmstb.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c index 16c7f9f..8658910 100644 --- a/drivers/gpio/gpio-brcmstb.c +++ b/drivers/gpio/gpio-brcmstb.c @@ -664,6 +664,18 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) struct brcmstb_gpio_bank *bank; struct gpio_chip *gc; + /* +* If bank_width is 0, then there is an empty bank in the +* register block. Special handling for this case. +*/ + if (bank_width == 0) { + dev_dbg(dev, "Width 0 found: Empty bank @ %d\n", + num_banks); + num_banks++; + gpio_base += MAX_GPIO_PER_BANK; + continue; + } + bank = devm_kzalloc(dev, sizeof(*bank), GFP_KERNEL); if (!bank) { err = -ENOMEM; @@ -740,8 +752,7 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) goto fail; } - dev_info(dev, "Registered %d banks (GPIO(s): %d-%d)\n", - num_banks, priv->gpio_base, gpio_base - 1); + dev_info(dev, "Brcmstb GPIO registered\n"); if (priv->parent_wake_irq && need_wakeup_event) pm_wakeup_event(dev, 0); -- 2.7.4
[PATCH] gpio: brcmstb: allow 0 width GPIO banks
From: Justin Chen Sometimes we have empty banks within the GPIO block. This commit allows proper handling of 0 width GPIO banks. We handle 0 width GPIO banks by incrementing the bank and number of GPIOs, but not initializing them. This will mean a call into the non-existent GPIOs will return an error. Also remove banks and GPIO information from the dev_info print. This information is misleading since the incremented banks and gpio_base do not reflect the actual GPIOs that got initialized. We leave this information out since it is already printed with dev_dbg. Signed-off-by: Justin Chen --- drivers/gpio/gpio-brcmstb.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c index 16c7f9f..8658910 100644 --- a/drivers/gpio/gpio-brcmstb.c +++ b/drivers/gpio/gpio-brcmstb.c @@ -664,6 +664,18 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) struct brcmstb_gpio_bank *bank; struct gpio_chip *gc; + /* +* If bank_width is 0, then there is an empty bank in the +* register block. Special handling for this case. +*/ + if (bank_width == 0) { + dev_dbg(dev, "Width 0 found: Empty bank @ %d\n", + num_banks); + num_banks++; + gpio_base += MAX_GPIO_PER_BANK; + continue; + } + bank = devm_kzalloc(dev, sizeof(*bank), GFP_KERNEL); if (!bank) { err = -ENOMEM; @@ -740,8 +752,7 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) goto fail; } - dev_info(dev, "Registered %d banks (GPIO(s): %d-%d)\n", - num_banks, priv->gpio_base, gpio_base - 1); + dev_info(dev, "Brcmstb GPIO registered\n"); if (priv->parent_wake_irq && need_wakeup_event) pm_wakeup_event(dev, 0); -- 2.7.4