Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Greg Kroah-Hartman
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote:
> if SWAP is disabled in kernel config, the following compile error will raise
> up with this release
> 
> arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1):
> undefined reference to `generic_max_swapfile_size'
> 
> of course this is simple to fix. the function max_swapfile_size must be
> excluded if CONFIG_SWAP is disabled

Now fixed up, thanks.

greg k-h


Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Greg Kroah-Hartman
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote:
> if SWAP is disabled in kernel config, the following compile error will raise
> up with this release
> 
> arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1):
> undefined reference to `generic_max_swapfile_size'
> 
> of course this is simple to fix. the function max_swapfile_size must be
> excluded if CONFIG_SWAP is disabled

Now fixed up, thanks.

greg k-h


Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.

2018-08-14 Thread Borislav Petkov
On Tue, Aug 14, 2018 at 08:14:47PM -0400, Michael Jin wrote:
> Therefore, I would like you to confirm that model 10h uses 0x15e8 (device
> F0) and 0x15ee (device F6) as I can not find any documentation or test
> whether ECC works.

He just asked you to make the change for models 0x10-0x2f. How much more
confirmation do you need?

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.

2018-08-14 Thread Borislav Petkov
On Tue, Aug 14, 2018 at 08:14:47PM -0400, Michael Jin wrote:
> Therefore, I would like you to confirm that model 10h uses 0x15e8 (device
> F0) and 0x15ee (device F6) as I can not find any documentation or test
> whether ECC works.

He just asked you to make the change for models 0x10-0x2f. How much more
confirmation do you need?

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Greg Kroah-Hartman
On Tue, Aug 14, 2018 at 11:58:18AM -0700, Nathan Chancellor wrote:
> On Tue, Aug 14, 2018 at 07:16:23PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.120 release.
> > There are 107 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.9.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Merged, compiled with -Werror, and installed onto my OnePlus 6, which is
> my daily driver while my Pixel 2 XL is in for another screen RMA...
> 
> No initial issues noticed in dmesg or general usage.

Thanks for the testing and letting me know.

greg k-h


Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Greg Kroah-Hartman
On Tue, Aug 14, 2018 at 11:58:18AM -0700, Nathan Chancellor wrote:
> On Tue, Aug 14, 2018 at 07:16:23PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.120 release.
> > There are 107 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.9.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Merged, compiled with -Werror, and installed onto my OnePlus 6, which is
> my daily driver while my Pixel 2 XL is in for another screen RMA...
> 
> No initial issues noticed in dmesg or general usage.

Thanks for the testing and letting me know.

greg k-h


Re: [PATCH 1/2] dt-bindings: clk: meson-g12a: Add G12A AO Clock Bindings

2018-08-14 Thread Jian Hu




On 2018/8/15 4:48, Rob Herring wrote:

On Fri, Aug 10, 2018 at 05:54:27PM +0800, Jian Hu wrote:

Add new clock controller compatible and dt-bingdings headers
for the Always-On domain of the g12a SoC

Signed-off-by: Jian Hu 
---
  .../bindings/clock/amlogic,gxbb-aoclkc.txt |  1 +
  include/dt-bindings/clock/g12a-aoclkc.h| 28 ++
  2 files changed, 29 insertions(+)
  create mode 100755 include/dt-bindings/clock/g12a-aoclkc.h


checkpatch says wrong mode.
Yes, I have checked the g12a-aoclk.h file,It is wrong mode. I will chmod 
a-x for it.Thank you for review.




diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt 
b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
index 3a88052..6f02288 100644
--- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
+++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
@@ -10,6 +10,7 @@ Required Properties:
- GXL (S905X, S905D) : "amlogic,meson-gxl-aoclkc"
- GXM (S912) : "amlogic,meson-gxm-aoclkc"
- AXG (A113D, A113X) : "amlogic,meson-axg-aoclkc"
+   - G12A (S905D2, S905X2) : "amlogic,g12a-aoclkc"
followed by the common "amlogic,meson-gx-aoclkc"
  
  - #clock-cells: should be 1.

diff --git a/include/dt-bindings/clock/g12a-aoclkc.h 
b/include/dt-bindings/clock/g12a-aoclkc.h
new file mode 100755
index 000..6b3f921
--- /dev/null
+++ b/include/dt-bindings/clock/g12a-aoclkc.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */
+/*
+ * Copyright (c) 2016 BayLibre, SAS
+ * Author: Neil Armstrong 
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Jian Hu
+ */
+
+#ifndef DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK
+#define DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK
+
+#define CLKID_AO_AHB_BUS   0
+#define CLKID_AO_REMOTE1
+#define CLKID_AO_I2C_MASTER2
+#define CLKID_AO_I2C_SLAVE 3
+#define CLKID_AO_UART1 4
+#define CLKID_AO_PROD_I2C  5
+#define CLKID_AO_UART2 6
+#define CLKID_AO_IR_BLASTER7
+#define CLKID_AO_SAR_ADC   8
+#define CLKID_AO_CLK81 9
+#define CLKID_AO_SAR_ADC_SEL   10
+#define CLKID_AO_SAR_ADC_DIV   11
+#define CLKID_AO_SAR_ADC_CLK   12
+#define CLKID_AO_ALT_XTAL  13
+
+#endif
--
1.9.1



.



Re: [PATCH 1/2] dt-bindings: clk: meson-g12a: Add G12A AO Clock Bindings

2018-08-14 Thread Jian Hu




On 2018/8/15 4:48, Rob Herring wrote:

On Fri, Aug 10, 2018 at 05:54:27PM +0800, Jian Hu wrote:

Add new clock controller compatible and dt-bingdings headers
for the Always-On domain of the g12a SoC

Signed-off-by: Jian Hu 
---
  .../bindings/clock/amlogic,gxbb-aoclkc.txt |  1 +
  include/dt-bindings/clock/g12a-aoclkc.h| 28 ++
  2 files changed, 29 insertions(+)
  create mode 100755 include/dt-bindings/clock/g12a-aoclkc.h


checkpatch says wrong mode.
Yes, I have checked the g12a-aoclk.h file,It is wrong mode. I will chmod 
a-x for it.Thank you for review.




diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt 
b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
index 3a88052..6f02288 100644
--- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
+++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt
@@ -10,6 +10,7 @@ Required Properties:
- GXL (S905X, S905D) : "amlogic,meson-gxl-aoclkc"
- GXM (S912) : "amlogic,meson-gxm-aoclkc"
- AXG (A113D, A113X) : "amlogic,meson-axg-aoclkc"
+   - G12A (S905D2, S905X2) : "amlogic,g12a-aoclkc"
followed by the common "amlogic,meson-gx-aoclkc"
  
  - #clock-cells: should be 1.

diff --git a/include/dt-bindings/clock/g12a-aoclkc.h 
b/include/dt-bindings/clock/g12a-aoclkc.h
new file mode 100755
index 000..6b3f921
--- /dev/null
+++ b/include/dt-bindings/clock/g12a-aoclkc.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */
+/*
+ * Copyright (c) 2016 BayLibre, SAS
+ * Author: Neil Armstrong 
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Jian Hu
+ */
+
+#ifndef DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK
+#define DT_BINDINGS_CLOCK_AMLOGIC_MESON_G12A_AOCLK
+
+#define CLKID_AO_AHB_BUS   0
+#define CLKID_AO_REMOTE1
+#define CLKID_AO_I2C_MASTER2
+#define CLKID_AO_I2C_SLAVE 3
+#define CLKID_AO_UART1 4
+#define CLKID_AO_PROD_I2C  5
+#define CLKID_AO_UART2 6
+#define CLKID_AO_IR_BLASTER7
+#define CLKID_AO_SAR_ADC   8
+#define CLKID_AO_CLK81 9
+#define CLKID_AO_SAR_ADC_SEL   10
+#define CLKID_AO_SAR_ADC_DIV   11
+#define CLKID_AO_SAR_ADC_CLK   12
+#define CLKID_AO_ALT_XTAL  13
+
+#endif
--
1.9.1



.



[PATCH 3/5] arm64: dts: mt7622: fix ram size for rfb1

2018-08-14 Thread Ryder Lee
Fix ram size and sort nodes in alphabetical order.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 196 +--
 1 file changed, 98 insertions(+), 98 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts 
b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
index b783764..033d7d1 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
+++ b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
@@ -51,7 +51,7 @@
};
 
memory {
-   reg = <0 0x4000 0 0x3F00>;
+   reg = <0 0x4000 0 0x2000>;
};
 
reg_1p8v: regulator-1p8v {
@@ -81,6 +81,103 @@
};
 };
 
+ {
+   status = "disabled";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+
+   gmac1: mac@1 {
+   compatible = "mediatek,eth-mac";
+   reg = <1>;
+   phy-handle = <>;
+   };
+
+   mdio-bus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy5: ethernet-phy@5 {
+   reg = <5>;
+   phy-mode = "sgmii";
+   };
+   };
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <8>;
+   max-frequency = <5000>;
+   cap-mmc-highspeed;
+   mmc-hs200-1_8v;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_1p8v>;
+   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+   non-removable;
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <4>;
+   max-frequency = <5000>;
+   cap-sd-highspeed;
+   r_smpl = <1>;
+   cd-gpios = < 81 GPIO_ACTIVE_LOW>;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_3p3v>;
+   assigned-clocks = < CLK_TOP_MSDC30_1_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_nand_pins>;
+   status = "disabled";
+};
+
+_flash {
+   pinctrl-names = "default";
+   pinctrl-0 = <_nor_pins>;
+   status = "disabled";
+
+   flash@0 {
+   compatible = "jedec,spi-nor";
+   reg = <0>;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins>;
@@ -344,103 +441,6 @@
};
 };
 
- {
-   status = "disabled";
-};
-
- {
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-
-   gmac1: mac@1 {
-   compatible = "mediatek,eth-mac";
-   reg = <1>;
-   phy-handle = <>;
-   };
-
-   mdio-bus {
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   phy5: ethernet-phy@5 {
-   reg = <5>;
-   phy-mode = "sgmii";
-   };
-   };
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default", "state_uhs";
-   pinctrl-0 = <_pins_default>;
-   pinctrl-1 = <_pins_uhs>;
-   status = "okay";
-   bus-width = <8>;
-   max-frequency = <5000>;
-   cap-mmc-highspeed;
-   mmc-hs200-1_8v;
-   vmmc-supply = <_3p3v>;
-   vqmmc-supply = <_1p8v>;
-   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
-   assigned-clock-parents = < CLK_TOP_UNIV48M>;
-   non-removable;
-};
-
- {
-   pinctrl-names = "default", "state_uhs";
-   pinctrl-0 = <_pins_default>;
-   pinctrl-1 = <_pins_uhs>;
-   status = "okay";
-   bus-width = <4>;
-   max-frequency = <5000>;
-   cap-sd-highspeed;
-   r_smpl = <1>;
-   cd-gpios = < 81 GPIO_ACTIVE_LOW>;
-   vmmc-supply = <_3p3v>;
-   vqmmc-supply = <_3p3v>;
-   assigned-clocks = < CLK_TOP_MSDC30_1_SEL>;
-   assigned-clock-parents = < CLK_TOP_UNIV48M>;
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_nand_pins>;
-   status = "disabled";
-};
-
-_flash {
-   pinctrl-names = "default";
-   pinctrl-0 = <_nor_pins>;
-   status = "disabled";
-
-   flash@0 {
-   compatible = "jedec,spi-nor";
-   reg = <0>;
- 

[PATCH 3/5] arm64: dts: mt7622: fix ram size for rfb1

2018-08-14 Thread Ryder Lee
Fix ram size and sort nodes in alphabetical order.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 196 +--
 1 file changed, 98 insertions(+), 98 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts 
b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
index b783764..033d7d1 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
+++ b/arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts
@@ -51,7 +51,7 @@
};
 
memory {
-   reg = <0 0x4000 0 0x3F00>;
+   reg = <0 0x4000 0 0x2000>;
};
 
reg_1p8v: regulator-1p8v {
@@ -81,6 +81,103 @@
};
 };
 
+ {
+   status = "disabled";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+
+   gmac1: mac@1 {
+   compatible = "mediatek,eth-mac";
+   reg = <1>;
+   phy-handle = <>;
+   };
+
+   mdio-bus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy5: ethernet-phy@5 {
+   reg = <5>;
+   phy-mode = "sgmii";
+   };
+   };
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <8>;
+   max-frequency = <5000>;
+   cap-mmc-highspeed;
+   mmc-hs200-1_8v;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_1p8v>;
+   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+   non-removable;
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <4>;
+   max-frequency = <5000>;
+   cap-sd-highspeed;
+   r_smpl = <1>;
+   cd-gpios = < 81 GPIO_ACTIVE_LOW>;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_3p3v>;
+   assigned-clocks = < CLK_TOP_MSDC30_1_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_nand_pins>;
+   status = "disabled";
+};
+
+_flash {
+   pinctrl-names = "default";
+   pinctrl-0 = <_nor_pins>;
+   status = "disabled";
+
+   flash@0 {
+   compatible = "jedec,spi-nor";
+   reg = <0>;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pins>;
@@ -344,103 +441,6 @@
};
 };
 
- {
-   status = "disabled";
-};
-
- {
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-
-   gmac1: mac@1 {
-   compatible = "mediatek,eth-mac";
-   reg = <1>;
-   phy-handle = <>;
-   };
-
-   mdio-bus {
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   phy5: ethernet-phy@5 {
-   reg = <5>;
-   phy-mode = "sgmii";
-   };
-   };
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_pins>;
-   status = "okay";
-};
-
- {
-   pinctrl-names = "default", "state_uhs";
-   pinctrl-0 = <_pins_default>;
-   pinctrl-1 = <_pins_uhs>;
-   status = "okay";
-   bus-width = <8>;
-   max-frequency = <5000>;
-   cap-mmc-highspeed;
-   mmc-hs200-1_8v;
-   vmmc-supply = <_3p3v>;
-   vqmmc-supply = <_1p8v>;
-   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
-   assigned-clock-parents = < CLK_TOP_UNIV48M>;
-   non-removable;
-};
-
- {
-   pinctrl-names = "default", "state_uhs";
-   pinctrl-0 = <_pins_default>;
-   pinctrl-1 = <_pins_uhs>;
-   status = "okay";
-   bus-width = <4>;
-   max-frequency = <5000>;
-   cap-sd-highspeed;
-   r_smpl = <1>;
-   cd-gpios = < 81 GPIO_ACTIVE_LOW>;
-   vmmc-supply = <_3p3v>;
-   vqmmc-supply = <_3p3v>;
-   assigned-clocks = < CLK_TOP_MSDC30_1_SEL>;
-   assigned-clock-parents = < CLK_TOP_UNIV48M>;
-};
-
- {
-   pinctrl-names = "default";
-   pinctrl-0 = <_nand_pins>;
-   status = "disabled";
-};
-
-_flash {
-   pinctrl-names = "default";
-   pinctrl-0 = <_nor_pins>;
-   status = "disabled";
-
-   flash@0 {
-   compatible = "jedec,spi-nor";
-   reg = <0>;
- 

[PATCH 1/5] arm64: dts: mt7622: add some misc device nodes

2018-08-14 Thread Ryder Lee
Add timer and CCI-400 device nodes for MT7622.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622.dtsi | 48 
 1 file changed, 48 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
index 9213c96..b235df7 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
@@ -79,6 +79,7 @@
#cooling-cells = <2>;
enable-method = "psci";
clock-frequency = <13>;
+   cci-control-port = <_control2>;
};
 
cpu1: cpu@1 {
@@ -91,6 +92,7 @@
operating-points-v2 = <_opp_table>;
enable-method = "psci";
clock-frequency = <13>;
+   cci-control-port = <_control2>;
};
};
 
@@ -217,6 +219,16 @@
#reset-cells = <1>;
};
 
+   timer: timer@10004000 {
+   compatible = "mediatek,mt7622-timer",
+"mediatek,mt6577-timer";
+   reg = <0 0x10004000 0 0x80>;
+   interrupts = ;
+   clocks = < CLK_INFRA_APXGPT_PD>,
+< CLK_TOP_RTC>;
+   clock-names = "system-clk", "rtc-clk";
+   };
+
scpsys: scpsys@10006000 {
compatible = "mediatek,mt7622-scpsys",
 "syscon";
@@ -317,6 +329,42 @@
  <0 0x1036 0 0x2000>;
};
 
+   cci: cci@1039 {
+   compatible = "arm,cci-400";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   reg = <0 0x1039 0 0x1000>;
+   ranges = <0 0 0x1039 0x1>;
+
+   cci_control0: slave-if@1000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace-lite";
+   reg = <0x1000 0x1000>;
+   };
+
+   cci_control1: slave-if@4000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace";
+   reg = <0x4000 0x1000>;
+   };
+
+   cci_control2: slave-if@5000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace";
+   reg = <0x5000 0x1000>;
+   };
+
+   pmu@9000 {
+   compatible = "arm,cci-400-pmu,r1";
+   reg = <0x9000 0x5000>;
+   interrupts = ,
+,
+,
+,
+;
+   };
+   };
+
auxadc: adc@11001000 {
compatible = "mediatek,mt7622-auxadc";
reg = <0 0x11001000 0 0x1000>;
-- 
1.9.1



[PATCH 1/5] arm64: dts: mt7622: add some misc device nodes

2018-08-14 Thread Ryder Lee
Add timer and CCI-400 device nodes for MT7622.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622.dtsi | 48 
 1 file changed, 48 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
index 9213c96..b235df7 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
@@ -79,6 +79,7 @@
#cooling-cells = <2>;
enable-method = "psci";
clock-frequency = <13>;
+   cci-control-port = <_control2>;
};
 
cpu1: cpu@1 {
@@ -91,6 +92,7 @@
operating-points-v2 = <_opp_table>;
enable-method = "psci";
clock-frequency = <13>;
+   cci-control-port = <_control2>;
};
};
 
@@ -217,6 +219,16 @@
#reset-cells = <1>;
};
 
+   timer: timer@10004000 {
+   compatible = "mediatek,mt7622-timer",
+"mediatek,mt6577-timer";
+   reg = <0 0x10004000 0 0x80>;
+   interrupts = ;
+   clocks = < CLK_INFRA_APXGPT_PD>,
+< CLK_TOP_RTC>;
+   clock-names = "system-clk", "rtc-clk";
+   };
+
scpsys: scpsys@10006000 {
compatible = "mediatek,mt7622-scpsys",
 "syscon";
@@ -317,6 +329,42 @@
  <0 0x1036 0 0x2000>;
};
 
+   cci: cci@1039 {
+   compatible = "arm,cci-400";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   reg = <0 0x1039 0 0x1000>;
+   ranges = <0 0 0x1039 0x1>;
+
+   cci_control0: slave-if@1000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace-lite";
+   reg = <0x1000 0x1000>;
+   };
+
+   cci_control1: slave-if@4000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace";
+   reg = <0x4000 0x1000>;
+   };
+
+   cci_control2: slave-if@5000 {
+   compatible = "arm,cci-400-ctrl-if";
+   interface-type = "ace";
+   reg = <0x5000 0x1000>;
+   };
+
+   pmu@9000 {
+   compatible = "arm,cci-400-pmu,r1";
+   reg = <0x9000 0x5000>;
+   interrupts = ,
+,
+,
+,
+;
+   };
+   };
+
auxadc: adc@11001000 {
compatible = "mediatek,mt7622-auxadc";
reg = <0 0x11001000 0 0x1000>;
-- 
1.9.1



[PATCH 4/5] arm64: dts: mt7622: add bananapi BPI-R64 board

2018-08-14 Thread Ryder Lee
Add support for the bananapi R64 (BPI-R64) development board from
BIPAI KEJI. Detailed hardware information for BPI-R64 which could be
found on http://wiki.banana-pi.org/Banana_Pi_BPI-R64

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/Makefile  |   1 +
 .../boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts  | 508 +
 2 files changed, 509 insertions(+)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts

diff --git a/arch/arm64/boot/dts/mediatek/Makefile 
b/arch/arm64/boot/dts/mediatek/Makefile
index ac17f60..2a1abe5 100644
--- a/arch/arm64/boot/dts/mediatek/Makefile
+++ b/arch/arm64/boot/dts/mediatek/Makefile
@@ -3,5 +3,6 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt2712-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6755-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb
+dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-bananapi-bpi-r64.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb
diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts 
b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
new file mode 100644
index 000..bc504fe
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
@@ -0,0 +1,508 @@
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * SPDX-License-Identifier: (GPL-2.0 OR MIT)
+ */
+
+/dts-v1/;
+#include 
+#include 
+
+#include "mt7622.dtsi"
+#include "mt6380.dtsi"
+
+/ {
+   model = "Bananapi BPI-R64";
+   compatible = "bananapi,bpi-r64", "mediatek,mt7622";
+
+   chosen {
+   bootargs = "earlycon=uart8250,mmio32,0x11002000 
console=ttyS0,115200n1 swiotlb=512";
+   };
+
+   cpus {
+   cpu@0 {
+   proc-supply = <_vcpu_reg>;
+   sram-supply = <_vm_reg>;
+   };
+
+   cpu@1 {
+   proc-supply = <_vcpu_reg>;
+   sram-supply = <_vm_reg>;
+   };
+   };
+
+   gpio-keys {
+   compatible = "gpio-keys";
+   poll-interval = <100>;
+
+   factory {
+   label = "factory";
+   linux,code = ;
+   gpios = < 0 0>;
+   };
+
+   wps {
+   label = "wps";
+   linux,code = ;
+   gpios = < 102 0>;
+   };
+   };
+
+   memory {
+   reg = <0 0x4000 0 0x4000>;
+   };
+
+   reg_1p8v: regulator-1p8v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-1.8V";
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   regulator-always-on;
+   };
+
+   reg_3p3v: regulator-3p3v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.3V";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
+   reg_5v: regulator-5v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-5V";
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+};
+
+ {
+   status = "disabled";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+
+   gmac1: mac@1 {
+   compatible = "mediatek,eth-mac";
+   reg = <1>;
+   phy-handle = <>;
+   };
+
+   mdio-bus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy5: ethernet-phy@5 {
+   reg = <5>;
+   phy-mode = "sgmii";
+   };
+   };
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <8>;
+   max-frequency = <5000>;
+   cap-mmc-highspeed;
+   mmc-hs200-1_8v;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_1p8v>;
+   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+   non-removable;
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width 

[PATCH 2/5] arm64: dts: mt7622: add a bluetooth 5 device node

2018-08-14 Thread Ryder Lee
Add a built-in bluetooth 5 support for MT7622.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
index b235df7..f8c3495 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
@@ -515,6 +515,13 @@
reg-shift = <2>;
reg-io-width = <4>;
status = "disabled";
+
+   bluetooth {
+   compatible = "mediatek,mt7622-bluetooth";
+   power-domains = < MT7622_POWER_DOMAIN_WB>;
+   clocks = <>;
+   clock-names = "ref";
+   };
};
 
nandc: nfi@1100d000 {
-- 
1.9.1



[PATCH 5/5] dt-bindings: arm: mediatek: add support for bananapi BPI-R64 board

2018-08-14 Thread Ryder Lee
Update binding document for bananapi BPI-R64 board being supported.

Signed-off-by: Ryder Lee 
---
 Documentation/devicetree/bindings/arm/mediatek.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt 
b/Documentation/devicetree/bindings/arm/mediatek.txt
index 7d21ab3..677fdcf 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek.txt
@@ -50,6 +50,9 @@ Supported boards:
 - Reference board variant 1 for MT7622:
 Required root node properties:
   - compatible = "mediatek,mt7622-rfb1", "mediatek,mt7622";
+- Bananapi BPI-R64 for MT7622:
+Required root node properties:
+  - compatible = "bananapi,bpi-r64", "mediatek,mt7622";
 - Reference board for MT7623a with eMMC:
 Required root node properties:
   - compatible = "mediatek,mt7623a-rfb-emmc", "mediatek,mt7623";
-- 
1.9.1



[PATCH 5/5] dt-bindings: arm: mediatek: add support for bananapi BPI-R64 board

2018-08-14 Thread Ryder Lee
Update binding document for bananapi BPI-R64 board being supported.

Signed-off-by: Ryder Lee 
---
 Documentation/devicetree/bindings/arm/mediatek.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/mediatek.txt 
b/Documentation/devicetree/bindings/arm/mediatek.txt
index 7d21ab3..677fdcf 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek.txt
@@ -50,6 +50,9 @@ Supported boards:
 - Reference board variant 1 for MT7622:
 Required root node properties:
   - compatible = "mediatek,mt7622-rfb1", "mediatek,mt7622";
+- Bananapi BPI-R64 for MT7622:
+Required root node properties:
+  - compatible = "bananapi,bpi-r64", "mediatek,mt7622";
 - Reference board for MT7623a with eMMC:
 Required root node properties:
   - compatible = "mediatek,mt7623a-rfb-emmc", "mediatek,mt7623";
-- 
1.9.1



[PATCH 4/5] arm64: dts: mt7622: add bananapi BPI-R64 board

2018-08-14 Thread Ryder Lee
Add support for the bananapi R64 (BPI-R64) development board from
BIPAI KEJI. Detailed hardware information for BPI-R64 which could be
found on http://wiki.banana-pi.org/Banana_Pi_BPI-R64

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/Makefile  |   1 +
 .../boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts  | 508 +
 2 files changed, 509 insertions(+)
 create mode 100644 arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts

diff --git a/arch/arm64/boot/dts/mediatek/Makefile 
b/arch/arm64/boot/dts/mediatek/Makefile
index ac17f60..2a1abe5 100644
--- a/arch/arm64/boot/dts/mediatek/Makefile
+++ b/arch/arm64/boot/dts/mediatek/Makefile
@@ -3,5 +3,6 @@ dtb-$(CONFIG_ARCH_MEDIATEK) += mt2712-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6755-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6795-evb.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt6797-evb.dtb
+dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-bananapi-bpi-r64.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt7622-rfb1.dtb
 dtb-$(CONFIG_ARCH_MEDIATEK) += mt8173-evb.dtb
diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts 
b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
new file mode 100644
index 000..bc504fe
--- /dev/null
+++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
@@ -0,0 +1,508 @@
+/*
+ * Copyright (c) 2018 MediaTek Inc.
+ * Author: Ryder Lee 
+ *
+ * SPDX-License-Identifier: (GPL-2.0 OR MIT)
+ */
+
+/dts-v1/;
+#include 
+#include 
+
+#include "mt7622.dtsi"
+#include "mt6380.dtsi"
+
+/ {
+   model = "Bananapi BPI-R64";
+   compatible = "bananapi,bpi-r64", "mediatek,mt7622";
+
+   chosen {
+   bootargs = "earlycon=uart8250,mmio32,0x11002000 
console=ttyS0,115200n1 swiotlb=512";
+   };
+
+   cpus {
+   cpu@0 {
+   proc-supply = <_vcpu_reg>;
+   sram-supply = <_vm_reg>;
+   };
+
+   cpu@1 {
+   proc-supply = <_vcpu_reg>;
+   sram-supply = <_vm_reg>;
+   };
+   };
+
+   gpio-keys {
+   compatible = "gpio-keys";
+   poll-interval = <100>;
+
+   factory {
+   label = "factory";
+   linux,code = ;
+   gpios = < 0 0>;
+   };
+
+   wps {
+   label = "wps";
+   linux,code = ;
+   gpios = < 102 0>;
+   };
+   };
+
+   memory {
+   reg = <0 0x4000 0 0x4000>;
+   };
+
+   reg_1p8v: regulator-1p8v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-1.8V";
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   regulator-always-on;
+   };
+
+   reg_3p3v: regulator-3p3v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.3V";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
+   reg_5v: regulator-5v {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-5V";
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+};
+
+ {
+   status = "disabled";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+
+   gmac1: mac@1 {
+   compatible = "mediatek,eth-mac";
+   reg = <1>;
+   phy-handle = <>;
+   };
+
+   mdio-bus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy5: ethernet-phy@5 {
+   reg = <5>;
+   phy-mode = "sgmii";
+   };
+   };
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width = <8>;
+   max-frequency = <5000>;
+   cap-mmc-highspeed;
+   mmc-hs200-1_8v;
+   vmmc-supply = <_3p3v>;
+   vqmmc-supply = <_1p8v>;
+   assigned-clocks = < CLK_TOP_MSDC30_0_SEL>;
+   assigned-clock-parents = < CLK_TOP_UNIV48M>;
+   non-removable;
+};
+
+ {
+   pinctrl-names = "default", "state_uhs";
+   pinctrl-0 = <_pins_default>;
+   pinctrl-1 = <_pins_uhs>;
+   status = "okay";
+   bus-width 

[PATCH 2/5] arm64: dts: mt7622: add a bluetooth 5 device node

2018-08-14 Thread Ryder Lee
Add a built-in bluetooth 5 support for MT7622.

Signed-off-by: Ryder Lee 
---
 arch/arm64/boot/dts/mediatek/mt7622.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi 
b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
index b235df7..f8c3495 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
@@ -515,6 +515,13 @@
reg-shift = <2>;
reg-io-width = <4>;
status = "disabled";
+
+   bluetooth {
+   compatible = "mediatek,mt7622-bluetooth";
+   power-domains = < MT7622_POWER_DOMAIN_WB>;
+   clocks = <>;
+   clock-names = "ref";
+   };
};
 
nandc: nfi@1100d000 {
-- 
1.9.1



Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-14 Thread Mike Galbraith
On Wed, 2018-08-15 at 11:59 +0800, Dave Young wrote:
> > Does this improve things, and plug the no boot hole?
> 
> Would you mind to tune my patch with some acpi_rsdp checking and add
> some error message in case kexec load failure? Eg. suggest people to use
> append acpi_rsdp for noefi booting etc.

Yeah, -ENODEV is better than hanging, but not very informative.

> I'm still not very satisfied with the code cleanup..

Not surprising, I didn't like it much either (ergo interrogative).

-Mike


Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-14 Thread Mike Galbraith
On Wed, 2018-08-15 at 11:59 +0800, Dave Young wrote:
> > Does this improve things, and plug the no boot hole?
> 
> Would you mind to tune my patch with some acpi_rsdp checking and add
> some error message in case kexec load failure? Eg. suggest people to use
> append acpi_rsdp for noefi booting etc.

Yeah, -ENODEV is better than hanging, but not very informative.

> I'm still not very satisfied with the code cleanup..

Not surprising, I didn't like it much either (ergo interrogative).

-Mike


Re: linux-next: manual merge of the kvm tree with the tip tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Wed, 8 Aug 2018 13:54:45 +1000 Stephen Rothwell  
wrote:
>
> Paolo pointed out a semantic conflict between the kvm tree and the tip
> tree in
> 
>   arch/x86/kernel/kvm.c
> 
> between commit:
> 
>   368a540e0232 ("x86/kvmclock: Remove memblock dependency")
> 
> from the tip tree and commit:
> 
>   d63bae079b64 ("KVM: X86: Add kvm hypervisor init time platform setup 
> callback")
> 
> from the kvm tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> I applied the following (as suggested) after the automatic merge:
> 
> From: Stephen Rothwell 
> Date: Wed, 8 Aug 2018 13:48:34 +1000
> Subject: [PATCH] kvm: x86: fix semantic conflict in platform init
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/x86/kernel/kvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 0906b8731393..e2499bad3209 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -714,13 +714,13 @@ static void __init kvm_apic_init(void)
>  static void __init kvm_init_platform(void)
>  {
>   x86_platform.apic_post_init = kvm_apic_init;
> + kvmclock_init();
>  }
>  
>  const __initconst struct hypervisor_x86 x86_hyper_kvm = {
>   .name   = "KVM",
>   .detect = kvm_detect,
>   .type   = X86_HYPER_KVM,
> - .init.init_platform = kvmclock_init,
>   .init.guest_late_init   = kvm_guest_init,
>   .init.x2apic_available  = kvm_para_available,
>   .init.init_platform = kvm_init_platform,
> -- 
> 2.18.0

This is now a conflict between Linus' tree and the kvm tree.
-- 
Cheers,
Stephen Rothwell


pgpR6IrpU56nd.pgp
Description: OpenPGP digital signature


Re: linux-next: manual merge of the kvm tree with the tip tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Wed, 8 Aug 2018 13:54:45 +1000 Stephen Rothwell  
wrote:
>
> Paolo pointed out a semantic conflict between the kvm tree and the tip
> tree in
> 
>   arch/x86/kernel/kvm.c
> 
> between commit:
> 
>   368a540e0232 ("x86/kvmclock: Remove memblock dependency")
> 
> from the tip tree and commit:
> 
>   d63bae079b64 ("KVM: X86: Add kvm hypervisor init time platform setup 
> callback")
> 
> from the kvm tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> I applied the following (as suggested) after the automatic merge:
> 
> From: Stephen Rothwell 
> Date: Wed, 8 Aug 2018 13:48:34 +1000
> Subject: [PATCH] kvm: x86: fix semantic conflict in platform init
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/x86/kernel/kvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 0906b8731393..e2499bad3209 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -714,13 +714,13 @@ static void __init kvm_apic_init(void)
>  static void __init kvm_init_platform(void)
>  {
>   x86_platform.apic_post_init = kvm_apic_init;
> + kvmclock_init();
>  }
>  
>  const __initconst struct hypervisor_x86 x86_hyper_kvm = {
>   .name   = "KVM",
>   .detect = kvm_detect,
>   .type   = X86_HYPER_KVM,
> - .init.init_platform = kvmclock_init,
>   .init.guest_late_init   = kvm_guest_init,
>   .init.x2apic_available  = kvm_para_available,
>   .init.init_platform = kvm_init_platform,
> -- 
> 2.18.0

This is now a conflict between Linus' tree and the kvm tree.
-- 
Cheers,
Stephen Rothwell


pgpR6IrpU56nd.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the kvm tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvm tree got a conflict in:

  arch/x86/include/asm/kvm_host.h

between commit:

  5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on 
vmentry")

from Linus' tree and commit:

  4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall")

from the kvm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/x86/include/asm/kvm_host.h
index acebb808c4b5,c18958ef17d2..
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@@ -1418,7 -1457,10 +1462,11 @@@ int kvm_cpu_get_interrupt(struct kvm_vc
  void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
  void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);
  
+ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
+   unsigned long ipi_bitmap_high, int min,
+   unsigned long icr, int op_64_bit);
+ 
 +u64 kvm_get_arch_capabilities(void);
  void kvm_define_shared_msr(unsigned index, u32 msr);
  int kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
  


pgp0kiRgw9mGV.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the kvm tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvm tree got a conflict in:

  arch/x86/include/asm/kvm_host.h

between commit:

  5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on 
vmentry")

from Linus' tree and commit:

  4180bf1b655a ("KVM: X86: Implement "send IPI" hypercall")

from the kvm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/x86/include/asm/kvm_host.h
index acebb808c4b5,c18958ef17d2..
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@@ -1418,7 -1457,10 +1462,11 @@@ int kvm_cpu_get_interrupt(struct kvm_vc
  void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
  void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);
  
+ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
+   unsigned long ipi_bitmap_high, int min,
+   unsigned long icr, int op_64_bit);
+ 
 +u64 kvm_get_arch_capabilities(void);
  void kvm_define_shared_msr(unsigned index, u32 msr);
  int kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
  


pgp0kiRgw9mGV.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the kvm tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvm tree got a conflict in:

  arch/x86/kvm/vmx.c

between commit:

  a399477e52c1 ("x86/KVM/VMX: Add module argument for L1TF mitigation")

from Linus' tree and commit:

  877ad952be3d ("KVM: vmx: Add tlb_remote_flush callback support")

from the kvm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/x86/kvm/vmx.c
index 46b428c0990e,16f9373c01de..
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@@ -188,150 -189,12 +189,156 @@@ module_param(ple_window_max, uint, 0444
  
  extern const ulong vmx_return;
  
 +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
 +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
 +static DEFINE_MUTEX(vmx_l1d_flush_mutex);
 +
 +/* Storage for pre module init parameter parsing */
 +static enum vmx_l1d_flush_state __read_mostly vmentry_l1d_flush_param = 
VMENTER_L1D_FLUSH_AUTO;
 +
 +static const struct {
 +  const char *option;
 +  enum vmx_l1d_flush_state cmd;
 +} vmentry_l1d_param[] = {
 +  {"auto",VMENTER_L1D_FLUSH_AUTO},
 +  {"never",   VMENTER_L1D_FLUSH_NEVER},
 +  {"cond",VMENTER_L1D_FLUSH_COND},
 +  {"always",  VMENTER_L1D_FLUSH_ALWAYS},
 +};
 +
 +#define L1D_CACHE_ORDER 4
 +static void *vmx_l1d_flush_pages;
 +
 +static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
 +{
 +  struct page *page;
 +  unsigned int i;
 +
 +  if (!enable_ept) {
 +  l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_EPT_DISABLED;
 +  return 0;
 +  }
 +
 +   if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) {
 + u64 msr;
 +
 + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
 + if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) {
 + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
 + return 0;
 + }
 +   }
 +
 +  /* If set to auto use the default l1tf mitigation method */
 +  if (l1tf == VMENTER_L1D_FLUSH_AUTO) {
 +  switch (l1tf_mitigation) {
 +  case L1TF_MITIGATION_OFF:
 +  l1tf = VMENTER_L1D_FLUSH_NEVER;
 +  break;
 +  case L1TF_MITIGATION_FLUSH_NOWARN:
 +  case L1TF_MITIGATION_FLUSH:
 +  case L1TF_MITIGATION_FLUSH_NOSMT:
 +  l1tf = VMENTER_L1D_FLUSH_COND;
 +  break;
 +  case L1TF_MITIGATION_FULL:
 +  case L1TF_MITIGATION_FULL_FORCE:
 +  l1tf = VMENTER_L1D_FLUSH_ALWAYS;
 +  break;
 +  }
 +  } else if (l1tf_mitigation == L1TF_MITIGATION_FULL_FORCE) {
 +  l1tf = VMENTER_L1D_FLUSH_ALWAYS;
 +  }
 +
 +  if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages &&
 +  !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) {
 +  page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER);
 +  if (!page)
 +  return -ENOMEM;
 +  vmx_l1d_flush_pages = page_address(page);
 +
 +  /*
 +   * Initialize each page with a different pattern in
 +   * order to protect against KSM in the nested
 +   * virtualization case.
 +   */
 +  for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) {
 +  memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1,
 + PAGE_SIZE);
 +  }
 +  }
 +
 +  l1tf_vmx_mitigation = l1tf;
 +
 +  if (l1tf != VMENTER_L1D_FLUSH_NEVER)
 +  static_branch_enable(_l1d_should_flush);
 +  else
 +  static_branch_disable(_l1d_should_flush);
 +
 +  if (l1tf == VMENTER_L1D_FLUSH_COND)
 +  static_branch_enable(_l1d_flush_cond);
 +  else
 +  static_branch_disable(_l1d_flush_cond);
 +  return 0;
 +}
 +
 +static int vmentry_l1d_flush_parse(const char *s)
 +{
 +  unsigned int i;
 +
 +  if (s) {
 +  for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) {
 +  if (sysfs_streq(s, vmentry_l1d_param[i].option))
 +  return vmentry_l1d_param[i].cmd;
 +  }
 +  }
 +  return -EINVAL;
 +}
 +
 +static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp)
 +{
 +  int l1tf, ret;
 +
 +  if (!boot_cpu_has(X86_BUG_L1TF))
 +  return 0;
 +
 +  l1tf = vmentry_l1d_flush_parse(s);
 +  if (l1tf < 0)
 +  return l1tf;
 +
 +  /*
 +   * Has vmx_init() run already? If not then this is the pre init
 +   * parameter parsing. In that 

linux-next: manual merge of the kvm tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the kvm tree got a conflict in:

  arch/x86/kvm/vmx.c

between commit:

  a399477e52c1 ("x86/KVM/VMX: Add module argument for L1TF mitigation")

from Linus' tree and commit:

  877ad952be3d ("KVM: vmx: Add tlb_remote_flush callback support")

from the kvm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/x86/kvm/vmx.c
index 46b428c0990e,16f9373c01de..
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@@ -188,150 -189,12 +189,156 @@@ module_param(ple_window_max, uint, 0444
  
  extern const ulong vmx_return;
  
 +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
 +static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
 +static DEFINE_MUTEX(vmx_l1d_flush_mutex);
 +
 +/* Storage for pre module init parameter parsing */
 +static enum vmx_l1d_flush_state __read_mostly vmentry_l1d_flush_param = 
VMENTER_L1D_FLUSH_AUTO;
 +
 +static const struct {
 +  const char *option;
 +  enum vmx_l1d_flush_state cmd;
 +} vmentry_l1d_param[] = {
 +  {"auto",VMENTER_L1D_FLUSH_AUTO},
 +  {"never",   VMENTER_L1D_FLUSH_NEVER},
 +  {"cond",VMENTER_L1D_FLUSH_COND},
 +  {"always",  VMENTER_L1D_FLUSH_ALWAYS},
 +};
 +
 +#define L1D_CACHE_ORDER 4
 +static void *vmx_l1d_flush_pages;
 +
 +static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
 +{
 +  struct page *page;
 +  unsigned int i;
 +
 +  if (!enable_ept) {
 +  l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_EPT_DISABLED;
 +  return 0;
 +  }
 +
 +   if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) {
 + u64 msr;
 +
 + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
 + if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) {
 + l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
 + return 0;
 + }
 +   }
 +
 +  /* If set to auto use the default l1tf mitigation method */
 +  if (l1tf == VMENTER_L1D_FLUSH_AUTO) {
 +  switch (l1tf_mitigation) {
 +  case L1TF_MITIGATION_OFF:
 +  l1tf = VMENTER_L1D_FLUSH_NEVER;
 +  break;
 +  case L1TF_MITIGATION_FLUSH_NOWARN:
 +  case L1TF_MITIGATION_FLUSH:
 +  case L1TF_MITIGATION_FLUSH_NOSMT:
 +  l1tf = VMENTER_L1D_FLUSH_COND;
 +  break;
 +  case L1TF_MITIGATION_FULL:
 +  case L1TF_MITIGATION_FULL_FORCE:
 +  l1tf = VMENTER_L1D_FLUSH_ALWAYS;
 +  break;
 +  }
 +  } else if (l1tf_mitigation == L1TF_MITIGATION_FULL_FORCE) {
 +  l1tf = VMENTER_L1D_FLUSH_ALWAYS;
 +  }
 +
 +  if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages &&
 +  !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) {
 +  page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER);
 +  if (!page)
 +  return -ENOMEM;
 +  vmx_l1d_flush_pages = page_address(page);
 +
 +  /*
 +   * Initialize each page with a different pattern in
 +   * order to protect against KSM in the nested
 +   * virtualization case.
 +   */
 +  for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) {
 +  memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1,
 + PAGE_SIZE);
 +  }
 +  }
 +
 +  l1tf_vmx_mitigation = l1tf;
 +
 +  if (l1tf != VMENTER_L1D_FLUSH_NEVER)
 +  static_branch_enable(_l1d_should_flush);
 +  else
 +  static_branch_disable(_l1d_should_flush);
 +
 +  if (l1tf == VMENTER_L1D_FLUSH_COND)
 +  static_branch_enable(_l1d_flush_cond);
 +  else
 +  static_branch_disable(_l1d_flush_cond);
 +  return 0;
 +}
 +
 +static int vmentry_l1d_flush_parse(const char *s)
 +{
 +  unsigned int i;
 +
 +  if (s) {
 +  for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) {
 +  if (sysfs_streq(s, vmentry_l1d_param[i].option))
 +  return vmentry_l1d_param[i].cmd;
 +  }
 +  }
 +  return -EINVAL;
 +}
 +
 +static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp)
 +{
 +  int l1tf, ret;
 +
 +  if (!boot_cpu_has(X86_BUG_L1TF))
 +  return 0;
 +
 +  l1tf = vmentry_l1d_flush_parse(s);
 +  if (l1tf < 0)
 +  return l1tf;
 +
 +  /*
 +   * Has vmx_init() run already? If not then this is the pre init
 +   * parameter parsing. In that 

linux-next: manual merge of the ftrace tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi Steven,

Today's linux-next merge of the ftrace tree got a conflict in:

  kernel/events/uprobes.c

between commit:

  788faab70d5a ("perf, tools: Use correct articles in comments")

from Linus' tree and commit:

  38e967ae1e60 ("Uprobes: Simplify uprobe_register() body")

from the ftrace tree.

I fixed it up (the latter moved the uprobe_unregister function - see
below) and can carry the fix as necessary. This is now fixed as far as
linux-next is concerned, but any non trivial conflicts should be mentioned
to your upstream maintainer when your tree is submitted for merging.
You may also want to consider cooperating with the maintainer of the
conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc kernel/events/uprobes.c
index aed1ba569954,c0418ba52ba8..
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@@ -860,7 -856,28 +856,28 @@@ __uprobe_unregister(struct uprobe *upro
  }
  
  /*
-  * uprobe_register - register a probe
 - * uprobe_unregister - unregister a already registered probe.
++ * uprobe_unregister - unregister an already registered probe.
+  * @inode: the file in which the probe has to be removed.
+  * @offset: offset from the start of the file.
+  * @uc: identify which probe if multiple probes are colocated.
+  */
+ void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
+ {
+   struct uprobe *uprobe;
+ 
+   uprobe = find_uprobe(inode, offset);
+   if (WARN_ON(!uprobe))
+   return;
+ 
+   down_write(>register_rwsem);
+   __uprobe_unregister(uprobe, uc);
+   up_write(>register_rwsem);
+   put_uprobe(uprobe);
+ }
+ EXPORT_SYMBOL_GPL(uprobe_unregister);
+ 
+ /*
+  * __uprobe_register - register a probe
   * @inode: the file in which the probe has to be placed.
   * @offset: offset from the start of the file.
   * @uc: information on howto handle the probe..


pgpF0XnTROJcL.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the ftrace tree with Linus' tree

2018-08-14 Thread Stephen Rothwell
Hi Steven,

Today's linux-next merge of the ftrace tree got a conflict in:

  kernel/events/uprobes.c

between commit:

  788faab70d5a ("perf, tools: Use correct articles in comments")

from Linus' tree and commit:

  38e967ae1e60 ("Uprobes: Simplify uprobe_register() body")

from the ftrace tree.

I fixed it up (the latter moved the uprobe_unregister function - see
below) and can carry the fix as necessary. This is now fixed as far as
linux-next is concerned, but any non trivial conflicts should be mentioned
to your upstream maintainer when your tree is submitted for merging.
You may also want to consider cooperating with the maintainer of the
conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc kernel/events/uprobes.c
index aed1ba569954,c0418ba52ba8..
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@@ -860,7 -856,28 +856,28 @@@ __uprobe_unregister(struct uprobe *upro
  }
  
  /*
-  * uprobe_register - register a probe
 - * uprobe_unregister - unregister a already registered probe.
++ * uprobe_unregister - unregister an already registered probe.
+  * @inode: the file in which the probe has to be removed.
+  * @offset: offset from the start of the file.
+  * @uc: identify which probe if multiple probes are colocated.
+  */
+ void uprobe_unregister(struct inode *inode, loff_t offset, struct 
uprobe_consumer *uc)
+ {
+   struct uprobe *uprobe;
+ 
+   uprobe = find_uprobe(inode, offset);
+   if (WARN_ON(!uprobe))
+   return;
+ 
+   down_write(>register_rwsem);
+   __uprobe_unregister(uprobe, uc);
+   up_write(>register_rwsem);
+   put_uprobe(uprobe);
+ }
+ EXPORT_SYMBOL_GPL(uprobe_unregister);
+ 
+ /*
+  * __uprobe_register - register a probe
   * @inode: the file in which the probe has to be placed.
   * @offset: offset from the start of the file.
   * @uc: information on howto handle the probe..


pgpF0XnTROJcL.pgp
Description: OpenPGP digital signature


Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-14 Thread Dave Young
Apologize for late reply, I'm occupied with something else.

On 08/10/18 at 07:39pm, Mike Galbraith wrote:
> On Fri, 2018-08-10 at 18:28 +0800, Dave Young wrote:
> > 
> > > @@ -250,8 +253,10 @@ setup_boot_parameters(struct kimage *image, struct 
> > > boot_params *params,
> > >  
> > >  #ifdef CONFIG_EFI
> > >   /* Setup EFI state */
> > > - setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz,
> > > + ret = setup_efi_state(params, params_load_addr, efi_map_offset, 
> > > efi_map_sz,
> > >   efi_setup_data_offset);
> > > + if (ret)
> > 
> > Here should check efi_enabled(EFI_BOOT) && ret
> 
> Patch with that works for me.
> 
> > In case efi boot we need the efi info set correctly,  or one need pass
> > acpi_rsdp= in kernel cmdline param.
> > 
> > Still not sure how to allow one to workaround it by using acpi_rsdp=
> > param with kexec_file_load..
> 
> Does this improve things, and plug the no boot hole?

Would you mind to tune my patch with some acpi_rsdp checking and add
some error message in case kexec load failure? Eg. suggest people to use
append acpi_rsdp for noefi booting etc.

I'm still not very satisfied with the code cleanup, ideally we should add a
separate kbuf for efi stuff, so that we can isolate the efi_map_sz
efi_setup_data_offset, and efi_map_offset initialization only when
necessary.  Anyway the cleanup can be a separate patch.

> 
> x86, kdump: cleanup efi setup data handling a bit
> 
> 1. Remove efi specific variables from bzImage64_load() other than the
> one it needs, efi_map_sz, passing it and params_cmdline_sz on to efi
> setup functions, giving them all they need without duplication.
> 
> 2. Only allocate space for efi setup data when a 1:1 mapping is available.
> Bail early with -ENODEV if not available, but is required to boot, and
> acpi_rsdp= was not passed on the command line. 
> 
> 3. Use the proper config dependency to isolate efi setup functions,
> adding a !EFI_RUNTIME_MAP stub for setup_efi_state().
> 
> 4. Change efi functions that cannot fail to void. 
> 
> Signed-off-by: Mike Galbraith 
> ---
>  arch/x86/kernel/kexec-bzimage64.c |   99 
> +-
>  1 file changed, 45 insertions(+), 54 deletions(-)
> 
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -112,35 +112,32 @@ static int setup_e820_entries(struct boo
>   return 0;
>  }
>  
> -#ifdef CONFIG_EFI
> -static int setup_efi_info_memmap(struct boot_params *params,
> +#ifdef CONFIG_EFI_RUNTIME_MAP
> +static void setup_efi_info_memmap(struct boot_params *params,
> unsigned long params_load_addr,
> -   unsigned int efi_map_offset,
> +   unsigned int params_cmdline_sz,
> unsigned int efi_map_sz)
>  {
> - void *efi_map = (void *)params + efi_map_offset;
> - unsigned long efi_map_phys_addr = params_load_addr + efi_map_offset;
> + void *efi_map = (void *)params + params_cmdline_sz;
> + unsigned long efi_map_phys_addr = params_load_addr + params_cmdline_sz;
>   struct efi_info *ei = >efi_info;
>  
> - if (!efi_map_sz)
> - return -EINVAL;
> -
>   efi_runtime_map_copy(efi_map, efi_map_sz);
>  
>   ei->efi_memmap = efi_map_phys_addr & 0x;
>   ei->efi_memmap_hi = efi_map_phys_addr >> 32;
>   ei->efi_memmap_size = efi_map_sz;
> -
> - return 0;
>  }
>  
> -static int
> +static void
>  prepare_add_efi_setup_data(struct boot_params *params,
> -unsigned long params_load_addr,
> -unsigned int efi_setup_data_offset)
> +unsigned long params_load_addr,
> +unsigned int params_cmdline_sz,
> +unsigned int efi_map_sz)
>  {
> + unsigned int data_offset = params_cmdline_sz + ALIGN(efi_map_sz, 16);
>   unsigned long setup_data_phys;
> - struct setup_data *sd = (void *)params + efi_setup_data_offset;
> + struct setup_data *sd = (void *)params + data_offset;
>   struct efi_setup_data *esd = (void *)sd + sizeof(struct setup_data);
>  
>   esd->fw_vendor = efi.fw_vendor;
> @@ -152,33 +149,20 @@ prepare_add_efi_setup_data(struct boot_p
>   sd->len = sizeof(struct efi_setup_data);
>  
>   /* Add setup data */
> - setup_data_phys = params_load_addr + efi_setup_data_offset;
> + setup_data_phys = params_load_addr + data_offset;
>   sd->next = params->hdr.setup_data;
>   params->hdr.setup_data = setup_data_phys;
> -
> - return 0;
>  }
>  
>  static int
>  setup_efi_state(struct boot_params *params, unsigned long params_load_addr,
> - unsigned int efi_map_offset, unsigned int efi_map_sz,
> - unsigned int efi_setup_data_offset)
> + unsigned int params_cmdline_sz, unsigned int efi_map_sz)
>  {
>   struct efi_info *current_ei = _params.efi_info;
>   struct efi_info *ei = 

Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-14 Thread Dave Young
Apologize for late reply, I'm occupied with something else.

On 08/10/18 at 07:39pm, Mike Galbraith wrote:
> On Fri, 2018-08-10 at 18:28 +0800, Dave Young wrote:
> > 
> > > @@ -250,8 +253,10 @@ setup_boot_parameters(struct kimage *image, struct 
> > > boot_params *params,
> > >  
> > >  #ifdef CONFIG_EFI
> > >   /* Setup EFI state */
> > > - setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz,
> > > + ret = setup_efi_state(params, params_load_addr, efi_map_offset, 
> > > efi_map_sz,
> > >   efi_setup_data_offset);
> > > + if (ret)
> > 
> > Here should check efi_enabled(EFI_BOOT) && ret
> 
> Patch with that works for me.
> 
> > In case efi boot we need the efi info set correctly,  or one need pass
> > acpi_rsdp= in kernel cmdline param.
> > 
> > Still not sure how to allow one to workaround it by using acpi_rsdp=
> > param with kexec_file_load..
> 
> Does this improve things, and plug the no boot hole?

Would you mind to tune my patch with some acpi_rsdp checking and add
some error message in case kexec load failure? Eg. suggest people to use
append acpi_rsdp for noefi booting etc.

I'm still not very satisfied with the code cleanup, ideally we should add a
separate kbuf for efi stuff, so that we can isolate the efi_map_sz
efi_setup_data_offset, and efi_map_offset initialization only when
necessary.  Anyway the cleanup can be a separate patch.

> 
> x86, kdump: cleanup efi setup data handling a bit
> 
> 1. Remove efi specific variables from bzImage64_load() other than the
> one it needs, efi_map_sz, passing it and params_cmdline_sz on to efi
> setup functions, giving them all they need without duplication.
> 
> 2. Only allocate space for efi setup data when a 1:1 mapping is available.
> Bail early with -ENODEV if not available, but is required to boot, and
> acpi_rsdp= was not passed on the command line. 
> 
> 3. Use the proper config dependency to isolate efi setup functions,
> adding a !EFI_RUNTIME_MAP stub for setup_efi_state().
> 
> 4. Change efi functions that cannot fail to void. 
> 
> Signed-off-by: Mike Galbraith 
> ---
>  arch/x86/kernel/kexec-bzimage64.c |   99 
> +-
>  1 file changed, 45 insertions(+), 54 deletions(-)
> 
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -112,35 +112,32 @@ static int setup_e820_entries(struct boo
>   return 0;
>  }
>  
> -#ifdef CONFIG_EFI
> -static int setup_efi_info_memmap(struct boot_params *params,
> +#ifdef CONFIG_EFI_RUNTIME_MAP
> +static void setup_efi_info_memmap(struct boot_params *params,
> unsigned long params_load_addr,
> -   unsigned int efi_map_offset,
> +   unsigned int params_cmdline_sz,
> unsigned int efi_map_sz)
>  {
> - void *efi_map = (void *)params + efi_map_offset;
> - unsigned long efi_map_phys_addr = params_load_addr + efi_map_offset;
> + void *efi_map = (void *)params + params_cmdline_sz;
> + unsigned long efi_map_phys_addr = params_load_addr + params_cmdline_sz;
>   struct efi_info *ei = >efi_info;
>  
> - if (!efi_map_sz)
> - return -EINVAL;
> -
>   efi_runtime_map_copy(efi_map, efi_map_sz);
>  
>   ei->efi_memmap = efi_map_phys_addr & 0x;
>   ei->efi_memmap_hi = efi_map_phys_addr >> 32;
>   ei->efi_memmap_size = efi_map_sz;
> -
> - return 0;
>  }
>  
> -static int
> +static void
>  prepare_add_efi_setup_data(struct boot_params *params,
> -unsigned long params_load_addr,
> -unsigned int efi_setup_data_offset)
> +unsigned long params_load_addr,
> +unsigned int params_cmdline_sz,
> +unsigned int efi_map_sz)
>  {
> + unsigned int data_offset = params_cmdline_sz + ALIGN(efi_map_sz, 16);
>   unsigned long setup_data_phys;
> - struct setup_data *sd = (void *)params + efi_setup_data_offset;
> + struct setup_data *sd = (void *)params + data_offset;
>   struct efi_setup_data *esd = (void *)sd + sizeof(struct setup_data);
>  
>   esd->fw_vendor = efi.fw_vendor;
> @@ -152,33 +149,20 @@ prepare_add_efi_setup_data(struct boot_p
>   sd->len = sizeof(struct efi_setup_data);
>  
>   /* Add setup data */
> - setup_data_phys = params_load_addr + efi_setup_data_offset;
> + setup_data_phys = params_load_addr + data_offset;
>   sd->next = params->hdr.setup_data;
>   params->hdr.setup_data = setup_data_phys;
> -
> - return 0;
>  }
>  
>  static int
>  setup_efi_state(struct boot_params *params, unsigned long params_load_addr,
> - unsigned int efi_map_offset, unsigned int efi_map_sz,
> - unsigned int efi_setup_data_offset)
> + unsigned int params_cmdline_sz, unsigned int efi_map_sz)
>  {
>   struct efi_info *current_ei = _params.efi_info;
>   struct efi_info *ei = 

Re: [PATCH] modpost: check strdup() return value

2018-08-14 Thread Masahiro Yamada
2018-08-15 5:50 GMT+09:00 Randy Dunlap :
> From: Randy Dunlap 
>
> Fix missing error check for function strdup() in scripts/mod/modpost.c.
>
> Fixes kernel bugzilla #200319:
> https://bugzilla.kernel.org/show_bug.cgi?id=200319
>
> Signed-off-by: Randy Dunlap 
> Cc: Yuexing Wang 
> Cc: Masahiro Yamada 
> ---
>  scripts/mod/modpost.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- linux-next-20180814.orig/scripts/mod/modpost.c
> +++ linux-next-20180814/scripts/mod/modpost.c
> @@ -672,7 +672,7 @@ static void handle_modversions(struct mo
> if (ELF_ST_TYPE(sym->st_info) == STT_SPARC_REGISTER)
> break;
> if (symname[0] == '.') {
> -   char *munged = strdup(symname);
> +   char *munged = NOFAIL(strdup(symname));
> munged[0] = '_';
> munged[1] = toupper(munged[1]);
> symname = munged;
>
>


While you are here,
will you fix a little more?


line 1321:
char *p = malloc(20);

line 1341:
return strdup("");

line 2039:
buf->p = realloc(buf->p, buf->size);



-- 
Best Regards
Masahiro Yamada


Re: [PATCH] modpost: check strdup() return value

2018-08-14 Thread Masahiro Yamada
2018-08-15 5:50 GMT+09:00 Randy Dunlap :
> From: Randy Dunlap 
>
> Fix missing error check for function strdup() in scripts/mod/modpost.c.
>
> Fixes kernel bugzilla #200319:
> https://bugzilla.kernel.org/show_bug.cgi?id=200319
>
> Signed-off-by: Randy Dunlap 
> Cc: Yuexing Wang 
> Cc: Masahiro Yamada 
> ---
>  scripts/mod/modpost.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- linux-next-20180814.orig/scripts/mod/modpost.c
> +++ linux-next-20180814/scripts/mod/modpost.c
> @@ -672,7 +672,7 @@ static void handle_modversions(struct mo
> if (ELF_ST_TYPE(sym->st_info) == STT_SPARC_REGISTER)
> break;
> if (symname[0] == '.') {
> -   char *munged = strdup(symname);
> +   char *munged = NOFAIL(strdup(symname));
> munged[0] = '_';
> munged[1] = toupper(munged[1]);
> symname = munged;
>
>


While you are here,
will you fix a little more?


line 1321:
char *p = malloc(20);

line 1341:
return strdup("");

line 2039:
buf->p = realloc(buf->p, buf->size);



-- 
Best Regards
Masahiro Yamada


Greetings in the name of God, Business proposal in God we trust

2018-08-14 Thread Mrs, Suran Yoda
Greetings in the name of God

Dear Friend


Greetings in the name of God,please let this not sound strange to you
for my only surviving lawyer who would have done this died early this
year.I prayed and got your email id from your country guestbook. I am
Mrs Suran Yoda from London,I am 72 years old,i am suffering from a
long time cancer of the lungs which also affected my brain,from all
indication my conditions is really deteriorating and it is quite
obvious that,according to my doctors they have advised me that i may
not live for the next two months,this is because the cancer stage has
gotten to a very bad stage.I am married to (Dr Andrews Yoda) who
worked with the Embassy of United Kingdom in South

Africa for nine years,Before he died in 2004.

I was bred up from a motherless babies home and was married to my late
husband for Thirty years without a child,my husband died in a fatal
motor accident Before his death we were true believers.Since his death
I decided not to re-marry,I sold all my inherited belongings and
deposited all the sum of $6.5 Million dollars with Bank in South
Africa.Though what disturbs me mostly is the cancer. Having known my
condition I decided to donate this fund to church,i want you as God
fearing person,to also use this money to fund church,orphanages and
widows,I took this decision,before i rest in peace because my time
will so on be up.

The Bible made us to understand that blessed are the hands that
giveth. I took this decision because I don`t have any child that will
inherit this money and my husband's relatives are not Christians and I
don`t want my husband hard earned money to be misused by unbelievers.
I don`t want a situation where these money will be used in an ungodly
manner,hence the reason for taking this bold decision.I am not afraid
of death hence i know where am going.Presently,I'm with my laptop in a
hospital here in London where I have been undergoing treatment for
cancer of the lungs.

As soon as I receive your reply I shall give you the contact of the
Bank.I will also issue you a letter of authority that will prove you
as the new beneficiary of my fund.Please assure me that you will act
accordingly as I stated.Hoping to hear from you soon.

Remain blessed in the name of the Lord.

Yours in Christ,
Mrs Suran Yoda


Greetings in the name of God, Business proposal in God we trust

2018-08-14 Thread Mrs, Suran Yoda
Greetings in the name of God

Dear Friend


Greetings in the name of God,please let this not sound strange to you
for my only surviving lawyer who would have done this died early this
year.I prayed and got your email id from your country guestbook. I am
Mrs Suran Yoda from London,I am 72 years old,i am suffering from a
long time cancer of the lungs which also affected my brain,from all
indication my conditions is really deteriorating and it is quite
obvious that,according to my doctors they have advised me that i may
not live for the next two months,this is because the cancer stage has
gotten to a very bad stage.I am married to (Dr Andrews Yoda) who
worked with the Embassy of United Kingdom in South

Africa for nine years,Before he died in 2004.

I was bred up from a motherless babies home and was married to my late
husband for Thirty years without a child,my husband died in a fatal
motor accident Before his death we were true believers.Since his death
I decided not to re-marry,I sold all my inherited belongings and
deposited all the sum of $6.5 Million dollars with Bank in South
Africa.Though what disturbs me mostly is the cancer. Having known my
condition I decided to donate this fund to church,i want you as God
fearing person,to also use this money to fund church,orphanages and
widows,I took this decision,before i rest in peace because my time
will so on be up.

The Bible made us to understand that blessed are the hands that
giveth. I took this decision because I don`t have any child that will
inherit this money and my husband's relatives are not Christians and I
don`t want my husband hard earned money to be misused by unbelievers.
I don`t want a situation where these money will be used in an ungodly
manner,hence the reason for taking this bold decision.I am not afraid
of death hence i know where am going.Presently,I'm with my laptop in a
hospital here in London where I have been undergoing treatment for
cancer of the lungs.

As soon as I receive your reply I shall give you the contact of the
Bank.I will also issue you a letter of authority that will prove you
as the new beneficiary of my fund.Please assure me that you will act
accordingly as I stated.Hoping to hear from you soon.

Remain blessed in the name of the Lord.

Yours in Christ,
Mrs Suran Yoda


Re: [RFC][PATCH 00/24] tools lib traceevent: Rename pevent to tep for preparation for library

2018-08-14 Thread Steven Rostedt
On Fri, 10 Aug 2018 13:57:06 -0300
Arnaldo Carvalho de Melo  wrote:

> > Arnaldo takes care of that, but I guess pulling from branch is the prefered 
> > way  
> 
> I'll try pulling and building it automatically patch by patch,

Hi Arnaldo,

Were you able to take this? We have another patch series that's ready
to go (it can wait till after the merge window though).

-- Steve


Re: [RFC][PATCH 00/24] tools lib traceevent: Rename pevent to tep for preparation for library

2018-08-14 Thread Steven Rostedt
On Fri, 10 Aug 2018 13:57:06 -0300
Arnaldo Carvalho de Melo  wrote:

> > Arnaldo takes care of that, but I guess pulling from branch is the prefered 
> > way  
> 
> I'll try pulling and building it automatically patch by patch,

Hi Arnaldo,

Were you able to take this? We have another patch series that's ready
to go (it can wait till after the merge window though).

-- Steve


Re: [PATCH V2] i2c: ismt: fix wrong device address when unmap the data buffer

2018-08-14 Thread Dmitry Safonov
+Cc: stable

Hi Greg, JFI:
This one has hit a couple of times on autotests on v4.9 stable.
The fix for BUG() is trivial, so probably worth to ship it to v4.9/v4.4/v3.18.

2017-06-13 5:59 GMT+01:00 Song liwei :
> From: Liwei Song 
>
> Fix the following kernel bug:
>
> kernel BUG at drivers/iommu/intel-iommu.c:3260!
> invalid opcode:  [#5] PREEMPT SMP
> Hardware name: Intel Corp. Harcuvar/Server, BIOS 
> HAVLCRB0.X64.0013.D39.1608311820 08/31/2016
> task: 880175389950 ti: 880176bec000 task.ti: 880176bec000
> RIP: 0010:[]  [] intel_unmap+0x25b/0x260
> RSP: 0018:880176bef5e8  EFLAGS: 00010296
> RAX: 0024 RBX: 8800773c7c88 RCX: ce04
> RDX: 8000 RSI:  RDI: 0009
> RBP: 880176bef638 R08: 0010 R09: 0004
> R10: 880175389c78 R11: 0a4f R12: 8800773c7868
> R13: ac88 R14: 8800773c7818 R15: 0001
> FS:  7fef21258700() GS:88017b5c() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0066d6d8 CR3: 7118c000 CR4: 003406e0
> Stack:
>  ac88 8199867f 880176bef5f8 88010030
>  880176bef668 8800773c7c88 880178288098 8800772c0010
>  8800773c7818 0001 880176bef648 8150a86e
> Call Trace:
>  [] ? printk+0x46/0x48
>  [] intel_unmap_page+0xe/0x10
>  [] ismt_access+0x27b/0x8fa [i2c_ismt]
>  [] ? __pm_runtime_suspend+0xa0/0xa0
>  [] ? pm_suspend_timer_fn+0x80/0x80
>  [] ? __pm_runtime_suspend+0xa0/0xa0
>  [] ? pm_suspend_timer_fn+0x80/0x80
>  [] ? pci_bus_read_dev_vendor_id+0xf0/0xf0
>  [] i2c_smbus_xfer+0xec/0x4b0
>  [] ? vprintk_emit+0x345/0x530
>  [] i2cdev_ioctl_smbus+0x12b/0x240 [i2c_dev]
>  [] ? vprintk_default+0x29/0x40
>  [] i2cdev_ioctl+0x63/0x1ec [i2c_dev]
>  [] do_vfs_ioctl+0x328/0x5d0
>  [] ? vfs_write+0x11c/0x190
>  [] ? rt_up_read+0x19/0x20
>  [] SyS_ioctl+0x81/0xa0
>  [] system_call_fastpath+0x16/0x6e
>
> This happen When run "i2cdetect -y 0" detect SMBus iSMT adapter.
>
> After finished I2C block read/write, when unmap the data buffer,
> a wrong device address was pass to dma_unmap_single().
>
> To fix this, give dma_unmap_single() the "dev" parameter, just like
> what dma_map_single() does, then unmap can find the right devices.
>
> Fixes: 13f35ac14cd0 ("i2c: Adding support for Intel iSMT SMBus 2.0 host 
> controller")
> Signed-off-by: Liwei Song 
> ---
>  drivers/i2c/busses/i2c-ismt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c
> index f573448..e98e44e 100644
> --- a/drivers/i2c/busses/i2c-ismt.c
> +++ b/drivers/i2c/busses/i2c-ismt.c
> @@ -584,7 +584,7 @@ static int ismt_access(struct i2c_adapter *adap, u16 addr,
>
> /* unmap the data buffer */
> if (dma_size != 0)
> -   dma_unmap_single(>dev, dma_addr, dma_size, 
> dma_direction);
> +   dma_unmap_single(dev, dma_addr, dma_size, dma_direction);
>
> if (unlikely(!time_left)) {
> dev_err(dev, "completion wait timed out\n");
> --
> 2.7.4
>



-- 
 Dmitry


Re: [PATCH V2] i2c: ismt: fix wrong device address when unmap the data buffer

2018-08-14 Thread Dmitry Safonov
+Cc: stable

Hi Greg, JFI:
This one has hit a couple of times on autotests on v4.9 stable.
The fix for BUG() is trivial, so probably worth to ship it to v4.9/v4.4/v3.18.

2017-06-13 5:59 GMT+01:00 Song liwei :
> From: Liwei Song 
>
> Fix the following kernel bug:
>
> kernel BUG at drivers/iommu/intel-iommu.c:3260!
> invalid opcode:  [#5] PREEMPT SMP
> Hardware name: Intel Corp. Harcuvar/Server, BIOS 
> HAVLCRB0.X64.0013.D39.1608311820 08/31/2016
> task: 880175389950 ti: 880176bec000 task.ti: 880176bec000
> RIP: 0010:[]  [] intel_unmap+0x25b/0x260
> RSP: 0018:880176bef5e8  EFLAGS: 00010296
> RAX: 0024 RBX: 8800773c7c88 RCX: ce04
> RDX: 8000 RSI:  RDI: 0009
> RBP: 880176bef638 R08: 0010 R09: 0004
> R10: 880175389c78 R11: 0a4f R12: 8800773c7868
> R13: ac88 R14: 8800773c7818 R15: 0001
> FS:  7fef21258700() GS:88017b5c() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0066d6d8 CR3: 7118c000 CR4: 003406e0
> Stack:
>  ac88 8199867f 880176bef5f8 88010030
>  880176bef668 8800773c7c88 880178288098 8800772c0010
>  8800773c7818 0001 880176bef648 8150a86e
> Call Trace:
>  [] ? printk+0x46/0x48
>  [] intel_unmap_page+0xe/0x10
>  [] ismt_access+0x27b/0x8fa [i2c_ismt]
>  [] ? __pm_runtime_suspend+0xa0/0xa0
>  [] ? pm_suspend_timer_fn+0x80/0x80
>  [] ? __pm_runtime_suspend+0xa0/0xa0
>  [] ? pm_suspend_timer_fn+0x80/0x80
>  [] ? pci_bus_read_dev_vendor_id+0xf0/0xf0
>  [] i2c_smbus_xfer+0xec/0x4b0
>  [] ? vprintk_emit+0x345/0x530
>  [] i2cdev_ioctl_smbus+0x12b/0x240 [i2c_dev]
>  [] ? vprintk_default+0x29/0x40
>  [] i2cdev_ioctl+0x63/0x1ec [i2c_dev]
>  [] do_vfs_ioctl+0x328/0x5d0
>  [] ? vfs_write+0x11c/0x190
>  [] ? rt_up_read+0x19/0x20
>  [] SyS_ioctl+0x81/0xa0
>  [] system_call_fastpath+0x16/0x6e
>
> This happen When run "i2cdetect -y 0" detect SMBus iSMT adapter.
>
> After finished I2C block read/write, when unmap the data buffer,
> a wrong device address was pass to dma_unmap_single().
>
> To fix this, give dma_unmap_single() the "dev" parameter, just like
> what dma_map_single() does, then unmap can find the right devices.
>
> Fixes: 13f35ac14cd0 ("i2c: Adding support for Intel iSMT SMBus 2.0 host 
> controller")
> Signed-off-by: Liwei Song 
> ---
>  drivers/i2c/busses/i2c-ismt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c
> index f573448..e98e44e 100644
> --- a/drivers/i2c/busses/i2c-ismt.c
> +++ b/drivers/i2c/busses/i2c-ismt.c
> @@ -584,7 +584,7 @@ static int ismt_access(struct i2c_adapter *adap, u16 addr,
>
> /* unmap the data buffer */
> if (dma_size != 0)
> -   dma_unmap_single(>dev, dma_addr, dma_size, 
> dma_direction);
> +   dma_unmap_single(dev, dma_addr, dma_size, dma_direction);
>
> if (unlikely(!time_left)) {
> dev_err(dev, "completion wait timed out\n");
> --
> 2.7.4
>



-- 
 Dmitry


Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation

2018-08-14 Thread Baolin Wang
Hi Rob,

On 15 August 2018 at 04:27, Rob Herring  wrote:
> On Thu, Aug 09, 2018 at 11:03:11AM +0800, Baolin Wang wrote:
>> Hi Trent,
>>
>> On 9 August 2018 at 02:57, Trent Piepho  wrote:
>> > On Wed, 2018-08-08 at 11:54 +0100, Mark Brown wrote:
>> >> On Wed, Aug 08, 2018 at 06:35:28PM +0800, Baolin Wang wrote:
>> >> > On 8 August 2018 at 17:50, Mark Brown  wrote:
>> >> > > Right, I don't think we added this yet (if we did I can't see
>> >> > > it).  I'd
>> >> > > add a new field to spi_transfer for this, then other controllers
>> >> > > with
>> >> > > the same support can implement it as well and drivers can start
>> >> > > using
>> >> > > it too.
>> >> > OK. So I will name the new filed as 'word_delay', is it OK for you?
>> >>
>> >> Sounds good, yes.
>> >
>> > Should it be in µs like the existing iter-transfer delay?  I think
>> > perhaps units of cycles of the SPI clock make more sense?
>>
>> Since some SPI controllers just want some interval values (neither µs
>> unit nor cycles unit ) set into hardware, and the hardware will
>> convert to the correct delay time automatically. So I did not force
>> 'word_delay' unit as µs or cycle, and just let the slave devices
>> decide the unit which depends on the SPI hardware requirement.
>
> This needs to be defined units in DT, not decided by each controller.

Do you mean we should introduce one standard property (maybe named as
'word_delay_unit') to define the word_delay unit?
If we really need to specify the unit of word_delay, I think we can
add comments for spi_tansfer  to specify the unit, which will be
better.

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index a64235e..7a72c0a 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -711,6 +711,8 @@ extern void spi_res_release(struct spi_controller *ctlr,
  * @delay_usecs: microseconds to delay after this transfer before
  * (optionally) changing the chipselect status, then starting
  * the next transfer or completing this @spi_message.
+ * @word_delay: clock cycles to inter word delay after each word size
(set by bits_per_word)
+ * transmission.
  * @transfer_list: transfers are sequenced through @spi_message.transfers
  * @tx_sg: Scatterlist for transmit, currently not for client use
  * @rx_sg: Scatterlist for receive, currently not for client use
@@ -793,6 +795,7 @@ struct spi_transfer {
  u8 bits_per_word;
  u16 delay_usecs;
  u32 speed_hz;
+ u16 word_delay;

  struct list_head transfer_list;
 };


-- 
Baolin Wang
Best Regards


Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation

2018-08-14 Thread Baolin Wang
Hi Rob,

On 15 August 2018 at 04:27, Rob Herring  wrote:
> On Thu, Aug 09, 2018 at 11:03:11AM +0800, Baolin Wang wrote:
>> Hi Trent,
>>
>> On 9 August 2018 at 02:57, Trent Piepho  wrote:
>> > On Wed, 2018-08-08 at 11:54 +0100, Mark Brown wrote:
>> >> On Wed, Aug 08, 2018 at 06:35:28PM +0800, Baolin Wang wrote:
>> >> > On 8 August 2018 at 17:50, Mark Brown  wrote:
>> >> > > Right, I don't think we added this yet (if we did I can't see
>> >> > > it).  I'd
>> >> > > add a new field to spi_transfer for this, then other controllers
>> >> > > with
>> >> > > the same support can implement it as well and drivers can start
>> >> > > using
>> >> > > it too.
>> >> > OK. So I will name the new filed as 'word_delay', is it OK for you?
>> >>
>> >> Sounds good, yes.
>> >
>> > Should it be in µs like the existing iter-transfer delay?  I think
>> > perhaps units of cycles of the SPI clock make more sense?
>>
>> Since some SPI controllers just want some interval values (neither µs
>> unit nor cycles unit ) set into hardware, and the hardware will
>> convert to the correct delay time automatically. So I did not force
>> 'word_delay' unit as µs or cycle, and just let the slave devices
>> decide the unit which depends on the SPI hardware requirement.
>
> This needs to be defined units in DT, not decided by each controller.

Do you mean we should introduce one standard property (maybe named as
'word_delay_unit') to define the word_delay unit?
If we really need to specify the unit of word_delay, I think we can
add comments for spi_tansfer  to specify the unit, which will be
better.

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index a64235e..7a72c0a 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -711,6 +711,8 @@ extern void spi_res_release(struct spi_controller *ctlr,
  * @delay_usecs: microseconds to delay after this transfer before
  * (optionally) changing the chipselect status, then starting
  * the next transfer or completing this @spi_message.
+ * @word_delay: clock cycles to inter word delay after each word size
(set by bits_per_word)
+ * transmission.
  * @transfer_list: transfers are sequenced through @spi_message.transfers
  * @tx_sg: Scatterlist for transmit, currently not for client use
  * @rx_sg: Scatterlist for receive, currently not for client use
@@ -793,6 +795,7 @@ struct spi_transfer {
  u8 bits_per_word;
  u16 delay_usecs;
  u32 speed_hz;
+ u16 word_delay;

  struct list_head transfer_list;
 };


-- 
Baolin Wang
Best Regards


Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Linus Torvalds
On Tue, Aug 14, 2018 at 6:35 PM Michal Wnukowski  wrote:
>
> I got confused after comaring disassembly of this code with and
> without volatile keyword. Thanks for the correction.

Note that _usually_, the "volatile" has absolutely no impact. When
there is one read in the source code, it's almost always one access in
the generated code too.

That's particularly true if the read like this access do dbbuf_ei:

if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value))

is only used as an argument to a real function call.

But if that function is an inline function (which it is), and the
argument ends up getting used multiple times (which in this case it is
not), then it is possible in theory that gcc ends up saying "instead
of reading the value once, and keep it in a register, I'll just read
it again for the second time".

That happens rarely, but it _can_ happen without the volatile (or the
READ_ONCE()).

The advantage of READ_ONCE() over using "volatile" in a data structure
tends to be that you explicitly mark the memory accesses that you care
about. That's nice documentation for whoever reads the code (and it's
not unheard of that the _same_ data structure is sometimes volatile,
and sometimes not - for example, the data structure might be protected
by a lock - not volatile - but people might use an optimistic lockless
access to the value too - when it ends up being volatile. So then it's
really good to explicitly use READ_ONCE() for the volatile cases where
you show that you know that you're now doing something special that
really depends on memory ordering or other magic "access exactly once"
behavior.


> > I'm assuming that's the actual controller hardware, but it needs a
> > comment about *that* access being ordered too, because if it isn't,
> > then ordering this side is pointless.
>
> The other side in this case is not actual controller hardware, but
> virtual one (the regular hardware should rely on normal MMIO
> doorbells).

Ok, Maybe worth adding a one-line note about the ordering guarantees
by the virtual controller accesses.

   Linus


Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Linus Torvalds
On Tue, Aug 14, 2018 at 6:35 PM Michal Wnukowski  wrote:
>
> I got confused after comaring disassembly of this code with and
> without volatile keyword. Thanks for the correction.

Note that _usually_, the "volatile" has absolutely no impact. When
there is one read in the source code, it's almost always one access in
the generated code too.

That's particularly true if the read like this access do dbbuf_ei:

if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value))

is only used as an argument to a real function call.

But if that function is an inline function (which it is), and the
argument ends up getting used multiple times (which in this case it is
not), then it is possible in theory that gcc ends up saying "instead
of reading the value once, and keep it in a register, I'll just read
it again for the second time".

That happens rarely, but it _can_ happen without the volatile (or the
READ_ONCE()).

The advantage of READ_ONCE() over using "volatile" in a data structure
tends to be that you explicitly mark the memory accesses that you care
about. That's nice documentation for whoever reads the code (and it's
not unheard of that the _same_ data structure is sometimes volatile,
and sometimes not - for example, the data structure might be protected
by a lock - not volatile - but people might use an optimistic lockless
access to the value too - when it ends up being volatile. So then it's
really good to explicitly use READ_ONCE() for the volatile cases where
you show that you know that you're now doing something special that
really depends on memory ordering or other magic "access exactly once"
behavior.


> > I'm assuming that's the actual controller hardware, but it needs a
> > comment about *that* access being ordered too, because if it isn't,
> > then ordering this side is pointless.
>
> The other side in this case is not actual controller hardware, but
> virtual one (the regular hardware should rely on normal MMIO
> doorbells).

Ok, Maybe worth adding a one-line note about the ordering guarantees
by the virtual controller accesses.

   Linus


Re: linux-next: manual merge of the tip tree with the rdma tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Mon, 6 Aug 2018 14:53:31 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the tip tree got a conflict in:
> 
>   drivers/infiniband/core/rdma_core.c
> 
> between commit:
> 
>   9867f5c6695f ("IB/uverbs: Convert 'bool exclusive' into an enum")
> 
> from the rdma tree and commit:
> 
>   bfc18e389c7a ("atomics/treewide: Rename __atomic_add_unless() => 
> atomic_fetch_add_unless()")
> 
> from the tip tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc drivers/infiniband/core/rdma_core.c
> index 4235b9ddc2ad,475910ffbcb6..
> --- a/drivers/infiniband/core/rdma_core.c
> +++ b/drivers/infiniband/core/rdma_core.c
> @@@ -122,206 -120,19 +122,206 @@@ static int uverbs_try_lock_object(struc
>* concurrently, setting the counter to zero is enough for releasing
>* this lock.
>*/
>  -if (!exclusive)
>  +switch (mode) {
>  +case UVERBS_LOOKUP_READ:
> - return __atomic_add_unless(>usecnt, 1, -1) == -1 ?
> + return atomic_fetch_add_unless(>usecnt, 1, -1) == -1 ?
>   -EBUSY : 0;
>  +case UVERBS_LOOKUP_WRITE:
>  +/* lock is exclusive */
>  +return atomic_cmpxchg(>usecnt, 0, -1) == 0 ? 0 : -EBUSY;
>  +case UVERBS_LOOKUP_DESTROY:
>  +return 0;
>  +}
>  +return 0;
>  +}
>  +
>  +static void assert_uverbs_usecnt(struct ib_uobject *uobj,
>  + enum rdma_lookup_mode mode)
>  +{
>  +#ifdef CONFIG_LOCKDEP
>  +switch (mode) {
>  +case UVERBS_LOOKUP_READ:
>  +WARN_ON(atomic_read(>usecnt) <= 0);
>  +break;
>  +case UVERBS_LOOKUP_WRITE:
>  +WARN_ON(atomic_read(>usecnt) != -1);
>  +break;
>  +case UVERBS_LOOKUP_DESTROY:
>  +break;
>  +}
>  +#endif
>  +}
>  +
>  +/*
>  + * This must be called with the hw_destroy_rwsem locked for read or write,
>  + * also the uobject itself must be locked for write.
>  + *
>  + * Upon return the HW object is guaranteed to be destroyed.
>  + *
>  + * For RDMA_REMOVE_ABORT, the hw_destroy_rwsem is not required to be held,
>  + * however the type's allocat_commit function cannot have been called and 
> the
>  + * uobject cannot be on the uobjects_lists
>  + *
>  + * For RDMA_REMOVE_DESTROY the caller shold be holding a kref (eg via
>  + * rdma_lookup_get_uobject) and the object is left in a state where the 
> caller
>  + * needs to call rdma_lookup_put_uobject.
>  + *
>  + * For all other destroy modes this function internally unlocks the uobject
>  + * and consumes the kref on the uobj.
>  + */
>  +static int uverbs_destroy_uobject(struct ib_uobject *uobj,
>  +  enum rdma_remove_reason reason)
>  +{
>  +struct ib_uverbs_file *ufile = uobj->ufile;
>  +unsigned long flags;
>  +int ret;
>  +
>  +lockdep_assert_held(>hw_destroy_rwsem);
>  +assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE);
>  +
>  +if (uobj->object) {
>  +ret = uobj->type->type_class->destroy_hw(uobj, reason);
>  +if (ret) {
>  +if (ib_is_destroy_retryable(ret, reason, uobj))
>  +return ret;
>  +
>  +/* Nothing to be done, dangle the memory and move on */
>  +WARN(true,
>  + "ib_uverbs: failed to remove uobject id %d, driver 
> err=%d",
>  + uobj->id, ret);
>  +}
>  +
>  +uobj->object = NULL;
>  +}
>  +
>  +if (reason == RDMA_REMOVE_ABORT) {
>  +WARN_ON(!list_empty(>list));
>  +WARN_ON(!uobj->context);
>  +uobj->type->type_class->alloc_abort(uobj);
>  +}
>  +
>  +uobj->context = NULL;
>  +
>  +/*
>  + * For DESTROY the usecnt is held write locked, the caller is expected
>  + * to put it unlock and put the object when done with it. Only DESTROY
>  + * can remove the IDR handle.
>  + */
>  +if (reason != RDMA_REMOVE_DESTROY)
>  +atomic_set(>usecnt, 0);
>  +else
>  +uobj->type->type_class->remove_handle(uobj);
>  +
>  +if (!list_empty(>list)) {
>  +spin_lock_irqsave(>uobjects_lock, flags);
>  +list_del_init(>list);
>  +spin_unlock_irqrestore(>uobjects_lock, flags);
>  +
>  +/*
>  + * Pairs with the get in rdma_alloc_commit_uobject(), could
>  + * destroy uobj.
>  + */
>  +uverbs_uobject_put(uobj);
>  +}
>   
>  -/* lock is either WRITE or DESTROY - should 

Re: linux-next: manual merge of the tip tree with the rdma tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Mon, 6 Aug 2018 14:53:31 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the tip tree got a conflict in:
> 
>   drivers/infiniband/core/rdma_core.c
> 
> between commit:
> 
>   9867f5c6695f ("IB/uverbs: Convert 'bool exclusive' into an enum")
> 
> from the rdma tree and commit:
> 
>   bfc18e389c7a ("atomics/treewide: Rename __atomic_add_unless() => 
> atomic_fetch_add_unless()")
> 
> from the tip tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc drivers/infiniband/core/rdma_core.c
> index 4235b9ddc2ad,475910ffbcb6..
> --- a/drivers/infiniband/core/rdma_core.c
> +++ b/drivers/infiniband/core/rdma_core.c
> @@@ -122,206 -120,19 +122,206 @@@ static int uverbs_try_lock_object(struc
>* concurrently, setting the counter to zero is enough for releasing
>* this lock.
>*/
>  -if (!exclusive)
>  +switch (mode) {
>  +case UVERBS_LOOKUP_READ:
> - return __atomic_add_unless(>usecnt, 1, -1) == -1 ?
> + return atomic_fetch_add_unless(>usecnt, 1, -1) == -1 ?
>   -EBUSY : 0;
>  +case UVERBS_LOOKUP_WRITE:
>  +/* lock is exclusive */
>  +return atomic_cmpxchg(>usecnt, 0, -1) == 0 ? 0 : -EBUSY;
>  +case UVERBS_LOOKUP_DESTROY:
>  +return 0;
>  +}
>  +return 0;
>  +}
>  +
>  +static void assert_uverbs_usecnt(struct ib_uobject *uobj,
>  + enum rdma_lookup_mode mode)
>  +{
>  +#ifdef CONFIG_LOCKDEP
>  +switch (mode) {
>  +case UVERBS_LOOKUP_READ:
>  +WARN_ON(atomic_read(>usecnt) <= 0);
>  +break;
>  +case UVERBS_LOOKUP_WRITE:
>  +WARN_ON(atomic_read(>usecnt) != -1);
>  +break;
>  +case UVERBS_LOOKUP_DESTROY:
>  +break;
>  +}
>  +#endif
>  +}
>  +
>  +/*
>  + * This must be called with the hw_destroy_rwsem locked for read or write,
>  + * also the uobject itself must be locked for write.
>  + *
>  + * Upon return the HW object is guaranteed to be destroyed.
>  + *
>  + * For RDMA_REMOVE_ABORT, the hw_destroy_rwsem is not required to be held,
>  + * however the type's allocat_commit function cannot have been called and 
> the
>  + * uobject cannot be on the uobjects_lists
>  + *
>  + * For RDMA_REMOVE_DESTROY the caller shold be holding a kref (eg via
>  + * rdma_lookup_get_uobject) and the object is left in a state where the 
> caller
>  + * needs to call rdma_lookup_put_uobject.
>  + *
>  + * For all other destroy modes this function internally unlocks the uobject
>  + * and consumes the kref on the uobj.
>  + */
>  +static int uverbs_destroy_uobject(struct ib_uobject *uobj,
>  +  enum rdma_remove_reason reason)
>  +{
>  +struct ib_uverbs_file *ufile = uobj->ufile;
>  +unsigned long flags;
>  +int ret;
>  +
>  +lockdep_assert_held(>hw_destroy_rwsem);
>  +assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE);
>  +
>  +if (uobj->object) {
>  +ret = uobj->type->type_class->destroy_hw(uobj, reason);
>  +if (ret) {
>  +if (ib_is_destroy_retryable(ret, reason, uobj))
>  +return ret;
>  +
>  +/* Nothing to be done, dangle the memory and move on */
>  +WARN(true,
>  + "ib_uverbs: failed to remove uobject id %d, driver 
> err=%d",
>  + uobj->id, ret);
>  +}
>  +
>  +uobj->object = NULL;
>  +}
>  +
>  +if (reason == RDMA_REMOVE_ABORT) {
>  +WARN_ON(!list_empty(>list));
>  +WARN_ON(!uobj->context);
>  +uobj->type->type_class->alloc_abort(uobj);
>  +}
>  +
>  +uobj->context = NULL;
>  +
>  +/*
>  + * For DESTROY the usecnt is held write locked, the caller is expected
>  + * to put it unlock and put the object when done with it. Only DESTROY
>  + * can remove the IDR handle.
>  + */
>  +if (reason != RDMA_REMOVE_DESTROY)
>  +atomic_set(>usecnt, 0);
>  +else
>  +uobj->type->type_class->remove_handle(uobj);
>  +
>  +if (!list_empty(>list)) {
>  +spin_lock_irqsave(>uobjects_lock, flags);
>  +list_del_init(>list);
>  +spin_unlock_irqrestore(>uobjects_lock, flags);
>  +
>  +/*
>  + * Pairs with the get in rdma_alloc_commit_uobject(), could
>  + * destroy uobj.
>  + */
>  +uverbs_uobject_put(uobj);
>  +}
>   
>  -/* lock is either WRITE or DESTROY - should 

Re: [PATCH v23 3/4] arm64: dts: mt8173: Add GCE node

2018-08-14 Thread houlong wei
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote:
> This patch adds the device node of the GCE hardware for CMDQ module.
> 
> Signed-off-by: Houlong Wei 
> Signed-off-by: HS Liao 
> ---
>  arch/arm64/boot/dts/mediatek/mt8173.dtsi |   10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
> b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> index 94597e3..97b1ec6 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "mt8173-pinfunc.h"
>  
>  / {
> @@ -519,6 +520,15 @@
>   status = "disabled";
>   };
>  
> + gce: mailbox@10212000 {
> + compatible = "mediatek,mt8173-gce";
> + reg = <0 0x10212000 0 0x1000>;
> + interrupts = ;
> + clocks = < CLK_INFRA_GCE>;
> + clock-names = "gce";
> + #mbox-cells = <3>;
> + };
> +
>   mipi_tx0: mipi-dphy@10215000 {
>   compatible = "mediatek,mt8173-mipi-tx";
>   reg = <0 0x10215000 0 0x1000>;

Hi Matthias,
  Could you please review this patch when you are available? Thanks a
lot.



Re: [PATCH v23 3/4] arm64: dts: mt8173: Add GCE node

2018-08-14 Thread houlong wei
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote:
> This patch adds the device node of the GCE hardware for CMDQ module.
> 
> Signed-off-by: Houlong Wei 
> Signed-off-by: HS Liao 
> ---
>  arch/arm64/boot/dts/mediatek/mt8173.dtsi |   10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
> b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> index 94597e3..97b1ec6 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "mt8173-pinfunc.h"
>  
>  / {
> @@ -519,6 +520,15 @@
>   status = "disabled";
>   };
>  
> + gce: mailbox@10212000 {
> + compatible = "mediatek,mt8173-gce";
> + reg = <0 0x10212000 0 0x1000>;
> + interrupts = ;
> + clocks = < CLK_INFRA_GCE>;
> + clock-names = "gce";
> + #mbox-cells = <3>;
> + };
> +
>   mipi_tx0: mipi-dphy@10215000 {
>   compatible = "mediatek,mt8173-mipi-tx";
>   reg = <0 0x10215000 0 0x1000>;

Hi Matthias,
  Could you please review this patch when you are available? Thanks a
lot.



Re: [PATCH v23 4/4] soc: mediatek: Add Mediatek CMDQ helper

2018-08-14 Thread houlong wei
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote:
> Add Mediatek CMDQ helper to create CMDQ packet and assemble GCE op code.
> 
> Signed-off-by: Houlong Wei 
> Signed-off-by: HS Liao 
> ---
>  drivers/soc/mediatek/Kconfig   |   12 ++
>  drivers/soc/mediatek/Makefile  |1 +
>  drivers/soc/mediatek/mtk-cmdq-helper.c |  291 
> 
>  include/linux/soc/mediatek/mtk-cmdq.h  |  135 +++
>  4 files changed, 439 insertions(+)
>  create mode 100644 drivers/soc/mediatek/mtk-cmdq-helper.c
>  create mode 100644 include/linux/soc/mediatek/mtk-cmdq.h
> 
> diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig
> index a7d0667..17bd759 100644
> --- a/drivers/soc/mediatek/Kconfig
> +++ b/drivers/soc/mediatek/Kconfig
> @@ -4,6 +4,18 @@
>  menu "MediaTek SoC drivers"
>   depends on ARCH_MEDIATEK || COMPILE_TEST
>  
> +config MTK_CMDQ
> + tristate "MediaTek CMDQ Support"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + select MAILBOX
> + select MTK_CMDQ_MBOX
> + select MTK_INFRACFG
> + help
> +   Say yes here to add support for the MediaTek Command Queue (CMDQ)
> +   driver. The CMDQ is used to help read/write registers with critical
> +   time limitation, such as updating display configuration during the
> +   vblank.
> +
>  config MTK_INFRACFG
>   bool "MediaTek INFRACFG Support"
>   select REGMAP
> diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile
> index 12998b0..64ce5ee 100644
> --- a/drivers/soc/mediatek/Makefile
> +++ b/drivers/soc/mediatek/Makefile
> @@ -1,3 +1,4 @@
> +obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o
>  obj-$(CONFIG_MTK_INFRACFG) += mtk-infracfg.o
>  obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o
>  obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o
> diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c 
> b/drivers/soc/mediatek/mtk-cmdq-helper.c
> new file mode 100644
> index 000..e4dbb7e
> --- /dev/null
> +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
> @@ -0,0 +1,291 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright (c) 2018 MediaTek Inc.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define CMDQ_ARG_A_WRITE_MASK0x
> +#define CMDQ_WRITE_ENABLE_MASK   BIT(0)
> +#define CMDQ_EOC_IRQ_EN  BIT(0)
> +#define CMDQ_EOC_CMD ((u64)((CMDQ_CODE_EOC << CMDQ_OP_CODE_SHIFT)) \
> + << 32 | CMDQ_EOC_IRQ_EN)
> +
> +static void cmdq_client_timeout(struct timer_list *t)
> +{
> + struct cmdq_client *client = from_timer(client, t, timer);
> +
> + dev_err(client->client.dev, "cmdq timeout!\n");
> +}
> +
> +struct cmdq_client *cmdq_mbox_create(struct device *dev, int index, u32 
> timeout)
> +{
> + struct cmdq_client *client;
> +
> + client = kzalloc(sizeof(*client), GFP_KERNEL);
> + if (!client)
> + return (struct cmdq_client *)-ENOMEM;
> +
> + client->timeout_ms = timeout;
> + if (timeout != CMDQ_NO_TIMEOUT) {
> + spin_lock_init(>lock);
> + timer_setup(>timer, cmdq_client_timeout, 0);
> + }
> + client->pkt_cnt = 0;
> + client->client.dev = dev;
> + client->client.tx_block = false;
> + client->chan = mbox_request_channel(>client, index);
> +
> + if (IS_ERR(client->chan)) {
> + long err = 0;
> +
> + dev_err(dev, "failed to request channel\n");
> + err = PTR_ERR(client->chan);
> + kfree(client);
> +
> + return (struct cmdq_client *)err;
> + }
> +
> + return client;
> +}
> +EXPORT_SYMBOL(cmdq_mbox_create);
> +
> +void cmdq_mbox_destroy(struct cmdq_client *client)
> +{
> + if (client->timeout_ms != CMDQ_NO_TIMEOUT) {
> + spin_lock(>lock);
> + del_timer_sync(>timer);
> + spin_unlock(>lock);
> + }
> + mbox_free_channel(client->chan);
> + kfree(client);
> +}
> +EXPORT_SYMBOL(cmdq_mbox_destroy);
> +
> +int cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt **pkt_ptr,
> + size_t size)
> +{
> + struct cmdq_pkt *pkt;
> + struct device *dev;
> + dma_addr_t dma_addr;
> +
> + pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> + if (!pkt)
> + return -ENOMEM;
> + pkt->va_base = kzalloc(size, GFP_KERNEL);
> + if (!pkt->va_base) {
> + kfree(pkt);
> + return -ENOMEM;
> + }
> + pkt->buf_size = size;
> + pkt->cl = (void *)client;
> +
> + dev = client->chan->mbox->dev;
> + dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size,
> + DMA_TO_DEVICE);
> + if (dma_mapping_error(dev, dma_addr)) {
> + dev_err(dev, "dma map failed, size=%u\n", (u32)(u64)size);
> + kfree(pkt->va_base);
> + kfree(pkt);
> + return -ENOMEM;
> + }
> +
> + pkt->pa_base = dma_addr;
> + *pkt_ptr = pkt;
> +
> + return 0;
> +}
> 

Re: [PATCH v23 4/4] soc: mediatek: Add Mediatek CMDQ helper

2018-08-14 Thread houlong wei
On Wed, 2018-07-25 at 09:26 +0800, Houlong Wei wrote:
> Add Mediatek CMDQ helper to create CMDQ packet and assemble GCE op code.
> 
> Signed-off-by: Houlong Wei 
> Signed-off-by: HS Liao 
> ---
>  drivers/soc/mediatek/Kconfig   |   12 ++
>  drivers/soc/mediatek/Makefile  |1 +
>  drivers/soc/mediatek/mtk-cmdq-helper.c |  291 
> 
>  include/linux/soc/mediatek/mtk-cmdq.h  |  135 +++
>  4 files changed, 439 insertions(+)
>  create mode 100644 drivers/soc/mediatek/mtk-cmdq-helper.c
>  create mode 100644 include/linux/soc/mediatek/mtk-cmdq.h
> 
> diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig
> index a7d0667..17bd759 100644
> --- a/drivers/soc/mediatek/Kconfig
> +++ b/drivers/soc/mediatek/Kconfig
> @@ -4,6 +4,18 @@
>  menu "MediaTek SoC drivers"
>   depends on ARCH_MEDIATEK || COMPILE_TEST
>  
> +config MTK_CMDQ
> + tristate "MediaTek CMDQ Support"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + select MAILBOX
> + select MTK_CMDQ_MBOX
> + select MTK_INFRACFG
> + help
> +   Say yes here to add support for the MediaTek Command Queue (CMDQ)
> +   driver. The CMDQ is used to help read/write registers with critical
> +   time limitation, such as updating display configuration during the
> +   vblank.
> +
>  config MTK_INFRACFG
>   bool "MediaTek INFRACFG Support"
>   select REGMAP
> diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile
> index 12998b0..64ce5ee 100644
> --- a/drivers/soc/mediatek/Makefile
> +++ b/drivers/soc/mediatek/Makefile
> @@ -1,3 +1,4 @@
> +obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o
>  obj-$(CONFIG_MTK_INFRACFG) += mtk-infracfg.o
>  obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o
>  obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o
> diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c 
> b/drivers/soc/mediatek/mtk-cmdq-helper.c
> new file mode 100644
> index 000..e4dbb7e
> --- /dev/null
> +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
> @@ -0,0 +1,291 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright (c) 2018 MediaTek Inc.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define CMDQ_ARG_A_WRITE_MASK0x
> +#define CMDQ_WRITE_ENABLE_MASK   BIT(0)
> +#define CMDQ_EOC_IRQ_EN  BIT(0)
> +#define CMDQ_EOC_CMD ((u64)((CMDQ_CODE_EOC << CMDQ_OP_CODE_SHIFT)) \
> + << 32 | CMDQ_EOC_IRQ_EN)
> +
> +static void cmdq_client_timeout(struct timer_list *t)
> +{
> + struct cmdq_client *client = from_timer(client, t, timer);
> +
> + dev_err(client->client.dev, "cmdq timeout!\n");
> +}
> +
> +struct cmdq_client *cmdq_mbox_create(struct device *dev, int index, u32 
> timeout)
> +{
> + struct cmdq_client *client;
> +
> + client = kzalloc(sizeof(*client), GFP_KERNEL);
> + if (!client)
> + return (struct cmdq_client *)-ENOMEM;
> +
> + client->timeout_ms = timeout;
> + if (timeout != CMDQ_NO_TIMEOUT) {
> + spin_lock_init(>lock);
> + timer_setup(>timer, cmdq_client_timeout, 0);
> + }
> + client->pkt_cnt = 0;
> + client->client.dev = dev;
> + client->client.tx_block = false;
> + client->chan = mbox_request_channel(>client, index);
> +
> + if (IS_ERR(client->chan)) {
> + long err = 0;
> +
> + dev_err(dev, "failed to request channel\n");
> + err = PTR_ERR(client->chan);
> + kfree(client);
> +
> + return (struct cmdq_client *)err;
> + }
> +
> + return client;
> +}
> +EXPORT_SYMBOL(cmdq_mbox_create);
> +
> +void cmdq_mbox_destroy(struct cmdq_client *client)
> +{
> + if (client->timeout_ms != CMDQ_NO_TIMEOUT) {
> + spin_lock(>lock);
> + del_timer_sync(>timer);
> + spin_unlock(>lock);
> + }
> + mbox_free_channel(client->chan);
> + kfree(client);
> +}
> +EXPORT_SYMBOL(cmdq_mbox_destroy);
> +
> +int cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt **pkt_ptr,
> + size_t size)
> +{
> + struct cmdq_pkt *pkt;
> + struct device *dev;
> + dma_addr_t dma_addr;
> +
> + pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> + if (!pkt)
> + return -ENOMEM;
> + pkt->va_base = kzalloc(size, GFP_KERNEL);
> + if (!pkt->va_base) {
> + kfree(pkt);
> + return -ENOMEM;
> + }
> + pkt->buf_size = size;
> + pkt->cl = (void *)client;
> +
> + dev = client->chan->mbox->dev;
> + dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size,
> + DMA_TO_DEVICE);
> + if (dma_mapping_error(dev, dma_addr)) {
> + dev_err(dev, "dma map failed, size=%u\n", (u32)(u64)size);
> + kfree(pkt->va_base);
> + kfree(pkt);
> + return -ENOMEM;
> + }
> +
> + pkt->pa_base = dma_addr;
> + *pkt_ptr = pkt;
> +
> + return 0;
> +}
> 

Re: linux-next: manual merge of the block tree with the rdma tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Thu, 26 Jul 2018 13:58:04 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the block tree got a conflict in:
> 
>   drivers/nvme/target/rdma.c
> 
> between commit:
> 
>   23f96d1f15a7 ("nvmet-rdma: Simplify ib_post_(send|recv|srq_recv)() calls")
>   202093848cac ("nvmet-rdma: add an error flow for post_recv failures")
> 
> from the rdma tree and commits:
> 
>   2fc464e2162c ("nvmet-rdma: add unlikely check in the fast path")
> 
> from the block tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc drivers/nvme/target/rdma.c
> index 1a642e214a4c,e7f43d1e1779..
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@@ -382,13 -435,22 +435,21 @@@ static void nvmet_rdma_free_rsps(struc
>   static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev,
>   struct nvmet_rdma_cmd *cmd)
>   {
>  -struct ib_recv_wr *bad_wr;
> + int ret;
> + 
>   ib_dma_sync_single_for_device(ndev->device,
>   cmd->sge[0].addr, cmd->sge[0].length,
>   DMA_FROM_DEVICE);
>   
>   if (ndev->srq)
> - return ib_post_srq_recv(ndev->srq, >wr, NULL);
> - return ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL);
>  -ret = ib_post_srq_recv(ndev->srq, >wr, _wr);
> ++ret = ib_post_srq_recv(ndev->srq, >wr, NULL);
> + else
>  -ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, _wr);
> ++ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL);
> + 
> + if (unlikely(ret))
> + pr_err("post_recv cmd failed\n");
> + 
> + return ret;
>   }
>   
>   static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue)
> @@@ -491,7 -553,7 +552,7 @@@ static void nvmet_rdma_queue_response(s
>   rsp->send_sge.addr, rsp->send_sge.length,
>   DMA_TO_DEVICE);
>   
> - if (ib_post_send(cm_id->qp, first_wr, NULL)) {
>  -if (unlikely(ib_post_send(cm_id->qp, first_wr, _wr))) {
> ++if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
>   pr_err("sending cmd response failed\n");
>   nvmet_rdma_release_rsp(rsp);
>   }

This is now a conflict between Linus' tree and the rdma tree.

-- 
Cheers,
Stephen Rothwell


pgpIIARNDrEL8.pgp
Description: OpenPGP digital signature


Re: linux-next: manual merge of the block tree with the rdma tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Thu, 26 Jul 2018 13:58:04 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the block tree got a conflict in:
> 
>   drivers/nvme/target/rdma.c
> 
> between commit:
> 
>   23f96d1f15a7 ("nvmet-rdma: Simplify ib_post_(send|recv|srq_recv)() calls")
>   202093848cac ("nvmet-rdma: add an error flow for post_recv failures")
> 
> from the rdma tree and commits:
> 
>   2fc464e2162c ("nvmet-rdma: add unlikely check in the fast path")
> 
> from the block tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc drivers/nvme/target/rdma.c
> index 1a642e214a4c,e7f43d1e1779..
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@@ -382,13 -435,22 +435,21 @@@ static void nvmet_rdma_free_rsps(struc
>   static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev,
>   struct nvmet_rdma_cmd *cmd)
>   {
>  -struct ib_recv_wr *bad_wr;
> + int ret;
> + 
>   ib_dma_sync_single_for_device(ndev->device,
>   cmd->sge[0].addr, cmd->sge[0].length,
>   DMA_FROM_DEVICE);
>   
>   if (ndev->srq)
> - return ib_post_srq_recv(ndev->srq, >wr, NULL);
> - return ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL);
>  -ret = ib_post_srq_recv(ndev->srq, >wr, _wr);
> ++ret = ib_post_srq_recv(ndev->srq, >wr, NULL);
> + else
>  -ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, _wr);
> ++ret = ib_post_recv(cmd->queue->cm_id->qp, >wr, NULL);
> + 
> + if (unlikely(ret))
> + pr_err("post_recv cmd failed\n");
> + 
> + return ret;
>   }
>   
>   static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue)
> @@@ -491,7 -553,7 +552,7 @@@ static void nvmet_rdma_queue_response(s
>   rsp->send_sge.addr, rsp->send_sge.length,
>   DMA_TO_DEVICE);
>   
> - if (ib_post_send(cm_id->qp, first_wr, NULL)) {
>  -if (unlikely(ib_post_send(cm_id->qp, first_wr, _wr))) {
> ++if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
>   pr_err("sending cmd response failed\n");
>   nvmet_rdma_release_rsp(rsp);
>   }

This is now a conflict between Linus' tree and the rdma tree.

-- 
Cheers,
Stephen Rothwell


pgpIIARNDrEL8.pgp
Description: OpenPGP digital signature


Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation

2018-08-14 Thread Baolin Wang
Hi Rob,

On 15 August 2018 at 04:21, Rob Herring  wrote:
> On Tue, Aug 07, 2018 at 06:43:37PM +0800, Baolin Wang wrote:
>> From: Lanqing Liu 
>>
>> This patch adds the binding documentation for Spreadtrum SPI
>> controller device.
>>
>> Signed-off-by: Lanqing Liu 
>> Signed-off-by: Baolin Wang 
>> ---
>>  Documentation/devicetree/bindings/spi/spi-sprd.txt |   31 
>> 
>>  1 file changed, 31 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/spi/spi-sprd.txt
>>
>> diff --git a/Documentation/devicetree/bindings/spi/spi-sprd.txt 
>> b/Documentation/devicetree/bindings/spi/spi-sprd.txt
>> new file mode 100644
>> index 000..06ff746
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/spi/spi-sprd.txt
>> @@ -0,0 +1,31 @@
>> +Spreadtrum SPI Controller
>> +
>> +Required properties:
>> +- compatible: Should be "sprd,sc9860-spi".
>> +- reg: Offset and length of SPI controller register space.
>> +- interrupts: Should contain SPI interrupt.
>> +- clock-names: Should contain following entries:
>> + "spi" for SPI clock,
>> + "source" for SPI source (parent) clock,
>
> Do the h/w block actually get this clock or the driver needs it? In the
> latter case, it should not be in DT.

Yes, we will get the actual SPI source clock form the "source" clock property.

>
>> + "enable" for SPI module enable clock.
>> +- clocks: List of clock input name strings sorted in the same order
>> + as the clock-names property.
>> +- #address-cells: The number of cells required to define a chip select
>> + address on the SPI bus. Should be set to 1.
>> +- #size-cells: Should be set to 0.
>> +
>> +Optional properties:
>> +- sprd,spi-interval: Specify the intervals of two SPI frames, which can be
>> + converted to the delay clock cycles = interval number * 4 + 10.
>
> There are read and write delay properties you can use.

Right. We've agreed introducing another field for spi_transfer to
represent this and will remove 'sprd,spi-interval' from DT.

-- 
Baolin Wang
Best Regards


Re: [PATCH 1/2] dt-bindings: spi: Add Spreadtrum SPI controller documentation

2018-08-14 Thread Baolin Wang
Hi Rob,

On 15 August 2018 at 04:21, Rob Herring  wrote:
> On Tue, Aug 07, 2018 at 06:43:37PM +0800, Baolin Wang wrote:
>> From: Lanqing Liu 
>>
>> This patch adds the binding documentation for Spreadtrum SPI
>> controller device.
>>
>> Signed-off-by: Lanqing Liu 
>> Signed-off-by: Baolin Wang 
>> ---
>>  Documentation/devicetree/bindings/spi/spi-sprd.txt |   31 
>> 
>>  1 file changed, 31 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/spi/spi-sprd.txt
>>
>> diff --git a/Documentation/devicetree/bindings/spi/spi-sprd.txt 
>> b/Documentation/devicetree/bindings/spi/spi-sprd.txt
>> new file mode 100644
>> index 000..06ff746
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/spi/spi-sprd.txt
>> @@ -0,0 +1,31 @@
>> +Spreadtrum SPI Controller
>> +
>> +Required properties:
>> +- compatible: Should be "sprd,sc9860-spi".
>> +- reg: Offset and length of SPI controller register space.
>> +- interrupts: Should contain SPI interrupt.
>> +- clock-names: Should contain following entries:
>> + "spi" for SPI clock,
>> + "source" for SPI source (parent) clock,
>
> Do the h/w block actually get this clock or the driver needs it? In the
> latter case, it should not be in DT.

Yes, we will get the actual SPI source clock form the "source" clock property.

>
>> + "enable" for SPI module enable clock.
>> +- clocks: List of clock input name strings sorted in the same order
>> + as the clock-names property.
>> +- #address-cells: The number of cells required to define a chip select
>> + address on the SPI bus. Should be set to 1.
>> +- #size-cells: Should be set to 0.
>> +
>> +Optional properties:
>> +- sprd,spi-interval: Specify the intervals of two SPI frames, which can be
>> + converted to the delay clock cycles = interval number * 4 + 10.
>
> There are read and write delay properties you can use.

Right. We've agreed introducing another field for spi_transfer to
represent this and will remove 'sprd,spi-interval' from DT.

-- 
Baolin Wang
Best Regards


Re: linux-next: build failure after merge of the block tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Thu, 26 Jul 2018 14:56:24 +1000 Stephen Rothwell  
wrote:
>
> After merging the block tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/nvme/target/rdma.c: In function 'nvmet_rdma_find_get_device':
> drivers/nvme/target/rdma.c:894:26: error: 'struct ib_device_attr' has no 
> member named 'max_sge'; did you mean 'max_cqe'?
>  cm_id->device->attrs.max_sge) - 1;
>   ^
> drivers/nvme/target/rdma.c:893:21: note: in expansion of macro 'max'
>   inline_sge_count = max(cm_id->device->attrs.max_sge_rd,
>  ^~~
> In file included from include/linux/list.h:9:0,
>  from include/linux/module.h:9,
>  from drivers/nvme/host/rdma.c:15:
> drivers/nvme/host/rdma.c: In function 'nvme_rdma_find_get_device':
> drivers/nvme/host/rdma.c:381:23: error: 'struct ib_device_attr' has no member 
> named 'max_sge'; did you mean 'max_cqe'?
>   ndev->dev->attrs.max_sge - 1);
>^
> drivers/nvme/host/rdma.c:380:30: note: in expansion of macro 'min'
>   ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS,
>   ^~~
> 
> Caused by commits
> 
>   64a741c1eaa8 ("nvme-rdma: support up to 4 segments of inline data")
>   0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
> 
> interacting with commit
> 
>   33023fb85a42 ("IB/core: add max_send_sge and max_recv_sge attributes")
> 
> from the rdma tree.
> 
> I have applied the following merge fix patch for today:
> 
> From: Stephen Rothwell 
> Date: Thu, 26 Jul 2018 14:32:15 +1000
> Subject: [PATCH] nvme-dma: merge fix up for replacement of max_sge
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/nvme/host/rdma.c   | 2 +-
>  drivers/nvme/target/rdma.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index cfa0319fcd1c..368fe5ac0c6b 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -378,7 +378,7 @@ nvme_rdma_find_get_device(struct rdma_cm_id *cm_id)
>   }
>  
>   ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS,
> - ndev->dev->attrs.max_sge - 1);
> + ndev->dev->attrs.max_send_sge - 1);
>   list_add(>entry, _list);
>  out_unlock:
>   mutex_unlock(_list_mutex);
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index 86121a7a19b2..8c30ac7d8078 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -891,7 +891,7 @@ nvmet_rdma_find_get_device(struct rdma_cm_id *cm_id)
>  
>   inline_page_count = num_pages(port->inline_data_size);
>   inline_sge_count = max(cm_id->device->attrs.max_sge_rd,
> - cm_id->device->attrs.max_sge) - 1;
> + cm_id->device->attrs.max_send_sge) - 1;
>   if (inline_page_count > inline_sge_count) {
>   pr_warn("inline_data_size %d cannot be supported by device %s. 
> Reducing to %lu.\n",
>   port->inline_data_size, cm_id->device->name,
> -- 
> 2.18.0

This is now needed (with the max_send_sge -> max_recv_sge update in
drivers/nvme/target/rdma.c patch) when I merge the rdma tree with
Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgp8nEyBD3iLU.pgp
Description: OpenPGP digital signature


Re: linux-next: build failure after merge of the block tree

2018-08-14 Thread Stephen Rothwell
Hi all,

On Thu, 26 Jul 2018 14:56:24 +1000 Stephen Rothwell  
wrote:
>
> After merging the block tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/nvme/target/rdma.c: In function 'nvmet_rdma_find_get_device':
> drivers/nvme/target/rdma.c:894:26: error: 'struct ib_device_attr' has no 
> member named 'max_sge'; did you mean 'max_cqe'?
>  cm_id->device->attrs.max_sge) - 1;
>   ^
> drivers/nvme/target/rdma.c:893:21: note: in expansion of macro 'max'
>   inline_sge_count = max(cm_id->device->attrs.max_sge_rd,
>  ^~~
> In file included from include/linux/list.h:9:0,
>  from include/linux/module.h:9,
>  from drivers/nvme/host/rdma.c:15:
> drivers/nvme/host/rdma.c: In function 'nvme_rdma_find_get_device':
> drivers/nvme/host/rdma.c:381:23: error: 'struct ib_device_attr' has no member 
> named 'max_sge'; did you mean 'max_cqe'?
>   ndev->dev->attrs.max_sge - 1);
>^
> drivers/nvme/host/rdma.c:380:30: note: in expansion of macro 'min'
>   ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS,
>   ^~~
> 
> Caused by commits
> 
>   64a741c1eaa8 ("nvme-rdma: support up to 4 segments of inline data")
>   0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
> 
> interacting with commit
> 
>   33023fb85a42 ("IB/core: add max_send_sge and max_recv_sge attributes")
> 
> from the rdma tree.
> 
> I have applied the following merge fix patch for today:
> 
> From: Stephen Rothwell 
> Date: Thu, 26 Jul 2018 14:32:15 +1000
> Subject: [PATCH] nvme-dma: merge fix up for replacement of max_sge
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/nvme/host/rdma.c   | 2 +-
>  drivers/nvme/target/rdma.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index cfa0319fcd1c..368fe5ac0c6b 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -378,7 +378,7 @@ nvme_rdma_find_get_device(struct rdma_cm_id *cm_id)
>   }
>  
>   ndev->num_inline_segments = min(NVME_RDMA_MAX_INLINE_SEGMENTS,
> - ndev->dev->attrs.max_sge - 1);
> + ndev->dev->attrs.max_send_sge - 1);
>   list_add(>entry, _list);
>  out_unlock:
>   mutex_unlock(_list_mutex);
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index 86121a7a19b2..8c30ac7d8078 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -891,7 +891,7 @@ nvmet_rdma_find_get_device(struct rdma_cm_id *cm_id)
>  
>   inline_page_count = num_pages(port->inline_data_size);
>   inline_sge_count = max(cm_id->device->attrs.max_sge_rd,
> - cm_id->device->attrs.max_sge) - 1;
> + cm_id->device->attrs.max_send_sge) - 1;
>   if (inline_page_count > inline_sge_count) {
>   pr_warn("inline_data_size %d cannot be supported by device %s. 
> Reducing to %lu.\n",
>   port->inline_data_size, cm_id->device->name,
> -- 
> 2.18.0

This is now needed (with the max_send_sge -> max_recv_sge update in
drivers/nvme/target/rdma.c patch) when I merge the rdma tree with
Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgp8nEyBD3iLU.pgp
Description: OpenPGP digital signature


Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Michal Wnukowski



On 08/14/2018 04:16 PM, Linus Torvalds wrote:
> On Tue, Aug 14, 2018 at 03:17:35PM -0700, Michal Wnukowski wrote:
>>
>> With memory barrier in place, the volatile keyword around *dbbuf_ei is
>> redundant.
> 
> No. The memory barrier enforces _ordering_, but it doesn't enforce
> that the accesses are only done once. So when you do
> 
>>  *dbbuf_db = value;
> 
> to write to dbbuf_db, and
> 
>>*dbbuf_ei
> 
> to read from dbbuf_ei, without the volatile the write (or the read)
> could be done multiple times, which can cause serious confusion.
> 

I got confused after comaring disassembly of this code with and 
without volatile keyword. Thanks for the correction.

> 
> However, there's a more serious problem with your patch:
> 
>> + /*
>> +  * Ensure that the doorbell is updated before reading
>> +  * the EventIdx from memory
>> +  */
>> + mb();
> 
> Good comment. Except what about the other side?
> 
> When you use memory ordering rules, as opposed to locking, there's
> always *two* sides to any access order. There's this "write dbbuf_db"
> vs "read dbbuf_ei" ordering.
> 
> But there's the other side: what about the side that writes dbbuf_ei,
> and reads dbbuf_db?
> 
> I'm assuming that's the actual controller hardware, but it needs a
> comment about *that* access being ordered too, because if it isn't,
> then ordering this side is pointless.
> 

The other side in this case is not actual controller hardware, but 
virtual one (the regular hardware should rely on normal MMIO 
doorbells). I spent some time going through the code of internal 
hypervisor and double-checking all guarantees around memory access 
before asking the same question: "what about the other side?". This 
execution ordering is mentioned in NVMe spec under "Controller 
Architecture", and it turned out that the NVMe driver itself had 
missing memory barrier. 


Thanks,
Michal


Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Michal Wnukowski



On 08/14/2018 04:16 PM, Linus Torvalds wrote:
> On Tue, Aug 14, 2018 at 03:17:35PM -0700, Michal Wnukowski wrote:
>>
>> With memory barrier in place, the volatile keyword around *dbbuf_ei is
>> redundant.
> 
> No. The memory barrier enforces _ordering_, but it doesn't enforce
> that the accesses are only done once. So when you do
> 
>>  *dbbuf_db = value;
> 
> to write to dbbuf_db, and
> 
>>*dbbuf_ei
> 
> to read from dbbuf_ei, without the volatile the write (or the read)
> could be done multiple times, which can cause serious confusion.
> 

I got confused after comaring disassembly of this code with and 
without volatile keyword. Thanks for the correction.

> 
> However, there's a more serious problem with your patch:
> 
>> + /*
>> +  * Ensure that the doorbell is updated before reading
>> +  * the EventIdx from memory
>> +  */
>> + mb();
> 
> Good comment. Except what about the other side?
> 
> When you use memory ordering rules, as opposed to locking, there's
> always *two* sides to any access order. There's this "write dbbuf_db"
> vs "read dbbuf_ei" ordering.
> 
> But there's the other side: what about the side that writes dbbuf_ei,
> and reads dbbuf_db?
> 
> I'm assuming that's the actual controller hardware, but it needs a
> comment about *that* access being ordered too, because if it isn't,
> then ordering this side is pointless.
> 

The other side in this case is not actual controller hardware, but 
virtual one (the regular hardware should rely on normal MMIO 
doorbells). I spent some time going through the code of internal 
hypervisor and double-checking all guarantees around memory access 
before asking the same question: "what about the other side?". This 
execution ordering is mentioned in NVMe spec under "Controller 
Architecture", and it turned out that the NVMe driver itself had 
missing memory barrier. 


Thanks,
Michal


Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max

2018-08-14 Thread David Collins
On 08/14/2018 04:56 PM, Doug Anderson wrote:
> On Tue, Aug 14, 2018 at 2:59 PM, David Collins  
> wrote:
>> On 08/14/2018 01:03 PM, Doug Anderson wrote:
>>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins  
>>> wrote:>>> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev)
>   struct regulator *sibling;
>   int current_uA = 0, output_uV, input_uV, err;
>   unsigned int mode;
> + bool any_unset = false;
>
>   lockdep_assert_held_once(>mutex);
>
> @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev 
> *rdev)
>   return -EINVAL;
>
>   /* calc total requested load */
> - list_for_each_entry(sibling, >consumer_list, list)
> + list_for_each_entry(sibling, >consumer_list, list) {
>   current_uA += sibling->uA_load;
> + if (!sibling->uA_load_set)
> + any_unset = true;
> + }
>
>   current_uA += rdev->constraints->system_load;
>
> + if (any_unset)
> + current_uA = INT_MAX;
> +

 This check will incorrectly result in a constant load request of INT_MAX
 for all regulators that have at least one child regulator.  This is the
 case because such child regulators are present in rdev->consumer_list and
 because regulator_set_load() requests are not propagated up to parent
 regulators.  Thus, the regulator structs for child regulators will always
 have uA_load==0 and uA_load_set==false.
>>>
>>> Ah, interesting.
>>>
>>> ...but doesn't this same bug exist anyway, just in the opposite
>>> direction?  Without my patch we'll try to request a 0 mA load in this
>>> case which seems just as wrong (or perhaps worse?).  I guess on RPMh
>>> regulator you're "saved" because the RPMh hardware itself knows the
>>> parent/child relationship and knows to ignore this 0 mA load, but it's
>>> still a bug in the overall Linux framework...
>>
>> I'm not sure what existing bug you are referring to here.  Where is a 0 mA
>> load being requested?  Could you list the consumer function calls and the
>> behavior on the framework side that you're envisioning?
> 
> Imagine you've got a tree like this:
> 
> - regulatorParent (1.8 V)
>   - regulatorChildA (1.7 V)
>   - regulatorChildB (1.2 V)
>   - regulatorChildC (1.5 V)
> 
> ...and this set of calls:
> 
> regulator_set_load(regulatorChildA, 1000);
> regulator_set_load(regulatorChildB, 2000);
> regulator_set_load(regulatorChildC, 3000);
> 
> regulator_enable(regulatorChildA);
> regulator_enable(regulatorChildB);
> regulator_enable(regulatorChildC);
> 
> 
> With the existing code in Mark's tree then ChildA / ChildB / ChildC
> will presumably have a load high enough to be in high power mode.
> However, as you said, loads aren't propagated.  ...so "Parent" will
> see no load requested (which translates to a load request of 0 mA).
> 
> The "bug" is that when you did the
> "regulator_enable(regulatorChildA);" then that will propagate up and
> cause a "regulator_enable(regulatorParent)".  In _regulator_enable()
> you can see drms_uA_update().  That will cause it to set the mode of
> regulatorParent based on a load of 0 mA.  That is the bug.  You may
> respond "but REGULATOR_CHANGE_DRMS isn't set for the parent!  See
> below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the
> parent.
> 
> With my code in the same situation the parent will end up with a load
> of "MAX_INT" mA which I think is the bug you pointed out.  ...while it
> would be good to fix this it does seem to be better to set the parent
> to a mode based on MAX_INT mA rather than 0 mA?

I agree that the existing behavior resulting from a lack of load current
propagation from child to parent regulators is wrong.  However, I do not
agree that replacing it with something else that results in different
wrong behavior is the way forward.

Adding load current propagation would resolve your 0 mA concern but does
not necessary require making MAX_INT the default for all consumers.


>>> I have no idea how we ought to propagate regulator_set_load() to
>>> parents though.  That seems like a tough thing to try to handle
>>> automagically.
>>
>> I attempted it seven years ago with two revisions of a patch series [1]
>> and [2].  The feature wasn't completed though.  Perhaps something along
>> the same lines could be reintroduced now that child regulators are handled
>> the same way as ordinary consumers within the regulator framework.
>>
>> [1]: https://lkml.org/lkml/2011/3/28/246
>> [2]: https://lkml.org/lkml/2011/3/28/530
> 
> Thanks for the links.  My first instinct is just what Mark said there:
> you can't really take the child load and map it accurately to a parent
> load without loads of per-device complexity and even then you'd
> probably get nothing more than an approximation.
> 
> IMO about the 

Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max

2018-08-14 Thread David Collins
On 08/14/2018 04:56 PM, Doug Anderson wrote:
> On Tue, Aug 14, 2018 at 2:59 PM, David Collins  
> wrote:
>> On 08/14/2018 01:03 PM, Doug Anderson wrote:
>>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins  
>>> wrote:>>> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev)
>   struct regulator *sibling;
>   int current_uA = 0, output_uV, input_uV, err;
>   unsigned int mode;
> + bool any_unset = false;
>
>   lockdep_assert_held_once(>mutex);
>
> @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev 
> *rdev)
>   return -EINVAL;
>
>   /* calc total requested load */
> - list_for_each_entry(sibling, >consumer_list, list)
> + list_for_each_entry(sibling, >consumer_list, list) {
>   current_uA += sibling->uA_load;
> + if (!sibling->uA_load_set)
> + any_unset = true;
> + }
>
>   current_uA += rdev->constraints->system_load;
>
> + if (any_unset)
> + current_uA = INT_MAX;
> +

 This check will incorrectly result in a constant load request of INT_MAX
 for all regulators that have at least one child regulator.  This is the
 case because such child regulators are present in rdev->consumer_list and
 because regulator_set_load() requests are not propagated up to parent
 regulators.  Thus, the regulator structs for child regulators will always
 have uA_load==0 and uA_load_set==false.
>>>
>>> Ah, interesting.
>>>
>>> ...but doesn't this same bug exist anyway, just in the opposite
>>> direction?  Without my patch we'll try to request a 0 mA load in this
>>> case which seems just as wrong (or perhaps worse?).  I guess on RPMh
>>> regulator you're "saved" because the RPMh hardware itself knows the
>>> parent/child relationship and knows to ignore this 0 mA load, but it's
>>> still a bug in the overall Linux framework...
>>
>> I'm not sure what existing bug you are referring to here.  Where is a 0 mA
>> load being requested?  Could you list the consumer function calls and the
>> behavior on the framework side that you're envisioning?
> 
> Imagine you've got a tree like this:
> 
> - regulatorParent (1.8 V)
>   - regulatorChildA (1.7 V)
>   - regulatorChildB (1.2 V)
>   - regulatorChildC (1.5 V)
> 
> ...and this set of calls:
> 
> regulator_set_load(regulatorChildA, 1000);
> regulator_set_load(regulatorChildB, 2000);
> regulator_set_load(regulatorChildC, 3000);
> 
> regulator_enable(regulatorChildA);
> regulator_enable(regulatorChildB);
> regulator_enable(regulatorChildC);
> 
> 
> With the existing code in Mark's tree then ChildA / ChildB / ChildC
> will presumably have a load high enough to be in high power mode.
> However, as you said, loads aren't propagated.  ...so "Parent" will
> see no load requested (which translates to a load request of 0 mA).
> 
> The "bug" is that when you did the
> "regulator_enable(regulatorChildA);" then that will propagate up and
> cause a "regulator_enable(regulatorParent)".  In _regulator_enable()
> you can see drms_uA_update().  That will cause it to set the mode of
> regulatorParent based on a load of 0 mA.  That is the bug.  You may
> respond "but REGULATOR_CHANGE_DRMS isn't set for the parent!  See
> below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the
> parent.
> 
> With my code in the same situation the parent will end up with a load
> of "MAX_INT" mA which I think is the bug you pointed out.  ...while it
> would be good to fix this it does seem to be better to set the parent
> to a mode based on MAX_INT mA rather than 0 mA?

I agree that the existing behavior resulting from a lack of load current
propagation from child to parent regulators is wrong.  However, I do not
agree that replacing it with something else that results in different
wrong behavior is the way forward.

Adding load current propagation would resolve your 0 mA concern but does
not necessary require making MAX_INT the default for all consumers.


>>> I have no idea how we ought to propagate regulator_set_load() to
>>> parents though.  That seems like a tough thing to try to handle
>>> automagically.
>>
>> I attempted it seven years ago with two revisions of a patch series [1]
>> and [2].  The feature wasn't completed though.  Perhaps something along
>> the same lines could be reintroduced now that child regulators are handled
>> the same way as ordinary consumers within the regulator framework.
>>
>> [1]: https://lkml.org/lkml/2011/3/28/246
>> [2]: https://lkml.org/lkml/2011/3/28/530
> 
> Thanks for the links.  My first instinct is just what Mark said there:
> you can't really take the child load and map it accurately to a parent
> load without loads of per-device complexity and even then you'd
> probably get nothing more than an approximation.
> 
> IMO about the 

[PATCH v5 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook

2018-08-14 Thread Zhen Lei
.flush_iotlb_all can not just wait for previous tlbi operations to be
completed, but should also invalid all TLBs of the related domain.

Signed-off-by: Zhen Lei 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 1d64710..4402187 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ops->unmap(ops, iova, size);
 }

+static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+   if (smmu_domain->smmu)
+   arm_smmu_tlb_inv_context(smmu_domain);
+}
+
 static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
 {
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
@@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev,
.map= arm_smmu_map,
.unmap  = arm_smmu_unmap,
.map_sg = default_iommu_map_sg,
-   .flush_iotlb_all= arm_smmu_iotlb_sync,
+   .flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
.add_device = arm_smmu_add_device,
--
1.8.3




[PATCH v5 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook

2018-08-14 Thread Zhen Lei
.flush_iotlb_all can not just wait for previous tlbi operations to be
completed, but should also invalid all TLBs of the related domain.

Signed-off-by: Zhen Lei 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 1d64710..4402187 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ops->unmap(ops, iova, size);
 }

+static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+   if (smmu_domain->smmu)
+   arm_smmu_tlb_inv_context(smmu_domain);
+}
+
 static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
 {
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
@@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev,
.map= arm_smmu_map,
.unmap  = arm_smmu_unmap,
.map_sg = default_iommu_map_sg,
-   .flush_iotlb_all= arm_smmu_iotlb_sync,
+   .flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
.add_device = arm_smmu_add_device,
--
1.8.3




Re: How delete node or property in overlayd dts?

2018-08-14 Thread Frank Rowand
On 08/14/18 07:46, 张波 wrote:
> /delete-node/ /delete-prop/ could be used in dtsi files without device
> tree overlay.
> 
> but with device tree overlay,  /delete-node/ and  /delete-prop/ are not work.
> How to delete property and node in overlay dts?
> 
> for example,
> in basel.dts have following node
> node1 {
> property1;
> property3;
> node2 {
> property2;
> }
> }
> 
> in overlay.dts as following
>  node1 {
> /delete-property/ property1;
> /delete-node/ node2;
> }
> 
> after overlay, property1 and node2 is not deleted.
> 

The /delete-node/ and /delete-prop/ directives are only used by the
dtc compiler within a single compilation.

There is nothing in the format of a devicetree blob to represent the
notion of deleting a property or a node.

You can not delete a property or a node in an overlay dtb.


Re: How delete node or property in overlayd dts?

2018-08-14 Thread Frank Rowand
On 08/14/18 07:46, 张波 wrote:
> /delete-node/ /delete-prop/ could be used in dtsi files without device
> tree overlay.
> 
> but with device tree overlay,  /delete-node/ and  /delete-prop/ are not work.
> How to delete property and node in overlay dts?
> 
> for example,
> in basel.dts have following node
> node1 {
> property1;
> property3;
> node2 {
> property2;
> }
> }
> 
> in overlay.dts as following
>  node1 {
> /delete-property/ property1;
> /delete-node/ node2;
> }
> 
> after overlay, property1 and node2 is not deleted.
> 

The /delete-node/ and /delete-prop/ directives are only used by the
dtc compiler within a single compilation.

There is nothing in the format of a devicetree blob to represent the
notion of deleting a property or a node.

You can not delete a property or a node in an overlay dtb.


Re: [RFC PATCH 1/2] mm: rework memcg kernel stack accounting

2018-08-14 Thread Shakeel Butt
On Tue, Aug 14, 2018 at 5:37 PM Roman Gushchin  wrote:
>
> If CONFIG_VMAP_STACK is set, kernel stacks are allocated
> using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
> stack pages are charged against corresponding memory cgroups
> on allocation and uncharged on releasing them.
>
> The problem is that we do cache kernel stacks in small
> per-cpu caches and do reuse them for new tasks, which can
> belong to different memory cgroups.
>
> Each stack page still holds a reference to the original cgroup,
> so the cgroup can't be released until the vmap area is released.
>
> To make this happen we need more than two subsequent exits
> without forks in between on the current cpu, which makes it
> very unlikely to happen. As a result, I saw a significant number
> of dying cgroups (in theory, up to 2 * number_of_cpu +
> number_of_tasks), which can't be released even by significant
> memory pressure.
>
> As a cgroup structure can take a significant amount of memory
> (first of all, per-cpu data like memcg statistics), it leads
> to a noticeable waste of memory.
>
> Signed-off-by: Roman Gushchin 

I was also looking into this issue. I was thinking of having a
per-memcg per-cpu stack cache. However this solution seems much
simpler. Can you also add the performance number for a similar simple
benchmark done in ac496bf48d97 ("fork: Optimize task creation by
caching two thread stacks per CPU if CONFIG_VMAP_STACK=y").

Reviewed-by: Shakeel Butt 

> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Andy Lutomirski 
> Cc: Konstantin Khlebnikov 
> Cc: Tejun Heo 
> ---
>  kernel/fork.c | 44 ++--
>  1 file changed, 38 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 69b6fea5a181..91872b2b37bd 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct 
> task_struct *tsk, int node)
> return s->addr;
> }
>
> +   /*
> +* Allocated stacks are cached and later reused by new threads,
> +* so memcg accounting is performed manually on assigning/releasing
> +* stacks to tasks. Drop __GFP_ACCOUNT.
> +*/
> stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
>  VMALLOC_START, VMALLOC_END,
> -THREADINFO_GFP,
> +THREADINFO_GFP & ~__GFP_ACCOUNT,
>  PAGE_KERNEL,
>  0, node, __builtin_return_address(0));
>
> @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct 
> task_struct *tsk, int node)
>  #endif
>  }
>
> +static void memcg_charge_kernel_stack(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_VMAP_STACK
> +   struct vm_struct *vm = task_stack_vm_area(tsk);
> +
> +   if (vm) {
> +   int i;
> +
> +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> +   memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL,
> + compound_order(vm->pages[i]));
> +
> +   /* All stack pages belong to the same memcg. */
> +   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> +THREAD_SIZE / 1024);
> +   }
> +#endif
> +}
> +
>  static inline void free_thread_stack(struct task_struct *tsk)
>  {
>  #ifdef CONFIG_VMAP_STACK
> -   if (task_stack_vm_area(tsk)) {
> +   struct vm_struct *vm = task_stack_vm_area(tsk);
> +
> +   if (vm) {
> int i;
>
> +   /* All stack pages belong to the same memcg. */
> +   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> +-(int)(THREAD_SIZE / 1024));
> +
> +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> +   memcg_kmem_uncharge(vm->pages[i],
> + compound_order(vm->pages[i]));
> +
> for (i = 0; i < NR_CACHED_STACKS; i++) {
> if (this_cpu_cmpxchg(cached_stacks[i],
> NULL, tsk->stack_vm_area) != NULL)
> @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct 
> *tsk, int account)
> NR_KERNEL_STACK_KB,
> PAGE_SIZE / 1024 * account);
> }
> -
> -   /* All stack pages belong to the same memcg. */
> -   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> -account * (THREAD_SIZE / 1024));
> } else {
> /*
>  * All stack pages are in the same zone and belong to the
> @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct 
> task_struct *orig, int node)
> if (!stack)
> 

Re: [RFC PATCH 1/2] mm: rework memcg kernel stack accounting

2018-08-14 Thread Shakeel Butt
On Tue, Aug 14, 2018 at 5:37 PM Roman Gushchin  wrote:
>
> If CONFIG_VMAP_STACK is set, kernel stacks are allocated
> using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
> stack pages are charged against corresponding memory cgroups
> on allocation and uncharged on releasing them.
>
> The problem is that we do cache kernel stacks in small
> per-cpu caches and do reuse them for new tasks, which can
> belong to different memory cgroups.
>
> Each stack page still holds a reference to the original cgroup,
> so the cgroup can't be released until the vmap area is released.
>
> To make this happen we need more than two subsequent exits
> without forks in between on the current cpu, which makes it
> very unlikely to happen. As a result, I saw a significant number
> of dying cgroups (in theory, up to 2 * number_of_cpu +
> number_of_tasks), which can't be released even by significant
> memory pressure.
>
> As a cgroup structure can take a significant amount of memory
> (first of all, per-cpu data like memcg statistics), it leads
> to a noticeable waste of memory.
>
> Signed-off-by: Roman Gushchin 

I was also looking into this issue. I was thinking of having a
per-memcg per-cpu stack cache. However this solution seems much
simpler. Can you also add the performance number for a similar simple
benchmark done in ac496bf48d97 ("fork: Optimize task creation by
caching two thread stacks per CPU if CONFIG_VMAP_STACK=y").

Reviewed-by: Shakeel Butt 

> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Andy Lutomirski 
> Cc: Konstantin Khlebnikov 
> Cc: Tejun Heo 
> ---
>  kernel/fork.c | 44 ++--
>  1 file changed, 38 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 69b6fea5a181..91872b2b37bd 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct 
> task_struct *tsk, int node)
> return s->addr;
> }
>
> +   /*
> +* Allocated stacks are cached and later reused by new threads,
> +* so memcg accounting is performed manually on assigning/releasing
> +* stacks to tasks. Drop __GFP_ACCOUNT.
> +*/
> stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
>  VMALLOC_START, VMALLOC_END,
> -THREADINFO_GFP,
> +THREADINFO_GFP & ~__GFP_ACCOUNT,
>  PAGE_KERNEL,
>  0, node, __builtin_return_address(0));
>
> @@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct 
> task_struct *tsk, int node)
>  #endif
>  }
>
> +static void memcg_charge_kernel_stack(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_VMAP_STACK
> +   struct vm_struct *vm = task_stack_vm_area(tsk);
> +
> +   if (vm) {
> +   int i;
> +
> +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> +   memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL,
> + compound_order(vm->pages[i]));
> +
> +   /* All stack pages belong to the same memcg. */
> +   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> +THREAD_SIZE / 1024);
> +   }
> +#endif
> +}
> +
>  static inline void free_thread_stack(struct task_struct *tsk)
>  {
>  #ifdef CONFIG_VMAP_STACK
> -   if (task_stack_vm_area(tsk)) {
> +   struct vm_struct *vm = task_stack_vm_area(tsk);
> +
> +   if (vm) {
> int i;
>
> +   /* All stack pages belong to the same memcg. */
> +   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> +-(int)(THREAD_SIZE / 1024));
> +
> +   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> +   memcg_kmem_uncharge(vm->pages[i],
> + compound_order(vm->pages[i]));
> +
> for (i = 0; i < NR_CACHED_STACKS; i++) {
> if (this_cpu_cmpxchg(cached_stacks[i],
> NULL, tsk->stack_vm_area) != NULL)
> @@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct 
> *tsk, int account)
> NR_KERNEL_STACK_KB,
> PAGE_SIZE / 1024 * account);
> }
> -
> -   /* All stack pages belong to the same memcg. */
> -   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
> -account * (THREAD_SIZE / 1024));
> } else {
> /*
>  * All stack pages are in the same zone and belong to the
> @@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct 
> task_struct *orig, int node)
> if (!stack)
> 

Re: [BUG] kernel: rcu: a possible sleep-in-atomic-context bug in srcu_read_delay()

2018-08-14 Thread Jia-Ju Bai




On 2018/8/13 20:42, Paul E. McKenney wrote:

On Mon, Aug 13, 2018 at 05:26:49PM +0800, Jia-Ju Bai wrote:


On 2018/8/13 12:18, Paul E. McKenney wrote:

On Mon, Aug 13, 2018 at 11:04:10AM +0800, Jia-Ju Bai wrote:

The kernel may sleep with holding a spinlock.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] schedule_timeout_interruptible
kernel/rcu/rcutorture.c, 523: schedule_timeout_interruptible in
srcu_read_delay
kernel/rcu/rcutorture.c, 1105: [FUNC_PTR]srcu_read_delay in
rcu_torture_timer
kernel/rcu/rcutorture.c, 1104: spin_lock in rcu_torture_timer

Note that [FUNC_PTR] means a function pointer call is used.

I do not find a good way to fix, so I only report.
This is found by my static analysis tool (DSAC).

Interesting.  I would have expected to have gotten a "scheduling while
atomic" error message, which I do not recall seeing.  And I ran a great
deal of rcutorture on v4.16.

So let's see...  As you say, the rcu_torture_timer() function does in
fact acquire rand_lock in 4.16 and 4.17, in which case sleeping would
indeed be illegal.  But let's take a look at srcu_read_delay():

static void
srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
{
long delay;
const long uspertick = 100 / HZ;
const long longdelay = 10;

/* We want there to be long-running readers, but not all the time. */

delay = torture_random(rrsp) %
(nrealreaders * 2 * longdelay * uspertick);
if (!delay && in_task()) {
schedule_timeout_interruptible(longdelay);
rtrsp->rt_delay_jiffies = longdelay;
} else {
rcu_read_delay(rrsp, rtrsp);
}
}

The call to schedule_timeout_interruptible() cannot happen unless the
in_task() macro returns true, which it won't if the SOFTIRQ_OFFSET bit
is set:

#define in_task()   (!(preempt_count() & \
   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))

And the SOFTIRQ_OFFSET bit will be set if srcu_read_delay()
is invoked from a timer handler, which is the case for the
call from rcu_torture_timer().  So if that lock is held,
schedule_timeout_interruptible() won't ever be invoked.

Thanks for your reply :)
My tool does not track this bit...
Sorry for this false report.

Not a problem, a few false positives are to be expected.  And it looks
like you have some work to do on your tool -- which is good, because I
would not want you to be bored.  ;-)

Thanx, Paul



Thanks for your advice.
I will improve my tool to produce less false positives :)


Best wishes,
Jia-Ju Bai


Re: [BUG] kernel: rcu: a possible sleep-in-atomic-context bug in srcu_read_delay()

2018-08-14 Thread Jia-Ju Bai




On 2018/8/13 20:42, Paul E. McKenney wrote:

On Mon, Aug 13, 2018 at 05:26:49PM +0800, Jia-Ju Bai wrote:


On 2018/8/13 12:18, Paul E. McKenney wrote:

On Mon, Aug 13, 2018 at 11:04:10AM +0800, Jia-Ju Bai wrote:

The kernel may sleep with holding a spinlock.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] schedule_timeout_interruptible
kernel/rcu/rcutorture.c, 523: schedule_timeout_interruptible in
srcu_read_delay
kernel/rcu/rcutorture.c, 1105: [FUNC_PTR]srcu_read_delay in
rcu_torture_timer
kernel/rcu/rcutorture.c, 1104: spin_lock in rcu_torture_timer

Note that [FUNC_PTR] means a function pointer call is used.

I do not find a good way to fix, so I only report.
This is found by my static analysis tool (DSAC).

Interesting.  I would have expected to have gotten a "scheduling while
atomic" error message, which I do not recall seeing.  And I ran a great
deal of rcutorture on v4.16.

So let's see...  As you say, the rcu_torture_timer() function does in
fact acquire rand_lock in 4.16 and 4.17, in which case sleeping would
indeed be illegal.  But let's take a look at srcu_read_delay():

static void
srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
{
long delay;
const long uspertick = 100 / HZ;
const long longdelay = 10;

/* We want there to be long-running readers, but not all the time. */

delay = torture_random(rrsp) %
(nrealreaders * 2 * longdelay * uspertick);
if (!delay && in_task()) {
schedule_timeout_interruptible(longdelay);
rtrsp->rt_delay_jiffies = longdelay;
} else {
rcu_read_delay(rrsp, rtrsp);
}
}

The call to schedule_timeout_interruptible() cannot happen unless the
in_task() macro returns true, which it won't if the SOFTIRQ_OFFSET bit
is set:

#define in_task()   (!(preempt_count() & \
   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))

And the SOFTIRQ_OFFSET bit will be set if srcu_read_delay()
is invoked from a timer handler, which is the case for the
call from rcu_torture_timer().  So if that lock is held,
schedule_timeout_interruptible() won't ever be invoked.

Thanks for your reply :)
My tool does not track this bit...
Sorry for this false report.

Not a problem, a few false positives are to be expected.  And it looks
like you have some work to do on your tool -- which is good, because I
would not want you to be bored.  ;-)

Thanx, Paul



Thanks for your advice.
I will improve my tool to produce less false positives :)


Best wishes,
Jia-Ju Bai


Re: Build failures with gcc 4.5 and older

2018-08-14 Thread Guenter Roeck

On 08/14/2018 04:20 PM, Linus Torvalds wrote:

On Tue, Aug 14, 2018 at 4:02 PM Andrew Morton  wrote:


The m68k build still fails because 0cc3cd21657 ("cpu/hotplug: Boot HT
siblings at least once") was evidently never tested on CONFIG_SMP=n.
How could that come about - the patch is six weeks old??


Ehh, meet the joys of embargoes.

The code was tested (and people even found subtle arm64 problems due
to that testing), but because it couldn't be made public until today,
it didn't go through all the usual infrastructure we depend on.

But:


kernel/cpu.c: In function 'boot_cpu_hotplug_init':
kernel/cpu.c:2275:2: error: 'struct cpuhp_cpu_state' has no member named 
'booted_once'


it should be fixed now in -git.


@@ -490,6 +490,8 @@ struct mm_struct {
  #endif
 } __randomize_layout;

+   int wibble;
+


Can we call this something informative? Like

 int __gcc_4_4_is_garbage_that_shouldnt_be_used;

or something?

That is, if we actually want to really drag out this whole pointless
pain of allowing ancient compilers?

Guys, at some point we need to switch to 4.6. The people who feel the
pain today *will* feel the pain at some point. Just get it over with
already.



For my part I am all for making gcc 4.6 mandatory.

Guenter



Re: Build failures with gcc 4.5 and older

2018-08-14 Thread Guenter Roeck

On 08/14/2018 04:20 PM, Linus Torvalds wrote:

On Tue, Aug 14, 2018 at 4:02 PM Andrew Morton  wrote:


The m68k build still fails because 0cc3cd21657 ("cpu/hotplug: Boot HT
siblings at least once") was evidently never tested on CONFIG_SMP=n.
How could that come about - the patch is six weeks old??


Ehh, meet the joys of embargoes.

The code was tested (and people even found subtle arm64 problems due
to that testing), but because it couldn't be made public until today,
it didn't go through all the usual infrastructure we depend on.

But:


kernel/cpu.c: In function 'boot_cpu_hotplug_init':
kernel/cpu.c:2275:2: error: 'struct cpuhp_cpu_state' has no member named 
'booted_once'


it should be fixed now in -git.


@@ -490,6 +490,8 @@ struct mm_struct {
  #endif
 } __randomize_layout;

+   int wibble;
+


Can we call this something informative? Like

 int __gcc_4_4_is_garbage_that_shouldnt_be_used;

or something?

That is, if we actually want to really drag out this whole pointless
pain of allowing ancient compilers?

Guys, at some point we need to switch to 4.6. The people who feel the
pain today *will* feel the pain at some point. Just get it over with
already.



For my part I am all for making gcc 4.6 mandatory.

Guenter



Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Guenter Roeck

On 08/14/2018 05:36 PM, Sebastian Gottschall wrote:

if SWAP is disabled in kernel config, the following compile error will raise up 
with this release

arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): 
undefined reference to `generic_max_swapfile_size'

of course this is simple to fix. the function max_swapfile_size must be 
excluded if CONFIG_SWAP is disabled



Fixed with upstream commit 792adb90fa72 ("x86/init: fix build with 
CONFIG_SWAP=n").
Also needed in other stable branches (at least in 4.4).

Guenter


Sebastian

Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman:

This is the start of the stable review cycle for the 4.9.120 release.
There are 107 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-
Pseudo-Shortlog of commits:

Greg Kroah-Hartman 
 Linux 4.9.120-rc1

Josh Poimboeuf 
 x86/microcode: Allow late microcode loading with SMT disabled

Ashok Raj 
 x86/microcode: Do not upload microcode if CPUs are offline

David Woodhouse 
 tools headers: Synchronise x86 cpufeatures.h for L1TF additions

Andi Kleen 
 x86/mm/kmmio: Make the tracer robust against L1TF

Andi Kleen 
 x86/mm/pat: Make set_memory_np() L1TF safe

Andi Kleen 
 x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert

Andi Kleen 
 x86/speculation/l1tf: Invert all not present mappings

Thomas Gleixner 
 cpu/hotplug: Fix SMT supported evaluation

Paolo Bonzini 
 KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Simplify sysfs report of VMX L1TF vulnerability

Paolo Bonzini 
 KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR

Wanpeng Li 
 KVM: X86: Allow userspace to define the microcode version

Wanpeng Li 
 KVM: X86: Introduce kvm_get_msr_feature()

Tom Lendacky 
 KVM: SVM: Add MSR-based feature support for serializing LFENCE

Tom Lendacky 
 KVM: x86: Add a framework for supporting MSR-based features

Thomas Gleixner 
 Documentation/l1tf: Remove Yonah processors from not vulnerable list

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr()

Nicolai Stange 
 x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d

Nicolai Stange 
 x86: Don't include linux/irq.h from asm/hardirq.h

Nicolai Stange 
 x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d

Nicolai Stange 
 x86/irq: Demote irq_cpustat_t::__softirq_pending to u16

Nicolai Stange 
 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush()

Nicolai Stange 
 x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond'

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush()

Josh Poimboeuf 
 cpu/hotplug: detect SMT disabled by BIOS

Tony Luck 
 Documentation/l1tf: Fix typos

Nicolai Stange 
 x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content

Thomas Gleixner 
 Documentation: Add section about CPU vulnerabilities

Jiri Kosina 
 x86/bugs, kvm: Introduce boot-time control of L1TF mitigations

Thomas Gleixner 
 cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early

Jiri Kosina 
 cpu/hotplug: Expose SMT control init function

Thomas Gleixner 
 x86/kvm: Allow runtime control of L1D flush

Thomas Gleixner 
 x86/kvm: Serialize L1D flush parameter setter

Thomas Gleixner 
 x86/kvm: Add static key for flush always

Thomas Gleixner 
 x86/kvm: Move l1tf setup function

Thomas Gleixner 
 x86/l1tf: Handle EPT disabled state proper

Thomas Gleixner 
 x86/kvm: Drop L1TF MSR list approach

Thomas Gleixner 
 x86/litf: Introduce vmx status variable

Thomas Gleixner 
 cpu/hotplug: Online siblings when SMT control is turned on

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add find_msr() helper function

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest 
numbers

Jim Mattson 
 kvm: nVMX: Update MSR load counts on a VMCS switch

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush logic

Paolo Bonzini 
 x86/KVM/VMX: Add L1D MSR based 

Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Guenter Roeck

On 08/14/2018 05:36 PM, Sebastian Gottschall wrote:

if SWAP is disabled in kernel config, the following compile error will raise up 
with this release

arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): 
undefined reference to `generic_max_swapfile_size'

of course this is simple to fix. the function max_swapfile_size must be 
excluded if CONFIG_SWAP is disabled



Fixed with upstream commit 792adb90fa72 ("x86/init: fix build with 
CONFIG_SWAP=n").
Also needed in other stable branches (at least in 4.4).

Guenter


Sebastian

Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman:

This is the start of the stable review cycle for the 4.9.120 release.
There are 107 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-
Pseudo-Shortlog of commits:

Greg Kroah-Hartman 
 Linux 4.9.120-rc1

Josh Poimboeuf 
 x86/microcode: Allow late microcode loading with SMT disabled

Ashok Raj 
 x86/microcode: Do not upload microcode if CPUs are offline

David Woodhouse 
 tools headers: Synchronise x86 cpufeatures.h for L1TF additions

Andi Kleen 
 x86/mm/kmmio: Make the tracer robust against L1TF

Andi Kleen 
 x86/mm/pat: Make set_memory_np() L1TF safe

Andi Kleen 
 x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert

Andi Kleen 
 x86/speculation/l1tf: Invert all not present mappings

Thomas Gleixner 
 cpu/hotplug: Fix SMT supported evaluation

Paolo Bonzini 
 KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Simplify sysfs report of VMX L1TF vulnerability

Paolo Bonzini 
 KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR

Wanpeng Li 
 KVM: X86: Allow userspace to define the microcode version

Wanpeng Li 
 KVM: X86: Introduce kvm_get_msr_feature()

Tom Lendacky 
 KVM: SVM: Add MSR-based feature support for serializing LFENCE

Tom Lendacky 
 KVM: x86: Add a framework for supporting MSR-based features

Thomas Gleixner 
 Documentation/l1tf: Remove Yonah processors from not vulnerable list

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr()

Nicolai Stange 
 x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d

Nicolai Stange 
 x86: Don't include linux/irq.h from asm/hardirq.h

Nicolai Stange 
 x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d

Nicolai Stange 
 x86/irq: Demote irq_cpustat_t::__softirq_pending to u16

Nicolai Stange 
 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush()

Nicolai Stange 
 x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond'

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush()

Josh Poimboeuf 
 cpu/hotplug: detect SMT disabled by BIOS

Tony Luck 
 Documentation/l1tf: Fix typos

Nicolai Stange 
 x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content

Thomas Gleixner 
 Documentation: Add section about CPU vulnerabilities

Jiri Kosina 
 x86/bugs, kvm: Introduce boot-time control of L1TF mitigations

Thomas Gleixner 
 cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early

Jiri Kosina 
 cpu/hotplug: Expose SMT control init function

Thomas Gleixner 
 x86/kvm: Allow runtime control of L1D flush

Thomas Gleixner 
 x86/kvm: Serialize L1D flush parameter setter

Thomas Gleixner 
 x86/kvm: Add static key for flush always

Thomas Gleixner 
 x86/kvm: Move l1tf setup function

Thomas Gleixner 
 x86/l1tf: Handle EPT disabled state proper

Thomas Gleixner 
 x86/kvm: Drop L1TF MSR list approach

Thomas Gleixner 
 x86/litf: Introduce vmx status variable

Thomas Gleixner 
 cpu/hotplug: Online siblings when SMT control is turned on

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add find_msr() helper function

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest 
numbers

Jim Mattson 
 kvm: nVMX: Update MSR load counts on a VMCS switch

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush logic

Paolo Bonzini 
 x86/KVM/VMX: Add L1D MSR based 

Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Nathan Chancellor
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote:
> if SWAP is disabled in kernel config, the following compile error will raise
> up with this release
> 
> arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1):
> undefined reference to `generic_max_swapfile_size'
> 
> of course this is simple to fix. the function max_swapfile_size must be
> excluded if CONFIG_SWAP is disabled
> 
> Sebastian
> 

This is fixed upstream as commmit 792adb90fa72 ("x86/init: fix build
with CONFIG_SWAP=n").

Cheers,
Nathan


Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Nathan Chancellor
On Wed, Aug 15, 2018 at 02:36:00AM +0200, Sebastian Gottschall wrote:
> if SWAP is disabled in kernel config, the following compile error will raise
> up with this release
> 
> arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1):
> undefined reference to `generic_max_swapfile_size'
> 
> of course this is simple to fix. the function max_swapfile_size must be
> excluded if CONFIG_SWAP is disabled
> 
> Sebastian
> 

This is fixed upstream as commmit 792adb90fa72 ("x86/init: fix build
with CONFIG_SWAP=n").

Cheers,
Nathan


Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Sebastian Gottschall
if SWAP is disabled in kernel config, the following compile error will 
raise up with this release


arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): 
undefined reference to `generic_max_swapfile_size'


of course this is simple to fix. the function max_swapfile_size must be 
excluded if CONFIG_SWAP is disabled


Sebastian

Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman:

This is the start of the stable review cycle for the 4.9.120 release.
There are 107 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-
Pseudo-Shortlog of commits:

Greg Kroah-Hartman 
 Linux 4.9.120-rc1

Josh Poimboeuf 
 x86/microcode: Allow late microcode loading with SMT disabled

Ashok Raj 
 x86/microcode: Do not upload microcode if CPUs are offline

David Woodhouse 
 tools headers: Synchronise x86 cpufeatures.h for L1TF additions

Andi Kleen 
 x86/mm/kmmio: Make the tracer robust against L1TF

Andi Kleen 
 x86/mm/pat: Make set_memory_np() L1TF safe

Andi Kleen 
 x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert

Andi Kleen 
 x86/speculation/l1tf: Invert all not present mappings

Thomas Gleixner 
 cpu/hotplug: Fix SMT supported evaluation

Paolo Bonzini 
 KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Simplify sysfs report of VMX L1TF vulnerability

Paolo Bonzini 
 KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR

Wanpeng Li 
 KVM: X86: Allow userspace to define the microcode version

Wanpeng Li 
 KVM: X86: Introduce kvm_get_msr_feature()

Tom Lendacky 
 KVM: SVM: Add MSR-based feature support for serializing LFENCE

Tom Lendacky 
 KVM: x86: Add a framework for supporting MSR-based features

Thomas Gleixner 
 Documentation/l1tf: Remove Yonah processors from not vulnerable list

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr()

Nicolai Stange 
 x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d

Nicolai Stange 
 x86: Don't include linux/irq.h from asm/hardirq.h

Nicolai Stange 
 x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d

Nicolai Stange 
 x86/irq: Demote irq_cpustat_t::__softirq_pending to u16

Nicolai Stange 
 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush()

Nicolai Stange 
 x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond'

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush()

Josh Poimboeuf 
 cpu/hotplug: detect SMT disabled by BIOS

Tony Luck 
 Documentation/l1tf: Fix typos

Nicolai Stange 
 x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content

Thomas Gleixner 
 Documentation: Add section about CPU vulnerabilities

Jiri Kosina 
 x86/bugs, kvm: Introduce boot-time control of L1TF mitigations

Thomas Gleixner 
 cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early

Jiri Kosina 
 cpu/hotplug: Expose SMT control init function

Thomas Gleixner 
 x86/kvm: Allow runtime control of L1D flush

Thomas Gleixner 
 x86/kvm: Serialize L1D flush parameter setter

Thomas Gleixner 
 x86/kvm: Add static key for flush always

Thomas Gleixner 
 x86/kvm: Move l1tf setup function

Thomas Gleixner 
 x86/l1tf: Handle EPT disabled state proper

Thomas Gleixner 
 x86/kvm: Drop L1TF MSR list approach

Thomas Gleixner 
 x86/litf: Introduce vmx status variable

Thomas Gleixner 
 cpu/hotplug: Online siblings when SMT control is turned on

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add find_msr() helper function

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest 
numbers

Jim Mattson 
 kvm: nVMX: Update MSR load counts on a VMCS switch

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush logic

Paolo Bonzini 
 x86/KVM/VMX: Add L1D MSR based flush

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush algorithm

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add module argument for L1TF mitigation

Konrad Rzeszutek Wilk 
 x86/KVM: Warn user if 

Re: [PATCH 4.9 000/107] 4.9.120-stable review

2018-08-14 Thread Sebastian Gottschall
if SWAP is disabled in kernel config, the following compile error will 
raise up with this release


arch/x86/built-in.o: in function `max_swapfile_size': (.text+0x3bba1): 
undefined reference to `generic_max_swapfile_size'


of course this is simple to fix. the function max_swapfile_size must be 
excluded if CONFIG_SWAP is disabled


Sebastian

Am 14.08.2018 um 19:16 schrieb Greg Kroah-Hartman:

This is the start of the stable review cycle for the 4.9.120 release.
There are 107 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Aug 16 17:14:53 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.120-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-
Pseudo-Shortlog of commits:

Greg Kroah-Hartman 
 Linux 4.9.120-rc1

Josh Poimboeuf 
 x86/microcode: Allow late microcode loading with SMT disabled

Ashok Raj 
 x86/microcode: Do not upload microcode if CPUs are offline

David Woodhouse 
 tools headers: Synchronise x86 cpufeatures.h for L1TF additions

Andi Kleen 
 x86/mm/kmmio: Make the tracer robust against L1TF

Andi Kleen 
 x86/mm/pat: Make set_memory_np() L1TF safe

Andi Kleen 
 x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert

Andi Kleen 
 x86/speculation/l1tf: Invert all not present mappings

Thomas Gleixner 
 cpu/hotplug: Fix SMT supported evaluation

Paolo Bonzini 
 KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry

Paolo Bonzini 
 x86/speculation: Simplify sysfs report of VMX L1TF vulnerability

Paolo Bonzini 
 KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR

Wanpeng Li 
 KVM: X86: Allow userspace to define the microcode version

Wanpeng Li 
 KVM: X86: Introduce kvm_get_msr_feature()

Tom Lendacky 
 KVM: SVM: Add MSR-based feature support for serializing LFENCE

Tom Lendacky 
 KVM: x86: Add a framework for supporting MSR-based features

Thomas Gleixner 
 Documentation/l1tf: Remove Yonah processors from not vulnerable list

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr()

Nicolai Stange 
 x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d

Nicolai Stange 
 x86: Don't include linux/irq.h from asm/hardirq.h

Nicolai Stange 
 x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d

Nicolai Stange 
 x86/irq: Demote irq_cpustat_t::__softirq_pending to u16

Nicolai Stange 
 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush()

Nicolai Stange 
 x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond'

Nicolai Stange 
 x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush()

Josh Poimboeuf 
 cpu/hotplug: detect SMT disabled by BIOS

Tony Luck 
 Documentation/l1tf: Fix typos

Nicolai Stange 
 x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content

Thomas Gleixner 
 Documentation: Add section about CPU vulnerabilities

Jiri Kosina 
 x86/bugs, kvm: Introduce boot-time control of L1TF mitigations

Thomas Gleixner 
 cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early

Jiri Kosina 
 cpu/hotplug: Expose SMT control init function

Thomas Gleixner 
 x86/kvm: Allow runtime control of L1D flush

Thomas Gleixner 
 x86/kvm: Serialize L1D flush parameter setter

Thomas Gleixner 
 x86/kvm: Add static key for flush always

Thomas Gleixner 
 x86/kvm: Move l1tf setup function

Thomas Gleixner 
 x86/l1tf: Handle EPT disabled state proper

Thomas Gleixner 
 x86/kvm: Drop L1TF MSR list approach

Thomas Gleixner 
 x86/litf: Introduce vmx status variable

Thomas Gleixner 
 cpu/hotplug: Online siblings when SMT control is turned on

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add find_msr() helper function

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest 
numbers

Jim Mattson 
 kvm: nVMX: Update MSR load counts on a VMCS switch

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush logic

Paolo Bonzini 
 x86/KVM/VMX: Add L1D MSR based flush

Paolo Bonzini 
 x86/KVM/VMX: Add L1D flush algorithm

Konrad Rzeszutek Wilk 
 x86/KVM/VMX: Add module argument for L1TF mitigation

Konrad Rzeszutek Wilk 
 x86/KVM: Warn user if 

Re: [RFC PATCH 2/2] mm: drain memcg stocks on css offlining

2018-08-14 Thread Shakeel Butt
On Tue, Aug 14, 2018 at 5:36 PM Roman Gushchin  wrote:
>
> Memcg charge is batched using per-cpu stocks, so an offline memcg
> can be pinned by a cached charge up to a moment, when a process
> belonging to some other cgroup will charge some memory on the same
> cpu. In other words, cached charges can prevent a memory cgroup
> from being reclaimed for some time, without any clear need.
>
> Let's optimize it by explicit draining of all stocks on css offlining.
> As draining is performed asynchronously, and is skipped if any
> parallel draining is happening, it's cheap.
>
> Signed-off-by: Roman Gushchin 

Seems reasonable.

Reviewed-by: Shakeel Butt 

> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Konstantin Khlebnikov 
> Cc: Tejun Heo 
> ---
>  mm/memcontrol.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4e3c1315b1de..cfb64b5b9957 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct 
> cgroup_subsys_state *css)
> memcg_offline_kmem(memcg);
> wb_memcg_offline(memcg);
>
> +   drain_all_stock(memcg);
> +
> mem_cgroup_id_put(memcg);
>  }
>
> --
> 2.14.4
>


Re: [RFC PATCH 2/2] mm: drain memcg stocks on css offlining

2018-08-14 Thread Shakeel Butt
On Tue, Aug 14, 2018 at 5:36 PM Roman Gushchin  wrote:
>
> Memcg charge is batched using per-cpu stocks, so an offline memcg
> can be pinned by a cached charge up to a moment, when a process
> belonging to some other cgroup will charge some memory on the same
> cpu. In other words, cached charges can prevent a memory cgroup
> from being reclaimed for some time, without any clear need.
>
> Let's optimize it by explicit draining of all stocks on css offlining.
> As draining is performed asynchronously, and is skipped if any
> parallel draining is happening, it's cheap.
>
> Signed-off-by: Roman Gushchin 

Seems reasonable.

Reviewed-by: Shakeel Butt 

> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Konstantin Khlebnikov 
> Cc: Tejun Heo 
> ---
>  mm/memcontrol.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4e3c1315b1de..cfb64b5b9957 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct 
> cgroup_subsys_state *css)
> memcg_offline_kmem(memcg);
> wb_memcg_offline(memcg);
>
> +   drain_all_stock(memcg);
> +
> mem_cgroup_id_put(memcg);
>  }
>
> --
> 2.14.4
>


[PATCH] kconfig: add build-only configurator targets

2018-08-14 Thread Randy Dunlap
From: Randy Dunlap 

Add build-only targets for build_menuconfig, build_nconfig,
build_xconfig, and build_gconfig.
(targets must end in "config" to qualify in top-level Makefile)

This allows these target to be built without execution (e.g., to
look for errors or warnings) and/or to be built and checked by sparse.

Signed-off-by: Randy Dunlap 
---
 scripts/kconfig/Makefile |8 
 1 file changed, 8 insertions(+)

--- linux-next-20180814.orig/scripts/kconfig/Makefile
+++ linux-next-20180814/scripts/kconfig/Makefile
@@ -33,6 +33,14 @@ config: $(obj)/conf
 nconfig: $(obj)/nconf
$< $(silent) $(Kconfig)
 
+build_menuconfig: $(obj)/mconf
+
+build_nconfig: $(obj)/nconf
+
+build_gconfig: $(obj)/gconf
+
+build_xconfig: $(obj)/qconf
+
 localyesconfig localmodconfig: $(obj)/conf
$(Q)perl $(srctree)/$(src)/streamline_config.pl --$@ $(srctree) 
$(Kconfig) > .tmp.config
$(Q)if [ -f .config ]; then \



[RFC PATCH 1/2] mm: rework memcg kernel stack accounting

2018-08-14 Thread Roman Gushchin
If CONFIG_VMAP_STACK is set, kernel stacks are allocated
using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
stack pages are charged against corresponding memory cgroups
on allocation and uncharged on releasing them.

The problem is that we do cache kernel stacks in small
per-cpu caches and do reuse them for new tasks, which can
belong to different memory cgroups.

Each stack page still holds a reference to the original cgroup,
so the cgroup can't be released until the vmap area is released.

To make this happen we need more than two subsequent exits
without forks in between on the current cpu, which makes it
very unlikely to happen. As a result, I saw a significant number
of dying cgroups (in theory, up to 2 * number_of_cpu +
number_of_tasks), which can't be released even by significant
memory pressure.

As a cgroup structure can take a significant amount of memory
(first of all, per-cpu data like memcg statistics), it leads
to a noticeable waste of memory.

Signed-off-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Andy Lutomirski 
Cc: Konstantin Khlebnikov 
Cc: Tejun Heo 
---
 kernel/fork.c | 44 ++--
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 69b6fea5a181..91872b2b37bd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
return s->addr;
}
 
+   /*
+* Allocated stacks are cached and later reused by new threads,
+* so memcg accounting is performed manually on assigning/releasing
+* stacks to tasks. Drop __GFP_ACCOUNT.
+*/
stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
 VMALLOC_START, VMALLOC_END,
-THREADINFO_GFP,
+THREADINFO_GFP & ~__GFP_ACCOUNT,
 PAGE_KERNEL,
 0, node, __builtin_return_address(0));
 
@@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
 #endif
 }
 
+static void memcg_charge_kernel_stack(struct task_struct *tsk)
+{
+#ifdef CONFIG_VMAP_STACK
+   struct vm_struct *vm = task_stack_vm_area(tsk);
+
+   if (vm) {
+   int i;
+
+   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
+   memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL,
+ compound_order(vm->pages[i]));
+
+   /* All stack pages belong to the same memcg. */
+   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
+THREAD_SIZE / 1024);
+   }
+#endif
+}
+
 static inline void free_thread_stack(struct task_struct *tsk)
 {
 #ifdef CONFIG_VMAP_STACK
-   if (task_stack_vm_area(tsk)) {
+   struct vm_struct *vm = task_stack_vm_area(tsk);
+
+   if (vm) {
int i;
 
+   /* All stack pages belong to the same memcg. */
+   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
+-(int)(THREAD_SIZE / 1024));
+
+   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
+   memcg_kmem_uncharge(vm->pages[i],
+ compound_order(vm->pages[i]));
+
for (i = 0; i < NR_CACHED_STACKS; i++) {
if (this_cpu_cmpxchg(cached_stacks[i],
NULL, tsk->stack_vm_area) != NULL)
@@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct *tsk, 
int account)
NR_KERNEL_STACK_KB,
PAGE_SIZE / 1024 * account);
}
-
-   /* All stack pages belong to the same memcg. */
-   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
-account * (THREAD_SIZE / 1024));
} else {
/*
 * All stack pages are in the same zone and belong to the
@@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
if (!stack)
goto free_tsk;
 
+   memcg_charge_kernel_stack(tsk);
+
stack_vm_area = task_stack_vm_area(tsk);
 
err = arch_dup_task_struct(tsk, orig);
-- 
2.14.4



[RFC PATCH 1/2] mm: rework memcg kernel stack accounting

2018-08-14 Thread Roman Gushchin
If CONFIG_VMAP_STACK is set, kernel stacks are allocated
using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
stack pages are charged against corresponding memory cgroups
on allocation and uncharged on releasing them.

The problem is that we do cache kernel stacks in small
per-cpu caches and do reuse them for new tasks, which can
belong to different memory cgroups.

Each stack page still holds a reference to the original cgroup,
so the cgroup can't be released until the vmap area is released.

To make this happen we need more than two subsequent exits
without forks in between on the current cpu, which makes it
very unlikely to happen. As a result, I saw a significant number
of dying cgroups (in theory, up to 2 * number_of_cpu +
number_of_tasks), which can't be released even by significant
memory pressure.

As a cgroup structure can take a significant amount of memory
(first of all, per-cpu data like memcg statistics), it leads
to a noticeable waste of memory.

Signed-off-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Andy Lutomirski 
Cc: Konstantin Khlebnikov 
Cc: Tejun Heo 
---
 kernel/fork.c | 44 ++--
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 69b6fea5a181..91872b2b37bd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -224,9 +224,14 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
return s->addr;
}
 
+   /*
+* Allocated stacks are cached and later reused by new threads,
+* so memcg accounting is performed manually on assigning/releasing
+* stacks to tasks. Drop __GFP_ACCOUNT.
+*/
stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
 VMALLOC_START, VMALLOC_END,
-THREADINFO_GFP,
+THREADINFO_GFP & ~__GFP_ACCOUNT,
 PAGE_KERNEL,
 0, node, __builtin_return_address(0));
 
@@ -246,12 +251,41 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
 #endif
 }
 
+static void memcg_charge_kernel_stack(struct task_struct *tsk)
+{
+#ifdef CONFIG_VMAP_STACK
+   struct vm_struct *vm = task_stack_vm_area(tsk);
+
+   if (vm) {
+   int i;
+
+   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
+   memcg_kmem_charge(vm->pages[i], __GFP_NOFAIL,
+ compound_order(vm->pages[i]));
+
+   /* All stack pages belong to the same memcg. */
+   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
+THREAD_SIZE / 1024);
+   }
+#endif
+}
+
 static inline void free_thread_stack(struct task_struct *tsk)
 {
 #ifdef CONFIG_VMAP_STACK
-   if (task_stack_vm_area(tsk)) {
+   struct vm_struct *vm = task_stack_vm_area(tsk);
+
+   if (vm) {
int i;
 
+   /* All stack pages belong to the same memcg. */
+   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
+-(int)(THREAD_SIZE / 1024));
+
+   for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
+   memcg_kmem_uncharge(vm->pages[i],
+ compound_order(vm->pages[i]));
+
for (i = 0; i < NR_CACHED_STACKS; i++) {
if (this_cpu_cmpxchg(cached_stacks[i],
NULL, tsk->stack_vm_area) != NULL)
@@ -352,10 +386,6 @@ static void account_kernel_stack(struct task_struct *tsk, 
int account)
NR_KERNEL_STACK_KB,
PAGE_SIZE / 1024 * account);
}
-
-   /* All stack pages belong to the same memcg. */
-   mod_memcg_page_state(vm->pages[0], MEMCG_KERNEL_STACK_KB,
-account * (THREAD_SIZE / 1024));
} else {
/*
 * All stack pages are in the same zone and belong to the
@@ -809,6 +839,8 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
if (!stack)
goto free_tsk;
 
+   memcg_charge_kernel_stack(tsk);
+
stack_vm_area = task_stack_vm_area(tsk);
 
err = arch_dup_task_struct(tsk, orig);
-- 
2.14.4



[PATCH] kconfig: add build-only configurator targets

2018-08-14 Thread Randy Dunlap
From: Randy Dunlap 

Add build-only targets for build_menuconfig, build_nconfig,
build_xconfig, and build_gconfig.
(targets must end in "config" to qualify in top-level Makefile)

This allows these target to be built without execution (e.g., to
look for errors or warnings) and/or to be built and checked by sparse.

Signed-off-by: Randy Dunlap 
---
 scripts/kconfig/Makefile |8 
 1 file changed, 8 insertions(+)

--- linux-next-20180814.orig/scripts/kconfig/Makefile
+++ linux-next-20180814/scripts/kconfig/Makefile
@@ -33,6 +33,14 @@ config: $(obj)/conf
 nconfig: $(obj)/nconf
$< $(silent) $(Kconfig)
 
+build_menuconfig: $(obj)/mconf
+
+build_nconfig: $(obj)/nconf
+
+build_gconfig: $(obj)/gconf
+
+build_xconfig: $(obj)/qconf
+
 localyesconfig localmodconfig: $(obj)/conf
$(Q)perl $(srctree)/$(src)/streamline_config.pl --$@ $(srctree) 
$(Kconfig) > .tmp.config
$(Q)if [ -f .config ]; then \



[RFC PATCH 2/2] mm: drain memcg stocks on css offlining

2018-08-14 Thread Roman Gushchin
Memcg charge is batched using per-cpu stocks, so an offline memcg
can be pinned by a cached charge up to a moment, when a process
belonging to some other cgroup will charge some memory on the same
cpu. In other words, cached charges can prevent a memory cgroup
from being reclaimed for some time, without any clear need.

Let's optimize it by explicit draining of all stocks on css offlining.
As draining is performed asynchronously, and is skipped if any
parallel draining is happening, it's cheap.

Signed-off-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Konstantin Khlebnikov 
Cc: Tejun Heo 
---
 mm/memcontrol.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4e3c1315b1de..cfb64b5b9957 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct 
cgroup_subsys_state *css)
memcg_offline_kmem(memcg);
wb_memcg_offline(memcg);
 
+   drain_all_stock(memcg);
+
mem_cgroup_id_put(memcg);
 }
 
-- 
2.14.4



[RFC PATCH 2/2] mm: drain memcg stocks on css offlining

2018-08-14 Thread Roman Gushchin
Memcg charge is batched using per-cpu stocks, so an offline memcg
can be pinned by a cached charge up to a moment, when a process
belonging to some other cgroup will charge some memory on the same
cpu. In other words, cached charges can prevent a memory cgroup
from being reclaimed for some time, without any clear need.

Let's optimize it by explicit draining of all stocks on css offlining.
As draining is performed asynchronously, and is skipped if any
parallel draining is happening, it's cheap.

Signed-off-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Konstantin Khlebnikov 
Cc: Tejun Heo 
---
 mm/memcontrol.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4e3c1315b1de..cfb64b5b9957 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4575,6 +4575,8 @@ static void mem_cgroup_css_offline(struct 
cgroup_subsys_state *css)
memcg_offline_kmem(memcg);
wb_memcg_offline(memcg);
 
+   drain_all_stock(memcg);
+
mem_cgroup_id_put(memcg);
 }
 
-- 
2.14.4



Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.

2018-08-14 Thread Michael Jin
On Tue, Aug 14, 2018 at 4:26 PM, Ghannam, Yazen  wrote:
>
> > -Original Message-
> > From: Michael Jin 
> > Sent: Friday, August 10, 2018 2:36 PM
> > To: Borislav Petkov ; Ghannam, Yazen
> > ; Mauro Carvalho Chehab
> > 
> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Michael Jin
> > 
> There may be some differences between models, but things should generally
> work. It's just a matter of whether or not the Platform enables certain things
> like DRAM ECC, etc.
>
> Does the amd64_edac_mod module load on your platform with just this patch?
>
> > + fam_type = _types[F17_M11H_CPUS];
> > + pvt->ops = _types[F17_M11H_CPUS].ops;
> > + break;
> > + }
> >   fam_type= _types[F17_CPUS];
> >   pvt->ops= _types[F17_CPUS].ops;
> >   break;

Yes, I was able to load amd64_edac_mod on my AMD Ryzen Embedded V1807B
(further details on my blog -
https://ndimcomputing.io/fs-fp5v_ecc_linux.html).

> > diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> > index 1d4b74e9a037..e50226cd53c6 100644
> > --- a/drivers/edac/amd64_edac.h
> > +++ b/drivers/edac/amd64_edac.h
> > @@ -115,6 +115,8 @@
> >  #define PCI_DEVICE_ID_AMD_16H_M30H_NB_F2 0x1582
> >  #define PCI_DEVICE_ID_AMD_17H_DF_F0  0x1460
> >  #define PCI_DEVICE_ID_AMD_17H_DF_F6  0x1466
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F0 0x15e8
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F6 0x15ee
> >
>
> These IDs are used for Fam17h Models 10h-2Fh. Can you please change
> the names here and in the rest of this patch?
>
> The format is to use the first supported model in the name, e.g. M11H -> M10H.
>
> >  /*
> >   * Function 1 - Address Map
> > @@ -281,6 +283,7 @@ enum amd_families {
> >   F16_CPUS,
> >   F16_M30H_CPUS,
> >   F17_CPUS,
> > + F17_M11H_CPUS,
> >   NUM_FAMILIES,
> >  };
> >

According to https://en.wikichip.org/wiki/amd/cpuid, family 17h model
10h is not publicly known.

Therefore, I would like you to confirm that model 10h uses 0x15e8
(device F0) and 0x15ee (device F6) as I can not find any documentation
or test whether ECC works.

Raven Ridge BIOS motherboards do not enable ECC, but share the same
CPUID (0081_0F10h
http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html) as AMD
Ryzen Embedded V1000 Processor Family.

This patch was written due to the fact that amd64_edac_mod erroneously
sets the wrong device ids for the AMD Ryzen Embedded V1000 Family
(device F0 and F6 fail to load, as they do not share 0x1460 and 0x1466
with the other family 17h processors).

Michael


Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.

2018-08-14 Thread Michael Jin
On Tue, Aug 14, 2018 at 4:26 PM, Ghannam, Yazen  wrote:
>
> > -Original Message-
> > From: Michael Jin 
> > Sent: Friday, August 10, 2018 2:36 PM
> > To: Borislav Petkov ; Ghannam, Yazen
> > ; Mauro Carvalho Chehab
> > 
> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Michael Jin
> > 
> There may be some differences between models, but things should generally
> work. It's just a matter of whether or not the Platform enables certain things
> like DRAM ECC, etc.
>
> Does the amd64_edac_mod module load on your platform with just this patch?
>
> > + fam_type = _types[F17_M11H_CPUS];
> > + pvt->ops = _types[F17_M11H_CPUS].ops;
> > + break;
> > + }
> >   fam_type= _types[F17_CPUS];
> >   pvt->ops= _types[F17_CPUS].ops;
> >   break;

Yes, I was able to load amd64_edac_mod on my AMD Ryzen Embedded V1807B
(further details on my blog -
https://ndimcomputing.io/fs-fp5v_ecc_linux.html).

> > diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> > index 1d4b74e9a037..e50226cd53c6 100644
> > --- a/drivers/edac/amd64_edac.h
> > +++ b/drivers/edac/amd64_edac.h
> > @@ -115,6 +115,8 @@
> >  #define PCI_DEVICE_ID_AMD_16H_M30H_NB_F2 0x1582
> >  #define PCI_DEVICE_ID_AMD_17H_DF_F0  0x1460
> >  #define PCI_DEVICE_ID_AMD_17H_DF_F6  0x1466
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F0 0x15e8
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F6 0x15ee
> >
>
> These IDs are used for Fam17h Models 10h-2Fh. Can you please change
> the names here and in the rest of this patch?
>
> The format is to use the first supported model in the name, e.g. M11H -> M10H.
>
> >  /*
> >   * Function 1 - Address Map
> > @@ -281,6 +283,7 @@ enum amd_families {
> >   F16_CPUS,
> >   F16_M30H_CPUS,
> >   F17_CPUS,
> > + F17_M11H_CPUS,
> >   NUM_FAMILIES,
> >  };
> >

According to https://en.wikichip.org/wiki/amd/cpuid, family 17h model
10h is not publicly known.

Therefore, I would like you to confirm that model 10h uses 0x15e8
(device F0) and 0x15ee (device F6) as I can not find any documentation
or test whether ECC works.

Raven Ridge BIOS motherboards do not enable ECC, but share the same
CPUID (0081_0F10h
http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html) as AMD
Ryzen Embedded V1000 Processor Family.

This patch was written due to the fact that amd64_edac_mod erroneously
sets the wrong device ids for the AMD Ryzen Embedded V1000 Family
(device F0 and F6 fail to load, as they do not share 0x1460 and 0x1466
with the other family 17h processors).

Michael


Re: [PATCH] mm: migration: fix migration of huge PMD shared pages

2018-08-14 Thread Mike Kravetz
On 08/14/2018 01:48 AM, Kirill A. Shutemov wrote:
> On Mon, Aug 13, 2018 at 11:21:41PM +, Mike Kravetz wrote:
>> On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote:
>>> On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote:
 I am not %100 sure on the required flushing, so suggestions would be
 appreciated.  This also should go to stable.  It has been around for
 a long time so still looking for an appropriate 'fixes:'.
>>>
>>> I believe we need flushing. And huge_pmd_unshare() usage in
>>> __unmap_hugepage_range() looks suspicious: I don't see how we flush TLB in
>>> that case.
>>
>> Thanks Kirill,
>>
>> __unmap_hugepage_range() has two callers:
>> 1) unmap_hugepage_range, which wraps the call with tlb_gather_mmu and
>>tlb_finish_mmu on the range.  IIUC, this should cause an appropriate
>>TLB flush.
>> 2) __unmap_hugepage_range_final via unmap_single_vma.  unmap_single_vma
>>   has three callers:
>>   - unmap_vmas which assumes the caller will flush the whole range after
>> return.
>>   - zap_page_range wraps the call with tlb_gather_mmu/tlb_finish_mmu
>>   - zap_page_range_single wraps the call with tlb_gather_mmu/tlb_finish_mmu
>>
>> So, it appears we are covered.  But, I could be missing something.
> 
> My problem here is that the mapping that moved by huge_pmd_unshare() in
> not accounted into mmu_gather and can be missed on tlb_finish_mmu().

Ah, I think I now see the issue you are concerned with.

When huge_pmd_unshare succeeds we effectively unmap a PUD_SIZE area.
The routine __unmap_hugepage_range may only have been passed a range
that is a subset of PUD_SIZE.  In the case I was trying to address,
try_to_unmap_one() the 'range' will certainly be less than PUD_SIZE.
Upon further thought, I think that even in the case of try_to_unmap_one
we should flush PUD_SIZE range.

My first thought would be to embed this flushing within huge_pmd_unshare
itself.  Perhaps, whenever huge_pmd_unshare succeeds we should do an
explicit:
flush_cache_range(PUD_SIZE)
flush_tlb_range(PUD_SIZE)
mmu_notifier_invalidate_range(PUD_SIZE)
That would take some of the burden off the callers of huge_pmd_unshare.
However, I am not sure if the flushing calls above play nice in all the
calling environments.  I'll look into it some more, but would appreciate
additional comments.

-- 
Mike Kravetz


Re: [PATCH] mm: migration: fix migration of huge PMD shared pages

2018-08-14 Thread Mike Kravetz
On 08/14/2018 01:48 AM, Kirill A. Shutemov wrote:
> On Mon, Aug 13, 2018 at 11:21:41PM +, Mike Kravetz wrote:
>> On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote:
>>> On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote:
 I am not %100 sure on the required flushing, so suggestions would be
 appreciated.  This also should go to stable.  It has been around for
 a long time so still looking for an appropriate 'fixes:'.
>>>
>>> I believe we need flushing. And huge_pmd_unshare() usage in
>>> __unmap_hugepage_range() looks suspicious: I don't see how we flush TLB in
>>> that case.
>>
>> Thanks Kirill,
>>
>> __unmap_hugepage_range() has two callers:
>> 1) unmap_hugepage_range, which wraps the call with tlb_gather_mmu and
>>tlb_finish_mmu on the range.  IIUC, this should cause an appropriate
>>TLB flush.
>> 2) __unmap_hugepage_range_final via unmap_single_vma.  unmap_single_vma
>>   has three callers:
>>   - unmap_vmas which assumes the caller will flush the whole range after
>> return.
>>   - zap_page_range wraps the call with tlb_gather_mmu/tlb_finish_mmu
>>   - zap_page_range_single wraps the call with tlb_gather_mmu/tlb_finish_mmu
>>
>> So, it appears we are covered.  But, I could be missing something.
> 
> My problem here is that the mapping that moved by huge_pmd_unshare() in
> not accounted into mmu_gather and can be missed on tlb_finish_mmu().

Ah, I think I now see the issue you are concerned with.

When huge_pmd_unshare succeeds we effectively unmap a PUD_SIZE area.
The routine __unmap_hugepage_range may only have been passed a range
that is a subset of PUD_SIZE.  In the case I was trying to address,
try_to_unmap_one() the 'range' will certainly be less than PUD_SIZE.
Upon further thought, I think that even in the case of try_to_unmap_one
we should flush PUD_SIZE range.

My first thought would be to embed this flushing within huge_pmd_unshare
itself.  Perhaps, whenever huge_pmd_unshare succeeds we should do an
explicit:
flush_cache_range(PUD_SIZE)
flush_tlb_range(PUD_SIZE)
mmu_notifier_invalidate_range(PUD_SIZE)
That would take some of the burden off the callers of huge_pmd_unshare.
However, I am not sure if the flushing calls above play nice in all the
calling environments.  I'll look into it some more, but would appreciate
additional comments.

-- 
Mike Kravetz


Re: general protection fault in send_sigurg_to_task

2018-08-14 Thread Stephen Rothwell
Hi Bruce,

On Tue, 14 Aug 2018 13:50:20 -0700 Dmitry Vyukov  wrote:
>
> On Tue, Aug 14, 2018 at 12:11 PM, J. Bruce Fields  
> wrote:
> > On Mon, Aug 13, 2018 at 06:33:02AM -0700, syzbot wrote:  
> >> syzbot has found a reproducer for the following crash on:
> >>
> >> HEAD commit:5ed5da74de9e Add linux-next specific files for 20180813
> >> git tree:   linux-next  
> >
> > I fetched linux-next but don't have 5ed5da74de9e.  
> 
> +Stephen for the disappeared linux-next commit.

That is just the HEAD commit on linux-next that day.  If you fetched
linux-next after I released next-20180814 yesterday, then it would have
a different HEAD commit.  If you check out the tag next-20180813, you
will get the above HEAD commit.

> On the dashboard link you can see that it also happened on a more
> recent commit 4e8b38549b50459a22573d756dd1f4e1963c2a8d that I do see
> now in linux-next.

Which is just the HEAD commit of next-20180814.

Linux-next is rebuilt every day based on Linus' tree of the day,
followed by merging all the other branches, so its HEAD is different
every day.
-- 
Cheers,
Stephen Rothwell


pgpaplQCyIbqo.pgp
Description: OpenPGP digital signature


Re: general protection fault in send_sigurg_to_task

2018-08-14 Thread Stephen Rothwell
Hi Bruce,

On Tue, 14 Aug 2018 13:50:20 -0700 Dmitry Vyukov  wrote:
>
> On Tue, Aug 14, 2018 at 12:11 PM, J. Bruce Fields  
> wrote:
> > On Mon, Aug 13, 2018 at 06:33:02AM -0700, syzbot wrote:  
> >> syzbot has found a reproducer for the following crash on:
> >>
> >> HEAD commit:5ed5da74de9e Add linux-next specific files for 20180813
> >> git tree:   linux-next  
> >
> > I fetched linux-next but don't have 5ed5da74de9e.  
> 
> +Stephen for the disappeared linux-next commit.

That is just the HEAD commit on linux-next that day.  If you fetched
linux-next after I released next-20180814 yesterday, then it would have
a different HEAD commit.  If you check out the tag next-20180813, you
will get the above HEAD commit.

> On the dashboard link you can see that it also happened on a more
> recent commit 4e8b38549b50459a22573d756dd1f4e1963c2a8d that I do see
> now in linux-next.

Which is just the HEAD commit of next-20180814.

Linux-next is rebuilt every day based on Linus' tree of the day,
followed by merging all the other branches, so its HEAD is different
every day.
-- 
Cheers,
Stephen Rothwell


pgpaplQCyIbqo.pgp
Description: OpenPGP digital signature


[PATCH] cpufreq: governor: Protect cpufreq governor_data

2018-08-14 Thread Henry Willard
If cppc_cpufreq.ko is deleted at the same time that tuned-adm is
changing profiles, there is a small chance that a race can occur
between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits()
resulting in a system failure when the latter tries to use
policy->governor_data that has been freed by the former.

This patch uses gov_dbs_data_mutex to synchronize access.

Signed-off-by: Henry Willard 
---
 drivers/cpufreq/cpufreq_governor.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 1d50e97d49f1..43416ee55849 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -555,12 +555,20 @@ void cpufreq_dbs_governor_stop(struct cpufreq_policy 
*policy)
 
 void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy)
 {
-   struct policy_dbs_info *policy_dbs = policy->governor_data;
+   struct policy_dbs_info *policy_dbs;
+
+   /* Protect gov->gdbs_data against cpufreq_dbs_governor_exit */
+   mutex_lock(_dbs_data_mutex);
+   policy_dbs = policy->governor_data;
+   if (!policy_dbs)
+   goto out;
 
mutex_lock(_dbs->update_mutex);
cpufreq_policy_apply_limits(policy);
gov_update_sample_delay(policy_dbs, 0);
 
mutex_unlock(_dbs->update_mutex);
+out:
+   mutex_unlock(_dbs_data_mutex);
 }
 EXPORT_SYMBOL_GPL(cpufreq_dbs_governor_limits);
-- 
1.8.3.1



[PATCH] cpufreq: governor: Protect cpufreq governor_data

2018-08-14 Thread Henry Willard
If cppc_cpufreq.ko is deleted at the same time that tuned-adm is
changing profiles, there is a small chance that a race can occur
between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits()
resulting in a system failure when the latter tries to use
policy->governor_data that has been freed by the former.

This patch uses gov_dbs_data_mutex to synchronize access.

Signed-off-by: Henry Willard 
---
 drivers/cpufreq/cpufreq_governor.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index 1d50e97d49f1..43416ee55849 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -555,12 +555,20 @@ void cpufreq_dbs_governor_stop(struct cpufreq_policy 
*policy)
 
 void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy)
 {
-   struct policy_dbs_info *policy_dbs = policy->governor_data;
+   struct policy_dbs_info *policy_dbs;
+
+   /* Protect gov->gdbs_data against cpufreq_dbs_governor_exit */
+   mutex_lock(_dbs_data_mutex);
+   policy_dbs = policy->governor_data;
+   if (!policy_dbs)
+   goto out;
 
mutex_lock(_dbs->update_mutex);
cpufreq_policy_apply_limits(policy);
gov_update_sample_delay(policy_dbs, 0);
 
mutex_unlock(_dbs->update_mutex);
+out:
+   mutex_unlock(_dbs_data_mutex);
 }
 EXPORT_SYMBOL_GPL(cpufreq_dbs_governor_limits);
-- 
1.8.3.1



Re: [PATCH 1/2] fpga: Document when fpga_blah_free functions should be used

2018-08-14 Thread Alan Tull
On Thu, Jul 26, 2018 at 2:26 AM, Federico Vaga  wrote:
> Hi Alan,
>
> have you considered the possibility of having something like devm_fpga_[mgr|
> bridge|region]_[create|free]() ? Like this, it will be obvious that 'struct
> fpga_mgr' will be released automatically without reading any comment (but the
> comment is still good), and you use devm_*_free() only to handle error
> conditions.

Hi Federico,

Sorry for the late reply.  I was waiting until I had this all
implemented.  As you've seen, I've posted the devm_fpga_mgr_create
implementation [1].  I found I didn't need devm_*_free() for the error
conditions, the managed device code handled cleanup nicely without it.

Just to be clear, I'm dropping this patch in favor of devm support instead.

Thanks again for your suggestions.

Alan

[1] https://lkml.org/lkml/2018/8/14/954

>
> On Wednesday, July 25, 2018 8:15:13 PM CEST Alan Tull wrote:
>> Clarify when fpga_(mgr|bridge|region)_free functions should be used.
>> The class's dev_release will handle cleanup when the device is released
>> so once the mgr/brige/region has been successfully registered, it
>> would be a bug to call fpga_(mgr|bridge|region)_free.
>>
>> Signed-off-by: Alan Tull 
>> Suggested-by: Florian Fainelli 
>> Suggested-by: Federico Vaga 
>> ---
>>  drivers/fpga/fpga-bridge.c | 10 +-
>>  drivers/fpga/fpga-mgr.c| 10 +-
>>  drivers/fpga/fpga-region.c | 10 +-
>>  3 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/fpga/fpga-bridge.c b/drivers/fpga/fpga-bridge.c
>> index 24b8f98..528d2149 100644
>> --- a/drivers/fpga/fpga-bridge.c
>> +++ b/drivers/fpga/fpga-bridge.c
>> @@ -379,7 +379,11 @@ EXPORT_SYMBOL_GPL(fpga_bridge_create);
>>
>>  /**
>>   * fpga_bridge_free - free a fpga bridge and its id
>> - * @bridge:  FPGA bridge struct created by fpga_bridge_create
>> + * @bridge:  FPGA bridge struct created by fpga_bridge_create()
>> + *
>> + * Free a FPGA bridge.  This function should only be called for
>> + * freeing a bridge that has not been registered yet (such as in error
>> + * paths in a probe function).
>>   */
>>  void fpga_bridge_free(struct fpga_bridge *bridge)
>>  {
>> @@ -414,6 +418,10 @@ EXPORT_SYMBOL_GPL(fpga_bridge_register);
>>  /**
>>   * fpga_bridge_unregister - unregister and free a fpga bridge
>>   * @bridge:  FPGA bridge struct created by fpga_bridge_create
>> + *
>> + * Unregister the bridge device.  The class's dev_release will handle
>> + * freeing the bridge struct when the device is released so don't
>> + * call fpga_bridge_free() after calling fpga_bridge_unregister().
>>   */
>>  void fpga_bridge_unregister(struct fpga_bridge *bridge)
>>  {
>> diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c
>> index a41b07e..9632cbd 100644
>> --- a/drivers/fpga/fpga-mgr.c
>> +++ b/drivers/fpga/fpga-mgr.c
>> @@ -619,7 +619,11 @@ EXPORT_SYMBOL_GPL(fpga_mgr_create);
>>
>>  /**
>>   * fpga_mgr_free - deallocate a FPGA manager
>> - * @mgr: fpga manager struct created by fpga_mgr_create
>> + * @mgr: fpga manager struct created by fpga_mgr_create()
>> + *
>> + * Free a FPGA manager struct.  This function should only be called
>> + * for freeing a manager that has not been registered yet (such as in
>> + * error paths in a probe function).
>>   */
>>  void fpga_mgr_free(struct fpga_manager *mgr)
>>  {
>> @@ -663,6 +667,10 @@ EXPORT_SYMBOL_GPL(fpga_mgr_register);
>>  /**
>>   * fpga_mgr_unregister - unregister and free a FPGA manager
>>   * @mgr: fpga manager struct
>> + *
>> + * Unregister the manager device.  The class's dev_release will handle
>> + * freeing the manager struct when the device is released so don't
>> + * call fpga_mgr_free() after calling fpga_mgr_unregister().
>>   */
>>  void fpga_mgr_unregister(struct fpga_manager *mgr)
>>  {
>> diff --git a/drivers/fpga/fpga-region.c b/drivers/fpga/fpga-region.c
>> index 0d65220..7335fa9 100644
>> --- a/drivers/fpga/fpga-region.c
>> +++ b/drivers/fpga/fpga-region.c
>> @@ -231,7 +231,11 @@ EXPORT_SYMBOL_GPL(fpga_region_create);
>>
>>  /**
>>   * fpga_region_free - free a struct fpga_region
>> - * @region: FPGA region created by fpga_region_create
>> + * @region:  FPGA region created by fpga_region_create()
>> + *
>> + * Free a FPGA region struct.  This function should only be called for
>> + * freeing a region that has not been registered yet (such as in error
>> + * paths in a probe function).
>>   */
>>  void fpga_region_free(struct fpga_region *region)
>>  {
>> @@ -255,6 +259,10 @@ EXPORT_SYMBOL_GPL(fpga_region_register);
>>  /**
>>   * fpga_region_unregister - unregister and free a FPGA region
>>   * @region: FPGA region
>> + *
>> + * Unregister the region device.  The class's dev_release will handle
>> + * freeing the region so don't call fpga_region_free() after calling
>> + * fpga_region_unregister().
>>   */
>>  void fpga_region_unregister(struct fpga_region *region)
>>  {
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe 

Re: [PATCH 1/2] fpga: Document when fpga_blah_free functions should be used

2018-08-14 Thread Alan Tull
On Thu, Jul 26, 2018 at 2:26 AM, Federico Vaga  wrote:
> Hi Alan,
>
> have you considered the possibility of having something like devm_fpga_[mgr|
> bridge|region]_[create|free]() ? Like this, it will be obvious that 'struct
> fpga_mgr' will be released automatically without reading any comment (but the
> comment is still good), and you use devm_*_free() only to handle error
> conditions.

Hi Federico,

Sorry for the late reply.  I was waiting until I had this all
implemented.  As you've seen, I've posted the devm_fpga_mgr_create
implementation [1].  I found I didn't need devm_*_free() for the error
conditions, the managed device code handled cleanup nicely without it.

Just to be clear, I'm dropping this patch in favor of devm support instead.

Thanks again for your suggestions.

Alan

[1] https://lkml.org/lkml/2018/8/14/954

>
> On Wednesday, July 25, 2018 8:15:13 PM CEST Alan Tull wrote:
>> Clarify when fpga_(mgr|bridge|region)_free functions should be used.
>> The class's dev_release will handle cleanup when the device is released
>> so once the mgr/brige/region has been successfully registered, it
>> would be a bug to call fpga_(mgr|bridge|region)_free.
>>
>> Signed-off-by: Alan Tull 
>> Suggested-by: Florian Fainelli 
>> Suggested-by: Federico Vaga 
>> ---
>>  drivers/fpga/fpga-bridge.c | 10 +-
>>  drivers/fpga/fpga-mgr.c| 10 +-
>>  drivers/fpga/fpga-region.c | 10 +-
>>  3 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/fpga/fpga-bridge.c b/drivers/fpga/fpga-bridge.c
>> index 24b8f98..528d2149 100644
>> --- a/drivers/fpga/fpga-bridge.c
>> +++ b/drivers/fpga/fpga-bridge.c
>> @@ -379,7 +379,11 @@ EXPORT_SYMBOL_GPL(fpga_bridge_create);
>>
>>  /**
>>   * fpga_bridge_free - free a fpga bridge and its id
>> - * @bridge:  FPGA bridge struct created by fpga_bridge_create
>> + * @bridge:  FPGA bridge struct created by fpga_bridge_create()
>> + *
>> + * Free a FPGA bridge.  This function should only be called for
>> + * freeing a bridge that has not been registered yet (such as in error
>> + * paths in a probe function).
>>   */
>>  void fpga_bridge_free(struct fpga_bridge *bridge)
>>  {
>> @@ -414,6 +418,10 @@ EXPORT_SYMBOL_GPL(fpga_bridge_register);
>>  /**
>>   * fpga_bridge_unregister - unregister and free a fpga bridge
>>   * @bridge:  FPGA bridge struct created by fpga_bridge_create
>> + *
>> + * Unregister the bridge device.  The class's dev_release will handle
>> + * freeing the bridge struct when the device is released so don't
>> + * call fpga_bridge_free() after calling fpga_bridge_unregister().
>>   */
>>  void fpga_bridge_unregister(struct fpga_bridge *bridge)
>>  {
>> diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c
>> index a41b07e..9632cbd 100644
>> --- a/drivers/fpga/fpga-mgr.c
>> +++ b/drivers/fpga/fpga-mgr.c
>> @@ -619,7 +619,11 @@ EXPORT_SYMBOL_GPL(fpga_mgr_create);
>>
>>  /**
>>   * fpga_mgr_free - deallocate a FPGA manager
>> - * @mgr: fpga manager struct created by fpga_mgr_create
>> + * @mgr: fpga manager struct created by fpga_mgr_create()
>> + *
>> + * Free a FPGA manager struct.  This function should only be called
>> + * for freeing a manager that has not been registered yet (such as in
>> + * error paths in a probe function).
>>   */
>>  void fpga_mgr_free(struct fpga_manager *mgr)
>>  {
>> @@ -663,6 +667,10 @@ EXPORT_SYMBOL_GPL(fpga_mgr_register);
>>  /**
>>   * fpga_mgr_unregister - unregister and free a FPGA manager
>>   * @mgr: fpga manager struct
>> + *
>> + * Unregister the manager device.  The class's dev_release will handle
>> + * freeing the manager struct when the device is released so don't
>> + * call fpga_mgr_free() after calling fpga_mgr_unregister().
>>   */
>>  void fpga_mgr_unregister(struct fpga_manager *mgr)
>>  {
>> diff --git a/drivers/fpga/fpga-region.c b/drivers/fpga/fpga-region.c
>> index 0d65220..7335fa9 100644
>> --- a/drivers/fpga/fpga-region.c
>> +++ b/drivers/fpga/fpga-region.c
>> @@ -231,7 +231,11 @@ EXPORT_SYMBOL_GPL(fpga_region_create);
>>
>>  /**
>>   * fpga_region_free - free a struct fpga_region
>> - * @region: FPGA region created by fpga_region_create
>> + * @region:  FPGA region created by fpga_region_create()
>> + *
>> + * Free a FPGA region struct.  This function should only be called for
>> + * freeing a region that has not been registered yet (such as in error
>> + * paths in a probe function).
>>   */
>>  void fpga_region_free(struct fpga_region *region)
>>  {
>> @@ -255,6 +259,10 @@ EXPORT_SYMBOL_GPL(fpga_region_register);
>>  /**
>>   * fpga_region_unregister - unregister and free a FPGA region
>>   * @region: FPGA region
>> + *
>> + * Unregister the region device.  The class's dev_release will handle
>> + * freeing the region so don't call fpga_region_free() after calling
>> + * fpga_region_unregister().
>>   */
>>  void fpga_region_unregister(struct fpga_region *region)
>>  {
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe 

Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max

2018-08-14 Thread Doug Anderson
Hi,

On Tue, Aug 14, 2018 at 2:59 PM, David Collins  wrote:
> Hi,
>
> On 08/14/2018 01:03 PM, Doug Anderson wrote:
>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins  
>> wrote:>>> --- a/drivers/regulator/core.c
 +++ b/drivers/regulator/core.c
 @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev)
   struct regulator *sibling;
   int current_uA = 0, output_uV, input_uV, err;
   unsigned int mode;
 + bool any_unset = false;

   lockdep_assert_held_once(>mutex);

 @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev *rdev)
   return -EINVAL;

   /* calc total requested load */
 - list_for_each_entry(sibling, >consumer_list, list)
 + list_for_each_entry(sibling, >consumer_list, list) {
   current_uA += sibling->uA_load;
 + if (!sibling->uA_load_set)
 + any_unset = true;
 + }

   current_uA += rdev->constraints->system_load;

 + if (any_unset)
 + current_uA = INT_MAX;
 +
>>>
>>> This check will incorrectly result in a constant load request of INT_MAX
>>> for all regulators that have at least one child regulator.  This is the
>>> case because such child regulators are present in rdev->consumer_list and
>>> because regulator_set_load() requests are not propagated up to parent
>>> regulators.  Thus, the regulator structs for child regulators will always
>>> have uA_load==0 and uA_load_set==false.
>>
>> Ah, interesting.
>>
>> ...but doesn't this same bug exist anyway, just in the opposite
>> direction?  Without my patch we'll try to request a 0 mA load in this
>> case which seems just as wrong (or perhaps worse?).  I guess on RPMh
>> regulator you're "saved" because the RPMh hardware itself knows the
>> parent/child relationship and knows to ignore this 0 mA load, but it's
>> still a bug in the overall Linux framework...
>
> I'm not sure what existing bug you are referring to here.  Where is a 0 mA
> load being requested?  Could you list the consumer function calls and the
> behavior on the framework side that you're envisioning?

Imagine you've got a tree like this:

- regulatorParent (1.8 V)
  - regulatorChildA (1.7 V)
  - regulatorChildB (1.2 V)
  - regulatorChildC (1.5 V)

...and this set of calls:

regulator_set_load(regulatorChildA, 1000);
regulator_set_load(regulatorChildB, 2000);
regulator_set_load(regulatorChildC, 3000);

regulator_enable(regulatorChildA);
regulator_enable(regulatorChildB);
regulator_enable(regulatorChildC);


With the existing code in Mark's tree then ChildA / ChildB / ChildC
will presumably have a load high enough to be in high power mode.
However, as you said, loads aren't propagated.  ...so "Parent" will
see no load requested (which translates to a load request of 0 mA).

The "bug" is that when you did the
"regulator_enable(regulatorChildA);" then that will propagate up and
cause a "regulator_enable(regulatorParent)".  In _regulator_enable()
you can see drms_uA_update().  That will cause it to set the mode of
regulatorParent based on a load of 0 mA.  That is the bug.  You may
respond "but REGULATOR_CHANGE_DRMS isn't set for the parent!  See
below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the
parent.

With my code in the same situation the parent will end up with a load
of "MAX_INT" mA which I think is the bug you pointed out.  ...while it
would be good to fix this it does seem to be better to set the parent
to a mode based on MAX_INT mA rather than 0 mA?


> The RPMh hardware never sees requests with units of mA (though its
> predecessor RPM SMD did).  Instead, mode requests sent to RPMh are raw
> PMIC MODE_CTL register values.  RPMh then applies max aggregation of these
> register values across masters (APPS, modem, etc).  Also note that the
> RPMh knowledge of parent/child relationships is only utilized in enable
> control and voltage headroom management.  It does not apply to mode control.

Yeah, I know RPMh only sees modes.  I was sorta assuming that perhaps
RPMh would use its parent/child relationship knowledge to say that if
a child is in HPM then the parent should automatically go to HPM.

...but as I dug further I figured out why we're not running into this
bug (at least with the regulator setup I have in my tree).  All of the
"parent" regulators are always configured such that
"REGULATOR_CHANGE_DRMS" is not valid.  Thus we never end up calling
into drms_uA_update() and never update the mode.


So tl;dr: your original comment was that my patch had problems
whenever one regulator is the supply for another regulator if parent
has REGULATOR_CHANGE_DRMS set.  I believe I have shown that even
without my patch we have problems in this exact same case.


>> I have no idea how we ought to propagate regulator_set_load() to
>> parents though.  That seems like a tough thing to try to handle
>> automagically.
>
> I attempted 

Re: [PATCH 1/4] regulator: core: If consumers don't call regulator_set_load() assume max

2018-08-14 Thread Doug Anderson
Hi,

On Tue, Aug 14, 2018 at 2:59 PM, David Collins  wrote:
> Hi,
>
> On 08/14/2018 01:03 PM, Doug Anderson wrote:
>> On Tue, Aug 14, 2018 at 11:30 AM, David Collins  
>> wrote:>>> --- a/drivers/regulator/core.c
 +++ b/drivers/regulator/core.c
 @@ -732,6 +732,7 @@ static int drms_uA_update(struct regulator_dev *rdev)
   struct regulator *sibling;
   int current_uA = 0, output_uV, input_uV, err;
   unsigned int mode;
 + bool any_unset = false;

   lockdep_assert_held_once(>mutex);

 @@ -751,11 +752,17 @@ static int drms_uA_update(struct regulator_dev *rdev)
   return -EINVAL;

   /* calc total requested load */
 - list_for_each_entry(sibling, >consumer_list, list)
 + list_for_each_entry(sibling, >consumer_list, list) {
   current_uA += sibling->uA_load;
 + if (!sibling->uA_load_set)
 + any_unset = true;
 + }

   current_uA += rdev->constraints->system_load;

 + if (any_unset)
 + current_uA = INT_MAX;
 +
>>>
>>> This check will incorrectly result in a constant load request of INT_MAX
>>> for all regulators that have at least one child regulator.  This is the
>>> case because such child regulators are present in rdev->consumer_list and
>>> because regulator_set_load() requests are not propagated up to parent
>>> regulators.  Thus, the regulator structs for child regulators will always
>>> have uA_load==0 and uA_load_set==false.
>>
>> Ah, interesting.
>>
>> ...but doesn't this same bug exist anyway, just in the opposite
>> direction?  Without my patch we'll try to request a 0 mA load in this
>> case which seems just as wrong (or perhaps worse?).  I guess on RPMh
>> regulator you're "saved" because the RPMh hardware itself knows the
>> parent/child relationship and knows to ignore this 0 mA load, but it's
>> still a bug in the overall Linux framework...
>
> I'm not sure what existing bug you are referring to here.  Where is a 0 mA
> load being requested?  Could you list the consumer function calls and the
> behavior on the framework side that you're envisioning?

Imagine you've got a tree like this:

- regulatorParent (1.8 V)
  - regulatorChildA (1.7 V)
  - regulatorChildB (1.2 V)
  - regulatorChildC (1.5 V)

...and this set of calls:

regulator_set_load(regulatorChildA, 1000);
regulator_set_load(regulatorChildB, 2000);
regulator_set_load(regulatorChildC, 3000);

regulator_enable(regulatorChildA);
regulator_enable(regulatorChildB);
regulator_enable(regulatorChildC);


With the existing code in Mark's tree then ChildA / ChildB / ChildC
will presumably have a load high enough to be in high power mode.
However, as you said, loads aren't propagated.  ...so "Parent" will
see no load requested (which translates to a load request of 0 mA).

The "bug" is that when you did the
"regulator_enable(regulatorChildA);" then that will propagate up and
cause a "regulator_enable(regulatorParent)".  In _regulator_enable()
you can see drms_uA_update().  That will cause it to set the mode of
regulatorParent based on a load of 0 mA.  That is the bug.  You may
respond "but REGULATOR_CHANGE_DRMS isn't set for the parent!  See
below and for now imagine that REGULATOR_CHANGE_DRMS _is_ set for the
parent.

With my code in the same situation the parent will end up with a load
of "MAX_INT" mA which I think is the bug you pointed out.  ...while it
would be good to fix this it does seem to be better to set the parent
to a mode based on MAX_INT mA rather than 0 mA?


> The RPMh hardware never sees requests with units of mA (though its
> predecessor RPM SMD did).  Instead, mode requests sent to RPMh are raw
> PMIC MODE_CTL register values.  RPMh then applies max aggregation of these
> register values across masters (APPS, modem, etc).  Also note that the
> RPMh knowledge of parent/child relationships is only utilized in enable
> control and voltage headroom management.  It does not apply to mode control.

Yeah, I know RPMh only sees modes.  I was sorta assuming that perhaps
RPMh would use its parent/child relationship knowledge to say that if
a child is in HPM then the parent should automatically go to HPM.

...but as I dug further I figured out why we're not running into this
bug (at least with the regulator setup I have in my tree).  All of the
"parent" regulators are always configured such that
"REGULATOR_CHANGE_DRMS" is not valid.  Thus we never end up calling
into drms_uA_update() and never update the mode.


So tl;dr: your original comment was that my patch had problems
whenever one regulator is the supply for another regulator if parent
has REGULATOR_CHANGE_DRMS set.  I believe I have shown that even
without my patch we have problems in this exact same case.


>> I have no idea how we ought to propagate regulator_set_load() to
>> parents though.  That seems like a tough thing to try to handle
>> automagically.
>
> I attempted 

Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Keith Busch
On Tue, Aug 14, 2018 at 04:16:41PM -0700, Linus Torvalds wrote:
> On Tue, Aug 14, 2018 at 3:56 PM Keith Busch  
> wrote:
> >
> > You just want to ensure the '*dbbuf_db = value' isn't reordered, right?
> > The order dependency might be more obvious if done as:
> >
> > WRITE_ONCE(*dbbuf_db, value);
> >
> > if (!nvme_dbbuf_need_event(READ_ONCE(*dbbuf_ei), value, old_value))
> > return false;
> >
> > And 'volatile' is again redundant.
> 
> Yes, using READ_ONCE/WRITE_ONCE obviates the need for volatile, but it
> does *not* impose a memory ordering.
> 
> It imposes an ordering on the compiler, but not on the CPU, so you
> still want the "mb()" there

I mistakenly recalled memory-barriers.txt mentioned order was enforced
on the CPU, but that's true only for overlapping memory, which this is
not. Thanks for the correction.


Re: [PATCH] Bugfix for handling of shadow doorbell buffer.

2018-08-14 Thread Keith Busch
On Tue, Aug 14, 2018 at 04:16:41PM -0700, Linus Torvalds wrote:
> On Tue, Aug 14, 2018 at 3:56 PM Keith Busch  
> wrote:
> >
> > You just want to ensure the '*dbbuf_db = value' isn't reordered, right?
> > The order dependency might be more obvious if done as:
> >
> > WRITE_ONCE(*dbbuf_db, value);
> >
> > if (!nvme_dbbuf_need_event(READ_ONCE(*dbbuf_ei), value, old_value))
> > return false;
> >
> > And 'volatile' is again redundant.
> 
> Yes, using READ_ONCE/WRITE_ONCE obviates the need for volatile, but it
> does *not* impose a memory ordering.
> 
> It imposes an ordering on the compiler, but not on the CPU, so you
> still want the "mb()" there

I mistakenly recalled memory-barriers.txt mentioned order was enforced
on the CPU, but that's true only for overlapping memory, which this is
not. Thanks for the correction.


Re: [PATCH] zram: fix bug storing backing_dev

2018-08-14 Thread Andrew Morton
On Mon, 13 Aug 2018 16:38:25 +0900 Sergey Senozhatsky 
 wrote:

> On (08/13/18 15:16), Minchan Kim wrote:
> > > The call to strlcpy in backing_dev_store is incorrect. It should take
> > > the size of the destination buffer instead of the size of the source
> > > buffer.  Additionally, ignore the newline character (\n) when reading
> > > the new file_name buffer. This makes it possible to set the backing_dev
> > > as follows:
> > > 
> > >   echo /dev/sdX > /sys/block/zram0/backing_dev
> > > 
> > > Signed-off-by: Peter Kalauskas 
> > Acked-by: Minchan Kim 
> > 
> > Cc: Andrew Morton 
> > Cc: Sergey Senozhatsky 
> > CC: LKML 
> > Cc: [4.14+]
> 
> Thanks for Cc-ing Minchan.
> 
> Reviewed-by: Sergey Senozhatsky 
> 
> > > - strlcpy(file_name, buf, len);
> 
> This is quite interesting. The reason it worked before was the fact that
> strlcpy() copies 'len - 1' bytes, which is strlen(buf) - 1 in our case,
> so it accidentally didn't copy the trailing new line symbol. Which also
> means that "echo -n /dev/sdX" most likely was broken.
> 

I can't find the original email on lkml for some reason, but I
recreated the patch.

The changelog doesn't describe the end-user impact of the bug, which is
very desirable when tagging a patch for -stable backporting.  Can we
have that paragraph please?

The implementation might be able to use strim() somehow.


Re: [PATCH] zram: fix bug storing backing_dev

2018-08-14 Thread Andrew Morton
On Mon, 13 Aug 2018 16:38:25 +0900 Sergey Senozhatsky 
 wrote:

> On (08/13/18 15:16), Minchan Kim wrote:
> > > The call to strlcpy in backing_dev_store is incorrect. It should take
> > > the size of the destination buffer instead of the size of the source
> > > buffer.  Additionally, ignore the newline character (\n) when reading
> > > the new file_name buffer. This makes it possible to set the backing_dev
> > > as follows:
> > > 
> > >   echo /dev/sdX > /sys/block/zram0/backing_dev
> > > 
> > > Signed-off-by: Peter Kalauskas 
> > Acked-by: Minchan Kim 
> > 
> > Cc: Andrew Morton 
> > Cc: Sergey Senozhatsky 
> > CC: LKML 
> > Cc: [4.14+]
> 
> Thanks for Cc-ing Minchan.
> 
> Reviewed-by: Sergey Senozhatsky 
> 
> > > - strlcpy(file_name, buf, len);
> 
> This is quite interesting. The reason it worked before was the fact that
> strlcpy() copies 'len - 1' bytes, which is strlen(buf) - 1 in our case,
> so it accidentally didn't copy the trailing new line symbol. Which also
> means that "echo -n /dev/sdX" most likely was broken.
> 

I can't find the original email on lkml for some reason, but I
recreated the patch.

The changelog doesn't describe the end-user impact of the bug, which is
very desirable when tagging a patch for -stable backporting.  Can we
have that paragraph please?

The implementation might be able to use strim() somehow.


[PATCH] gpio: brcmstb: allow 0 width GPIO banks

2018-08-14 Thread justinpopo6
From: Justin Chen 

Sometimes we have empty banks within the GPIO block. This commit allows
proper handling of 0 width GPIO banks. We handle 0 width GPIO banks by
incrementing the bank and number of GPIOs, but not initializing them.
This will mean a call into the non-existent GPIOs will return an error.

Also remove banks and GPIO information from the dev_info print. This
information is misleading since the incremented banks and gpio_base do
not reflect the actual GPIOs that got initialized. We leave this
information out since it is already printed with dev_dbg.

Signed-off-by: Justin Chen 
---
 drivers/gpio/gpio-brcmstb.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c
index 16c7f9f..8658910 100644
--- a/drivers/gpio/gpio-brcmstb.c
+++ b/drivers/gpio/gpio-brcmstb.c
@@ -664,6 +664,18 @@ static int brcmstb_gpio_probe(struct platform_device *pdev)
struct brcmstb_gpio_bank *bank;
struct gpio_chip *gc;
 
+   /*
+* If bank_width is 0, then there is an empty bank in the
+* register block. Special handling for this case.
+*/
+   if (bank_width == 0) {
+   dev_dbg(dev, "Width 0 found: Empty bank @ %d\n",
+   num_banks);
+   num_banks++;
+   gpio_base += MAX_GPIO_PER_BANK;
+   continue;
+   }
+
bank = devm_kzalloc(dev, sizeof(*bank), GFP_KERNEL);
if (!bank) {
err = -ENOMEM;
@@ -740,8 +752,7 @@ static int brcmstb_gpio_probe(struct platform_device *pdev)
goto fail;
}
 
-   dev_info(dev, "Registered %d banks (GPIO(s): %d-%d)\n",
-   num_banks, priv->gpio_base, gpio_base - 1);
+   dev_info(dev, "Brcmstb GPIO registered\n");
 
if (priv->parent_wake_irq && need_wakeup_event)
pm_wakeup_event(dev, 0);
-- 
2.7.4



[PATCH] gpio: brcmstb: allow 0 width GPIO banks

2018-08-14 Thread justinpopo6
From: Justin Chen 

Sometimes we have empty banks within the GPIO block. This commit allows
proper handling of 0 width GPIO banks. We handle 0 width GPIO banks by
incrementing the bank and number of GPIOs, but not initializing them.
This will mean a call into the non-existent GPIOs will return an error.

Also remove banks and GPIO information from the dev_info print. This
information is misleading since the incremented banks and gpio_base do
not reflect the actual GPIOs that got initialized. We leave this
information out since it is already printed with dev_dbg.

Signed-off-by: Justin Chen 
---
 drivers/gpio/gpio-brcmstb.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c
index 16c7f9f..8658910 100644
--- a/drivers/gpio/gpio-brcmstb.c
+++ b/drivers/gpio/gpio-brcmstb.c
@@ -664,6 +664,18 @@ static int brcmstb_gpio_probe(struct platform_device *pdev)
struct brcmstb_gpio_bank *bank;
struct gpio_chip *gc;
 
+   /*
+* If bank_width is 0, then there is an empty bank in the
+* register block. Special handling for this case.
+*/
+   if (bank_width == 0) {
+   dev_dbg(dev, "Width 0 found: Empty bank @ %d\n",
+   num_banks);
+   num_banks++;
+   gpio_base += MAX_GPIO_PER_BANK;
+   continue;
+   }
+
bank = devm_kzalloc(dev, sizeof(*bank), GFP_KERNEL);
if (!bank) {
err = -ENOMEM;
@@ -740,8 +752,7 @@ static int brcmstb_gpio_probe(struct platform_device *pdev)
goto fail;
}
 
-   dev_info(dev, "Registered %d banks (GPIO(s): %d-%d)\n",
-   num_banks, priv->gpio_base, gpio_base - 1);
+   dev_info(dev, "Brcmstb GPIO registered\n");
 
if (priv->parent_wake_irq && need_wakeup_event)
pm_wakeup_event(dev, 0);
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >