[PATCH -next] powerpc/book3s64: fix link error with CONFIG_PPC_RADIX_MMU=n

2020-09-05 Thread Yang Yingliang
Fix link error when CONFIG_PPC_RADIX_MMU is disabled:
powerpc64-linux-gnu-ld: arch/powerpc/platforms/pseries/lpar.o:(.toc+0x0): 
undefined reference to `mmu_pid_bits'

Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 arch/powerpc/mm/book3s64/mmu_context.c | 4 
 arch/powerpc/platforms/pseries/lpar.c  | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/mmu_context.c 
b/arch/powerpc/mm/book3s64/mmu_context.c
index 0ba30b8b935b..a8e292cd88f0 100644
--- a/arch/powerpc/mm/book3s64/mmu_context.c
+++ b/arch/powerpc/mm/book3s64/mmu_context.c
@@ -152,6 +152,7 @@ void hash__setup_new_exec(void)
 
 static int radix__init_new_context(struct mm_struct *mm)
 {
+#ifdef CONFIG_PPC_RADIX_MMU
unsigned long rts_field;
int index, max_id;
 
@@ -177,6 +178,9 @@ static int radix__init_new_context(struct mm_struct *mm)
mm->context.hash_context = NULL;
 
return index;
+#else
+   return -ENOTSUPP;
+#endif
 }
 
 int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index baf24eacd268..e454e218dbba 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -1726,10 +1726,12 @@ void __init hpte_init_pseries(void)
 
 void radix_init_pseries(void)
 {
+#ifdef CONFIG_PPC_RADIX_MMU
pr_info("Using radix MMU under hypervisor\n");
 
pseries_lpar_register_process_table(__pa(process_tb),
0, PRTB_SIZE_SHIFT - 12);
+#endif
 }
 
 #ifdef CONFIG_PPC_SMLPAR
-- 
2.25.1



[PATCH -next] powerpc/eeh: fix compile warning with CONFIG_PROC_FS=n

2020-09-05 Thread Yang Yingliang
Fix the compile warning:

arch/powerpc/kernel/eeh.c:1639:12: error: 'proc_eeh_show' defined but not used 
[-Werror=unused-function]
 static int proc_eeh_show(struct seq_file *m, void *v)

Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 arch/powerpc/kernel/eeh.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 94682382fc8c..420c3c25c6e7 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1636,6 +1636,7 @@ int eeh_pe_inject_err(struct eeh_pe *pe, int type, int 
func,
 }
 EXPORT_SYMBOL_GPL(eeh_pe_inject_err);
 
+#ifdef CONFIG_PROC_FS
 static int proc_eeh_show(struct seq_file *m, void *v)
 {
if (!eeh_enabled()) {
@@ -1662,6 +1663,7 @@ static int proc_eeh_show(struct seq_file *m, void *v)
 
return 0;
 }
+#endif
 
 #ifdef CONFIG_DEBUG_FS
 static int eeh_enable_dbgfs_set(void *data, u64 val)
-- 
2.25.1



[PATCH v3 4/5] powerpc: apm82181: add Meraki MR24 AP

2020-09-05 Thread Christian Lamparter
This patch adds the device-tree definitions for Meraki MR24
Accesspoint devices.

Board: MR24 - Meraki MR24 Cloud Managed Access Point
CPU: APM82181 SoC 800 MHz (PLB=200 OPB=100 EBC=100)
Flash size: 32MiB
RAM Size: 128MiB
Wireless: Atheros AR9380 5.0GHz + Atheros AR9380 2.4GHz
EPHY: 1x Gigabit Atheros AR8035

Ready to go images and install instruction can be found @OpenWrt.

Signed-off-by: Chris Blake 
Signed-off-by: Christian Lamparter 
---
rfc v1 -> v2:
- use new led naming scheme
- space-vs-tab snafu cleanup
- remove led-aliases (openwrt specific)
- overhauled commit message
v2 -> v3:
- added interrupt-properties legacy pci interrupt signalling
  to fix wifi
---
 arch/powerpc/boot/dts/meraki-mr24.dts  | 237 +
 arch/powerpc/platforms/44x/ppc44x_simple.c |   1 +
 2 files changed, 238 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/meraki-mr24.dts

diff --git a/arch/powerpc/boot/dts/meraki-mr24.dts 
b/arch/powerpc/boot/dts/meraki-mr24.dts
new file mode 100644
index ..f91c243e7678
--- /dev/null
+++ b/arch/powerpc/boot/dts/meraki-mr24.dts
@@ -0,0 +1,237 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Device Tree Source for Meraki MR24 (Ikarem)
+ *
+ * Copyright (C) 2016 Chris Blake 
+ *
+ * Based on Cisco Meraki GPL Release r23-20150601 MR24 DTS
+ */
+
+/dts-v1/;
+
+#include 
+#include "apm82181.dtsi"
+
+/ {
+   model = "Meraki MR24 Access Point";
+   compatible = "meraki,mr24";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   stdout-path = "/plb/opb/serial@ef600400";
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+
+   /* 32 MiB NAND Flash */
+   nand {
+   partition@0 {
+   label = "u-boot";
+   reg = <0x 0x0015>;
+   read-only;
+   };
+
+   partition@15 {
+   /*
+* The u-boot environment size is one NAND
+* block (16KiB). u-boot allocates four NAND
+* blocks (64KiB) in order to have spares
+* around for bad block management
+*/
+   label = "u-boot-env";
+   reg = <0x0015 0x0001>;
+   read-only;
+   };
+
+   partition@16 {
+   /*
+* redundant u-boot environment.
+* has to be kept it in sync with the
+* data in "u-boot-env".
+*/
+   label = "u-boot-env-redundant";
+   reg = <0x0016 0x0001>;
+   read-only;
+   };
+
+   partition@17 {
+   label = "oops";
+   reg = <0x0017 0x0001>;
+   };
+
+   partition@18 {
+   label = "ubi";
+   reg = <0x0018 0x01e8>;
+   };
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+   /* Boot ROM is at 0x52-0x53, do not touch */
+   /* Unknown chip at 0x6e, not sure what it is */
+};
+
+ {
+   status = "okay";
+
+   phy-mode = "rgmii-id";
+   phy-map = <0x2>;
+   phy-address = <0x1>;
+   phy-handle = <>;
+
+   mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   phy: phy@1 {
+   compatible = "ethernet-phy-ieee802.3-c22";
+   reg = <1>;
+   };
+   };
+};
+
+ {
+   leds {
+   compatible = "gpio-leds";
+
+   status: power-green {
+   function = LED_FUNCTION_POWER;
+   color = ;
+   gpios = < 18 GPIO_ACTIVE_LOW>;
+   };
+
+   failsafe: power-amber {
+   function = LED_FUNCTION_FAULT;
+   color = ;
+   gpios = < 19 GPIO_ACTIVE_LOW>;
+   };
+
+   lan {
+   function = LED_FUNCTION_WAN;
+   color = ;
+   gpios = < 17 GPIO_ACTIVE_LOW>;
+   };
+
+   /* signal strength indicator */
+   ssi-0 {
+   function = LED_FUNCTION_INDICATOR;
+   color = ;
+   gpios = < 23 GPIO_ACTIVE_LOW>;
+   };
+
+   ssi-1 {
+   function = LED_FUNCTION_INDICATOR;
+   color = ;
+   gpios = < 22 GPIO_ACTIVE_LOW>;
+   };
+
+   ssi-2 {
+   function 

[PATCH v3 5/5] powerpc: apm82181: integrate bluestone.dts

2020-09-05 Thread Christian Lamparter
This patch tries to integrate the existing bluestone.dts into the
apm82181.dtsi framework.

The original bluestone.dts produces a  peculiar warning message.
> bluestone.dts:120.10-125.4: Warning (i2c_bus_reg):
>  /plb/opb/i2c@ef600700/sttm@4C: I2C bus unit address format error, expected 
> "4c"
For now, this has been kept as-is.

Signed-off-by: Christian Lamparter 
---
rfc -> v1:
- no changes
v2 -> v3:
- incorporated pcie@d node-name switch
---
 arch/powerpc/boot/dts/bluestone.dts | 458 +++-
 1 file changed, 104 insertions(+), 354 deletions(-)

diff --git a/arch/powerpc/boot/dts/bluestone.dts 
b/arch/powerpc/boot/dts/bluestone.dts
index aa1ae94cd776..b568fe7ae526 100644
--- a/arch/powerpc/boot/dts/bluestone.dts
+++ b/arch/powerpc/boot/dts/bluestone.dts
@@ -8,388 +8,138 @@
 
 /dts-v1/;
 
+#include "apm82181.dtsi"
+
 / {
-   #address-cells = <2>;
-   #size-cells = <1>;
model = "apm,bluestone";
compatible = "apm,bluestone";
-   dcr-parent = <&{/cpus/cpu@0}>;
 
aliases {
-   ethernet0 = 
serial0 = 
serial1 = 
};
+};
 
-   cpus {
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   cpu@0 {
-   device_type = "cpu";
-   model = "PowerPC,apm821xx";
-   reg = <0x>;
-   clock-frequency = <0>; /* Filled in by U-Boot */
-   timebase-frequency = <0>; /* Filled in by U-Boot */
-   i-cache-line-size = <32>;
-   d-cache-line-size = <32>;
-   i-cache-size = <32768>;
-   d-cache-size = <32768>;
-   dcr-controller;
-   dcr-access-method = "native";
-   next-level-cache = <>;
-   };
-   };
-
-   memory {
-   device_type = "memory";
-   reg = <0x 0x 0x>; /* Filled in by 
U-Boot */
-   };
-
-   UIC0: interrupt-controller0 {
-   compatible = "ibm,uic";
-   interrupt-controller;
-   cell-index = <0>;
-   dcr-reg = <0x0c0 0x009>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   #interrupt-cells = <2>;
-   };
-
-   UIC1: interrupt-controller1 {
-   compatible = "ibm,uic";
-   interrupt-controller;
-   cell-index = <1>;
-   dcr-reg = <0x0d0 0x009>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   #interrupt-cells = <2>;
-   interrupts = <0x1e 0x4 0x1f 0x4>; /* cascade */
-   interrupt-parent = <>;
-   };
+ {
+   status = "okay";
+};
 
-   UIC2: interrupt-controller2 {
-   compatible = "ibm,uic";
-   interrupt-controller;
-   cell-index = <2>;
-   dcr-reg = <0x0e0 0x009>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   #interrupt-cells = <2>;
-   interrupts = <0xa 0x4 0xb 0x4>; /* cascade */
-   interrupt-parent = <>;
-   };
+ {
+   status = "okay";
+};
 
-   UIC3: interrupt-controller3 {
-   compatible = "ibm,uic";
-   interrupt-controller;
-   cell-index = <3>;
-   dcr-reg = <0x0f0 0x009>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   #interrupt-cells = <2>;
-   interrupts = <0x10 0x4 0x11 0x4>; /* cascade */
-   interrupt-parent = <>;
-   };
+ {
+   status = "okay";
 
-   OCM: ocm@40004 {
-   compatible = "ibm,ocm";
-   status = "okay";
-   cell-index = <1>;
-   /* configured in U-Boot */
-   reg = <4 0x0004 0x8000>; /* 32K */
-   };
+   compatible = "amd,s29gl512n", "cfi-flash";
+   bank-width = <2>;
+   reg = <0x 0x 0x0040>;
 
-   SDR0: sdr {
-   compatible = "ibm,sdr-apm821xx";
-   dcr-reg = <0x00e 0x002>;
+   partition@0 {
+   label = "kernel";
+   reg = <0x 0x0018>;
};
-
-   CPR0: cpr {
-   compatible = "ibm,cpr-apm821xx";
-   dcr-reg = <0x00c 0x002>;
+   partition@18 {
+   label = "env";
+   reg = <0x0018 0x0002>;
};
-
-   L2C0: l2c {
-   compatible = "ibm,l2-cache-apm82181", "ibm,l2-cache";
-   dcr-reg = <0x020 0x008
-  0x030 0x008>;
-   cache-line-size = <32>;
-   cache-size = <262144>;
-   interrupt-parent = <>;
-   interrupts = <11 1>;
+   partition@1a {
+   label = "u-boot";
+   reg = 

[PATCH v3 3/5] powerpc: apm82181: add WD MyBook Live NAS

2020-09-05 Thread Christian Lamparter
This patch adds the device-tree definitions for
Western Digital MyBook Live NAS devices.

CPU: AMCC PowerPC  APM82181 (PVR=12c41c83) at 800 MHz
 (PLB=200, OPB=100, EBC=100 MHz)
 32 kB I-Cache 32 kB D-Cache, 256 kB L2-Cache, 32 kB OnChip Memory
DRAM:  256 MB (2x NT5TU64M16GG-AC)
FLASH: 512 kB
Ethernet: 1xRGMII - 1 Gbit - Broadcom PHY BCM54610
SATA: 2*SATA (DUO Variant) / 1*SATA (Single Variant)
USB: 1xUSB2.0 (Only DUO)

Technically, this devicetree file is shared by two, very
similar devices.

There's the My Book Live and the My Book Live Duo. WD's uboot
on the device will enable/disable the nodes for the device.
This device boots from a u-boot on a 512 KiB NOR Flash onto a
Linux image stored on one of the harddrives.

Ready to go images and install instruction can be found @OpenWrt.org

Signed-off-by: Christian Lamparter 
---
rfc v1 -> v2:
- use new LED naming scheme
- dish out read-only; for essential NOR partitions
- remove openwrt led-aliases
- comment on the location of linux kernel (on the HDD)
- overhauled commit message
v2 -> v3:
- "jedec-probe" should be "jedec,spi-nor"
---
 arch/powerpc/boot/dts/wd-mybooklive.dts| 200 +
 arch/powerpc/platforms/44x/ppc44x_simple.c |   3 +-
 2 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/boot/dts/wd-mybooklive.dts

diff --git a/arch/powerpc/boot/dts/wd-mybooklive.dts 
b/arch/powerpc/boot/dts/wd-mybooklive.dts
new file mode 100644
index ..8fe868252cb5
--- /dev/null
+++ b/arch/powerpc/boot/dts/wd-mybooklive.dts
@@ -0,0 +1,200 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2008 DENX Software Engineering, Stefan Roese 
+ * (c) Copyright 2010 Western Digital Technologies, Inc. All Rights Reserved.
+ */
+
+/dts-v1/;
+
+#include 
+#include "apm82181.dtsi"
+
+/ {
+   compatible = "wd,mybooklive";
+   model = "MyBook Live";
+
+   aliases {
+   serial0 = 
+   };
+};
+
+ {
+   GPIO1: gpio@e000 {
+   compatible = "wd,mbl-gpio";
+   reg-names = "dat";
+   reg = <0xe000 0x1>;
+   #gpio-cells = <2>;
+   gpio-controller;
+
+   enable-button {
+   /* Defined in u-boot as: NOT_NOR
+* "enables features other than NOR
+* specifically, the buffer at CS2"
+* (button).
+*
+* Note: This option is disabled as
+* it prevents the system from being
+* rebooted successfully.
+*/
+
+   gpio-hog;
+   line-name = "Enable Reset Button, disable NOR";
+   gpios = <1 GPIO_ACTIVE_HIGH>;
+   output-low;
+   };
+   };
+
+   GPIO2: gpio@e010 {
+   compatible = "wd,mbl-gpio";
+   reg-names = "dat";
+   reg = <0xe010 0x1>;
+   #gpio-cells = <2>;
+   gpio-controller;
+   no-output;
+   };
+
+   leds {
+   compatible = "gpio-leds";
+
+   /* There's just one tri-color LED. */
+   failsafe: power-red {
+   function = LED_FUNCTION_FAULT;
+   color = ;
+   gpios = < 4 GPIO_ACTIVE_HIGH>;
+   linux,default-trigger = "panic";
+   };
+
+   power-green {
+   function = LED_FUNCTION_POWER;
+   color = ;
+   gpios = < 5 GPIO_ACTIVE_HIGH>;
+   };
+
+   power-blue {
+   function = LED_FUNCTION_DISK;
+   color = ;
+   gpios = < 6 GPIO_ACTIVE_HIGH>;
+   linux,default-trigger = "disk-activity";
+   };
+   };
+
+   keys {
+   compatible = "gpio-keys-polled";
+   poll-interval = <60>;   /* 3 * 20 = 60ms */
+   autorepeat;
+
+   reset-button {
+   label = "Reset button";
+   linux,code = ;
+   gpios = < 2 GPIO_ACTIVE_LOW>;
+   };
+   };
+
+   usbpwr: usb-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "Power USB Core";
+   gpios = < 2 GPIO_ACTIVE_LOW>;
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   };
+
+   sata1pwr: sata1-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "Power Drive Port 1";
+   gpios = < 3 GPIO_ACTIVE_LOW>;
+   regulator-min-microvolt = <1200>;
+   regulator-max-microvolt = <1200>;
+   

[PATCH v3 2/5] powerpc: apm82181: create shared dtsi for APM bluestone

2020-09-05 Thread Christian Lamparter
This patch adds an DTSI-File that can be used by various device-tree
files for APM82181-based devices.

Some of the nodes (like UART, PCIE, SATA) are used by the uboot and
need to stick with the naming-conventions of the old times'.
I've added comments whenever this was the case.

Signed-off-by: Chris Blake 
Signed-off-by: Christian Lamparter 
---
rfc v1 -> v2:
- removed PKA (this CryptoPU will need driver)
- stick with compatibles, nodes, ... from either
  Bluestone (APM82181) or Canyonlands (PPC460EX).
- add labels for NAND and NOR to help with access.
v2 -> v3:
- nodename of pciex@d was changed to pcie@d..
  due to upstream patch.
- use simple-bus on the ebc, opb and plb nodes
---
 arch/powerpc/boot/dts/apm82181.dtsi | 466 
 1 file changed, 466 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/apm82181.dtsi

diff --git a/arch/powerpc/boot/dts/apm82181.dtsi 
b/arch/powerpc/boot/dts/apm82181.dtsi
new file mode 100644
index ..60283430978d
--- /dev/null
+++ b/arch/powerpc/boot/dts/apm82181.dtsi
@@ -0,0 +1,466 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Device Tree template include for various APM82181 boards.
+ *
+ * The SoC is an evolution of the PPC460EX predecessor.
+ * This is why dt-nodes from the canyonlands EBC, OPB, USB,
+ * DMA, SATA, EMAC, ... ended up in here.
+ *
+ * Copyright (c) 2010, Applied Micro Circuits Corporation
+ * Author: Tirumala R Marri ,
+ *Christian Lamparter ,
+ *Chris Blake 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+/ {
+   #address-cells = <2>;
+   #size-cells = <1>;
+   dcr-parent = <&{/cpus/cpu@0}>;
+
+   aliases {
+   ethernet0 =  /* needed for BSP u-boot */
+   };
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   CPU0: cpu@0 {
+   device_type = "cpu";
+   model = "PowerPC,apm82181";
+   reg = <0x>;
+   clock-frequency = <0>; /* Filled in by U-Boot */
+   timebase-frequency = <0>; /* Filled in by U-Boot */
+   i-cache-line-size = <32>;
+   d-cache-line-size = <32>;
+   i-cache-size = <32768>;
+   d-cache-size = <32768>;
+   dcr-controller;
+   dcr-access-method = "native";
+   next-level-cache = <>;
+   };
+   };
+
+   memory {
+   device_type = "memory";
+   reg = <0x 0x 0x>; /* Filled in by 
U-Boot */
+   };
+
+   UIC0: interrupt-controller0 {
+   compatible = "apm,uic-apm82181", "ibm,uic";
+   interrupt-controller;
+   cell-index = <0>;
+   dcr-reg = <0x0c0 0x009>;
+   #address-cells = <0>;
+   #size-cells = <0>;
+   #interrupt-cells = <2>;
+   };
+
+   UIC1: interrupt-controller1 {
+   compatible = "apm,uic-apm82181", "ibm,uic";
+   interrupt-controller;
+   cell-index = <1>;
+   dcr-reg = <0x0d0 0x009>;
+   #address-cells = <0>;
+   #size-cells = <0>;
+   #interrupt-cells = <2>;
+   interrupts = <0x1e IRQ_TYPE_LEVEL_HIGH>,
+<0x1f IRQ_TYPE_LEVEL_HIGH>; /* cascade */
+   interrupt-parent = <>;
+   };
+
+   UIC2: interrupt-controller2 {
+   compatible = "apm,uic-apm82181", "ibm,uic";
+   interrupt-controller;
+   cell-index = <2>;
+   dcr-reg = <0x0e0 0x009>;
+   #address-cells = <0>;
+   #size-cells = <0>;
+   #interrupt-cells = <2>;
+   interrupts = <0x0a IRQ_TYPE_LEVEL_HIGH>,
+<0x0b IRQ_TYPE_LEVEL_HIGH>; /* cascade */
+   interrupt-parent = <>;
+   };
+
+   UIC3: interrupt-controller3 {
+   compatible = "apm,uic-apm82181","ibm,uic";
+   interrupt-controller;
+   cell-index = <3>;
+   dcr-reg = <0x0f0 0x009>;
+   #address-cells = <0>;
+   #size-cells = <0>;
+   #interrupt-cells = <2>;
+   interrupts = <0x10 IRQ_TYPE_LEVEL_HIGH>,
+<0x11 IRQ_TYPE_LEVEL_HIGH>; /* cascade */
+   interrupt-parent = <>;
+   };
+
+   OCM: ocm@40004 {
+   compatible = "ibm,ocm";
+   status = "okay";
+   cell-index = <1>;
+   /* configured in U-Boot */
+   reg = <4 0x0004 0x8000>; /* 32K */
+   };
+
+   SDR0: sdr {
+   compatible = "apm,sdr-apm821xx";
+   dcr-reg = <0x00e 0x002>;
+   };
+
+   CPR0: cpr {
+   

[PATCH v3 0/5] powerpc: apm82181: adding customer devices

2020-09-05 Thread Christian Lamparter
Hello,

I've been holding on to these devices dts' for a while now.
But ever since the recent purge of the PPC405, I'm feeling
the urge to move forward.

The devices in question have been running with OpenWrt since
around 2016/2017. Back then it was linux v4.4 and required
many out-of-tree patches (for WIFI, SATA, CRYPTO...), that
since have been integrated. So, there's nothing else in the
way I think.

A patch that adds the Meraki vendor-prefix has been sent
separately, as there's also the Meraki MR32 that I'm working
on as well. Here's the link to the patch:


Now, I've looked around in the arch/powerpc for recent .dts
and device submissions to get an understanding of what is
required.
>From the looks of it, it seems like every device gets a
skeleton defconfig and a CONFIG_$DEVICE symbol (Like:
CONFIG_MERAKI_MR24, CONFIG_WD_MYBOOKLIVE).

Will this be the case? Or would it make sense to further
unite the Bluestone, MR24 and MBL under a common CONFIG_APM82181
and integrate the BLUESTONE device's defconfig into it as well?
(I've stumbled across the special machine compatible
handling of ppc in the Documentation/devicetree/usage-model.rst
already.)

Cheers,
Christian

Note:
If someone has a WD MyBook Live (DUO) and is interested in
giving it a spin with 5.8. I've made a:
"build your own Debian System" sort of script that can be
found on github: 
(the only remaining patch hack is for debian's make-kpkg crossbuild)

Furthermore, the OpenWrt project currently has images for
the additional apm82181-based devices:
 Cisco Meraki MX60(W) - Needs DSA for the AR8327
 Netgear WNDAP620/WNDAP660 - (Could be next)
 Netgear WNDR4700 - Needs DSA for the AR8327

Note2: I do have a stash of extensive APM82181 related documentation.

-- 
2.28.0



[PATCH v3 1/5] dt-bindings: powerpc: define apm,apm82181 binding

2020-09-05 Thread Christian Lamparter
make a binding for the various boards based on the
AppliedMicro/APM APM82181 SoC.

Signed-off-by: Christian Lamparter 
---
 .../bindings/powerpc/4xx/apm,apm82181.yaml| 29 +++
 1 file changed, 29 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/powerpc/4xx/apm,apm82181.yaml

diff --git a/Documentation/devicetree/bindings/powerpc/4xx/apm,apm82181.yaml 
b/Documentation/devicetree/bindings/powerpc/4xx/apm,apm82181.yaml
new file mode 100644
index ..03a3c02fe920
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/4xx/apm,apm82181.yaml
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/powerpc/4xx/apm,apm82181.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: APM APM82181 device tree bindings
+
+description:
+  AppliedMicro APM82181 Wi-Fi/network SoCs based
+  on the PPC464-CPU architecture.
+
+maintainers:
+  - Christian Lamparter 
+
+properties:
+  $nodename:
+const: '/'
+  compatible:
+oneOf:
+  - description: APM82181 based boards
+items:
+  - enum:
+  - apm,bluestone
+  - meraki,mr24
+  - wd,mybooklive
+  - const: amcc,apm82181
+
+...
-- 
2.28.0



[RFC PATCH 12/12] powerpc/64s: power4 nap fixup in C

2020-09-05 Thread Nicholas Piggin
There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h   | 14 
 arch/powerpc/include/asm/processor.h   |  1 +
 arch/powerpc/include/asm/thread_info.h |  6 
 arch/powerpc/kernel/exceptions-64s.S   | 45 --
 arch/powerpc/kernel/idle_book3s.S  |  4 +++
 5 files changed, 25 insertions(+), 45 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 3ae3d2f93b61..acfcc7d5779b 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -8,6 +8,16 @@
 #include 
 #include 
 
+static inline void nap_adjust_return(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_970_NAP
+   if (test_thread_local_flags(_TLF_NAPPING)) {
+   clear_thread_local_flags(_TLF_NAPPING);
+   regs->nip = (unsigned long)power4_idle_nap_return;
+   }
+#endif
+}
+
 #ifdef CONFIG_PPC_BOOK3S_64
 static inline void interrupt_enter_prepare(struct pt_regs *regs)
 {
@@ -33,6 +43,8 @@ static inline void interrupt_async_enter_prepare(struct 
pt_regs *regs)
if (cpu_has_feature(CPU_FTR_CTRL) &&
!test_thread_local_flags(_TLF_RUNLATCH))
__ppc64_runlatch_on();
+
+   nap_adjust_return(regs);
 }
 
 #else /* CONFIG_PPC_BOOK3S_64 */
@@ -72,6 +84,8 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs 
*regs, struct inte
 
this_cpu_set_ftrace_enabled(0);
 
+   nap_adjust_return(regs);
+
nmi_enter();
 }
 
diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa..3da1dba91386 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -424,6 +424,7 @@ extern unsigned long isa300_idle_stop_mayloss(unsigned long 
psscr_val);
 extern unsigned long isa206_idle_insn_mayloss(unsigned long type);
 #ifdef CONFIG_PPC_970_NAP
 extern void power4_idle_nap(void);
+extern void power4_idle_nap_return(void);
 #endif
 
 extern unsigned long cpuidle_disable;
diff --git a/arch/powerpc/include/asm/thread_info.h 
b/arch/powerpc/include/asm/thread_info.h
index ca6c97025704..9b15f7edb0cb 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -156,6 +156,12 @@ void arch_setup_new_exec(void);
 
 #ifndef __ASSEMBLY__
 
+static inline void clear_thread_local_flags(unsigned int flags)
+{
+   struct thread_info *ti = current_thread_info();
+   ti->local_flags &= ~flags;
+}
+
 static inline bool test_thread_local_flags(unsigned int flags)
 {
struct thread_info *ti = current_thread_info();
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 227bad3a586d..1db6b3438c88 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -692,25 +692,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
ld  r1,GPR1(r1)
 .endm
 
-/*
- * When the idle code in power4_idle puts the CPU into NAP mode,
- * it has to do so in a loop, and relies on the external interrupt
- * and decrementer interrupt entry code to get it out of the loop.
- * It sets the _TLF_NAPPING bit in current_thread_info()->local_flags
- * to signal that it is in the loop and needs help to get out.
- */
-#ifdef CONFIG_PPC_970_NAP
-#define FINISH_NAP \
-BEGIN_FTR_SECTION  \
-   ld  r11, PACA_THREAD_INFO(r13); \
-   ld  r9,TI_LOCAL_FLAGS(r11); \
-   andi.   r10,r9,_TLF_NAPPING;\
-   bnelpower4_fixup_nap;   \
-END_FTR_SECTION_IFSET(CPU_FTR_CAN_NAP)
-#else
-#define FINISH_NAP
-#endif
-
 /*
  * There are a few constraints to be concerned with.
  * - Real mode exceptions code/data must be located at their physical location.
@@ -1250,7 +1231,6 @@ EXC_COMMON_BEGIN(machine_check_common)
 */
GEN_COMMON machine_check
 
-   FINISH_NAP
/* Enable MSR_RI when finished with PACA_EXMC */
li  r10,MSR_RI
mtmsrd  r10,1
@@ -1572,7 +1552,6 @@ EXC_VIRT_BEGIN(hardware_interrupt, 0x4500, 0x100)
 EXC_VIRT_END(hardware_interrupt, 0x4500, 0x100)
 EXC_COMMON_BEGIN(hardware_interrupt_common)
GEN_COMMON hardware_interrupt
-   FINISH_NAP
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_IRQ
b   interrupt_return
@@ -1757,7 +1736,6 @@ EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80)
 EXC_VIRT_END(decrementer, 0x4900, 0x80)
 EXC_COMMON_BEGIN(decrementer_common)
GEN_COMMON decrementer
-   FINISH_NAP
addir3,r1,STACK_FRAME_OVERHEAD
bl  timer_interrupt
b   interrupt_return
@@ -1842,7 +1820,6 @@ EXC_VIRT_BEGIN(doorbell_super, 0x4a00, 0x100)
 EXC_VIRT_END(doorbell_super, 0x4a00, 0x100)
 EXC_COMMON_BEGIN(doorbell_super_common)
GEN_COMMON doorbell_super
-   FINISH_NAP
addi

[RFC PATCH 11/12] powerpc/64s: runlatch interrupt handling in C

2020-09-05 Thread Nicholas Piggin
There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 16 ++--
 arch/powerpc/kernel/exceptions-64s.S | 18 --
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index c26a7c466416..3ae3d2f93b61 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_PPC_BOOK3S_64
 static inline void interrupt_enter_prepare(struct pt_regs *regs)
@@ -25,10 +26,22 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
}
 }
 
+static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
+{
+   interrupt_enter_prepare(regs);
+
+   if (cpu_has_feature(CPU_FTR_CTRL) &&
+   !test_thread_local_flags(_TLF_RUNLATCH))
+   __ppc64_runlatch_on();
+}
+
 #else /* CONFIG_PPC_BOOK3S_64 */
 static inline void interrupt_enter_prepare(struct pt_regs *regs)
 {
 }
+static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
+{
+}
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 struct interrupt_nmi_state {
@@ -76,7 +89,6 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs 
*regs, struct inter
 #endif
 }
 
-
 /**
  * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
  * @func:  Function name of the entry point
@@ -193,7 +205,7 @@ static __always_inline void ___##func(struct pt_regs 
*regs);\
\
 __visible noinstr void func(struct pt_regs *regs)  \
 {  \
-   interrupt_enter_prepare(regs);  \
+   interrupt_async_enter_prepare(regs);\
\
___##func (regs);   \
 }  \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 0949dd47be59..227bad3a586d 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -692,14 +692,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
ld  r1,GPR1(r1)
 .endm
 
-#define RUNLATCH_ON\
-BEGIN_FTR_SECTION  \
-   ld  r3, PACA_THREAD_INFO(r13);  \
-   ld  r4,TI_LOCAL_FLAGS(r3);  \
-   andi.   r0,r4,_TLF_RUNLATCH;\
-   beqlppc64_runlatch_on_trampoline;   \
-END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
-
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
  * it has to do so in a loop, and relies on the external interrupt
@@ -1581,7 +1573,6 @@ EXC_VIRT_END(hardware_interrupt, 0x4500, 0x100)
 EXC_COMMON_BEGIN(hardware_interrupt_common)
GEN_COMMON hardware_interrupt
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_IRQ
b   interrupt_return
@@ -1767,7 +1758,6 @@ EXC_VIRT_END(decrementer, 0x4900, 0x80)
 EXC_COMMON_BEGIN(decrementer_common)
GEN_COMMON decrementer
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
bl  timer_interrupt
b   interrupt_return
@@ -1853,7 +1843,6 @@ EXC_VIRT_END(doorbell_super, 0x4a00, 0x100)
 EXC_COMMON_BEGIN(doorbell_super_common)
GEN_COMMON doorbell_super
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
bl  doorbell_exception
@@ -2208,7 +2197,6 @@ EXC_COMMON_BEGIN(hmi_exception_early_common)
 EXC_COMMON_BEGIN(hmi_exception_common)
GEN_COMMON hmi_exception
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
bl  handle_hmi_exception
b   interrupt_return
@@ -2238,7 +2226,6 @@ EXC_VIRT_END(h_doorbell, 0x4e80, 0x20)
 EXC_COMMON_BEGIN(h_doorbell_common)
GEN_COMMON h_doorbell
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
bl  doorbell_exception
@@ -2272,7 +2259,6 @@ EXC_VIRT_END(h_virt_irq, 0x4ea0, 0x20)
 EXC_COMMON_BEGIN(h_virt_irq_common)
GEN_COMMON h_virt_irq
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_IRQ
b   interrupt_return
@@ -2319,7 +2305,6 @@ EXC_VIRT_END(performance_monitor, 0x4f00, 0x20)
 EXC_COMMON_BEGIN(performance_monitor_common)
GEN_COMMON performance_monitor
FINISH_NAP
-   RUNLATCH_ON
addir3,r1,STACK_FRAME_OVERHEAD
bl  performance_monitor_exception
b   interrupt_return
@@ 

[RFC PATCH 10/12] powerpc/64s: move NMI soft-mask handling to C

2020-09-05 Thread Nicholas Piggin
Saving and restoring soft-mask state can now be done in C using the
interrupt handler wrapper functions.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 25 
 arch/powerpc/kernel/exceptions-64s.S | 60 
 2 files changed, 25 insertions(+), 60 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 69eb8a432984..c26a7c466416 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -33,12 +33,30 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
 
 struct interrupt_nmi_state {
 #ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
+   u8 irq_soft_mask;
+   u8 irq_happened;
+#endif
u8 ftrace_enabled;
 #endif
 };
 
 static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct 
interrupt_nmi_state *state)
 {
+#ifdef CONFIG_PPC_BOOK3S_64
+   state->irq_soft_mask = local_paca->irq_soft_mask;
+   state->irq_happened = local_paca->irq_happened;
+   state->ftrace_enabled = this_cpu_get_ftrace_enabled();
+
+   /*
+* Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
+* the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
+* because that goes through irq tracing which we don't want in NMI.
+*/
+   local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
+   local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+#endif
+
this_cpu_set_ftrace_enabled(0);
 
nmi_enter();
@@ -49,6 +67,13 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs 
*regs, struct inter
nmi_exit();
 
this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+
+#ifdef CONFIG_PPC_BOOK3S_64
+   /* Check we didn't change the pending interrupt mask. */
+   WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != 
local_paca->irq_happened);
+   local_paca->irq_happened = state->irq_happened;
+   local_paca->irq_soft_mask = state->irq_soft_mask;
+#endif
 }
 
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 121a55c87c02..0949dd47be59 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1010,20 +1010,6 @@ EXC_COMMON_BEGIN(system_reset_common)
ld  r1,PACA_NMI_EMERG_SP(r13)
subir1,r1,INT_FRAME_SIZE
__GEN_COMMON_BODY system_reset
-   /*
-* Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
-* the right thing. We do not want to reconcile because that goes
-* through irq tracing which we don't want in NMI.
-*
-* Save PACAIRQHAPPENED to RESULT (otherwise unused), and set HARD_DIS
-* as we are running with MSR[EE]=0.
-*/
-   li  r10,IRQS_ALL_DISABLED
-   stb r10,PACAIRQSOFTMASK(r13)
-   lbz r10,PACAIRQHAPPENED(r13)
-   std r10,RESULT(r1)
-   ori r10,r10,PACA_IRQ_HARD_DIS
-   stb r10,PACAIRQHAPPENED(r13)
 
addir3,r1,STACK_FRAME_OVERHEAD
bl  system_reset_exception
@@ -1039,14 +1025,6 @@ EXC_COMMON_BEGIN(system_reset_common)
subir10,r10,1
sth r10,PACA_IN_NMI(r13)
 
-   /*
-* Restore soft mask settings.
-*/
-   ld  r10,RESULT(r1)
-   stb r10,PACAIRQHAPPENED(r13)
-   ld  r10,SOFTE(r1)
-   stb r10,PACAIRQSOFTMASK(r13)
-
kuap_restore_amr r9, r10
EXCEPTION_RESTORE_REGS
RFI_TO_USER_OR_KERNEL
@@ -1192,30 +1170,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
li  r10,MSR_RI
mtmsrd  r10,1
 
-   /*
-* Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
-* system_reset_common)
-*/
-   li  r10,IRQS_ALL_DISABLED
-   stb r10,PACAIRQSOFTMASK(r13)
-   lbz r10,PACAIRQHAPPENED(r13)
-   std r10,RESULT(r1)
-   ori r10,r10,PACA_IRQ_HARD_DIS
-   stb r10,PACAIRQHAPPENED(r13)
-
addir3,r1,STACK_FRAME_OVERHEAD
bl  machine_check_early
std r3,RESULT(r1)   /* Save result */
ld  r12,_MSR(r1)
 
-   /*
-* Restore soft mask settings.
-*/
-   ld  r10,RESULT(r1)
-   stb r10,PACAIRQHAPPENED(r13)
-   ld  r10,SOFTE(r1)
-   stb r10,PACAIRQSOFTMASK(r13)
-
 #ifdef CONFIG_PPC_P7_NAP
/*
 * Check if thread was in power saving mode. We come here when any
@@ -2814,17 +2773,6 @@ EXC_COMMON_BEGIN(soft_nmi_common)
subir1,r1,INT_FRAME_SIZE
__GEN_COMMON_BODY soft_nmi
 
-   /*
-* Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
-* system_reset_common)
-*/
-   li  r10,IRQS_ALL_DISABLED
-   stb r10,PACAIRQSOFTMASK(r13)
-   lbz r10,PACAIRQHAPPENED(r13)
-   std r10,RESULT(r1)
-   ori r10,r10,PACA_IRQ_HARD_DIS
-   stb r10,PACAIRQHAPPENED(r13)
-

[RFC PATCH 09/12] powerpc: move NMI entry/exit code into wrapper

2020-09-05 Thread Nicholas Piggin
This moves the common NMI entry and exit code into the interrupt handler
wrappers.

This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
also MCE interrupts on 64e, by adding missing parts of the NMI entry to
them.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 26 +++
 arch/powerpc/kernel/mce.c| 12 -
 arch/powerpc/kernel/traps.c  | 38 +---
 arch/powerpc/kernel/watchdog.c   | 10 +++-
 4 files changed, 37 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 83fe1d64cf23..69eb8a432984 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -31,6 +31,27 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
 }
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
+struct interrupt_nmi_state {
+#ifdef CONFIG_PPC64
+   u8 ftrace_enabled;
+#endif
+};
+
+static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct 
interrupt_nmi_state *state)
+{
+   this_cpu_set_ftrace_enabled(0);
+
+   nmi_enter();
+}
+
+static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct 
interrupt_nmi_state *state)
+{
+   nmi_exit();
+
+   this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+}
+
+
 /**
  * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
  * @func:  Function name of the entry point
@@ -177,10 +198,15 @@ static __always_inline long ___##func(struct pt_regs 
*regs);  \
\
 __visible noinstr long func(struct pt_regs *regs)  \
 {  \
+   struct interrupt_nmi_state state;   \
long ret;   \
\
+   interrupt_nmi_enter_prepare(regs, );  \
+   \
ret = ___##func (regs); \
\
+   interrupt_nmi_exit_prepare(regs, );   \
+   \
return ret; \
 }  \
\
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index d0bbcc4fe13c..9f39deed4fca 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -592,13 +592,6 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
 DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
 {
long handled = 0;
-   bool nested = in_nmi();
-   u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
-
-   this_cpu_set_ftrace_enabled(0);
-
-   if (!nested)
-   nmi_enter();
 
hv_nmi_check_nonrecoverable(regs);
 
@@ -608,11 +601,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
if (ppc_md.machine_check_early)
handled = ppc_md.machine_check_early(regs);
 
-   if (!nested)
-   nmi_exit();
-
-   this_cpu_set_ftrace_enabled(ftrace_enabled);
-
return handled;
 }
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 3784578db630..01ddbe5ed5a4 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -443,11 +443,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 {
unsigned long hsrr0, hsrr1;
bool saved_hsrrs = false;
-   u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
-
-   this_cpu_set_ftrace_enabled(0);
-
-   nmi_enter();
 
/*
 * System reset can interrupt code where HSRRs are live and MSR[RI]=1.
@@ -519,10 +514,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
mtspr(SPRN_HSRR1, hsrr1);
}
 
-   nmi_exit();
-
-   this_cpu_set_ftrace_enabled(ftrace_enabled);
-
/* What should we do here? We could issue a shutdown or hard reset. */
 
return 0;
@@ -828,6 +819,12 @@ int machine_check_generic(struct pt_regs *regs)
 #endif /* everything else */
 
 
+/*
+ * BOOK3S_64 does not call this handler as a non-maskable interrupt
+ * (it uses its own early real-mode handler to handle the MCE proper
+ * and then raises irq_work to call this handler when interrupts are
+ * enabled).
+ */
 #ifdef CONFIG_PPC_BOOK3S_64
 DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
 #else
@@ -836,20 +833,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
 {
int recover = 0;
 
-   /*
-* BOOK3S_64 does not call this 

[RFC PATCH 08/12] powerpc/64: entry cpu time accounting in C

2020-09-05 Thread Nicholas Piggin
There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h |  4 
 arch/powerpc/include/asm/ppc_asm.h   | 24 
 arch/powerpc/kernel/exceptions-64e.S |  1 -
 arch/powerpc/kernel/exceptions-64s.S |  5 -
 4 files changed, 4 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 511b3304722b..83fe1d64cf23 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -16,6 +17,9 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
if (user_mode(regs)) {
CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
user_exit_irqoff();
+
+   account_cpu_user_entry();
+   account_stolen_time();
} else {
CT_WARN_ON(ct_state() == CONTEXT_USER);
}
diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index b4cc6608131c..a363c9220ce3 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -25,7 +25,6 @@
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)
 #define ACCOUNT_CPU_USER_EXIT(ptr, ra, rb)
-#define ACCOUNT_STOLEN_TIME
 #else
 #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)\
MFTB(ra);   /* get timebase */  \
@@ -44,29 +43,6 @@
PPC_LL  ra, ACCOUNT_SYSTEM_TIME(ptr);   \
add ra,ra,rb;   /* add on to system time */ \
PPC_STL ra, ACCOUNT_SYSTEM_TIME(ptr)
-
-#ifdef CONFIG_PPC_SPLPAR
-#define ACCOUNT_STOLEN_TIME\
-BEGIN_FW_FTR_SECTION;  \
-   beq 33f;\
-   /* from user - see if there are any DTL entries to process */   \
-   ld  r10,PACALPPACAPTR(r13); /* get ptr to VPA */\
-   ld  r11,PACA_DTL_RIDX(r13); /* get log read index */\
-   addir10,r10,LPPACA_DTLIDX;  \
-   LDX_BE  r10,0,r10;  /* get log write index */   \
-   cmpdcr1,r11,r10;\
-   beq+cr1,33f;\
-   bl  accumulate_stolen_time; \
-   ld  r12,_MSR(r1);   \
-   andi.   r10,r12,MSR_PR; /* Restore cr0 (coming from user) */ \
-33:\
-END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
-
-#else  /* CONFIG_PPC_SPLPAR */
-#define ACCOUNT_STOLEN_TIME
-
-#endif /* CONFIG_PPC_SPLPAR */
-
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 
 /*
diff --git a/arch/powerpc/kernel/exceptions-64e.S 
b/arch/powerpc/kernel/exceptions-64e.S
index 5988d61783b5..7a9db1a30e82 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -398,7 +398,6 @@ exc_##n##_common:   
\
std r10,_NIP(r1);   /* save SRR0 to stackframe */   \
std r11,_MSR(r1);   /* save SRR1 to stackframe */   \
beq 2f; /* if from kernel mode */   \
-   ACCOUNT_CPU_USER_ENTRY(r13,r10,r11);/* accounting (uses cr0+eq) */  \
 2: ld  r3,excf+EX_R10(r13);/* get back r10 */  \
ld  r4,excf+EX_R11(r13);/* get back r11 */  \
mfspr   r5,scratch; /* get back r13 */  \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b36247ad1f64..121a55c87c02 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -577,7 +577,6 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
kuap_save_amr_and_lock r9, r10, cr1, cr0
.endif
beq 101f/* if from kernel mode  */
-   ACCOUNT_CPU_USER_ENTRY(r13, r9, r10)
 BEGIN_FTR_SECTION
ld  r9,IAREA+EX_PPR(r13)/* Read PPR from paca   */
std r9,_PPR(r1)
@@ -645,10 +644,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
ld  r11,exception_marker@toc(r2)
std r10,RESULT(r1)  /* clear regs->result   */
std r11,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame   */
-
-   .if ISTACK
-   ACCOUNT_STOLEN_TIME
-   .endif
 .endm
 
 /*
-- 
2.23.0



[RFC PATCH 07/12] powerpc/64: move account_stolen_time into its own function

2020-09-05 Thread Nicholas Piggin
This will be used by interrupt entry as well.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/cputime.h | 15 +++
 arch/powerpc/kernel/syscall_64.c   | 10 +-
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/cputime.h 
b/arch/powerpc/include/asm/cputime.h
index ed75d1c318e3..3f61604e1fcf 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -87,6 +87,18 @@ static notrace inline void account_cpu_user_exit(void)
acct->starttime_user = tb;
 }
 
+static notrace inline void account_stolen_time(void)
+{
+#ifdef CONFIG_PPC_SPLPAR
+   if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
+   firmware_has_feature(FW_FEATURE_SPLPAR)) {
+   struct lppaca *lp = local_paca->lppaca_ptr;
+
+   if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
+   accumulate_stolen_time();
+   }
+#endif
+}
 
 #endif /* __KERNEL__ */
 #else /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
@@ -96,5 +108,8 @@ static inline void account_cpu_user_entry(void)
 static inline void account_cpu_user_exit(void)
 {
 }
+static notrace inline void account_stolen_time(void)
+{
+}
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 #endif /* __POWERPC_CPUTIME_H */
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 58eec1c7fdb8..27595aee5777 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -44,15 +44,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
 
account_cpu_user_entry();
 
-#ifdef CONFIG_PPC_SPLPAR
-   if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
-   firmware_has_feature(FW_FEATURE_SPLPAR)) {
-   struct lppaca *lp = local_paca->lppaca_ptr;
-
-   if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
-   accumulate_stolen_time();
-   }
-#endif
+   account_stolen_time();
 
/*
 * This is not required for the syscall exit path, but makes the
-- 
2.23.0



[RFC PATCH 06/12] powerpc/64s: reconcile interrupts in C

2020-09-05 Thread Nicholas Piggin
There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h |  4 
 arch/powerpc/kernel/exceptions-64s.S | 26 --
 2 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 98acfbb2df04..511b3304722b 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -9,6 +9,10 @@
 #ifdef CONFIG_PPC_BOOK3S_64
 static inline void interrupt_enter_prepare(struct pt_regs *regs)
 {
+   if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED)
+   trace_hardirqs_off();
+   local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
if (user_mode(regs)) {
CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
user_exit_irqoff();
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index f6989321136d..b36247ad1f64 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -139,7 +139,6 @@ name:
 #define IKVM_VIRT  .L_IKVM_VIRT_\name\()   /* Virt entry tests KVM */
 #define ISTACK .L_ISTACK_\name\()  /* Set regular kernel stack */
 #define __ISTACK(name) .L_ISTACK_ ## name
-#define IRECONCILE .L_IRECONCILE_\name\()  /* Do RECONCILE_IRQ_STATE */
 #define IKUAP  .L_IKUAP_\name\()   /* Do KUAP lock */
 
 #define INT_DEFINE_BEGIN(n)\
@@ -203,9 +202,6 @@ do_define_int n
.ifndef ISTACK
ISTACK=1
.endif
-   .ifndef IRECONCILE
-   IRECONCILE=1
-   .endif
.ifndef IKUAP
IKUAP=1
.endif
@@ -653,10 +649,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
.if ISTACK
ACCOUNT_STOLEN_TIME
.endif
-
-   .if IRECONCILE
-   RECONCILE_IRQ_STATE(r10, r11)
-   .endif
 .endm
 
 /*
@@ -935,7 +927,6 @@ INT_DEFINE_BEGIN(system_reset)
 */
ISET_RI=0
ISTACK=0
-   IRECONCILE=0
IKVM_REAL=1
 INT_DEFINE_END(system_reset)
 
@@ -1125,7 +1116,6 @@ INT_DEFINE_BEGIN(machine_check_early)
ISTACK=0
IDAR=1
IDSISR=1
-   IRECONCILE=0
IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
 INT_DEFINE_END(machine_check_early)
 
@@ -1473,7 +1463,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 INT_DEFINE_BEGIN(data_access_slb)
IVEC=0x380
IAREA=PACA_EXSLB
-   IRECONCILE=0
IDAR=1
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
IKVM_SKIP=1
@@ -1502,7 +1491,6 @@ MMU_FTR_SECTION_ELSE
li  r3,-EFAULT
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
std r3,RESULT(r1)
-   RECONCILE_IRQ_STATE(r10, r11)
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_bad_slb_fault
b   interrupt_return
@@ -1564,7 +1552,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 INT_DEFINE_BEGIN(instruction_access_slb)
IVEC=0x480
IAREA=PACA_EXSLB
-   IRECONCILE=0
IISIDE=1
IDAR=1
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
@@ -1593,7 +1580,6 @@ MMU_FTR_SECTION_ELSE
li  r3,-EFAULT
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
std r3,RESULT(r1)
-   RECONCILE_IRQ_STATE(r10, r11)
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_bad_slb_fault
b   interrupt_return
@@ -1753,7 +1739,6 @@ EXC_COMMON_BEGIN(program_check_common)
  */
 INT_DEFINE_BEGIN(fp_unavailable)
IVEC=0x800
-   IRECONCILE=0
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
IKVM_REAL=1
 #endif
@@ -1768,7 +1753,6 @@ EXC_VIRT_END(fp_unavailable, 0x4800, 0x100)
 EXC_COMMON_BEGIN(fp_unavailable_common)
GEN_COMMON fp_unavailable
bne 1f  /* if from user, just load it up */
-   RECONCILE_IRQ_STATE(r10, r11)
addir3,r1,STACK_FRAME_OVERHEAD
bl  kernel_fp_unavailable_exception
 0: trap
@@ -1787,7 +1771,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
b   fast_interrupt_return
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2: /* User process was in a transaction */
-   RECONCILE_IRQ_STATE(r10, r11)
addir3,r1,STACK_FRAME_OVERHEAD
bl  fp_unavailable_tm
b   interrupt_return
@@ -1852,7 +1835,6 @@ INT_DEFINE_BEGIN(hdecrementer)
IVEC=0x980
IHSRR=1
ISTACK=0
-   IRECONCILE=0
IKVM_REAL=1
IKVM_VIRT=1
 INT_DEFINE_END(hdecrementer)
@@ -2226,7 +2208,6 @@ INT_DEFINE_BEGIN(hmi_exception_early)
IHSRR=1
IREALMODE_COMMON=1
ISTACK=0
-   IRECONCILE=0
IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
IKVM_REAL=1
 INT_DEFINE_END(hmi_exception_early)
@@ -2400,7 +2381,6 @@ EXC_COMMON_BEGIN(performance_monitor_common)
  */
 INT_DEFINE_BEGIN(altivec_unavailable)

[RFC PATCH 05/12] powerpc/64s: Do context tracking in interrupt entry wrapper

2020-09-05 Thread Nicholas Piggin
Context tracking is very broken currently. Add a helper function that is to
be called first thing by any normal interrupt handler function to track user
exits, and user entry is done by the interrupt exit prepare functions.

Context tracking is disabled on 64e for now, it must move to interrupt exit
in C before enabling it.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig  |  2 +-
 arch/powerpc/include/asm/interrupt.h  | 23 +
 arch/powerpc/kernel/ptrace/ptrace.c   |  4 --
 arch/powerpc/kernel/signal.c  |  4 --
 arch/powerpc/kernel/syscall_64.c  | 14 +
 arch/powerpc/kernel/traps.c   | 74 ++-
 arch/powerpc/mm/book3s64/hash_utils.c |  2 -
 arch/powerpc/mm/fault.c   |  3 --
 8 files changed, 55 insertions(+), 71 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1f48bbfb3ce9..3da7bbff46a9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -189,7 +189,7 @@ config PPC
select HAVE_CBPF_JITif !PPC64
select HAVE_STACKPROTECTOR  if PPC64 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
select HAVE_STACKPROTECTOR  if PPC32 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
-   select HAVE_CONTEXT_TRACKINGif PPC64
+   select HAVE_CONTEXT_TRACKINGif PPC_BOOK3S_64
select HAVE_TIF_NOHZif PPC64
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 7231949fc1c8..98acfbb2df04 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -6,6 +6,23 @@
 #include 
 #include 
 
+#ifdef CONFIG_PPC_BOOK3S_64
+static inline void interrupt_enter_prepare(struct pt_regs *regs)
+{
+   if (user_mode(regs)) {
+   CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
+   user_exit_irqoff();
+   } else {
+   CT_WARN_ON(ct_state() == CONTEXT_USER);
+   }
+}
+
+#else /* CONFIG_PPC_BOOK3S_64 */
+static inline void interrupt_enter_prepare(struct pt_regs *regs)
+{
+}
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
 /**
  * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
  * @func:  Function name of the entry point
@@ -60,6 +77,8 @@ static __always_inline void ___##func(struct pt_regs *regs);  
\
\
 __visible noinstr void func(struct pt_regs *regs)  \
 {  \
+   interrupt_enter_prepare(regs);  \
+   \
___##func (regs);   \
 }  \
\
@@ -90,6 +109,8 @@ __visible noinstr long func(struct pt_regs *regs)
\
 {  \
long ret;   \
\
+   interrupt_enter_prepare(regs);  \
+   \
ret = ___##func (regs); \
\
return ret; \
@@ -118,6 +139,8 @@ static __always_inline void ___##func(struct pt_regs 
*regs);\
\
 __visible noinstr void func(struct pt_regs *regs)  \
 {  \
+   interrupt_enter_prepare(regs);  \
+   \
___##func (regs);   \
 }  \
\
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c 
b/arch/powerpc/kernel/ptrace/ptrace.c
index f6e51be47c6e..8970400e521c 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -290,8 +290,6 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 {
u32 flags;
 
-   user_exit();
-
flags = READ_ONCE(current_thread_info()->flags) &
(_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
 
@@ -368,8 +366,6 @@ void do_syscall_trace_leave(struct pt_regs 

[RFC PATCH 04/12] powerpc: add interrupt_cond_local_irq_enable helper

2020-09-05 Thread Nicholas Piggin
Simple helper for synchronous interrupt handlers to use to enable
interrupts if they were taken in interrupt-enabled context.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h |  7 +++
 arch/powerpc/kernel/traps.c  | 24 +++-
 arch/powerpc/mm/fault.c  |  4 +---
 3 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 7c7e58541171..7231949fc1c8 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -3,6 +3,7 @@
 #define _ASM_POWERPC_INTERRUPT_H
 
 #include 
+#include 
 #include 
 
 /**
@@ -198,4 +199,10 @@ DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
 void replay_system_reset(void);
 void replay_soft_interrupts(void);
 
+static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs)
+{
+   if (!arch_irq_disabled_regs(regs))
+   local_irq_enable();
+}
+
 #endif /* _ASM_POWERPC_INTERRUPT_H */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 96fa2d7e088c..a850647b7062 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -343,8 +343,8 @@ static bool exception_common(int signr, struct pt_regs 
*regs, int code,
 
show_signal_msg(signr, regs, code, addr);
 
-   if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   if (arch_irqs_disabled())
+   interrupt_cond_local_irq_enable(regs);
 
current->thread.trap_nr = code;
 
@@ -1579,9 +1579,7 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
if (!user_mode(regs))
goto sigill;
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
/* (reason & REASON_ILLEGAL) would be the obvious thing here,
 * but there seems to be a hardware bug on the 405GP (RevD)
@@ -1635,9 +1633,7 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
int sig, code, fixed = 0;
unsigned long  reason;
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
reason = get_reason(regs);
 
@@ -1798,9 +1794,7 @@ DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
die("Unexpected facility unavailable exception", regs, SIGABRT);
}
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
if (status == FSCR_DSCR_LG) {
/*
@@ -2145,9 +2139,7 @@ void SPEFloatingPointException(struct pt_regs *regs)
int code = FPE_FLTUNK;
int err;
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
flush_spe_to_thread(current);
 
@@ -2194,9 +2186,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
extern int speround_handler(struct pt_regs *regs);
int err;
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
preempt_disable();
if (regs->msr & MSR_SPE)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 7d63b5512068..fd8e28944293 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -441,9 +441,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
return bad_area_nosemaphore(regs, address);
}
 
-   /* We restore the interrupt state now */
-   if (!arch_irq_disabled_regs(regs))
-   local_irq_enable();
+   interrupt_cond_local_irq_enable(regs);
 
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-- 
2.23.0



[RFC PATCH 03/12] powerpc: interrupt handler wrapper functions

2020-09-05 Thread Nicholas Piggin
Add wrapper functions (derived from x86 macros) for interrupt handler
functions. This allows interrupt entry code to be written in C.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/asm-prototypes.h |  28 ---
 arch/powerpc/include/asm/bug.h|   1 -
 arch/powerpc/include/asm/hw_irq.h |   9 -
 arch/powerpc/include/asm/interrupt.h  | 201 ++
 arch/powerpc/include/asm/time.h   |   2 +
 arch/powerpc/kernel/dbell.c   |   3 +-
 arch/powerpc/kernel/exceptions-64s.S  |   8 +-
 arch/powerpc/kernel/irq.c |   3 +-
 arch/powerpc/kernel/mce.c |   5 +-
 arch/powerpc/kernel/tau_6xx.c |   2 +-
 arch/powerpc/kernel/time.c|   3 +-
 arch/powerpc/kernel/traps.c   |  78 ++---
 arch/powerpc/kernel/watchdog.c|   7 +-
 arch/powerpc/kvm/book3s_hv_builtin.c  |   1 +
 arch/powerpc/mm/book3s64/hash_utils.c |   3 +-
 arch/powerpc/mm/book3s64/slb.c|   5 +-
 arch/powerpc/mm/fault.c   |  10 +-
 arch/powerpc/platforms/powernv/idle.c |   1 +
 18 files changed, 290 insertions(+), 80 deletions(-)
 create mode 100644 arch/powerpc/include/asm/interrupt.h

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index fffac9de2922..de4dad05e272 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -56,34 +56,6 @@ int exit_vmx_usercopy(void);
 int enter_vmx_ops(void);
 void *exit_vmx_ops(void *dest);
 
-/* Traps */
-long machine_check_early(struct pt_regs *regs);
-long hmi_exception_realmode(struct pt_regs *regs);
-void SMIException(struct pt_regs *regs);
-void handle_hmi_exception(struct pt_regs *regs);
-void instruction_breakpoint_exception(struct pt_regs *regs);
-void RunModeException(struct pt_regs *regs);
-void single_step_exception(struct pt_regs *regs);
-void program_check_exception(struct pt_regs *regs);
-void alignment_exception(struct pt_regs *regs);
-void StackOverflow(struct pt_regs *regs);
-void kernel_fp_unavailable_exception(struct pt_regs *regs);
-void altivec_unavailable_exception(struct pt_regs *regs);
-void vsx_unavailable_exception(struct pt_regs *regs);
-void fp_unavailable_tm(struct pt_regs *regs);
-void altivec_unavailable_tm(struct pt_regs *regs);
-void vsx_unavailable_tm(struct pt_regs *regs);
-void facility_unavailable_exception(struct pt_regs *regs);
-void TAUException(struct pt_regs *regs);
-void altivec_assist_exception(struct pt_regs *regs);
-void unrecoverable_exception(struct pt_regs *regs);
-void kernel_bad_stack(struct pt_regs *regs);
-void system_reset_exception(struct pt_regs *regs);
-void machine_check_exception(struct pt_regs *regs);
-void emulation_assist_interrupt(struct pt_regs *regs);
-long do_slb_fault(struct pt_regs *regs);
-void do_bad_slb_fault(struct pt_regs *regs);
-
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
struct ucontext __user *new_ctx,
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 2fa0cf6c6011..7b89cdbbb789 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -111,7 +111,6 @@
 #ifndef __ASSEMBLY__
 
 struct pt_regs;
-extern long do_page_fault(struct pt_regs *);
 extern long hash__do_page_fault(struct pt_regs *);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index 3a0db7b0b46e..19420e48bcd5 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -51,15 +51,6 @@
 
 #ifndef __ASSEMBLY__
 
-extern void replay_system_reset(void);
-extern void replay_soft_interrupts(void);
-
-extern void timer_interrupt(struct pt_regs *);
-extern void timer_broadcast_interrupt(void);
-extern void performance_monitor_exception(struct pt_regs *regs);
-extern void WatchdogException(struct pt_regs *regs);
-extern void unknown_exception(struct pt_regs *regs);
-
 #ifdef CONFIG_PPC64
 #include 
 
diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
new file mode 100644
index ..7c7e58541171
--- /dev/null
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -0,0 +1,201 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_POWERPC_INTERRUPT_H
+#define _ASM_POWERPC_INTERRUPT_H
+
+#include 
+#include 
+
+/**
+ * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
+ * @func:  Function name of the entry point
+ * @returns:   Returns a value back to asm caller
+ */
+#define DECLARE_INTERRUPT_HANDLER_RAW(func)\
+   __visible long func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER_RAW - Define raw interrupt handler function
+ * @func:  Function name of the entry point
+ 

[RFC PATCH 02/12] powerpc: remove arguments from interrupt handler functions

2020-09-05 Thread Nicholas Piggin
Make interrupt handlers all just take the pt_regs * argument and load
DAR/DSISR etc from that. Make those that return a value return long.

This is done to make the function signatures match more closely, which
will help with a future patch to add wrappers. Explicit arguments could
be re-added for performance in future but that would require more
complex wrapper macros.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/asm-prototypes.h |  4 ++--
 arch/powerpc/include/asm/bug.h|  4 ++--
 arch/powerpc/kernel/exceptions-64e.S  |  2 --
 arch/powerpc/kernel/exceptions-64s.S  | 14 ++
 arch/powerpc/mm/book3s64/hash_utils.c |  8 +---
 arch/powerpc/mm/book3s64/slb.c| 11 +++
 arch/powerpc/mm/fault.c   | 16 +---
 7 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index de14b1a34d56..fffac9de2922 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -81,8 +81,8 @@ void kernel_bad_stack(struct pt_regs *regs);
 void system_reset_exception(struct pt_regs *regs);
 void machine_check_exception(struct pt_regs *regs);
 void emulation_assist_interrupt(struct pt_regs *regs);
-long do_slb_fault(struct pt_regs *regs, unsigned long ea);
-void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
+long do_slb_fault(struct pt_regs *regs);
+void do_bad_slb_fault(struct pt_regs *regs);
 
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d714d83bbc7c..2fa0cf6c6011 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -111,8 +111,8 @@
 #ifndef __ASSEMBLY__
 
 struct pt_regs;
-extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
-extern int hash__do_page_fault(struct pt_regs *, unsigned long, unsigned long);
+extern long do_page_fault(struct pt_regs *);
+extern long hash__do_page_fault(struct pt_regs *);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
 extern void _exception_pkey(struct pt_regs *, unsigned long, int);
diff --git a/arch/powerpc/kernel/exceptions-64e.S 
b/arch/powerpc/kernel/exceptions-64e.S
index d9ed79415100..5988d61783b5 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1012,8 +1012,6 @@ storage_fault_common:
std r14,_DAR(r1)
std r15,_DSISR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
-   mr  r4,r14
-   mr  r5,r15
ld  r14,PACA_EXGEN+EX_R14(r13)
ld  r15,PACA_EXGEN+EX_R15(r13)
bl  do_page_fault
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index f830b893fe03..1f34cfd1887c 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1437,8 +1437,6 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
 EXC_VIRT_END(data_access, 0x4300, 0x80)
 EXC_COMMON_BEGIN(data_access_common)
GEN_COMMON data_access
-   ld  r4,_DAR(r1)
-   ld  r5,_DSISR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
bl  do_hash_fault
@@ -1491,10 +1489,9 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
 EXC_COMMON_BEGIN(data_access_slb_common)
GEN_COMMON data_access_slb
-   ld  r4,_DAR(r1)
-   addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
/* HPT case, do SLB fault */
+   addir3,r1,STACK_FRAME_OVERHEAD
bl  do_slb_fault
cmpdi   r3,0
bne-1f
@@ -1506,8 +1503,6 @@ MMU_FTR_SECTION_ELSE
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
std r3,RESULT(r1)
RECONCILE_IRQ_STATE(r10, r11)
-   ld  r4,_DAR(r1)
-   ld  r5,RESULT(r1)
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_bad_slb_fault
b   interrupt_return
@@ -1542,8 +1537,6 @@ EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
 EXC_VIRT_END(instruction_access, 0x4400, 0x80)
 EXC_COMMON_BEGIN(instruction_access_common)
GEN_COMMON instruction_access
-   ld  r4,_DAR(r1)
-   ld  r5,_DSISR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
bl  do_hash_fault
@@ -1587,10 +1580,9 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
 EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
 EXC_COMMON_BEGIN(instruction_access_slb_common)
GEN_COMMON instruction_access_slb
-   ld  r4,_DAR(r1)
-   addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
/* HPT case, do SLB fault */
+   addir3,r1,STACK_FRAME_OVERHEAD
bl  do_slb_fault
cmpdi   r3,0
bne-   

[RFC PATCH 01/12] powerpc/64s: move the last of the page fault handling logic to C

2020-09-05 Thread Nicholas Piggin
The page fault handling still has some complex logic particularly around
hash table handling, in asm. Implement this in C instead.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/bug.h|   1 +
 arch/powerpc/kernel/exceptions-64s.S  | 131 +-
 arch/powerpc/mm/book3s64/hash_utils.c |  77 +--
 arch/powerpc/mm/fault.c   |  55 ++-
 4 files changed, 124 insertions(+), 140 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 338f36cd9934..d714d83bbc7c 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -112,6 +112,7 @@
 
 struct pt_regs;
 extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
+extern int hash__do_page_fault(struct pt_regs *, unsigned long, unsigned long);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
 extern void _exception_pkey(struct pt_regs *, unsigned long, int);
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index f7d748b88705..f830b893fe03 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1403,14 +1403,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
  *
  * Handling:
  * - Hash MMU
- *   Go to do_hash_page first to see if the HPT can be filled from an entry in
- *   the Linux page table. Hash faults can hit in kernel mode in a fairly
+ *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
+ *   Linux page table. Hash faults can hit in kernel mode in a fairly
  *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
  *   "non-bolted" regions, e.g., vmalloc space. However these should always be
- *   backed by Linux page tables.
+ *   backed by Linux page table entries.
  *
- *   If none is found, do a Linux page fault. Linux page faults can happen in
- *   kernel mode due to user copy operations of course.
+ *   If no entry is found the Linux page fault handler is invoked (by
+ *   do_hash_fault). Linux page faults can happen in kernel mode due to user
+ *   copy operations of course.
  *
  * - Radix MMU
  *   The hardware loads from the Linux page table directly, so a fault goes
@@ -1438,13 +1439,17 @@ EXC_COMMON_BEGIN(data_access_common)
GEN_COMMON data_access
ld  r4,_DAR(r1)
ld  r5,_DSISR(r1)
+   addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-   ld  r6,_MSR(r1)
-   li  r3,0x300
-   b   do_hash_page/* Try to handle as hpte fault */
+   bl  do_hash_fault
 MMU_FTR_SECTION_ELSE
-   b   handle_page_fault
+   bl  do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+cmpdi  r3,0
+   beq+interrupt_return
+   /* We need to restore NVGPRS */
+   REST_NVGPRS(r1)
+   b   interrupt_return
 
GEN_KVM data_access
 
@@ -1539,13 +1544,17 @@ EXC_COMMON_BEGIN(instruction_access_common)
GEN_COMMON instruction_access
ld  r4,_DAR(r1)
ld  r5,_DSISR(r1)
+   addir3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-   ld  r6,_MSR(r1)
-   li  r3,0x400
-   b   do_hash_page/* Try to handle as hpte fault */
+   bl  do_hash_fault
 MMU_FTR_SECTION_ELSE
-   b   handle_page_fault
+   bl  do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+cmpdi  r3,0
+   beq+interrupt_return
+   /* We need to restore NVGPRS */
+   REST_NVGPRS(r1)
+   b   interrupt_return
 
GEN_KVM instruction_access
 
@@ -3197,99 +3206,3 @@ disable_machine_check:
RFI_TO_KERNEL
 1: mtlrr0
blr
-
-/*
- * Hash table stuff
- */
-   .balign IFETCH_ALIGN_BYTES
-do_hash_page:
-#ifdef CONFIG_PPC_BOOK3S_64
-   lis r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
-   ori r0,r0,DSISR_BAD_FAULT_64S@l
-   and.r0,r5,r0/* weird error? */
-   bne-handle_page_fault   /* if not, try to insert a HPTE */
-
-   /*
-* If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
-* don't call hash_page, just fail the fault. This is required to
-* prevent re-entrancy problems in the hash code, namely perf
-* interrupts hitting while something holds H_PAGE_BUSY, and taking a
-* hash fault. See the comment in hash_preload().
-*/
-   ld  r11, PACA_THREAD_INFO(r13)
-   lwz r0,TI_PREEMPT(r11)
-   andis.  r0,r0,NMI_MASK@h
-   bne 77f
-
-   /*
-* r3 contains the trap number
-* r4 contains the faulting address
-* r5 contains dsisr
-* r6 msr
-*
-* at return r3 = 0 for success, 1 for page fault, negative for error
-*/
-   bl  __hash_page /* build 

[RFC PATCH 00/12] interrupt entry wrappers

2020-09-05 Thread Nicholas Piggin
This series moves more stuff to C, and fixes context tracking on
64s.

Nicholas Piggin (12):
  powerpc/64s: move the last of the page fault handling logic to C
  powerpc: remove arguments from interrupt handler functions
  powerpc: interrupt handler wrapper functions
  powerpc: add interrupt_cond_local_irq_enable helper
  powerpc/64s: Do context tracking in interrupt entry wrapper
  powerpc/64s: reconcile interrupts in C
  powerpc/64: move account_stolen_time into its own function
  powerpc/64: entry cpu time accounting in C
  powerpc: move NMI entry/exit code into wrapper
  powerpc/64s: move NMI soft-mask handling to C
  powerpc/64s: runlatch interrupt handling in C
  powerpc/64s: power4 nap fixup in C

 arch/powerpc/Kconfig  |   2 +-
 arch/powerpc/include/asm/asm-prototypes.h |  28 --
 arch/powerpc/include/asm/bug.h|   2 +-
 arch/powerpc/include/asm/cputime.h|  15 +
 arch/powerpc/include/asm/hw_irq.h |   9 -
 arch/powerpc/include/asm/interrupt.h  | 316 ++
 arch/powerpc/include/asm/ppc_asm.h|  24 --
 arch/powerpc/include/asm/processor.h  |   1 +
 arch/powerpc/include/asm/thread_info.h|   6 +
 arch/powerpc/include/asm/time.h   |   2 +
 arch/powerpc/kernel/dbell.c   |   3 +-
 arch/powerpc/kernel/exceptions-64e.S  |   3 -
 arch/powerpc/kernel/exceptions-64s.S  | 307 ++---
 arch/powerpc/kernel/idle_book3s.S |   4 +
 arch/powerpc/kernel/irq.c |   3 +-
 arch/powerpc/kernel/mce.c |  17 +-
 arch/powerpc/kernel/ptrace/ptrace.c   |   4 -
 arch/powerpc/kernel/signal.c  |   4 -
 arch/powerpc/kernel/syscall_64.c  |  24 +-
 arch/powerpc/kernel/tau_6xx.c |   2 +-
 arch/powerpc/kernel/time.c|   3 +-
 arch/powerpc/kernel/traps.c   | 198 ++
 arch/powerpc/kernel/watchdog.c|  15 +-
 arch/powerpc/kvm/book3s_hv_builtin.c  |   1 +
 arch/powerpc/mm/book3s64/hash_utils.c |  82 --
 arch/powerpc/mm/book3s64/slb.c|  12 +-
 arch/powerpc/mm/fault.c   |  74 -
 arch/powerpc/platforms/powernv/idle.c |   1 +
 28 files changed, 608 insertions(+), 554 deletions(-)
 create mode 100644 arch/powerpc/include/asm/interrupt.h

-- 
2.23.0



Re: [PATCH 5/5] powerpc: use the generic dma_ops_bypass mode

2020-09-05 Thread Alexey Kardashevskiy




On 31/08/2020 16:40, Christoph Hellwig wrote:

On Sun, Aug 30, 2020 at 11:04:21AM +0200, Cédric Le Goater wrote:

Hello,

On 7/8/20 5:24 PM, Christoph Hellwig wrote:

Use the DMA API bypass mechanism for direct window mappings.  This uses
common code and speed up the direct mapping case by avoiding indirect
calls just when not using dma ops at all.  It also fixes a problem where
the sync_* methods were using the bypass check for DMA allocations, but
those are part of the streaming ops.

Note that this patch loses the DMA_ATTR_WEAK_ORDERING override, which
has never been well defined, as is only used by a few drivers, which
IIRC never showed up in the typical Cell blade setups that are affected
by the ordering workaround.

Fixes: efd176a04bef ("powerpc/pseries/dma: Allow SWIOTLB")
Signed-off-by: Christoph Hellwig 
---
  arch/powerpc/Kconfig  |  1 +
  arch/powerpc/include/asm/device.h |  5 --
  arch/powerpc/kernel/dma-iommu.c   | 90 ---
  3 files changed, 10 insertions(+), 86 deletions(-)


I am seeing corruptions on a couple of POWER9 systems (boston) when
stressed with IO. stress-ng gives some results but I have first seen
it when compiling the kernel in a guest and this is still the best way
to raise the issue.

These systems have of a SAS Adaptec controller :

   0003:01:00.0 Serial Attached SCSI controller: Adaptec Series 8 12G SAS/PCIe 
3 (rev 01)

When the failure occurs, the POWERPC EEH interrupt fires and dumps
lowlevel PHB4 registers among which :

   [ 2179.251069490,3] PHB#0003[0:3]:   phbErrorStatus = 
0280
   [ 2179.251117476,3] PHB#0003[0:3]:  phbFirstErrorStatus = 
0200

The bits raised identify a PPC 'TCE' error, which means it is related
to DMAs. See below for more details.


Reverting this patch "fixes" the issue but it is probably else where,
in some other layers or in the aacraid driver. How should I proceed
to get more information ?


The aacraid DMA masks look like a mess.



It kinds does and is. The thing is that after f1565c24b596 the driver 
sets 32 bit DMA mask which in turn enables the small DMA window (not 
bypass) and since the aacraid driver has at least one bug with double 
unmap of the same DMA handle, this somehow leads to EEH (PCI DMA error).



The driver sets 32but mask because it callis dma_get_required_mask() 
_before_ setting the mask so dma_get_required_mask() does not go the 
dma_alloc_direct() path and calls the powerpc's 
dma_iommu_get_required_mask() which:


1. does the math like this (spot 2 bugs):

mask = 1ULL < (fls_long(tbl->it_offset + tbl->it_size) - 1)

2. but even after fixing that, the driver crashes as f1565c24b596 
removed the call to dma_iommu_bypass_supported() so it enforces IOMMU.



The patch below (the first hunk to be precise) brings the things back to 
where they were (64bit mask). The double unmap bug in the driver is 
still to be investigated.




diff --git a/arch/powerpc/kernel/dma-iommu.c 
b/arch/powerpc/kernel/dma-iommu.c

index 569fecd7b5b2..785abccb90fc 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -117,10 +117,18 @@ u64 dma_iommu_get_required_mask(struct device *dev)
struct iommu_table *tbl = get_iommu_table_base(dev);
u64 mask;

+   if (dev_is_pci(dev)) {
+   u64 bypass_mask = dma_direct_get_required_mask(dev);
+
+   if (dma_iommu_bypass_supported(dev, bypass_mask))
+   return bypass_mask;
+   }
+
if (!tbl)
return 0;

-   mask = 1ULL < (fls_long(tbl->it_offset + tbl->it_size) - 1);
+   mask = 1ULL << (fls_long(tbl->it_offset + tbl->it_size) +
+   tbl->it_page_shift - 1);
mask += mask - 1;

return mask;



--
Alexey


RE: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

2020-09-05 Thread David Laight
From: Christophe Leroy
> Sent: 05 September 2020 08:16
> 
> Le 04/09/2020 à 23:01, David Laight a écrit :
> > From: Alexey Dobriyan
> >> Sent: 04 September 2020 18:58
...
> > What is this strange %fs register you are talking about.
> > Figure 2-4 only has CS, DS, SS and ES.
> >
> 
> Intel added registers FS and GS in the i386

I know, I've got both the 'iAPX 286 Programmer's Reference Manual'
and the '80386 Programmer's Reference Manual' on my shelf.

I don't have the 8088 book though - which I used in 1982.

The old books are a lot easier to read if, for instance,
you are trying to work out how to back and forth to real mode
to do bios calls.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-05 Thread Gerald Schaefer
On Fri, 4 Sep 2020 18:01:15 +0200
Gerald Schaefer  wrote:

> On Fri, 4 Sep 2020 17:26:47 +0200
> Gerald Schaefer  wrote:
> 
> > On Fri, 4 Sep 2020 12:18:05 +0530
> > Anshuman Khandual  wrote:
> > 
> > > 
> > > 
> > > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > > > This patch series includes fixes for debug_vm_pgtable test code so that
> > > > they follow page table updates rules correctly. The first two patches 
> > > > introduce
> > > > changes w.r.t ppc64. The patches are included in this series for 
> > > > completeness. We can
> > > > merge them via ppc64 tree if required.
> > > > 
> > > > Hugetlb test is disabled on ppc64 because that needs larger change to 
> > > > satisfy
> > > > page table update rules.
> > > > 
> > > > These tests are broken w.r.t page table update rules and results in 
> > > > kernel
> > > > crash as below. 
> > > > 
> > > > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > > cpu 0x0: Vector: 700 (Program Check) at [c00c6d1e76c0]
> > > > pc: c009a5ec: assert_pte_locked+0x14c/0x380
> > > > lr: c05c: pte_update+0x11c/0x190
> > > > sp: c00c6d1e7950
> > > >msr: 82029033
> > > >   current = 0xc00c6d172c80
> > > >   paca= 0xc3ba   irqmask: 0x03   irq_happened: 0x01
> > > > pid   = 1, comm = swapper/0
> > > > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > > [link register   ] c05c pte_update+0x11c/0x190
> > > > [c00c6d1e7950] 0001 (unreliable)
> > > > [c00c6d1e79b0] c05eee14 pte_update+0x44/0x190
> > > > [c00c6d1e7a10] c1a2ca9c pte_advanced_tests+0x160/0x3d8
> > > > [c00c6d1e7ab0] c1a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > > > [c00c6d1e7ba0] c00116ec do_one_initcall+0xac/0x5f0
> > > > [c00c6d1e7c80] c19e4fac kernel_init_freeable+0x4dc/0x5a4
> > > > [c00c6d1e7db0] c0012474 kernel_init+0x24/0x160
> > > > [c00c6d1e7e20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > > 
> > > > With DEBUG_VM disabled
> > > > 
> > > > [   20.530152] BUG: Kernel NULL pointer dereference on read at 
> > > > 0x
> > > > [   20.530183] Faulting instruction address: 0xc00df330
> > > > cpu 0x33: Vector: 380 (Data SLB Access) at [c00c6d19f700]
> > > > pc: c00df330: memset+0x68/0x104
> > > > lr: c009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > > sp: c00c6d19f990
> > > >msr: 82009033
> > > >dar: 0
> > > >   current = 0xc00c6d177480
> > > >   paca= 0xc0001ec4f400   irqmask: 0x03   irq_happened: 0x01
> > > > pid   = 1, comm = swapper/0
> > > > [link register   ] c009f6d8 
> > > > hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > > [c00c6d19f990] c009f748 
> > > > hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > > > [c00c6d19fa10] c19ebf30 pmd_advanced_tests+0x1f0/0x378
> > > > [c00c6d19fab0] c19ed088 debug_vm_pgtable+0x79c/0x1244
> > > > [c00c6d19fba0] c00116ec do_one_initcall+0xac/0x5f0
> > > > [c00c6d19fc80] c19a4fac kernel_init_freeable+0x4dc/0x5a4
> > > > [c00c6d19fdb0] c0012474 kernel_init+0x24/0x160
> > > > [c00c6d19fe20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > > 
> > > > Changes from v3:
> > > > * Address review feedback
> > > > * Move page table depost and withdraw patch after adding pmdlock to 
> > > > avoid bisect failure.
> > > 
> > > This version
> > > 
> > > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with 
> > > DEBUG_VM_PGTABLE)
> > > - Runs on arm64 and x86 without any regression, atleast nothing that I 
> > > have noticed
> > > - Will be great if this could get tested on s390, arc, riscv, ppc32 
> > > platforms as well
> > 
> > When I quickly tested v3, it worked fine, but now it turned out to
> > only work fine "sometimes", both v3 and v4. I need to look into it
> > further, but so far it seems related to the hugetlb_advanced_tests().
> > 
> > I guess there was already some discussion on this test, but we did
> > not receive all of the thread(s). Please always add at least
> > linux-s...@vger.kernel.org and maybe myself and Vasily Gorbik 
> > 
> > for further discussions.
> 
> BTW, with myself I mean the new address gerald.schae...@linux.ibm.com.
> The old gerald.schae...@de.ibm.com seems to work (again), but is not
> very reliable.
> 
> BTW2, a quick test with this change (so far) made the issues on s390
> go away:
> 
> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
> spin_unlock(ptl);
> 
>  #ifndef CONFIG_PPC_BOOK3S_64
> -   hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +   hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, 
> prot);
>  #endif
> 
> spin_lock(>page_table_lock);
> 
> That would more match the "pte_t pointer" usage for hugetlb code,
> i.e. just cast a pmd_t pointer to it. Also 

Re: fsl_espi errors on v5.7.15

2020-09-05 Thread Heiner Kallweit
On Fri 4. Sep 2020 at 01:58, Chris Packham <
chris.pack...@alliedtelesis.co.nz> wrote:

>
>
> On 1/09/20 6:14 pm, Nicholas Piggin wrote:
>
> > Excerpts from Chris Packham's message of September 1, 2020 11:25 am:
>
> >> On 1/09/20 12:33 am, Heiner Kallweit wrote:
>
> >>> On 30.08.2020 23:59, Chris Packham wrote:
>
>  On 31/08/20 9:41 am, Heiner Kallweit wrote:
>
> > On 30.08.2020 23:00, Chris Packham wrote:
>
> >> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>
> >>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>
> >> 
>
> >>
>
>  I've also now seen the RX FIFO not empty error on the T2080RDB
>
> 
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't
> empty!
>
>  fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>
> 
>
>  With my current workaround of emptying the RX FIFO. It seems
>
>  survivable. Interestingly it only ever seems to be 1 extra
> byte in the
>
>  RX FIFO and it seems to be after either a READ_SR or a
> READ_FSR.
>
> 
>
>  fsl_espi ffe11.spi: tx 70
>
>  fsl_espi ffe11.spi: rx 03
>
>  fsl_espi ffe11.spi: Extra RX 00
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't
> empty!
>
>  fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>
>  fsl_espi ffe11.spi: tx 05
>
>  fsl_espi ffe11.spi: rx 00
>
>  fsl_espi ffe11.spi: Extra RX 03
>
>  fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>
>  fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't
> empty!
>
>  fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>
>  fsl_espi ffe11.spi: tx 05
>
>  fsl_espi ffe11.spi: rx 00
>
>  fsl_espi ffe11.spi: Extra RX 03
>
> 
>
>   From all the Micron SPI-NOR datasheets I've got access
> to it is
>
>  possible to continually read the SR/FSR. But I've no idea why
> it
>
>  happens some times and not others.
>
> >>> So I think I've got a reproduction and I think I've bisected
> the problem
>
> >>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt
> replay in
>
> >>> C"). My day is just finishing now so I haven't applied too
> much scrutiny
>
> >>> to this result. Given the various rabbit holes I've been down
> on this
>
> >>> issue already I'd take this information with a good degree of
> skepticism.
>
> >>>
>
> >> OK, so an easy test should be to re-test with a 5.4 kernel.
>
> >> It doesn't have yet the change you're referring to, and the
> fsl-espi driver
>
> >> is basically the same as in 5.7 (just two small changes in 5.7).
>
> > There's 6cc0c16d82f88 and maybe also other interrupt related
> patches
>
> > around this time that could affect book E, so it's good if that
> exact
>
> > patch is confirmed.
>
>  My confirmation is basically that I can induce the issue in a 5.4
> kernel
>
>  by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue
> in
>
>  5.9-rc2 by reverting that one commit.
>
> 
>
>  I both cases it's not exactly a clean cherry-pick/revert so I also
>
>  confirmed the bisection result by building at 3282a3da25bd (which
> sees
>
>  the issue) and the commit just before (which does not).
>
> >>> Thanks for testing, that confirms it well.
>
> >>>
>
> >>> [snip patch]
>
> >>>
>
>  I still saw the issue with this change applied.
> PPC_IRQ_SOFT_MASK_DEBUG
>
>  didn't report anything (either with or without the change above).
>
> >>> Okay, it was a bit of a shot in the dark. I still can't see what
>
> >>> else has changed.
>
> >>>
>
> >>> What would cause this, a lost interrupt? A spurious interrupt? Or
>
> >>> higher interrupt latency?
>
> >>>
>
> >>> I don't think the patch should cause significantly worse latency,
>
> >>> (it's supposed to be a bit better if anything because it doesn't
> set
>
> >>> up the full interrupt frame). But it's possible.
>
> >> My working theory is that the SPI_DON indication is all about the TX
>
> >> direction an now that the interrupts are faster we're hitting an
> error
>
> >> because there is still RX activity going on. Heiner disagrees with
> my
>
> >> 

Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-05 Thread Gerald Schaefer
On Fri, 4 Sep 2020 17:26:47 +0200
Gerald Schaefer  wrote:

> On Fri, 4 Sep 2020 12:18:05 +0530
> Anshuman Khandual  wrote:
> 
> > 
> > 
> > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > > This patch series includes fixes for debug_vm_pgtable test code so that
> > > they follow page table updates rules correctly. The first two patches 
> > > introduce
> > > changes w.r.t ppc64. The patches are included in this series for 
> > > completeness. We can
> > > merge them via ppc64 tree if required.
> > > 
> > > Hugetlb test is disabled on ppc64 because that needs larger change to 
> > > satisfy
> > > page table update rules.
> > > 
> > > These tests are broken w.r.t page table update rules and results in kernel
> > > crash as below. 
> > > 
> > > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > cpu 0x0: Vector: 700 (Program Check) at [c00c6d1e76c0]
> > > pc: c009a5ec: assert_pte_locked+0x14c/0x380
> > > lr: c05c: pte_update+0x11c/0x190
> > > sp: c00c6d1e7950
> > >msr: 82029033
> > >   current = 0xc00c6d172c80
> > >   paca= 0xc3ba   irqmask: 0x03   irq_happened: 0x01
> > > pid   = 1, comm = swapper/0
> > > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > [link register   ] c05c pte_update+0x11c/0x190
> > > [c00c6d1e7950] 0001 (unreliable)
> > > [c00c6d1e79b0] c05eee14 pte_update+0x44/0x190
> > > [c00c6d1e7a10] c1a2ca9c pte_advanced_tests+0x160/0x3d8
> > > [c00c6d1e7ab0] c1a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > > [c00c6d1e7ba0] c00116ec do_one_initcall+0xac/0x5f0
> > > [c00c6d1e7c80] c19e4fac kernel_init_freeable+0x4dc/0x5a4
> > > [c00c6d1e7db0] c0012474 kernel_init+0x24/0x160
> > > [c00c6d1e7e20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > 
> > > With DEBUG_VM disabled
> > > 
> > > [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x
> > > [   20.530183] Faulting instruction address: 0xc00df330
> > > cpu 0x33: Vector: 380 (Data SLB Access) at [c00c6d19f700]
> > > pc: c00df330: memset+0x68/0x104
> > > lr: c009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > sp: c00c6d19f990
> > >msr: 82009033
> > >dar: 0
> > >   current = 0xc00c6d177480
> > >   paca= 0xc0001ec4f400   irqmask: 0x03   irq_happened: 0x01
> > > pid   = 1, comm = swapper/0
> > > [link register   ] c009f6d8 
> > > hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > [c00c6d19f990] c009f748 
> > > hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > > [c00c6d19fa10] c19ebf30 pmd_advanced_tests+0x1f0/0x378
> > > [c00c6d19fab0] c19ed088 debug_vm_pgtable+0x79c/0x1244
> > > [c00c6d19fba0] c00116ec do_one_initcall+0xac/0x5f0
> > > [c00c6d19fc80] c19a4fac kernel_init_freeable+0x4dc/0x5a4
> > > [c00c6d19fdb0] c0012474 kernel_init+0x24/0x160
> > > [c00c6d19fe20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > 
> > > Changes from v3:
> > > * Address review feedback
> > > * Move page table depost and withdraw patch after adding pmdlock to avoid 
> > > bisect failure.
> > 
> > This version
> > 
> > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with 
> > DEBUG_VM_PGTABLE)
> > - Runs on arm64 and x86 without any regression, atleast nothing that I have 
> > noticed
> > - Will be great if this could get tested on s390, arc, riscv, ppc32 
> > platforms as well
> 
> When I quickly tested v3, it worked fine, but now it turned out to
> only work fine "sometimes", both v3 and v4. I need to look into it
> further, but so far it seems related to the hugetlb_advanced_tests().
> 
> I guess there was already some discussion on this test, but we did
> not receive all of the thread(s). Please always add at least
> linux-s...@vger.kernel.org and maybe myself and Vasily Gorbik 
> 
> for further discussions.

BTW, with myself I mean the new address gerald.schae...@linux.ibm.com.
The old gerald.schae...@de.ibm.com seems to work (again), but is not
very reliable.

BTW2, a quick test with this change (so far) made the issues on s390
go away:

@@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
spin_unlock(ptl);
 
 #ifndef CONFIG_PPC_BOOK3S_64
-   hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+   hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, 
prot);
 #endif
 
spin_lock(>page_table_lock);

That would more match the "pte_t pointer" usage for hugetlb code,
i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
but I think the root cause is the pte_t pointer.

Not entirely sure though if that would really be the correct fix.
I somehow lost whatever little track I had about what these tests
really want to check, and if that would still be valid with that
change.


Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes

2020-09-05 Thread Gerald Schaefer
On Fri, 4 Sep 2020 12:18:05 +0530
Anshuman Khandual  wrote:

> 
> 
> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > This patch series includes fixes for debug_vm_pgtable test code so that
> > they follow page table updates rules correctly. The first two patches 
> > introduce
> > changes w.r.t ppc64. The patches are included in this series for 
> > completeness. We can
> > merge them via ppc64 tree if required.
> > 
> > Hugetlb test is disabled on ppc64 because that needs larger change to 
> > satisfy
> > page table update rules.
> > 
> > These tests are broken w.r.t page table update rules and results in kernel
> > crash as below. 
> > 
> > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > cpu 0x0: Vector: 700 (Program Check) at [c00c6d1e76c0]
> > pc: c009a5ec: assert_pte_locked+0x14c/0x380
> > lr: c05c: pte_update+0x11c/0x190
> > sp: c00c6d1e7950
> >msr: 82029033
> >   current = 0xc00c6d172c80
> >   paca= 0xc3ba   irqmask: 0x03   irq_happened: 0x01
> > pid   = 1, comm = swapper/0
> > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > [link register   ] c05c pte_update+0x11c/0x190
> > [c00c6d1e7950] 0001 (unreliable)
> > [c00c6d1e79b0] c05eee14 pte_update+0x44/0x190
> > [c00c6d1e7a10] c1a2ca9c pte_advanced_tests+0x160/0x3d8
> > [c00c6d1e7ab0] c1a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > [c00c6d1e7ba0] c00116ec do_one_initcall+0xac/0x5f0
> > [c00c6d1e7c80] c19e4fac kernel_init_freeable+0x4dc/0x5a4
> > [c00c6d1e7db0] c0012474 kernel_init+0x24/0x160
> > [c00c6d1e7e20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > 
> > With DEBUG_VM disabled
> > 
> > [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x
> > [   20.530183] Faulting instruction address: 0xc00df330
> > cpu 0x33: Vector: 380 (Data SLB Access) at [c00c6d19f700]
> > pc: c00df330: memset+0x68/0x104
> > lr: c009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > sp: c00c6d19f990
> >msr: 82009033
> >dar: 0
> >   current = 0xc00c6d177480
> >   paca= 0xc0001ec4f400   irqmask: 0x03   irq_happened: 0x01
> > pid   = 1, comm = swapper/0
> > [link register   ] c009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > [c00c6d19f990] c009f748 
> > hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > [c00c6d19fa10] c19ebf30 pmd_advanced_tests+0x1f0/0x378
> > [c00c6d19fab0] c19ed088 debug_vm_pgtable+0x79c/0x1244
> > [c00c6d19fba0] c00116ec do_one_initcall+0xac/0x5f0
> > [c00c6d19fc80] c19a4fac kernel_init_freeable+0x4dc/0x5a4
> > [c00c6d19fdb0] c0012474 kernel_init+0x24/0x160
> > [c00c6d19fe20] c000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > 
> > Changes from v3:
> > * Address review feedback
> > * Move page table depost and withdraw patch after adding pmdlock to avoid 
> > bisect failure.
> 
> This version
> 
> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with 
> DEBUG_VM_PGTABLE)
> - Runs on arm64 and x86 without any regression, atleast nothing that I have 
> noticed
> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms 
> as well

When I quickly tested v3, it worked fine, but now it turned out to
only work fine "sometimes", both v3 and v4. I need to look into it
further, but so far it seems related to the hugetlb_advanced_tests().

I guess there was already some discussion on this test, but we did
not receive all of the thread(s). Please always add at least
linux-s...@vger.kernel.org and maybe myself and Vasily Gorbik 

for further discussions.

That being said, sorry for duplications, this might already have been
discussed. Preliminary analysis showed that it only seems to go wrong
for certain random vaddr values. I cannot make any sense of that yet,
but what seems strange to me is that the hugetlb_advanced_tests()
take a (real) pte_t pointer as input, and also use that for all
kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.).

Although all the hugetlb code in the kernel is (mis)using pte_t
pointers instead of the correct pmd/pud_t pointers like THP, that
is just for historic reasons. The pointers will actually never point
to a real pte_t (i.e. page table entry), but of course to a pmd
or pud entry, depending on hugepage size.

What is passed in as ptep to hugetlb_advanced_tests() seems to be
the result from the previous ptep = pte_alloc_map(mm, pmdp, vaddr),
so I would expect that it points to a real page table entry. Need
to investigate further, but IIUC, using such a pointer for adding
large pte entries (i.e. pmd/pud entries) at least feels very wrong
to me, and I assume it is related to the issues we see on s390.

We actually see different issues, e.g. once a panic directly in
hugetlb_advanced_tests() -> 

Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

2020-09-05 Thread Christophe Leroy




Le 04/09/2020 à 23:01, David Laight a écrit :

From: Alexey Dobriyan

Sent: 04 September 2020 18:58

On Fri, Sep 04, 2020 at 08:00:24AM +0200, Ingo Molnar wrote:

* Christoph Hellwig  wrote:

this series removes the last set_fs() used to force a kernel address
space for the uaccess code in the kernel read/write/splice code, and then
stops implementing the address space overrides entirely for x86 and
powerpc.


Cool! For the x86 bits:

   Acked-by: Ingo Molnar 


set_fs() is older than some kernel hackers!

$ cd linux-0.11/
$ find . -type f -name '*.h' | xargs grep -e set_fs -w -n -A3
./include/asm/segment.h:61:extern inline void set_fs(unsigned long val)
./include/asm/segment.h-62-{
./include/asm/segment.h-63- __asm__("mov %0,%%fs"::"a" ((unsigned 
short) val));
./include/asm/segment.h-64-}


What is this strange %fs register you are talking about.
Figure 2-4 only has CS, DS, SS and ES.



Intel added registers FS and GS in the i386

Christophe