Re: linux-next: build failure after merge of the arm-soc tree

2021-04-09 Thread Hector Martin

Hi Stephen,

On 09/04/2021 19.13, Stephen Rothwell wrote:

Hi all,

After merging the arm-soc tree, today's linux-next build (powerpc
allnoconfig) failed like this:


[...]

Caused by commits

   7c566bb5e4d5 ("asm-generic/io.h:  Add a non-posted variant of ioremap()")
   89897f739d7b ("of/address: Add infrastructure to declare MMIO as non-posted")
(and maybe some others)

I have reverted 86332e9e3477..7d2d16ccf15d for today.


This is fixed in ea9629283839 in the soc tree, which went in a few hours 
ago. Sorry for the noise.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


[GIT PULL] Apple M1 SoC platform bring-up for 5.13

2021-04-08 Thread Hector Martin
 to
drop by #asahi and #asahi-dev on freenode to chat with us, or check
our website for more information on the project:

https://asahilinux.org/

Signed-off-by: Hector Martin 


Arnd Bergmann (1):
  docs: driver-api: device-io: Document I/O access functions

Hector Martin (17):
  dt-bindings: vendor-prefixes: Add apple prefix
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles
  arm64: cputype: Add CPU implementor & types for the Apple M1 cores
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: arch_timer: Implement support for interrupt-names
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  docs: driver-api: device-io: Document ioremap() variants & access funcs
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np
  of/address: Add infrastructure to declare MMIO as non-posted
  arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

 Documentation/devicetree/bindings/arm/apple.yaml   |  64 ++
 Documentation/devicetree/bindings/arm/cpus.yaml|   2 +
 .../bindings/display/simple-framebuffer.yaml   |   5 +
 .../bindings/interrupt-controller/apple,aic.yaml   |  88 +++
 .../devicetree/bindings/timer/arm,arch_timer.yaml  |  19 +
 .../devicetree/bindings/vendor-prefixes.yaml   |   2 +
 Documentation/driver-api/device-io.rst | 356 +
 Documentation/driver-api/driver-model/devres.rst   |   1 +
 MAINTAINERS|  14 +
 arch/arm64/Kconfig.platforms   |   7 +
 arch/arm64/boot/dts/Makefile   |   1 +
 arch/arm64/boot/dts/apple/Makefile |   2 +
 arch/arm64/boot/dts/apple/t8103-j274.dts   |  45 ++
 arch/arm64/boot/dts/apple/t8103.dtsi   | 135 
 arch/arm64/configs/defconfig   |   1 +
 arch/arm64/include/asm/cputype.h   |   6 +
 arch/arm64/include/asm/io.h|  11 +-
 arch/arm64/include/asm/sysreg.h|  60 ++
 arch/sparc/include/asm/io_64.h |   4 +
 drivers/clocksource/arm_arch_timer.c   |  24 +-
 drivers/irqchip/Kconfig|   8 +
 drivers/irqchip/Makefile   |   1 +
 drivers/irqchip/irq-apple-aic.c| 852 +
 drivers/of/address.c   |  43 +-
 include/asm-generic/io.h   |  22 +-
 include/asm-generic/iomap.h|   9 +
 include/clocksource/arm_arch_timer.h   |   1 +
 .../dt-bindings/interrupt-controller/apple-aic.h   |  15 +
 include/linux/cpuhotplug.h |   1 +
 include/linux/io.h |  18 +-
 include/linux/ioport.h |   1 +
 include/linux/irqchip/arm-gic-v3.h |  56 --
 lib/devres.c   |  22 +
 33 files changed, 1816 insertions(+), 80 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/apple.yaml
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 create mode 100644 arch/arm64/boot/dts/apple/Makefile
 create mode 100644 arch/arm64/boot/dts/apple/t8103-j274.dts
 create mode 100644 arch/arm64/boot/dts/apple/t8103.dtsi
 create mode 100644 drivers/irqchip/irq-apple-aic.c
 create mode 100644 include/dt-bindings/interrupt-controller/apple-aic.h


Re: [PATCH v4 15/18] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-04-08 Thread Hector Martin

On 08/04/2021 06.09, Will Deacon wrote:

Couple of stale comment nits:


[...]


But with that:

Acked-by: Will Deacon 


Fixed those for the PR, thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v4 11/18] asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np

2021-04-08 Thread Hector Martin

On 08/04/2021 06.03, Will Deacon wrote:

I would rewrite above as

void __iomem *ret;

ret = ioremap_np(offset, size);
if (ret)
   return ret;

return ioremap(offset, size);


Looks like it might be one of those rare occasions where the GCC ternary if
extension thingy comes in handy:

return ioremap_np(offset, size) ?: ioremap(offset, size);


Today I learned that this one is kosher in kernel code. Handy! Let's go 
with that.



Acked-by: Will Deacon 


Thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v4 15/18] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-04-06 Thread Hector Martin

On 07/04/2021 03.16, Marc Zyngier wrote:

Hi Hector,

On Fri, 02 Apr 2021 10:05:39 +0100,
Hector Martin  wrote:

+   /*
+* In EL1 the non-redirected registers are the guest's,
+* not EL2's, so remap the hwirqs to match.
+*/
+   if (!is_kernel_in_hyp_mode()) {
+   switch (fwspec->param[1]) {
+   case AIC_TMR_GUEST_PHYS:
+   *hwirq = ic->nr_hw + AIC_TMR_HV_PHYS;
+   break;
+   case AIC_TMR_GUEST_VIRT:
+   *hwirq = ic->nr_hw + AIC_TMR_HV_VIRT;
+   break;
+   case AIC_TMR_HV_PHYS:
+   case AIC_TMR_HV_VIRT:
+   return -ENOENT;
+   default:
+   break;
+   }
+   }


Urgh, this is nasty. You are internally remapping the hwirq from one
timer to another in order to avoid accessing the enable register
which happens to be an EL2 only register?


The remapping is to make the IRQs route properly at all.

There are EL2 and EL0 timers, and on GIC each timer goes to its own IRQ. 
But here there are no real IRQs, everything's a FIQ. However, thanks to 
VHE, the EL2 timer shows up as the EL0 timer, and the EL0 timer is 
accessed via EL02 registers, when in EL2. So in EL2/VHE mode, "HV" means 
EL0 and "guest" means EL02, while in EL1, there is no HV and "guest" 
means EL0. And since we figure out which IRQ fired by reading timer 
registers, this is what matters. So I map the guest IRQs to the HV 
hwirqs in EL1 mode, which makes this all work out. Then the timer code 
goes and ends up undoing all this logic again, so we map to separate 
fake "IRQs" only to end up right back at using the same timer registers 
anuway :-)


Really, the ugliness here is that the constant meaning is overloaded. In 
fwspec context they mean what they say on the tin, while in hwirq 
context "HV" means EL0 and "guest" means EL02 (other FIQs would be 
passed through unchanged). Perhaps some additional defines might help 
clarify this? Say, at the top of this file (not in the binding),


/*
 * Pass-through mapping from real timers to the correct registers to
 * access them in EL2/VHE mode. When running in EL1, this gets
 * overridden to access the guest timer using EL0 registers.
 */
#define AIC_TMR_EL0_PHYS AIC_TMR_HV_PHYS
#define AIC_TMR_EL0_VIRT AIC_TMR_HV_VIRT
#define AIC_TMR_EL02_PHYS AIC_TMR_GUEST_PHYS
#define AIC_TMR_EL02_VIRT AIC_TMR_GUEST_VIRT

Then the irqchip/FIQ dispatch side can use the EL* constants, the 
default pass-through mapping is appropriate for VHE/EL2 mode, and 
translation can adjust it for EL1 mode.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v4 12/18] of/address: Add infrastructure to declare MMIO as non-posted

2021-04-06 Thread Hector Martin

On 07/04/2021 01.47, Rob Herring wrote:

+EXPORT_SYMBOL_GPL(of_mmio_is_nonposted);


Is this needed outside of of/address.c? If not, please make it static
and don't export.


Ah, yes, that was cargo culted from of_dma_is_coherent. Not sure how I 
missed that it's obviously unnecessary. Thanks for pointing it out.




With that,

Reviewed-by: Rob Herring 


Thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


[PATCH v4 18/18] arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

2021-04-02 Thread Hector Martin
This currently supports:

* SMP (via spin-tables)
* AIC IRQs
* Serial (with earlycon)
* Framebuffer

A number of properties are dynamic, and based on system firmware
decisions that vary from version to version. These are expected
to be filled in by the loader.

Signed-off-by: Hector Martin 
---
 MAINTAINERS  |   1 +
 arch/arm64/boot/dts/Makefile |   1 +
 arch/arm64/boot/dts/apple/Makefile   |   2 +
 arch/arm64/boot/dts/apple/t8103-j274.dts |  45 
 arch/arm64/boot/dts/apple/t8103.dtsi | 135 +++
 5 files changed, 184 insertions(+)
 create mode 100644 arch/arm64/boot/dts/apple/Makefile
 create mode 100644 arch/arm64/boot/dts/apple/t8103-j274.dts
 create mode 100644 arch/arm64/boot/dts/apple/t8103.dtsi

diff --git a/MAINTAINERS b/MAINTAINERS
index e27332ec1f12..9ac46317840b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1647,6 +1647,7 @@ C:irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
 F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
+F: arch/arm64/boot/dts/apple/
 F: drivers/irqchip/irq-apple-aic.c
 F: include/dt-bindings/interrupt-controller/apple-aic.h
 
diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index f1173cd93594..639e01a4d855 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -6,6 +6,7 @@ subdir-y += amazon
 subdir-y += amd
 subdir-y += amlogic
 subdir-y += apm
+subdir-y += apple
 subdir-y += arm
 subdir-y += bitmain
 subdir-y += broadcom
diff --git a/arch/arm64/boot/dts/apple/Makefile 
b/arch/arm64/boot/dts/apple/Makefile
new file mode 100644
index ..cbbd701ebf05
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+dtb-$(CONFIG_ARCH_APPLE) += t8103-j274.dtb
diff --git a/arch/arm64/boot/dts/apple/t8103-j274.dts 
b/arch/arm64/boot/dts/apple/t8103-j274.dts
new file mode 100644
index ..e0f6775b9878
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/t8103-j274.dts
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Apple Mac mini (M1, 2020)
+ *
+ * target-type: J274
+ *
+ * Copyright The Asahi Linux Contributors
+ */
+
+/dts-v1/;
+
+#include "t8103.dtsi"
+
+/ {
+   compatible = "apple,j274", "apple,t8103", "apple,arm-platform";
+   model = "Apple Mac mini (M1, 2020)";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   stdout-path = "serial0";
+
+   framebuffer0: framebuffer@0 {
+   compatible = "apple,simple-framebuffer", 
"simple-framebuffer";
+   reg = <0 0 0 0>; /* To be filled by loader */
+   /* Format properties will be added by loader */
+   status = "disabled";
+   };
+   };
+
+   memory@8 {
+   device_type = "memory";
+   reg = <0x8 0 0x2 0>; /* To be filled by loader */
+   };
+};
+
+ {
+   status = "okay";
+};
diff --git a/arch/arm64/boot/dts/apple/t8103.dtsi 
b/arch/arm64/boot/dts/apple/t8103.dtsi
new file mode 100644
index ..ff2bcb64bb13
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/t8103.dtsi
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Apple T8103 "M1" SoC
+ *
+ * Other names: H13G, "Tonga"
+ *
+ * Copyright The Asahi Linux Contributors
+ */
+
+#include 
+#include 
+
+/ {
+   compatible = "apple,t8103", "apple,arm-platform";
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x0>;
+   enable-method = "spin-table";
+   cpu-release-addr = <0 0>; /* To be filled by loader */
+   };
+
+   cpu1: cpu@1 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x1>;
+   enable-method = "spin-table";
+   cpu-release-addr = <0 0>; /* To be filled by loader */
+   };
+
+   cpu2: cpu@2 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x2>;
+ 

[PATCH v4 17/18] dt-bindings: display: Add apple,simple-framebuffer

2021-04-02 Thread Hector Martin
Apple SoCs run firmware that sets up a simplefb-compatible framebuffer
for us. Add a compatible for it, and two missing supported formats.

Reviewed-by: Rob Herring 
Reviewed-by: Linus Walleij 
Signed-off-by: Hector Martin 
---
 .../devicetree/bindings/display/simple-framebuffer.yaml  | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml 
b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
index eaf8c54fcf50..c2499a7906f5 100644
--- a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
+++ b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
@@ -54,6 +54,7 @@ properties:
   compatible:
 items:
   - enum:
+  - apple,simple-framebuffer
   - allwinner,simple-framebuffer
   - amlogic,simple-framebuffer
   - const: simple-framebuffer
@@ -84,9 +85,13 @@ properties:
   Format of the framebuffer:
 * `a8b8g8r8` - 32-bit pixels, d[31:24]=a, d[23:16]=b, d[15:8]=g, 
d[7:0]=r
 * `r5g6b5` - 16-bit pixels, d[15:11]=r, d[10:5]=g, d[4:0]=b
+* `x2r10g10b10` - 32-bit pixels, d[29:20]=r, d[19:10]=g, d[9:0]=b
+* `x8r8g8b8` - 32-bit pixels, d[23:16]=r, d[15:8]=g, d[7:0]=b
 enum:
   - a8b8g8r8
   - r5g6b5
+  - x2r10g10b10
+  - x8r8g8b8
 
   display:
 $ref: /schemas/types.yaml#/definitions/phandle
-- 
2.30.0



[PATCH v4 16/18] arm64: Kconfig: Introduce CONFIG_ARCH_APPLE

2021-04-02 Thread Hector Martin
This adds a Kconfig option to toggle support for Apple ARM SoCs.
At this time this targets the M1 and later "Apple Silicon" Mac SoCs.

Signed-off-by: Hector Martin 
---
 arch/arm64/Kconfig.platforms | 7 +++
 arch/arm64/configs/defconfig | 1 +
 2 files changed, 8 insertions(+)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index cdfd5fed457f..df320a13915a 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -36,6 +36,13 @@ config ARCH_ALPINE
  This enables support for the Annapurna Labs Alpine
  Soc family.
 
+config ARCH_APPLE
+   bool "Apple Silicon SoC family"
+   select APPLE_AIC
+   help
+ This enables support for Apple's in-house ARM SoC family, starting
+ with the Apple M1.
+
 config ARCH_BCM2835
bool "Broadcom BCM2835 family"
select TIMER_OF
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index d612f633b771..54fb257e55f7 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -31,6 +31,7 @@ CONFIG_ARCH_ACTIONS=y
 CONFIG_ARCH_AGILEX=y
 CONFIG_ARCH_SUNXI=y
 CONFIG_ARCH_ALPINE=y
+CONFIG_ARCH_APPLE=y
 CONFIG_ARCH_BCM2835=y
 CONFIG_ARCH_BCM4908=y
 CONFIG_ARCH_BCM_IPROC=y
-- 
2.30.0



[PATCH v4 15/18] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-04-02 Thread Hector Martin
This is the root interrupt controller used on Apple ARM SoCs such as the
M1. This irqchip driver performs multiple functions:

* Handles both IRQs and FIQs

* Drives the AIC peripheral itself (which handles IRQs)

* Dispatches FIQs to downstream hard-wired clients (currently the ARM
  timer).

* Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
  into a single hardware IPI

Signed-off-by: Hector Martin 
---
 MAINTAINERS |   2 +
 drivers/irqchip/Kconfig |   8 +
 drivers/irqchip/Makefile|   1 +
 drivers/irqchip/irq-apple-aic.c | 837 
 include/linux/cpuhotplug.h  |   1 +
 5 files changed, 849 insertions(+)
 create mode 100644 drivers/irqchip/irq-apple-aic.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b26a7e23c512..e27332ec1f12 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1647,6 +1647,8 @@ C:irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
 F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
+F: drivers/irqchip/irq-apple-aic.c
+F: include/dt-bindings/interrupt-controller/apple-aic.h
 
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 15536e321df5..d3a14f304ec8 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -577,4 +577,12 @@ config MST_IRQ
help
  Support MStar Interrupt Controller.
 
+config APPLE_AIC
+   bool "Apple Interrupt Controller (AIC)"
+   depends on ARM64
+   default ARCH_APPLE
+   help
+ Support for the Apple Interrupt Controller found on Apple Silicon 
SoCs,
+ such as the M1.
+
 endmenu
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index c59b95a0532c..eb6a515f0f64 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -113,3 +113,4 @@ obj-$(CONFIG_LOONGSON_PCH_MSI)  += 
irq-loongson-pch-msi.o
 obj-$(CONFIG_MST_IRQ)  += irq-mst-intc.o
 obj-$(CONFIG_SL28CPLD_INTC)+= irq-sl28cpld.o
 obj-$(CONFIG_MACH_REALTEK_RTL) += irq-realtek-rtl.o
+obj-$(CONFIG_APPLE_AIC)+= irq-apple-aic.o
diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
new file mode 100644
index ..ed16b6cc00d7
--- /dev/null
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -0,0 +1,837 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright The Asahi Linux Contributors
+ *
+ * Based on irq-lpc32xx:
+ *   Copyright 2015-2016 Vladimir Zapolskiy 
+ * Based on irq-bcm2836:
+ *   Copyright 2015 Broadcom
+ */
+
+/*
+ * AIC is a fairly simple interrupt controller with the following features:
+ *
+ * - 896 level-triggered hardware IRQs
+ *   - Single mask bit per IRQ
+ *   - Per-IRQ affinity setting
+ *   - Automatic masking on event delivery (auto-ack)
+ *   - Software triggering (ORed with hw line)
+ * - 2 per-CPU IPIs (meant as "self" and "other", but they are
+ *   interchangeable if not symmetric)
+ * - Automatic prioritization (single event/ack register per CPU, lower IRQs =
+ *   higher priority)
+ * - Automatic masking on ack
+ * - Default "this CPU" register view and explicit per-CPU views
+ *
+ * In addition, this driver also handles FIQs, as these are routed to the same
+ * IRQ vector. These are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and
+ * performance counters (TODO).
+ *
+ * Implementation notes:
+ *
+ * - This driver creates two IRQ domains, one for HW IRQs and internal FIQs,
+ *   and one for IPIs.
+ * - Since Linux needs more than 2 IPIs, we implement a software IRQ controller
+ *   and funnel all IPIs into one per-CPU IPI (the second "self" IPI is 
unused).
+ * - FIQ hwirq numbers are assigned after true hwirqs, and are per-cpu.
+ * - DT bindings use 3-cell form (like GIC):
+ *   - <0 nr flags> - hwirq #nr
+ *   - <1 nr flags> - FIQ #nr
+ * - nr=0  Physical HV timer
+ * - nr=1  Virtual HV timer
+ * - nr=2  Physical guest timer
+ * - nr=3  Virtual guest timer
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/*
+ * AIC registers (MMIO)
+ */
+
+#define AIC_INFO   0x0004
+#define AIC_INFO_NR_HW GENMASK(15, 0)
+
+#define AIC_CONFIG 0x0010
+
+#define AIC_WHOAMI 0x2000
+#define AIC_EVENT  0x2004
+#define AIC_EVENT_TYPE GENMASK(31, 16)
+#define AIC_EVENT_NUM  GENMASK(15, 0)
+
+#define AIC_EVENT_TYPE_HW  1
+#define AIC_EVENT_TYPE_IPI 4
+#define AIC_EVENT_IPI_OTHER1
+#define AIC_EVENT_IPI_SELF 2
+
+#define AIC_IPI_SEND   0x2008
+#define AIC_IPI_ACK0x200c
+#define AIC_IPI

[PATCH v4 14/18] dt-bindings: interrupt-controller: Add DT bindings for apple-aic

2021-04-02 Thread Hector Martin
AIC is the Apple Interrupt Controller found on Apple ARM SoCs, such as
the M1.

Reviewed-by: Linus Walleij 
Reviewed-by: Rob Herring 
Signed-off-by: Hector Martin 
---
 .../interrupt-controller/apple,aic.yaml   | 88 +++
 MAINTAINERS   |  1 +
 .../interrupt-controller/apple-aic.h  | 15 
 3 files changed, 104 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 create mode 100644 include/dt-bindings/interrupt-controller/apple-aic.h

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml 
b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
new file mode 100644
index ..cf6c091a07b1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
@@ -0,0 +1,88 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/apple,aic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple Interrupt Controller
+
+maintainers:
+  - Hector Martin 
+
+description: |
+  The Apple Interrupt Controller is a simple interrupt controller present on
+  Apple ARM SoC platforms, including various iPhone and iPad devices and the
+  "Apple Silicon" Macs.
+
+  It provides the following features:
+
+  - Level-triggered hardware IRQs wired to SoC blocks
+- Single mask bit per IRQ
+- Per-IRQ affinity setting
+- Automatic masking on event delivery (auto-ack)
+- Software triggering (ORed with hw line)
+  - 2 per-CPU IPIs (meant as "self" and "other", but they are interchangeable
+if not symmetric)
+  - Automatic prioritization (single event/ack register per CPU, lower IRQs =
+higher priority)
+  - Automatic masking on ack
+  - Default "this CPU" register view and explicit per-CPU views
+
+  This device also represents the FIQ interrupt sources on platforms using AIC,
+  which do not go through a discrete interrupt controller.
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+  compatible:
+items:
+  - const: apple,t8103-aic
+  - const: apple,aic
+
+  interrupt-controller: true
+
+  '#interrupt-cells':
+const: 3
+description: |
+  The 1st cell contains the interrupt type:
+- 0: Hardware IRQ
+- 1: FIQ
+
+  The 2nd cell contains the interrupt number.
+- HW IRQs: interrupt number
+- FIQs:
+  - 0: physical HV timer
+  - 1: virtual HV timer
+  - 2: physical guest timer
+  - 3: virtual guest timer
+
+  The 3rd cell contains the interrupt flags. This is normally
+  IRQ_TYPE_LEVEL_HIGH (4).
+
+  reg:
+description: |
+  Specifies base physical address and size of the AIC registers.
+maxItems: 1
+
+required:
+  - compatible
+  - '#interrupt-cells'
+  - interrupt-controller
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+aic: interrupt-controller@23b10 {
+compatible = "apple,t8103-aic", "apple,aic";
+#interrupt-cells = <3>;
+interrupt-controller;
+reg = <0x2 0x3b10 0x0 0x8000>;
+};
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index bee9a57e6cec..b26a7e23c512 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1646,6 +1646,7 @@ B:https://github.com/AsahiLinux/linux/issues
 C: irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
diff --git a/include/dt-bindings/interrupt-controller/apple-aic.h 
b/include/dt-bindings/interrupt-controller/apple-aic.h
new file mode 100644
index ..604f2bb30ac0
--- /dev/null
+++ b/include/dt-bindings/interrupt-controller/apple-aic.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0+ OR MIT */
+#ifndef _DT_BINDINGS_INTERRUPT_CONTROLLER_APPLE_AIC_H
+#define _DT_BINDINGS_INTERRUPT_CONTROLLER_APPLE_AIC_H
+
+#include 
+
+#define AIC_IRQ0
+#define AIC_FIQ1
+
+#define AIC_TMR_HV_PHYS0
+#define AIC_TMR_HV_VIRT1
+#define AIC_TMR_GUEST_PHYS 2
+#define AIC_TMR_GUEST_VIRT 3
+
+#endif
-- 
2.30.0



[PATCH v4 13/18] arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h

2021-04-02 Thread Hector Martin
These definitions are in arm-gic-v3.h for historical reasons which no
longer apply. Move them to sysreg.h so the AIC driver can use them, as
it needs to peek into vGIC registers to deal with the GIC maintentance
interrupt.

Acked-by: Marc Zyngier 
Acked-by: Will Deacon 
Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/sysreg.h| 60 ++
 include/linux/irqchip/arm-gic-v3.h | 56 
 2 files changed, 60 insertions(+), 56 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index d4a5fca984c3..609dc42ec8c8 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1032,6 +1032,66 @@
 #define TRFCR_ELx_ExTREBIT(1)
 #define TRFCR_ELx_E0TREBIT(0)
 
+
+/* GIC Hypervisor interface registers */
+/* ICH_MISR_EL2 bit definitions */
+#define ICH_MISR_EOI   (1 << 0)
+#define ICH_MISR_U (1 << 1)
+
+/* ICH_LR*_EL2 bit definitions */
+#define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1)
+
+#define ICH_LR_EOI (1ULL << 41)
+#define ICH_LR_GROUP   (1ULL << 60)
+#define ICH_LR_HW  (1ULL << 61)
+#define ICH_LR_STATE   (3ULL << 62)
+#define ICH_LR_PENDING_BIT (1ULL << 62)
+#define ICH_LR_ACTIVE_BIT  (1ULL << 63)
+#define ICH_LR_PHYS_ID_SHIFT   32
+#define ICH_LR_PHYS_ID_MASK(0x3ffULL << ICH_LR_PHYS_ID_SHIFT)
+#define ICH_LR_PRIORITY_SHIFT  48
+#define ICH_LR_PRIORITY_MASK   (0xffULL << ICH_LR_PRIORITY_SHIFT)
+
+/* ICH_HCR_EL2 bit definitions */
+#define ICH_HCR_EN (1 << 0)
+#define ICH_HCR_UIE(1 << 1)
+#define ICH_HCR_NPIE   (1 << 3)
+#define ICH_HCR_TC (1 << 10)
+#define ICH_HCR_TALL0  (1 << 11)
+#define ICH_HCR_TALL1  (1 << 12)
+#define ICH_HCR_EOIcount_SHIFT 27
+#define ICH_HCR_EOIcount_MASK  (0x1f << ICH_HCR_EOIcount_SHIFT)
+
+/* ICH_VMCR_EL2 bit definitions */
+#define ICH_VMCR_ACK_CTL_SHIFT 2
+#define ICH_VMCR_ACK_CTL_MASK  (1 << ICH_VMCR_ACK_CTL_SHIFT)
+#define ICH_VMCR_FIQ_EN_SHIFT  3
+#define ICH_VMCR_FIQ_EN_MASK   (1 << ICH_VMCR_FIQ_EN_SHIFT)
+#define ICH_VMCR_CBPR_SHIFT4
+#define ICH_VMCR_CBPR_MASK (1 << ICH_VMCR_CBPR_SHIFT)
+#define ICH_VMCR_EOIM_SHIFT9
+#define ICH_VMCR_EOIM_MASK (1 << ICH_VMCR_EOIM_SHIFT)
+#define ICH_VMCR_BPR1_SHIFT18
+#define ICH_VMCR_BPR1_MASK (7 << ICH_VMCR_BPR1_SHIFT)
+#define ICH_VMCR_BPR0_SHIFT21
+#define ICH_VMCR_BPR0_MASK (7 << ICH_VMCR_BPR0_SHIFT)
+#define ICH_VMCR_PMR_SHIFT 24
+#define ICH_VMCR_PMR_MASK  (0xffUL << ICH_VMCR_PMR_SHIFT)
+#define ICH_VMCR_ENG0_SHIFT0
+#define ICH_VMCR_ENG0_MASK (1 << ICH_VMCR_ENG0_SHIFT)
+#define ICH_VMCR_ENG1_SHIFT1
+#define ICH_VMCR_ENG1_MASK (1 << ICH_VMCR_ENG1_SHIFT)
+
+/* ICH_VTR_EL2 bit definitions */
+#define ICH_VTR_PRI_BITS_SHIFT 29
+#define ICH_VTR_PRI_BITS_MASK  (7 << ICH_VTR_PRI_BITS_SHIFT)
+#define ICH_VTR_ID_BITS_SHIFT  23
+#define ICH_VTR_ID_BITS_MASK   (7 << ICH_VTR_ID_BITS_SHIFT)
+#define ICH_VTR_SEIS_SHIFT 22
+#define ICH_VTR_SEIS_MASK  (1 << ICH_VTR_SEIS_SHIFT)
+#define ICH_VTR_A3V_SHIFT  21
+#define ICH_VTR_A3V_MASK   (1 << ICH_VTR_A3V_SHIFT)
+
 #ifdef __ASSEMBLY__
 
.irp
num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index f6d092fdb93d..81cbf85f73de 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -575,67 +575,11 @@
 #define ICC_SRE_EL1_DFB(1U << 1)
 #define ICC_SRE_EL1_SRE(1U << 0)
 
-/*
- * Hypervisor interface registers (SRE only)
- */
-#define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1)
-
-#define ICH_LR_EOI (1ULL << 41)
-#define ICH_LR_GROUP   (1ULL << 60)
-#define ICH_LR_HW  (1ULL << 61)
-#define ICH_LR_STATE   (3ULL << 62)
-#define ICH_LR_PENDING_BIT (1ULL << 62)
-#define ICH_LR_ACTIVE_BIT  (1ULL << 63)
-#define ICH_LR_PHYS_ID_SHIFT   32
-#define ICH_LR_PHYS_ID_MASK(0x3ffULL << ICH_LR_PHYS_ID_SHIFT)
-#define ICH_LR_PRIORITY_SHIFT  48
-#define ICH_LR_PRIORITY_MASK   (0xffULL << ICH_LR_PRIORITY_SHIFT)
-
 /* These are for GICv2 emulation only */
 #define GICH_LR_VIRTUALID  (0x3ffUL << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT (10)
 #define GICH_LR_PHYSID_CPUID   (7UL << GICH_LR_PHYSID_CPUID_SHIFT)
 
-#define ICH_MISR_EOI   (1 << 0)
-#define ICH_MISR_U (1 <<

[PATCH v4 12/18] of/address: Add infrastructure to declare MMIO as non-posted

2021-04-02 Thread Hector Martin
This implements the 'nonposted-mmio' boolean property. Placing this
property in a bus marks all direct child devices as requiring
non-posted MMIO mappings. If no such property is found, the default
is posted MMIO.

of_mmio_is_nonposted() performs this check to determine if a given
device has requested non-posted MMIO.

of_address_to_resource() uses this to set the IORESOURCE_MEM_NONPOSTED
flag on resources that require non-posted MMIO.

of_iomap() and of_io_request_and_map() then use this flag to pick the
correct ioremap() variant.

This mechanism is currently restricted to builds that support Apple ARM
platforms, as an optimization.

Reviewed-by: Linus Walleij 
Signed-off-by: Hector Martin 
---
 drivers/of/address.c   | 43 --
 include/linux/of_address.h |  1 +
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 73ddf2540f3f..6485cc536e81 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -847,6 +847,9 @@ static int __of_address_to_resource(struct device_node *dev,
return -EINVAL;
memset(r, 0, sizeof(struct resource));
 
+   if (of_mmio_is_nonposted(dev))
+   flags |= IORESOURCE_MEM_NONPOSTED;
+
r->start = taddr;
r->end = taddr + size - 1;
r->flags = flags;
@@ -896,7 +899,10 @@ void __iomem *of_iomap(struct device_node *np, int index)
if (of_address_to_resource(np, index, ))
return NULL;
 
-   return ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   return ioremap_np(res.start, resource_size());
+   else
+   return ioremap(res.start, resource_size());
 }
 EXPORT_SYMBOL(of_iomap);
 
@@ -928,7 +934,11 @@ void __iomem *of_io_request_and_map(struct device_node 
*np, int index,
if (!request_mem_region(res.start, resource_size(), name))
return IOMEM_ERR_PTR(-EBUSY);
 
-   mem = ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   mem = ioremap_np(res.start, resource_size());
+   else
+   mem = ioremap(res.start, resource_size());
+
if (!mem) {
release_mem_region(res.start, resource_size());
return IOMEM_ERR_PTR(-ENOMEM);
@@ -1094,3 +1104,32 @@ bool of_dma_is_coherent(struct device_node *np)
return false;
 }
 EXPORT_SYMBOL_GPL(of_dma_is_coherent);
+
+/**
+ * of_mmio_is_nonposted - Check if device uses non-posted MMIO
+ * @np:device node
+ *
+ * Returns true if the "nonposted-mmio" property was found for
+ * the device's bus.
+ *
+ * This is currently only enabled on builds that support Apple ARM devices, as
+ * an optimization.
+ */
+bool of_mmio_is_nonposted(struct device_node *np)
+{
+   struct device_node *parent;
+   bool nonposted;
+
+   if (!IS_ENABLED(CONFIG_ARCH_APPLE))
+   return false;
+
+   parent = of_get_parent(np);
+   if (!parent)
+   return false;
+
+   nonposted = of_property_read_bool(parent, "nonposted-mmio");
+
+   of_node_put(parent);
+   return nonposted;
+}
+EXPORT_SYMBOL_GPL(of_mmio_is_nonposted);
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 88bc943405cd..88f6333fee6c 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -62,6 +62,7 @@ extern struct of_pci_range *of_pci_range_parser_one(
struct of_pci_range_parser *parser,
struct of_pci_range *range);
 extern bool of_dma_is_coherent(struct device_node *np);
+extern bool of_mmio_is_nonposted(struct device_node *np);
 #else /* CONFIG_OF_ADDRESS */
 static inline void __iomem *of_io_request_and_map(struct device_node *device,
  int index, const char *name)
-- 
2.30.0



[PATCH v4 11/18] asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np

2021-04-02 Thread Hector Martin
Now that we have ioremap_np(), we can make pci_remap_cfgspace() default
to it, falling back to ioremap() on platforms where it is not available.

Remove the arm64 implementation, since that is now redundant. Future
cleanups should be able to do the same for other arches, and eventually
make the generic pci_remap_cfgspace() unconditional.

Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/io.h | 10 --
 include/linux/io.h  | 21 +
 2 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 953b8703af60..7fd836bea7eb 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -171,16 +171,6 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, 
size_t size);
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_NORMAL_NC))
 #define ioremap_np(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRnE))
 
-/*
- * PCI configuration space mapping function.
- *
- * The PCI specification disallows posted write configuration transactions.
- * Add an arch specific pci_remap_cfgspace() definition that is implemented
- * through nGnRnE device memory attribute as recommended by the ARM v8
- * Architecture reference manual Issue A.k B2.8.2 "Device memory".
- */
-#define pci_remap_cfgspace(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRnE))
-
 /*
  * io{read,write}{16,32,64}be() macros
  */
diff --git a/include/linux/io.h b/include/linux/io.h
index d718354ed3e1..6f6b9233f2c3 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -82,20 +82,25 @@ void devm_memunmap(struct device *dev, void *addr);
 #ifdef CONFIG_PCI
 /*
  * The PCI specifications (Rev 3.0, 3.2.5 "Transaction Ordering and
- * Posting") mandate non-posted configuration transactions. There is
- * no ioremap API in the kernel that can guarantee non-posted write
- * semantics across arches so provide a default implementation for
- * mapping PCI config space that defaults to ioremap(); arches
- * should override it if they have memory mapping implementations that
- * guarantee non-posted writes semantics to make the memory mapping
- * compliant with the PCI specification.
+ * Posting") mandate non-posted configuration transactions. This default
+ * implementation attempts to use the ioremap_np() API to provide this
+ * on arches that support it, and falls back to ioremap() on those that
+ * don't. Overriding this function is deprecated; arches that properly
+ * support non-posted accesses should implement ioremap_np() instead, which
+ * this default implementation can then use to return mappings compliant with
+ * the PCI specification.
  */
 #ifndef pci_remap_cfgspace
 #define pci_remap_cfgspace pci_remap_cfgspace
 static inline void __iomem *pci_remap_cfgspace(phys_addr_t offset,
   size_t size)
 {
-   return ioremap(offset, size);
+   void __iomem *ret = ioremap_np(offset, size);
+
+   if (!ret)
+   ret = ioremap(offset, size);
+
+   return ret;
 }
 #endif
 #endif
-- 
2.30.0



[PATCH v4 10/18] arm64: Implement ioremap_np() to map MMIO as nGnRnE

2021-04-02 Thread Hector Martin
This is used on Apple ARM platforms, which require most MMIO
(except PCI devices) to be mapped as nGnRnE.

Acked-by: Marc Zyngier 
Acked-by: Will Deacon 
Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/io.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 5ea8656a2030..953b8703af60 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -169,6 +169,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, 
size_t size);
 
 #define ioremap(addr, size)__ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_NORMAL_NC))
+#define ioremap_np(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRnE))
 
 /*
  * PCI configuration space mapping function.
-- 
2.30.0



[PATCH v4 09/18] docs: driver-api: device-io: Document ioremap() variants & access funcs

2021-04-02 Thread Hector Martin
This documents the newly introduced ioremap_np() along with all the
other common ioremap() variants, and some higher-level abstractions
available.

Reviewed-by: Linus Walleij 
Signed-off-by: Hector Martin 
---
 Documentation/driver-api/device-io.rst | 218 +
 1 file changed, 218 insertions(+)

diff --git a/Documentation/driver-api/device-io.rst 
b/Documentation/driver-api/device-io.rst
index b20864b3ddc7..e9f04b1815d1 100644
--- a/Documentation/driver-api/device-io.rst
+++ b/Documentation/driver-api/device-io.rst
@@ -284,6 +284,224 @@ insl, insw, insb, outsl, outsw, outsb
   first byte in the FIFO register corresponds to the first byte in the memory
   buffer regardless of the architecture.
 
+Device memory mapping modes
+===
+
+Some architectures support multiple modes for mapping device memory.
+ioremap_*() variants provide a common abstraction around these
+architecture-specific modes, with a shared set of semantics.
+
+ioremap() is the most common mapping type, and is applicable to typical device
+memory (e.g. I/O registers). Other modes can offer weaker or stronger
+guarantees, if supported by the architecture. From most to least common, they
+are as follows:
+
+ioremap()
+-
+
+The default mode, suitable for most memory-mapped devices, e.g. control
+registers. Memory mapped using ioremap() has the following characteristics:
+
+* Uncached - CPU-side caches are bypassed, and all reads and writes are handled
+  directly by the device
+* No speculative operations - the CPU may not issue a read or write to this
+  memory, unless the instruction that does so has been reached in committed
+  program flow.
+* No reordering - The CPU may not reorder accesses to this memory mapping with
+  respect to each other. On some architectures, this relies on barriers in
+  readl_relaxed()/writel_relaxed().
+* No repetition - The CPU may not issue multiple reads or writes for a single
+  program instruction.
+* No write-combining - Each I/O operation results in one discrete read or write
+  being issued to the device, and multiple writes are not combined into larger
+  writes. This may or may not be enforced when using __raw I/O accessors or
+  pointer dereferences.
+* Non-executable - The CPU is not allowed to speculate instruction execution
+  from this memory (it probably goes without saying, but you're also not
+  allowed to jump into device memory).
+
+On many platforms and buses (e.g. PCI), writes issued through ioremap()
+mappings are posted, which means that the CPU does not wait for the write to
+actually reach the target device before retiring the write instruction.
+
+On many platforms, I/O accesses must be aligned with respect to the access
+size; failure to do so will result in an exception or unpredictable results.
+
+ioremap_wc()
+
+
+Maps I/O memory as normal memory with write combining. Unlike ioremap(),
+
+* The CPU may speculatively issue reads from the device that the program
+  didn't actually execute, and may choose to basically read whatever it wants.
+* The CPU may reorder operations as long as the result is consistent from the
+  program's point of view.
+* The CPU may write to the same location multiple times, even when the program
+  issued a single write.
+* The CPU may combine several writes into a single larger write.
+
+This mode is typically used for video framebuffers, where it can increase
+performance of writes. It can also be used for other blocks of memory in
+devices (e.g. buffers or shared memory), but care must be taken as accesses are
+not guaranteed to be ordered with respect to normal ioremap() MMIO register
+accesses without explicit barriers.
+
+On a PCI bus, it is usually safe to use ioremap_wc() on MMIO areas marked as
+``IORESOURCE_PREFETCH``, but it may not be used on those without the flag.
+For on-chip devices, there is no corresponding flag, but a driver can use
+ioremap_wc() on a device that is known to be safe.
+
+ioremap_wt()
+
+
+Maps I/O memory as normal memory with write-through caching. Like ioremap_wc(),
+but also,
+
+* The CPU may cache writes issued to and reads from the device, and serve reads
+  from that cache.
+
+This mode is sometimes used for video framebuffers, where drivers still expect
+writes to reach the device in a timely manner (and not be stuck in the CPU
+cache), but reads may be served from the cache for efficiency. However, it is
+rarely useful these days, as framebuffer drivers usually perform writes only,
+for which ioremap_wc() is more efficient (as it doesn't needlessly trash the
+cache). Most drivers should not use this.
+
+ioremap_np()
+
+
+Like ioremap(), but explicitly requests non-posted write semantics. On some
+architectures and buses, ioremap() mappings have posted write semantics, which
+means that writes can appear to "complete" from the point of view of the
+CPU before the written data actually arrives at the target devi

[PATCH v4 08/18] docs: driver-api: device-io: Document I/O access functions

2021-04-02 Thread Hector Martin
From: Arnd Bergmann 

This adds more detailed descriptions of the various read/write
primitives available for use with I/O memory/ports.

Reviewed-by: Linus Walleij 
Signed-off-by: Arnd Bergmann 
Signed-off-by: Hector Martin 
---
 Documentation/driver-api/device-io.rst | 138 +
 1 file changed, 138 insertions(+)

diff --git a/Documentation/driver-api/device-io.rst 
b/Documentation/driver-api/device-io.rst
index 764963876d08..b20864b3ddc7 100644
--- a/Documentation/driver-api/device-io.rst
+++ b/Documentation/driver-api/device-io.rst
@@ -146,6 +146,144 @@ There are also equivalents to memcpy. The ins() and
 outs() functions copy bytes, words or longs to the given
 port.
 
+__iomem pointer tokens
+==
+
+The data type for an MMIO address is an ``__iomem`` qualified pointer, such as
+``void __iomem *reg``. On most architectures it is a regular pointer that
+points to a virtual memory address and can be offset or dereferenced, but in
+portable code, it must only be passed from and to functions that explicitly
+operated on an ``__iomem`` token, in particular the ioremap() and
+readl()/writel() functions. The 'sparse' semantic code checker can be used to
+verify that this is done correctly.
+
+While on most architectures, ioremap() creates a page table entry for an
+uncached virtual address pointing to the physical MMIO address, some
+architectures require special instructions for MMIO, and the ``__iomem`` 
pointer
+just encodes the physical address or an offsettable cookie that is interpreted
+by readl()/writel().
+
+Differences between I/O access functions
+
+
+readq(), readl(), readw(), readb(), writeq(), writel(), writew(), writeb()
+
+  These are the most generic accessors, providing serialization against other
+  MMIO accesses and DMA accesses as well as fixed endianness for accessing
+  little-endian PCI devices and on-chip peripherals. Portable device drivers
+  should generally use these for any access to ``__iomem`` pointers.
+
+  Note that posted writes are not strictly ordered against a spinlock, see
+  Documentation/driver-api/io_ordering.rst.
+
+readq_relaxed(), readl_relaxed(), readw_relaxed(), readb_relaxed(),
+writeq_relaxed(), writel_relaxed(), writew_relaxed(), writeb_relaxed()
+
+  On architectures that require an expensive barrier for serializing against
+  DMA, these "relaxed" versions of the MMIO accessors only serialize against
+  each other, but contain a less expensive barrier operation. A device driver
+  might use these in a particularly performance sensitive fast path, with a
+  comment that explains why the usage in a specific location is safe without
+  the extra barriers.
+
+  See memory-barriers.txt for a more detailed discussion on the precise 
ordering
+  guarantees of the non-relaxed and relaxed versions.
+
+ioread64(), ioread32(), ioread16(), ioread8(),
+iowrite64(), iowrite32(), iowrite16(), iowrite8()
+
+  These are an alternative to the normal readl()/writel() functions, with 
almost
+  identical behavior, but they can also operate on ``__iomem`` tokens returned
+  for mapping PCI I/O space with pci_iomap() or ioport_map(). On architectures
+  that require special instructions for I/O port access, this adds a small
+  overhead for an indirect function call implemented in lib/iomap.c, while on
+  other architectures, these are simply aliases.
+
+ioread64be(), ioread32be(), ioread16be()
+iowrite64be(), iowrite32be(), iowrite16be()
+
+  These behave in the same way as the ioread32()/iowrite32() family, but with
+  reversed byte order, for accessing devices with big-endian MMIO registers.
+  Device drivers that can operate on either big-endian or little-endian
+  registers may have to implement a custom wrapper function that picks one or
+  the other depending on which device was found.
+
+  Note: On some architectures, the normal readl()/writel() functions
+  traditionally assume that devices are the same endianness as the CPU, while
+  using a hardware byte-reverse on the PCI bus when running a big-endian 
kernel.
+  Drivers that use readl()/writel() this way are generally not portable, but
+  tend to be limited to a particular SoC.
+
+hi_lo_readq(), lo_hi_readq(), hi_lo_readq_relaxed(), lo_hi_readq_relaxed(),
+ioread64_lo_hi(), ioread64_hi_lo(), ioread64be_lo_hi(), ioread64be_hi_lo(),
+hi_lo_writeq(), lo_hi_writeq(), hi_lo_writeq_relaxed(), lo_hi_writeq_relaxed(),
+iowrite64_lo_hi(), iowrite64_hi_lo(), iowrite64be_lo_hi(), iowrite64be_hi_lo()
+
+  Some device drivers have 64-bit registers that cannot be accessed atomically
+  on 32-bit architectures but allow two consecutive 32-bit accesses instead.
+  Since it depends on the particular device which of the two halves has to be
+  accessed first, a helper is provided for each combination of 64-bit accessors
+  with either low/high or high/low word ordering. A device driver must include
+  either  or  to
+  get the function definit

[PATCH v4 07/18] asm-generic/io.h: Add a non-posted variant of ioremap()

2021-04-02 Thread Hector Martin
ARM64 currently defaults to posted MMIO (nGnRE), but some devices
require the use of non-posted MMIO (nGnRnE). Introduce a new ioremap()
variant to handle this case. ioremap_np() returns NULL on arches that
do not implement this variant.

sparc64 is the only architecture that needs to be touched directly,
because it includes neither of the generic io.h or iomap.h headers.

This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this
variant and marks a given resource as requiring non-posted mappings.
This is implemented in the resource system because it is a SoC-level
requirement, so existing drivers do not need special-case code to pick
this ioremap variant.

Then this is implemented in devres by introducing devm_ioremap_np(),
and making devm_ioremap_resource() automatically select this variant
when the resource has the IORESOURCE_MEM_NONPOSTED flag set.

Acked-by: Marc Zyngier 
Signed-off-by: Hector Martin 
---
 .../driver-api/driver-model/devres.rst|  1 +
 arch/sparc/include/asm/io_64.h|  4 
 include/asm-generic/io.h  | 22 ++-
 include/asm-generic/iomap.h   |  9 
 include/linux/io.h|  2 ++
 include/linux/ioport.h|  1 +
 lib/devres.c  | 22 +++
 7 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/Documentation/driver-api/driver-model/devres.rst 
b/Documentation/driver-api/driver-model/devres.rst
index cd8b6e657b94..2f45877a539d 100644
--- a/Documentation/driver-api/driver-model/devres.rst
+++ b/Documentation/driver-api/driver-model/devres.rst
@@ -309,6 +309,7 @@ IOMAP
   devm_ioremap()
   devm_ioremap_uc()
   devm_ioremap_wc()
+  devm_ioremap_np()
   devm_ioremap_resource() : checks resource, requests memory region, ioremaps
   devm_ioremap_resource_wc()
   devm_platform_ioremap_resource() : calls devm_ioremap_resource() for 
platform device
diff --git a/arch/sparc/include/asm/io_64.h b/arch/sparc/include/asm/io_64.h
index 9bb27e5c22f1..9fbfc9574432 100644
--- a/arch/sparc/include/asm/io_64.h
+++ b/arch/sparc/include/asm/io_64.h
@@ -409,6 +409,10 @@ static inline void __iomem *ioremap(unsigned long offset, 
unsigned long size)
 #define ioremap_uc(X,Y)ioremap((X),(Y))
 #define ioremap_wc(X,Y)ioremap((X),(Y))
 #define ioremap_wt(X,Y)ioremap((X),(Y))
+static inline void __iomem *ioremap_np(unsigned long offset, unsigned long 
size)
+{
+   return NULL;
+}
 
 static inline void iounmap(volatile void __iomem *addr)
 {
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index c6af40ce03be..082e0c96db6e 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -942,7 +942,9 @@ static inline void *phys_to_virt(unsigned long address)
  *
  * ioremap_wc() and ioremap_wt() can provide more relaxed caching attributes
  * for specific drivers if the architecture choses to implement them.  If they
- * are not implemented we fall back to plain ioremap.
+ * are not implemented we fall back to plain ioremap. Conversely, ioremap_np()
+ * can provide stricter non-posted write semantics if the architecture
+ * implements them.
  */
 #ifndef CONFIG_MMU
 #ifndef ioremap
@@ -993,6 +995,24 @@ static inline void __iomem *ioremap_uc(phys_addr_t offset, 
size_t size)
 {
return NULL;
 }
+
+/*
+ * ioremap_np needs an explicit architecture implementation, as it
+ * requests stronger semantics than regular ioremap(). Portable drivers
+ * should instead use one of the higher-level abstractions, like
+ * devm_ioremap_resource(), to choose the correct variant for any given
+ * device and bus. Portable drivers with a good reason to want non-posted
+ * write semantics should always provide an ioremap() fallback in case
+ * ioremap_np() is not available.
+ */
+#ifndef ioremap_np
+#define ioremap_np ioremap_np
+static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
+{
+   return NULL;
+}
+#endif
+
 #endif
 
 #ifdef CONFIG_HAS_IOPORT_MAP
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 649224664969..9b3eb6d86200 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -101,6 +101,15 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wt ioremap
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_NP
+/* See the comment in asm-generic/io.h about ioremap_np(). */
+#define ioremap_np ioremap_np
+static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
+{
+   return NULL;
+}
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;
diff --git a/include/linux/io.h b/include/linux/io.h
index 8394c56babc2..d718354ed3e1 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -68,6 +68,8 @@ void __iomem *devm_ioremap_uc(struct device *dev, 
resource_size_t offset

[PATCH v4 06/18] arm64: arch_timer: Implement support for interrupt-names

2021-04-02 Thread Hector Martin
This allows the devicetree to correctly represent the available set of
timers, which varies from device to device, without the need for fake
dummy interrupts for unavailable slots.

Also add the hyp-virt timer/PPI, which is not currently used, but worth
representing.

Reviewed-by: Tony Lindgren 
Reviewed-by: Linus Walleij 
Reviewed-by: Marc Zyngier 
Signed-off-by: Hector Martin 
---
 drivers/clocksource/arm_arch_timer.c | 24 +---
 include/clocksource/arm_arch_timer.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index d0177824c518..932f95691e27 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -63,6 +63,14 @@ struct arch_timer {
 static u32 arch_timer_rate;
 static int arch_timer_ppi[ARCH_TIMER_MAX_TIMER_PPI];
 
+static const char *arch_timer_ppi_names[ARCH_TIMER_MAX_TIMER_PPI] = {
+   [ARCH_TIMER_PHYS_SECURE_PPI]= "sec-phys",
+   [ARCH_TIMER_PHYS_NONSECURE_PPI] = "phys",
+   [ARCH_TIMER_VIRT_PPI]   = "virt",
+   [ARCH_TIMER_HYP_PPI]= "hyp-phys",
+   [ARCH_TIMER_HYP_VIRT_PPI]   = "hyp-virt",
+};
+
 static struct clock_event_device __percpu *arch_timer_evt;
 
 static enum arch_timer_ppi_nr arch_timer_uses_ppi = ARCH_TIMER_VIRT_PPI;
@@ -1280,8 +1288,9 @@ static void __init arch_timer_populate_kvm_info(void)
 
 static int __init arch_timer_of_init(struct device_node *np)
 {
-   int i, ret;
+   int i, irq, ret;
u32 rate;
+   bool has_names;
 
if (arch_timers_present & ARCH_TIMER_TYPE_CP15) {
pr_warn("multiple nodes in dt, skipping\n");
@@ -1289,8 +1298,17 @@ static int __init arch_timer_of_init(struct device_node 
*np)
}
 
arch_timers_present |= ARCH_TIMER_TYPE_CP15;
-   for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
-   arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
+
+   has_names = of_property_read_bool(np, "interrupt-names");
+
+   for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++) 
{
+   if (has_names)
+   irq = of_irq_get_byname(np, arch_timer_ppi_names[i]);
+   else
+   irq = of_irq_get(np, i);
+   if (irq > 0)
+   arch_timer_ppi[i] = irq;
+   }
 
arch_timer_populate_kvm_info();
 
diff --git a/include/clocksource/arm_arch_timer.h 
b/include/clocksource/arm_arch_timer.h
index 1d68d5613dae..73c7139c866f 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -32,6 +32,7 @@ enum arch_timer_ppi_nr {
ARCH_TIMER_PHYS_NONSECURE_PPI,
ARCH_TIMER_VIRT_PPI,
ARCH_TIMER_HYP_PPI,
+   ARCH_TIMER_HYP_VIRT_PPI,
ARCH_TIMER_MAX_TIMER_PPI
 };
 
-- 
2.30.0



[PATCH v4 05/18] dt-bindings: timer: arm,arch_timer: Add interrupt-names support

2021-04-02 Thread Hector Martin
Not all platforms provide the same set of timers/interrupts, and Linux
only needs one (plus kvm/guest ones); some platforms are working around
this by using dummy fake interrupts. Implementing interrupt-names allows
the devicetree to specify an arbitrary set of available interrupts, so
the timer code can pick the right one.

This also adds the hyp-virt timer/interrupt, which was previously not
expressed in the fixed 4-interrupt form.

Reviewed-by: Linus Walleij 
Acked-by: Marc Zyngier 
Reviewed-by: Tony Lindgren 
Signed-off-by: Hector Martin 
---
 .../bindings/timer/arm,arch_timer.yaml| 19 +++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml 
b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
index 2c75105c1398..7f5e3af58255 100644
--- a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
+++ b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
@@ -34,11 +34,30 @@ properties:
   - arm,armv8-timer
 
   interrupts:
+minItems: 1
+maxItems: 5
 items:
   - description: secure timer irq
   - description: non-secure timer irq
   - description: virtual timer irq
   - description: hypervisor timer irq
+  - description: hypervisor virtual timer irq
+
+  interrupt-names:
+oneOf:
+  - minItems: 2
+items:
+  - const: phys
+  - const: virt
+  - const: hyp-phys
+  - const: hyp-virt
+  - minItems: 3
+items:
+  - const: sec-phys
+  - const: phys
+  - const: virt
+  - const: hyp-phys
+  - const: hyp-virt
 
   clock-frequency:
 description: The frequency of the main counter, in Hz. Should be present
-- 
2.30.0



[PATCH v4 04/18] arm64: cputype: Add CPU implementor & types for the Apple M1 cores

2021-04-02 Thread Hector Martin
The implementor will be used to condition the FIQ support quirk.

The specific CPU types are not used at the moment, but let's add them
for documentation purposes.

Acked-by: Will Deacon 
Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/cputype.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index ef5b040dee44..6231e1f0abe7 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -59,6 +59,7 @@
 #define ARM_CPU_IMP_NVIDIA 0x4E
 #define ARM_CPU_IMP_FUJITSU0x46
 #define ARM_CPU_IMP_HISI   0x48
+#define ARM_CPU_IMP_APPLE  0x61
 
 #define ARM_CPU_PART_AEM_V80xD0F
 #define ARM_CPU_PART_FOUNDATION0xD00
@@ -99,6 +100,9 @@
 
 #define HISI_CPU_PART_TSV110   0xD01
 
+#define APPLE_CPU_PART_M1_ICESTORM 0x022
+#define APPLE_CPU_PART_M1_FIRESTORM0x023
+
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A57)
 #define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A72)
@@ -127,6 +131,8 @@
 #define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, 
NVIDIA_CPU_PART_CARMEL)
 #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, 
FUJITSU_CPU_PART_A64FX)
 #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
+#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, 
APPLE_CPU_PART_M1_ICESTORM)
+#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, 
APPLE_CPU_PART_M1_FIRESTORM)
 
 /* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
 #define MIDR_FUJITSU_ERRATUM_010001MIDR_FUJITSU_A64FX
-- 
2.30.0



[PATCH v4 03/18] dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles

2021-04-02 Thread Hector Martin
These are the CPU cores in the "Apple Silicon" M1 SoC.

Reviewed-by: Rob Herring 
Signed-off-by: Hector Martin 
---
 Documentation/devicetree/bindings/arm/cpus.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/cpus.yaml 
b/Documentation/devicetree/bindings/arm/cpus.yaml
index 26b886b20b27..c299423dc7cb 100644
--- a/Documentation/devicetree/bindings/arm/cpus.yaml
+++ b/Documentation/devicetree/bindings/arm/cpus.yaml
@@ -85,6 +85,8 @@ properties:
 
   compatible:
 enum:
+  - apple,icestorm
+  - apple,firestorm
   - arm,arm710t
   - arm,arm720t
   - arm,arm740t
-- 
2.30.0



[PATCH v4 02/18] dt-bindings: arm: apple: Add bindings for Apple ARM platforms

2021-04-02 Thread Hector Martin
This introduces bindings for all three 2020 Apple M1 devices:

* apple,j274 - Mac mini (M1, 2020)
* apple,j293 - MacBook Pro (13-inch, M1, 2020)
* apple,j313 - MacBook Air (M1, 2020)

Reviewed-by: Linus Walleij 
Reviewed-by: Rob Herring 
Signed-off-by: Hector Martin 
---
 .../devicetree/bindings/arm/apple.yaml| 64 +++
 MAINTAINERS   | 10 +++
 2 files changed, 74 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/apple.yaml

diff --git a/Documentation/devicetree/bindings/arm/apple.yaml 
b/Documentation/devicetree/bindings/arm/apple.yaml
new file mode 100644
index ..1e772c85206c
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/apple.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/apple.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple ARM Machine Device Tree Bindings
+
+maintainers:
+  - Hector Martin 
+
+description: |
+  ARM platforms using SoCs designed by Apple Inc., branded "Apple Silicon".
+
+  This currently includes devices based on the "M1" SoC, starting with the
+  three Mac models released in late 2020:
+
+  - Mac mini (M1, 2020)
+  - MacBook Pro (13-inch, M1, 2020)
+  - MacBook Air (M1, 2020)
+
+  The compatible property should follow this format:
+
+  compatible = "apple,", "apple,", "apple,arm-platform";
+
+   represents the board/device and comes from the `target-type`
+  property of the root node of the Apple Device Tree, lowercased. It can be
+  queried on macOS using the following command:
+
+  $ ioreg -d2 -l | grep target-type
+
+   is the lowercased SoC ID. Apple uses at least *five* different
+  names for their SoCs:
+
+  - Marketing name ("M1")
+  - Internal name ("H13G")
+  - Codename ("Tonga")
+  - SoC ID ("T8103")
+  - Package/IC part number ("APL1102")
+
+  Devicetrees should use the lowercased SoC ID, to avoid confusion if
+  multiple SoCs share the same marketing name. This can be obtained from
+  the `compatible` property of the arm-io node of the Apple Device Tree,
+  which can be queried as follows on macOS:
+
+  $ ioreg -n arm-io | grep compatible
+
+properties:
+  $nodename:
+const: "/"
+  compatible:
+oneOf:
+  - description: Apple M1 SoC based platforms
+items:
+  - enum:
+  - apple,j274 # Mac mini (M1, 2020)
+  - apple,j293 # MacBook Pro (13-inch, M1, 2020)
+  - apple,j313 # MacBook Air (M1, 2020)
+  - const: apple,t8103
+  - const: apple,arm-platform
+
+additionalProperties: true
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index 88ad851fb5da..bee9a57e6cec 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1637,6 +1637,16 @@ F:   arch/arm/mach-alpine/
 F: arch/arm64/boot/dts/amazon/
 F: drivers/*/*alpine*
 
+ARM/APPLE MACHINE SUPPORT
+M: Hector Martin 
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+S: Maintained
+W: https://asahilinux.org
+B: https://github.com/AsahiLinux/linux/issues
+C: irc://chat.freenode.net/asahi-dev
+T: git https://github.com/AsahiLinux/linux.git
+F: Documentation/devicetree/bindings/arm/apple.yaml
+
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
 M: Lars Persson 
-- 
2.30.0



[PATCH v4 01/18] dt-bindings: vendor-prefixes: Add apple prefix

2021-04-02 Thread Hector Martin
This is different from the legacy AAPL prefix used on PPC, but
consensus is that we prefer `apple` for these new platforms.

Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Linus Walleij 
Reviewed-by: Rob Herring 
Signed-off-by: Hector Martin 
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml 
b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index f6064d84a424..7b59b6d3f526 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -103,6 +103,8 @@ patternProperties:
 description: Anvo-Systems Dresden GmbH
   "^apm,.*":
 description: Applied Micro Circuits Corporation (APM)
+  "^apple,.*":
+description: Apple Inc.
   "^aptina,.*":
 description: Aptina Imaging
   "^arasan,.*":
-- 
2.30.0



[PATCH v4 00/18] Apple M1 SoC platform bring-up

2021-04-02 Thread Hector Martin
.rst
* Removed sysreg_apple.h (defines are in drivers now)
* Many cleanups, bug fixes, and reworks to AIC, including some bug
  fixes from Marc's KVM series, kernel as EL1 support, and more.
* Simplified non-posted MMIO DT handling to only apply to direct
  children of a bus, and only use "nonposted-mmio". Removed quirk
  optimization.
* Implemented default pci_remap_cfgspace() in terms of ioremap_np()
  and removed arch-specific pci_remap_cfgspace override for arm64
  (arm32 can come later)
* Replaced license in AIC bindings header with GPL-2.0+ OR MIT
* Other minor typo/style fixes

Arnd Bergmann (1):
  docs: driver-api: device-io: Document I/O access functions

Hector Martin (17):
  dt-bindings: vendor-prefixes: Add apple prefix
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles
  arm64: cputype: Add CPU implementor & types for the Apple M1 cores
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: arch_timer: Implement support for interrupt-names
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  docs: driver-api: device-io: Document ioremap() variants & access
funcs
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np
  of/address: Add infrastructure to declare MMIO as non-posted
  arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

 .../devicetree/bindings/arm/apple.yaml|  64 ++
 .../devicetree/bindings/arm/cpus.yaml |   2 +
 .../bindings/display/simple-framebuffer.yaml  |   5 +
 .../interrupt-controller/apple,aic.yaml   |  88 ++
 .../bindings/timer/arm,arch_timer.yaml|  19 +
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 Documentation/driver-api/device-io.rst| 356 
 .../driver-api/driver-model/devres.rst|   1 +
 MAINTAINERS   |  14 +
 arch/arm64/Kconfig.platforms  |   7 +
 arch/arm64/boot/dts/Makefile  |   1 +
 arch/arm64/boot/dts/apple/Makefile|   2 +
 arch/arm64/boot/dts/apple/t8103-j274.dts  |  45 +
 arch/arm64/boot/dts/apple/t8103.dtsi  | 135 +++
 arch/arm64/configs/defconfig  |   1 +
 arch/arm64/include/asm/cputype.h  |   6 +
 arch/arm64/include/asm/io.h   |  11 +-
 arch/arm64/include/asm/sysreg.h   |  60 ++
 arch/sparc/include/asm/io_64.h|   4 +
 drivers/clocksource/arm_arch_timer.c  |  24 +-
 drivers/irqchip/Kconfig   |   8 +
 drivers/irqchip/Makefile  |   1 +
 drivers/irqchip/irq-apple-aic.c   | 837 ++
 drivers/of/address.c  |  43 +-
 include/asm-generic/io.h  |  22 +-
 include/asm-generic/iomap.h   |   9 +
 include/clocksource/arm_arch_timer.h  |   1 +
 .../interrupt-controller/apple-aic.h  |  15 +
 include/linux/cpuhotplug.h|   1 +
 include/linux/io.h|  23 +-
 include/linux/ioport.h|   1 +
 include/linux/irqchip/arm-gic-v3.h|  56 --
 include/linux/of_address.h|   1 +
 lib/devres.c  |  22 +
 34 files changed, 1807 insertions(+), 80 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/apple.yaml
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 create mode 100644 arch/arm64/boot/dts/apple/Makefile
 create mode 100644 arch/arm64/boot/dts/apple/t8103-j274.dts
 create mode 100644 arch/arm64/boot/dts/apple/t8103.dtsi
 create mode 100644 drivers/irqchip/irq-apple-aic.c
 create mode 100644 include/dt-bindings/interrupt-controller/apple-aic.h

--
2.30.0



Re: [RFT PATCH v3 16/27] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-04-01 Thread Hector Martin

Hi Will,

On 29/03/2021 21.04, Will Deacon wrote:

One CPU still needs to be able to mutate the flags of another CPU to fire an
IPI; AIUI the per-cpu ops are *not* atomic for concurrent access by multiple
CPUs, and in fact there is no API for that, only for "this CPU".


Huh, I really thought we had an API for that, but you're right. Oh well! But
I'd still suggest a per-cpu atomic_t in that case, rather than the array.


Yeah, after digging into the per-cpu stuff earlier and understanding how 
it works, I agree that a per-cpu atomic makes sense here. Switched it to 
that (which simplified out a bunch of smp_processor_id() calls too). Thanks!



I think a more idiomatic (and portable) way to do this would be to use
the relaxed accessors, but with smp_mb__after_atomic() between them. Do you
have a good reason for _not_ doing it like that?


Not particularly, other than symmetry with the case below.


I think it would be better not to rely on arm64-specific ordering unless
there's a good reason to.


Sounds reasonable, I'll switch to the barrier version.


We do need the return data here, and the release semantics (or another
barrier before it). But the read below can be made relaxed and a barrier
used instead, and then the same patern above except with a plain
atomic_or().


Yes, I think using atomic_fetch_or() followed by atomic_read() would be
best (obviously with the relevant comments!)


atomic_fetch_or_release is sufficient here (atomic_fetch_or is stronger; 
atomic_fetch_or_relaxed would not be strong enough as this needs to be 
ordered after any writes prior to sending the IPI; in this case release 
semantics also make logical sense).



It is ordered, right? As the comment says, it "needs to be ordered after the
aic_ic_write() above". atomic_fetch_andnot() is *supposed* to be fully
ordered and that should include against the writel_relaxed() on
AIC_IPI_FLAG. On ARM it turns out it's not quite fully ordered, but the
acquire semantics of the read half are sufficient for this case, as they
guarantee the flags are always read after the FIQ has been ACKed.


Sorry, I missed that the answer to my question was already written in the
comment. However, I'm still a bit unsure about whether the memory barriers
give you what you need here. The barrier in atomic_fetch_andnot() will
order the previous aic_ic_write(AIC_IPI_ACK) for the purposes of other
CPUs reading those locations, but it doesn't say anything about when the
interrupt controller actually changes state after the Ack.

Given that the AIC is mapped Device-nGnRnE, the Arm ARM offers:

   | Additionally, for Device-nGnRnE memory, a read or write of a Location
   | in a Memory-mapped peripheral that exhibits side-effects is complete
   | only when the read or write both:
   |
   | * Can begin to affect the state of the Memory-mapped peripheral.
   | * Can trigger all associated side-effects, whether they affect other
   |   peripheral devices, PEs, or memory.

so without AIC documentation I can't tell whether completion of the Ack write
just begins the process of an Ack (in which case we might need something like
a read-back), or whether the write response back from the AIC only occurs once
the Ack has taken effect. Any ideas?


Ahh, you're talking about latency within AIC itself... I obviously don't 
have an authoritative answer to this, though the hardware designer in me 
wants to say this really ought to be single-cycle type stuff that isn't 
internally pipelined in a way that would create races.


I tried to set up an SMP test case for the atomic-to-AIC sequence in 
m1n1, but unfortunately I couldn't hit the race window in deliberately 
racy code (i.e. ack after clearing flags) without widening it even 
further with at least one dummy load in between, and of course I didn't 
experience any races with the proper code either.


What I can say is that a simple set IPI; ack IPI (in adjacent str 
instructions) sequence always yields a cleared IPI, and the converse 
always yields a set IPI. So if there is latency to the operations it 
seems it would at least be the same for sets and acks and would imply 
readbacks block, which should still yield equivalently correct results. 
But of course this is a single-CPU test, so it is not fully 
representative of what could happen in an SMP scenario.


At this point all I can say is I'm inclined to shrug and say we have no 
evidence of this being something that can happen, and it shouldn't in 
sane hardware, and hope for the best :-)


Thanks,
--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 16/27] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-03-26 Thread Hector Martin

On 06/03/2021 00.05, Andy Shevchenko wrote:

+#define pr_fmt(fmt) "%s: " fmt, __func__


This is not needed, really, if you have unique / distinguishable
messages in the first place.
Rather people include module names, which may be useful.


Makes sense, I'll switch to KBUILD_MODNAME.


+#define MASK_BIT(x)BIT((x) & 0x1f)


GENMASK(4,0)


It's not really a register bitmask, but rather extracting the low bits 
of an index... but sure, GENMASK also expresses that. Changed.



+static atomic_t aic_vipi_flag[AIC_MAX_CPUS];
+static atomic_t aic_vipi_enable[AIC_MAX_CPUS];


Isn't it easier to handle these when they are full width, i.e. 32
items per the array?


I don't think so, it doesn't really buy us anything. It's just a maximum 
beyond which the driver doesn't work in its current state anyway (if the 
number were much larger it'd make sense to dynamically allocate these, 
but not at this point).



+static int aic_irq_set_affinity(struct irq_data *d,
+   const struct cpumask *mask_val, bool force)
+{
+   irq_hw_number_t hwirq = irqd_to_hwirq(d);
+   struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
+   int cpu;
+
+   if (hwirq > ic->nr_hw)


>= ?


Good catch, but this is actually obsolete. Higher IRQs go into the FIQ 
irqchip, so this should never happen (it's a leftover from when they 
were a single one). I'll remove it.


Ack on the other comments, thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 16/27] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-03-26 Thread Hector Martin
e) way to do this would be to use
the relaxed accessors, but with smp_mb__after_atomic() between them. Do you
have a good reason for _not_ doing it like that?


Not particularly, other than symmetry with the case below.


+   /*
+* This sequence is the mirror of the one in aic_ipi_unmask();
+* see the comment there. Additionally, release semantics
+* ensure that the vIPI flag set is ordered after any shared
+* memory accesses that precede it. This therefore also pairs
+* with the atomic_fetch_andnot in aic_handle_ipi().
+*/
+   pending = atomic_fetch_or_release(irq_bit, _vipi_flag[cpu]);


We do need the return data here, and the release semantics (or another 
barrier before it). But the read below can be made relaxed and a barrier 
used instead, and then the same patern above except with a plain 
atomic_or().



+   if (!(pending & irq_bit) && 
(atomic_read_acquire(_vipi_enable[cpu]) & irq_bit))
+   send |= AIC_IPI_SEND_CPU(cpu);
+   }


[...]


+   /*
+* Clear the IPIs we are about to handle. This pairs with the
+* atomic_fetch_or_release() in aic_ipi_send_mask(), and needs to be
+* ordered after the aic_ic_write() above (to avoid dropping vIPIs) and
+* before IPI handling code (to avoid races handling vIPIs before they
+* are signaled). The former is taken care of by the release semantics
+* of the write portion, while the latter is taken care of by the
+* acquire semantics of the read portion.
+*/
+   firing = atomic_fetch_andnot(enabled, _vipi_flag[this_cpu]) & 
enabled;


Does this also need to be ordered after the Ack? For example, if we have
something like:

CPU 0   CPU 1

aic_ipi_send_mask()
atomic_fetch_andnot(flag)
atomic_fetch_or_release(flag)
aic_ic_write(AIC_IPI_SEND)
aic_ic_write(AIC_IPI_ACK)

sorry if it's a stupid question, I'm just not sure about the cases in which
the hardware will pend things for you.


It is ordered, right? As the comment says, it "needs to be ordered after 
the aic_ic_write() above". atomic_fetch_andnot() is *supposed* to be 
fully ordered and that should include against the writel_relaxed() on 
AIC_IPI_FLAG. On ARM it turns out it's not quite fully ordered, but the 
acquire semantics of the read half are sufficient for this case, as they 
guarantee the flags are always read after the FIQ has been ACKed.


Cheeers,
--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 16/27] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-03-26 Thread Hector Martin

On 08/03/2021 22.31, Marc Zyngier wrote:

+   if ((read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EN) &&
+   read_sysreg_s(SYS_ICH_MISR_EL2) != 0) {
+   pr_err("vGIC IRQ fired, disabling.\n");


Please add a _ratelimited here. Whilst debugging KVM on this machine,
I ended up with this firing at such a rate that it was impossible to
do anything useful. Ratelimiting it allowed me to pinpoint the
problem.


Ouch. Done for v4.


+static void aic_fiq_eoi(struct irq_data *d)
+{
+   /* We mask to ack (where we can), so we need to unmask at EOI. */
+   if (!irqd_irq_disabled(d) && !irqd_irq_masked(d))


Ah, be careful here: irqd_irq_masked() doesn't do what you think it
does for per-CPU interrupts. It's been on my list to fix for the rVIC
implementation, but I never got around to doing it, and all decent ICs
hide this from SW by having a HW-managed mask, similar to what is on
the IRQ side.

I can see two possibilities:

- you can track the masked state directly and use that instead of
   these predicates

- you can just drop the masking altogether as this is only useful to a
   hosted hypervisor (KVM), which will have to do its own masking
   behind the scenes anyway



Since you're using the masking in KVM after all, I'm tracking the mask 
state in a percpu variable now. Also folded in your two minor bugfixes 
from the KVM series. Cheers!



--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 01/27] arm64: Cope with CPUs stuck in VHE mode

2021-03-26 Thread Hector Martin

On 25/03/2021 05.00, Marc Zyngier wrote:

I've come up with this on top of the original patch, spitting a
warning when the right conditions are met. It's pretty ugly, but hey,
so is the HW this runs on.


[...]

Folded into v4 and tested; works fine with `kvm-arm.mode=nvhe`, throwing 
the ugly WARN.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 13/27] arm64: Add Apple vendor-specific system registers

2021-03-26 Thread Hector Martin

On 25/03/2021 04.04, Will Deacon wrote:

On Wed, Mar 24, 2021 at 06:59:21PM +, Mark Rutland wrote:

So far we've kept arch/arm64/ largely devoid of IMP-DEF bits, and it
seems a shame to add something with the sole purpose of collating that,
especially given arch code shouldn't need to touch these if FW and
bootloader have done their jobs right.

Can we put the definitions in the relevant drivers? That would sidestep
any pain with MAINTAINERS, too.


If we can genuinely ignore these in arch code, then sure. I just don't know
how long that is going to be the case, and ending up in a situation where
these are scattered randomly throughout the tree sounds horrible to me.


I thought we would need some in KVM code, but given the direction Marc's 
series ended up in, it seems we won't. So I'm happy keeping these in the 
respective drivers; if this ends up being messy in the future it 
shouldn't be a big deal to refactor it all into one file again.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 08/27] asm-generic/io.h: Add a non-posted variant of ioremap()

2021-03-25 Thread Hector Martin

On 25/03/2021 04.09, Arnd Bergmann wrote:

On Wed, Mar 24, 2021 at 7:12 PM Will Deacon  wrote:



+/*
+ * ioremap_np needs an explicit architecture implementation, as it
+ * requests stronger semantics than regular ioremap(). Portable drivers
+ * should instead use one of the higher-level abstractions, like
+ * devm_ioremap_resource(), to choose the correct variant for any given
+ * device and bus. Portable drivers with a good reason to want non-posted
+ * write semantics should always provide an ioremap() fallback in case
+ * ioremap_np() is not available.
+ */
+#ifndef ioremap_np
+#define ioremap_np ioremap_np
+static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
+{
+ return NULL;
+}
+#endif


Can we implement the generic pci_remap_cfgspace() in terms of ioremap_np()
if it is supported by the architecture? That way, we could avoid defining
both on arm64.


Good idea. It needs a fallback in case the ioremap_np() fails on most
architectures, but that sounds easy enough.

Since pci_remap_cfgspace() only has custom implementations, it sounds like
we can actually make the generic implementation unconditional in the end,
but that requires adding ioremap_np() on 32-bit as well, and I would keep
that separate from this series.


Sounds good; I'm adding a patch to adjust the generic implementation and 
remove the arm64 one in v4, and we can then complete the cleanup for 
other arches later.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-11 Thread Hector Martin

On 11/03/2021 23.31, Linus Walleij wrote:

I understand your argument, is your position such that the nature
of the hardware is such that community should leave this hardware
alone and not try to make use of RPMB  for say ordinary (self-installed)
Linux distributions?


It's not really that the community should leave this hardware alone, so 
much that I think there is a very small subset of users who will be able 
to benefit from it, and that subset will be happy with a usable 
kernel/userspace interface and some userspace tooling for this purpose, 
including provisioning and such.


Consider the prerequisites for using RPMB usefully here:

* You need (user-controlled) secureboot
* You need secret key storage - so either some kind of CPU-fused key, or 
one protected by a TPM paired with the secureboot (key sealed to PCR 
values and such)
* But if you have a TPM, that can handle secure counters for you already 
AIUI, so you don't need RPMB

* So this means you must be running a non-TPM secureboot system

And so we're back to embedded platforms like Android phones and other 
SoC stuff... user-controlled secureboot is already somewhat rare here, 
and even rarer are the cases where the user controls the whole chain 
including the TEE if any (otherwise it'll be using RPMB already); this 
pretty much excludes all production Android phones except for a few 
designed as completely open systems; we're left with those and a subset 
of dev boards (e.g. the Jetson TX1 I did fuse experiments on). In the 
end, those systems will probably end up with fairly bespoke set-ups for 
any given device or SoC family, for using RPMB.


But then again, if you have a full secureboot system where you control 
the TEE level, wouldn't you want to put the RPMB shenanigans there and 
get some semblance of secure TPM/keystore/attempt throttling 
functionality that is robust against Linux exploits and has a smaller 
attack surface? Systems without EL3 are rare (Apple M1 :-)) so it makes 
more sense to do this on those that do have it. If you're paranoid 
enough to be getting into building your own secure system with 
anti-rollback for retry counters, you should be heading in that directly 
anyway.


And now Linux's RPMB code is useless because you're running the stack in 
the secure monitor instead :-)


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-11 Thread Hector Martin
ted against because
the next attempt can not be made until after the RPMB
monotonic counter has been increased.


But this is only enforced by software. If you do not have secure boot,
you can just patch software to allow infinite tries without touching the
RPMB. The RPMB doesn't check PINs for you, it doesn't even gate read
access to data in any way. All it does is promise you cannot make the
counter count down, or make the data stored within go back in time.


This is true, I guess the argument is something along the
line that if one link in the chain is weaker, why harden
any other link, the chain will break anyway?


This is how security works, yes :-)

I'm not saying hardening a link in the chain is pointless in every case, 
but in this case it's like heat treating one link in the chain, then 
joining it to the next one with a ziptie. Only once you at least have 
the entire chain of steel does it make sense to start thinking about 
heat treatment.



I am more of the position let's harden this link if we can
and then deal with the others when they come up, i.e.
my concern is this piece of the puzzle, even if it is not
the centerpiece (maybe the centerpiece is secure boot
what do I know).


Well, that's what I'm saying, you do need secureboot for this to make 
sense :-)


RPMB isn't useless and some systems should implement it; but there's no 
real way for the kernel to transparently use it to improve security in 
general (for anyone) without the user being aware. Since any security 
benefit from RPMB must come from integration with user policy, it 
doesn't make sense to "well, just do something else with RPMB because 
it's better than nothing"; just doing "something" doesn't make systems 
more secure. There needs to be a specific, practical use case that we'd 
be trying to solve with RPMB here.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-11 Thread Hector Martin

On 11/03/2021 18.12, Arnd Bergmann wrote:

On Wed, Mar 10, 2021 at 6:01 PM Rob Herring  wrote:


On Wed, Mar 10, 2021 at 1:27 AM Hector Martin  wrote:


On 10/03/2021 07.06, Rob Herring wrote:

My main concern here is that this creates an inconsistency in the device
tree representation that only works because PCI drivers happen not to
use these code paths. Logically, having "nonposted-mmio" above the PCI
controller would imply that it applies to that bus too. Sure, it doesn't
matter for Linux since it is ignored, but this creates an implicit
exception that PCI buses always use posted modes.


We could be stricter that "nonposted-mmio" must be in the immediate
parent. That's kind of in line with how addressing already works.
Every level has to have 'ranges' to be an MMIO address, and the
address cell size is set by the immediate parent.


Then if a device comes along that due to some twisted fabric logic needs
nonposted nGnRnE mappings for PCIe (even though the actual PCIe ops will
end up posted at the bus anyway)... how do we represent that? Declare
that another "nonposted-mmio" on the PCIe bus means "no, really, use
nonposted mmio for this"?


If we're strict, yes. The PCI host bridge would have to have "nonposted-mmio".


Works for me; then let's just make it non-recursive.

Do you think we can get rid of the Apple-only optimization if we do
this? It would mean only looking at the parent during address
resolution, not recursing all the way to the top, so presumably the
performance impact would be quite minimal.


Works for me.


Incidentally, even though it would now be unused, I'd like to keep the 
apple,arm-platform compatible at this point; we've already been pretty 
close to a use case for it, and I don't want to have to fall back to a 
list of SoC compatibles if we ever need another quirk for all Apple ARM 
SoCs (or break backwards compat). It doesn't really hurt to have it in 
the binding and devicetrees, right?



Yeah, that should be fine. I'd keep an IS_ENABLED() config check
though. Then I'll also know if anyone else needs this.


Ok, makes sense.

Conceptually, I'd like to then see a check that verifies that the
property is only set for nodes whose parent also has it set, since
that is how AXI defines it: A bus can wait for the ack from its
child node, or it can acknowledge the write to its parent early.
However, this breaks down as soon as a bus does the early ack:
all its children by definition use posted writes (as seen by the
CPU), even if they wait for stores that come from other masters.

Does this make sense to you?


Makes sense. This shouldn't really be something the kernel concerns 
itself with at runtime, just something for the dts linting, right?


I assume this isn't representable in json-schema, so it would presumably 
need some ad-hoc validation code.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-11 Thread Hector Martin

On 11/03/2021 09.49, Linus Walleij wrote:

The use case for TPM on laptops is similar: it can be used by a
provider to lock down a machine, but it can also be used by the
random user to store keys. Very few users beside James
Bottomley are capable of doing that (I am not) but they exist.
https://blog.hansenpartnership.com/using-your-tpm-as-a-secure-key-store/


I've used a TPM as an SSH key keystore in the past (these days I use 
YubiKeys, but same idea). TPMs are useful because they *do* implement 
policy and cryptographic operations. So you can, in fact, get security 
guarantees out of a TPM without secureboot.


For example, assuming the TPM is secure, it is impossible to clone an 
SSH key private key managed by a TPM. This means that any usage has to 
be on-device, which provides inherent rate-limiting. Then, the TPM can 
gate access to the key based on a passphrase, which again provides 
inherent rate-limits on cracking attempts. TPM 2.0 devices also provide 
explicit count limits and time-based throttling for unlocking attempts.


We have much the same story with the Secure Enclave Processor on Apple 
Silicon machines (which I'm working on porting Linux to) - it provides 
policy, and can even authenticate with fingerprints (there is a hardware 
secure channel between the reader and the SEP) as well as passphrases. 
For all intents and purposes it is an Apple-managed TPM (with its own 
secureboot). So it is similarly useful for us to support the SEP for key 
storage, and perhaps even integrate it with kernel subsystems at some 
point. It's useful for our regular users, even though they are unlikely 
to be running with full secureboot on the main CPU (though Apple's 
implementation does allow for a user-controlled secureboot subset, and 
it should be possible to provide hard guarantees there as well, but I 
digress).


All of these things make putting keys into TPMs, YubiKeys, the SEP, etc 
a useful thing for anyone, regardless of whether their machine is locked 
down or not.


This is not the case for RPMB. RPMB *relies* on the software running on 
the other side being trusted. RPMB, alone, provides zero new security 
guarantees, without trusted software communicating with it.


The key initialization story is also a lot thornier in RPMB. TPMs, the 
SEP, and YubiKeys are all designed so that they can be factory-reset 
(losing all key material in the process) by a user with physical access, 
which means that provisioning operations and experiments are risk-free, 
and the only danger is data loss, not making the hardware less useful. 
With the MAC key provisioning for RPMB being a one-time process, it is 
inherently a very risky operation that a user must commit to with great 
care, as they only get one chance, ever. Better have that key backed up 
somewhere (but not somewhere an attacker can get to... see the 
problem?). This is like fusing secureboot keys on SoCs (I remember being 
*very* nervous about hitting  on the command to fuse a Tegra X1 
board with a secureboot key for some experiments... these kinds of 
irreversible things are no joke).


Basically, TPMs, SEP, YubiKeys, etc were designed to be generally useful 
and flexible devices for various crypto and authentication use cases. 
RPMB was designed for the sole purpose of plugging the secure storage 
replay exploit for Android phones running TrustZone secure monitors. It 
doesn't really do anything else; it's just a single low-level primitive 
and you need to already have an equivalent design that is only missing 
that piece to get anything from it. And its provisioning model assumes a 
typical OEM device production pipeline and integration with CPU fusing; 
it isn't friendly to Linux hackers messing around with securing LUKS 
unlock attempt counters.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-11 Thread Hector Martin

On 11/03/2021 09.36, Linus Walleij wrote:

It is not intended to store keys in a way that is somehow safer than
other mechanisms. After all, you need to securely store the RPMB key to
begin with; you might as well use that to encrypt a keystore on any
random block device.


The typical use-case mentioned in one reference is to restrict
the number of password/pin attempts and  combine that with
secure time to make sure that longer and longer intervals are
required between password attempts.

This seems pretty neat to me.


Yes, but to implement that you don't need any secure storage *at all*. 
If all the RPMB did was authenticate an incrementing counter, you could 
just store the  tuple inside a blob 
of secure (encrypted and MACed) storage on any random Flash device, 
along with the counter value, and thus prevent rollbacks that way (some 
finer design points are needed to deal with power loss protection and 
ordering, but the theory holds).


Basically what I'm saying is that for security *guarantee* purposes, 
AFAICT the storage part of RPMB makes no difference. It is useful in 
practical implementations for various reasons, but if you think you can 
use that secure storage to provide security properties which you 
couldn't do otherwise, you are probably being misled. If you're trying 
to understand what having RPMB gets you over not having it, it helps if 
you ignore all the storage stuff and just view it as a single secure, 
increment-only counter.





But RPMB does not enforce any of this policy for you. RPMB only gives
you a primitive: the ability to have storage that cannot be externally
rolled back. So none of this works unless the entire system is set up to
securely boot all the way until the drive unlock happens, and there are
no other blatant code execution avenues.


This is true for firmware anti-rollback or say secure boot.

But RPMB can also be used for example for restricting the
number of PIN attempts.

A typical attack vector on phones (I think candybar phones
even) was a robot that was punching PIN codes to unlock
the phone, combined with an electronic probe that would
cut the WE (write enable) signal to the flash right after
punching a code. The counter was stored in the flash.

(A bit silly example as this can be countered by reading back
the counter from flash and checking etc, but you get the idea,
various versions of this attack is possible,)

With RPMB this can be properly protected against because
the next attempt can not be made until after the RPMB
monotonic counter has been increased.


But this is only enforced by software. If you do not have secure boot, 
you can just patch software to allow infinite tries without touching the 
RPMB. The RPMB doesn't check PINs for you, it doesn't even gate read 
access to data in any way. All it does is promise you cannot make the 
counter count down, or make the data stored within go back in time.



Of course the system can be compromised in other ways,
(like, maybe it doesn't even have secure boot or even
no encrypted drive) but this is one of the protection
mechanisms that can plug one hole.


This is hot how security systems are designed though; you do not "plug 
holes", what you do is cover more attack scenarios, and you do that in 
the order from simplest to hardest.


If we are trying to crack the PIN on a device we have physical access 
to, the simplest and most effective attack is to just run your own 
software on the machine, extract whatever hash or material you need to 
validate PINs, and do it offline.


To protect against that, you first need to move the PIN checking into a 
trust domain where an attacker with physical access can't easily break 
in, which means secure boot.


*Then* the next simplest attack is a secure storage rollback attack, 
which is what I described in that blog post about iOS. And *now* it 
makes sense to start thinking about the RPMB.


But RPMB alone doesn't make any sense on a system without secure boot. 
It doesn't change anything; in both cases the simplest attack is to just 
run your own software.



It is thus a countermeasure to keyboard emulators and other
evil hardware trying to brute force their way past screen
locks and passwords. Such devices exist, sadly.


If you're trying to protect against a "dumb" attack with a keyboard 
emulator that doesn't consider access to physical storage, then you 
don't need RPMB either; you can just put the PIN unlock counter in a 
random file.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-10 Thread Hector Martin

On 10/03/2021 18.48, Linus Walleij wrote:

Disk is encrypted, and RPMB is there to block any exhaustive
password or other authentication token search.


This relies on having a secure boot chain to start with (otherwise you 
can just bypass policy that way; the RPMB is merely storage to give you 
anti-rollback properties, it can't enforce anything itself). So you 
would have to have a laptop with a fully locked down secure boot, which 
can only boot some version of Linux signed by you until, say, LUKS 
decryption. And then the tooling around that needs to be integrated with 
RPMB, to use it as an attempt counter.


But now this ends up having to involve userspace anyway; the kernel key 
stuff doesn't support policy like this, does it? So having the kernel 
automagically use RPMB wouldn't get us there.


I may be wrong on the details here, but as far as I know RPMB is 
strictly equivalent to a simple secure increment-only counter in what it 
buys you. The stuff about writing data to it securely is all a red 
herring - you can implement secure storage elsewhere, and with secure 
storage + a single secure counter, you can implement anti-rollback.


It is not intended to store keys in a way that is somehow safer than 
other mechanisms. After all, you need to securely store the RPMB key to 
begin with; you might as well use that to encrypt a keystore on any 
random block device.



Ideally: the only way to make use of the hardware again would
be to solder off the eMMC, if eMMC is used for RPMB.
If we have RPMB on an NVME or UFS drive, the idea is
to lock that thing such that it becomes useless and need to
be replaced with a new part in this scenario.

In practice: make it hard, because we know no such jail is
perfect. Make it not worth the effort, make it cheaper for thieves
to just buy a new harddrive to use a stolen laptop, locking
the data that was in it away forever by making the drive
useless for any practical attacks.


But RPMB does not enforce any of this policy for you. RPMB only gives 
you a primitive: the ability to have storage that cannot be externally 
rolled back. So none of this works unless the entire system is set up to 
securely boot all the way until the drive unlock happens, and there are 
no other blatant code execution avenues.


There isn't even any encryption involved in the protocol, so all the 
data stored in the RPMB is public and available to any attacker.


So unless the kernel grows a subsystem/feature to enforce complex key 
policies (with things like use counts, retry times, etc), I don't think 
there's a place to integrate RPMB kernel-side. You still need a trusted 
userspace tool to glue it all together.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-10 Thread Hector Martin

On 10/03/2021 14.14, Sumit Garg wrote:

On Wed, 10 Mar 2021 at 02:47, Hector Martin  wrote:


On 09/03/2021 01.20, Linus Walleij wrote:

I suppose it would be a bit brutal if the kernel would just go in and
appropriate any empty RPMB it finds, but I suspect it is the right way
to make use of this facility given that so many of them are just sitting
there unused. Noone will run $CUSTOM_UTILITY any more than they
run the current RPMB tools in mmc-tools.


AIUI the entire thing relies on a shared key that is programmed once
into the RPMB device, which is a permanent operation. This key has to be
secure, usually stored on CPU fuses or derived based on such a root of
trust. To me it would seem ill-advised to attempt to automate this
process and have the kernel do a permanent take-over of any RPMBs it
finds (with what key, for one?) :)



Wouldn't it be a good idea to use DT here to represent whether a
particular RPMB is used as a TEE backup or is available for normal
kernel usage?

In case of normal kernel usage, I think the RPMB key can come from
trusted and encrypted keys subsystem.


Remember that if the key is ever lost, the RPMB is now completely 
useless forever.


This is why, as far as I know, most sane platforms will use hard fused 
values to derive this kind of thing, not any kind of key stored in 
erasable storage.


Also, newly provisioned keys are sent in plain text, which means that 
any kind of "if the RPMB is blank, take it over" automation equates to 
handing over your key who an attacker who removes the RPMB and replaces 
it with a blank one, and then they can go access anything they want on 
the old RPMB device (assuming the key hasn't changed; and if it has 
changed that's conversely a recipe for data loss if something goes wrong).


I really think trying to automate any kind of "default" usage of an RPMB 
is a terrible idea. It needs to be a conscious decision on a 
per-platform basis.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-10 Thread Hector Martin

On 10/03/2021 07.06, Rob Herring wrote:

My main concern here is that this creates an inconsistency in the device
tree representation that only works because PCI drivers happen not to
use these code paths. Logically, having "nonposted-mmio" above the PCI
controller would imply that it applies to that bus too. Sure, it doesn't
matter for Linux since it is ignored, but this creates an implicit
exception that PCI buses always use posted modes.


We could be stricter that "nonposted-mmio" must be in the immediate
parent. That's kind of in line with how addressing already works.
Every level has to have 'ranges' to be an MMIO address, and the
address cell size is set by the immediate parent.


Then if a device comes along that due to some twisted fabric logic needs
nonposted nGnRnE mappings for PCIe (even though the actual PCIe ops will
end up posted at the bus anyway)... how do we represent that? Declare
that another "nonposted-mmio" on the PCIe bus means "no, really, use
nonposted mmio for this"?


If we're strict, yes. The PCI host bridge would have to have "nonposted-mmio".


Works for me; then let's just make it non-recursive.

Do you think we can get rid of the Apple-only optimization if we do 
this? It would mean only looking at the parent during address 
resolution, not recursing all the way to the top, so presumably the 
performance impact would be quite minimal.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFC PATCH 1/5] rpmb: add Replay Protected Memory Block (RPMB) subsystem

2021-03-09 Thread Hector Martin

On 09/03/2021 01.20, Linus Walleij wrote:

I suppose it would be a bit brutal if the kernel would just go in and
appropriate any empty RPMB it finds, but I suspect it is the right way
to make use of this facility given that so many of them are just sitting
there unused. Noone will run $CUSTOM_UTILITY any more than they
run the current RPMB tools in mmc-tools.


AIUI the entire thing relies on a shared key that is programmed once 
into the RPMB device, which is a permanent operation. This key has to be 
secure, usually stored on CPU fuses or derived based on such a root of 
trust. To me it would seem ill-advised to attempt to automate this 
process and have the kernel do a permanent take-over of any RPMBs it 
finds (with what key, for one?) :)


For what it's worth, these days I think Apple uses a separate, dedicated 
secure element for replay protected storage, not RPMB. That seems like a 
sane approach, given that obviously Flash storage vendors cannot be 
trusted to write security-critical firmware. But if all you have is 
RPMB, using it is better than nothing.


The main purpose of the RPMB is, as the name implies, replay protection. 
You can do secure storage on any random flash with encryption, and even 
do full authentication with hash trees, but the problem is no matter how 
fancy your scheme is, attackers can always dump all memory and roll your 
device back to the past. This defeats stuff like PIN code attempt 
limits. So it isn't so much for storing crypto keys or such, but rather 
a way to prevent these attacks.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 26/27] dt-bindings: display: Add apple,simple-framebuffer

2021-03-09 Thread Hector Martin

On 10/03/2021 01.37, Linus Walleij wrote:

On Thu, Mar 4, 2021 at 10:42 PM Hector Martin  wrote:


Apple SoCs run firmware that sets up a simplefb-compatible framebuffer
for us. Add a compatible for it, and two missing supported formats.

Signed-off-by: Hector Martin 


Reviewed-by: Linus Walleij 

Marcan: tell me if you need me to apply this to the drm-misc tree
and I'll fix it.


I think Arnd is okay merging this one through the SoC tree.

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 17/27] arm64: Kconfig: Introduce CONFIG_ARCH_APPLE

2021-03-09 Thread Hector Martin

On 09/03/2021 00.35, Marc Zyngier wrote:

On Thu, 04 Mar 2021 21:38:52 +,
Hector Martin  wrote:


This adds a Kconfig option to toggle support for Apple ARM SoCs.
At this time this targets the M1 and later "Apple Silicon" Mac SoCs.

Signed-off-by: Hector Martin 
---
  arch/arm64/Kconfig.platforms | 8 
  arch/arm64/configs/defconfig | 1 +
  2 files changed, 9 insertions(+)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index cdfd5fed457f..c2b5791e3d69 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -36,6 +36,14 @@ config ARCH_ALPINE
  This enables support for the Annapurna Labs Alpine
  Soc family.
  
+config ARCH_APPLE

+   bool "Apple Silicon SoC family"
+   select APPLE_AIC
+   select ARM64_FIQ_SUPPORT


Do we still need this FIQ symbol? I though it was now gone...


Whoops! Thanks for the catch, this can go away.

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 10/27] docs: driver-api: device-io: Document ioremap() variants & access funcs

2021-03-09 Thread Hector Martin

On 06/03/2021 00.51, Arnd Bergmann wrote:

On Fri, Mar 5, 2021 at 4:09 PM Andy Shevchenko
 wrote:

On Fri, Mar 5, 2021 at 12:25 PM Linus Walleij  wrote:

On Thu, Mar 4, 2021 at 10:40 PM Hector Martin  wrote:


This documents the newly introduced ioremap_np() along with all the
other common ioremap() variants, and some higher-level abstractions
available.

Signed-off-by: Hector Martin 


I like this, I just want one change:

Put the common ioremap() on top in all paragraphs, so the norm
comes before the exceptions.

I.e. it is weird to mention ioremap_np() before mentioning ioremap().


+1 here. That is what I have stumbled upon reading carefully.


In that case, the order should probably be:

ioremap
ioremap_wc
ioremap_wt
ioremap_np
ioremap_uc
ioremap_cache

Going from most common to least common, rather than going from
strongest to weakest.


Yeah, I was dwelling on the issue of ioremap_np being first when I wrote 
that... this alternative works for me, I'll sort it like this then. 
It'll just need some re-wording to make it all flow properly.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 06/27] dt-bindings: timer: arm,arch_timer: Add interrupt-names support

2021-03-09 Thread Hector Martin

On 10/03/2021 01.11, Rob Herring wrote:

On Mon, Mar 8, 2021 at 3:42 PM Marc Zyngier  wrote:


On Mon, 08 Mar 2021 20:38:41 +,
Rob Herring  wrote:


On Fri, Mar 05, 2021 at 06:38:41AM +0900, Hector Martin wrote:

Not all platforms provide the same set of timers/interrupts, and Linux
only needs one (plus kvm/guest ones); some platforms are working around
this by using dummy fake interrupts. Implementing interrupt-names allows
the devicetree to specify an arbitrary set of available interrupts, so
the timer code can pick the right one.

This also adds the hyp-virt timer/interrupt, which was previously not
expressed in the fixed 4-interrupt form.

Signed-off-by: Hector Martin 
---
  .../devicetree/bindings/timer/arm,arch_timer.yaml  | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml 
b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
index 2c75105c1398..ebe9b0bebe41 100644
--- a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
+++ b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
@@ -34,11 +34,25 @@ properties:
- arm,armv8-timer

interrupts:
+minItems: 1
+maxItems: 5
  items:
- description: secure timer irq
- description: non-secure timer irq
- description: virtual timer irq
- description: hypervisor timer irq
+  - description: hypervisor virtual timer irq
+
+  interrupt-names:
+minItems: 1
+maxItems: 5
+items:
+  enum:
+- phys-secure
+- phys
+- virt
+- hyp-phys
+- hyp-virt


phys-secure and hyp-phys is not very consistent. secure-phys or sec-phys
instead?

This allows any order which is not ideal (unfortunately json-schema
doesn't have a way to define order with optional entries in the middle).
How many possible combinations are there which make sense? If that's a
reasonable number, I'd rather see them listed out.


The available of interrupts are a function of the number of security
states, privileged exception levels and architecture revisions, as
described in D11.1.1:


- An EL1 physical timer.
- A Non-secure EL2 physical timer.
- An EL3 physical timer.
- An EL1 virtual timer.
- A Non-secure EL2 virtual timer.
- A Secure EL2 virtual timer.
- A Secure EL2 physical timer.


* Single security state, EL1 only, ARMv7 & ARMv8.0+ (assumed NS):
   - physical, virtual

* Single security state, EL1 + EL2, ARMv7 & ARMv8.0 (assumed NS)
   - physical, virtual, hyp physical

* Single security state, EL1 + EL2, ARMv8.1+ (assumed NS)
   - physical, virtual, hyp physical, hyp virtual

* Two security states, EL1 + EL3, ARMv7 & ARMv8.0+:
   - secure physical, physical, virtual

* Two security states, EL1 + EL2 + EL3, ARMv7 & ARMv8.0
   - secure physical, physical, virtual, hyp physical

* Two security states, EL1 + EL2 + EL3, ARMv8.1+
   - secure physical, physical, virtual, hyp physical, hyp virtual

* Two security states, EL1 + EL2 + S-EL2 + EL3, ARMv8.4+
   - secure physical, physical, virtual, hyp physical, hyp virtual,
 secure hyp physical, secure hyp virtual

Nobody has seen the last combination in the wild (that is, outside of
a SW model).

I'm really not convinced we want to express this kind of complexity in
the binding (each of the 7 cases), specially given that we don't
encode the underlying HW architecture level or number of exception
levels anywhere, and have ho way to validate such information.


Actually, we can simplify this down to 2 cases:

oneOf:
   - minItems: 2
 items:
   - const: phys
   - const: virt
   - const: hyp-phys
   - const: hyp-virt
   - minItems: 3
 items:
   - const: sec-phys
   - const: phys
   - const: virt
   - const: hyp-phys
   - const: hyp-virt
   - const: sec-hyp-phy
   - const: sec-hyp-virt

And that's below my threshold for not worth the complexity.


This makes sense. Since we aren't using the sec-hyp stuff here, and 
those go at the end of the list, we can omit them from this patch for 
now and add them whenever they're needed for a platform. Does that sound OK?


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-09 Thread Hector Martin

On 10/03/2021 00.48, Rob Herring wrote:

On Mon, Mar 8, 2021 at 2:56 PM Arnd Bergmann  wrote:


On Mon, Mar 8, 2021 at 10:14 PM Rob Herring  wrote:

On Mon, Mar 08, 2021 at 09:29:54PM +0100, Arnd Bergmann wrote:

On Mon, Mar 8, 2021 at 4:56 PM Rob Herring  wrote:


Let's just stick with 'nonposted-mmio', but drop 'posted-mmio'. I'd
rather know if and when we need 'posted-mmio'. It does need to be added
to the DT spec[1] and schema[2] though (GH PRs are fine for both).


I think the reason for having "posted-mmio" is that you cannot properly
define the PCI host controller nodes on the M1 without that: Since
nonposted-mmio applies to all child nodes, this would mean the PCI
memory space gets declared as nonposted by the DT, but the hardware
requires it to be mapped as posted.


I don't think so. PCI devices wouldn't use any of the code paths in
this patch. They would map their memory space with plain ioremap()
which is posted.


My main concern here is that this creates an inconsistency in the device 
tree representation that only works because PCI drivers happen not to 
use these code paths. Logically, having "nonposted-mmio" above the PCI 
controller would imply that it applies to that bus too. Sure, it doesn't 
matter for Linux since it is ignored, but this creates an implicit 
exception that PCI buses always use posted modes.


Then if a device comes along that due to some twisted fabric logic needs 
nonposted nGnRnE mappings for PCIe (even though the actual PCIe ops will 
end up posted at the bus anyway)... how do we represent that? Declare 
that another "nonposted-mmio" on the PCIe bus means "no, really, use 
nonposted mmio for this"?


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-05 Thread Hector Martin

On 06/03/2021 02.39, Rob Herring wrote:

I'm still a little hesitant to add these properties and having some
default. I worry about a similar situation as 'dma-coherent' where the
assumed default on non-coherent on Arm doesn't work for PowerPC which
defaults coherent. More below on this.


The intent of the default here is that it matches what ioremap() does on 
other platforms already (where it does not make any claims of being 
posted, though it could be on some platforms). It could be per-platform 
what that means... but either way it should be what drivers get today 
without asking for anything special.



-   return ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   return ioremap_np(res.start, resource_size());
+   else
+   return ioremap(res.start, resource_size());


This and the devm variants all scream for a ioremap_extended()
function. IOW, it would be better if the ioremap flavor was a
parameter. Unless we could implement that just for arm64 first, that's
a lot of refactoring...


I agree, but yeah... that's one big refactor to try to do now...


What's the code path using these functions on the M1 where we need to
return 'posted'? It's just downstream PCI mappings (PCI memory space),
right? Those would never hit these paths because they don't have a DT
node or if they do the memory space is not part of it. So can't the
check just be:

bool of_mmio_is_nonposted(struct device_node *np)
{
 return np && of_machine_is_compatible("apple,arm-platform");
}


Yes; the implementation was trying to be generic, but AIUI we don't need 
this on M1 because the PCI mappings don't go through this codepath, and 
nothing else needs posted mode. My first hack was something not too 
unlike this, then I was going to get rid of apple,arm-platform and just 
have this be a generic mechanism with the properties, but then we added 
the optimization to not do the lookups on other platforms, and now we're 
coming full circle... :-)


If you prefer to handle it this way for now I can do it like this. I 
think we should still have the DT bindings and properties though (even 
if not used), as they do describe the hardware properly, and in the 
future we might want to use them instead of having a quirk.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-05 Thread Hector Martin

On 06/03/2021 01.43, Arnd Bergmann wrote:

- setting ioremap() on PCI buses non-posted only makes them
   only slower but not more reliable, because the non-posted flag
   on the bus is discarded by the PCI host bridge.


Note that this doesn't work here *anyway*. The fabric is picky in both 
directions: thou shalt use nGnRnE for on-SoC MMIO and nGnRE for PCIe 
windows, or else, SError.


Since these devices can support *any* PCI device via Thunderbolt, making 
PCI drivers be the oddball ones needing special APIs would mean hundreds 
of changes needed - the vast majority of PCI drivers in the kernel use 
plain ioremap variants that don't have any flags to look at.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 24/27] tty: serial: samsung_tty: Add support for Apple UARTs

2021-03-05 Thread Hector Martin
n say 
overlapping the bits and the macro name in the same columns makes it 
less readable to my eyes.



+#define APPLE_S5L_UTRSTAT_RXTHRESH (1<<4)
+#define APPLE_S5L_UTRSTAT_TXTHRESH (1<<5)
+#define APPLE_S5L_UTRSTAT_RXTO     (1<<9)
+#define APPLE_S5L_UTRSTAT_ALL_FLAGS(0x3f0)


BIT() ?


See above.

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 27/27] arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

2021-03-05 Thread Hector Martin

On 06/03/2021 00.59, Mark Kettenis wrote:

It may be better to handle the memory reserved by the firmware using a
"/reserved-memory" node.  I think the benefit of that could be that it
communicates the entire range of physical memory to the kernel, which
means it could use large mappings in the page tables.  Unless the
"/reserved-memory" node defines a region that has the "no-map"
property of course.


We actually need no-map, because otherwise the CPU could speculate its 
way into these carveouts (it's not just firmware, there's stuff in here 
the CPU really can't be allowed to touch, e.g. the SEP carveout). It 
also breaks simplefb mapping the framebuffer. I thought of the 
reserved-memory approach, but then figured it wouldn't buy us anything 
for this reason.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 21/27] tty: serial: samsung_tty: IRQ rework

2021-03-05 Thread Hector Martin

On 06/03/2021 01.20, Andy Shevchenko wrote:

I am just splitting an
existing function into two, where one takes the lock and the other does
the work. Do you mean using a different locking function? I'm not
entirely sure what you're suggesting.


Yes, as a prerequisite

spin_lock_irqsave -> spin_lock().


Krzysztof, is this something you want in this series? I was trying to 
avoid logic changes to the non-Apple paths.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 21/27] tty: serial: samsung_tty: IRQ rework

2021-03-05 Thread Hector Martin

On 06/03/2021 00.17, Andy Shevchenko wrote:

Add a separate change that removes flags from the spin lock in the IRQ handler.


This commit should have no functional changes; I am just splitting an 
existing function into two, where one takes the lock and the other does 
the work. Do you mean using a different locking function? I'm not 
entirely sure what you're suggesting.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-05 Thread Hector Martin

On 06/03/2021 00.13, Andy Shevchenko wrote:

@@ -896,7 +899,10 @@ void __iomem *of_iomap(struct device_node *np, int index)
 if (of_address_to_resource(np, index, ))
 return NULL;

-   return ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   return ioremap_np(res.start, resource_size());
+   else
+   return ioremap(res.start, resource_size());


This doesn't sound right. Why _np is so exceptional? Why don't we have
other flavours (it also rings a bell to my previous comment that the
flag in ioresource is not in the right place)?


This is different from other variants, because until now *drivers* have 
made the choice of what ioremap mode to use based on device requirements 
(which means ioremap() 99% of the time, and then framebuffers and other 
memory-ish things such use something else). Now we have a *SoC fabric* 
that is calling the shots on what ioremap mode we have to use - and 
*every* non-PCIe driver needs to use ioremap_np() on these SoCs, or they 
break. So it seems a lot cleaner to make the choice for drivers here to 
upgrade ioremap() to ioremap_np() for SoCs that need it.


If we don't do something like this here or in otherwise common code, 
we'd have to have an open-coded "if apple then ioremap_np, else ioremap" 
in every driver that runs on-die devices on these SoCs, even ones that 
are otherwise standard and need few or no Apple-specific quirks.


We're still going to have to patch some drivers to use managed APIs that 
can properly hit this conditional (like I did for samsung_tty) in cases 
where they currently don't, but that's a lot cleaner than an open-coded 
conditional, I think (and often comes with other benefits anyway).


Note that wholesale making ioremap() behave like ioremap_np() at the 
arch level as as SoC quirk is not an option - for extenal PCIe devices, 
we still need to use ioremap(). We tried this approach initially but it 
doesn't work. Hence we arrived at this solution which describes the 
required mode in the devicetree, at the bus level (which makes sense, 
since that represents the fabric), and then these wrappers can use that 
information, carried over via the bit in struct device, to pick the 
right ioremap mode.


It doesn't really make sense to include the other variants here, because 
_np is strictly stronger than the default. Downgrading ioremap to any 
other variant would break most drivers, badly. However, upgrading to 
ioremap_np() is always correct (if possibly slower), on platforms where 
it is allowed by the bus. In fact, I bet that on many systems nGnRE 
already behaves like nGnRnE anyway. I don't know why Apple didn't just 
allow nGnRE mappings to work (behaving like nGnRnE) instead of making 
them explode, which is the whole reason we have to do this.



+   while (node) {
+   if (!of_property_read_bool(node, "ranges")) {
+   break;
+   } else if (of_property_read_bool(node, "nonposted-mmio")) {
+   of_node_put(node);
+   return true;
+   } else if (of_property_read_bool(node, "posted-mmio")) {
+   break;
+   }
+   parent = of_get_parent(node);
+   of_node_put(node);
+   node = parent;
+   }


I believe above can be slightly optimized. Don't we have helpers to
traverse to all parents?


Keep in mind the logic here is that it stops on the first instance of 
either property, and does not traverse non-translatable boundaries. Are 
there helpers that can implement this kind of complex logic? It's not a 
simple recursive property lookup.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 08/27] asm-generic/io.h: Add a non-posted variant of ioremap()

2021-03-05 Thread Hector Martin

On 05/03/2021 23.45, Andy Shevchenko wrote:

On Thu, Mar 4, 2021 at 11:40 PM Hector Martin  wrote:


ARM64 currently defaults to posted MMIO (nGnRnE), but some devices
require the use of non-posted MMIO (nGnRE). Introduce a new ioremap()
variant to handle this case. ioremap_np() is aliased to ioremap() by
default on arches that do not implement this variant.


Hmm... But isn't it basically a requirement to those device drivers to
use readX()/writeX() instead of readX_relaxed() / writeX_relaxed()?


No, the write ops used do not matter. It's just that on these Apple SoCs 
the fabric requires the mappings to be nGnRnE, else it just throws 
SErrors on all writes and ignores them.


The difference between _relaxed and not is barrier behavior with regards 
to DMA/memory accesses; this applies regardless of whether the writes 
are E or nE. You can have relaxed accesses with nGnRnE and then you 
would still have race conditions if you do not have a barrier between 
the MMIO and accessing DMA memory. What nGnRnE buys you (on 
platforms/buses where it works properly) is that you do need a dummy 
read after a write to ensure completion.


All of this is to some extent moot on these SoCs; it's not that we need 
the drivers to use nGnRnE for some correctness reason, it's that the 
SoCs force us to use it or else everything breaks, which was the 
motivation for this change. But since on most other SoCs both are valid 
options, this does allow some other drivers/platforms to opt into nGnRnE 
if they have a good reason to do so.


Though you just made me notice two mistakes in the commit description: 
first, it describes the old v2 version, for v3 I made ioremap_np() just 
return NULL on arches that don't implement it. Second, nGnRnE and nGnRE 
are backwards. Oops. I'll fix it for the next version.



  #define IORESOURCE_MEM_32BIT   (3<<3)
  #define IORESOURCE_MEM_SHADOWABLE  (1<<5)  /* dup: IORESOURCE_SHADOWABLE 
*/
  #define IORESOURCE_MEM_EXPANSIONROM(1<<6)
+#define IORESOURCE_MEM_NONPOSTED   (1<<7)


Not sure it's the right location (in a bit field) for this flag.


Do you have a better suggestion? It seemed logical to put it here, as a 
flag on memory-type I/O resources.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 27/27] arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

2021-03-05 Thread Hector Martin

On 05/03/2021 20.03, Krzysztof Kozlowski wrote:

+   memory@8 {
+   device_type = "memory";
+   reg = <0x8 0 0x2 0>; /* To be filled by loader */


Shouldn't this be 0x8 with ~0x8000 length (or whatever is
more common)? Or did I miss some ranges?


The base model has 8GB of RAM, and RAM always starts at 0x8, 
hence that reg property.


It's not actually useful to try to boot Linux like this, because it'll 
step all over device carveouts on both ends and break, but since those 
are potentially dynamic it doesn't really make sense to use a more 
specific example for the dts.


E.g. on my system, with my current firmware version, this ends up 
getting patched to:


reg = <0x8 0x0134c000 0x1 0xda294000>

Thanks,
--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [RFT PATCH v3 00/27] Apple M1 SoC platform bring-up

2021-03-05 Thread Hector Martin

On 05/03/2021 06.38, Hector Martin wrote:

== Merge notes ==

This patchset depends on both the nVHE changes that are already in
5.12-rc1, as well as the FIQ support work currently being reviewed
at [1]. A tree containing this patchset on top of the required
dependencies is available at [2][3]. Alternatively, you may apply
this series on top of Mark's tree at the arm64-fiq-20210302 tag [4][5].


Important warning: these trees are all based on v5.12-rc1, which has a 
bad bug that causes your filesystems to go kaboom if you use a swap file 
[1].


This doesn't affect M1 since we don't *have* storage, but for folks 
testing for regressions on on e.g. Samsung or other ARM boards, please 
make sure you don't use swap files.


[1] 
https://lore.kernel.org/lkml/CAHk-=wjnzdlsp3odxhf9emtyo7gf-qjanlbuh1zk3c4a7x7...@mail.gmail.com/


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCHv2 0/8] arm64: Support FIQ controller registration

2021-03-05 Thread Hector Martin

On 02/03/2021 19.12, Mark Rutland wrote:

I'm hoping that we can get the first 2 patches in as a preparatory cleanup for
the next rc or so, and then the rest of the series can be rebased atop that.
I've pushed the series out to my arm64/fiq branch [4] on kernel.org, also
tagged as arm64-fiq-20210302, atop v5.12-rc1.


Just a reminder to everyone that filesystems under v5.12-rc1 go explodey 
if you use a swap file [1].


I don't care for the M1 bring-up series (we don't *have* storage), but 
it's worth pointing out for other people who might test this.


Modulo that,

Tested-by: Hector Martin 

[1] 
https://lore.kernel.org/lkml/CAHk-=wjnzdlsp3odxhf9emtyo7gf-qjanlbuh1zk3c4a7x7...@mail.gmail.com/


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


[RFT PATCH v3 25/27] tty: serial: samsung_tty: Add earlycon support for Apple UARTs

2021-03-04 Thread Hector Martin
Earlycon support is identical to S3C2410, but Apple SoCs also need
MMIO mapped as nGnRnE. This is handled generically for normal drivers
including the normal UART path here, but earlycon uses fixmap and
runs before that scaffolding is ready.

Since this is the only case where we need this fix, it makes more
sense to do it here in the UART driver instead of introducing a
whole fdt nonposted-mmio resolver just for earlycon/fixmap.

Suggested-by: Arnd Bergmann 
Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 5ef37c4538ce..80df842bf4c7 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -3001,6 +3001,23 @@ OF_EARLYCON_DECLARE(s5pv210, "samsung,s5pv210-uart",
s5pv210_early_console_setup);
 OF_EARLYCON_DECLARE(exynos4210, "samsung,exynos4210-uart",
s5pv210_early_console_setup);
+
+/* Apple S5L */
+static int __init apple_s5l_early_console_setup(struct earlycon_device *device,
+   const char *opt)
+{
+   /* Close enough to S3C2410 for earlycon... */
+   device->port.private_data = _early_console_data;
+
+#ifdef CONFIG_ARM64
+   /* ... but we need to override the existing fixmap entry as nGnRnE */
+   __set_fixmap(FIX_EARLYCON_MEM_BASE, device->port.mapbase,
+__pgprot(PROT_DEVICE_nGnRnE));
+#endif
+   return samsung_early_console_setup(device, opt);
+}
+
+OF_EARLYCON_DECLARE(s5l, "apple,s5l-uart", apple_s5l_early_console_setup);
 #endif
 
 MODULE_ALIAS("platform:samsung-uart");
-- 
2.30.0



[RFT PATCH v3 27/27] arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

2021-03-04 Thread Hector Martin
This currently supports:

* SMP (via spin-tables)
* AIC IRQs
* Serial (with earlycon)
* Framebuffer

A number of properties are dynamic, and based on system firmware
decisions that vary from version to version. These are expected
to be filled in by the loader.

Signed-off-by: Hector Martin 
---
 MAINTAINERS  |   1 +
 arch/arm64/boot/dts/Makefile |   1 +
 arch/arm64/boot/dts/apple/Makefile   |   2 +
 arch/arm64/boot/dts/apple/t8103-j274.dts |  45 
 arch/arm64/boot/dts/apple/t8103.dtsi | 135 +++
 5 files changed, 184 insertions(+)
 create mode 100644 arch/arm64/boot/dts/apple/Makefile
 create mode 100644 arch/arm64/boot/dts/apple/t8103-j274.dts
 create mode 100644 arch/arm64/boot/dts/apple/t8103.dtsi

diff --git a/MAINTAINERS b/MAINTAINERS
index 28bd46f4f7a7..d5e4d93a536a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1647,6 +1647,7 @@ C:irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
 F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
+F: arch/arm64/boot/dts/apple/
 F: arch/arm64/include/asm/sysreg_apple.h
 F: drivers/irqchip/irq-apple-aic.c
 F: include/dt-bindings/interrupt-controller/apple-aic.h
diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index f1173cd93594..639e01a4d855 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -6,6 +6,7 @@ subdir-y += amazon
 subdir-y += amd
 subdir-y += amlogic
 subdir-y += apm
+subdir-y += apple
 subdir-y += arm
 subdir-y += bitmain
 subdir-y += broadcom
diff --git a/arch/arm64/boot/dts/apple/Makefile 
b/arch/arm64/boot/dts/apple/Makefile
new file mode 100644
index ..cbbd701ebf05
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+dtb-$(CONFIG_ARCH_APPLE) += t8103-j274.dtb
diff --git a/arch/arm64/boot/dts/apple/t8103-j274.dts 
b/arch/arm64/boot/dts/apple/t8103-j274.dts
new file mode 100644
index ..8afc2ed70361
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/t8103-j274.dts
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Apple Mac mini (M1, 2020)
+ *
+ * target-type: J174
+ *
+ * Copyright The Asahi Linux Contributors
+ */
+
+/dts-v1/;
+
+#include "t8103.dtsi"
+
+/ {
+   compatible = "apple,j274", "apple,t8103", "apple,arm-platform";
+   model = "Apple Mac mini (M1, 2020)";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   stdout-path = "serial0";
+
+   framebuffer0: framebuffer@0 {
+   compatible = "apple,simple-framebuffer", 
"simple-framebuffer";
+   reg = <0 0 0 0>; /* To be filled by loader */
+   /* Format properties will be added by loader */
+   status = "disabled";
+   };
+   };
+
+   memory@8 {
+   device_type = "memory";
+   reg = <0x8 0 0x2 0>; /* To be filled by loader */
+   };
+};
+
+ {
+   status = "okay";
+};
diff --git a/arch/arm64/boot/dts/apple/t8103.dtsi 
b/arch/arm64/boot/dts/apple/t8103.dtsi
new file mode 100644
index ..aac9e4e6abc5
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/t8103.dtsi
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Apple T8103 "M1" SoC
+ *
+ * Other names: H13G, "Tonga"
+ *
+ * Copyright The Asahi Linux Contributors
+ */
+
+#include 
+#include 
+
+/ {
+   compatible = "apple,t8103", "apple,arm-platform";
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x0>;
+   enable-method = "spin-table";
+   cpu-release-addr = <0 0>; /* To be filled by loader */
+   };
+
+   cpu1: cpu@1 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x1>;
+   enable-method = "spin-table";
+   cpu-release-addr = <0 0>; /* To be filled by loader */
+   };
+
+   cpu2: cpu@2 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+

[RFT PATCH v3 24/27] tty: serial: samsung_tty: Add support for Apple UARTs

2021-03-04 Thread Hector Martin
Apple SoCs are a distant descendant of Samsung designs and use yet
another variant of their UART style, with different interrupt handling.

In particular, this variant has the following differences with existing
ones:

* It includes a built-in interrupt controller with different registers,
  using only a single platform IRQ

* Internal interrupt sources are treated as edge-triggered, even though
  the IRQ output is level-triggered. This chiefly affects the TX IRQ
  path: the driver can no longer rely on the TX buffer empty IRQ
  immediately firing after TX is enabled, but instead must prime the
  FIFO with data directly.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/Kconfig   |   2 +-
 drivers/tty/serial/samsung_tty.c | 238 +--
 include/linux/serial_s3c.h   |  16 +++
 3 files changed, 247 insertions(+), 9 deletions(-)

diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 0c4cd4a348f4..3ba31ea20d8a 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -236,7 +236,7 @@ config SERIAL_CLPS711X_CONSOLE
 
 config SERIAL_SAMSUNG
tristate "Samsung SoC serial support"
-   depends on PLAT_SAMSUNG || ARCH_S5PV210 || ARCH_EXYNOS || COMPILE_TEST
+   depends on PLAT_SAMSUNG || ARCH_S5PV210 || ARCH_EXYNOS || ARCH_APPLE || 
COMPILE_TEST
select SERIAL_CORE
help
  Support for the on-chip UARTs on the Samsung S3C24XX series CPUs,
diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 26cb05992e9f..5ef37c4538ce 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -59,6 +59,7 @@
 enum s3c24xx_port_type {
TYPE_S3C24XX,
TYPE_S3C6400,
+   TYPE_APPLE_S5L,
 };
 
 struct s3c24xx_uart_info {
@@ -151,6 +152,8 @@ struct s3c24xx_uart_port {
 #endif
 };
 
+static void s3c24xx_serial_tx_chars(struct s3c24xx_uart_port *ourport);
+
 /* conversion functions */
 
 #define s3c24xx_dev_to_port(__dev) dev_get_drvdata(__dev)
@@ -290,6 +293,9 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
case TYPE_S3C6400:
s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
break;
+   case TYPE_APPLE_S5L:
+   s3c24xx_clear_bit(port, APPLE_S5L_UCON_TXTHRESH_ENA, 
S3C2410_UCON);
+   break;
default:
disable_irq_nosync(ourport->tx_irq);
break;
@@ -358,6 +364,9 @@ static void enable_tx_dma(struct s3c24xx_uart_port *ourport)
case TYPE_S3C6400:
s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
break;
+   case TYPE_APPLE_S5L:
+   WARN_ON(1); // No DMA
+   break;
default:
disable_irq_nosync(ourport->tx_irq);
break;
@@ -396,12 +405,23 @@ static void enable_tx_pio(struct s3c24xx_uart_port 
*ourport)
s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD,
  S3C64XX_UINTM);
break;
+   case TYPE_APPLE_S5L:
+   ucon |= APPLE_S5L_UCON_TXTHRESH_ENA_MSK;
+   wr_regl(port, S3C2410_UCON, ucon);
+   break;
default:
enable_irq(ourport->tx_irq);
break;
}
 
ourport->tx_mode = S3C24XX_TX_PIO;
+
+   /*
+* The Apple version only has edge triggered TX IRQs, so we need
+* to kick off the process by sending some characters here.
+*/
+   if (ourport->info->type == TYPE_APPLE_S5L)
+   s3c24xx_serial_tx_chars(ourport);
 }
 
 static void s3c24xx_serial_start_tx_pio(struct s3c24xx_uart_port *ourport)
@@ -527,6 +547,10 @@ static void s3c24xx_serial_stop_rx(struct uart_port *port)
s3c24xx_set_bit(port, S3C64XX_UINTM_RXD,
S3C64XX_UINTM);
break;
+   case TYPE_APPLE_S5L:
+   s3c24xx_clear_bit(port, APPLE_S5L_UCON_RXTHRESH_ENA, 
S3C2410_UCON);
+   s3c24xx_clear_bit(port, APPLE_S5L_UCON_RXTO_ENA, 
S3C2410_UCON);
+   break;
default:
disable_irq_nosync(ourport->rx_irq);
break;
@@ -664,14 +688,18 @@ static void enable_rx_pio(struct s3c24xx_uart_port 
*ourport)
 
/* set Rx mode to DMA mode */
ucon = rd_regl(port, S3C2410_UCON);
-   ucon &= ~(S3C64XX_UCON_TIMEOUT_MASK |
-   S3C64XX_UCON_EMPTYINT_EN |
-   S3C64XX_UCON_DMASUS_EN |
-   S3C64XX_UCON_TIMEOUT_EN |
-   S3C64XX_UCON_RXMODE_MASK);
-   ucon |= 0xf << S3C64XX_UCON_TIMEOUT_SHIFT |
-   S3C64XX_UCON_TIMEOUT_EN |
-   S3C64XX_UCON_RXMODE_CPU;
+   ucon &= ~S3C64XX_UCON_RXMODE_MASK;
+   ucon |= S3C64XX_

[RFT PATCH v3 26/27] dt-bindings: display: Add apple,simple-framebuffer

2021-03-04 Thread Hector Martin
Apple SoCs run firmware that sets up a simplefb-compatible framebuffer
for us. Add a compatible for it, and two missing supported formats.

Signed-off-by: Hector Martin 
---
 .../devicetree/bindings/display/simple-framebuffer.yaml  | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml 
b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
index eaf8c54fcf50..c2499a7906f5 100644
--- a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
+++ b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml
@@ -54,6 +54,7 @@ properties:
   compatible:
 items:
   - enum:
+  - apple,simple-framebuffer
   - allwinner,simple-framebuffer
   - amlogic,simple-framebuffer
   - const: simple-framebuffer
@@ -84,9 +85,13 @@ properties:
   Format of the framebuffer:
 * `a8b8g8r8` - 32-bit pixels, d[31:24]=a, d[23:16]=b, d[15:8]=g, 
d[7:0]=r
 * `r5g6b5` - 16-bit pixels, d[15:11]=r, d[10:5]=g, d[4:0]=b
+* `x2r10g10b10` - 32-bit pixels, d[29:20]=r, d[19:10]=g, d[9:0]=b
+* `x8r8g8b8` - 32-bit pixels, d[23:16]=r, d[15:8]=g, d[7:0]=b
 enum:
   - a8b8g8r8
   - r5g6b5
+  - x2r10g10b10
+  - x8r8g8b8
 
   display:
 $ref: /schemas/types.yaml#/definitions/phandle
-- 
2.30.0



[RFT PATCH v3 22/27] tty: serial: samsung_tty: Use devm_ioremap_resource

2021-03-04 Thread Hector Martin
This picks up the non-posted I/O mode needed for Apple platforms to
work properly.

This removes the request/release functions, which are no longer
necessary, since devm_ioremap_resource takes care of that already. Most
other drivers already do it this way, anyway.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 25 +++--
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 7106eb238d8c..26cb05992e9f 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -1573,26 +1573,11 @@ static const char *s3c24xx_serial_type(struct uart_port 
*port)
}
 }
 
-#define MAP_SIZE (0x100)
-
-static void s3c24xx_serial_release_port(struct uart_port *port)
-{
-   release_mem_region(port->mapbase, MAP_SIZE);
-}
-
-static int s3c24xx_serial_request_port(struct uart_port *port)
-{
-   const char *name = s3c24xx_serial_portname(port);
-
-   return request_mem_region(port->mapbase, MAP_SIZE, name) ? 0 : -EBUSY;
-}
-
 static void s3c24xx_serial_config_port(struct uart_port *port, int flags)
 {
struct s3c24xx_uart_info *info = s3c24xx_port_to_info(port);
 
-   if (flags & UART_CONFIG_TYPE &&
-   s3c24xx_serial_request_port(port) == 0)
+   if (flags & UART_CONFIG_TYPE)
port->type = info->port_type;
 }
 
@@ -1645,8 +1630,6 @@ static const struct uart_ops s3c24xx_serial_ops = {
.shutdown   = s3c24xx_serial_shutdown,
.set_termios= s3c24xx_serial_set_termios,
.type   = s3c24xx_serial_type,
-   .release_port   = s3c24xx_serial_release_port,
-   .request_port   = s3c24xx_serial_request_port,
.config_port= s3c24xx_serial_config_port,
.verify_port= s3c24xx_serial_verify_port,
 #if defined(CONFIG_SERIAL_SAMSUNG_CONSOLE) && defined(CONFIG_CONSOLE_POLL)
@@ -1668,8 +1651,6 @@ static const struct uart_ops s3c64xx_serial_ops = {
.shutdown   = s3c64xx_serial_shutdown,
.set_termios= s3c24xx_serial_set_termios,
.type   = s3c24xx_serial_type,
-   .release_port   = s3c24xx_serial_release_port,
-   .request_port   = s3c24xx_serial_request_port,
.config_port= s3c24xx_serial_config_port,
.verify_port= s3c24xx_serial_verify_port,
 #if defined(CONFIG_SERIAL_SAMSUNG_CONSOLE) && defined(CONFIG_CONSOLE_POLL)
@@ -1927,8 +1908,8 @@ static int s3c24xx_serial_init_port(struct 
s3c24xx_uart_port *ourport,
 
dev_dbg(port->dev, "resource %pR)\n", res);
 
-   port->membase = devm_ioremap(port->dev, res->start, resource_size(res));
-   if (!port->membase) {
+   port->membase = devm_ioremap_resource(port->dev, res);
+   if (IS_ERR(port->membase)) {
dev_err(port->dev, "failed to remap controller address\n");
return -EBUSY;
}
-- 
2.30.0



[RFT PATCH v3 20/27] tty: serial: samsung_tty: Add s3c24xx_port_type

2021-03-04 Thread Hector Martin
This decouples the TTY layer PORT_ types, which are exposed to
userspace, from the driver-internal flag of what kind of port this is.

This removes s3c24xx_serial_has_interrupt_mask, which was just checking
for a specific type anyway.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 112 +++
 1 file changed, 70 insertions(+), 42 deletions(-)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 33b421dbeb83..39b2eb165bdc 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -56,9 +56,15 @@
 /* flag to ignore all characters coming in */
 #define RXSTAT_DUMMY_READ (0x1000)
 
+enum s3c24xx_port_type {
+   TYPE_S3C24XX,
+   TYPE_S3C6400,
+};
+
 struct s3c24xx_uart_info {
char*name;
-   unsigned inttype;
+   enum s3c24xx_port_type  type;
+   unsigned intport_type;
unsigned intfifosize;
unsigned long   rx_fifomask;
unsigned long   rx_fifoshift;
@@ -229,16 +235,6 @@ static int s3c24xx_serial_txempty_nofifo(struct uart_port 
*port)
return rd_regl(port, S3C2410_UTRSTAT) & S3C2410_UTRSTAT_TXE;
 }
 
-/*
- * s3c64xx and later SoC's include the interrupt mask and status registers in
- * the controller itself, unlike the s3c24xx SoC's which have these registers
- * in the interrupt controller. Check if the port type is s3c64xx or higher.
- */
-static int s3c24xx_serial_has_interrupt_mask(struct uart_port *port)
-{
-   return to_ourport(port)->info->type == PORT_S3C6400;
-}
-
 static void s3c24xx_serial_rx_enable(struct uart_port *port)
 {
struct s3c24xx_uart_port *ourport = to_ourport(port);
@@ -290,10 +286,14 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
if (!ourport->tx_enabled)
return;
 
-   if (s3c24xx_serial_has_interrupt_mask(port))
+   switch (ourport->info->type) {
+   case TYPE_S3C6400:
s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
-   else
+   break;
+   default:
disable_irq_nosync(ourport->tx_irq);
+   break;
+   }
 
if (dma && dma->tx_chan && ourport->tx_in_progress == S3C24XX_TX_DMA) {
dmaengine_pause(dma->tx_chan);
@@ -354,10 +354,14 @@ static void enable_tx_dma(struct s3c24xx_uart_port 
*ourport)
u32 ucon;
 
/* Mask Tx interrupt */
-   if (s3c24xx_serial_has_interrupt_mask(port))
+   switch (ourport->info->type) {
+   case TYPE_S3C6400:
s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
-   else
+   break;
+   default:
disable_irq_nosync(ourport->tx_irq);
+   break;
+   }
 
/* Enable tx dma mode */
ucon = rd_regl(port, S3C2410_UCON);
@@ -387,11 +391,15 @@ static void enable_tx_pio(struct s3c24xx_uart_port 
*ourport)
wr_regl(port,  S3C2410_UCON, ucon);
 
/* Unmask Tx interrupt */
-   if (s3c24xx_serial_has_interrupt_mask(port))
+   switch (ourport->info->type) {
+   case TYPE_S3C6400:
s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD,
  S3C64XX_UINTM);
-   else
+   break;
+   default:
enable_irq(ourport->tx_irq);
+   break;
+   }
 
ourport->tx_mode = S3C24XX_TX_PIO;
 }
@@ -514,11 +522,15 @@ static void s3c24xx_serial_stop_rx(struct uart_port *port)
 
if (ourport->rx_enabled) {
dev_dbg(port->dev, "stopping rx\n");
-   if (s3c24xx_serial_has_interrupt_mask(port))
+   switch (ourport->info->type) {
+   case TYPE_S3C6400:
s3c24xx_set_bit(port, S3C64XX_UINTM_RXD,
S3C64XX_UINTM);
-   else
+   break;
+   default:
disable_irq_nosync(ourport->rx_irq);
+   break;
+   }
ourport->rx_enabled = 0;
}
if (dma && dma->rx_chan) {
@@ -1543,14 +1555,12 @@ static void s3c24xx_serial_set_termios(struct uart_port 
*port,
 
 static const char *s3c24xx_serial_type(struct uart_port *port)
 {
-   switch (port->type) {
-   case PORT_S3C2410:
-   return "S3C2410";
-   case PORT_S3C2440:
-   return "S3C2440";
-   case PORT_S3C2412:
-   return "S3C2412";
-   case PORT_S3C6400:
+   struct s3c24xx_uart_port *ourport = to_ourport(port);
+
+   switch (ourport->info->type) {
+   case TYPE_S3C24XX:
+   return "S3C24XX";
+   case TYPE_S3C6400:
return "S3C6400/10&quo

[RFT PATCH v3 21/27] tty: serial: samsung_tty: IRQ rework

2021-03-04 Thread Hector Martin
* Split out s3c24xx_serial_tx_chars from s3c24xx_serial_tx_irq,
  where only the latter acquires the port lock. This will be necessary
  on platforms which have edge-triggered IRQs, as we need to call
  s3c24xx_serial_tx_chars to kick off transmission from outside IRQ
  context, with the port lock held.

* Rename s3c24xx_serial_rx_chars to s3c24xx_serial_rx_irq for
  consistency with the above. All it does now is call two other
  functions anyway.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 34 +++-
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 39b2eb165bdc..7106eb238d8c 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -827,7 +827,7 @@ static irqreturn_t s3c24xx_serial_rx_chars_pio(void *dev_id)
return IRQ_HANDLED;
 }
 
-static irqreturn_t s3c24xx_serial_rx_chars(int irq, void *dev_id)
+static irqreturn_t s3c24xx_serial_rx_irq(int irq, void *dev_id)
 {
struct s3c24xx_uart_port *ourport = dev_id;
 
@@ -836,16 +836,12 @@ static irqreturn_t s3c24xx_serial_rx_chars(int irq, void 
*dev_id)
return s3c24xx_serial_rx_chars_pio(dev_id);
 }
 
-static irqreturn_t s3c24xx_serial_tx_chars(int irq, void *id)
+static void s3c24xx_serial_tx_chars(struct s3c24xx_uart_port *ourport)
 {
-   struct s3c24xx_uart_port *ourport = id;
struct uart_port *port = >port;
struct circ_buf *xmit = >state->xmit;
-   unsigned long flags;
int count, dma_count = 0;
 
-   spin_lock_irqsave(>lock, flags);
-
count = CIRC_CNT_TO_END(xmit->head, xmit->tail, UART_XMIT_SIZE);
 
if (ourport->dma && ourport->dma->tx_chan &&
@@ -862,7 +858,7 @@ static irqreturn_t s3c24xx_serial_tx_chars(int irq, void 
*id)
wr_reg(port, S3C2410_UTXH, port->x_char);
port->icount.tx++;
port->x_char = 0;
-   goto out;
+   return;
}
 
/* if there isn't anything more to transmit, or the uart is now
@@ -871,7 +867,7 @@ static irqreturn_t s3c24xx_serial_tx_chars(int irq, void 
*id)
 
if (uart_circ_empty(xmit) || uart_tx_stopped(port)) {
s3c24xx_serial_stop_tx(port);
-   goto out;
+   return;
}
 
/* try and drain the buffer... */
@@ -893,7 +889,7 @@ static irqreturn_t s3c24xx_serial_tx_chars(int irq, void 
*id)
 
if (!count && dma_count) {
s3c24xx_serial_start_tx_dma(ourport, dma_count);
-   goto out;
+   return;
}
 
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS) {
@@ -904,8 +900,18 @@ static irqreturn_t s3c24xx_serial_tx_chars(int irq, void 
*id)
 
if (uart_circ_empty(xmit))
s3c24xx_serial_stop_tx(port);
+}
+
+static irqreturn_t s3c24xx_serial_tx_irq(int irq, void *id)
+{
+   struct s3c24xx_uart_port *ourport = id;
+   struct uart_port *port = >port;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+
+   s3c24xx_serial_tx_chars(ourport);
 
-out:
spin_unlock_irqrestore(>lock, flags);
return IRQ_HANDLED;
 }
@@ -919,11 +925,11 @@ static irqreturn_t s3c64xx_serial_handle_irq(int irq, 
void *id)
irqreturn_t ret = IRQ_HANDLED;
 
if (pend & S3C64XX_UINTM_RXD_MSK) {
-   ret = s3c24xx_serial_rx_chars(irq, id);
+   ret = s3c24xx_serial_rx_irq(irq, id);
wr_regl(port, S3C64XX_UINTP, S3C64XX_UINTM_RXD_MSK);
}
if (pend & S3C64XX_UINTM_TXD_MSK) {
-   ret = s3c24xx_serial_tx_chars(irq, id);
+   ret = s3c24xx_serial_tx_irq(irq, id);
wr_regl(port, S3C64XX_UINTP, S3C64XX_UINTM_TXD_MSK);
}
return ret;
@@ -1155,7 +1161,7 @@ static int s3c24xx_serial_startup(struct uart_port *port)
 
ourport->rx_enabled = 1;
 
-   ret = request_irq(ourport->rx_irq, s3c24xx_serial_rx_chars, 0,
+   ret = request_irq(ourport->rx_irq, s3c24xx_serial_rx_irq, 0,
  s3c24xx_serial_portname(port), ourport);
 
if (ret != 0) {
@@ -1169,7 +1175,7 @@ static int s3c24xx_serial_startup(struct uart_port *port)
 
ourport->tx_enabled = 1;
 
-   ret = request_irq(ourport->tx_irq, s3c24xx_serial_tx_chars, 0,
+   ret = request_irq(ourport->tx_irq, s3c24xx_serial_tx_irq, 0,
  s3c24xx_serial_portname(port), ourport);
 
if (ret) {
-- 
2.30.0



[RFT PATCH v3 18/27] tty: serial: samsung_tty: Separate S3C64XX ops structure

2021-03-04 Thread Hector Martin
Instead of patching a single global ops structure depending on the port
type, use a separate s3c64xx_serial_ops for the S3C64XX type. This
allows us to mark the structures as const.

Also split out s3c64xx_serial_shutdown into a separate function now that
we have a separate ops structure; this avoids excessive branching
control flow and mirrors s3c64xx_serial_startup. tx_claimed and
rx_claimed are only used in the S3C24XX functions.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 71 
 1 file changed, 54 insertions(+), 17 deletions(-)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 8ae3e03fbd8c..78dc6e9240fb 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -1098,27 +1098,36 @@ static void s3c24xx_serial_shutdown(struct uart_port 
*port)
struct s3c24xx_uart_port *ourport = to_ourport(port);
 
if (ourport->tx_claimed) {
-   if (!s3c24xx_serial_has_interrupt_mask(port))
-   free_irq(ourport->tx_irq, ourport);
+   free_irq(ourport->tx_irq, ourport);
ourport->tx_enabled = 0;
ourport->tx_claimed = 0;
ourport->tx_mode = 0;
}
 
if (ourport->rx_claimed) {
-   if (!s3c24xx_serial_has_interrupt_mask(port))
-   free_irq(ourport->rx_irq, ourport);
+   free_irq(ourport->rx_irq, ourport);
ourport->rx_claimed = 0;
ourport->rx_enabled = 0;
}
 
-   /* Clear pending interrupts and mask all interrupts */
-   if (s3c24xx_serial_has_interrupt_mask(port)) {
-   free_irq(port->irq, ourport);
+   if (ourport->dma)
+   s3c24xx_serial_release_dma(ourport);
 
-   wr_regl(port, S3C64XX_UINTP, 0xf);
-   wr_regl(port, S3C64XX_UINTM, 0xf);
-   }
+   ourport->tx_in_progress = 0;
+}
+
+static void s3c64xx_serial_shutdown(struct uart_port *port)
+{
+   struct s3c24xx_uart_port *ourport = to_ourport(port);
+
+   ourport->tx_enabled = 0;
+   ourport->tx_mode = 0;
+   ourport->rx_enabled = 0;
+
+   free_irq(port->irq, ourport);
+
+   wr_regl(port, S3C64XX_UINTP, 0xf);
+   wr_regl(port, S3C64XX_UINTM, 0xf);
 
if (ourport->dma)
s3c24xx_serial_release_dma(ourport);
@@ -1193,9 +1202,7 @@ static int s3c64xx_serial_startup(struct uart_port *port)
 
/* For compatibility with s3c24xx Soc's */
ourport->rx_enabled = 1;
-   ourport->rx_claimed = 1;
ourport->tx_enabled = 0;
-   ourport->tx_claimed = 1;
 
spin_lock_irqsave(>lock, flags);
 
@@ -1608,7 +1615,7 @@ static void s3c24xx_serial_put_poll_char(struct uart_port 
*port,
 unsigned char c);
 #endif
 
-static struct uart_ops s3c24xx_serial_ops = {
+static const struct uart_ops s3c24xx_serial_ops = {
.pm = s3c24xx_serial_pm,
.tx_empty   = s3c24xx_serial_tx_empty,
.get_mctrl  = s3c24xx_serial_get_mctrl,
@@ -1631,6 +1638,29 @@ static struct uart_ops s3c24xx_serial_ops = {
 #endif
 };
 
+static const struct uart_ops s3c64xx_serial_ops = {
+   .pm = s3c24xx_serial_pm,
+   .tx_empty   = s3c24xx_serial_tx_empty,
+   .get_mctrl  = s3c24xx_serial_get_mctrl,
+   .set_mctrl  = s3c24xx_serial_set_mctrl,
+   .stop_tx= s3c24xx_serial_stop_tx,
+   .start_tx   = s3c24xx_serial_start_tx,
+   .stop_rx= s3c24xx_serial_stop_rx,
+   .break_ctl  = s3c24xx_serial_break_ctl,
+   .startup= s3c64xx_serial_startup,
+   .shutdown   = s3c64xx_serial_shutdown,
+   .set_termios= s3c24xx_serial_set_termios,
+   .type   = s3c24xx_serial_type,
+   .release_port   = s3c24xx_serial_release_port,
+   .request_port   = s3c24xx_serial_request_port,
+   .config_port= s3c24xx_serial_config_port,
+   .verify_port= s3c24xx_serial_verify_port,
+#if defined(CONFIG_SERIAL_SAMSUNG_CONSOLE) && defined(CONFIG_CONSOLE_POLL)
+   .poll_get_char = s3c24xx_serial_get_poll_char,
+   .poll_put_char = s3c24xx_serial_put_poll_char,
+#endif
+};
+
 static struct uart_driver s3c24xx_uart_drv = {
.owner  = THIS_MODULE,
.driver_name= "s3c2410_serial",
@@ -1868,10 +1898,6 @@ static int s3c24xx_serial_init_port(struct 
s3c24xx_uart_port *ourport,
/* setup info for port */
port->dev   = >dev;
 
-   /* Startup sequence is different for s3c64xx and higher SoC's */
-   if (s3c24xx_serial_has_interrupt_mask(port))
-   s3c24xx_serial_ops.startup = s3c64xx_serial_startup;
-
port->uartclk = 1;
 
if (cfg->uart_flags & UPF_CONS_FLOW) {
@@ -2019,6 +2045,17 @@ st

[RFT PATCH v3 19/27] tty: serial: samsung_tty: Add ucon_mask parameter

2021-03-04 Thread Hector Martin
This simplifies the code by removing the only distinction between the
S3C2410 and S3C2440 codepaths.

Signed-off-by: Hector Martin 
---
 drivers/tty/serial/samsung_tty.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
index 78dc6e9240fb..33b421dbeb83 100644
--- a/drivers/tty/serial/samsung_tty.c
+++ b/drivers/tty/serial/samsung_tty.c
@@ -70,6 +70,7 @@ struct s3c24xx_uart_info {
unsigned long   num_clks;
unsigned long   clksel_mask;
unsigned long   clksel_shift;
+   unsigned long   ucon_mask;
 
/* uart port features */
 
@@ -1736,14 +1737,9 @@ static void s3c24xx_serial_resetport(struct uart_port 
*port,
 {
struct s3c24xx_uart_info *info = s3c24xx_port_to_info(port);
unsigned long ucon = rd_regl(port, S3C2410_UCON);
-   unsigned int ucon_mask;
 
-   ucon_mask = info->clksel_mask;
-   if (info->type == PORT_S3C2440)
-   ucon_mask |= S3C2440_UCON0_DIVMASK;
-
-   ucon &= ucon_mask;
-   wr_regl(port, S3C2410_UCON,  ucon | cfg->ucon);
+   ucon &= (info->clksel_mask | info->ucon_mask);
+   wr_regl(port, S3C2410_UCON, ucon | cfg->ucon);
 
/* reset both fifos */
wr_regl(port, S3C2410_UFCON, cfg->ufcon | S3C2410_UFCON_RESETBOTH);
@@ -2486,6 +2482,7 @@ static struct s3c24xx_serial_drv_data 
s3c2440_serial_drv_data = {
.num_clks   = 4,
.clksel_mask= S3C2412_UCON_CLKMASK,
.clksel_shift   = S3C2412_UCON_CLKSHIFT,
+   .ucon_mask  = S3C2440_UCON0_DIVMASK,
},
.def_cfg = &(struct s3c2410_uartcfg) {
.ucon   = S3C2410_UCON_DEFAULT,
-- 
2.30.0



[RFT PATCH v3 17/27] arm64: Kconfig: Introduce CONFIG_ARCH_APPLE

2021-03-04 Thread Hector Martin
This adds a Kconfig option to toggle support for Apple ARM SoCs.
At this time this targets the M1 and later "Apple Silicon" Mac SoCs.

Signed-off-by: Hector Martin 
---
 arch/arm64/Kconfig.platforms | 8 
 arch/arm64/configs/defconfig | 1 +
 2 files changed, 9 insertions(+)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index cdfd5fed457f..c2b5791e3d69 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -36,6 +36,14 @@ config ARCH_ALPINE
  This enables support for the Annapurna Labs Alpine
  Soc family.
 
+config ARCH_APPLE
+   bool "Apple Silicon SoC family"
+   select APPLE_AIC
+   select ARM64_FIQ_SUPPORT
+   help
+ This enables support for Apple's in-house ARM SoC family, starting
+ with the Apple M1.
+
 config ARCH_BCM2835
bool "Broadcom BCM2835 family"
select TIMER_OF
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index d612f633b771..54fb257e55f7 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -31,6 +31,7 @@ CONFIG_ARCH_ACTIONS=y
 CONFIG_ARCH_AGILEX=y
 CONFIG_ARCH_SUNXI=y
 CONFIG_ARCH_ALPINE=y
+CONFIG_ARCH_APPLE=y
 CONFIG_ARCH_BCM2835=y
 CONFIG_ARCH_BCM4908=y
 CONFIG_ARCH_BCM_IPROC=y
-- 
2.30.0



[RFT PATCH v3 23/27] dt-bindings: serial: samsung: Add apple,s5l-uart compatible

2021-03-04 Thread Hector Martin
Apple mobile devices originally used Samsung SoCs (starting with the
S5L8900), and their current in-house SoCs continue to use compatible
UART peripherals. We'll call this UART variant apple,s5l-uart.

Signed-off-by: Hector Martin 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Linus Walleij 
---
 Documentation/devicetree/bindings/serial/samsung_uart.yaml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/serial/samsung_uart.yaml 
b/Documentation/devicetree/bindings/serial/samsung_uart.yaml
index 21ee627b2ced..a59be11acd4f 100644
--- a/Documentation/devicetree/bindings/serial/samsung_uart.yaml
+++ b/Documentation/devicetree/bindings/serial/samsung_uart.yaml
@@ -4,7 +4,7 @@
 $id: http://devicetree.org/schemas/serial/samsung_uart.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
-title: Samsung S3C, S5P and Exynos SoC UART Controller
+title: Samsung S3C, S5P, Exynos, and S5L (Apple SoC) SoC UART Controller
 
 maintainers:
   - Krzysztof Kozlowski 
@@ -19,6 +19,7 @@ properties:
   compatible:
 items:
   - enum:
+  - apple,s5l-uart
   - samsung,s3c2410-uart
   - samsung,s3c2412-uart
   - samsung,s3c2440-uart
@@ -96,6 +97,7 @@ allOf:
 compatible:
   contains:
 enum:
+  - apple,s5l-uart
   - samsung,exynos4210-uart
 then:
   properties:
-- 
2.30.0



[RFT PATCH v3 15/27] dt-bindings: interrupt-controller: Add DT bindings for apple-aic

2021-03-04 Thread Hector Martin
AIC is the Apple Interrupt Controller found on Apple ARM SoCs, such as
the M1.

Signed-off-by: Hector Martin 
Reviewed-by: Linus Walleij 
---
 .../interrupt-controller/apple,aic.yaml   | 88 +++
 MAINTAINERS   |  1 +
 .../interrupt-controller/apple-aic.h  | 15 
 3 files changed, 104 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 create mode 100644 include/dt-bindings/interrupt-controller/apple-aic.h

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml 
b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
new file mode 100644
index ..cf6c091a07b1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
@@ -0,0 +1,88 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/apple,aic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple Interrupt Controller
+
+maintainers:
+  - Hector Martin 
+
+description: |
+  The Apple Interrupt Controller is a simple interrupt controller present on
+  Apple ARM SoC platforms, including various iPhone and iPad devices and the
+  "Apple Silicon" Macs.
+
+  It provides the following features:
+
+  - Level-triggered hardware IRQs wired to SoC blocks
+- Single mask bit per IRQ
+- Per-IRQ affinity setting
+- Automatic masking on event delivery (auto-ack)
+- Software triggering (ORed with hw line)
+  - 2 per-CPU IPIs (meant as "self" and "other", but they are interchangeable
+if not symmetric)
+  - Automatic prioritization (single event/ack register per CPU, lower IRQs =
+higher priority)
+  - Automatic masking on ack
+  - Default "this CPU" register view and explicit per-CPU views
+
+  This device also represents the FIQ interrupt sources on platforms using AIC,
+  which do not go through a discrete interrupt controller.
+
+allOf:
+  - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+  compatible:
+items:
+  - const: apple,t8103-aic
+  - const: apple,aic
+
+  interrupt-controller: true
+
+  '#interrupt-cells':
+const: 3
+description: |
+  The 1st cell contains the interrupt type:
+- 0: Hardware IRQ
+- 1: FIQ
+
+  The 2nd cell contains the interrupt number.
+- HW IRQs: interrupt number
+- FIQs:
+  - 0: physical HV timer
+  - 1: virtual HV timer
+  - 2: physical guest timer
+  - 3: virtual guest timer
+
+  The 3rd cell contains the interrupt flags. This is normally
+  IRQ_TYPE_LEVEL_HIGH (4).
+
+  reg:
+description: |
+  Specifies base physical address and size of the AIC registers.
+maxItems: 1
+
+required:
+  - compatible
+  - '#interrupt-cells'
+  - interrupt-controller
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+aic: interrupt-controller@23b10 {
+compatible = "apple,t8103-aic", "apple,aic";
+#interrupt-cells = <3>;
+interrupt-controller;
+reg = <0x2 0x3b10 0x0 0x8000>;
+};
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index 3a352c687d4b..744e086d28cd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1646,6 +1646,7 @@ B:https://github.com/AsahiLinux/linux/issues
 C: irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 F: arch/arm64/include/asm/sysreg_apple.h
 
 ARM/ARTPEC MACHINE SUPPORT
diff --git a/include/dt-bindings/interrupt-controller/apple-aic.h 
b/include/dt-bindings/interrupt-controller/apple-aic.h
new file mode 100644
index ..9ac56a7e6d3f
--- /dev/null
+++ b/include/dt-bindings/interrupt-controller/apple-aic.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
+#ifndef _DT_BINDINGS_INTERRUPT_CONTROLLER_APPLE_AIC_H
+#define _DT_BINDINGS_INTERRUPT_CONTROLLER_APPLE_AIC_H
+
+#include 
+
+#define AIC_IRQ0
+#define AIC_FIQ1
+
+#define AIC_TMR_HV_PHYS0
+#define AIC_TMR_HV_VIRT1
+#define AIC_TMR_GUEST_PHYS 2
+#define AIC_TMR_GUEST_VIRT 3
+
+#endif
-- 
2.30.0



[RFT PATCH v3 11/27] arm64: Implement ioremap_np() to map MMIO as nGnRnE

2021-03-04 Thread Hector Martin
This is used on Apple ARM platforms, which require most MMIO
(except PCI devices) to be mapped as nGnRnE.

Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/io.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 5ea8656a2030..953b8703af60 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -169,6 +169,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, 
size_t size);
 
 #define ioremap(addr, size)__ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_wc(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_NORMAL_NC))
+#define ioremap_np(addr, size) __ioremap((addr), (size), 
__pgprot(PROT_DEVICE_nGnRnE))
 
 /*
  * PCI configuration space mapping function.
-- 
2.30.0



[RFT PATCH v3 13/27] arm64: Add Apple vendor-specific system registers

2021-03-04 Thread Hector Martin
Apple ARM64 SoCs have a ton of vendor-specific registers we're going to
have to deal with, and those don't really belong in sysreg.h with all
the architectural registers. Make a new home for them, and add some
registers which are useful for early bring-up.

Signed-off-by: Hector Martin 
---
 MAINTAINERS   |  1 +
 arch/arm64/include/asm/sysreg_apple.h | 69 +++
 2 files changed, 70 insertions(+)
 create mode 100644 arch/arm64/include/asm/sysreg_apple.h

diff --git a/MAINTAINERS b/MAINTAINERS
index aec14fbd61b8..3a352c687d4b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1646,6 +1646,7 @@ B:https://github.com/AsahiLinux/linux/issues
 C: irc://chat.freenode.net/asahi-dev
 T: git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
+F: arch/arm64/include/asm/sysreg_apple.h
 
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
diff --git a/arch/arm64/include/asm/sysreg_apple.h 
b/arch/arm64/include/asm/sysreg_apple.h
new file mode 100644
index ..48347a51d564
--- /dev/null
+++ b/arch/arm64/include/asm/sysreg_apple.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Apple SoC vendor-defined system register definitions
+ *
+ * Copyright The Asahi Linux Contributors
+
+ * This file contains only well-understood registers that are useful to
+ * Linux. If you are looking for things to add here, you should visit:
+ *
+ * https://github.com/AsahiLinux/docs/wiki/HW:ARM-System-Registers
+ */
+
+#ifndef __ASM_SYSREG_APPLE_H
+#define __ASM_SYSREG_APPLE_H
+
+#include 
+#include 
+#include 
+
+/*
+ * Keep these registers in encoding order, except for register arrays;
+ * those should be listed in array order starting from the position of
+ * the encoding of the first register.
+ */
+
+#define SYS_APL_PMCR0_EL1  sys_reg(3, 1, 15, 0, 0)
+#define PMCR0_IMODEGENMASK(10, 8)
+#define PMCR0_IMODE_OFF0
+#define PMCR0_IMODE_PMI1
+#define PMCR0_IMODE_AIC2
+#define PMCR0_IMODE_HALT   3
+#define PMCR0_IMODE_FIQ4
+#define PMCR0_IACT BIT(11)
+
+/* IPI request registers */
+#define SYS_APL_IPI_RR_LOCAL_EL1   sys_reg(3, 5, 15, 0, 0)
+#define SYS_APL_IPI_RR_GLOBAL_EL1  sys_reg(3, 5, 15, 0, 1)
+#define IPI_RR_CPU GENMASK(7, 0)
+/* Cluster only used for the GLOBAL register */
+#define IPI_RR_CLUSTER GENMASK(23, 16)
+#define IPI_RR_TYPEGENMASK(29, 28)
+#define IPI_RR_IMMEDIATE   0
+#define IPI_RR_RETRACT 1
+#define IPI_RR_DEFERRED2
+#define IPI_RR_NOWAKE  3
+
+/* IPI status register */
+#define SYS_APL_IPI_SR_EL1 sys_reg(3, 5, 15, 1, 1)
+#define IPI_SR_PENDING BIT(0)
+
+/* Guest timer FIQ enable register */
+#define SYS_APL_VM_TMR_FIQ_ENA_EL1 sys_reg(3, 5, 15, 1, 3)
+#define VM_TMR_FIQ_ENABLE_VBIT(0)
+#define VM_TMR_FIQ_ENABLE_PBIT(1)
+
+/* Deferred IPI countdown register */
+#define SYS_APL_IPI_CR_EL1 sys_reg(3, 5, 15, 3, 1)
+
+#define SYS_APL_UPMCR0_EL1 sys_reg(3, 7, 15, 0, 4)
+#define UPMCR0_IMODE   GENMASK(18, 16)
+#define UPMCR0_IMODE_OFF   0
+#define UPMCR0_IMODE_AIC   2
+#define UPMCR0_IMODE_HALT  3
+#define UPMCR0_IMODE_FIQ   4
+
+#define SYS_APL_UPMSR_EL1  sys_reg(3, 7, 15, 6, 4)
+#define UPMSR_IACT BIT(0)
+
+#endif /* __ASM_SYSREG_APPLE_H */
-- 
2.30.0



[RFT PATCH v3 14/27] arm64: move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h

2021-03-04 Thread Hector Martin
These definitions are in arm-gic-v3.h for historical reasons which no
longer apply. Move them to sysreg.h so the AIC driver can use them, as
it needs to peek into vGIC registers to deal with the GIC maintentance
interrupt.

Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/sysreg.h| 60 ++
 include/linux/irqchip/arm-gic-v3.h | 56 
 2 files changed, 60 insertions(+), 56 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index dfd4edbfe360..645926490ada 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1024,6 +1024,66 @@
 #define TRFCR_ELx_ExTREBIT(1)
 #define TRFCR_ELx_E0TREBIT(0)
 
+
+/* GIC Hypervisor interface registers */
+/* ICH_MISR_EL2 bit definitions */
+#define ICH_MISR_EOI   (1 << 0)
+#define ICH_MISR_U (1 << 1)
+
+/* ICH_LR*_EL2 bit definitions */
+#define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1)
+
+#define ICH_LR_EOI (1ULL << 41)
+#define ICH_LR_GROUP   (1ULL << 60)
+#define ICH_LR_HW  (1ULL << 61)
+#define ICH_LR_STATE   (3ULL << 62)
+#define ICH_LR_PENDING_BIT (1ULL << 62)
+#define ICH_LR_ACTIVE_BIT  (1ULL << 63)
+#define ICH_LR_PHYS_ID_SHIFT   32
+#define ICH_LR_PHYS_ID_MASK(0x3ffULL << ICH_LR_PHYS_ID_SHIFT)
+#define ICH_LR_PRIORITY_SHIFT  48
+#define ICH_LR_PRIORITY_MASK   (0xffULL << ICH_LR_PRIORITY_SHIFT)
+
+/* ICH_HCR_EL2 bit definitions */
+#define ICH_HCR_EN (1 << 0)
+#define ICH_HCR_UIE(1 << 1)
+#define ICH_HCR_NPIE   (1 << 3)
+#define ICH_HCR_TC (1 << 10)
+#define ICH_HCR_TALL0  (1 << 11)
+#define ICH_HCR_TALL1  (1 << 12)
+#define ICH_HCR_EOIcount_SHIFT 27
+#define ICH_HCR_EOIcount_MASK  (0x1f << ICH_HCR_EOIcount_SHIFT)
+
+/* ICH_VMCR_EL2 bit definitions */
+#define ICH_VMCR_ACK_CTL_SHIFT 2
+#define ICH_VMCR_ACK_CTL_MASK  (1 << ICH_VMCR_ACK_CTL_SHIFT)
+#define ICH_VMCR_FIQ_EN_SHIFT  3
+#define ICH_VMCR_FIQ_EN_MASK   (1 << ICH_VMCR_FIQ_EN_SHIFT)
+#define ICH_VMCR_CBPR_SHIFT4
+#define ICH_VMCR_CBPR_MASK (1 << ICH_VMCR_CBPR_SHIFT)
+#define ICH_VMCR_EOIM_SHIFT9
+#define ICH_VMCR_EOIM_MASK (1 << ICH_VMCR_EOIM_SHIFT)
+#define ICH_VMCR_BPR1_SHIFT18
+#define ICH_VMCR_BPR1_MASK (7 << ICH_VMCR_BPR1_SHIFT)
+#define ICH_VMCR_BPR0_SHIFT21
+#define ICH_VMCR_BPR0_MASK (7 << ICH_VMCR_BPR0_SHIFT)
+#define ICH_VMCR_PMR_SHIFT 24
+#define ICH_VMCR_PMR_MASK  (0xffUL << ICH_VMCR_PMR_SHIFT)
+#define ICH_VMCR_ENG0_SHIFT0
+#define ICH_VMCR_ENG0_MASK (1 << ICH_VMCR_ENG0_SHIFT)
+#define ICH_VMCR_ENG1_SHIFT1
+#define ICH_VMCR_ENG1_MASK (1 << ICH_VMCR_ENG1_SHIFT)
+
+/* ICH_VTR_EL2 bit definitions */
+#define ICH_VTR_PRI_BITS_SHIFT 29
+#define ICH_VTR_PRI_BITS_MASK  (7 << ICH_VTR_PRI_BITS_SHIFT)
+#define ICH_VTR_ID_BITS_SHIFT  23
+#define ICH_VTR_ID_BITS_MASK   (7 << ICH_VTR_ID_BITS_SHIFT)
+#define ICH_VTR_SEIS_SHIFT 22
+#define ICH_VTR_SEIS_MASK  (1 << ICH_VTR_SEIS_SHIFT)
+#define ICH_VTR_A3V_SHIFT  21
+#define ICH_VTR_A3V_MASK   (1 << ICH_VTR_A3V_SHIFT)
+
 #ifdef __ASSEMBLY__
 
.irp
num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index f6d092fdb93d..81cbf85f73de 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -575,67 +575,11 @@
 #define ICC_SRE_EL1_DFB(1U << 1)
 #define ICC_SRE_EL1_SRE(1U << 0)
 
-/*
- * Hypervisor interface registers (SRE only)
- */
-#define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1)
-
-#define ICH_LR_EOI (1ULL << 41)
-#define ICH_LR_GROUP   (1ULL << 60)
-#define ICH_LR_HW  (1ULL << 61)
-#define ICH_LR_STATE   (3ULL << 62)
-#define ICH_LR_PENDING_BIT (1ULL << 62)
-#define ICH_LR_ACTIVE_BIT  (1ULL << 63)
-#define ICH_LR_PHYS_ID_SHIFT   32
-#define ICH_LR_PHYS_ID_MASK(0x3ffULL << ICH_LR_PHYS_ID_SHIFT)
-#define ICH_LR_PRIORITY_SHIFT  48
-#define ICH_LR_PRIORITY_MASK   (0xffULL << ICH_LR_PRIORITY_SHIFT)
-
 /* These are for GICv2 emulation only */
 #define GICH_LR_VIRTUALID  (0x3ffUL << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT (10)
 #define GICH_LR_PHYSID_CPUID   (7UL << GICH_LR_PHYSID_CPUID_SHIFT)
 
-#define ICH_MISR_EOI   (1 << 0)
-#define ICH_MISR_U (1 << 1)
-
-#define ICH_HCR_EN (1 << 

[RFT PATCH v3 16/27] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-03-04 Thread Hector Martin
This is the root interrupt controller used on Apple ARM SoCs such as the
M1. This irqchip driver performs multiple functions:

* Handles both IRQs and FIQs

* Drives the AIC peripheral itself (which handles IRQs)

* Dispatches FIQs to downstream hard-wired clients (currently the ARM
  timer).

* Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
  into a single hardware IPI

Signed-off-by: Hector Martin 
---
 MAINTAINERS |   2 +
 drivers/irqchip/Kconfig |   8 +
 drivers/irqchip/Makefile|   1 +
 drivers/irqchip/irq-apple-aic.c | 710 
 include/linux/cpuhotplug.h  |   1 +
 5 files changed, 722 insertions(+)
 create mode 100644 drivers/irqchip/irq-apple-aic.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 744e086d28cd..28bd46f4f7a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1648,6 +1648,8 @@ T:git https://github.com/AsahiLinux/linux.git
 F: Documentation/devicetree/bindings/arm/apple.yaml
 F: Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 F: arch/arm64/include/asm/sysreg_apple.h
+F: drivers/irqchip/irq-apple-aic.c
+F: include/dt-bindings/interrupt-controller/apple-aic.h
 
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 15536e321df5..d3a14f304ec8 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -577,4 +577,12 @@ config MST_IRQ
help
  Support MStar Interrupt Controller.
 
+config APPLE_AIC
+   bool "Apple Interrupt Controller (AIC)"
+   depends on ARM64
+   default ARCH_APPLE
+   help
+ Support for the Apple Interrupt Controller found on Apple Silicon 
SoCs,
+ such as the M1.
+
 endmenu
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index c59b95a0532c..eb6a515f0f64 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -113,3 +113,4 @@ obj-$(CONFIG_LOONGSON_PCH_MSI)  += 
irq-loongson-pch-msi.o
 obj-$(CONFIG_MST_IRQ)  += irq-mst-intc.o
 obj-$(CONFIG_SL28CPLD_INTC)+= irq-sl28cpld.o
 obj-$(CONFIG_MACH_REALTEK_RTL) += irq-realtek-rtl.o
+obj-$(CONFIG_APPLE_AIC)+= irq-apple-aic.o
diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
new file mode 100644
index ..ddc0856f36a5
--- /dev/null
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -0,0 +1,710 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright The Asahi Linux Contributors
+ *
+ * Based on irq-lpc32xx:
+ *   Copyright 2015-2016 Vladimir Zapolskiy 
+ * Based on irq-bcm2836:
+ *   Copyright 2015 Broadcom
+ */
+
+/*
+ * AIC is a fairly simple interrupt controller with the following features:
+ *
+ * - 896 level-triggered hardware IRQs
+ *   - Single mask bit per IRQ
+ *   - Per-IRQ affinity setting
+ *   - Automatic masking on event delivery (auto-ack)
+ *   - Software triggering (ORed with hw line)
+ * - 2 per-CPU IPIs (meant as "self" and "other", but they are
+ *   interchangeable if not symmetric)
+ * - Automatic prioritization (single event/ack register per CPU, lower IRQs =
+ *   higher priority)
+ * - Automatic masking on ack
+ * - Default "this CPU" register view and explicit per-CPU views
+ *
+ * In addition, this driver also handles FIQs, as these are routed to the same
+ * IRQ vector. These are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and
+ * performance counters (TODO).
+ *
+ * Implementation notes:
+ *
+ * - This driver creates two IRQ domains, one for HW IRQs and internal FIQs,
+ *   and one for IPIs.
+ * - Since Linux needs more than 2 IPIs, we implement a software IRQ controller
+ *   and funnel all IPIs into one per-CPU IPI (the second "self" IPI is 
unused).
+ * - FIQ hwirq numbers are assigned after true hwirqs, and are per-cpu.
+ * - DT bindings use 3-cell form (like GIC):
+ *   - <0 nr flags> - hwirq #nr
+ *   - <1 nr flags> - FIQ #nr
+ * - nr=0  Physical HV timer
+ * - nr=1  Virtual HV timer
+ * - nr=2  Physical guest timer
+ * - nr=3  Virtual guest timer
+ *
+ */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define AIC_INFO   0x0004
+#define AIC_INFO_NR_HW GENMASK(15, 0)
+
+#define AIC_CONFIG 0x0010
+
+#define AIC_WHOAMI 0x2000
+#define AIC_EVENT  0x2004
+#define AIC_EVENT_TYPE GENMASK(31, 16)
+#define AIC_EVENT_NUM  GENMASK(15, 0)
+
+#define AIC_EVENT_TYPE_HW  1
+#define AIC_EVENT_TYPE_IPI 4
+#define AIC_EVENT_IPI_OTHER1
+#define AIC_EVENT_IPI_SELF 2
+
+#define AIC_IPI_SEND   0x2008
+#define AIC_IPI_ACK0x200c
+#define AIC_IPI_MASK_SET   0x2024
+#define AIC_IPI_MAS

[RFT PATCH v3 08/27] asm-generic/io.h: Add a non-posted variant of ioremap()

2021-03-04 Thread Hector Martin
ARM64 currently defaults to posted MMIO (nGnRnE), but some devices
require the use of non-posted MMIO (nGnRE). Introduce a new ioremap()
variant to handle this case. ioremap_np() is aliased to ioremap() by
default on arches that do not implement this variant.

sparc64 is the only architecture that needs to be touched directly,
because it includes neither of the generic io.h or iomap.h headers.

This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this
variant and marks a given resource as requiring non-posted mappings.
This is implemented in the resource system because it is a SoC-level
requirement, so existing drivers do not need special-case code to pick
this ioremap variant.

Then this is implemented in devres by introducing devm_ioremap_np(),
and making devm_ioremap_resource() automatically select this variant
when the resource has the IORESOURCE_MEM_NONPOSTED flag set.

Signed-off-by: Hector Martin 
---
 .../driver-api/driver-model/devres.rst|  1 +
 arch/sparc/include/asm/io_64.h|  4 
 include/asm-generic/io.h  | 22 ++-
 include/asm-generic/iomap.h   |  9 
 include/linux/io.h|  2 ++
 include/linux/ioport.h|  1 +
 lib/devres.c  | 22 +++
 7 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/Documentation/driver-api/driver-model/devres.rst 
b/Documentation/driver-api/driver-model/devres.rst
index cd8b6e657b94..2f45877a539d 100644
--- a/Documentation/driver-api/driver-model/devres.rst
+++ b/Documentation/driver-api/driver-model/devres.rst
@@ -309,6 +309,7 @@ IOMAP
   devm_ioremap()
   devm_ioremap_uc()
   devm_ioremap_wc()
+  devm_ioremap_np()
   devm_ioremap_resource() : checks resource, requests memory region, ioremaps
   devm_ioremap_resource_wc()
   devm_platform_ioremap_resource() : calls devm_ioremap_resource() for 
platform device
diff --git a/arch/sparc/include/asm/io_64.h b/arch/sparc/include/asm/io_64.h
index 9bb27e5c22f1..9fbfc9574432 100644
--- a/arch/sparc/include/asm/io_64.h
+++ b/arch/sparc/include/asm/io_64.h
@@ -409,6 +409,10 @@ static inline void __iomem *ioremap(unsigned long offset, 
unsigned long size)
 #define ioremap_uc(X,Y)ioremap((X),(Y))
 #define ioremap_wc(X,Y)ioremap((X),(Y))
 #define ioremap_wt(X,Y)ioremap((X),(Y))
+static inline void __iomem *ioremap_np(unsigned long offset, unsigned long 
size)
+{
+   return NULL;
+}
 
 static inline void iounmap(volatile void __iomem *addr)
 {
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index c6af40ce03be..082e0c96db6e 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -942,7 +942,9 @@ static inline void *phys_to_virt(unsigned long address)
  *
  * ioremap_wc() and ioremap_wt() can provide more relaxed caching attributes
  * for specific drivers if the architecture choses to implement them.  If they
- * are not implemented we fall back to plain ioremap.
+ * are not implemented we fall back to plain ioremap. Conversely, ioremap_np()
+ * can provide stricter non-posted write semantics if the architecture
+ * implements them.
  */
 #ifndef CONFIG_MMU
 #ifndef ioremap
@@ -993,6 +995,24 @@ static inline void __iomem *ioremap_uc(phys_addr_t offset, 
size_t size)
 {
return NULL;
 }
+
+/*
+ * ioremap_np needs an explicit architecture implementation, as it
+ * requests stronger semantics than regular ioremap(). Portable drivers
+ * should instead use one of the higher-level abstractions, like
+ * devm_ioremap_resource(), to choose the correct variant for any given
+ * device and bus. Portable drivers with a good reason to want non-posted
+ * write semantics should always provide an ioremap() fallback in case
+ * ioremap_np() is not available.
+ */
+#ifndef ioremap_np
+#define ioremap_np ioremap_np
+static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
+{
+   return NULL;
+}
+#endif
+
 #endif
 
 #ifdef CONFIG_HAS_IOPORT_MAP
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 649224664969..9b3eb6d86200 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -101,6 +101,15 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wt ioremap
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_NP
+/* See the comment in asm-generic/io.h about ioremap_np(). */
+#define ioremap_np ioremap_np
+static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
+{
+   return NULL;
+}
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;
diff --git a/include/linux/io.h b/include/linux/io.h
index 8394c56babc2..d718354ed3e1 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -68,6 +68,8 @@ void __iomem *devm_ioremap_uc(struct device *dev, 
resource_size_t offset

[RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

2021-03-04 Thread Hector Martin
This implements the 'nonposted-mmio' and 'posted-mmio' boolean
properties. Placing these properties in a bus marks all child devices as
requiring non-posted or posted MMIO mappings. If no such properties are
found, the default is posted MMIO.

of_mmio_is_nonposted() performs the tree walking to determine if a given
device has requested non-posted MMIO.

of_address_to_resource() uses this to set the IORESOURCE_MEM_NONPOSTED
flag on resources that require non-posted MMIO.

of_iomap() and of_io_request_and_map() then use this flag to pick the
correct ioremap() variant.

This mechanism is currently restricted to Apple ARM platforms, as an
optimization.

Signed-off-by: Hector Martin 
---
 drivers/of/address.c   | 72 --
 include/linux/of_address.h |  1 +
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 73ddf2540f3f..6114dceb1ba6 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -847,6 +847,9 @@ static int __of_address_to_resource(struct device_node *dev,
return -EINVAL;
memset(r, 0, sizeof(struct resource));
 
+   if (of_mmio_is_nonposted(dev))
+   flags |= IORESOURCE_MEM_NONPOSTED;
+
r->start = taddr;
r->end = taddr + size - 1;
r->flags = flags;
@@ -896,7 +899,10 @@ void __iomem *of_iomap(struct device_node *np, int index)
if (of_address_to_resource(np, index, ))
return NULL;
 
-   return ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   return ioremap_np(res.start, resource_size());
+   else
+   return ioremap(res.start, resource_size());
 }
 EXPORT_SYMBOL(of_iomap);
 
@@ -928,7 +934,11 @@ void __iomem *of_io_request_and_map(struct device_node 
*np, int index,
if (!request_mem_region(res.start, resource_size(), name))
return IOMEM_ERR_PTR(-EBUSY);
 
-   mem = ioremap(res.start, resource_size());
+   if (res.flags & IORESOURCE_MEM_NONPOSTED)
+   mem = ioremap_np(res.start, resource_size());
+   else
+   mem = ioremap(res.start, resource_size());
+
if (!mem) {
release_mem_region(res.start, resource_size());
return IOMEM_ERR_PTR(-ENOMEM);
@@ -1094,3 +1104,61 @@ bool of_dma_is_coherent(struct device_node *np)
return false;
 }
 EXPORT_SYMBOL_GPL(of_dma_is_coherent);
+
+static bool of_nonposted_mmio_quirk(void)
+{
+   if (IS_ENABLED(CONFIG_ARCH_APPLE)) {
+   /* To save cycles, we cache the result for global "Apple ARM" 
setting */
+   static int quirk_state = -1;
+
+   /* Make quirk cached */
+   if (quirk_state < 0)
+   quirk_state = 
of_machine_is_compatible("apple,arm-platform");
+   return !!quirk_state;
+   }
+   return false;
+}
+
+/**
+ * of_mmio_is_nonposted - Check if device uses non-posted MMIO
+ * @np:device node
+ *
+ * Returns true if the "nonposted-mmio" property was found for
+ * the device's bus or a parent. "posted-mmio" has the opposite
+ * effect, terminating recursion and overriding any
+ * "nonposted-mmio" properties in parent buses.
+ *
+ * Recursion terminates if reach a non-translatable boundary
+ * (a node without a 'ranges' property).
+ *
+ * This is currently only enabled on Apple ARM devices, as an
+ * optimization.
+ */
+bool of_mmio_is_nonposted(struct device_node *np)
+{
+   struct device_node *node;
+   struct device_node *parent;
+
+   if (!of_nonposted_mmio_quirk())
+   return false;
+
+   node = of_get_parent(np);
+
+   while (node) {
+   if (!of_property_read_bool(node, "ranges")) {
+   break;
+   } else if (of_property_read_bool(node, "nonposted-mmio")) {
+   of_node_put(node);
+   return true;
+   } else if (of_property_read_bool(node, "posted-mmio")) {
+   break;
+   }
+   parent = of_get_parent(node);
+   of_node_put(node);
+   node = parent;
+   }
+
+   of_node_put(node);
+   return false;
+}
+EXPORT_SYMBOL_GPL(of_mmio_is_nonposted);
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 88bc943405cd..88f6333fee6c 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -62,6 +62,7 @@ extern struct of_pci_range *of_pci_range_parser_one(
struct of_pci_range_parser *parser,
struct of_pci_range *range);
 extern bool of_dma_is_coherent(struct device_node *np);
+extern bool of_mmio_is_nonposted(struct device_node *np);
 #else /* CONFIG_OF_ADDRESS */
 static inline void __iomem *of_

[RFT PATCH v3 10/27] docs: driver-api: device-io: Document ioremap() variants & access funcs

2021-03-04 Thread Hector Martin
This documents the newly introduced ioremap_np() along with all the
other common ioremap() variants, and some higher-level abstractions
available.

Signed-off-by: Hector Martin 
---
 Documentation/driver-api/device-io.rst | 218 +
 1 file changed, 218 insertions(+)

diff --git a/Documentation/driver-api/device-io.rst 
b/Documentation/driver-api/device-io.rst
index b20864b3ddc7..0e12a1d3592b 100644
--- a/Documentation/driver-api/device-io.rst
+++ b/Documentation/driver-api/device-io.rst
@@ -284,6 +284,224 @@ insl, insw, insb, outsl, outsw, outsb
   first byte in the FIFO register corresponds to the first byte in the memory
   buffer regardless of the architecture.
 
+Device memory mapping modes
+===
+
+Some architectures support multiple modes for mapping device memory.
+ioremap_*() variants provide a common abstraction around these
+architecture-specific modes, with a shared set of semantics.
+
+ioremap() is the most common mapping type, and is applicable to typical device
+memory (e.g. I/O registers). Other modes can offer weaker or stronger
+guarantees, if supported by the architecture. In order from strongest to
+weakest, they are as follows:
+
+ioremap_np()
+
+
+Like ioremap(), but explicitly requests non-posted write semantics. On some
+architectures and buses, ioremap() mappings have posted write semantics, which
+means that writes can appear to "complete" from the point of view of the
+CPU before the written data actually arrives at the target device. Writes are
+still ordered with respect to other writes and reads from the same device, but
+due to the posted write semantics, this is not the case with respect to other
+devices. ioremap_np() explicitly requests non-posted semantics, which means
+that the write instruction will not appear to complete until the device has
+received (and to some platform-specific extent acknowledged) the written data.
+
+This mapping mode primarily exists to cater for platforms with bus fabrics that
+require this particular mapping mode to work correctly. These platforms set the
+``IORESOURCE_MEM_NONPOSTED`` flag for a resource that requires ioremap_np()
+semantics and portable drivers should use an abstraction that automatically
+selects it where appropriate (see the `Higher-level ioremap abstractions`_
+section below).
+
+The bare ioremap_np() is only available on some architectures; on others, it
+always returns NULL. Drivers should not normally use it, unless they are
+platform-specific or they derive benefit from non-posted writes where
+supported, and can fall back to ioremap() otherwise. The normal approach to
+ensure posted write completion is to do a dummy read after a write as
+explained in `Accessing the device`_, which works with ioremap() on all
+platforms.
+
+ioremap_np() should never be used for PCI drivers. PCI memory space writes are
+always posted, even on architectures that otherwise implement ioremap_np().
+Using ioremap_np() for PCI BARs will at best result in posted write semantics,
+and at worst result in complete breakage.
+
+Note that non-posted write semantics are orthogonal to CPU-side ordering
+guarantees. A CPU may still choose to issue other reads or writes before a
+non-posted write instruction retires. See the previous section on MMIO access
+functions for details on the CPU side of things.
+
+ioremap()
+-
+
+The default mode, suitable for most memory-mapped devices, e.g. control
+registers. Memory mapped using ioremap() has the following characteristics:
+
+* Uncached - CPU-side caches are bypassed, and all reads and writes are handled
+  directly by the device
+* No speculative operations - the CPU may not issue a read or write to this
+  memory, unless the instruction that does so has been reached in committed
+  program flow.
+* No reordering - The CPU may not reorder accesses to this memory mapping with
+  respect to each other. On some architectures, this relies on barriers in
+  readl_relaxed()/writel_relaxed().
+* No repetition - The CPU may not issue multiple reads or writes for a single
+  program instruction.
+* No write-combining - Each I/O operation results in one discrete read or write
+  being issued to the device, and multiple writes are not combined into larger
+  writes. This may or may not be enforced when using __raw I/O accessors or
+  pointer dereferences.
+* Non-executable - The CPU is not allowed to speculate instruction execution
+  from this memory (it probably goes without saying, but you're also not
+  allowed to jump into device memory).
+
+On many platforms and buses (e.g. PCI), writes issued through ioremap()
+mappings are posted, which means that the CPU does not wait for the write to
+actually reach the target device before retiring the write instruction.
+
+On many platforms, I/O accesses must be aligned with respect to the access
+size; failure to do so will result in an exception or unpredictable results.
+
+

[RFT PATCH v3 02/27] dt-bindings: vendor-prefixes: Add apple prefix

2021-03-04 Thread Hector Martin
This is different from the legacy AAPL prefix used on PPC, but
consensus is that we prefer `apple` for these new platforms.

Signed-off-by: Hector Martin 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Linus Walleij 
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml 
b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index f6064d84a424..7b59b6d3f526 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -103,6 +103,8 @@ patternProperties:
 description: Anvo-Systems Dresden GmbH
   "^apm,.*":
 description: Applied Micro Circuits Corporation (APM)
+  "^apple,.*":
+description: Apple Inc.
   "^aptina,.*":
 description: Aptina Imaging
   "^arasan,.*":
-- 
2.30.0



[RFT PATCH v3 01/27] arm64: Cope with CPUs stuck in VHE mode

2021-03-04 Thread Hector Martin
From: Marc Zyngier 

It seems that the CPU known as Apple M1 has the terrible habit
of being stuck with HCR_EL2.E2H==1, in violation of the architecture.

Try and work around this deplorable state of affairs by detecting
the stuck bit early and short-circuit the nVHE dance. It is still
unknown whether there are many more such nuggets to be found...

Reported-by: Hector Martin 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/head.S | 33 ++---
 arch/arm64/kernel/hyp-stub.S | 28 
 2 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 66b0e0b66e31..673002b11865 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -477,14 +477,13 @@ EXPORT_SYMBOL(kimage_vaddr)
  * booted in EL1 or EL2 respectively.
  */
 SYM_FUNC_START(init_kernel_el)
-   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
-   msr sctlr_el1, x0
-
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
b.eqinit_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr sctlr_el1, x0
isb
mov_q   x0, INIT_PSTATE_EL1
msr spsr_el1, x0
@@ -504,6 +503,34 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr vbar_el2, x0
isb
 
+   /*
+* Fruity CPUs seem to have HCR_EL2.E2H set to RES1,
+* making it impossible to start in nVHE mode. Is that
+* compliant with the architecture? Absolutely not!
+*/
+   mrs x0, hcr_el2
+   and x0, x0, #HCR_E2H
+   cbz x0, 1f
+
+   /* Switching to VHE requires a sane SCTLR_EL1 as a start */
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr_s   SYS_SCTLR_EL12, x0
+
+   /*
+* Force an eret into a helper "function", and let it return
+* to our original caller... This makes sure that we have
+* initialised the basic PSTATE state.
+*/
+   mov x0, #INIT_PSTATE_EL2
+   msr spsr_el1, x0
+   adr_l   x0, stick_to_vhe
+   msr elr_el1, x0
+   eret
+
+1:
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr sctlr_el1, x0
+
msr elr_el2, lr
mov w0, #BOOT_CPU_MODE_EL2
eret
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 5eccbd62fec8..c7601030ee82 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -27,12 +27,12 @@ SYM_CODE_START(__hyp_stub_vectors)
ventry  el2_fiq_invalid // FIQ EL2t
ventry  el2_error_invalid   // Error EL2t
 
-   ventry  el2_sync_invalid// Synchronous EL2h
+   ventry  elx_sync// Synchronous EL2h
ventry  el2_irq_invalid // IRQ EL2h
ventry  el2_fiq_invalid // FIQ EL2h
ventry  el2_error_invalid   // Error EL2h
 
-   ventry  el1_sync// Synchronous 64-bit EL1
+   ventry  elx_sync// Synchronous 64-bit EL1
ventry  el1_irq_invalid // IRQ 64-bit EL1
ventry  el1_fiq_invalid // FIQ 64-bit EL1
ventry  el1_error_invalid   // Error 64-bit EL1
@@ -45,7 +45,7 @@ SYM_CODE_END(__hyp_stub_vectors)
 
.align 11
 
-SYM_CODE_START_LOCAL(el1_sync)
+SYM_CODE_START_LOCAL(elx_sync)
cmp x0, #HVC_SET_VECTORS
b.ne1f
msr vbar_el2, x1
@@ -71,7 +71,7 @@ SYM_CODE_START_LOCAL(el1_sync)
 
 9: mov x0, xzr
eret
-SYM_CODE_END(el1_sync)
+SYM_CODE_END(elx_sync)
 
 // nVHE? No way! Give me the real thing!
 SYM_CODE_START_LOCAL(mutate_to_vhe)
@@ -243,3 +243,23 @@ SYM_FUNC_START(switch_to_vhe)
 #endif
ret
 SYM_FUNC_END(switch_to_vhe)
+
+SYM_FUNC_START(stick_to_vhe)
+   /*
+* Make sure the switch to VHE cannot fail, by overriding the
+* override. This is hilarious.
+*/
+   adr_l   x1, id_aa64mmfr1_override
+   add x1, x1, #FTR_OVR_MASK_OFFSET
+   dc  civac, x1
+   dsb sy
+   isb
+   ldr x0, [x1]
+   bic x0, x0, #(0xf << ID_AA64MMFR1_VHE_SHIFT)
+   str x0, [x1]
+
+   mov x0, #HVC_VHE_RESTART
+   hvc #0
+   mov x0, #BOOT_CPU_MODE_EL2
+   ret
+SYM_FUNC_END(stick_to_vhe)
-- 
2.30.0



[RFT PATCH v3 05/27] arm64: cputype: Add CPU implementor & types for the Apple M1 cores

2021-03-04 Thread Hector Martin
The implementor will be used to condition the FIQ support quirk.

The specific CPU types are not used at the moment, but let's add them
for documentation purposes.

Signed-off-by: Hector Martin 
---
 arch/arm64/include/asm/cputype.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index ef5b040dee44..6231e1f0abe7 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -59,6 +59,7 @@
 #define ARM_CPU_IMP_NVIDIA 0x4E
 #define ARM_CPU_IMP_FUJITSU0x46
 #define ARM_CPU_IMP_HISI   0x48
+#define ARM_CPU_IMP_APPLE  0x61
 
 #define ARM_CPU_PART_AEM_V80xD0F
 #define ARM_CPU_PART_FOUNDATION0xD00
@@ -99,6 +100,9 @@
 
 #define HISI_CPU_PART_TSV110   0xD01
 
+#define APPLE_CPU_PART_M1_ICESTORM 0x022
+#define APPLE_CPU_PART_M1_FIRESTORM0x023
+
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A57)
 #define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, 
ARM_CPU_PART_CORTEX_A72)
@@ -127,6 +131,8 @@
 #define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, 
NVIDIA_CPU_PART_CARMEL)
 #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, 
FUJITSU_CPU_PART_A64FX)
 #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
+#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, 
APPLE_CPU_PART_M1_ICESTORM)
+#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, 
APPLE_CPU_PART_M1_FIRESTORM)
 
 /* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
 #define MIDR_FUJITSU_ERRATUM_010001MIDR_FUJITSU_A64FX
-- 
2.30.0



[RFT PATCH v3 00/27] Apple M1 SoC platform bring-up

2021-03-04 Thread Hector Martin
t [5].

[1] https://github.com/AsahiLinux/m1n1/
[2] https://github.com/AsahiLinux/macvdmtool/
[3] https://github.com/AsahiLinux/vdmtool/
[4] https://github.com/AsahiLinux/docs/wiki/Developer-Quickstart
[5] https://github.com/AsahiLinux/docs/wiki

== Project Blurb ==

Asahi Linux is an open community project dedicated to developing and
maintaining mainline support for Apple Silicon on Linux. Feel free to
drop by #asahi and #asahi-dev on freenode to chat with us, or check
our website for more information on the project:

https://asahilinux.org/

== Changes since v2 ==

* Removed FIQ support patches, as this is now being handled as a
  separate series.
* Added nVHE workaround patch from Marc
* Renamed dts(i) files to better match conventions used in other
  platforms
* Renamed 'm1' in compatible/dts names to 't8103', to be better
  prepared for the chance of multiple SoCs being released under the
  same marketing name.
* Reworded device tree binding text for the platform.
* Changed the default ioremap_np() implementation to return NULL,
  like ioremap_uc().
* Added general documentation for ioremap() variants, including the
  newly introduced one.
* Reworked virtual IPI support in the AIC driver, and attempted
  to thoroughly shave the memory ordering yak.
* Moved GIC registers to sysregs.h instead of including that in the AIC
  driver.
* Added _EL1 suffixes to Apple sysregs.
* Addressed further review comments and feedback.


Arnd Bergmann (1):
  docs: driver-api: device-io: Document I/O access functions

Hector Martin (25):
  dt-bindings: vendor-prefixes: Add apple prefix
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles
  arm64: cputype: Add CPU implementor & types for the Apple M1 cores
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: arch_timer: implement support for interrupt-names
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  docs: driver-api: device-io: Document ioremap() variants & access
funcs
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  of/address: Add infrastructure to declare MMIO as non-posted
  arm64: Add Apple vendor-specific system registers
  arm64: move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  tty: serial: samsung_tty: Separate S3C64XX ops structure
  tty: serial: samsung_tty: Add ucon_mask parameter
  tty: serial: samsung_tty: Add s3c24xx_port_type
  tty: serial: samsung_tty: IRQ rework
  tty: serial: samsung_tty: Use devm_ioremap_resource
  dt-bindings: serial: samsung: Add apple,s5l-uart compatible
  tty: serial: samsung_tty: Add support for Apple UARTs
  tty: serial: samsung_tty: Add earlycon support for Apple UARTs
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree

Marc Zyngier (1):
  arm64: Cope with CPUs stuck in VHE mode

 .../devicetree/bindings/arm/apple.yaml|  64 ++
 .../devicetree/bindings/arm/cpus.yaml |   2 +
 .../bindings/display/simple-framebuffer.yaml  |   5 +
 .../interrupt-controller/apple,aic.yaml   |  88 +++
 .../bindings/serial/samsung_uart.yaml |   4 +-
 .../bindings/timer/arm,arch_timer.yaml|  14 +
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 Documentation/driver-api/device-io.rst| 356 +
 .../driver-api/driver-model/devres.rst|   1 +
 MAINTAINERS   |  15 +
 arch/arm64/Kconfig.platforms  |   8 +
 arch/arm64/boot/dts/Makefile  |   1 +
 arch/arm64/boot/dts/apple/Makefile|   2 +
 arch/arm64/boot/dts/apple/t8103-j274.dts  |  45 ++
 arch/arm64/boot/dts/apple/t8103.dtsi  | 135 
 arch/arm64/configs/defconfig  |   1 +
 arch/arm64/include/asm/cputype.h  |   6 +
 arch/arm64/include/asm/io.h   |   1 +
 arch/arm64/include/asm/sysreg.h   |  60 ++
 arch/arm64/include/asm/sysreg_apple.h |  69 ++
 arch/arm64/kernel/head.S  |  33 +-
 arch/arm64/kernel/hyp-stub.S  |  28 +-
 arch/sparc/include/asm/io_64.h|   4 +
 drivers/clocksource/arm_arch_timer.c  |  24 +-
 drivers/irqchip/Kconfig   |   8 +
 drivers/irqchip/Makefile  |   1 +
 drivers/irqchip/irq-apple-aic.c   | 710 ++
 drivers/of/address.c  |  72 +-
 drivers/tty/serial/Kconfig|   2 +-
 drivers/tty/serial/samsung_tty.c  | 496 +---
 include/asm-generic/io.h  |  22 +-
 include/asm-generic/iomap.h   |   9 +
 include/clocksource/arm_arch_timer.h  |   1 +
 .../interrupt-contro

[RFT PATCH v3 09/27] docs: driver-api: device-io: Document I/O access functions

2021-03-04 Thread Hector Martin
From: Arnd Bergmann 

This adds more detailed descriptions of the various read/write
primitives available for use with I/O memory/ports.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Hector Martin 
---
 Documentation/driver-api/device-io.rst | 138 +
 1 file changed, 138 insertions(+)

diff --git a/Documentation/driver-api/device-io.rst 
b/Documentation/driver-api/device-io.rst
index 764963876d08..b20864b3ddc7 100644
--- a/Documentation/driver-api/device-io.rst
+++ b/Documentation/driver-api/device-io.rst
@@ -146,6 +146,144 @@ There are also equivalents to memcpy. The ins() and
 outs() functions copy bytes, words or longs to the given
 port.
 
+__iomem pointer tokens
+==
+
+The data type for an MMIO address is an ``__iomem`` qualified pointer, such as
+``void __iomem *reg``. On most architectures it is a regular pointer that
+points to a virtual memory address and can be offset or dereferenced, but in
+portable code, it must only be passed from and to functions that explicitly
+operated on an ``__iomem`` token, in particular the ioremap() and
+readl()/writel() functions. The 'sparse' semantic code checker can be used to
+verify that this is done correctly.
+
+While on most architectures, ioremap() creates a page table entry for an
+uncached virtual address pointing to the physical MMIO address, some
+architectures require special instructions for MMIO, and the ``__iomem`` 
pointer
+just encodes the physical address or an offsettable cookie that is interpreted
+by readl()/writel().
+
+Differences between I/O access functions
+
+
+readq(), readl(), readw(), readb(), writeq(), writel(), writew(), writeb()
+
+  These are the most generic accessors, providing serialization against other
+  MMIO accesses and DMA accesses as well as fixed endianness for accessing
+  little-endian PCI devices and on-chip peripherals. Portable device drivers
+  should generally use these for any access to ``__iomem`` pointers.
+
+  Note that posted writes are not strictly ordered against a spinlock, see
+  Documentation/driver-api/io_ordering.rst.
+
+readq_relaxed(), readl_relaxed(), readw_relaxed(), readb_relaxed(),
+writeq_relaxed(), writel_relaxed(), writew_relaxed(), writeb_relaxed()
+
+  On architectures that require an expensive barrier for serializing against
+  DMA, these "relaxed" versions of the MMIO accessors only serialize against
+  each other, but contain a less expensive barrier operation. A device driver
+  might use these in a particularly performance sensitive fast path, with a
+  comment that explains why the usage in a specific location is safe without
+  the extra barriers.
+
+  See memory-barriers.txt for a more detailed discussion on the precise 
ordering
+  guarantees of the non-relaxed and relaxed versions.
+
+ioread64(), ioread32(), ioread16(), ioread8(),
+iowrite64(), iowrite32(), iowrite16(), iowrite8()
+
+  These are an alternative to the normal readl()/writel() functions, with 
almost
+  identical behavior, but they can also operate on ``__iomem`` tokens returned
+  for mapping PCI I/O space with pci_iomap() or ioport_map(). On architectures
+  that require special instructions for I/O port access, this adds a small
+  overhead for an indirect function call implemented in lib/iomap.c, while on
+  other architectures, these are simply aliases.
+
+ioread64be(), ioread32be(), ioread16be()
+iowrite64be(), iowrite32be(), iowrite16be()
+
+  These behave in the same way as the ioread32()/iowrite32() family, but with
+  reversed byte order, for accessing devices with big-endian MMIO registers.
+  Device drivers that can operate on either big-endian or little-endian
+  registers may have to implement a custom wrapper function that picks one or
+  the other depending on which device was found.
+
+  Note: On some architectures, the normal readl()/writel() functions
+  traditionally assume that devices are the same endianness as the CPU, while
+  using a hardware byte-reverse on the PCI bus when running a big-endian 
kernel.
+  Drivers that use readl()/writel() this way are generally not portable, but
+  tend to be limited to a particular SoC.
+
+hi_lo_readq(), lo_hi_readq(), hi_lo_readq_relaxed(), lo_hi_readq_relaxed(),
+ioread64_lo_hi(), ioread64_hi_lo(), ioread64be_lo_hi(), ioread64be_hi_lo(),
+hi_lo_writeq(), lo_hi_writeq(), hi_lo_writeq_relaxed(), lo_hi_writeq_relaxed(),
+iowrite64_lo_hi(), iowrite64_hi_lo(), iowrite64be_lo_hi(), iowrite64be_hi_lo()
+
+  Some device drivers have 64-bit registers that cannot be accessed atomically
+  on 32-bit architectures but allow two consecutive 32-bit accesses instead.
+  Since it depends on the particular device which of the two halves has to be
+  accessed first, a helper is provided for each combination of 64-bit accessors
+  with either low/high or high/low word ordering. A device driver must include
+  either  or  to
+  get the function definitions along with helpers tha

[RFT PATCH v3 07/27] arm64: arch_timer: implement support for interrupt-names

2021-03-04 Thread Hector Martin
This allows the devicetree to correctly represent the available set of
timers, which varies from device to device, without the need for fake
dummy interrupts for unavailable slots.

Also add the hyp-virt timer/PPI, which is not currently used, but worth
representing.

Signed-off-by: Hector Martin 
Reviewed-by: Tony Lindgren 
---
 drivers/clocksource/arm_arch_timer.c | 24 +---
 include/clocksource/arm_arch_timer.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index d0177824c518..ee2501a17697 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -63,6 +63,14 @@ struct arch_timer {
 static u32 arch_timer_rate;
 static int arch_timer_ppi[ARCH_TIMER_MAX_TIMER_PPI];
 
+static const char *arch_timer_ppi_names[ARCH_TIMER_MAX_TIMER_PPI] = {
+   [ARCH_TIMER_PHYS_SECURE_PPI]= "phys-secure",
+   [ARCH_TIMER_PHYS_NONSECURE_PPI] = "phys",
+   [ARCH_TIMER_VIRT_PPI]   = "virt",
+   [ARCH_TIMER_HYP_PPI]= "hyp-phys",
+   [ARCH_TIMER_HYP_VIRT_PPI]   = "hyp-virt",
+};
+
 static struct clock_event_device __percpu *arch_timer_evt;
 
 static enum arch_timer_ppi_nr arch_timer_uses_ppi = ARCH_TIMER_VIRT_PPI;
@@ -1280,8 +1288,9 @@ static void __init arch_timer_populate_kvm_info(void)
 
 static int __init arch_timer_of_init(struct device_node *np)
 {
-   int i, ret;
+   int i, irq, ret;
u32 rate;
+   bool has_names;
 
if (arch_timers_present & ARCH_TIMER_TYPE_CP15) {
pr_warn("multiple nodes in dt, skipping\n");
@@ -1289,8 +1298,17 @@ static int __init arch_timer_of_init(struct device_node 
*np)
}
 
arch_timers_present |= ARCH_TIMER_TYPE_CP15;
-   for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
-   arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
+
+   has_names = of_property_read_bool(np, "interrupt-names");
+
+   for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++) 
{
+   if (has_names)
+   irq = of_irq_get_byname(np, arch_timer_ppi_names[i]);
+   else
+   irq = of_irq_get(np, i);
+   if (irq > 0)
+   arch_timer_ppi[i] = irq;
+   }
 
arch_timer_populate_kvm_info();
 
diff --git a/include/clocksource/arm_arch_timer.h 
b/include/clocksource/arm_arch_timer.h
index 1d68d5613dae..73c7139c866f 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -32,6 +32,7 @@ enum arch_timer_ppi_nr {
ARCH_TIMER_PHYS_NONSECURE_PPI,
ARCH_TIMER_VIRT_PPI,
ARCH_TIMER_HYP_PPI,
+   ARCH_TIMER_HYP_VIRT_PPI,
ARCH_TIMER_MAX_TIMER_PPI
 };
 
-- 
2.30.0



[RFT PATCH v3 06/27] dt-bindings: timer: arm,arch_timer: Add interrupt-names support

2021-03-04 Thread Hector Martin
Not all platforms provide the same set of timers/interrupts, and Linux
only needs one (plus kvm/guest ones); some platforms are working around
this by using dummy fake interrupts. Implementing interrupt-names allows
the devicetree to specify an arbitrary set of available interrupts, so
the timer code can pick the right one.

This also adds the hyp-virt timer/interrupt, which was previously not
expressed in the fixed 4-interrupt form.

Signed-off-by: Hector Martin 
---
 .../devicetree/bindings/timer/arm,arch_timer.yaml  | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml 
b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
index 2c75105c1398..ebe9b0bebe41 100644
--- a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
+++ b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
@@ -34,11 +34,25 @@ properties:
   - arm,armv8-timer
 
   interrupts:
+minItems: 1
+maxItems: 5
 items:
   - description: secure timer irq
   - description: non-secure timer irq
   - description: virtual timer irq
   - description: hypervisor timer irq
+  - description: hypervisor virtual timer irq
+
+  interrupt-names:
+minItems: 1
+maxItems: 5
+items:
+  enum:
+- phys-secure
+- phys
+- virt
+- hyp-phys
+- hyp-virt
 
   clock-frequency:
 description: The frequency of the main counter, in Hz. Should be present
-- 
2.30.0



[RFT PATCH v3 04/27] dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles

2021-03-04 Thread Hector Martin
These are the CPU cores in the "Apple Silicon" M1 SoC.

Signed-off-by: Hector Martin 
---
 Documentation/devicetree/bindings/arm/cpus.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/cpus.yaml 
b/Documentation/devicetree/bindings/arm/cpus.yaml
index 26b886b20b27..c299423dc7cb 100644
--- a/Documentation/devicetree/bindings/arm/cpus.yaml
+++ b/Documentation/devicetree/bindings/arm/cpus.yaml
@@ -85,6 +85,8 @@ properties:
 
   compatible:
 enum:
+  - apple,icestorm
+  - apple,firestorm
   - arm,arm710t
   - arm,arm720t
   - arm,arm740t
-- 
2.30.0



[RFT PATCH v3 03/27] dt-bindings: arm: apple: Add bindings for Apple ARM platforms

2021-03-04 Thread Hector Martin
This introduces bindings for all three 2020 Apple M1 devices:

* apple,j274 - Mac mini (M1, 2020)
* apple,j293 - MacBook Pro (13-inch, M1, 2020)
* apple,j313 - MacBook Air (M1, 2020)

Signed-off-by: Hector Martin 
---
 .../devicetree/bindings/arm/apple.yaml| 64 +++
 MAINTAINERS   | 10 +++
 2 files changed, 74 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/apple.yaml

diff --git a/Documentation/devicetree/bindings/arm/apple.yaml 
b/Documentation/devicetree/bindings/arm/apple.yaml
new file mode 100644
index ..1e772c85206c
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/apple.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/apple.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple ARM Machine Device Tree Bindings
+
+maintainers:
+  - Hector Martin 
+
+description: |
+  ARM platforms using SoCs designed by Apple Inc., branded "Apple Silicon".
+
+  This currently includes devices based on the "M1" SoC, starting with the
+  three Mac models released in late 2020:
+
+  - Mac mini (M1, 2020)
+  - MacBook Pro (13-inch, M1, 2020)
+  - MacBook Air (M1, 2020)
+
+  The compatible property should follow this format:
+
+  compatible = "apple,", "apple,", "apple,arm-platform";
+
+   represents the board/device and comes from the `target-type`
+  property of the root node of the Apple Device Tree, lowercased. It can be
+  queried on macOS using the following command:
+
+  $ ioreg -d2 -l | grep target-type
+
+   is the lowercased SoC ID. Apple uses at least *five* different
+  names for their SoCs:
+
+  - Marketing name ("M1")
+  - Internal name ("H13G")
+  - Codename ("Tonga")
+  - SoC ID ("T8103")
+  - Package/IC part number ("APL1102")
+
+  Devicetrees should use the lowercased SoC ID, to avoid confusion if
+  multiple SoCs share the same marketing name. This can be obtained from
+  the `compatible` property of the arm-io node of the Apple Device Tree,
+  which can be queried as follows on macOS:
+
+  $ ioreg -n arm-io | grep compatible
+
+properties:
+  $nodename:
+const: "/"
+  compatible:
+oneOf:
+  - description: Apple M1 SoC based platforms
+items:
+  - enum:
+  - apple,j274 # Mac mini (M1, 2020)
+  - apple,j293 # MacBook Pro (13-inch, M1, 2020)
+  - apple,j313 # MacBook Air (M1, 2020)
+  - const: apple,t8103
+  - const: apple,arm-platform
+
+additionalProperties: true
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index d92f85ca831d..aec14fbd61b8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1637,6 +1637,16 @@ F:   arch/arm/mach-alpine/
 F: arch/arm64/boot/dts/amazon/
 F: drivers/*/*alpine*
 
+ARM/APPLE MACHINE SUPPORT
+M: Hector Martin 
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+S: Maintained
+W: https://asahilinux.org
+B: https://github.com/AsahiLinux/linux/issues
+C: irc://chat.freenode.net/asahi-dev
+T: git https://github.com/AsahiLinux/linux.git
+F: Documentation/devicetree/bindings/arm/apple.yaml
+
 ARM/ARTPEC MACHINE SUPPORT
 M: Jesper Nilsson 
 M: Lars Persson 
-- 
2.30.0



Re: [PATCH v2 00/25] Apple M1 SoC platform bring-up

2021-02-24 Thread Hector Martin

On 22/02/2021 00.20, Hector Martin wrote:

I haven't tested things at EL0 yet, but it looks like the stateful
instructions known to be usable in EL0 (AMX) already default to trap on
this platform, so we should be safe there. Everything else looks like it
probably either shouldn't work in EL0 (I sure hope the address
translation one doesn't...) or is probably stateless. I'll dig deeper
and test EL0 in the future, but so far things look OK (for some
questionable values of OK :) ).


Follow-up: I have EL0 testing scaffolding now, and I found some more 
mutable state (an IMP-DEF, pre-standard version of FEAT_AFP, using a 
separate status register for the bits), but thankfully it traps at EL0 
by default.


And then I found some other mutable IMP-DEF state that does not trap at 
EL0. And which is a 0-day CVE in macOS, because it doesn't 
save/restore/clear it either, nor does it trap there.


E-mailing secur...@apple.com...

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 00/25] Apple M1 SoC platform bring-up

2021-02-23 Thread Hector Martin

On 22/02/2021 05.41, Andy Shevchenko wrote:

Hector, I would like to be cc’ed in the next version


Noted, thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 15/25] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-02-22 Thread Hector Martin

On 16/02/2021 03.09, Marc Zyngier wrote:

On Mon, 15 Feb 2021 12:17:03 +,
Hector Martin  wrote:

This patch introduces basic UP irqchip support, without SMP/IPI support.


This last comment seems outdated now.


Heh, I forgot to reword this one. Thanks :)


+config APPLE_AIC
+   bool "Apple Interrupt Controller (AIC)"
+   depends on ARM64
+   default ARCH_APPLE
+   select IRQ_DOMAIN
+   select IRQ_DOMAIN_HIERARCHY


arm64 selects GENERIC_IRQ_IPI, which selects IRQ_DOMAIN_HIERARCHY,
which selects IRQ_DOMAIN. So these two lines are superfluous.


Ack, removing these for v3.


+ * In addition, this driver also handles FIQs, as these are routed to the same 
IRQ vector. These
+ * are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and performance 
counters (TODO).
+ *


nit: A bit of comment formatting could be helpful.


Wrapped this to 80 columns for v3.


+#include 
+#include 
+#include 
+#include 
+#include 
+#include 


I'd rather you move the ICH_HCR_* definitions to sysreg.h rather than
including the GICv3 stuff. They are only there for historical reasons
(such as supporting KVM on 32bit systems), none of which apply anymore.


Just ICH_HCR, or should I bring all of the ICH_ and ICC_ defines along 
with it?



+   aic_ic_write(ic, AIC_TARGET_CPU + hwirq * 4, BIT(cpu));
+   irq_data_update_effective_affinity(d, cpumask_of(cpu));


It is fine to pick a single CPU out of the whole affinity set, but you
should tell the kernel that this is the case (irqd_set_single_target()).



+
+   irq_set_status_flags(irq, IRQ_LEVEL);


I'm definitely not keen on this override, and the trigger information
should be the one coming from the DT, which is already set for you.
It'd probably be useful to provide an irq_set_type() callback that
returns an error when fed an unsupported trigger.




+   irq_set_noprobe(irq);


This seems to be cargo-culted, and I don't believe this is necessary.



+static const struct irq_domain_ops aic_irq_domain_ops = {
+   .map = aic_irq_domain_map,
+   .unmap = aic_irq_domain_unmap,
+   .xlate = aic_irq_domain_xlate,
+};


You are mixing two APIs: the older OF-specific one, and the newer one
that uses fwnode_handle for hierarchical support. That's OK for older
drivers that were forcefully converted to using generic IPIs, but as
this is a brand new driver, I'd rather it consistently used the new
API. See a proposed rework at [1] (compile tested only).


Applying your fixups for these, thanks! :)


+   atomic_and(~irq_bit, _vipi_mask[this_cpu]);


atomic_andnot()?


+
+   if (!atomic_read(_vipi_mask[this_cpu]))
+   aic_ic_write(ic, AIC_IPI_MASK_SET, AIC_IPI_OTHER);


This is odd. It means that you still perform the MMIO write if the bit
was already clear. I think this could be written as:

u32 val;
val = atomic_fetch_andnot(irq_bit, _vipi_mask[this_cpu]);
if (val && !(val & ~irq_bit))
aic_ic_write();




val  = atomic_fetch_or(irq_bit, _vipi_mask[this_cpu]);
if (!val)
aic_ic_write();


This makes more sense to avoid the redundant MMIO writes. I need to get 
more familiar with all the available atomic ops... lots of useful stuff 
in there I didn't know about.



+   for_each_cpu(cpu, mask) {
+   if (atomic_read(_vipi_mask[cpu]) & irq_bit) {
+   atomic_or(irq_bit, _vipi_flag[cpu]);
+   send |= AIC_IPI_SEND_CPU(cpu);


That's really odd. A masked IPI should be made pending, and delivered
on unmask. I think this all works because we never mask individual
IPIs, as this would otherwise drop interrupts on the floor.


I wasn't really sure whether IPIs are supposed to end up pending like 
that; indeed if that's how it's supposed to work, then I also need logic 
at mask/unmask time to fire off any pending IPIs. I'll do it like that 
for v3.


Now I wonder how other drivers do it... I'm guessing this never gets 
tested, since the IPI code only exercises a fraction of the irq features...



+static void aic_handle_ipi(struct pt_regs *regs)
+{
+   int this_cpu = smp_processor_id();
+   int i;
+   unsigned long firing;
+
+   aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
+
+   /*
+* Ensure that we've received and acked the IPI before we load the vIPI
+* flags. This pairs with the second wmb() above.
+*/
+   mb();


I don't get your ordering here.

If you are trying to order against something that has happened on
another CPU (which is pretty likely in the case of an IPI), why isn't
this a smp_mb_before_atomic() (and conversely smp_mb_after_atomic() in
aic_ipi_send_mask())?

Although this looks to me like a good case for _acquire/_release
semantics.


This is trying to order the atomic ops with the IPI IRQ itself, in 
particular the ACK in the preceding line. If they execute in reverse 
order (or more precisely if the ACK t

Re: [PATCH 7/8 v1.5] arm64: Always keep DAIF.[IF] in sync

2021-02-22 Thread Hector Martin

On 20/02/2021 03.26, Mark Rutland wrote:

On Sat, Feb 20, 2021 at 02:25:30AM +0900, Hector Martin wrote:

Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
FIQ line. We implement support for this by simply treating IRQs and FIQs
the same way in the interrupt vectors.

To support these systems, the FIQ mask bit needs to be kept in sync with
the IRQ mask bit, so both kinds of exceptions are masked together. No
other platforms should be delivering FIQ exceptions right now, and we
already unmask FIQ in normal process context, so this should not have an
effect on other systems - if spurious FIQs were arriving, they would
already panic the kernel.


This looks good to me; I've picked this up and pushed out my arm64/fiq
branch [1,2] incorporating this, tagged as arm64-fiq-20210219.

I'll give this version a few days to gather comments before I post a v2.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/fiq
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/fiqA


Thanks! Any chance you can do a rebase on top of torvalds/master? Since 
Marc's nVHE changes went in, we're going to need to add a workaround 
patch for the M1's lack of nVHE mode, which is going to be in the next 
version of my M1 bringup series - but right now that would involve 
telling people to merge two trees to build a base to apply it on, which 
is sub-optimal.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 20/25] tty: serial: samsung_tty: Use devm_ioremap_resource

2021-02-21 Thread Hector Martin

On 21/02/2021 23.59, Marc Zyngier wrote:

Here's what I've been using last time I had to muck with the 4210
stuff:


qemu-system-arm \
-kernel arch/arm/boot/zImage -M smdkc210 \
-append "console=ttySAC0,115200n8 earlycon=smh root=/dev/mmcblk0p2 
rootwait" \
-nographic -semihosting -smp 2 \
-dtb arch/arm/boot/dts/exynos4210-smdkv310.dtb \
-drive if=sd,driver=null-co -drive if=sd,driver=null-co \
-drive 
if=sd,file=../vminstall/bullseye32/MsiKFRxxujYIkiKT.img,format=raw


where the last line points to a standard Debian image created
separately.


Hah, exynos4210-smdkv310.dtb is what did it. And here I was thinking 
something with "c210" in the name would be more likely to work with qemu 
machine "smdkc210"... :-)


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 00/25] Apple M1 SoC platform bring-up

2021-02-21 Thread Hector Martin

On 18/02/2021 23.36, Mark Rutland wrote:

IIUC, the CPUs in these parts have some IMP-DEF instructions that can be
used at EL0 which might have some IMP-DEF state. Our general expectation
is that FW should configure such things to trap, but I don't know
whether the M1 FW does that, and I fear that this will end up being a
problem for us -- even if that doesn't affect EL1/EL2, IMP-DEF state is
an interesting covert channel between EL0 tasks, and not generally safe
to use thanks to context-switch and idle, so I'd like to make sure we
can catch usage and make it SIGILL.

Do you happen to know whether all of that is configured to trap, and if
not, is it possible to adjust the bootloader to ensure it is?


Very good point!

If only they were IMP-DEF... they're straight in Unallocated space. I 
spent some time the other day exhaustively searching the chunk of the 
encoding space where it looks like all these "fun" additions are,

at EL2, and I documented what I found here:

https://github.com/AsahiLinux/docs/wiki/HW:Apple-Instructions

I haven't tested things at EL0 yet, but it looks like the stateful 
instructions known to be usable in EL0 (AMX) already default to trap on 
this platform, so we should be safe there. Everything else looks like it 
probably either shouldn't work in EL0 (I sure hope the address 
translation one doesn't...) or is probably stateless. I'll dig deeper 
and test EL0 in the future, but so far things look OK (for some 
questionable values of OK :) ).


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 25/25] arm64: apple: Add initial Mac Mini 2020 (M1) devicetree

2021-02-21 Thread Hector Martin

On 16/02/2021 04.29, Krzysztof Kozlowski wrote:

On Mon, Feb 15, 2021 at 09:17:13PM +0900, Hector Martin wrote:

+   memory@8 {
+   device_type = "memory";
+   reg = <0 0 0 0>; /* To be filled by loader */


dtc and dtschema might complain, so could you set here fake memory
address 0x8? Would that work for your bootloader?


Yeah, the bootloader just replaces the entire property anyway. I'll fill 
in some dummy values (the real usable memory range is to some extent 
dynamic and depends on firmware).



+   };
+};
+
+ {
+   status = "okay";
+};
diff --git a/arch/arm64/boot/dts/apple/apple-m1.dtsi 
b/arch/arm64/boot/dts/apple/apple-m1.dtsi
new file mode 100644
index ..45c87771b057
--- /dev/null
+++ b/arch/arm64/boot/dts/apple/apple-m1.dtsi
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Copyright The Asahi Linux Contributors
+ */
+
+#include 
+#include 
+
+/ {
+   compatible = "apple,m1", "apple,arm-platform";
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   compatible = "apple,icestorm";
+   device_type = "cpu";
+   reg = <0x0 0x0>;
+   enable-method = "spin-table";
+   cpu-release-addr = <0 0>; /* To be filled by loader */
+   };


New line after every device node, please.


Added newlines after all the CPU nodes.


With this minor changes, fine for me:
Reviewed-by: Krzysztof Kozlowski 


Thanks!

v3 will rename this file to apple/t8103.dtsi and the board file to 
t8103-j274.dts to better match other platforms (and to use the proper 
SoC ID for the M1); please let me know if you're okay keeping the 
Reviewed-by for that.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 20/25] tty: serial: samsung_tty: Use devm_ioremap_resource

2021-02-21 Thread Hector Martin

On 21/02/2021 04.17, Marc Zyngier wrote:

On 2021-02-20 19:13, Krzysztof Kozlowski wrote:

On Thu, Feb 18, 2021 at 11:01:21PM +0900, Hector Martin wrote:

On 16/02/2021 03.51, Krzysztof Kozlowski wrote:

Also fix a bug checking the return value, which should use IS_ERR().


No, no, no. We never, never combine fixing bugs with some rework.
However devm_ioremap() returns NULL so where is the error?


Sorry, this was a commit message mistake. The code is correct and so
is the
patch: just the NULL check is correct for the previous variant and
IS_ERR is
correct for devm_ioremap_resource. I confused myself while writing the
commit message after the fact.


Did you test your patches on existing platforms? If not, please mark all
of them as RFT on next submission, so Greg does not pick them too fast.


I unfortunately don't have any Exynos devices where I could test the
code (I
have a couple but no serial connections, and I have no idea if mailine
would
run on them). I'll mark v3 as RFT.


If you have one of Odroid boards with Exynos, then you can nicely test
Exynos. Others - depends, on board.
Anyway I can test them for you. I just want to be sure that Greg waits
for this testing.


Worse case, QEMU has some Exynos4210 emulation that is usable.


That's a good point; better than nothing, certainly.

Does anyone have a known good example of booting an exynos kernel under 
qemu? I tried building a plain 5.11 arm exynos_defconfig and booting it, 
but without much luck:


$ qemu-system-arm -kernel arch/arm/boot/zImage -append 
"console=ttySAC0,115200n8 debug" -dtb 
arch/arm/boot/dts/exynos4210-universal_c210.dtb -nographic -serial 
mon:stdio -M smdkc210 -smp 2


(I also tried without the -dtb option, in case qemu provides something 
usable)


Of course I'll still mark v3 as RFT, I just thought I might as well try 
qemu.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 19/25] tty: serial: samsung_tty: IRQ rework

2021-02-21 Thread Hector Martin

On 21/02/2021 04.11, Krzysztof Kozlowski wrote:

On Thu, Feb 18, 2021 at 10:53:10PM +0900, Hector Martin wrote:

This should've gone in the next patch. A previous reviewer told me to put
declarations at the top of the file, so I put it there and moved this one
along with it, but I'll keep it to the additon only for v3.


Maybe I missed something in the context but it looked like
forward declaration s3c24xx_serial_tx_chars() was not needed? In such
case no need to move it.


It's needed in patch #22 in this series; having it in this patch was a 
mistake I made while splitting up the changes. I have moved that line to 
the Apple support patch for v3.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


[PATCH 7/8 v1.5] arm64: Always keep DAIF.[IF] in sync

2021-02-19 Thread Hector Martin
Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
FIQ line. We implement support for this by simply treating IRQs and FIQs
the same way in the interrupt vectors.

To support these systems, the FIQ mask bit needs to be kept in sync with
the IRQ mask bit, so both kinds of exceptions are masked together. No
other platforms should be delivering FIQ exceptions right now, and we
already unmask FIQ in normal process context, so this should not have an
effect on other systems - if spurious FIQs were arriving, they would
already panic the kernel.

Signed-off-by: Hector Martin 
Cc: Mark Rutland 
Cc: Catalin Marinas 
Cc: James Morse 
Cc: Marc Zyngier 
Cc: Thomas Gleixner 
Cc: Will Deacon 

---
 arch/arm64/include/asm/arch_gicv3.h |  2 +-
 arch/arm64/include/asm/assembler.h  |  8 
 arch/arm64/include/asm/daifflags.h  | 10 +-
 arch/arm64/include/asm/irqflags.h   | 16 +++-
 arch/arm64/kernel/entry.S   | 12 +++-
 arch/arm64/kernel/process.c |  2 +-
 arch/arm64/kernel/smp.c |  1 +
 7 files changed, 26 insertions(+), 25 deletions(-)

This is the updated patch after addressing the comments in the original
v2 review; we're moving it to this series now, so please review it in
this context.

diff --git a/arch/arm64/include/asm/arch_gicv3.h 
b/arch/arm64/include/asm/arch_gicv3.h
index 880b9054d75c..934b9be582d2 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -173,7 +173,7 @@ static inline void gic_pmr_mask_irqs(void)

 static inline void gic_arch_enable_irqs(void)
 {
-   asm volatile ("msr daifclr, #2" : : : "memory");
+   asm volatile ("msr daifclr, #3" : : : "memory");
 }

 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index bf125c591116..53ff8c71eed7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -40,9 +40,9 @@
msr daif, \flags
.endm

-   /* IRQ is the lowest priority flag, unconditionally unmask the rest. */
-   .macro enable_da_f
-   msr daifclr, #(8 | 4 | 1)
+   /* IRQ/FIQ are the lowest priority flags, unconditionally unmask the 
rest. */
+   .macro enable_da
+   msr daifclr, #(8 | 4)
.endm

 /*
@@ -50,7 +50,7 @@
  */
.macro  save_and_disable_irq, flags
mrs \flags, daif
-   msr daifset, #2
+   msr daifset, #3
.endm

.macro  restore_irq, flags
diff --git a/arch/arm64/include/asm/daifflags.h 
b/arch/arm64/include/asm/daifflags.h
index 1c26d7baa67f..5eb7af9c4557 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -13,8 +13,8 @@
 #include 

 #define DAIF_PROCCTX   0
-#define DAIF_PROCCTX_NOIRQ PSR_I_BIT
-#define DAIF_ERRCTX(PSR_I_BIT | PSR_A_BIT)
+#define DAIF_PROCCTX_NOIRQ (PSR_I_BIT | PSR_F_BIT)
+#define DAIF_ERRCTX(PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
 #define DAIF_MASK  (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)


@@ -47,7 +47,7 @@ static inline unsigned long local_daif_save_flags(void)
if (system_uses_irq_prio_masking()) {
/* If IRQs are masked with PMR, reflect it in the flags */
if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON)
-   flags |= PSR_I_BIT;
+   flags |= PSR_I_BIT | PSR_F_BIT;
}

return flags;
@@ -69,7 +69,7 @@ static inline void local_daif_restore(unsigned long flags)
bool irq_disabled = flags & PSR_I_BIT;

WARN_ON(system_has_prio_mask_debugging() &&
-   !(read_sysreg(daif) & PSR_I_BIT));
+   (read_sysreg(daif) & (PSR_I_BIT | PSR_F_BIT)) != (PSR_I_BIT | 
PSR_F_BIT));

if (!irq_disabled) {
trace_hardirqs_on();
@@ -86,7 +86,7 @@ static inline void local_daif_restore(unsigned long flags)
 * If interrupts are disabled but we can take
 * asynchronous errors, we can take NMIs
 */
-   flags &= ~PSR_I_BIT;
+   flags &= ~(PSR_I_BIT | PSR_F_BIT);
pmr = GIC_PRIO_IRQOFF;
} else {
pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET;
diff --git a/arch/arm64/include/asm/irqflags.h 
b/arch/arm64/include/asm/irqflags.h
index ff328e5bbb75..b57b9b1e4344 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -12,15 +12,13 @@

 /*
  * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
- * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'daif'
  * order:
  * Masking debug exceptions causes all other exceptions to be masked too/
- * 

Re: [PATCH 7/8 v1.5] arm64: Always keep DAIF.[IF] in sync

2021-02-19 Thread Hector Martin

On 20/02/2021 02.21, Hector Martin wrote:

Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
FIQ line. We implement support for this by simply treating IRQs and FIQs
the same way in the interrupt vectors.

To support these systems, the FIQ mask bit needs to be kept in sync with
the IRQ mask bit, so both kinds of exceptions are masked together. No
other platforms should be delivering FIQ exceptions right now, and we
already unmask FIQ in normal process context, so this should not have an
effect on other systems - if spurious FIQs were arriving, they would
already panic the kernel.

Signed-off-by: Hector Martin 
Cc: Mark Rutland 
Cc: Catalin Marinas 
Cc: James Morse 
Cc: Marc Zyngier 
Cc: Thomas Gleixner 
Cc: Will Deacon 

Argh, sorry, I botched the trhreading. Got caught by git send-email 
prompting me on the dry-run, but not after I added a --to... Resending.


--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


[PATCH 7/8 v1.5] arm64: Always keep DAIF.[IF] in sync

2021-02-19 Thread Hector Martin
Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
FIQ line. We implement support for this by simply treating IRQs and FIQs
the same way in the interrupt vectors.

To support these systems, the FIQ mask bit needs to be kept in sync with
the IRQ mask bit, so both kinds of exceptions are masked together. No
other platforms should be delivering FIQ exceptions right now, and we
already unmask FIQ in normal process context, so this should not have an
effect on other systems - if spurious FIQs were arriving, they would
already panic the kernel.

Signed-off-by: Hector Martin 
Cc: Mark Rutland 
Cc: Catalin Marinas 
Cc: James Morse 
Cc: Marc Zyngier 
Cc: Thomas Gleixner 
Cc: Will Deacon 

---
 arch/arm64/include/asm/arch_gicv3.h |  2 +-
 arch/arm64/include/asm/assembler.h  |  8 
 arch/arm64/include/asm/daifflags.h  | 10 +-
 arch/arm64/include/asm/irqflags.h   | 16 +++-
 arch/arm64/kernel/entry.S   | 12 +++-
 arch/arm64/kernel/process.c |  2 +-
 arch/arm64/kernel/smp.c |  1 +
 7 files changed, 26 insertions(+), 25 deletions(-)

This is the updated patch after addressing the comments in the original
v2 review; we're moving it to this series now, so please review it in
this context.

diff --git a/arch/arm64/include/asm/arch_gicv3.h 
b/arch/arm64/include/asm/arch_gicv3.h
index 880b9054d75c..934b9be582d2 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -173,7 +173,7 @@ static inline void gic_pmr_mask_irqs(void)

 static inline void gic_arch_enable_irqs(void)
 {
-   asm volatile ("msr daifclr, #2" : : : "memory");
+   asm volatile ("msr daifclr, #3" : : : "memory");
 }

 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index bf125c591116..53ff8c71eed7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -40,9 +40,9 @@
msr daif, \flags
.endm

-   /* IRQ is the lowest priority flag, unconditionally unmask the rest. */
-   .macro enable_da_f
-   msr daifclr, #(8 | 4 | 1)
+   /* IRQ/FIQ are the lowest priority flags, unconditionally unmask the 
rest. */
+   .macro enable_da
+   msr daifclr, #(8 | 4)
.endm

 /*
@@ -50,7 +50,7 @@
  */
.macro  save_and_disable_irq, flags
mrs \flags, daif
-   msr daifset, #2
+   msr daifset, #3
.endm

.macro  restore_irq, flags
diff --git a/arch/arm64/include/asm/daifflags.h 
b/arch/arm64/include/asm/daifflags.h
index 1c26d7baa67f..5eb7af9c4557 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -13,8 +13,8 @@
 #include 

 #define DAIF_PROCCTX   0
-#define DAIF_PROCCTX_NOIRQ PSR_I_BIT
-#define DAIF_ERRCTX(PSR_I_BIT | PSR_A_BIT)
+#define DAIF_PROCCTX_NOIRQ (PSR_I_BIT | PSR_F_BIT)
+#define DAIF_ERRCTX(PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
 #define DAIF_MASK  (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)


@@ -47,7 +47,7 @@ static inline unsigned long local_daif_save_flags(void)
if (system_uses_irq_prio_masking()) {
/* If IRQs are masked with PMR, reflect it in the flags */
if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON)
-   flags |= PSR_I_BIT;
+   flags |= PSR_I_BIT | PSR_F_BIT;
}

return flags;
@@ -69,7 +69,7 @@ static inline void local_daif_restore(unsigned long flags)
bool irq_disabled = flags & PSR_I_BIT;

WARN_ON(system_has_prio_mask_debugging() &&
-   !(read_sysreg(daif) & PSR_I_BIT));
+   (read_sysreg(daif) & (PSR_I_BIT | PSR_F_BIT)) != (PSR_I_BIT | 
PSR_F_BIT));

if (!irq_disabled) {
trace_hardirqs_on();
@@ -86,7 +86,7 @@ static inline void local_daif_restore(unsigned long flags)
 * If interrupts are disabled but we can take
 * asynchronous errors, we can take NMIs
 */
-   flags &= ~PSR_I_BIT;
+   flags &= ~(PSR_I_BIT | PSR_F_BIT);
pmr = GIC_PRIO_IRQOFF;
} else {
pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET;
diff --git a/arch/arm64/include/asm/irqflags.h 
b/arch/arm64/include/asm/irqflags.h
index ff328e5bbb75..b57b9b1e4344 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -12,15 +12,13 @@

 /*
  * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
- * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'daif'
  * order:
  * Masking debug exceptions causes all other exceptions to be masked too/
- * 

Re: [PATCH 0/8] arm64: Support FIQ controller registration

2021-02-19 Thread Hector Martin

Hi Mark,

Thanks for tackling this side of the problem!

On 19/02/2021 20.38, Mark Rutland wrote:

The only functional difference here is that if an IRQ
is somehow taken prior to set_handle_irq() the default handler will directly
panic() rather than the vector branching to NULL.


That sounds like the right thing to do, certainly.


The penultimate patch is cherry-picked from the v2 M1 series, and as per
discussion there [3] will need a few additional fixups. I've included it for
now as the DAIF.IF alignment is necessary for the FIQ exception handling added
in the final patch.



The final patch adds the low-level FIQ exception handling and registration
mechanism atop the prior rework.

I'm hoping that we can somehow queue the first 6 patches of this series as a
base for the M1 support. With that we can either cherry-pick a later version of
the DAIF.IF patch here, or the M1 support series can take the FIQ handling
patch. I've pushed the series out to my arm64/fiq branch [4] on kernel.org,
atop v5.11.


Looks good! I cherry picked my updated version of the DAIF.IF patch into 
your series at [1] (3322522d), and then rebased the M1 series on top of 
it (with the change to use set_handle_fiq(), minus all the other 
obsoleted FIQ stuff) at [2]. It all boots and works as expected.


I think it makes sense for you to take the DAIF.IF patch, as it goes 
along with this series. Then we can base the M1 series off of it. If you 
think that works, I can send it off as a one-off reply to the version in 
this series and we can review it here if you want, or otherwise feel 
free to cherry-pick it into a v2 (CC as appropriate).


If this all makes sense, the v3 of the M1 series will then be based off 
of this patchset as in [2], and I'll link to your tree in the cover 
letter so others know where to apply it. Arnd (CCed) is going to be 
merging that one via the SoC tree, so as long as we coordinate a stable 
base once everything is reviewed and ready to merge, I believe it should 
all work out fine on the way up.


Just for completeness, the current DAIF.IF patch in the context of the 
original series is at [3] (4dd6330f), in case that's useful to someone 
for some reason (since there were conflicts due to the refactoring 
happening before it, it changed a bit).


[1] https://github.com/AsahiLinux/linux/tree/fiq
[2] https://github.com/AsahiLinux/linux/tree/upstream-bringup-v3
[3] https://github.com/AsahiLinux/linux/tree/upstream-bringup-v2.5

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 08/25] arm64: Always keep DAIF.[IF] in sync

2021-02-18 Thread Hector Martin

On 18/02/2021 23.22, Mark Rutland wrote:

I think that for consistency we always want to keep IRQ and FIQ in-sync,
even when using GIC priorities. So when handling a pseudo-NMI we should
unmask DAIF.DA and leave DAIF.IF masked.


In that case there's one more, in daifflags.h:local_daif_restore():

/*
 * If interrupts are disabled but we can take
 * asynchronous errors, we can take NMIs
 */
flags &= PSR_I_BIT;
pmr = GIC_PRIO_IRQOFF;


And a minor related one: should init_gic_priority_masking() WARN if FIQ is
masked too? This probably goes with the above.


I think it should, yes.


Done for v3 then. Thanks!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


Re: [PATCH v2 22/25] tty: serial: samsung_tty: Add support for Apple UARTs

2021-02-18 Thread Hector Martin

On 16/02/2021 04.13, Krzysztof Kozlowski wrote:

On Mon, Feb 15, 2021 at 09:17:10PM +0900, Hector Martin wrote:

@@ -389,10 +396,12 @@ static void enable_tx_pio(struct s3c24xx_uart_port 
*ourport)
ucon = rd_regl(port, S3C2410_UCON);
ucon &= ~(S3C64XX_UCON_TXMODE_MASK);
ucon |= S3C64XX_UCON_TXMODE_CPU;
-   wr_regl(port,  S3C2410_UCON, ucon);
  
  	/* Unmask Tx interrupt */

switch (ourport->info->type) {
+   case TYPE_APPLE_S5L:
+   ucon |= APPLE_S5L_UCON_TXTHRESH_ENA_MSK;
+   break;
case TYPE_S3C6400:
s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
break;
@@ -401,7 +410,16 @@ static void enable_tx_pio(struct s3c24xx_uart_port 
*ourport)
break;
}
  
+	wr_regl(port,  S3C2410_UCON, ucon);


You are now configuring the PIO mode after unmasking interrupt. I don't
think it's a good idea to change the order... and if it were, it
would deserve a separate patch.


For v3 I moved the wr_regl back and just write it again in the 
TYPE_APPLE_S5L branch; that way, setting the PIO mode and unmasking the 
interrupt are two discrete operations on S5L, like they are on other types.



/* Keep all interrupts masked and cleared */
switch (ourport->info->type) {
+   case TYPE_APPLE_S5L: {


Usually you put TYPE_APPLE at the end of switch, so please keep it
consistent. Can be first or last - just everywhere the same, unless you
have a fall-through on purpose.


Good point, thanks, moved it for v3. It was actually inconsistent in 
more places, I made all the orders the same (the enum order, and 
default: always goes at the end).



@@ -2179,6 +2329,32 @@ static int s3c24xx_serial_resume_noirq(struct device 
*dev)
if (port) {
/* restore IRQ mask */
switch (ourport->info->type) {
+   case TYPE_APPLE_S5L: {
+   unsigned int ucon;
+
+   clk_prepare_enable(ourport->clk);
+   if (!IS_ERR(ourport->baudclk))
+   clk_prepare_enable(ourport->baudclk);


We should start checking the return values of clk operations. I know
that existing code does it only in few places, so basically you are not
making it worse...


Added error checking for these for v3, thanks.


+#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)_serial_drv_data)
+#else
+#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)NULL)
+#endif
+
+


Only one line break.


Fixed in v3.

Thank you for the reviews!

--
Hector Martin (mar...@marcan.st)
Public Key: https://mrcn.st/pub


  1   2   3   >