Re: [PATCH v3 5/6] arm64: iommu: smmu-v3: Add data structure initialization and stage 2 for SMMUv3

Jan Kiszka Thu, 18 Jul 2019 23:23:19 -0700

On 09.07.19 22:48, 'Pratyush Yadav' via Jailhouse wrote:
> A System Memory Management Unit(SMMU) performs a task analogous to a
> CPU's MMU, translating addresses for device requests from system I/O
> devices before the requests are passed into the system interconnect.
> 
> Implement a driver for SMMU v3 that maps and unmaps memory for specified
> stream ids.
> 
> The guest cells are assigned stream IDs in their configs and only those
> assigned stream IDs can be used by the cells. There is no checking in
> place to make sure two cells do not use the same stream IDs. This must
> be taken care of when creating the cell configs.
> 
> This driver is implemented based on the following assumptions:
> - Running on a Little endian 64 bit core compatible with ARM v8
>   architecture.
> - SMMU supporting only AARCH64 mode.
> - SMMU AARCH 64 stage 2 translation configurations are compatible with
>   ARMv8 VMSA. So re-using the translation tables of CPU for SMMU.
> 
> This driver is loosely based on the Linux kernel SMMU v3 driver.
> 
> Signed-off-by: Pratyush Yadav <[email protected]>
> Signed-off-by: Lokesh Vutla <[email protected]>
> ---
> v3:
> Address comments by Jean:
> - Fix typo in opening comment.
> - Change STRTAB_STE_DWORDS to (1 << STRTAB_STE_DWORDS_BITS).
> - Remove STRTAB_CTXDESC_S1CTXPTR_SHIFT.
> - Fix queue_error(). Read GERROR register now.
> - Add gerr_mask to struct arm_smmu_queue.
> - Add function arm_smmu_cmdq_skip_err() to skip current command when
>   there is a command queue error.
> - Sync consumer when checking for queue_full() in
>   arm_smmu_cmdq_insert_cmd().
> - Change dsb(ish) to dsb(ishst).
> - Remove local variable vmid from arm_smmu_cmdq_build_cmd().
> - Remove an irrelevant comment from arm_smmu_cmdq_build_cmd().
> - Drop queue_insert_raw() and moved its contents to
>   arm_smmu_cmdq_insert_cmd().
> - Invalidate L1 descriptors in arm_smmu_init_l2_strtab() and
>   arm_smmu_uninit_l2_strtab().
> - Fix valid bit being dropped in arm_smmu_write_strtab_ent().
> - Drop double semicolons in arm_smmu_init_one_queue().
> - Drop un-needed return in arm_smmu_uninit_l2_strtab().
> - Issue CMDQ_OP_TLBI_S12_VMALL on cell_exit().
> - Add a warning when VMID16 is not supported and the cell id is greater
>   than 255.
> Other fixes:
> - Fix re-using i as the counter in arm_smmuv3_cell_exit(). Use j
>   instead.
> 
> v2:
> - Split the driver into two parts
> 
>  hypervisor/arch/arm64/Kbuild         |    2 +-
>  hypervisor/arch/arm64/smmu-v3.c      | 1183 ++++++++++++++++++++++++++
>  hypervisor/include/jailhouse/entry.h |    1 +
>  3 files changed, 1185 insertions(+), 1 deletion(-)
>  create mode 100644 hypervisor/arch/arm64/smmu-v3.c
> 
> diff --git a/hypervisor/arch/arm64/Kbuild b/hypervisor/arch/arm64/Kbuild
> index 7283a008..323b78b6 100644
> --- a/hypervisor/arch/arm64/Kbuild
> +++ b/hypervisor/arch/arm64/Kbuild
> @@ -20,4 +20,4 @@ always := lib.a
>  # irqchip (common-objs-y), <generic units>
>  
>  lib-y := $(common-objs-y)
> -lib-y += entry.o setup.o control.o mmio.o paging.o caches.o traps.o
> +lib-y += entry.o setup.o control.o mmio.o paging.o caches.o traps.o smmu-v3.o
> diff --git a/hypervisor/arch/arm64/smmu-v3.c b/hypervisor/arch/arm64/smmu-v3.c
> new file mode 100644
> index 00000000..cde384e7
> --- /dev/null
> +++ b/hypervisor/arch/arm64/smmu-v3.c
> @@ -0,0 +1,1183 @@
> +/*
> + * Jailhouse AArch64 support
> + *
> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
> + *
> + * Authors:
> + *  Lokesh Vutla <[email protected]>
> + *  Pratyush Yadav <[email protected]>
> + *
> + * An emulated SMMU is presented to inmates by trapping access to MMIO
> + * registers to enable stage 1 translations. Accesses to the SMMU memory 
> mapped
> + * registers are trapped and then routed to the emulated SMMU. This is not
> + * emulation in the sense that we fully emulate the device top to bottom. The
> + * emulation is used to provide an interface to the SMMU that the hypervisor
> + * can control to make sure the inmates are not doing anything they should 
> not.
> + * The actual translations are done by hardware.
> + *
> + * Emulation is needed because both stage 1 and stage 2 parameters are
> + * configured in a single data structure, the stream table entry. For this
> + * reason, the inmates can't be allowed to directly control the stream table
> + * entries, and by extension, the stream table.
> + *
> + * The guest cells are assigned stream IDs in their configs and only those
> + * assigned stream IDs can be used by the cells. There is no checking in 
> place
> + * to make sure two cells do not use the same stream IDs. This must be taken
> + * care of when creating the cell configs.
> + *
> + * This driver is implemented based on the following assumptions:
> + * - Running on a Little endian 64 bit core compatible with ARM v8 
> architecture.
> + * - SMMU supporting only AARCH64 mode.
> + * - SMMU AARCH 64 stage 2 translation configurations are compatible with 
> ARMv8
> + *   VMSA. So re using the translation tables of CPU for SMMU.
> + *
> + * This driver is loosely based on the Linux kernel SMMU v3 driver.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include <jailhouse/control.h>
> +#include <jailhouse/paging.h>
> +#include <jailhouse/printk.h>
> +#include <jailhouse/string.h>
> +#include <asm/control.h>
> +#include <jailhouse/unit.h>
> +#include <asm/iommu.h>
> +#include <jailhouse/cell.h>
> +#include <jailhouse/mmio.h>
> +
> +/* Offset of addr from start of the page. */
> +#define PAGE_OFFSET(addr)            ((addr) & PAGE_OFFS_MASK)
> +
> +#define LOWER_32_BITS(n)             ((u32)(n))
> +#define UPPER_32_BITS(n)             ((n) >> 32)
> +
> +/* MMIO registers */
> +#define ARM_SMMU_IDR0                        0x0
> +#define IDR0_ST_LVL                  BIT_MASK(28, 27)
> +#define IDR0_TTENDIAN                        BIT_MASK(22, 21)
> +#define IDR0_VATOS                   (1 << 20)
> +#define IDR0_VMID16                  (1 << 18)
> +#define IDR0_PRI                     (1 << 16)
> +#define IDR0_ATOS                    (1 << 15)
> +#define IDR0_MSI                     (1 << 13)
> +#define IDR0_ASID16                  (1 << 12)
> +#define IDR0_NS1ATS                  (1 << 11)
> +#define IDR0_ATS                     (1 << 10)
> +#define IDR0_S2P                     (1 << 0)
> +#define IDR0_S1P                     (1 << 1)
> +#define IDR0_HTTU                    BIT_MASK(7, 6)
> +#define IDR0_COHACC                  (1 << 4)
> +#define IDR0_TTF                     BIT_MASK(3, 2)
> +
> +#define IDR0_TTF_AARCH64             2
> +#define IDR0_TTENDIAN_LE             2
> +#define IDR0_ST_LVL_2LVL             1
> +
> +#define ARM_SMMU_VMID8_MAX_VMID              255
> +
> +#define ARM_SMMU_IDR1                        0x4
> +#define IDR1_TABLES_PRESET           (1 << 30)
> +#define IDR1_QUEUES_PRESET           (1 << 29)
> +#define IDR1_REL                     (1 << 28)
> +#define IDR1_CMDQS                   BIT_MASK(25, 21)
> +#define IDR1_EVTQS                   BIT_MASK(20, 16)
> +#define IDR1_SSIDSIZE                        BIT_MASK(10, 6)
> +#define IDR1_SIDSIZE                 BIT_MASK(5, 0)
> +
> +#define ARM_SMMU_IDR2                        0x8
> +#define ARM_SMMU_IDR3                        0xC
> +#define ARM_SMMU_IDR4                        0x10
> +#define ARM_SMMU_IDR5                        0x14
> +
> +#define ARM_SMMU_CR0                 0x20
> +#define CR0_CMDQEN                   (1 << 3)
> +#define CR0_EVTQEN                   (1 << 2)
> +#define CR0_SMMUEN                   (1 << 0)
> +
> +#define ARM_SMMU_CR0ACK                      0x24
> +
> +#define ARM_SMMU_CR1                 0x28
> +#define CR1_TABLE_SH                 BIT_MASK(11, 10)
> +#define CR1_TABLE_OC                 BIT_MASK(9, 8)
> +#define CR1_TABLE_IC                 BIT_MASK(7, 6)
> +#define CR1_QUEUE_SH                 BIT_MASK(5, 4)
> +#define CR1_QUEUE_OC                 BIT_MASK(3, 2)
> +#define CR1_QUEUE_IC                 BIT_MASK(1, 0)
> +/* CR1 cacheability fields don't quite follow the usual TCR-style encoding */
> +#define CR1_CACHE_NC                 0
> +#define CR1_CACHE_WB                 1
> +#define CR1_CACHE_WT                 2
> +
> +#define ARM_SMMU_CR2                 0x2c
> +#define CR2_PTM                              (1 << 2)
> +#define CR2_RECINVSID                        (1 << 1)
> +#define CR2_E2H                              (1 << 0)
> +
> +#define ARM_SMMU_STRTAB_BASE         0x80
> +#define STRTAB_BASE_RA                       (1UL << 62)
> +#define STRTAB_BASE_ADDR_MASK                BIT_MASK(51, 6)
> +
> +#define ARM_SMMU_STRTAB_BASE_CFG     0x88
> +#define STRTAB_BASE_CFG_FMT          BIT_MASK(17, 16)
> +#define STRTAB_BASE_CFG_FMT_LINEAR   0
> +#define STRTAB_BASE_CFG_FMT_2LVL     1
> +#define STRTAB_BASE_CFG_SPLIT                BIT_MASK(10, 6)
> +#define STRTAB_BASE_CFG_LOG2SIZE     BIT_MASK(5, 0)
> +
> +#define ARM_SMMU_CMDQ_BASE           0x90
> +#define ARM_SMMU_CMDQ_PROD           0x98
> +#define ARM_SMMU_CMDQ_CONS           0x9c
> +
> +#define ARM_SMMU_EVTQ_BASE           0xa0
> +#define ARM_SMMU_EVTQ_PROD           0x100a8
> +#define ARM_SMMU_EVTQ_CONS           0x100ac
> +#define ARM_SMMU_EVTQ_IRQ_CFG0               0xb0
> +#define ARM_SMMU_EVTQ_IRQ_CFG1               0xb8
> +#define ARM_SMMU_EVTQ_IRQ_CFG2               0xbc
> +
> +#define ARM_SMMU_GERROR                      0x60
> +#define GERROR_CMDQ_ERR                      (1 << 0)
> +#define GERROR_EVTQ_ABT_ERR          (1 << 2)
> +
> +#define ARM_SMMU_GERRORN             0x64
> +#define ARM_SMMU_IRQ_CTRL            0x50
> +#define ARM_SMMU_IRQ_CTRLACK         0x54
> +#define ARM_SMMU_GERROR_IRQ_CFG0     0x68
> +#define ARM_SMMU_EVTQ_IRQ_CFG0               0xb0
> +
> +/* Common memory attribute values */
> +#define ARM_SMMU_SH_NSH                      0
> +#define ARM_SMMU_SH_OSH                      2
> +#define ARM_SMMU_SH_ISH                      3
> +#define ARM_SMMU_MEMATTR_DEVICE_nGnRE        0x1
> +#define ARM_SMMU_MEMATTR_OIWB                0xf
> +
> +#define Q_IDX(reg, shift)            ((reg) & ((1 << (shift)) - 1))
> +#define Q_WRP(reg, shift)            ((reg) & (1 << (shift)))
> +#define Q_OVERFLOW_FLAG                      (1 << 31)
> +#define Q_OVF(reg)                   ((reg) & Q_OVERFLOW_FLAG)
> +#define Q_EMPTY(prod, cons, shift)   \
> +                     (Q_IDX((prod), (shift)) == Q_IDX((cons), (shift)) && \
> +                      Q_WRP((prod), (shift)) == Q_WRP((cons), (shift)))
> +#define Q_FULL(prod, cons, shift)    \
> +                     (Q_IDX((prod), (shift)) == Q_IDX((cons), (shift)) && \
> +                      Q_WRP((prod), (shift)) != Q_WRP((cons), (shift)))
> +
> +#define Q_BASE_RWA                   (1UL << 62)
> +#define Q_BASE_ADDR_MASK             BIT_MASK(51, 5)
> +#define Q_BASE_LOG2SIZE                      BIT_MASK(4, 0)
> +
> +/*
> + * Stream table.
> + *
> + * Linear: Enough to cover 1 << IDR1.SIDSIZE entries
> + * 2lvl: 128k L1 entries,
> + *       256 lazy entries per table (each table covers a PCI bus)
> + */
> +#define STRTAB_L1_SZ_SHIFT           20
> +#define STRTAB_SPLIT                 8
> +
> +#define STRTAB_L1_DESC_DWORDS                1
> +#define STRTAB_L1_DESC_SIZE          (STRTAB_L1_DESC_DWORDS << 3)
> +#define STRTAB_L1_DESC_SPAN          BIT_MASK(4, 0)
> +#define STRTAB_L1_DESC_L2PTR_MASK    BIT_MASK(51, 6)
> +
> +#define STRTAB_STE_DWORDS_BITS               3
> +#define STRTAB_STE_DWORDS            (1 << STRTAB_STE_DWORDS_BITS)
> +#define STRTAB_STE_SIZE                      (STRTAB_STE_DWORDS << 3)
> +#define STRTAB_STE_0_V                       (1UL << 0)
> +#define STRTAB_STE_0_CFG             BIT_MASK(3, 1)
> +#define STRTAB_STE_0_CFG_ABORT               0
> +#define STRTAB_STE_0_CFG_BYPASS              4
> +#define STRTAB_STE_0_CFG_S1_TRANS    5
> +#define STRTAB_STE_0_CFG_S2_TRANS    6
> +#define STRTAB_STE_0_S1CTXPTR                BIT_MASK(51, 6)
> +#define STRTAB_STE_0_S1CDMAX         BIT_MASK(63, 59)
> +#define STRTAB_STE_1_S1DSS           BIT_MASK(1, 0)
> +#define STRTAB_STE_1_S1CIR           BIT_MASK(3, 2)
> +#define STRTAB_STE_1_S1COR           BIT_MASK(5, 4)
> +#define STRTAB_STE_1_S1CSH           BIT_MASK(7, 6)
> +#define STRTAB_STE_1_S1STALLD                (1UL << 27)
> +#define STRTAB_CTXDESC_DWORDS                8
> +#define STRTAB_CTXDESC_S1CTXPTR_SHIFT        6
> +
> +#define STRTAB_STE_1_SHCFG           BIT_MASK(45, 44)
> +#define STRTAB_STE_1_SHCFG_INCOMING  1UL
> +
> +#define STRTAB_STE_2_S2VMID          BIT_MASK(15, 0)
> +#define STRTAB_STE_2_VTCR            BIT_MASK(50, 32)
> +#define STRTAB_STE_2_S2AA64          (1UL << 51)
> +#define STRTAB_STE_2_S2ENDI          (1UL << 52)
> +#define STRTAB_STE_2_S2PTW           (1UL << 54)
> +#define STRTAB_STE_2_S2R             (1UL << 58)
> +
> +#define STRTAB_STE_3_S2TTB_MASK              BIT_MASK(51, 4)
> +
> +#define CTXDESC_1_TTB0                       BIT_MASK(51, 4)
> +#define CTXDESC_2_TTB1                       BIT_MASK(51, 4)
> +#define CTXDESC_TTB0_SHIFT           4
> +#define CTXDESC_TTB1_SHIFT           4
> +
> +/* Command queue */
> +#define CMDQ_ENT_DWORDS                      2
> +#define CMDQ_ENT_SIZE                        (CMDQ_ENT_DWORDS << 3)
> +#define CMDQ_MAX_SZ_SHIFT            8
> +
> +#define CMDQ_CONS_ERR                        BIT_MASK(30, 24)
> +#define CMDQ_ERR_CERROR_NONE_IDX     0
> +#define CMDQ_ERR_CERROR_ILL_IDX              1
> +#define CMDQ_ERR_CERROR_ABT_IDX              2
> +
> +#define CMDQ_0_OP                    BIT_MASK(7, 0)
> +#define CMDQ_0_SSV                   (1UL << 11)
> +
> +#define CMDQ_PREFETCH_0_SSID         BIT_MASK(31, 12)
> +#define CMDQ_PREFETCH_0_SID          BIT_MASK(63, 32)
> +#define CMDQ_PREFETCH_1_SIZE         BIT_MASK(4, 0)
> +#define CMDQ_PREFETCH_1_ADDR_MASK    BIT_MASK(63, 12)
> +
> +#define CMDQ_CFGI_0_SID                      BIT_MASK(63, 32)
> +#define CMDQ_CFGI_1_LEAF             (1UL << 0)
> +#define CMDQ_CFGI_1_RANGE            BIT_MASK(4, 0)
> +
> +#define CMDQ_TLBI_0_VMID             BIT_MASK(47, 32)
> +#define CMDQ_TLBI_0_ASID             BIT_MASK(63, 48)
> +#define CMDQ_TLBI_1_LEAF             (1UL << 0)
> +#define CMDQ_TLBI_1_VA_MASK          BIT_MASK(63, 12)
> +#define CMDQ_TLBI_1_IPA_MASK         BIT_MASK(51, 12)
> +
> +#define CMDQ_PRI_0_SSID                      BIT_MASK(31, 12)
> +#define CMDQ_PRI_0_SID                       BIT_MASK(63, 32)
> +#define CMDQ_PRI_1_GRPID             BIT_MASK(8, 0)
> +#define CMDQ_PRI_1_RESP                      BIT_MASK(13, 12)
> +
> +#define CMDQ_SYNC_0_CS                       BIT_MASK(13, 12)
> +#define CMDQ_SYNC_0_CS_NONE          0
> +#define CMDQ_SYNC_0_CS_IRQ           1
> +#define CMDQ_SYNC_0_CS_SEV           2
> +#define CMDQ_SYNC_0_MSH                      BIT_MASK(23, 22)
> +#define CMDQ_SYNC_0_MSIATTR          BIT_MASK(27, 24)
> +#define CMDQ_SYNC_0_MSIDATA          BIT_MASK(63, 32)
> +#define CMDQ_SYNC_1_MSIADDR_MASK     BIT_MASK(51, 2)
> +
> +/* Event queue */
> +#define EVTQ_ENT_DWORDS                      4
> +#define EVTQ_MAX_SZ_SHIFT            7
> +
> +#define EVTQ_0_ID                    BIT_MASK(7, 0)
> +
> +#define ARM_SMMU_SYNC_TIMEOUT                1000000
> +
> +#define FIELD_PREP(mask, val)        \
> +                     (((u64)(val) << (__builtin_ffsl((mask)) - 1)) & (mask))
> +#define FIELD_GET(mask, reg) \
> +                     (((reg) & (mask)) >> (__builtin_ffsl((mask)) - 1))
> +#define FIELD_CLEAR(mask, reg)       \
> +                     ((reg) & (~(mask)))
> +
> +#define CMDQ_OP_PREFETCH_CFG 0x1
> +#define CMDQ_OP_PREFETCH_ADDR        0x2
> +#define CMDQ_OP_CFGI_STE     0x3
> +#define CMDQ_OP_CFGI_ALL     0x4
> +#define CMDQ_OP_TLBI_NH_ASID 0x11
> +#define CMDQ_OP_TLBI_NH_VA   0x12
> +#define CMDQ_OP_TLBI_EL2_ALL 0x20
> +#define CMDQ_OP_TLBI_S12_VMALL       0x28
> +#define CMDQ_OP_TLBI_S2_IPA  0x2a
> +#define CMDQ_OP_TLBI_NSNH_ALL        0x30
> +#define CMDQ_OP_CMD_SYNC     0x46
> +#define ARM_SMMU_FEAT_2_LVL_STRTAB   (1 << 0)
> +
> +/* High-level queue structures */
> +struct arm_smmu_cmdq_ent {
> +     /* Common fields */
> +     u8                              opcode;
> +     bool                            substream_valid;
> +
> +     /* Command-specific fields */
> +     union {
> +             struct {
> +                     u32                     sid;
> +                     u8                      size;
> +                     u64                     addr;
> +             } prefetch;
> +
> +             struct {
> +                     u32                     sid;
> +                     union {
> +                             bool            leaf;
> +                             u8              span;
> +                     };
> +             } cfgi;
> +
> +             struct {
> +                     u16                     asid;
> +                     u16                     vmid;
> +                     bool                    leaf;
> +                     u64                     addr;
> +             } tlbi;
> +
> +             struct {
> +                     u32                     msidata;
> +                     u64                     msiaddr;
> +             } sync;
> +     };
> +};
> +
> +struct arm_smmu_queue {
> +     u64     *base;
> +     u64     base_dma;
> +     u64     q_base;
> +     u64     ent_dwords;
> +     u32     max_n_shift;
> +     u32     prod;
> +     u32     cons;
> +     u32     *prod_reg;
> +     u32     *cons_reg;
> +     u32     gerr_mask;
> +};
> +
> +struct arm_smmu_cmdq {
> +     struct arm_smmu_queue           q;
> +     spinlock_t                      lock;
> +};
> +
> +struct arm_smmu_evtq {
> +     struct arm_smmu_queue           q;
> +};
> +
> +/* High-level stream table structures */
> +struct arm_smmu_strtab_l1_desc {
> +     u8      span;
> +     __u64   *l2ptr;


The type-usage policy in Jailhouse is that we only have the __types for
guest-facing APIs, but we stick with the underscore-free types for
hypervisor-internal usage.

> +     u64     l2ptr_dma;
> +     u32     active_stes;
> +};
> +
> +struct arm_smmu_strtab_cfg {
> +     __u64                           *strtab;
> +     u64                             strtab_dma;
> +     struct arm_smmu_strtab_l1_desc  *l1_desc;
> +     unsigned int                    num_l1_ents;
> +     u64                             strtab_base;
> +     u32                             strtab_base_cfg;
> +};
> +
> +/* An SMMUv3 instance */
> +struct arm_smmu_device {
> +     void                            *base;
> +     u32                             features;
> +     struct arm_smmu_cmdq            cmdq;
> +     struct arm_smmu_evtq            evtq;
> +     unsigned int                    sid_bits;
> +     struct arm_smmu_strtab_cfg      strtab_cfg;
> +} smmu[JAILHOUSE_MAX_IOMMU_UNITS];
> +
> +/* Low-level queue manipulation functions */
> +static bool queue_full(struct arm_smmu_queue *q)
> +{
> +     u32 shift = q->max_n_shift;
> +
> +     return Q_FULL(q->prod, q->cons, shift);
> +}
> +
> +static bool queue_empty(struct arm_smmu_queue *q)
> +{
> +     u32 shift = q->max_n_shift;
> +
> +     return Q_EMPTY(q->prod, q->cons, shift);
> +}
> +
> +static void queue_sync_cons(struct arm_smmu_queue *q)
> +{
> +     q->cons = mmio_read32(q->cons_reg);
> +}
> +
> +static bool queue_error(struct arm_smmu_device *smmu, struct arm_smmu_queue 
> *q)
> +{
> +     u32 gerror, gerrorn;
> +
> +     gerror = mmio_read32(smmu->base + ARM_SMMU_GERROR);
> +     gerrorn = mmio_read32(smmu->base + ARM_SMMU_GERRORN);
> +
> +     return (gerror ^ gerrorn) & q->gerr_mask;
> +}
> +
> +static void queue_inc_prod(struct arm_smmu_queue *q)
> +{
> +     u32 shift = q->max_n_shift;
> +     u32 prod = (Q_WRP(q->prod, shift) | Q_IDX(q->prod, shift)) + 1;
> +
> +     q->prod = Q_OVF(q->prod) | Q_WRP(prod, shift) | Q_IDX(prod, shift);
> +     mmio_write32(q->prod_reg, q->prod);
> +}
> +
> +static void queue_write(__u64 *dst, __u64 *src, u32 n_dwords)
> +{
> +     int i;
> +
> +     for (i = 0; i < n_dwords; ++i)
> +             *dst++ = *src++;
> +     dsb(ishst);
> +}
> +
> +static __u64 *queue_entry(struct arm_smmu_queue *q, u32 reg)
> +{
> +     return q->base + (Q_IDX(reg, q->max_n_shift) * q->ent_dwords);
> +}
> +
> +/* High-level queue accessors */
> +static int arm_smmu_cmdq_build_cmd(__u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> +{
> +     memset(cmd, 0, CMDQ_ENT_SIZE);
> +     cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
> +
> +     switch (ent->opcode) {
> +     case CMDQ_OP_TLBI_EL2_ALL:
> +     case CMDQ_OP_TLBI_NSNH_ALL:
> +             break;
> +     case CMDQ_OP_PREFETCH_ADDR:
> +             cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size);
> +             cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK;
> +             /* Fallthrough */
> +     case CMDQ_OP_PREFETCH_CFG:
> +             cmd[0] |= FIELD_PREP(CMDQ_PREFETCH_0_SID, ent->prefetch.sid);
> +             break;
> +     case CMDQ_OP_CFGI_STE:
> +             cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid);
> +             cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_LEAF, ent->cfgi.leaf);
> +             break;
> +     case CMDQ_OP_CFGI_ALL:
> +             /* Cover the entire SID range */
> +             cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31);
> +             break;
> +     case CMDQ_OP_TLBI_NH_VA:
> +             cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
> +             cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
> +             cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
> +             cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK;
> +             break;
> +     case CMDQ_OP_TLBI_S2_IPA:
> +             cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
> +             cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
> +             cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK;
> +             break;
> +     case CMDQ_OP_TLBI_NH_ASID:
> +             cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
> +             /* Fallthrough */
> +     case CMDQ_OP_TLBI_S12_VMALL:
> +             cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
> +             break;
> +     case CMDQ_OP_CMD_SYNC:
> +             if (ent->sync.msiaddr)
> +                     cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
> CMDQ_SYNC_0_CS_IRQ);
> +             else
> +                     cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
> CMDQ_SYNC_0_CS_SEV);
> +             cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH) |
> +                       FIELD_PREP(CMDQ_SYNC_0_MSIATTR, 
> ARM_SMMU_MEMATTR_OIWB) |
> +                       FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
> +             cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
> +             break;
> +     default:
> +             return -ENOENT;
> +     }
> +
> +     return 0;
> +}
> +
> +static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
> +{
> +     struct arm_smmu_queue *q;
> +     u64 cmd[CMDQ_ENT_DWORDS];
> +     u32 gerrorn;
> +     struct arm_smmu_cmdq_ent cmd_sync = {
> +             .opcode = CMDQ_OP_CMD_SYNC,
> +     };
> +
> +     q = &smmu->cmdq.q;
> +
> +     printk("WARN: Command queue error 0x%x detected. Skipping command.\n",
> +            (u32)FIELD_GET(CMDQ_CONS_ERR, q->cons));

Keep in mind that report-only but guest triggerable errors can flood the
consoles. The clearer alternative might be panicing the guest we do not support.

> +     /*
> +      * Convert the faulty command to sync and clear the error so
> +      * command consumption can continue.
> +      */
> +     arm_smmu_cmdq_build_cmd(cmd, &cmd_sync);
> +     queue_write(queue_entry(q, q->cons), cmd, q->ent_dwords);
> +
> +     gerrorn = mmio_read32(smmu->base + ARM_SMMU_GERRORN);
> +
> +     gerrorn ^= GERROR_CMDQ_ERR;
> +     mmio_write32(smmu->base + ARM_SMMU_GERRORN, gerrorn);
> +}
> +
> +static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, __u64 
> *cmd)
> +{
> +     struct arm_smmu_queue *q = &smmu->cmdq.q;
> +
> +     while (queue_full(q)) {
> +             queue_sync_cons(q);
> +     }

Single-line block, no braces needed.

> +
> +     queue_write(queue_entry(q, q->prod), cmd, q->ent_dwords);
> +     queue_inc_prod(q);
> +     while (!queue_empty(q)) {
> +             queue_sync_cons(q);
> +
> +             if (queue_error(smmu, q)) {
> +                     arm_smmu_cmdq_skip_err(smmu);
> +             }

Same here.

> +     }
> +}
> +
> +static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
> +                                 struct arm_smmu_cmdq_ent *ent)
> +{
> +     u64 cmd[CMDQ_ENT_DWORDS];
> +
> +     if (arm_smmu_cmdq_build_cmd(cmd, ent)) {
> +             printk("WARN: SMMU ignoring unknown CMDQ opcode 0x%x\n",
> +                      ent->opcode);
> +             return;

And again.

> +     }
> +
> +     spin_lock(&smmu->cmdq.lock);
> +     arm_smmu_cmdq_insert_cmd(smmu, cmd);
> +     spin_unlock(&smmu->cmdq.lock);
> +}
> +
> +static void arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
> +{
> +     struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC };
> +     u64 cmd[CMDQ_ENT_DWORDS];
> +
> +     arm_smmu_cmdq_build_cmd(cmd, &ent);
> +
> +     spin_lock(&smmu->cmdq.lock);
> +     arm_smmu_cmdq_insert_cmd(smmu, cmd);
> +     spin_unlock(&smmu->cmdq.lock);
> +}
> +
> +/* Stream table manipulation functions */
> +static void
> +arm_smmu_write_strtab_l1_desc(__u64 *dst, struct arm_smmu_strtab_l1_desc 
> *desc)
> +{
> +     u64 val = 0;
> +
> +     val |= FIELD_PREP(STRTAB_L1_DESC_SPAN, desc->span);
> +     val |= desc->l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK;
> +
> +     /* Assuming running on Little endian cpu */
> +     *dst = val;
> +     dsb(ishst);
> +}
> +
> +static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
> +{
> +     struct arm_smmu_cmdq_ent cmd = {
> +             .opcode = CMDQ_OP_CFGI_STE,
> +             .cfgi   = {
> +                     .sid    = sid,
> +                     .leaf   = true,
> +             },
> +     };
> +
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +     arm_smmu_cmdq_issue_sync(smmu);
> +}
> +
> +static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
> +                                   __u64 *guest_ste, __u64 *dst,
> +                                   bool bypass, u32 vmid)
> +{
> +     struct paging_structures *pg_structs = &this_cell()->arch.mm;
> +     u64 val, vttbr;
> +
> +     val = 0;
> +
> +     /* Bypass */
> +     if (bypass) {
> +             val = STRTAB_STE_0_V;
> +             val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> +             dst[1] = FIELD_PREP(STRTAB_STE_1_SHCFG,
> +                                 STRTAB_STE_1_SHCFG_INCOMING);
> +             dst[2] = FIELD_PREP(STRTAB_STE_2_S2VMID, vmid);
> +             dst[0] = val;
> +             dsb(ishst);
> +             if (smmu) {
> +                     arm_smmu_sync_ste_for_sid(smmu, sid);
> +             }

Single-line conditional block.

> +             return;
> +     }
> +
> +     if (!(smmu->features & IDR0_VMID16) && vmid > ARM_SMMU_VMID8_MAX_VMID) {
> +             printk("ERROR: 16 bit VMID not supported\n");
> +             return;

Should this failure be warn-only or rather fail loudly, ie. with upward
reporting of an error code? It should only hit us during reconfigurations as far
as I see, thus could be reported to requester without panicing.

> +     }
> +
> +     dst[2] = FIELD_PREP(STRTAB_STE_2_S2VMID, vmid) |
> +              FIELD_PREP(STRTAB_STE_2_VTCR, VTCR_CELL) |
> +              STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> +              STRTAB_STE_2_S2R;
> +
> +     vttbr = paging_hvirt2phys(pg_structs->root_table);
> +     dst[3] = vttbr & STRTAB_STE_3_S2TTB_MASK;
> +
> +     val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> +     val |= STRTAB_STE_0_V;
> +
> +     arm_smmu_sync_ste_for_sid(smmu, sid);
> +     dst[0] = val;
> +     dsb(ishst);
> +     arm_smmu_sync_ste_for_sid(smmu, sid);
> +}
> +
> +static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent)
> +{
> +     unsigned int i;
> +
> +     for (i = 0; i < nent; ++i) {
> +             arm_smmu_write_strtab_ent(NULL, -1, NULL, strtab, true,
> +                                       (u32)this_cell()->config->id);
> +             strtab += STRTAB_STE_DWORDS;
> +     }
> +}
> +
> +static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
> +{
> +     void *strtab;
> +     u64 reg;
> +     u32 size;
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +
> +     size = (1 << smmu->sid_bits) * STRTAB_STE_SIZE;
> +     strtab = page_alloc_aligned(&mem_pool, PAGES(size));
> +     if (!strtab) {
> +             printk("ERROR: SMMU failed to allocate l1 stream table (%u 
> bytes)\n",
> +                    size);

We typcially do not print ENOMEM errors.

> +             return -ENOMEM;
> +     }
> +     cfg->strtab_dma = paging_hvirt2phys(strtab);
> +     cfg->strtab = strtab;
> +     cfg->num_l1_ents = 1 << smmu->sid_bits;
> +
> +     /* Configure strtab_base_cfg for a linear table covering all SIDs */
> +     reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_LINEAR);
> +     reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
> +     cfg->strtab_base_cfg = reg;
> +
> +     arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
> +     return 0;
> +}
> +
> +static int arm_smmu_init_l1_strtab(struct arm_smmu_device *smmu)
> +{
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +     u32 size = sizeof(*cfg->l1_desc) * cfg->num_l1_ents;
> +     void *strtab = smmu->strtab_cfg.strtab;
> +     unsigned int i;
> +
> +     cfg->l1_desc = page_alloc(&mem_pool, PAGES(size));
> +     if (!cfg->l1_desc) {
> +             printk("ERROR: SMMU failed to allocate l1 stream table desc\n");

Same as above.

> +             return -ENOMEM;
> +     }
> +
> +     for (i = 0; i < cfg->num_l1_ents; ++i) {
> +             memset(&cfg->l1_desc[i], 0, sizeof(*cfg->l1_desc));
> +             arm_smmu_write_strtab_l1_desc(strtab, &cfg->l1_desc[i]);
> +             strtab += STRTAB_L1_DESC_SIZE;
> +     }
> +
> +     return 0;
> +}
> +
> +static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
> +{
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +     u32 size, l1size;
> +     void *strtab;
> +     u64 reg;
> +     int ret;
> +
> +     /* Calculate the L1 size, capped to the SIDSIZE. */
> +     size = STRTAB_L1_SZ_SHIFT - 3;
> +     size = MIN(size, smmu->sid_bits - STRTAB_SPLIT);
> +     cfg->num_l1_ents = 1 << size;
> +
> +     size += STRTAB_SPLIT;
> +     if (size < smmu->sid_bits)
> +             printk("WARN: SMMU 2-level strtab only covers %u/%u bits of 
> SID\n",
> +                    size, smmu->sid_bits);

What does that mean for the user? Or the guest?

> +
> +     l1size = cfg->num_l1_ents * STRTAB_L1_DESC_SIZE;
> +     strtab = page_alloc_aligned(&mem_pool, PAGES(l1size));
> +     if (!strtab) {
> +             printk("ERROR: SMMU failed to allocate l1 stream table (%u 
> bytes)\n",
> +                    size);
> +             return -ENOMEM;
> +     }
> +     cfg->strtab_dma = paging_hvirt2phys(strtab);
> +     cfg->strtab = strtab;
> +
> +     /* Configure strtab_base_cfg for 2 levels */
> +     reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_2LVL);
> +     reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, size);
> +     reg |= FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT);
> +     cfg->strtab_base_cfg = reg;
> +
> +     ret = arm_smmu_init_l1_strtab(smmu);
> +
> +     if (ret) {
> +             page_free(&mem_pool, strtab, PAGES(l1size));
> +             return ret;
> +     }
> +
> +     return 0;
> +}
> +
> +static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
> +{
> +     u64 reg;
> +     int ret;
> +
> +     if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
> +             ret = arm_smmu_init_strtab_2lvl(smmu);
> +     else
> +             ret = arm_smmu_init_strtab_linear(smmu);
> +
> +     if (ret)
> +             return ret;
> +
> +     /* Set the strtab base address */
> +     reg  = smmu->strtab_cfg.strtab_dma & STRTAB_BASE_ADDR_MASK;
> +     reg |= STRTAB_BASE_RA;
> +     smmu->strtab_cfg.strtab_base = reg;
> +
> +     return 0;
> +}
> +
> +static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
> +                                struct arm_smmu_queue *q,
> +                                unsigned long prod_off,
> +                                unsigned long cons_off,
> +                                unsigned long dwords,
> +                                unsigned int gerr_mask)
> +{
> +     /* Queue size is capped to 4K. So allocate 1 page */
> +     q->base = page_alloc(&mem_pool, 1);
> +     if (!q->base) {
> +             printk("ERROR: SMMU failed to allocate queue\n");
> +             return -ENOMEM;
> +     }
> +     q->base_dma = paging_hvirt2phys(q->base);
> +
> +     q->prod_reg     = smmu->base + prod_off;
> +     q->cons_reg     = smmu->base + cons_off;
> +     q->ent_dwords   = dwords;
> +
> +     q->q_base  = Q_BASE_RWA;
> +     q->q_base |= q->base_dma & Q_BASE_ADDR_MASK;
> +     q->q_base |= FIELD_PREP(Q_BASE_LOG2SIZE, q->max_n_shift);
> +
> +     q->gerr_mask = gerr_mask;
> +
> +     q->prod = q->cons = 0;
> +     return 0;
> +}
> +
> +static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> +{
> +     int ret;
> +
> +     /* cmdq */
> +     ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> +                                   ARM_SMMU_CMDQ_CONS, CMDQ_ENT_DWORDS,
> +                                   GERROR_CMDQ_ERR);
> +     if (ret)
> +             return ret;
> +
> +     /* evtq */
> +     ret = arm_smmu_init_one_queue(smmu, &smmu->evtq.q, ARM_SMMU_EVTQ_PROD,
> +                                   ARM_SMMU_EVTQ_CONS, EVTQ_ENT_DWORDS,
> +                                   GERROR_EVTQ_ABT_ERR);
> +     if (ret)
> +             return ret;
> +
> +     return ret;
> +}
> +
> +static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
> +{
> +     int ret;
> +
> +     ret = arm_smmu_init_queues(smmu);
> +     if (ret)
> +             return ret;
> +
> +     return arm_smmu_init_strtab(smmu);
> +}
> +
> +static int arm_smmu_write_reg_sync(struct arm_smmu_device *smmu, u32 val,
> +                                unsigned int reg_off, unsigned int ack_off)
> +{
> +     u32 i, timeout = ARM_SMMU_SYNC_TIMEOUT;
> +
> +     mmio_write32(smmu->base + reg_off, val);
> +     for (i = 0; i < timeout; i++) {
> +             if (mmio_read32(smmu->base + ack_off) == val)
> +                     return 0;
> +     }
> +
> +     return -EINVAL;
> +}
> +
> +static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
> +{
> +     int ret;
> +
> +     ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> +     if (ret)
> +             printk("ERROR: SMMU failed to clear cr0\n");
> +
> +     return ret;
> +}
> +
> +static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
> +{
> +     int ret;
> +     u32 reg, enables;
> +     struct arm_smmu_cmdq_ent cmd;
> +
> +     /* Clear CR0 and sync (disables SMMU and queue processing) */
> +     reg = mmio_read32(smmu->base + ARM_SMMU_CR0);
> +     if (reg & CR0_SMMUEN)
> +             printk("ERROR: SMMU currently enabled! Resetting...\n");
> +
> +     ret = arm_smmu_device_disable(smmu);
> +     if (ret)
> +             return ret;
> +
> +     /* CR1 (table and queue memory attributes) */
> +     reg = FIELD_PREP(CR1_TABLE_SH, ARM_SMMU_SH_ISH) |
> +           FIELD_PREP(CR1_TABLE_OC, CR1_CACHE_WB) |
> +           FIELD_PREP(CR1_TABLE_IC, CR1_CACHE_WB) |
> +           FIELD_PREP(CR1_QUEUE_SH, ARM_SMMU_SH_ISH) |
> +           FIELD_PREP(CR1_QUEUE_OC, CR1_CACHE_WB) |
> +           FIELD_PREP(CR1_QUEUE_IC, CR1_CACHE_WB);
> +     mmio_write32(smmu->base + ARM_SMMU_CR1, reg);
> +
> +     /* Stream table */
> +     mmio_write64(smmu->base + ARM_SMMU_STRTAB_BASE,
> +                  smmu->strtab_cfg.strtab_base);
> +     mmio_write32(smmu->base + ARM_SMMU_STRTAB_BASE_CFG,
> +                  smmu->strtab_cfg.strtab_base_cfg);
> +
> +     /* Command queue */
> +     mmio_write64(smmu->base + ARM_SMMU_CMDQ_BASE, smmu->cmdq.q.q_base);
> +     mmio_write32(smmu->base + ARM_SMMU_CMDQ_PROD, smmu->cmdq.q.prod);
> +     mmio_write32(smmu->base + ARM_SMMU_CMDQ_CONS, smmu->cmdq.q.cons);
> +
> +     enables = CR0_CMDQEN;
> +     ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
> +                                   ARM_SMMU_CR0ACK);
> +     if (ret) {
> +             printk("ERROR: SMMU failed to enable command queue\n");

I would rather recommend trace_error() over sprinkling printk all over, provided
we are in management code paths (cell init/exit etc.).

> +             return ret;
> +     }
> +
> +     /* Invalidate any cached configuration */
> +     cmd.opcode = CMDQ_OP_CFGI_ALL;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +     arm_smmu_cmdq_issue_sync(smmu);
> +
> +     cmd.opcode = CMDQ_OP_TLBI_NSNH_ALL;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +
> +     /* Invalidate any stale TLB entries */
> +     cmd.opcode = CMDQ_OP_TLBI_EL2_ALL;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +     arm_smmu_cmdq_issue_sync(smmu);
> +
> +     /* Event queue */
> +     mmio_write64(smmu->base + ARM_SMMU_EVTQ_BASE, smmu->evtq.q.q_base);
> +     mmio_write32(smmu->base + ARM_SMMU_EVTQ_PROD, smmu->evtq.q.prod);
> +     mmio_write32(smmu->base + ARM_SMMU_EVTQ_CONS, smmu->evtq.q.cons);
> +
> +     enables |= CR0_EVTQEN;
> +     ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
> +                                   ARM_SMMU_CR0ACK);
> +     if (ret) {
> +             printk("ERROR: SMMU failed to enable event queue\n");

Same here - unless the called function does some reporting already. More cases
below, I won't highlight them.

> +             return ret;
> +     }
> +
> +     /* ToDo: Add support for PRI queue and IRQs  */
> +
> +     enables |= CR0_SMMUEN;
> +     ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
> +                                   ARM_SMMU_CR0ACK);
> +     if (ret) {
> +             printk("ERROR: SMMU failed to enable SMMU interface\n");
> +             return ret;
> +     }
> +
> +     return 0;
> +}
> +
> +static int arm_smmu_device_init_features(struct arm_smmu_device *smmu)
> +{
> +     u32 reg;
> +
> +     /* IDR0 */
> +     reg = mmio_read32(smmu->base + ARM_SMMU_IDR0);
> +
> +     smmu->features = 0;
> +     /* 2-level structures */
> +     if (FIELD_GET(IDR0_ST_LVL, reg) == IDR0_ST_LVL_2LVL)
> +             smmu->features |= ARM_SMMU_FEAT_2_LVL_STRTAB;
> +
> +     if (!(reg & IDR0_S2P)) {
> +             printk("ERROR: SMMU stage2 translations not supported\n");
> +             return -ENXIO;

Why not something like ENODEV (with trace_error)?

> +     }
> +
> +     if (FIELD_GET(IDR0_S1P, reg)) {
> +             smmu->features |= IDR0_S1P;
> +     }
> +
> +     if (FIELD_GET(IDR0_VMID16, reg)) {
> +             smmu->features |= IDR0_VMID16;
> +     }
> +
> +     /* IDR1 */
> +     reg = mmio_read32(smmu->base + ARM_SMMU_IDR1);
> +     if (reg & (IDR1_TABLES_PRESET | IDR1_QUEUES_PRESET | IDR1_REL)) {
> +             printk("ERROR: SMMU embedded implementation not supported\n");
> +             return -ENXIO;
> +     }
> +
> +     /* Queue sizes, capped at 4k */
> +     smmu->cmdq.q.max_n_shift = MIN(CMDQ_MAX_SZ_SHIFT,
> +                                    FIELD_GET(IDR1_CMDQS, reg));
> +     if (!smmu->cmdq.q.max_n_shift) {
> +             printk("ERROR: SMMU unit-length command queue not supported\n");
> +             return -ENXIO;
> +     }
> +     smmu->evtq.q.max_n_shift = MIN(EVTQ_MAX_SZ_SHIFT,
> +                                    FIELD_GET(IDR1_EVTQS, reg));
> +
> +     /* SID sizes */
> +     smmu->sid_bits = FIELD_GET(IDR1_SIDSIZE, reg);
> +
> +     /*
> +      * If the SMMU supports fewer bits than would fill a single L2 stream
> +      * table, use a linear table instead.
> +      */
> +     if (smmu->sid_bits <= STRTAB_SPLIT)
> +             smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
> +
> +     return 0;
> +}
> +
> +static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
> +{
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +     struct arm_smmu_strtab_l1_desc *desc;
> +     struct arm_smmu_cmdq_ent cmd;
> +     void *strtab;
> +     u32 size;
> +
> +     desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
> +     if (desc->l2ptr) {
> +             desc->active_stes++;
> +             return 0;
> +     }
> +
> +     size = 1 << (STRTAB_SPLIT + STRTAB_STE_DWORDS_BITS + 3);
> +     strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
> +
> +     desc->span = STRTAB_SPLIT + 1;
> +     desc->l2ptr = page_alloc_aligned(&mem_pool, PAGES(size));
> +     if (!desc->l2ptr) {
> +             printk("ERROR: SMMU failed to allocate l2 stream table (%u 
> bytes)\n",
> +                    size);
> +             return -ENOMEM;
> +     }
> +     desc->l2ptr_dma = paging_hvirt2phys(desc->l2ptr);
> +     desc->active_stes = 1;
> +     arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
> +     arm_smmu_write_strtab_l1_desc(strtab, desc);
> +
> +     /* Invalidate cached L1 descriptors. */
> +     cmd.opcode = CMDQ_OP_CFGI_STE;
> +     cmd.cfgi.sid = sid;
> +     cmd.cfgi.leaf = false;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +
> +     return 0;
> +}
> +
> +static void arm_smmu_uninit_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
> +{
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +     struct arm_smmu_strtab_l1_desc *desc;
> +     struct arm_smmu_cmdq_ent cmd;
> +     void *strtab;
> +     u32 size;
> +
> +     desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
> +
> +     desc->active_stes--;
> +     if (desc->active_stes)
> +             return;
> +
> +     desc->l2ptr = NULL;
> +     desc->l2ptr_dma = 0;
> +     desc->span = 0;
> +     strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
> +     arm_smmu_write_strtab_l1_desc(strtab, desc);
> +
> +     /* Invalidate cached L1 descriptors. */
> +     cmd.opcode = CMDQ_OP_CFGI_STE;
> +     cmd.cfgi.sid = sid;
> +     cmd.cfgi.leaf = false;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +
> +     size = 1 << (STRTAB_SPLIT + STRTAB_STE_DWORDS_BITS + 3);
> +     page_free(&mem_pool, desc->l2ptr, PAGES(size));
> +}
> +
> +static __u64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 
> sid)
> +{
> +     __u64 *step;
> +     struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +
> +     if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> +             struct arm_smmu_strtab_l1_desc *l1_desc;
> +             int idx;
> +
> +             /* Two-level walk */
> +             idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> +             l1_desc = &cfg->l1_desc[idx];
> +             idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> +             step = &l1_desc->l2ptr[idx];
> +     } else {
> +             /* Simple linear lookup */
> +             step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> +     }
> +
> +     return step;
> +}
> +
> +static int arm_smmu_init_ste(struct arm_smmu_device *smmu, u32 sid, u32 vmid)
> +{
> +     __u64 *step;
> +
> +     if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
> +             arm_smmu_init_l2_strtab(smmu, sid);
> +
> +     step = arm_smmu_get_step_for_sid(smmu, sid);
> +     arm_smmu_write_strtab_ent(smmu, sid, NULL, step, false, vmid);
> +
> +     return 0;
> +}
> +
> +static void arm_smmu_uninit_ste(struct arm_smmu_device *smmu, u32 sid, u32 
> vmid)
> +{
> +     __u64 *step;
> +
> +     step = arm_smmu_get_step_for_sid(smmu, sid);
> +     arm_smmu_write_strtab_ent(smmu, sid, NULL, step, true, vmid);
> +
> +     if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
> +             arm_smmu_uninit_l2_strtab(smmu, sid);
> +}
> +
> +static int arm_smmuv3_cell_init(struct cell *cell)
> +{
> +     struct jailhouse_iommu *iommu;
> +     struct arm_smmu_cmdq_ent cmd;
> +     int ret, i, j, sid;
> +
> +     for (i = 0; i < JAILHOUSE_MAX_IOMMU_UNITS; i++) {
> +             iommu = &system_config->platform_info.arm.iommu_units[i];
> +             if (iommu->type != JAILHOUSE_IOMMU_SMMUV3)
> +                     continue;
> +
> +             for_each_stream_id(sid, cell->config, j) {
> +                     ret = arm_smmu_init_ste(&smmu[i], sid, 
> cell->config->id);
> +                     if (ret) {
> +                             printk("ERROR: SMMU INIT ste failed: sid = 
> %d\n",
> +                                    sid);
> +                             return ret;

Do we need any rollback in case only one of many calls fails?

> +                     }
> +             }
> +     }
> +
> +     cmd.opcode      = CMDQ_OP_TLBI_S12_VMALL;
> +     cmd.tlbi.vmid   = cell->config->id;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +     arm_smmu_cmdq_issue_sync(smmu);
> +
> +     return 0;
> +}
> +
> +static void arm_smmuv3_cell_exit(struct cell *cell)
> +{
> +     struct jailhouse_iommu *iommu;
> +     struct arm_smmu_cmdq_ent cmd;
> +     int i, j, sid;
> +
> +     for (i = 0; i < JAILHOUSE_MAX_IOMMU_UNITS; i++) {
> +             iommu = &system_config->platform_info.arm.iommu_units[i];
> +             if (iommu->type != JAILHOUSE_IOMMU_SMMUV3)
> +                     continue;
> +
> +             for_each_stream_id(sid, cell->config, j) {
> +                     arm_smmu_uninit_ste(&smmu[i], sid, cell->config->id);
> +             }
> +     }
> +
> +     cmd.opcode      = CMDQ_OP_TLBI_S12_VMALL;
> +     cmd.tlbi.vmid   = cell->config->id;
> +     arm_smmu_cmdq_issue_cmd(smmu, &cmd);
> +     arm_smmu_cmdq_issue_sync(smmu);
> +}
> +
> +static int arm_smmuv3_init(void)
> +{
> +     struct jailhouse_iommu *iommu;
> +     int ret, i;
> +
> +     for (i = 0; i < JAILHOUSE_MAX_IOMMU_UNITS; i++) {
> +             iommu = &system_config->platform_info.arm.iommu_units[i];
> +             if (iommu->type != JAILHOUSE_IOMMU_SMMUV3)
> +                     continue;
> +
> +             smmu[i].base = paging_map_device(iommu->base, iommu->size);
> +
> +             /* ToDo: irq allocation*/
> +
> +             ret = arm_smmu_device_init_features(&smmu[i]);
> +             if (ret)
> +                     return ret;
> +
> +             ret = arm_smmu_init_structures(&smmu[i]);
> +             if (ret)
> +                     return ret;
> +
> +             /* Reset the device */
> +             ret = arm_smmu_device_reset(&smmu[i]);
> +             if (ret)
> +                     return ret;
> +     }
> +
> +     return arm_smmuv3_cell_init(&root_cell);
> +}
> +
> +DEFINE_UNIT_MMIO_COUNT_REGIONS_STUB(arm_smmuv3);
> +DEFINE_UNIT_SHUTDOWN_STUB(arm_smmuv3);
> +DEFINE_UNIT(arm_smmuv3, "ARM SMMU v3");
> diff --git a/hypervisor/include/jailhouse/entry.h 
> b/hypervisor/include/jailhouse/entry.h
> index 26360a6e..da1c9da2 100644
> --- a/hypervisor/include/jailhouse/entry.h
> +++ b/hypervisor/include/jailhouse/entry.h
> @@ -21,6 +21,7 @@
>  #define EPERM                1
>  #define ENOENT               2
>  #define EIO          5
> +#define ENXIO                6
>  #define E2BIG                7
>  #define ENOMEM               12
>  #define EBUSY                16
> 

Primarily had style comments as I'm not familiar with the SMMU itself yet. At a
higher level, you should check again if rollbacks on errors are missing and if
error reporting can be switched to standard patterns (trace_error, error code
forwarding).

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/9bdee063-99c3-efd6-bb4a-2de6ebf8f1bc%40siemens.com.

Re: [PATCH v3 5/6] arm64: iommu: smmu-v3: Add data structure initialization and stage 2 for SMMUv3

Reply via email to