On 09.07.19 10:25, Pratyush Yadav wrote: > > > On 09/07/19 12:01 AM, Jean-philippe Brucker wrote: >> Hi Pratyush, Lokesh, >> >> Nice work! Good to see new SMMUv3 support. I'm quite familiar with the >> Linux driver and have a little time to spare, so I have a few comments below. > > Thanks for the review Jean. Replies inline. > > Jan, > > If you have any further comments, let me know. Otherwise I'll send a v3 > addressing Jean's comments. Also, please review the stage 1 emulation > patch as well.
I suspect I won't find the time to review in depth the next days. So please don't wait for me. Jan > >> On 05/07/2019 15:42, 'Pratyush Yadav' via Jailhouse wrote: >> [...] >>> +++ b/hypervisor/arch/arm64/smmu-v3.c >>> @@ -0,0 +1,1120 @@ >>> +/* >>> + * Jailhouse AArch64 support >>> + * >>> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/ >>> + * >>> + * Authors: >>> + * Lokesh Vutla <[email protected]> >>> + * Pratyush Yadav <[email protected]> >>> + * >>> + * An emulated SMMU is presented to inmates by trapping access to MMIO >>> + * registers to enable stage 1 translations. Accesses to the SMMU memory >>> mapped >>> + * registers are trapped and then routed to the emulated SMMU. This is not >>> + * emulation in the sense that we fully emulate the device top to bottom. >>> The >>> + * emulation is used to provied an interface to the SMMU that the >>> hypervisor >> >> provide > > Fixed. > >>> + * can control to make sure the inmates are not doing anything they should >>> not. >>> + * The actual translations are done by hardware. >>> + * >>> + * Emulation is needed because both stage 1 and stage 2 parameters are >>> + * configured in a single data structure, the stream table entry. For this >>> + * reason, the inmates can't be allowed to directly control the stream >>> table >>> + * entries, and by extension, the stream table. >>> + * >>> + * The guest cells are assigned stream IDs in their configs and only those >>> + * assigned stream IDs can be used by the cells. There is no checking in >>> place >>> + * to make sure two cells do not use the same stream IDs. This must be >>> taken >>> + * care of when creating the cell configs. >>> + * >>> + * This driver is implemented based on the following assumptions: >>> + * - Running on a Little endian 64 bit core compatible with ARM v8 >>> architecture. >>> + * - SMMU supporting only AARCH64 mode. >>> + * - SMMU AARCH 64 stage 2 translation configurations are compatible with >>> ARMv8 >>> + * VMSA. So re using the translation tables of CPU for SMMU. >>> + * >>> + * This driver is loosely based on the Linux kernel SMMU v3 driver. >>> + * >>> + * This work is licensed under the terms of the GNU GPL, version 2. See >>> the >>> + * COPYING file in the top-level directory. >>> + */ >> [...] >>> +/* >>> + * Stream table. >>> + * >>> + * Linear: Enough to cover 1 << IDR1.SIDSIZE entries >>> + * 2lvl: 128k L1 entries, >>> + * 256 lazy entries per table (each table covers a PCI bus) >>> + */ >>> +#define STRTAB_L1_SZ_SHIFT 20 >>> +#define STRTAB_SPLIT 8 >>> + >>> +#define STRTAB_L1_DESC_DWORDS 1 >>> +#define STRTAB_L1_DESC_SIZE (STRTAB_L1_DESC_DWORDS << 3) >>> +#define STRTAB_L1_DESC_SPAN BIT_MASK(4, 0) >>> +#define STRTAB_L1_DESC_L2PTR_MASK BIT_MASK(51, 6) >>> + >>> +#define STRTAB_STE_DWORDS 8 >> >> nit: that could be >> #define STRTAB_STE_DWORDS (1 << STRTAB_STE_DWORDS_BITS) > > Done. > >>> +#define STRTAB_STE_DWORDS_BITS 3 >>> +#define STRTAB_STE_SIZE (STRTAB_STE_DWORDS << 3) >>> +#define STRTAB_STE_0_V (1UL << 0) >>> +#define STRTAB_STE_0_CFG BIT_MASK(3, 1) >>> +#define STRTAB_STE_0_CFG_ABORT 0 >>> +#define STRTAB_STE_0_CFG_BYPASS 4 >>> +#define STRTAB_STE_0_CFG_S1_TRANS 5 >>> +#define STRTAB_STE_0_CFG_S2_TRANS 6 >>> +#define STRTAB_STE_0_S1CTXPTR BIT_MASK(51, 6) >>> +#define STRTAB_STE_0_S1CDMAX BIT_MASK(63, 59) >>> +#define STRTAB_STE_1_S1DSS BIT_MASK(1, 0) >>> +#define STRTAB_STE_1_S1CIR BIT_MASK(3, 2) >>> +#define STRTAB_STE_1_S1COR BIT_MASK(5, 4) >>> +#define STRTAB_STE_1_S1CSH BIT_MASK(7, 6) >>> +#define STRTAB_STE_1_S1STALLD (1UL << 27) >>> +#define STRTAB_CTXDESC_DWORDS 8 >>> +#define STRTAB_CTXDESC_S1CTXPTR_SHIFT 6 >> >> STRTAB_STE_0_S1CTXPTR_SHIFT? But you already have S1CTXPTR above, so I'm >> not sure these two lines are useful. > > They aren't. This define was left in as something else I was doing while > trying to figure stage 1 emulation out. Removed. > >> [...] >> >>> +static bool queue_error(struct arm_smmu_queue *q) >>> +{ >>> + return mmio_read32(q->cons_reg) & 0x1; >> >> This doesn't seem right, the error field in SMMU_CMDQ_CONS is at [30:24] >> (CMDQ_CONS_ERR). You could also get the value from q->cons since you read >> this after queue_sync_cons(). > > Hm, this doesn't seem right to me either. Not sure why it was added > especially with the hard-coded 0x1. Probably an artifact left over from > debugging. > > Anyway, reading CMDQ_CONS_ERR to check if an error happened is not the > correct behavior either. The spec says the value is UNKNOWN when > CMDQ_ERR global error is not active. So we should be reading > SMMU_GERROR.CMDQ_ERR instead. > > Will fix. > >>> +} >>> + >>> +static void queue_inc_prod(struct arm_smmu_queue *q) >>> +{ >>> + u32 shift = q->max_n_shift; >>> + u32 prod = (Q_WRP(q->prod, shift) | Q_IDX(q->prod, shift)) + 1; >>> + >>> + q->prod = Q_OVF(q->prod) | Q_WRP(prod, shift) | Q_IDX(prod, shift); >>> + mmio_write32(q->prod_reg, q->prod); >>> +} >>> + >>> +static void queue_write(__u64 *dst, __u64 *src, u32 n_dwords) >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < n_dwords; ++i) >>> + *dst++ = *src++; >>> + dsb(ish); >> >> dsb(ishst) should be sufficient > > Will fix here and in all other cases. > >>> +} >>> + >>> +static __u64 *queue_entry(struct arm_smmu_queue *q) >>> +{ >>> + return q->base + (Q_IDX(q->prod, q->max_n_shift) * q->ent_dwords); >>> +} >>> + >>> +static int queue_insert_raw(struct arm_smmu_queue *q, __u64 *ent) >>> +{ >>> + while (queue_full(q)) >>> + {} >> >> It seems like queue_full() only accesses the shadow version of cons and >> prod, so I think you need a queue_sync_cons() in this loop. But given that >> you drain the queue for each command below, queue_full() could only happen >> if the queue is in error state, in which case this will loop indefinitely. > > I'll sync the consumer in the body of the loop, just in case there is a > change in the future that doesn't drain the queue. > > Also, why would queue_full() happen when the queue is in error state? In > that function, we don't check the error code, only the write index and > wrap. AFAIK, neither of those are affected when an error is active. > > So we should also add queue_error() check here, correct? > >>> + >>> + queue_write(queue_entry(q), ent, q->ent_dwords); >>> + queue_inc_prod(q); >>> + while (!queue_empty(q) && !queue_error(q)) { >>> + queue_sync_cons(q); >>> + } >>> + return 0; >>> +} >>> + >>> +/* High-level queue accessors */ >>> +static int arm_smmu_cmdq_build_cmd(__u64 *cmd, struct arm_smmu_cmdq_ent >>> *ent) >>> +{ >>> + u64 vmid = (u64)this_cell()->config->id; >> >> A little redundant since you're setting ent->tlbi.vmid already. > > Removed. > >>> + memset(cmd, 0, CMDQ_ENT_SIZE); >>> + cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode); >>> + >>> + switch (ent->opcode) { >>> + case CMDQ_OP_TLBI_EL2_ALL: >>> + case CMDQ_OP_TLBI_NSNH_ALL: >>> + break; >>> + case CMDQ_OP_PREFETCH_ADDR: >>> + cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size); >>> + cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK; >>> + /* Fallthrough */ >>> + case CMDQ_OP_PREFETCH_CFG: >>> + cmd[0] |= FIELD_PREP(CMDQ_PREFETCH_0_SID, ent->prefetch.sid); >>> + break; >>> + case CMDQ_OP_CFGI_STE: >>> + cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid); >>> + cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_LEAF, ent->cfgi.leaf); >>> + break; >>> + case CMDQ_OP_CFGI_ALL: >>> + /* Cover the entire SID range */ >>> + cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31); >>> + break; >>> + case CMDQ_OP_TLBI_NH_VA: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid); >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf); >>> + cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK; >>> + break; >>> + case CMDQ_OP_TLBI_S2_IPA: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vmid); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf); >>> + cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK; >>> + break; >>> + case CMDQ_OP_TLBI_NH_ASID: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid); >>> + /* Fallthrough */ >>> + case CMDQ_OP_TLBI_S12_VMALL: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vmid); >>> + break; >>> + case CMDQ_OP_CMD_SYNC: >>> + if (ent->sync.msiaddr) >>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, >>> CMDQ_SYNC_0_CS_IRQ); >>> + else >>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, >>> CMDQ_SYNC_0_CS_SEV); >>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH); >>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, >>> ARM_SMMU_MEMATTR_OIWB); >>> + /* >>> + * Commands are written little-endian, but we want the SMMU to >>> + * receive MSIData, and thus write it back to memory, in CPU >>> + * byte order, so big-endian needs an extra byteswap here. >>> + */ >> >> The comment doesn't apply anymore, does it? > > Why not? I'm guessing it's because the driver is written with the > assumption it is running on a little endian core. > > In that case, yes, it does not apply. This was taken verbatim from the > Linux driver, so probably why this comment stayed around. > >>> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata); >>> + cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; >>> + break; >>> + default: >>> + return -ENOENT; >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, __u64 >>> *cmd) >>> +{ >>> + struct arm_smmu_queue *q = &smmu->cmdq.q; >>> + >>> + queue_insert_raw(q, cmd); >> >> nit: you could drop queue_insert_raw() and insert its content here > > Done. > >>> +} >>> + >>> +static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu, >>> + struct arm_smmu_cmdq_ent *ent) >>> +{ >>> + u64 cmd[CMDQ_ENT_DWORDS]; >>> + >>> + if (arm_smmu_cmdq_build_cmd(cmd, ent)) { >>> + printk("WARN: SMMU ignoring unknown CMDQ opcode 0x%x\n", >>> + ent->opcode); >>> + return; >>> + } >>> + >>> + spin_lock(&smmu->cmdq.lock); >>> + arm_smmu_cmdq_insert_cmd(smmu, cmd); >>> + spin_unlock(&smmu->cmdq.lock); >>> +} >>> + >>> +static void arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu) >>> +{ >>> + struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC }; >>> + u64 cmd[CMDQ_ENT_DWORDS]; >>> + >>> + arm_smmu_cmdq_build_cmd(cmd, &ent); >>> + >>> + spin_lock(&smmu->cmdq.lock); >>> + arm_smmu_cmdq_insert_cmd(smmu, cmd); >>> + spin_unlock(&smmu->cmdq.lock); >>> +} >>> + >>> +/* Stream table manipulation functions */ >>> +static void >>> +arm_smmu_write_strtab_l1_desc(__u64 *dst, struct arm_smmu_strtab_l1_desc >>> *desc) >>> +{ >>> + u64 val = 0; >>> + >>> + val |= FIELD_PREP(STRTAB_L1_DESC_SPAN, desc->span); >>> + val |= desc->l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK; >>> + >>> + /* Assuming running on Little endian cpu */ >>> + *dst = val; >>> + dsb(ish); >> >> dsb(ishst)? It might be a good idea to add a small comment before each >> memory barrier. I'm not sure what this one does for example, does it order >> against the subsequent write to SMMU_STRTAB_BASE? If that's the case then >> you could move the barrier closer to that write, so it doesn't have to be >> done for each L1 descriptor. > > The idea was to have a data barrier whenever there was any updates to > any of the data structures that the SMMU uses. > > Since someone is likely to invalidate the L1 descriptor (though we don't > do it right now, we should), it makes sense to be sure the write went > through before telling the SMMU to read it back. > >>> +} >>> + >>> +static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 >>> sid) >>> +{ >>> + struct arm_smmu_cmdq_ent cmd = { >>> + .opcode = CMDQ_OP_CFGI_STE, >>> + .cfgi = { >>> + .sid = sid, >>> + .leaf = true, >> >> You probably need leaf = false, because the SMMU could cache l1 stream >> table descriptors. I think the Linux driver gets away with it because it >> only modifies l1 descriptors before making the stream table visible to the >> SMMU through SMMU_STRTAB_BASE (I haven't verified that claim). > > Hm, I don't think the Linux driver only modifies L1 descriptors before > making the stream table visible to the SMMU. > > arm_smmu_add_device() calls arm_smmu_init_l2_strtab() which modifies the > L1 descriptor. Since a device can be added after the SMMU is > initialised, the Linux driver makes changes to L1 descriptors after the > SMMU is up and the stream table is visible. > > Either way, we should be invalidating the L1 descriptors when we > actually update them. And the best place for that would be > arm_smmu_write_strtab_l1_desc(). > > Unfortunately, I don't think there is a command to invalidate just a L1 > descriptor. So we'll probably have to hack around it by invalidating the > first STE for the L1 descriptor with leaf = false. > > Any better ideas? > >>> + }, >>> + }; >>> + >>> + arm_smmu_cmdq_issue_cmd(smmu, &cmd); >>> + arm_smmu_cmdq_issue_sync(smmu); >>> +} >>> + >>> +static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 >>> sid, >>> + __u64 *guest_ste, __u64 *dst, >>> + bool bypass, u32 vmid) >>> +{ >>> + struct paging_structures *pg_structs = &this_cell()->arch.mm; >>> + u64 val = dst[0], vttbr, mask; >>> + >>> + /* Bypass */ >>> + if (bypass) { >>> + dst[0] = STRTAB_STE_0_V; >> >> Hm, won't the "dst[0] = val" below clear V? > > Yes it will. This probably got messed up in a bit of last-minute > refactoring I did. Fixed. > >> STE writes in this function might need a little more care. At this point >> for example, the SMMU could issue C_BAD_STE events because the STE is >> marked valid but is in inconsistent state. It might be best to do the same >> as the Linux driver: ensure dst[0].V is clear, write dst[1-3] first, issue >> a sync(), then write dst[0] in a single access. > > All right. The next patch also touches this function. That will need > fixing too. > >>> + dsb(ish); >>> + val = FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); >>> + dst[0] = val; >>> + dst[1] = FIELD_PREP(STRTAB_STE_1_SHCFG, >>> + STRTAB_STE_1_SHCFG_INCOMING); >>> + dst[2] = vmid; >>> + dsb(ish); >>> + if (smmu) { >>> + arm_smmu_sync_ste_for_sid(smmu, sid); >>> + } >>> + return; >>> + } >>> + >>> + dst[2] = FIELD_PREP(STRTAB_STE_2_S2VMID, vmid) | >> >> One more restriction on the hardware is that the VMID size used by the CPU >> must be smaller or equal the size of the VMID supported by the SMMU. If >> the CPU is using 16-bit VMID (TCR.VS == 1) but SMMU_IDR0.VMID16 == 0, then >> setting the top VMID bits here is illegal. > > We're pulling the VMID from cell->config->id, which is the cell id. This > is a linearly allocated number, so you'd need at least 257 cells to set > any of the top bits. In case of Jailhouse, that would mean at least 257 > CPUs. I don't think anyone is using Jailhouse with 256+ CPUs, so this is > not a big problem for now. > > And say we do overflow. Say we do have more than 257 cells running. What > do you set the VMID to then? You can't wrap the value back because that > would conflict with another cell. The only course then would be to > terminate the cell, or not use the SMMU for it at all. > > Not using the SMMU could work, but it means there would be complications > when the cell expects stage 1 translations. > > Again, to me it looks like a problem that is not likely to occur. Jan, > your thoughts about it? > >>> + FIELD_PREP(STRTAB_STE_2_VTCR, VTCR_CELL) | >>> + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | >>> + STRTAB_STE_2_S2R; >> >>> + >>> + vttbr = paging_hvirt2phys(pg_structs->root_table); >>> + dst[3] = vttbr & STRTAB_STE_3_S2TTB_MASK; >>> + >>> + dst[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); >>> + dst[0] |= STRTAB_STE_0_V; >> >> Here as well, it would be better to sync the rest of the STE, then write >> dst[0] in a single access. > > Will fix. > >>> + dsb(ish); >> >> Not sure that's needed, the Linux driver doesn't have a dsb() here > > Why not? Do you not want to make sure the writes actually go through > before you sync the STE? > >>> + >>> + arm_smmu_sync_ste_for_sid(smmu, sid); >>> +} >>> + >>> +static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent) >>> +{ >>> + unsigned int i; >>> + >>> + for (i = 0; i < nent; ++i) { >>> + arm_smmu_write_strtab_ent(NULL, -1, NULL, strtab, true, >>> + (u32)this_cell()->config->id); >>> + strtab += STRTAB_STE_DWORDS; >>> + } >>> +} >>> + >>> +static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) >>> +{ >>> + void *strtab; >>> + u64 reg; >>> + u32 size; >>> + struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; >>> + >>> + size = (1 << smmu->sid_bits) * STRTAB_STE_SIZE; >>> + strtab = page_alloc_aligned(&mem_pool, PAGES(size)); >>> + if (!strtab) { >>> + printk("ERROR: SMMU failed to allocate l1 stream table (%u >>> bytes)\n", >>> + size); >>> + return -ENOMEM; >>> + } >>> + cfg->strtab_dma = paging_hvirt2phys(strtab); >>> + cfg->strtab = strtab; >>> + cfg->num_l1_ents = 1 << smmu->sid_bits; >>> + >>> + /* Configure strtab_base_cfg for a linear table covering all SIDs */ >>> + reg = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_LINEAR); >>> + reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits); >>> + cfg->strtab_base_cfg = reg; >>> + >>> + arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents); >>> + return 0; >>> +} >>> + >>> +static int arm_smmu_init_l1_strtab(struct arm_smmu_device *smmu) >>> +{ >>> + struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; >>> + u32 size = sizeof(*cfg->l1_desc) * cfg->num_l1_ents; >>> + void *strtab = smmu->strtab_cfg.strtab; >>> + unsigned int i; >>> + >>> + cfg->l1_desc = page_alloc(&mem_pool, PAGES(size)); >>> + if (!cfg->l1_desc) { >>> + printk("ERROR: SMMU failed to allocate l1 stream table desc\n"); >>> + return -ENOMEM; >>> + } >>> + >>> + for (i = 0; i < cfg->num_l1_ents; ++i) { >>> + memset(&cfg->l1_desc[i], 0, sizeof(*cfg->l1_desc)); >> >> I might be wrong, but I think pages obtained from mem_pool are initialized >> to zero, so you don't need this. > > Yes, mem_pool uses the flag PAGE_SCRUB_ON_FREE (which means pages > obtained from mem_pool will be zeroed out on free), but it is not > documented anywhere that pages allocated via mem_pool will be zeroed > out. I'd rather stay on the safer side here. > >>> + arm_smmu_write_strtab_l1_desc(strtab, &cfg->l1_desc[i]); >>> + cfg->l1_desc[i].active_stes = 0; >> >> Already initialized to zero > > Fixed. > >>> + strtab += STRTAB_L1_DESC_SIZE; >>> + } >>> + >>> + return 0; >>> +} >> >> [...] >>> +static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, >>> + struct arm_smmu_queue *q, >>> + unsigned long prod_off, >>> + unsigned long cons_off, >>> + unsigned long dwords) >>> +{ >>> + /* Queue size is capped to 4K. So allocate 1 page */ >>> + q->base = page_alloc(&mem_pool, 1); >>> + if (!q->base) { >>> + printk("ERROR: SMMU failed to allocate queue\n"); >>> + return -ENOMEM; >>> + } >>> + q->base_dma = paging_hvirt2phys(q->base);; >> >> (nit: ;;) > > Fixed. > >> [...] >>> +static int arm_smmu_device_reset(struct arm_smmu_device *smmu) >>> +{ >>> + int ret; >>> + u32 reg, enables; >>> + struct arm_smmu_cmdq_ent cmd; >>> + >>> + /* Clear CR0 and sync (disables SMMU and queue processing) */ >>> + reg = mmio_read32(smmu->base + ARM_SMMU_CR0); >>> + if (reg & CR0_SMMUEN) >>> + printk("ERROR: SMMU currently enabled! Resetting...\n"); >> >> I'm guessing the SMMU is disabled in the root cell before we enable >> jailhouse? > > If you want stage 2 only, it is. But if you want stage 1 + 2, you need > to enable it in Linux on boot so any devices that need to go through the > SMMU are properly initialized. > > Then when Jailhouse comes up, we need to reset it. > >> [...] >> >>> +static void arm_smmu_uninit_l2_strtab(struct arm_smmu_device *smmu, u32 >>> sid) >>> +{ >>> + struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; >>> + struct arm_smmu_strtab_l1_desc *desc; >>> + void *strtab; >>> + u32 size; >>> + >>> + desc = &cfg->l1_desc[sid >> STRTAB_SPLIT]; >>> + >>> + desc->active_stes--; >>> + if (desc->active_stes) >>> + return; >>> + >>> + size = 1 << (STRTAB_SPLIT + STRTAB_STE_DWORDS_BITS + 3); >>> + page_free(&mem_pool, desc->l2ptr, PAGES(size)); >> >> Might be safer to free the page after clearing the l1_desc and issuing a >> CMDQ_CFGI_STE? Here the SMMU can still access the page. > > This function is only called during cell_exit(). Will the SMMU have any > transactions while a cell is exiting? > >>> + desc->l2ptr = NULL; >>> + desc->l2ptr_dma = 0; >>> + desc->span = 0; >>> + strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS]; >>> + arm_smmu_write_strtab_l1_desc(strtab, desc); >>> + >>> + return; >> >> nit: not needed > > Fixed. > >> [...] >>> +static int arm_smmuv3_cell_init(struct cell *cell) >>> +{ >>> + struct jailhouse_iommu *iommu; >>> + struct arm_smmu_cmdq_ent cmd; >>> + int ret, i, j, sid; >>> + >>> + for (i = 0; i < JAILHOUSE_MAX_IOMMU_UNITS; i++) { >>> + iommu = &system_config->platform_info.arm.iommu_units[i]; >>> + if (iommu->type != JAILHOUSE_IOMMU_SMMUV3) >>> + continue; >>> + >>> + for_each_stream_id(sid, cell->config, j) { >>> + ret = arm_smmu_init_ste(&smmu[i], sid, >>> cell->config->id); >>> + if (ret) {arch_paging_flush_page_tlbs >>> + printk("ERROR: SMMU INIT ste failed: sid = >>> %d\n", >>> + sid); >>> + return ret; >>> + } >>> + } >>> + } >>> + >>> + cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; >>> + cmd.tlbi.vmid = cell->config->id; >>> + arm_smmu_cmdq_issue_cmd(smmu, &cmd); >>> + arm_smmu_cmdq_issue_sync(smmu); >> >> By the way, shouldn't there be a CMDQ_OP_TLBI_S2_IPA whenever we do >> arch_paging_flush_page_tlbs() on the CPU side? > > I don't think so. Firstly, arch_paging_flush_page_tlbs() is called only > for changes to hypervisor paging structs. Right now, the hypervisor does > not use the SMMU at all. It is only for guest use. > > And the guest memory mappings are static and will remain constant from > start till exit. So we should only need to flush these entries when > starting a cell (which we do), and when exiting a cell. > > This makes me realise, we should be calling CMDQ_OP_TLBI_S12_VMALL on > cell_exit() too. > >> Thanks, >> Jean >> > -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jailhouse-dev/de802594-927d-4588-af42-1c51c1f8c915%40siemens.com. For more options, visit https://groups.google.com/d/optout.
