Re: [PATCH v2 5/6] arm64: iommu: smmu-v3: Add data structure initialization and stage 2 for SMMUv3

Jan Kiszka Tue, 09 Jul 2019 05:20:52 -0700

On 09.07.19 10:25, Pratyush Yadav wrote:
> 
> 
> On 09/07/19 12:01 AM, Jean-philippe Brucker wrote:
>> Hi Pratyush, Lokesh,
>>
>> Nice work! Good to see new SMMUv3 support. I'm quite familiar with the
>> Linux driver and have a little time to spare, so I have a few comments below.
> 
> Thanks for the review Jean. Replies inline.
> 
> Jan,
> 
> If you have any further comments, let me know. Otherwise I'll send a v3
> addressing Jean's comments. Also, please review the stage 1 emulation
> patch as well.


I suspect I won't find the time to review in depth the next days. So please
don't wait for me.

Jan

> 
>> On 05/07/2019 15:42, 'Pratyush Yadav' via Jailhouse wrote:
>> [...]
>>> +++ b/hypervisor/arch/arm64/smmu-v3.c
>>> @@ -0,0 +1,1120 @@
>>> +/*
>>> + * Jailhouse AArch64 support
>>> + *
>>> + * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
>>> + *
>>> + * Authors:
>>> + *  Lokesh Vutla <[email protected]>
>>> + *  Pratyush Yadav <[email protected]>
>>> + *
>>> + * An emulated SMMU is presented to inmates by trapping access to MMIO
>>> + * registers to enable stage 1 translations. Accesses to the SMMU memory 
>>> mapped
>>> + * registers are trapped and then routed to the emulated SMMU. This is not
>>> + * emulation in the sense that we fully emulate the device top to bottom. 
>>> The
>>> + * emulation is used to provied an interface to the SMMU that the 
>>> hypervisor
>>
>>                            provide
> 
> Fixed.
> 
>>> + * can control to make sure the inmates are not doing anything they should 
>>> not.
>>> + * The actual translations are done by hardware.
>>> + *
>>> + * Emulation is needed because both stage 1 and stage 2 parameters are
>>> + * configured in a single data structure, the stream table entry. For this
>>> + * reason, the inmates can't be allowed to directly control the stream 
>>> table
>>> + * entries, and by extension, the stream table.
>>> + *
>>> + * The guest cells are assigned stream IDs in their configs and only those
>>> + * assigned stream IDs can be used by the cells. There is no checking in 
>>> place
>>> + * to make sure two cells do not use the same stream IDs. This must be 
>>> taken
>>> + * care of when creating the cell configs.
>>> + *
>>> + * This driver is implemented based on the following assumptions:
>>> + * - Running on a Little endian 64 bit core compatible with ARM v8 
>>> architecture.
>>> + * - SMMU supporting only AARCH64 mode.
>>> + * - SMMU AARCH 64 stage 2 translation configurations are compatible with 
>>> ARMv8
>>> + *   VMSA. So re using the translation tables of CPU for SMMU.
>>> + *
>>> + * This driver is loosely based on the Linux kernel SMMU v3 driver.
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2.  See 
>>> the
>>> + * COPYING file in the top-level directory.
>>> + */
>> [...]
>>> +/*
>>> + * Stream table.
>>> + *
>>> + * Linear: Enough to cover 1 << IDR1.SIDSIZE entries
>>> + * 2lvl: 128k L1 entries,
>>> + *       256 lazy entries per table (each table covers a PCI bus)
>>> + */
>>> +#define STRTAB_L1_SZ_SHIFT         20
>>> +#define STRTAB_SPLIT                       8
>>> +
>>> +#define STRTAB_L1_DESC_DWORDS              1
>>> +#define STRTAB_L1_DESC_SIZE                (STRTAB_L1_DESC_DWORDS << 3)
>>> +#define STRTAB_L1_DESC_SPAN                BIT_MASK(4, 0)
>>> +#define STRTAB_L1_DESC_L2PTR_MASK  BIT_MASK(51, 6)
>>> +
>>> +#define STRTAB_STE_DWORDS          8
>>
>> nit: that could be
>> #define STRTAB_STE_DWORDS (1 << STRTAB_STE_DWORDS_BITS)
> 
> Done.
> 
>>> +#define STRTAB_STE_DWORDS_BITS             3
>>> +#define STRTAB_STE_SIZE                    (STRTAB_STE_DWORDS << 3)
>>> +#define STRTAB_STE_0_V                     (1UL << 0)
>>> +#define STRTAB_STE_0_CFG           BIT_MASK(3, 1)
>>> +#define STRTAB_STE_0_CFG_ABORT             0
>>> +#define STRTAB_STE_0_CFG_BYPASS            4
>>> +#define STRTAB_STE_0_CFG_S1_TRANS  5
>>> +#define STRTAB_STE_0_CFG_S2_TRANS  6
>>> +#define STRTAB_STE_0_S1CTXPTR              BIT_MASK(51, 6)
>>> +#define STRTAB_STE_0_S1CDMAX               BIT_MASK(63, 59)
>>> +#define STRTAB_STE_1_S1DSS         BIT_MASK(1, 0)
>>> +#define STRTAB_STE_1_S1CIR         BIT_MASK(3, 2)
>>> +#define STRTAB_STE_1_S1COR         BIT_MASK(5, 4)
>>> +#define STRTAB_STE_1_S1CSH         BIT_MASK(7, 6)
>>> +#define STRTAB_STE_1_S1STALLD              (1UL << 27)
>>> +#define STRTAB_CTXDESC_DWORDS              8
>>> +#define STRTAB_CTXDESC_S1CTXPTR_SHIFT      6
>>
>> STRTAB_STE_0_S1CTXPTR_SHIFT? But you already have S1CTXPTR above, so I'm
>> not sure these two lines are useful.
> 
> They aren't. This define was left in as something else I was doing while
> trying to figure stage 1 emulation out. Removed.
> 
>> [...]
>>
>>> +static bool queue_error(struct arm_smmu_queue *q)
>>> +{
>>> +   return mmio_read32(q->cons_reg) & 0x1;
>>
>> This doesn't seem right, the error field in SMMU_CMDQ_CONS is at [30:24]
>> (CMDQ_CONS_ERR). You could also get the value from q->cons since you read
>> this after queue_sync_cons().
> 
> Hm, this doesn't seem right to me either. Not sure why it was added
> especially with the hard-coded 0x1. Probably an artifact left over from
> debugging.
> 
> Anyway, reading CMDQ_CONS_ERR to check if an error happened is not the
> correct behavior either. The spec says the value is UNKNOWN when
> CMDQ_ERR global error is not active. So we should be reading
> SMMU_GERROR.CMDQ_ERR instead.
> 
> Will fix.
> 
>>> +}
>>> +
>>> +static void queue_inc_prod(struct arm_smmu_queue *q)
>>> +{
>>> +   u32 shift = q->max_n_shift;
>>> +   u32 prod = (Q_WRP(q->prod, shift) | Q_IDX(q->prod, shift)) + 1;
>>> +
>>> +   q->prod = Q_OVF(q->prod) | Q_WRP(prod, shift) | Q_IDX(prod, shift);
>>> +   mmio_write32(q->prod_reg, q->prod);
>>> +}
>>> +
>>> +static void queue_write(__u64 *dst, __u64 *src, u32 n_dwords)
>>> +{
>>> +   int i;
>>> +
>>> +   for (i = 0; i < n_dwords; ++i)
>>> +           *dst++ = *src++;
>>> +   dsb(ish);
>>
>> dsb(ishst) should be sufficient
> 
> Will fix here and in all other cases.
> 
>>> +}
>>> +
>>> +static __u64 *queue_entry(struct arm_smmu_queue *q)
>>> +{
>>> +   return q->base + (Q_IDX(q->prod, q->max_n_shift) * q->ent_dwords);
>>> +}
>>> +
>>> +static int queue_insert_raw(struct arm_smmu_queue *q, __u64 *ent)
>>> +{
>>> +   while (queue_full(q))
>>> +   {}
>>
>> It seems like queue_full() only accesses the shadow version of cons and
>> prod, so I think you need a queue_sync_cons() in this loop. But given that
>> you drain the queue for each command below, queue_full() could only happen
>> if the queue is in error state, in which case this will loop indefinitely.
> 
> I'll sync the consumer in the body of the loop, just in case there is a
> change in the future that doesn't drain the queue.
> 
> Also, why would queue_full() happen when the queue is in error state? In
> that function, we don't check the error code, only the write index and
> wrap. AFAIK, neither of those are affected when an error is active.
> 
> So we should also add queue_error() check here, correct?
> 
>>> +
>>> +   queue_write(queue_entry(q), ent, q->ent_dwords);
>>> +   queue_inc_prod(q);
>>> +   while (!queue_empty(q) && !queue_error(q)) {
>>> +           queue_sync_cons(q);
>>> +   }
>>> +   return 0;
>>> +}
>>> +
>>> +/* High-level queue accessors */
>>> +static int arm_smmu_cmdq_build_cmd(__u64 *cmd, struct arm_smmu_cmdq_ent 
>>> *ent)
>>> +{
>>> +   u64 vmid = (u64)this_cell()->config->id;
>>
>> A little redundant since you're setting ent->tlbi.vmid already.
> 
> Removed.
> 
>>> +   memset(cmd, 0, CMDQ_ENT_SIZE);
>>> +   cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
>>> +
>>> +   switch (ent->opcode) {
>>> +   case CMDQ_OP_TLBI_EL2_ALL:
>>> +   case CMDQ_OP_TLBI_NSNH_ALL:
>>> +           break;
>>> +   case CMDQ_OP_PREFETCH_ADDR:
>>> +           cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size);
>>> +           cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK;
>>> +           /* Fallthrough */
>>> +   case CMDQ_OP_PREFETCH_CFG:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_PREFETCH_0_SID, ent->prefetch.sid);
>>> +           break;
>>> +   case CMDQ_OP_CFGI_STE:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid);
>>> +           cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_LEAF, ent->cfgi.leaf);
>>> +           break;
>>> +   case CMDQ_OP_CFGI_ALL:
>>> +           /* Cover the entire SID range */
>>> +           cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31);
>>> +           break;
>>> +   case CMDQ_OP_TLBI_NH_VA:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
>>> +           cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
>>> +           cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
>>> +           cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK;
>>> +           break;
>>> +   case CMDQ_OP_TLBI_S2_IPA:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vmid);
>>> +           cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
>>> +           cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK;
>>> +           break;
>>> +   case CMDQ_OP_TLBI_NH_ASID:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
>>> +           /* Fallthrough */
>>> +   case CMDQ_OP_TLBI_S12_VMALL:
>>> +           cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vmid);
>>> +           break;
>>> +   case CMDQ_OP_CMD_SYNC:
>>> +           if (ent->sync.msiaddr)
>>> +                   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
>>> CMDQ_SYNC_0_CS_IRQ);
>>> +           else
>>> +                   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
>>> CMDQ_SYNC_0_CS_SEV);
>>> +           cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
>>> +           cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, 
>>> ARM_SMMU_MEMATTR_OIWB);
>>> +           /*
>>> +            * Commands are written little-endian, but we want the SMMU to
>>> +            * receive MSIData, and thus write it back to memory, in CPU
>>> +            * byte order, so big-endian needs an extra byteswap here.
>>> +            */
>>
>> The comment doesn't apply anymore, does it?
> 
> Why not? I'm guessing it's because the driver is written with the
> assumption it is running on a little endian core.
> 
> In that case, yes, it does not apply. This was taken verbatim from the
> Linux driver, so probably why this comment stayed around.
> 
>>> +           cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
>>> +           cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
>>> +           break;
>>> +   default:
>>> +           return -ENOENT;
>>> +   }
>>> +
>>> +   return 0;
>>> +}
>>> +
>>> +static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, __u64 
>>> *cmd)
>>> +{
>>> +   struct arm_smmu_queue *q = &smmu->cmdq.q;
>>> +
>>> +   queue_insert_raw(q, cmd);
>>
>> nit: you could drop queue_insert_raw() and insert its content here
> 
> Done.
> 
>>> +}
>>> +
>>> +static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
>>> +                               struct arm_smmu_cmdq_ent *ent)
>>> +{
>>> +   u64 cmd[CMDQ_ENT_DWORDS];
>>> +
>>> +   if (arm_smmu_cmdq_build_cmd(cmd, ent)) {
>>> +           printk("WARN: SMMU ignoring unknown CMDQ opcode 0x%x\n",
>>> +                    ent->opcode);
>>> +           return;
>>> +   }
>>> +
>>> +   spin_lock(&smmu->cmdq.lock);
>>> +   arm_smmu_cmdq_insert_cmd(smmu, cmd);
>>> +   spin_unlock(&smmu->cmdq.lock);
>>> +}
>>> +
>>> +static void arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
>>> +{
>>> +   struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC };
>>> +   u64 cmd[CMDQ_ENT_DWORDS];
>>> +
>>> +   arm_smmu_cmdq_build_cmd(cmd, &ent);
>>> +
>>> +   spin_lock(&smmu->cmdq.lock);
>>> +   arm_smmu_cmdq_insert_cmd(smmu, cmd);
>>> +   spin_unlock(&smmu->cmdq.lock);
>>> +}
>>> +
>>> +/* Stream table manipulation functions */
>>> +static void
>>> +arm_smmu_write_strtab_l1_desc(__u64 *dst, struct arm_smmu_strtab_l1_desc 
>>> *desc)
>>> +{
>>> +   u64 val = 0;
>>> +
>>> +   val |= FIELD_PREP(STRTAB_L1_DESC_SPAN, desc->span);
>>> +   val |= desc->l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK;
>>> +
>>> +   /* Assuming running on Little endian cpu */
>>> +   *dst = val;
>>> +   dsb(ish);
>>
>> dsb(ishst)? It might be a good idea to add a small comment before each
>> memory barrier. I'm not sure what this one does for example, does it order
>> against the subsequent write to SMMU_STRTAB_BASE? If that's the case then
>> you could move the barrier closer to that write, so it doesn't have to be
>> done for each L1 descriptor.
> 
> The idea was to have a data barrier whenever there was any updates to
> any of the data structures that the SMMU uses.
> 
> Since someone is likely to invalidate the L1 descriptor (though we don't
> do it right now, we should), it makes sense to be sure the write went
> through before telling the SMMU to read it back.
> 
>>> +}
>>> +
>>> +static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 
>>> sid)
>>> +{
>>> +   struct arm_smmu_cmdq_ent cmd = {
>>> +           .opcode = CMDQ_OP_CFGI_STE,
>>> +           .cfgi   = {
>>> +                   .sid    = sid,
>>> +                   .leaf   = true,
>>
>> You probably need leaf = false, because the SMMU could cache l1 stream
>> table descriptors. I think the Linux driver gets away with it because it
>> only modifies l1 descriptors before making the stream table visible to the
>> SMMU through SMMU_STRTAB_BASE (I haven't verified that claim).
> 
> Hm, I don't think the Linux driver only modifies L1 descriptors before
> making the stream table visible to the SMMU.
> 
> arm_smmu_add_device() calls arm_smmu_init_l2_strtab() which modifies the
> L1 descriptor. Since a device can be added after the SMMU is
> initialised, the Linux driver makes changes to L1 descriptors after the
> SMMU is up and the stream table is visible.
> 
> Either way, we should be invalidating the L1 descriptors when we
> actually update them. And the best place for that would be
> arm_smmu_write_strtab_l1_desc().
> 
> Unfortunately, I don't think there is a command to invalidate just a L1
> descriptor. So we'll probably have to hack around it by invalidating the
> first STE for the L1 descriptor with leaf = false.
> 
> Any better ideas?
> 
>>> +           },
>>> +   };
>>> +
>>> +   arm_smmu_cmdq_issue_cmd(smmu, &cmd);
>>> +   arm_smmu_cmdq_issue_sync(smmu);
>>> +}
>>> +
>>> +static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 
>>> sid,
>>> +                                 __u64 *guest_ste, __u64 *dst,
>>> +                                 bool bypass, u32 vmid)
>>> +{
>>> +   struct paging_structures *pg_structs = &this_cell()->arch.mm;
>>> +   u64 val = dst[0], vttbr, mask;
>>> +
>>> +   /* Bypass */
>>> +   if (bypass) {
>>> +           dst[0] = STRTAB_STE_0_V;
>>
>> Hm, won't the "dst[0] = val" below clear V?
> 
> Yes it will. This probably got messed up in a bit of last-minute
> refactoring I did. Fixed.
> 
>> STE writes in this function might need a little more care. At this point
>> for example, the SMMU could issue C_BAD_STE events because the STE is
>> marked valid but is in inconsistent state. It might be best to do the same
>> as the Linux driver: ensure dst[0].V is clear, write dst[1-3] first, issue
>> a sync(), then write dst[0] in a single access.
> 
> All right. The next patch also touches this function. That will need
> fixing too.
> 
>>> +           dsb(ish);
>>> +           val = FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>>> +           dst[0] = val;
>>> +           dst[1] = FIELD_PREP(STRTAB_STE_1_SHCFG,
>>> +                               STRTAB_STE_1_SHCFG_INCOMING);
>>> +           dst[2] = vmid;
>>> +           dsb(ish);
>>> +           if (smmu) {
>>> +                   arm_smmu_sync_ste_for_sid(smmu, sid);
>>> +           }
>>> +           return;
>>> +   }
>>> +
>>> +   dst[2] = FIELD_PREP(STRTAB_STE_2_S2VMID, vmid) |
>>
>> One more restriction on the hardware is that the VMID size used by the CPU
>> must be smaller or equal the size of the VMID supported by the SMMU. If
>> the CPU is using 16-bit VMID (TCR.VS == 1) but SMMU_IDR0.VMID16 == 0, then
>> setting the top VMID bits here is illegal.
> 
> We're pulling the VMID from cell->config->id, which is the cell id. This
> is a linearly allocated number, so you'd need at least 257 cells to set
> any of the top bits. In case of Jailhouse, that would mean at least 257
> CPUs. I don't think anyone is using Jailhouse with 256+ CPUs, so this is
> not a big problem for now.
> 
> And say we do overflow. Say we do have more than 257 cells running. What
> do you set the VMID to then? You can't wrap the value back because that
> would conflict with another cell. The only course then would be to
> terminate the cell, or not use the SMMU for it at all.
> 
> Not using the SMMU could work, but it means there would be complications
> when the cell expects stage 1 translations.
> 
> Again, to me it looks like a problem that is not likely to occur. Jan,
> your thoughts about it?
> 
>>> +            FIELD_PREP(STRTAB_STE_2_VTCR, VTCR_CELL) |
>>> +            STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
>>> +            STRTAB_STE_2_S2R;
>>
>>> +
>>> +   vttbr = paging_hvirt2phys(pg_structs->root_table);
>>> +   dst[3] = vttbr & STRTAB_STE_3_S2TTB_MASK;
>>> +
>>> +   dst[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
>>> +   dst[0] |= STRTAB_STE_0_V;
>>
>> Here as well, it would be better to sync the rest of the STE, then write
>> dst[0] in a single access.
> 
> Will fix.
> 
>>> +   dsb(ish);
>>
>> Not sure that's needed, the Linux driver doesn't have a dsb() here
> 
> Why not? Do you not want to make sure the writes actually go through
> before you sync the STE?
> 
>>> +
>>> +   arm_smmu_sync_ste_for_sid(smmu, sid);
>>> +}
>>> +
>>> +static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent)
>>> +{
>>> +   unsigned int i;
>>> +
>>> +   for (i = 0; i < nent; ++i) {
>>> +           arm_smmu_write_strtab_ent(NULL, -1, NULL, strtab, true,
>>> +                                     (u32)this_cell()->config->id);
>>> +           strtab += STRTAB_STE_DWORDS;
>>> +   }
>>> +}
>>> +
>>> +static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>>> +{
>>> +   void *strtab;
>>> +   u64 reg;
>>> +   u32 size;
>>> +   struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>>> +
>>> +   size = (1 << smmu->sid_bits) * STRTAB_STE_SIZE;
>>> +   strtab = page_alloc_aligned(&mem_pool, PAGES(size));
>>> +   if (!strtab) {
>>> +           printk("ERROR: SMMU failed to allocate l1 stream table (%u 
>>> bytes)\n",
>>> +                  size);
>>> +           return -ENOMEM;
>>> +   }
>>> +   cfg->strtab_dma = paging_hvirt2phys(strtab);
>>> +   cfg->strtab = strtab;
>>> +   cfg->num_l1_ents = 1 << smmu->sid_bits;
>>> +
>>> +   /* Configure strtab_base_cfg for a linear table covering all SIDs */
>>> +   reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_LINEAR);
>>> +   reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
>>> +   cfg->strtab_base_cfg = reg;
>>> +
>>> +   arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
>>> +   return 0;
>>> +}
>>> +
>>> +static int arm_smmu_init_l1_strtab(struct arm_smmu_device *smmu)
>>> +{
>>> +   struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>>> +   u32 size = sizeof(*cfg->l1_desc) * cfg->num_l1_ents;
>>> +   void *strtab = smmu->strtab_cfg.strtab;
>>> +   unsigned int i;
>>> +
>>> +   cfg->l1_desc = page_alloc(&mem_pool, PAGES(size));
>>> +   if (!cfg->l1_desc) {
>>> +           printk("ERROR: SMMU failed to allocate l1 stream table desc\n");
>>> +           return -ENOMEM;
>>> +   }
>>> +
>>> +   for (i = 0; i < cfg->num_l1_ents; ++i) {
>>> +           memset(&cfg->l1_desc[i], 0, sizeof(*cfg->l1_desc));
>>
>> I might be wrong, but I think pages obtained from mem_pool are initialized
>> to zero, so you don't need this.
> 
> Yes, mem_pool uses the flag PAGE_SCRUB_ON_FREE (which means pages
> obtained from mem_pool will be zeroed out on free), but it is not
> documented anywhere that pages allocated via mem_pool will be zeroed
> out. I'd rather stay on the safer side here.
> 
>>> +           arm_smmu_write_strtab_l1_desc(strtab, &cfg->l1_desc[i]);
>>> +           cfg->l1_desc[i].active_stes = 0;
>>
>> Already initialized to zero
> 
> Fixed.
> 
>>> +           strtab += STRTAB_L1_DESC_SIZE;
>>> +   }
>>> +
>>> +   return 0;
>>> +}
>>
>> [...]
>>> +static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
>>> +                              struct arm_smmu_queue *q,
>>> +                              unsigned long prod_off,
>>> +                              unsigned long cons_off,
>>> +                              unsigned long dwords)
>>> +{
>>> +   /* Queue size is capped to 4K. So allocate 1 page */
>>> +   q->base = page_alloc(&mem_pool, 1);
>>> +   if (!q->base) {
>>> +           printk("ERROR: SMMU failed to allocate queue\n");
>>> +           return -ENOMEM;
>>> +   }
>>> +   q->base_dma = paging_hvirt2phys(q->base);;
>>
>> (nit: ;;)
> 
> Fixed.
> 
>> [...]
>>> +static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
>>> +{
>>> +   int ret;
>>> +   u32 reg, enables;
>>> +   struct arm_smmu_cmdq_ent cmd;
>>> +
>>> +   /* Clear CR0 and sync (disables SMMU and queue processing) */
>>> +   reg = mmio_read32(smmu->base + ARM_SMMU_CR0);
>>> +   if (reg & CR0_SMMUEN)
>>> +           printk("ERROR: SMMU currently enabled! Resetting...\n");
>>
>> I'm guessing the SMMU is disabled in the root cell before we enable 
>> jailhouse?
> 
> If you want stage 2 only, it is. But if you want stage 1 + 2, you need
> to enable it in Linux on boot so any devices that need to go through the
> SMMU are properly initialized.
> 
> Then when Jailhouse comes up, we need to reset it.
> 
>> [...]
>>
>>> +static void arm_smmu_uninit_l2_strtab(struct arm_smmu_device *smmu, u32 
>>> sid)
>>> +{
>>> +   struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>>> +   struct arm_smmu_strtab_l1_desc *desc;
>>> +   void *strtab;
>>> +   u32 size;
>>> +
>>> +   desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
>>> +
>>> +   desc->active_stes--;
>>> +   if (desc->active_stes)
>>> +           return;
>>> +
>>> +   size = 1 << (STRTAB_SPLIT + STRTAB_STE_DWORDS_BITS + 3);
>>> +   page_free(&mem_pool, desc->l2ptr, PAGES(size));
>>
>> Might be safer to free the page after clearing the l1_desc and issuing a
>> CMDQ_CFGI_STE? Here the SMMU can still access the page.
> 
> This function is only called during cell_exit(). Will the SMMU have any
> transactions while a cell is exiting?
> 
>>> +   desc->l2ptr = NULL;
>>> +   desc->l2ptr_dma = 0;
>>> +   desc->span = 0;
>>> +   strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
>>> +   arm_smmu_write_strtab_l1_desc(strtab, desc);
>>> +
>>> +   return;
>>
>> nit: not needed
> 
> Fixed.
> 
>> [...]
>>> +static int arm_smmuv3_cell_init(struct cell *cell)
>>> +{
>>> +   struct jailhouse_iommu *iommu;
>>> +   struct arm_smmu_cmdq_ent cmd;
>>> +   int ret, i, j, sid;
>>> +
>>> +   for (i = 0; i < JAILHOUSE_MAX_IOMMU_UNITS; i++) {
>>> +           iommu = &system_config->platform_info.arm.iommu_units[i];
>>> +           if (iommu->type != JAILHOUSE_IOMMU_SMMUV3)
>>> +                   continue;
>>> +
>>> +           for_each_stream_id(sid, cell->config, j) {
>>> +                   ret = arm_smmu_init_ste(&smmu[i], sid, 
>>> cell->config->id);
>>> +                   if (ret) {arch_paging_flush_page_tlbs
>>> +                           printk("ERROR: SMMU INIT ste failed: sid = 
>>> %d\n",
>>> +                                  sid);
>>> +                           return ret;
>>> +                   }
>>> +           }
>>> +   }
>>> +
>>> +   cmd.opcode      = CMDQ_OP_TLBI_S12_VMALL;
>>> +   cmd.tlbi.vmid   = cell->config->id;
>>> +   arm_smmu_cmdq_issue_cmd(smmu, &cmd);
>>> +   arm_smmu_cmdq_issue_sync(smmu);
>>
>> By the way, shouldn't there be a CMDQ_OP_TLBI_S2_IPA whenever we do
>> arch_paging_flush_page_tlbs() on the CPU side?
> 
> I don't think so. Firstly, arch_paging_flush_page_tlbs() is called only
> for changes to hypervisor paging structs. Right now, the hypervisor does
> not use the SMMU at all. It is only for guest use.
> 
> And the guest memory mappings are static and will remain constant from
> start till exit. So we should only need to flush these entries when
> starting a cell (which we do), and when exiting a cell.
> 
> This makes me realise, we should be calling CMDQ_OP_TLBI_S12_VMALL on
> cell_exit() too.
> 
>> Thanks,
>> Jean
>>
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/de802594-927d-4588-af42-1c51c1f8c915%40siemens.com.
For more options, visit https://groups.google.com/d/optout.

Re: [PATCH v2 5/6] arm64: iommu: smmu-v3: Add data structure initialization and stage 2 for SMMUv3

Reply via email to