On 15/08/2019 14:57, Will Deacon wrote:
On Thu, Aug 15, 2019 at 01:43:11PM +0100, Robin Murphy wrote:
On 14/08/2019 18:56, Will Deacon wrote:
Commit b6b65ca20bc9 ("iommu/io-pgtable-arm: Add support for non-strict
mode") added an unconditional call to io_pgtable_tlb_sync() immediately
after the case where we replace a block entry with a table entry during
an unmap() call. This is redundant, since the IOMMU API will call
iommu_tlb_sync() on this path and the patch in question mentions this:
| To save having to reason about it too much, make sure the invalidation
| in arm_lpae_split_blk_unmap() just performs its own unconditional sync
| to minimise the window in which we're technically violating the break-
| before-make requirement on a live mapping. This might work out redundant
| with an outer-level sync for strict unmaps, but we'll never be splitting
| blocks on a DMA fastpath anyway.
However, this sync gets in the way of deferred TLB invalidation for leaf
entries and is at best a questionable, unproven hack. Remove it.
Hey, that's my questionable, unproven hack! :P
I thought you'd like to remain anonymous, but I can credit you if you like?
It's not entirely clear to me how this gets in the way though - AFAICS the
intent of tlb_flush_leaf exactly matches the desired operation here, so
couldn't these just wait to be converted in patch #8?
Good point. I think there are two things:
1. Initially, I didn't plan to have tlb_flush_leaf() at all because
I didn't think it would be needed. Then I ran into the v7s CONT
stuff and ended up needing it after all (I think it's the only
user). So that's an oversight.
2. If we do the tlb_flush_leaf() here, then we could potentially
put a hole in the ongoing gather structure, but I suppose we
could do both a tlb_add_page() *and* a tlb_flush_leaf() to get
So yes, I probably could move this back if the sync is necessary but...
In principle the concern is that if the caller splits a block with
iommu_unmap_fast(), there's no guarantee of seeing an iommu_tlb_sync()
before returning to the caller, and thus there's the potential to run into a
TLB conflict on a subsequent access even if the endpoint was "good" and
didn't make any accesses *during* the unmap call.
... this just feels pretty theoretical to me. The fact of the matter is
that we're unable to do break before make because we can't reliably tolerate
faults. If the hardware actually requires BBM for correctness, then we
should probably explore proper solutions (e.g. quirks, avoiding block
mappings, handling faults) rather than emitting a random sync and hoping
for the best.
Did you add the sync just in case, or was it based on a real crash?
Nope, just a theoretical best-effort thing, which I'm certainly not
going to lose sleep over either way - I just felt compelled to question
the rationale which didn't seem to fit. Realistically, this
partial-unmap case is not well-defined in IOMMU API terms, and other
drivers don't handle it consistently. I think VFIO explicitly rejects
partial unmaps, so if we see them at all it's only likely to be from
GPU/SVA type users who in principle ought to be able to tolerate
transient faults from BBM anyway.
iommu mailing list