On 2020/6/27 上午5:29, Peter Xu wrote:
Hi, Eugenio,

(CCing Eric, Yan and Michael too)

On Fri, Jun 26, 2020 at 08:41:22AM +0200, Eugenio Pérez wrote:
diff --git a/memory.c b/memory.c
index 2f15a4b250..7f789710d2 100644
--- a/memory.c
+++ b/memory.c
@@ -1915,8 +1915,6 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
- assert(entry->iova >= notifier->start && entry_end <= notifier->end);
I can understand removing the assertion should solve the issue, however imho
the major issue is not about this single assertion but the whole addr_mask
issue behind with virtio...

I don't get here, it looks to the the range was from guest IOMMU drivers.

For normal IOTLB invalidations, we were trying our best to always make
IOMMUTLBEntry contain a valid addr_mask to be 2**N-1.  E.g., that's what we're
doing with the loop in vtd_address_space_unmap().

I'm sure such such assumption can work for any type of IOMMU.

But this is not the first time that we may want to break this assumption for
virtio so that we make the IOTLB a tuple of (start, len), then that len can be
not a address mask any more.  That seems to be more efficient for things like
vhost because iotlbs there are not page based, so it'll be inefficient if we
always guarantee the addr_mask because it'll be quite a lot more roundtrips of
the same range of invalidation.  Here we've encountered another issue of
triggering the assertion with virtio-net, but only with the old RHEL7 guest.

I'm thinking whether we can make the IOTLB invalidation configurable by
specifying whether the backend of the notifier can handle arbitary address
range in some way.  So we still have the guaranteed addr_masks by default
(since I still don't think totally break the addr_mask restriction is wise...),
however we can allow the special backends to take adavantage of using arbitary
(start, len) ranges for reasons like performance.

To do that, a quick idea is to introduce a flag IOMMU_NOTIFIER_ARBITRARY_MASK
to IOMMUNotifierFlag, to declare that the iommu notifier (and its backend) can
take arbitrary address mask, then it can be any value and finally becomes a
length rather than an addr_mask.  Then for every iommu notify() we can directly
deliver whatever we've got from the upper layer to this notifier.  With the new
flag, vhost can do iommu_notifier_init() with UNMAP|ARBITRARY_MASK so it
declares this capability.  Then no matter for device iotlb or normal iotlb, we
skip the complicated procedure to split a big range into small ranges that are
with strict addr_mask, but directly deliver the message to the iommu notifier.
E.g., we can skip the loop in vtd_address_space_unmap() if the notifier is with
ARBITRARY flag set.

I'm not sure coupling IOMMU capability to notifier is the best choice.

How about just convert to use a range [start, end] for any notifier and move the checks (e.g the assert) into the actual notifier implemented (vhost or vfio)?


Then, the assert() is not accurate either, and may become something like:

diff --git a/memory.c b/memory.c
index 2f15a4b250..99d0492509 100644
--- a/memory.c
+++ b/memory.c
@@ -1906,6 +1906,7 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
      IOMMUNotifierFlag request_flags;
      hwaddr entry_end = entry->iova + entry->addr_mask;
+    IOMMUTLBEntry tmp = *entry;

       * Skip the notification if the notification does not overlap
@@ -1915,7 +1916,13 @@ void memory_region_notify_one(IOMMUNotifier *notifier,

-    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    if (notifier->notifier_flags & IOMMU_NOTIFIER_ARBITRARY_MASK) {
+        tmp.iova = MAX(tmp.iova, notifier->start);
+        tmp.addr_mask = MIN(tmp.addr_mask, notifier->end);
+        assert(tmp.iova <= tmp.addr_mask);
+    } else {
+        assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    }

      if (entry->perm & IOMMU_RW) {
          request_flags = IOMMU_NOTIFIER_MAP;
@@ -1924,7 +1931,7 @@ void memory_region_notify_one(IOMMUNotifier *notifier,

      if (notifier->notifier_flags & request_flags) {
-        notifier->notify(notifier, entry);
+        notifier->notify(notifier, &tmp);

Then we can keep the assert() for e.g. vfio, however vhost can skip it and even
get some further performance boosts..  Does that make sense?


Reply via email to