On 2022-06-09 13:49, Jason Gunthorpe wrote:
On Thu, Jun 09, 2022 at 03:08:10PM +0800, Lu Baolu wrote:
The IOMMU page tables are updated using iommu_map/unmap() interfaces.
Currently, there is no mandatory requirement for drivers to use locks
to ensure concurrent updates to page tables, because it's assumed that
overlapping IOVA ranges do not have concurrent updates. Therefore the
IOMMU drivers only need to take care of concurrent updates to level
page table entries.

But enabling new features challenges this assumption. For example, the
hardware assisted dirty page tracking feature requires scanning page
tables in interfaces other than mapping and unmapping. This might result
in a use-after-free scenario in which a level page table has been freed
by the unmap() interface, while another thread is scanning the next level
page table.

This adds RCU-protected page free support so that the pages are really
freed and reused after a RCU grace period. Hence, the page tables are
safe for scanning within a rcu_read_lock critical region. Considering
that scanning the page table is a rare case, this also adds a domain
flag and the RCU-protected page free is only used when this flat is set.

Signed-off-by: Lu Baolu <baolu...@linux.intel.com>
  include/linux/iommu.h |  9 +++++++++
  drivers/iommu/iommu.c | 23 +++++++++++++++++++++++
  2 files changed, 32 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5e1afe169549..6f68eabb8567 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -95,6 +95,7 @@ struct iommu_domain {
        void *handler_token;
        struct iommu_domain_geometry geometry;
        struct iommu_dma_cookie *iova_cookie;
+       unsigned long concurrent_traversal:1;
static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
@@ -657,6 +658,12 @@ static inline void dev_iommu_priv_set(struct device *dev, 
void *priv)
        dev->iommu->priv = priv;
+static inline void domain_set_concurrent_traversal(struct iommu_domain *domain,
+                                                  bool value)
+       domain->concurrent_traversal = value;

?? If you want it to be a driver opt in I would just add a flags to
the domain ops. "DOMAIN_FLAG_RCU_FREE_PAGES"

Is there a significant benefit to keeping both paths, or could we get away with just always using RCU? Realistically, pagetable pages aren't likely to be freed all that frequently, except perhaps at domain teardown, but that shouldn't really be performance-critical, and I guess we could stick an RCU sync point in iommu_domain_free() if we're really worried about releasing larger quantities of pages back to the allocator ASAP?

It's already a driver opt-in to use the iommu_iotlb_gather freelist in the first place, and right now the ones that do are also the ones that do lock-free table walks so will ultimately all want this as well.


+void iommu_free_pgtbl_pages(struct iommu_domain *domain,
+                           struct list_head *pages)
+       struct page *page, *next;
+       if (!domain->concurrent_traversal) {
+               put_pages_list(pages);
+               return;
+       }
+       list_for_each_entry_safe(page, next, pages, lru) {
+               list_del(&page->lru);
+               call_rcu(&page->rcu_head, pgtble_page_free_rcu);
+       }

It seems OK, but I wonder if there is benifit to using
put_pages_list() from the rcu callback

iommu mailing list

Reply via email to