On Wed, Jan 20, 2016 at 01:14:35PM -0800, Shaohua Li wrote:
> > My understanding from the above is that the only issue with our
> > patchset was not dealing with pfn_limit. I can just fix that and
> > repost, sounds good?
>
> Sure, please do it. For the patches, I'm not comformatable about the
> per-cpu deferred invalidation. One important benefit of IOMMU is
> isolation. Deferred invalidation already loose the isolation, per-cpu
> invalidation loose further. It would be better we can flush all per-cpu
> invalidation entries if one cpu hits its per-cpu limit. Also you'd
> better look at CPU hotplug. We don't want to lose the invalidation
> entries if one cpu is hot removed.
I'll look into these.
> The per-cpu iova implementation looks unnecessary complicated. I know
> you are referring the paper, but the whole point is batch
> allocation/free.
Batched allocation/free isn't enough. It still creates spinlock
contention, even if there is per-cpu invalidation (that gets rid of
async_umap_flush_lock). Here are sample results from our memcached
test (throughput of querying 16 memcached instances on a 16-core box
with an Intel XL710 NIC):
batched alloc/free, iommu=on:
313,161 memcached transactions/sec (= 29% of iommu=off)
batched alloc/free + per-cpu invalidations, iommu=on:
434,590 memcached transactions/sec (= 40% of iommu=off)
perf report:
61.15% 0.33% swapper [kernel.kallsyms] [k]
_raw_spin_lock_irqsave
|
---_raw_spin_lock_irqsave
|
|--87.81%-- free_iova_array
|--11.71%-- alloc_iova
In contrast, the per-cpu magazine cache in our patchset enables iova
allocation/free to complete without accessing the iova allocator at
all. So we don't touch the rbtree spinlock, and also complete iova
allocation in constant time, which avoids the linear-time allocations
that the iova allocator suffers from. (These were described in the
paper "Efficient intra-operating system protection against harmful
DMAs", presented at the USENIX FAST 2015 conference.) The end result:
magazines cache + per-cpu invalidations, iommu=on:
1,067,586 memcached transactions/sec (= 98% of iommu=off)
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu