I saw Will has already sent the pull request. But, FWIW, we are seeing
roughly the same performance as v1 patchset. For PCI NIC, Zhou again
found performance drop goes from ~15->8% with SMMU enabled, and for
integrated storage controller [platform device], we still see a drop of
about 50%, depending on datarates (Leizhen has been working on fixing
this).
Thanks for confirming. Following Joerg's suggestion that the storage
workloads may still depend on rbtree performance - it had slipped my
mind that even with small block sizes those could well be grouped into
scatterlists large enough to trigger a >64-page IOVA allocation - I've
taken the liberty of cooking up a simplified version of Leizhen's rbtree
optimisation series in the iommu/iova branch of my tree. I'll follow up
on that after the merge window, but if anyone wants to play with it in
the meantime feel free.
Just a reminder that we did also see poor performance with our
integrated NIC on your v1 patchset also (I can push for v2 patchset
testing, but expect the same).
We might be able to now include a LSI 3108 PCI SAS card in our testing
also to give a broader set of results.
John
Robin.
.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu