Re: [PATCH 1/3] iommu: match the original algorithm
On Wed, Nov 27, 2019 at 10:01 AM John Garry wrote: > > On 21/11/2019 00:13, Cong Wang wrote: > > The IOVA cache algorithm implemented in IOMMU code does not > > exactly match the original algorithm described in the paper. > > > > Particularly, it doesn't need to free the loaded empty magazine > > when trying to put it back to global depot. > > > > This patch makes it exactly match the original algorithm. > > > > I haven't gone into the details, but this patch alone is giving this: > > root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see > /sys/kernel/debug/kmemleak) Ah, thanks for catching it! I see what I missed, I should pre-allocate those empty entries in order to make it work correctly. I didn't catch this because this was tested on a production machine where we can't afford CONFIG_DEBUG_KMEMLEAK=y for obvious performance concerns. Anyway, I will fix this and send v2. Thanks! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu: match the original algorithm
> On Nov 27, 2019, at 1:01 PM, John Garry wrote: > > I haven't gone into the details, but this patch alone is giving this: > > root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see > /sys/kernel/debug/kmemleak) > > root@(none)$ cat /sys/kernel/debug/kmemleak > unreferenced object 0x002339843000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > hex dump (first 32 bytes): >00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > backtrace: >[<1d2710bf>] kmem_cache_alloc+0x188/0x260 >[] init_iova_domain+0x1e8/0x2a8 >[<2646fc92>] iommu_setup_dma_ops+0x200/0x710 >[ ] arch_setup_dma_ops+0x80/0x128 >[<994e1e43>] acpi_dma_configure+0x11c/0x140 >[ ] pci_dma_configure+0xe0/0x108 >[ ] really_probe+0x210/0x548 >[<87884b1b>] driver_probe_device+0x7c/0x148 >[<10af2936>] device_driver_attach+0x94/0xa0 >[ ] __driver_attach+0xa4/0x110 >[ ] bus_for_each_dev+0xe8/0x158 >[ ] driver_attach+0x30/0x40 >[<3cf39ba8>] bus_add_driver+0x234/0x2f0 >[<43830a45>] driver_register+0xbc/0x1d0 >[ ] __pci_register_driver+0xb0/0xc8 >[ ] sas_v3_pci_driver_init+0x20/0x28 > unreferenced object 0x002339844000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > > [snip] > > And I don't feel like continuing until it's resolved Thanks for talking a hit by this before me. It is frustrating that people tend not to test their patches properly with things like kmemleak. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu: match the original algorithm
On 21/11/2019 00:13, Cong Wang wrote: The IOVA cache algorithm implemented in IOMMU code does not exactly match the original algorithm described in the paper. Particularly, it doesn't need to free the loaded empty magazine when trying to put it back to global depot. This patch makes it exactly match the original algorithm. I haven't gone into the details, but this patch alone is giving this: root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) root@(none)$ cat /sys/kernel/debug/kmemleak unreferenced object 0x002339843000 (size 2048): comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 backtrace: [<1d2710bf>] kmem_cache_alloc+0x188/0x260 [] init_iova_domain+0x1e8/0x2a8 [<2646fc92>] iommu_setup_dma_ops+0x200/0x710 [ ] arch_setup_dma_ops+0x80/0x128 [<994e1e43>] acpi_dma_configure+0x11c/0x140 [ ] pci_dma_configure+0xe0/0x108 [ ] really_probe+0x210/0x548 [<87884b1b>] driver_probe_device+0x7c/0x148 [<10af2936>] device_driver_attach+0x94/0xa0 [ ] __driver_attach+0xa4/0x110 [ ] bus_for_each_dev+0xe8/0x158 [ ] driver_attach+0x30/0x40 [<3cf39ba8>] bus_add_driver+0x234/0x2f0 [<43830a45>] driver_register+0xbc/0x1d0 [ ] __pci_register_driver+0xb0/0xc8 [ ] sas_v3_pci_driver_init+0x20/0x28 unreferenced object 0x002339844000 (size 2048): comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) [snip] And I don't feel like continuing until it's resolved Thanks, John Cc: Joerg Roedel Signed-off-by: Cong Wang --- drivers/iommu/iova.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..92f72a85e62a 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (!iova_magazine_full(cpu_rcache->loaded)) { can_insert = true; - } else if (!iova_magazine_full(cpu_rcache->prev)) { + } else if (iova_magazine_empty(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); can_insert = true; } else { @@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (new_mag) { spin_lock(>lock); if (rcache->depot_size < MAX_GLOBAL_MAGS) { - rcache->depot[rcache->depot_size++] = - cpu_rcache->loaded; + swap(rcache->depot[rcache->depot_size], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size++; } else { mag_to_free = cpu_rcache->loaded; } @@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, if (!iova_magazine_empty(cpu_rcache->loaded)) { has_pfn = true; - } else if (!iova_magazine_empty(cpu_rcache->prev)) { + } else if (iova_magazine_full(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); has_pfn = true; } else { spin_lock(>lock); if (rcache->depot_size > 0) { - iova_magazine_free(cpu_rcache->loaded); - cpu_rcache->loaded = rcache->depot[--rcache->depot_size]; + swap(rcache->depot[rcache->depot_size - 1], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size--; has_pfn = true; } spin_unlock(>lock); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu: match the original algorithm
The IOVA cache algorithm implemented in IOMMU code does not exactly match the original algorithm described in the paper. Particularly, it doesn't need to free the loaded empty magazine when trying to put it back to global depot. This patch makes it exactly match the original algorithm. Cc: Joerg Roedel Signed-off-by: Cong Wang --- drivers/iommu/iova.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..92f72a85e62a 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (!iova_magazine_full(cpu_rcache->loaded)) { can_insert = true; - } else if (!iova_magazine_full(cpu_rcache->prev)) { + } else if (iova_magazine_empty(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); can_insert = true; } else { @@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad, if (new_mag) { spin_lock(>lock); if (rcache->depot_size < MAX_GLOBAL_MAGS) { - rcache->depot[rcache->depot_size++] = - cpu_rcache->loaded; + swap(rcache->depot[rcache->depot_size], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size++; } else { mag_to_free = cpu_rcache->loaded; } @@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache, if (!iova_magazine_empty(cpu_rcache->loaded)) { has_pfn = true; - } else if (!iova_magazine_empty(cpu_rcache->prev)) { + } else if (iova_magazine_full(cpu_rcache->prev)) { swap(cpu_rcache->prev, cpu_rcache->loaded); has_pfn = true; } else { spin_lock(>lock); if (rcache->depot_size > 0) { - iova_magazine_free(cpu_rcache->loaded); - cpu_rcache->loaded = rcache->depot[--rcache->depot_size]; + swap(rcache->depot[rcache->depot_size - 1], cpu_rcache->prev); + swap(cpu_rcache->prev, cpu_rcache->loaded); + rcache->depot_size--; has_pfn = true; } spin_unlock(>lock); -- 2.21.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu