Re: [PATCH 1/3] iommu: match the original algorithm

2019-11-28 Thread Cong Wang
On Wed, Nov 27, 2019 at 10:01 AM John Garry  wrote:
>
> On 21/11/2019 00:13, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper.
> >
> > Particularly, it doesn't need to free the loaded empty magazine
> > when trying to put it back to global depot.
> >
> > This patch makes it exactly match the original algorithm.
> >
>
> I haven't gone into the details, but this patch alone is giving this:
>
> root@(none)$ [  123.857024] kmemleak: 8 new suspected memory leaks (see
> /sys/kernel/debug/kmemleak)

Ah, thanks for catching it! I see what I missed, I should pre-allocate those
empty entries in order to make it work correctly.

I didn't catch this because this was tested on a production machine where
we can't afford CONFIG_DEBUG_KMEMLEAK=y for obvious performance
concerns.

Anyway, I will fix this and send v2.

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu: match the original algorithm

2019-11-27 Thread Qian Cai



> On Nov 27, 2019, at 1:01 PM, John Garry  wrote:
> 
> I haven't gone into the details, but this patch alone is giving this:
> 
> root@(none)$ [  123.857024] kmemleak: 8 new suspected memory leaks (see 
> /sys/kernel/debug/kmemleak)
> 
> root@(none)$ cat /sys/kernel/debug/kmemleak
> unreferenced object 0x002339843000 (size 2048):
>  comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s)
>  hex dump (first 32 bytes):
>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>  backtrace:
>[<1d2710bf>] kmem_cache_alloc+0x188/0x260
>[] init_iova_domain+0x1e8/0x2a8
>[<2646fc92>] iommu_setup_dma_ops+0x200/0x710
>[] arch_setup_dma_ops+0x80/0x128
>[<994e1e43>] acpi_dma_configure+0x11c/0x140
>[] pci_dma_configure+0xe0/0x108
>[] really_probe+0x210/0x548
>[<87884b1b>] driver_probe_device+0x7c/0x148
>[<10af2936>] device_driver_attach+0x94/0xa0
>[] __driver_attach+0xa4/0x110
>[] bus_for_each_dev+0xe8/0x158
>[] driver_attach+0x30/0x40
>[<3cf39ba8>] bus_add_driver+0x234/0x2f0
>[<43830a45>] driver_register+0xbc/0x1d0
>[] __pci_register_driver+0xb0/0xc8
>[] sas_v3_pci_driver_init+0x20/0x28
> unreferenced object 0x002339844000 (size 2048):
>  comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s)
> 
> [snip]
> 
> And I don't feel like continuing until it's resolved

Thanks for talking a hit by this before me. It is frustrating that people tend 
not to test their patches properly  with things like kmemleak.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu: match the original algorithm

2019-11-27 Thread John Garry

On 21/11/2019 00:13, Cong Wang wrote:

The IOVA cache algorithm implemented in IOMMU code does not
exactly match the original algorithm described in the paper.

Particularly, it doesn't need to free the loaded empty magazine
when trying to put it back to global depot.

This patch makes it exactly match the original algorithm.



I haven't gone into the details, but this patch alone is giving this:

root@(none)$ [  123.857024] kmemleak: 8 new suspected memory leaks (see 
/sys/kernel/debug/kmemleak)


root@(none)$ cat /sys/kernel/debug/kmemleak
unreferenced object 0x002339843000 (size 2048):
  comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<1d2710bf>] kmem_cache_alloc+0x188/0x260
[] init_iova_domain+0x1e8/0x2a8
[<2646fc92>] iommu_setup_dma_ops+0x200/0x710
[] arch_setup_dma_ops+0x80/0x128
[<994e1e43>] acpi_dma_configure+0x11c/0x140
[] pci_dma_configure+0xe0/0x108
[] really_probe+0x210/0x548
[<87884b1b>] driver_probe_device+0x7c/0x148
[<10af2936>] device_driver_attach+0x94/0xa0
[] __driver_attach+0xa4/0x110
[] bus_for_each_dev+0xe8/0x158
[] driver_attach+0x30/0x40
[<3cf39ba8>] bus_add_driver+0x234/0x2f0
[<43830a45>] driver_register+0xbc/0x1d0
[] __pci_register_driver+0xb0/0xc8
[] sas_v3_pci_driver_init+0x20/0x28
unreferenced object 0x002339844000 (size 2048):
  comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s)

[snip]

And I don't feel like continuing until it's resolved

Thanks,
John


Cc: Joerg Roedel 
Signed-off-by: Cong Wang 
---
  drivers/iommu/iova.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..92f72a85e62a 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
  
  	if (!iova_magazine_full(cpu_rcache->loaded)) {

can_insert = true;
-   } else if (!iova_magazine_full(cpu_rcache->prev)) {
+   } else if (iova_magazine_empty(cpu_rcache->prev)) {
swap(cpu_rcache->prev, cpu_rcache->loaded);
can_insert = true;
} else {
@@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
if (new_mag) {
spin_lock(>lock);
if (rcache->depot_size < MAX_GLOBAL_MAGS) {
-   rcache->depot[rcache->depot_size++] =
-   cpu_rcache->loaded;
+   swap(rcache->depot[rcache->depot_size], 
cpu_rcache->prev);
+   swap(cpu_rcache->prev, cpu_rcache->loaded);
+   rcache->depot_size++;
} else {
mag_to_free = cpu_rcache->loaded;
}
@@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
  
  	if (!iova_magazine_empty(cpu_rcache->loaded)) {

has_pfn = true;
-   } else if (!iova_magazine_empty(cpu_rcache->prev)) {
+   } else if (iova_magazine_full(cpu_rcache->prev)) {
swap(cpu_rcache->prev, cpu_rcache->loaded);
has_pfn = true;
} else {
spin_lock(>lock);
if (rcache->depot_size > 0) {
-   iova_magazine_free(cpu_rcache->loaded);
-   cpu_rcache->loaded = 
rcache->depot[--rcache->depot_size];
+   swap(rcache->depot[rcache->depot_size - 1], 
cpu_rcache->prev);
+   swap(cpu_rcache->prev, cpu_rcache->loaded);
+   rcache->depot_size--;
has_pfn = true;
}
spin_unlock(>lock);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu: match the original algorithm

2019-11-20 Thread Cong Wang
The IOVA cache algorithm implemented in IOMMU code does not
exactly match the original algorithm described in the paper.

Particularly, it doesn't need to free the loaded empty magazine
when trying to put it back to global depot.

This patch makes it exactly match the original algorithm.

Cc: Joerg Roedel 
Signed-off-by: Cong Wang 
---
 drivers/iommu/iova.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..92f72a85e62a 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -900,7 +900,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 
if (!iova_magazine_full(cpu_rcache->loaded)) {
can_insert = true;
-   } else if (!iova_magazine_full(cpu_rcache->prev)) {
+   } else if (iova_magazine_empty(cpu_rcache->prev)) {
swap(cpu_rcache->prev, cpu_rcache->loaded);
can_insert = true;
} else {
@@ -909,8 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
if (new_mag) {
spin_lock(>lock);
if (rcache->depot_size < MAX_GLOBAL_MAGS) {
-   rcache->depot[rcache->depot_size++] =
-   cpu_rcache->loaded;
+   swap(rcache->depot[rcache->depot_size], 
cpu_rcache->prev);
+   swap(cpu_rcache->prev, cpu_rcache->loaded);
+   rcache->depot_size++;
} else {
mag_to_free = cpu_rcache->loaded;
}
@@ -963,14 +964,15 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
 
if (!iova_magazine_empty(cpu_rcache->loaded)) {
has_pfn = true;
-   } else if (!iova_magazine_empty(cpu_rcache->prev)) {
+   } else if (iova_magazine_full(cpu_rcache->prev)) {
swap(cpu_rcache->prev, cpu_rcache->loaded);
has_pfn = true;
} else {
spin_lock(>lock);
if (rcache->depot_size > 0) {
-   iova_magazine_free(cpu_rcache->loaded);
-   cpu_rcache->loaded = 
rcache->depot[--rcache->depot_size];
+   swap(rcache->depot[rcache->depot_size - 1], 
cpu_rcache->prev);
+   swap(cpu_rcache->prev, cpu_rcache->loaded);
+   rcache->depot_size--;
has_pfn = true;
}
spin_unlock(>lock);
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu