Thanks, I will have it run for a couple of days. Will have answers next week.
Given how hard it is to reproduce, could you build a debugging patch with
per-agent ref counting?
We'd run with it, and this way have more info if it still fails.
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
Subject: RE: [PATCH] mad.c memory leak
>Couldn't pin-point it yet.
>Happens about each two days after stress testing the stack for some hours:
>lots of queries, umad tools, IPoIB, restarting SM ...
One area that's changed lately was user_mad. Looking at the code, I don't think
that receive MADs are being released when the user closes the file. Can you see
if the following patch fixes the problem? (I didn't even compile test this.)
Free queued receive MADs when closing.
Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
Index: user_mad.c
===================================================================
--- user_mad.c (revision 5693)
+++ user_mad.c (working copy)
@@ -701,8 +701,11 @@ static int ib_umad_close(struct inode *i
already_dead = file->agents_dead;
file->agents_dead = 1;
- list_for_each_entry_safe(packet, tmp, &file->recv_list, list)
+ list_for_each_entry_safe(packet, tmp, &file->recv_list, list) {
+ if (packet->recv_wc)
+ ib_free_recv_mad(packet->recv_wc);
kfree(packet);
+ }
list_del(&file->port_list);
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general