-----Original Message-----
From: Max Staufer [mailto:max.stau...@gmx.net]
Sent: Friday, September 13, 2013 7:06 AM
To: Rolf vandeVaart;de...@open-mpi.org
Subject: Re: [OMPI devel] Nearly unlimited growth of pml free list
Hi Rolf,
I applied your patch, the full output is rather big, even gzip > 10Mb,
which is
not good for the mailinglist, but the head and tail are below for a 7 and 8
processor run.
Seem that the send requests are growing fast 4000 times in just 10 min.
Do you now of a method to bound the list such that it is not growing excessivly
?
thanks
Max
7 Processor run
------------------
[gpu207.dev-env.lan:11236] Iteration = 0 sleeping [gpu207.dev-env.lan:11236]
Freelist=rdma_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11236]
Freelist=recv_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11236]
Freelist=pending_pckts, numAlloc=4, maxAlloc=-1 [gpu207.dev-
env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11236] [gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11236] [gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11236] [gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11236] [gpu207.dev-env.lan:11236] Iteration = 0 sleeping
[gpu207.dev-env.lan:11236] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11236] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11236] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11236] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
......
[gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11243] [gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11243] [gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11243] [gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11243] [gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11243] [gpu207.dev-env.lan:11243] Iteration = 0 sleeping
[gpu207.dev-env.lan:11243] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11243] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=send_requests, numAlloc=16324,
maxAlloc=-1
[gpu207.dev-env.lan:11243] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11243] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
8 Processor run
--------------------
[gpu207.dev-env.lan:11315] Iteration = 0 sleeping [gpu207.dev-env.lan:11315]
Freelist=rdma_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11315]
Freelist=recv_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11315]
Freelist=pending_pckts, numAlloc=4, maxAlloc=-1 [gpu207.dev-
env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11315] [gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11315] [gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11315] [gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11315] [gpu207.dev-env.lan:11315] Iteration = 0 sleeping
[gpu207.dev-env.lan:11315] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11315] Freelist=send_requests, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11315] Freelist=recv_requests, numAlloc=4,
maxAlloc=-1 [gpu207.dev-env.lan:11315] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
...
[gpu207.dev-env.lan:11322] Iteration = 0 sleeping [gpu207.dev-env.lan:11322]
Freelist=rdma_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11322]
Freelist=recv_frags, numAlloc=4, maxAlloc=-1 [gpu207.dev-env.lan:11322]
Freelist=pending_pckts, numAlloc=4, maxAlloc=-1 [gpu207.dev-
env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11322] [gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11322] [gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11322] [gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0 [gpu207.dev-
env.lan:11322] [gpu207.dev-env.lan:11322] Iteration = 0 sleeping
[gpu207.dev-env.lan:11322] Freelist=rdma_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_frags, numAlloc=4, maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=pending_pckts, numAlloc=4, maxAlloc=-
1 [gpu207.dev-env.lan:11322] Freelist=send_ranges_pckts, numAlloc=4,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=send_requests, numAlloc=16708,
maxAlloc=-1
[gpu207.dev-env.lan:11322] Freelist=recv_requests, numAlloc=68, maxAlloc=-
1 [gpu207.dev-env.lan:11322] rdma_pending=0, pckt_pending=0,
recv_pending=0, send_pending=0, comm_pending=0
Am 12.09.2013 17:04, schrieb Rolf vandeVaart:
Can you apply this patch and try again? It will print out the sizes of the free
lists after every 100 calls into the mca_pml_ob1_send. It would be interesting
to see which one is growing.
This might give us some clues.
Rolf
-----Original Message-----
From: Max Staufer [mailto:max.stau...@gmx.net]
Sent: Thursday, September 12, 2013 3:53 AM
To: Rolf vandeVaart
Subject: Re: [OMPI devel] Nearly unlimited growth of pml free list
Hi Rolf,
the heap snapshots I do tell me where and when the memory has
been allocated, and a simple source trace of the in tells me that the
calling
routine was mca_pml_ob1_send and that all of the ~100000 single
allocations during the run were called because of an MPI_ALLREDUCE
command called in exactly one place of the code.
The tool I use for doing that is MemorySCAPE but I thing Valgrind can
tell you the same thing. However, I was not able to reproduce the
problem in a simpler program yet, but I suspect it has something to
do with the locking mechanism of the list elements. I dont know
enough about OMPI to comment on that, but it looks like that the list
is growing because all elements are locked.
really any help is appreciated
Max
PS:
IF I MIMICK ALLREDUCE with 2*Nproc SEND and RECV commands
(aggregating on proc 0 and then sending out to all Proc) I get the same kind
of behaviour.
Am 11.09.2013 17:12, schrieb Rolf vandeVaart:
Hi Max:
You say that that the function keeps "allocating memory in the pml free
list."
How do you know that is happening?
Do you know which free list it is happening on? There are something
like 8
free lists associated with the pml ob1 so it would be interesting to
know which one you observe is growing.
Rolf
-----Original Message-----
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Max
Staufer
Sent: Wednesday, September 11, 2013 10:23 AM
To:de...@open-mpi.org
Subject: [OMPI devel] Nearly unlimited growth of pml free list
Hi All,
as I already asked in the users list, I was told thats not
the right place to ask, I came across a "missbehaviour" of openmpi
version
1.4.5 and 1.6.5 alike.
the mca_pml_ob1_send function keeps allocating memory in the pml
free
list.
It does that indefinitly. In my case the list grew to about 100Gb.
I can controll the maximum using the pml_ob1_free_list_max
parameter, but then the application just stops working when this
number of entries in the list is reached.
The interesting part is that the growth only happens in a single
place in the code, which is RECURSIVE SUBROUTINE.
And the called function is an MPI_ALLREDUCE(... MPI_SUM)
Apparently its not easy to create a test program that shows the
same behaviour, just recursion is not enought.
Is there a mca parameter that allows to limit the total list size
without making the app. stop ?
or is there a way to enforce the lock on the free list entries ?
Thanks for all the help
Max
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--------------------------------------------------------------------
--
------------- This email message is for the sole use of the intended
recipient(s) and may contain confidential information. Any
unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
--------------------------------------------------------------------
--
-------------