Re: [ofa-general] Limited number of multicasts groups that can be joined?

Andrew Friedley Thu, 28 Jun 2007 08:46:35 -0700

Some updates on this problem.

The code I'm using to test/produce this behavior is an MPI program. MPIis used for convenience of job startup and collection of results. Theactual test/benchmark is using straight RDMA CM & ibverbs. What I'mdoing is timing how long it takes to join and bring up a multicast groupwith varying number of processes and existing groups. One rank joinswith a '0' address to get a real address, MPI_Bcast's that address tothe other ranks, which then join the group. Meanwhile the root rank isrepeatedly sending a small ping message to the group. Every other ranktimes from when they call rdma_join_multicast() to the join eventarrival, and to when they first receive a message on that group. Oncecompleted, the process repeats N times, leaving all the groups joined.

I'm now running OFED v1.2, and behavior has not changed due to this,though I've noticed some other cases. First -- If I have not been usinganything multicast on the network for a while, I'm able to join a totalof 4 groups with my benchmark. After this, running it any number oftimes, I can join 14 groups as described below.

Now the more interesting part. I'm now able to run on a 128 nodemachine using open SM running on a node (before, I was running on an 8node machine which I'm told is running the Cisco SM on a Topspinswitch). On this machine, if I run my benchmark with two processes pernode (instead of one, i.e. mpirun -np 16 with 8 nodes), I'm able to join> 750 groups simultaneously from one QP on each process. To make thisstranger, I can join only 4 groups running the same thing on the 8-nodemachine.

While doing so I noticed that the time from callingrdma_join_multicast() to the event arrival stayed fairly constant (inthe .001sec range), while the time from the join call to actuallyreceiving messages on the group steadily increased from around .1 secsto around 2.7 secs with 750+ groups. Furthermore, this time does notdrop back to .1 secs if I stop the benchmark and run it (or any of myother multicast code) again. This is understandable within a singleprogram run, but the fact that behavior persists across runs concerns me-- feels like a bug, but I don't have much concrete here.

Sorry for the long email -- I'm trying to provide as much detail aspossible so this can get fixed. I'm really not sure where to startlooking on my own, so even some hints on where the problem(s) might liewould be useful.


Andrew

Andrew Friedley wrote:

I've run into a problem where it appears that I cannot join more than 14multicast groups from a single HCA. I'm using the RDMA CM UD/multicastinterface from an OFED v1.2 nightly build, and using a '0' address whenjoining to have the SM allocate an unused address. The first 14rdma_join_multicast() calls succeed, a MULTICAST_JOIN event comesthrough for each of them and everything works. But the 15th call tordma_join_multicast() returns -1 and sets errno to 99, 'Cannot assignrequested address'.
Note that I'm using a single QP per process to do all the joins. Thingsget weirder if I run two instances of my program on the same node -- assoon the total between the two instances is 14, neither instance canjoin any more groups. Also, right now my code hangs when this happens-- if I kill off one of the two instances and run a third instance(while leaving the other hung, holding some number of groups), the thirdinstance is not able to join ANY groups. The behavior resets when Ikill all instances.
Two instances running on separate nodes (on the same network) do notappear to interfere with each other like described above; they do stillerror out on the 15th join.
This feels like a bug to me; though regardless this limit is WAY toolow. Any ideas what might be going on, or how I can work around it?
Andrew
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visithttp://openib.org/mailman/listinfo/openib-general

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Limited number of multicasts groups that can be joined?

Reply via email to