On 6/18/08, Marcel Heinz <[EMAIL PROTECTED]> wrote: > Hi, > > Marcel Heinz wrote: > > [Multicast bandwidth of only ~250MByte/s] > > I'm still fighting with this issue. I've played around with my code > to track down possible causes of the effects, but it looks like I know > even less than before. > > These are the old results (client is always running on host A, the hosts > running the servers are listed right of the '->'): > > A -> [none] 258.867MB/s > A -> A: 951.288MB/s > A -> B: 258.863MB/s > A -> A,B: 952.234MB/s > A -> B,B: 143.878MB/s > > I've tested with the client attaching it's QP to the mcast group > (but not posting any receive WRs to it). And this already changed > something: > > A -> [none] 303.440MB/s > A -> A: 99.121MB/s > A -> B: 303.426MB/s > A -> A,B: 99.207MB/s > A -> B,B: 143.866MB/s > > The same effect can also be reproduced by starting another instance on > Host A which just attaches a new QP to the mc group, and then doing nothing > else, so it is not related to that particular QP I use for sending. > > After that, I checked what happens if I post just one WR to the client's > receive queue (still attached to the mc group) and don't care about it > any more. I'd expect that for that first WR, the behavior would be the > same as in the scenario with having another server instance running on > that host, and after that, it should behave like in the "attached but no > WRs" scenario above. But this is not the case: > > A -> [none]: 383.125MB/s > A -> A: 104.039MB/s > A -> B: 383.130MB/s > A -> A,B: 104.038MB/s > A -> B,B: 143.920MB/s > > I don't know why, but the overall rate stays at 383MB/s instead of going > back to the 300MB/s from the last test. I've tried to confirm these > numbers by posting more recv WRs (this time also with polling the recv > CQ), so that I really could benchmark the 2 different phases. I chose > 1000000 and got: > > Phase I Phase II > A -> [none]: 975.614MB/s 382.953MB/s > A -> B: 975.614MB/s 382.953MB/s > A -> A,B: 144.615MB/s 144.615MB/s > A -> B,B: 143.911MB/s 143.852MB/s > A -> A: 144.701MB/s 103.944MB/s > > At least my expectations for phase I were met. But I have no idea what > could cause such effects. The fact that performance increases when the > there is a "local" QP attached to the group indicates the this is a > problem at the local side, not involving the switch at all, but I can't > be sure. > > Has anyone an explanation for these numbers? Or any ideas what else I > should check? > > Regards, > Marcel > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general >
Hi Marcel, We also did tests with multicast traffic and figured out that when using Arbel HCAs there is performance penalty when: 1. Sending MC packets with NO QP attached on local Arble HCA. 2. Receiving MC packet with more then 1 QP attached on a single HCA which causes the back pressure that slows down the sender. Hope this helps Olga _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
