Hi Marcel, On Thu, 2008-05-29 at 16:32 +0200, Marcel Heinz wrote: > Hi, > > Hal Rosenstock wrote: > > On Thu, 2008-05-29 at 15:35 +0200, Marcel Heinz wrote: > >>Hal Rosenstock wrote: > >>>On Thu, 2008-05-29 at 11:19 +0200, Marcel Heinz wrote: > >>>>Especially if I take into account > >>>>that with my own benchmark, I can get ~950MB/s when I start another > >>>>receiver on the same host as the sender. Note that both of the > >>>>receivers, the local and the remote one, are seeing all packets at that > >>>>rate, so the HCAs and the switch must be able to handle multicast > >>>>packets with this throughput. > >>> > >>> > >>>Perhaps this is a static rate issue. > >>> > >>>What SM is being used ? > >> > >>It's OpenSM 3.1.7. I had also made some tests with OpensSM 3.2.1, but > >>this didn't change anything. > > > > > > Can you validate either the PathRecord or MCMemberRecord returned or the > > static rate applied to the multicast QP in the various scenarios ? If it > > is the same, this is not the problem but if it's different then we're on > > to something here. > > > > This is what happened: > > 1. The server on host B is started and creates the MC group, OpenSM > returns: > > | May 29 15:54:34 699610 [B6D71B90] 0x08 -> MCMember Record dump: > | MGID....................0xff12000000000000 : > 0x00010002deadbeef > | PortGid.................0xfe80000000000000 : > 0x0002c9020025abdd > | qkey....................0xABCD > | mlid....................0xC000 > | mtu.....................0x84 > | TClass..................0x0 > | pkey....................0x7FFF > | rate....................0x86 > | pkt_life................0x80 > | SLFlowLabelHopLimit.....0x0 > | ScopeState..............0x21 > | ProxyJoin...............0x0 > > 2. The client on host A is started and joins to the group as > SendOnlyNonMember, OpenSM returns: > > | May 29 15:54:45 381972 [B5D6FB90] 0x08 -> MCMember Record dump: > | MGID....................0xff12000000000000 : > 0x00010002deadbeef > | PortGid.................0xfe80000000000000 : > 0x0002c9020025abed > | qkey....................0xABCD > | mlid....................0xC000 > | mtu.....................0x84 > | TClass..................0x0 > | pkey....................0x7FFF > | rate....................0x86 > | pkt_life................0x80 > | SLFlowLabelHopLimit.....0x0 > | ScopeState..............0x4 > | ProxyJoin...............0x0 > > Now I have 255MB/s between host A and B. > > 3. I start another server on host A, it joines to the group and > OpenSM returns: > > | May 29 15:54:56 129971 [B6570B90] 0x08 -> MCMember Record dump: > | MGID....................0xff12000000000000 : > 0x00010002deadbeef > | PortGid.................0xfe80000000000000 : > 0x0002c9020025abed > | qkey....................0xABCD > | mlid....................0xC000 > | mtu.....................0x84 > | TClass..................0x0 > | pkey....................0x7FFF > | rate....................0x86 > | pkt_life................0x80 > | SLFlowLabelHopLimit.....0x0 > | ScopeState..............0x25 > | ProxyJoin...............0x0 > > Now, all 3 instances measure 950MB/s throughput. > > The returned MCMember Records are absolutely identical except > for the PortGid and the membership state.
Rate 0x86 is exactly 20 Gbps. > How can I find out the static rate applied to the multicast QP? Given the above, I don't see this as a likely suspect but you should be able to query the ah used for sending and look in the ah_attr for static_rate. -- Hal > Regards, > Marcel _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
