On 6/12/08, Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > Hi Olga, > > On Thu, 2008-06-12 at 09:46 +0300, Olga Shern wrote: > > Hi All, > > > > > > > > We have found something that seems like Infiniband Spec hole, > > What's the spec hole ?
According to the Infiniband spec - partial member cannot "talk" with partial member only with full member. Therefore if partial member sending MC packet - all other partial members of this partition will generate BAD PKEY trap. It means that the behavior that we see is according to Infiniband Spec - but very problematic > This issue is system issue that prevents from partial P_Key setup to > > go into production. > > Indeed :-( > > > Short Setup & test description: > > ------------------------------------------ > > * Node A: P_Key XXX (full member) > > * Node B, C, D, E, F: P_Key XXx (partial member) > > > > 1. Send ping from B -> A : ping is OK > > 2. Send ping from C -> A : ping is OK > > 3. Send ping from B -> C : no ping also OK > > * Get traps Bad P_Key in SM - from all HCA in the fabric both for > > test 1 & 2 (one time) and also for test 3 (all the time). > > > > Probably the ARP request that is MC traffic generate the trap in HCA, > > for test 1 > > & 2 we have only one ARP but for test 3 we send ARP all the time > > because > > we do not get any ARP reply. > > > > * The trap number SM get is 257 (HCA trap) if we will do P_Key > > switch enforcement we will probably get 259 > > Is this with OpenSM or VSM ? We tested it with Voltaire SM but it should behave the same with OpenSM. -- Hal > > > * We get trap also from the originator of the MC traffic even > > though that receive switch relay error counter is increased (when out > > port==in port), the switch does not drop the packet ? > > > > Additional questions/issues: > > * Do we have a way to suppress port traps from SMA ?? i.e. that > > the port will not generate traps that can "kill the SM" - as its look > > this is bug in the spec where we can't send any mc traffic (even ARP) > > when we have partial members and we do not have a way to suppress the > > traps. > > > > > > * What will happen in the HCA when we get many traps (mc packets > > from many nodes) and they need to keep all events until SM will > > acknowledge? - Is there limitation in the number of on-going > > traps (any HCA specific issues)? > > > > > > > > > > > > Best Regards > > > > Olga > > > > > > _______________________________________________ > > general mailing list > > [email protected] > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general >
_______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
