On Wed, Jul 22, 2009 at 5:59 PM, Ira Weiny<[email protected]> wrote:
> Check your multicast group membership and forwarding tables on the switches.
>
> We have had similar issues and have found that some nodes fail to join the 
> multicast groups for various reasons.

Also, such failures should be in the opensm log and at least give hint
of the issue (e.g. rate, MTU, etc.).

-- Hal

>
> Ira
>
> On Wed, 22 Jul 2009 15:55:42 -0600
> Todd Bowman <[email protected]> wrote:
>
>> I need a little direction to help solve an IPoIB issue.
>> Software: OFED 1.3 and 1.4 stacks, running OpenSM
>>
>>
>> Problem:
>> IPoIB connections fail, meaning a node cannot ping all or some of the other
>> IPoIB nodes.  IB itself is still up, we can run IB tests with success.  So
>> far the only resolution is to restart the IB stack.  Size of the cluster
>> seems to be irrelevant.  It has happened on clusters from around 64 to
>> 1000s.
>>
>>
>> My first instinct is that some information has been lost from SM/SA which is
>> needed to create an IPoIB connection, but I'm not for sure what that
>> information is or how to verify that it is gone.
>>
>> Thanks in advance,
>>
>> Todd
>>
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> [email protected]
> _______________________________________________
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to