Hal Rosenstock wrote:

On Fri, 2005-09-02 at 08:28, Sean Hubbell wrote:
Again, thanks Hal. Yes, I can perform an ibping on all of the nodes so connectivity and the ports inbetween are up. I am almost positive now that this has something to do with the IPoIB. I am going to try to ping each node and then look at the arp table. Do you know of anything I can do to look specifically at the IPoIB "exchange".

If ibping works, UD unicast is working (and would work for that part of
IPoIB). What I suspect is not working is multicast. I suspect some issue
with the IPoIB broadcast group. So can you comment on the topology and
provide an OpenSM log when run with verbose ? [Also can you down and
then up all the ib<n> interfaces and see if connectivity is restored.
Also, is the SM running on a node also running IPoIB ?]

If not, you can debug this using the following:

The "currently" topology of the system is 4 Dell PowerEdge 2.8 GHz machines with hyperthreading. There also is another DELL and then one day there will be 48 other nodes that are blades in 4 other chasises. There are 12 infiniband switches which basically use three switches to route to the other switches.

The log file I cannot send. I can go through it and answer any questions that you have. I realize this is stupid, but this is well above me.

I am not sure about the Subnet Manager. How can I tell where it is running?

1. Using ibroute, you can display the multicast tables in the switches.
Using ibtracert you can trace the route of a multicast group.

       Multicast examples:
               ibroute -M 4    # dump all non empty mlids of switch with lid 4
               ibroute -M 4 0xc010 0xc020      # same, but with range
               ibroute -M -n 4 # simple dump format

       Multicast example:
               ibtracert -m 0xc000 4 16        # show multicast path of mlid 
0xc000 between lids 4 and 16

I will try these.

2. There are 2 levels of debug tracing in IPoIB. You can enable these in
the build with CONFIG_INFINIBAND_IPOIB_DEBUG and
CONFIG_INFINIBAND_IPOIB_DEBUG_DATA
Sorry for my ignorance, but how would one go about doing this?

Sean
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to