Hello,
  We have an Infiniband cluster in a fat tree configuration with 8 core 
switches and
12 leaf switches.  The compute nodes are all in enclosures connected to the 12
leaf switches.  However, we have a number of non-compute nodes (admin,
login and storage nodes) that we have connected directly to the core
switches.  Initially, we were getting credit-loop issues so we switched
from Min Hop to UPDN routing.  However, now 90% of our IB traffic seems
to be routed through a single core switch.  I have tried adding a root
guid file with the -a option, but that results in us getting this error:

Nov
28 16:47:19 319442 [45007960] 0x01 -> __osm_pr_rcv_get_path_parms:
ERR 1F07: Dead end on path to LID 0x6F from switch for GUID
0x00066a00d9000ac8
Nov 28 16:47:22 319469 [43C05960] 0x01 ->
__osm_pr_rcv_get_path_parms: ERR 1F07: Dead end on path to LID 0x6F
from switch for GUID 0x00066a00d9000ac8

Is there any way we can handle this hardware config via subnet management?

Thanks,

Reid O.                                           
                                          
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to