On Mon, 2005-11-07 at 21:05, Sayantan Sur wrote: > Hi, > > I am using OpenSM (svn rev 3984 and with 3882). It is unable to bring up > the subnet and "hangs". This behavior is observed with machines are > connected back-to-back as well as with any switch. My kernel version is > 2.6.13.1, machines are Opteron (on Tyan S295 motherboard). I have > included the log file. Maybe someone can tell if I am doing anything wrong?
Is the infiniband support from 2.6.13.1 or has it been replaced with OpenIB svn of the revs indicated (or is that only OpenSM) ? If it is only OpenSM, I would recommend trying to update at least user_mad.c as there have been a number of problems which have been fixed in this. There will be some backport issues to 2.6.13.1 to deal with but they have all been discussed on the list. > [EMAIL PROTECTED]:~] lsmod | grep ^ib > ib_ucm 22280 0 > ib_cm 37616 1 ib_ucm > ib_uverbs 40984 0 > ib_umad 17824 2 > ib_mthca 124320 0 > ib_mad 42660 3 ib_cm,ib_umad,ib_mthca > ib_core 56320 6 > ib_ucm,ib_cm,ib_uverbs,ib_umad,ib_mthca,ib_mad > > [EMAIL PROTECTED]:tmp] ls -l /dev/infiniband/ > total 0 > crw-rw---- 1 root root 231, 64 2005-11-08 02:23 issm0 > crw-rw---- 1 root root 231, 65 2005-11-08 02:23 issm1 > crw-rw-rw- 1 root root 231, 224 2005-11-08 02:23 ucm0 > crw-rw---- 1 root root 231, 0 2005-11-08 02:23 umad0 > crw-rw---- 1 root root 231, 1 2005-11-08 02:23 umad1 > crw-rw-rw- 1 root root 231, 192 2005-11-08 02:23 uverbs0 > > > <==== Was opensm started with -V ? > Nov 08 02:59:33 576837 [AB454D00] -> OpenSM Rev:openib-1.1.0 > Nov 08 02:59:33 576979 [0000] -> OpenSM Rev:openib-1.1.0 > > Nov 08 02:59:33 577953 [AB454D00] -> osm_report_notice: Reporting > Generic Notice type:3 num:66 from LID:0x0000 > GID:0xfe80000000000000,0x0000000000000000 > Nov 08 02:59:33 578017 [AB454D00] -> osm_report_notice: Reporting > Generic Notice type:3 num:66 from LID:0x0000 > GID:0xfe80000000000000,0x0000000000000000 > Nov 08 02:59:33 581289 [AB454D00] -> osm_vendor_get_all_port_attr: > assign CA mthca0 port 1 guid (0x2c902004002e9) as the default port. > Nov 08 02:59:33 581326 [AB454D00] -> osm_vendor_bind: Binding to port > 0x2c902004002e9. > Nov 08 02:59:33 583680 [AB454D00] -> osm_vendor_bind: Binding to port > 0x2c902004002e9. > Nov 08 02:59:33 987191 [40C05960] -> umad_receiver: ERR 5409: send > completed with error (method=0x1 attr=0x11 trans_id=0x1234) -- dropping. > Nov 08 02:59:33 987227 [40C05960] -> umad_receiver: ERR 5411: DR SMP hop > ptr 0 hop count 0 DR SLID 0x0 DR DLID 0x0 > Nov 08 02:59:33 987243 [40C05960] -> __osm_sm_mad_ctrl_send_err_cb: ERR > 3113: MAD completed in error (IB_TIMEOUT). > Nov 08 02:59:33 987303 [40C05960] -> SMP dump: > base_ver................0x1 > mgmt_class..............0x81 > class_ver...............0x1 > method..................0x1 (SubnGet) > D bit...................0x0 > status..................0x0 > hop_ptr.................0x0 > hop_count...............0x0 > trans_id................0x1234 > attr_id.................0x11 (NodeInfo) > resv....................0x0 > attr_mod................0x0 > m_key...................0x0000000000000000 > dr_slid.................0xFFFF > dr_dlid.................0xFFFF > > Initial path: [0] > Return path: [0] > Reserved: [0][0][0][0][0][0][0] > > 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 > > 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 > > 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 > > 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 > > Nov 08 02:59:33 987391 [40401960] -> __osm_state_mgr_is_sm_port_down: > ERR 3308: SM port GUID unknown. Since gets are timing out, there is no response to SubnGet NodeInfo for the local node which sets the SM port GUID. Anyrhing relevant in dmesg ? -- Hal > Nov 08 02:59:33 987408 [0000] -> SM port is down. > > Nov 08 02:59:33 987485 [40401960] -> __osm_sm_state_mgr_signal_error: > ERR 3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in state > IB_SMINFO_STATE_DISCOVERING > ===> _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
