On Tue, 2008-06-17 at 11:07 -0400, Michael Di Domenico wrote: > Can anyone tell me what i likely causing this? The SM seems to be in > a loop entering/exiting,
Are you saying the SM exits all on its own ? If so, is there some script that restarts it ? > and the "Unknown remote side" comes up with a different list of port > each time it cycles. That's weird. How different are the ports ? Unknown remote side is due to a port which is not physically DOWN but perhaps unresponsive (or slow response). > I though it was bad cables, but since it keeps changing, that seems > unlikely. > > Thanks > - Michael > > > Jun 17 04:05:51 681669 [AAF0D060] -> OpenSM Rev:openib-2.0.5 OpenIB > svn 9905 This looks close to OFED 1.1. Any chance of updating to a more recent OpenSM version ? > Jun 17 04:05:51 681723 [AAF0D060] -> OpenSM Rev:openib-2.0.5 OpenIB > svn 9905 > Jun 17 04:05:51 686837 [AAF0D060] -> osm_vendor_bind: Binding to port > 0x2c9030000792e > Jun 17 04:05:51 689709 [AAF0D060] -> osm_vendor_bind: Binding to port > 0x2c9030000792e > Jun 17 04:05:52 785232 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00510d port 4. Adding to > light sweep sampling list > Jun 17 04:05:52 785273 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][13] > Jun 17 04:05:52 785281 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00510d port 6. Adding to > light sweep sampling list Can you check the ports indicated and the switch to see if they are responsive ? -- Hal > Jun 17 04:05:52 785290 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][13] > Jun 17 04:05:52 785296 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00510d port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785305 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][13] > Jun 17 04:05:52 785312 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00510d port 10. Adding to > light sweep sampling list > Jun 17 04:05:52 785319 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][13] > Jun 17 04:05:52 785325 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00510d port 12. Adding to > light sweep sampling list > Jun 17 04:05:52 785334 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][13] > Jun 17 04:05:52 785352 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff005118 port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785359 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][13][4] > Jun 17 04:05:52 785424 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff00507a port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785431 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][13][6] > Jun 17 04:05:52 785444 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff005094 port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785454 [46409940] -> Directed Path Dump of 2 hop path: > Path = [0][2][3] > Jun 17 04:05:52 785467 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff005095 port 10. Adding to > light sweep sampling list > Jun 17 04:05:52 785475 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][8][2] > Jun 17 04:05:52 785501 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050a0 port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785509 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][13][8] > Jun 17 04:05:52 785536 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050af port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785545 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][13][C] > Jun 17 04:05:52 785553 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050b0 port 8. Adding to > light sweep sampling list > Jun 17 04:05:52 785561 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][13][A] > Jun 17 04:05:52 785602 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050c1 port 4. Adding to > light sweep sampling list > Jun 17 04:05:52 785631 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][D] > Jun 17 04:05:52 785645 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050c4 port 4. Adding to > light sweep sampling list > Jun 17 04:05:52 785651 [46409940] -> Directed Path Dump of 3 hop path: > Path = [0][2][3][8] > Jun 17 04:05:52 785662 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050c6 port 4. Adding to > light sweep sampling list > Jun 17 04:05:52 785671 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][D][4] > Jun 17 04:05:52 785686 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050d2 port 2. Adding to > light sweep sampling list > Jun 17 04:05:52 785694 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][8] > Jun 17 04:05:52 785702 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050d2 port 12. Adding to > light sweep sampling list > Jun 17 04:05:52 785710 [46409940] -> Directed Path Dump of 4 hop path: > Path = [0][2][3][11][8] > Jun 17 04:05:52 785741 [46409940] -> osm_drop_mgr_process: ERR 0108: > Unknown remote side for node 0x000b8cffff0050fa port 10. Adding to > light sweep sampling list > Jun 17 04:05:52 785748 [46409940] -> Directed Path Dump of 5 hop path: > Path = [0][2][3][11][8][C] > Jun 17 04:05:52 785774 [46409940] -> Entering MASTER state > Jun 17 04:05:52 786126 [46409940] -> osm_report_notice: Reporting > Generic Notice type:3 num:66 from LID:0x0000 > GID:0x000000000000e80f,0x0002c9030000792e > Jun 17 04:05:52 786254 [46409940] -> osm_report_notice: Reporting > Generic Notice type:3 num:66 from LID:0x0000 > GID:0x000000000000e80f,0x0002c9030000792e > Jun 17 04:05:56 691253 [AAF0D060] -> Exiting SM > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
