On Fri, 2005-06-03 at 15:03, Troy Benjegerdes wrote: > On Fri, Jun 03, 2005 at 01:52:31PM -0400, Hal Rosenstock wrote: > > Hi Troy, > > > > On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > > > I'm having intermittent problems with opensm.. It seems after a while > > > IPoIB stops working and if I restart opensm, it starts spitting out > > > errors. > > > > Please try the following workaround and let me know if this makes things > > better. > > > > -- Hal > > > > Index: libvendor/osm_vendor_ibumad.c > > =================================================================== > > --- libvendor/osm_vendor_ibumad.c (revision 2520) > > +++ libvendor/osm_vendor_ibumad.c (working copy) > > @@ -402,7 +402,7 @@ > > > > p_vend->p_log = p_log; > > p_vend->timeout = timeout; > > - p_vend->max_retries = OSM_DEFAULT_RETRY_COUNT; > > + p_vend->max_retries = 1; > > > > p_vend->umad_port_id = -1; > > p_vend->issmfd = -1; > > No, it doesn't seem to help. To get anything to work at all, I seem to > need to reload all the IB modules on every maching I want to use ipoib > on. > > There have been two times now I've been able to see about 4 ping > packets, and then one of the arp entries seems to go away. > > (On the sm machine, also the machine I am trying to ping) > 10.40.5.213 (incomplete) ib0 > > (on another machine, trying to ping from..) > 10.40.137.12 ether 00:00:04:04:FE:80:00C ib0
That may be another issue. Are all your links active and the OpenSM appears to be behaving better now ? -- Hal _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
