On Wed, 2005-11-02 at 09:50, Eitan Zahavi wrote: > Hi Hal, > > Yael is working on the exact same problem. She is probably going to > complete it tomorrow. > > The issue was both the vl15 cl_unregister but we are also facing some > issues as the umad receiver never exists.
Yes, I've also been working on making the umad receiver exit. This has also been a lower priority and I don't have a completed solution yet. -- Hal > When MADs are arriving after the dispatcher is destroyed they cause a > segfault. > > Hope it will be all fixed by the weekend. > > EZ > > Eitan Zahavi > Design Technology Director > Mellanox Technologies LTD > Tel:+972-4-9097208 > Fax:+972-4-9593245 > P.O. Box 586 Yokneam 20692 ISRAEL > > > > -----Original Message----- > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, November 02, 2005 4:20 PM > > To: Michael S. Tsirkin > > Cc: [email protected] > > Subject: [openib-general] Re: openib segfaults when openib is not > loaded > > > > On Wed, 2005-11-02 at 09:14, Michael S. Tsirkin wrote: > > > Hi! > > > If I try to load opensm without loading any of openib modules, > > > opensm crashes on exit. > > > Has anyone else seen this? > > > > > > # /usr/local/bin/opensm > > > ------------------------------------------------- > > > OpenSM Rev:openib-1.1.0 > > > Command Line Arguments: > > > Log File: /var/log/osm.log > > > ------------------------------------------------- > > > OpenSM Rev:openib-1.1.0 > > > > > > ibwarn: [8954] umad_init: can't read ABI version from > > /sys/class/infiniband_mad/abi_version (No such file or directory): > is ib_umad module > > loaded? > > > > > > Error from osm_vendor_get_all_port_attr (ffffffff) > > > Error: Could not get port guid > > > Exiting SM > > > > > > Segmentation fault (core dumped) > > > > Yes, this seg fault is caused due to the following: > > osm_opensm_destroy shutdowns the dispatcher and subsequent to this > > osm_vl15_destroy attempts to unregister with the dispatcher > (although > > this has already been done). > > > > osm_opensm.c::osm_opensm_destroy > > > > /* shut down the dispatcher - so no new messages cross */ > > cl_disp_shutdown( &p_osm->disp ); > > > > /* cleanup all messages on VL15 fifo that were not sent yet */ > > osm_vl15_shutdown( &p_osm->vl15, &p_osm->mad_pool ); > > > > /* lock the whole thing so we do not get any requests etc */ > > cl_plock_excl_acquire( &p_osm->lock ); > > > > /* do the destruction in reverse order as init */ > > updn_destroy( p_osm->p_updn_ucast_routing ); > > osm_sa_destroy( &p_osm->sa ); > > osm_sm_destroy( &p_osm->sm ); > > osm_db_destroy( &p_osm->db ); > > osm_vl15_destroy( &p_osm->vl15, &p_osm->mad_pool ); > > > > > > My workaround has been to remove this from > > osm_vl15intf.c::osm_vl15_destroy but I'm not sure this is the best > long > > term fix as yet. I hadn't searched out whether there were other > paths > > that were different from this flow. > > > > This seems lower priority to me than some other issues I'm still > sorting > > through but I will get back to this unless someone else gets to it > first > > or thinks that the workaround I have should be made permanent. > > > > -- Hal > > > > _______________________________________________ > > openib-general mailing list > > [email protected] > > http://openib.org/mailman/listinfo/openib-general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
