Hi Tom, On Thu, 2005-01-27 at 12:53, Tom Duffy wrote: > I hit control-c to kill osm and got: > > Jan 27 18:47:09 [44808960] -> osm_mad_pool_get: [ > opensm[4627]: *** exception handler: died with signal 11 > Segmentation fault
Looks to me like the following could be the case: One thread was shutting down the OSM (osm_opensm_destroy was called and got at least as far as destroying the SA; subsequent to this the MAD pool is destroyed) and another thread attempted a get from the MAD pool. I'm not sure what would prevent this from occuring. I am looking into this crash further and am trying to reproduce the same. -- Hal > Here is the last 100 lines of the osm.log > > [EMAIL PROTECTED] bin]# tail -100 /var/log/osm.log > Jan 27 18:47:04 [43005960] -> __osm_sm_mad_ctrl_retire_trans_mad: Retiring > MAD with TID = 0x2bf9. > Jan 27 18:47:04 [43005960] -> osm_mad_pool_put: [ > Jan 27 18:47:04 [43005960] -> osm_mad_pool_put: Releasing p_madw = 0x56d9c0, > p_mad = 0x599140. > Jan 27 18:47:04 [43005960] -> osm_vendor_put: [ > Jan 27 18:47:04 [43005960] -> osm_vendor_put: Retiring UMAD 0x599140. > Jan 27 18:47:04 [43005960] -> osm_vendor_put: ] > Jan 27 18:47:04 [43005960] -> osm_mad_pool_put: ] > Jan 27 18:47:04 [43005960] -> __osm_sm_mad_ctrl_retire_trans_mad: 0 QP0 MADs > outstanding. > Jan 27 18:47:04 [43005960] -> __osm_sm_mad_ctrl_retire_trans_mad: Posting > Dispatcher message OSM_MSG_NO_SMPS_OUTSTANDING. > Jan 27 18:47:04 [43005960] -> __osm_sm_mad_ctrl_retire_trans_mad: ] > Jan 27 18:47:04 [43005960] -> __osm_sm_mad_ctrl_disp_done_callback: ] > Jan 27 18:47:04 [43005960] -> osm_state_mgr_process: [ > Jan 27 18:47:04 [43005960] -> osm_state_mgr_process: Received signal > OSM_SIGNAL_NO_PENDING_TRANSACTIONS in state OSM_SM_STATE_SWEEP_LIGHT. > Jan 27 18:47:04 [43005960] -> __osm_state_mgr_light_sweep_done_msg: > > > ****************************************************************** > ********************** LIGHT SWEEP COMPLETE ********************** > ****************************************************************** > > > Jan 27 18:47:04 [43005960] -> osm_state_mgr_process: Received signal > OSM_SIGNAL_IDLE_TIME_PROCESS in state OSM_SM_STATE_PROCESS_REQUEST. > Jan 27 18:47:04 [43005960] -> __process_idle_time_queue_start: [ > Jan 27 18:47:04 [43005960] -> __process_idle_time_queue_start: ] > Jan 27 18:47:04 [43005960] -> osm_state_mgr_process: ] > Jan 27 18:47:09 [9597F060] -> osm_vl15_shutdown: [ > Jan 27 18:47:09 [9597F060] -> osm_vl15_shutdown: ] > Jan 27 18:47:09 [9597F060] -> osm_vendor_set_sm: [ > Jan 27 18:47:09 [9597F060] -> osm_vendor_set_sm: ] > Jan 27 18:47:09 [9597F060] -> osm_sm_destroy: [ > Jan 27 18:47:09 [44007960] -> __osm_sm_sweeper: Off schedule sweep signalled. > Jan 27 18:47:09 [44007960] -> __osm_sm_sweeper: ] > Jan 27 18:47:09 [9597F060] -> osm_trap_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> cl_event_wheel_destroy: [ > Jan 27 18:47:09 [9597F060] -> cl_event_wheel_dump: [ > Jan 27 18:47:09 [9597F060] -> cl_event_wheel_dump: event_wheel ptr:0x5575f8 > Jan 27 18:47:09 [9597F060] -> cl_event_wheel_dump: ] > Jan 27 18:47:09 [9597F060] -> cl_event_wheel_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_trap_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sminfo_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_sminfo_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_ni_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_ni_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_pi_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_pi_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_si_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_si_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_nd_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_nd_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_lid_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_lid_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_ucast_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_ucast_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_link_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_link_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_drop_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_drop_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_lft_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_lft_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_mft_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_mft_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_slvl_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_slvl_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_vla_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_vla_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_pkey_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_pkey_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_state_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_state_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sm_state_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_sm_state_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_mcast_mgr_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_mcast_mgr_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sm_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sa_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_nr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_nr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_pir_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_pir_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_lr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_lr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_pr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_pr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_smir_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_smir_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_mcmr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_mcmr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_sr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_infr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_infr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_vlarb_rec_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_vlarb_rec_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_slvl_rec_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_slvl_rec_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_pkey_rec_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_pkey_rec_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_lftr_rcv_destroy: [ > Jan 27 18:47:09 [9597F060] -> osm_lftr_rcv_destroy: ] > Jan 27 18:47:09 [9597F060] -> osm_sa_destroy: ] > > > ______________________________________________________________________ > > _______________________________________________ > openib-general mailing list > [email protected] > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
