On 12/14/2011 10:18 PM, Ira Weiny wrote: > > In addition print transaction ID of all DR PATH dumps to make sure we know > which MAD's they refer to.
A note on this approach is that this splits the logging of send errors between the vendor layer and SM rather than keeping it all at one layer of the implementation. That's the tradeoff to not fixing the bug in umad_receiver in terms of printing the DR path in ERR 5411. > Signed-off-by: Ira Weiny <[email protected]> > --- > libvendor/osm_vendor_ibumad.c | 2 -- > opensm/osm_helper.c | 5 +++-- > opensm/osm_sm_mad_ctrl.c | 16 ++++++++++++++-- > 3 files changed, 17 insertions(+), 6 deletions(-) > > diff --git a/libvendor/osm_vendor_ibumad.c b/libvendor/osm_vendor_ibumad.c > index e2ebd8e..b2872c8 100644 > --- a/libvendor/osm_vendor_ibumad.c > +++ b/libvendor/osm_vendor_ibumad.c > @@ -348,8 +348,6 @@ static void *umad_receiver(void *p_ptr) > ", Hop Ptr: 0x%X\n", > mad->method, cl_ntoh16(mad->attr_id), > cl_ntoh64(mad->trans_id), smp->hop_ptr); > - osm_dump_smp_dr_path(p_vend->p_log, smp, > - OSM_LOG_ERROR); If you're going this direction, why not remove the logging of error 5411 above it which means eliminate the else clause there ? Isn't that redundant with your change below to sm_mad_ctrl_send_err_cb ? Also, shouldn't another related change to umad_receiver be done: Where it is: if (mad->mgmt_class != IB_MCLASS_SUBN_DIR) { it should now be: if ((mad->mgmt_class != IB_MCLASS_SUBN_DIR) && (mad->mgmt_class != IB_MCLASS_SUBN_LID)) { to go along with SM class being logged in the SM send_err callback rather than at umad layer. -- Hal > } > > if (!(p_req_madw = get_madw(p_vend, &mad->trans_id))) { > diff --git a/opensm/osm_helper.c b/opensm/osm_helper.c > index f9f3d9d..b968679 100644 > --- a/opensm/osm_helper.c > +++ b/opensm/osm_helper.c > @@ -2059,8 +2059,9 @@ void osm_dump_smp_dr_path(IN osm_log_t * p_log, IN > const ib_smp_t * p_smp, > char buf[BUF_SIZE]; > unsigned n; > > - n = sprintf(buf, "Received SMP on a %u hop path: " > - "Initial path = ", p_smp->hop_count); > + n = sprintf(buf, " DR SMP (TID 0x%" PRIx64 ") on a %u hop > path: " > + "Initial path = ", > + cl_ntoh64(p_smp->trans_id), p_smp->hop_count); > n += sprint_uint8_arr(buf + n, sizeof(buf) - n, > p_smp->initial_path, > p_smp->hop_count + 1); > diff --git a/opensm/osm_sm_mad_ctrl.c b/opensm/osm_sm_mad_ctrl.c > index ee92c66..a3b444a 100644 > --- a/opensm/osm_sm_mad_ctrl.c > +++ b/opensm/osm_sm_mad_ctrl.c > @@ -704,6 +704,7 @@ Exit: > */ > static void sm_mad_ctrl_send_err_cb(IN void *context, IN osm_madw_t * p_madw) > { > + char lidstr[8]; > osm_sm_mad_ctrl_t *p_ctrl = context; > ib_api_status_t status; > ib_smp_t *p_smp; > @@ -713,13 +714,24 @@ static void sm_mad_ctrl_send_err_cb(IN void *context, > IN osm_madw_t * p_madw) > CL_ASSERT(p_madw); > > p_smp = osm_madw_get_smp_ptr(p_madw); > + > + if (p_smp->mgmt_class == IB_MCLASS_SUBN_DIR) > + lidstr[0] = '\0'; > + else > + snprintf(lidstr, 8, " DLID %u", > + cl_ntoh16(p_madw->mad_addr.dest_lid)); > + > OSM_LOG(p_ctrl->p_log, OSM_LOG_ERROR, "ERR 3113: " > "MAD completed in error (%s): " > - "%s(%s), attr_mod 0x%x, TID 0x%" PRIx64 "\n", > + "%s(%s), attr_mod 0x%x, TID 0x%" PRIx64 " %s\n", > ib_get_err_str(p_madw->status), > ib_get_sm_method_str(p_smp->method), > ib_get_sm_attr_str(p_smp->attr_id), cl_ntoh32(p_smp->attr_mod), > - cl_ntoh64(p_smp->trans_id)); > + cl_ntoh64(p_smp->trans_id), > + lidstr); > + > + if (p_smp->mgmt_class == IB_MCLASS_SUBN_DIR) > + osm_dump_smp_dr_path(p_ctrl->p_log, p_smp, OSM_LOG_ERROR); > > /* > If this was a SubnSet MAD, then this error might indicate a problem -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
