On 08:49 Thu 20 Dec , Hal Rosenstock wrote: > > >>>> > > >>>> Anyone else seeing the same? > > >>> -G requires a SA path record lookup so this could be an issue with that > > >>> timing out in some cases (assuming the port is active and the SM is > > >>> operational). > > >> I'm seeing the same problem. > > >> Sometimes the query works, and sometimes it doesn't. > > >> I also see that when the query fails, OpenSM doesn't get PathRecord > > >> query at all. > > >> > > >> Hal, can you elaborate on "that timing out in some cases" issue? > > > > > > I just meant that the SM not responding (for an unknown reason right > > > now) would yield this effect. > > > > > >> Adding Jack for the libibmad issue: > > >> > > >> I see that the ib_path_query() in libibmad/sa.c sometimes fails > > >> when calling safe_sa_call(). > > > > > > This could just be more detail on the same thing in terms of the > > > (smpquery) client which is layered on top of libibmad: the SA path query > > > timeout. > > > I would suggest running OpenSM in verbose mode (both instances are with > > > OpenSM) and seeing if it responds to the PathRecord query used by this > > > form of smpquery and continue troubleshooting from there based on the > > > result. > > > > This is actually what I was saying here. > > I have *debugged* smpquery, and saw that the failing function is > > ib_path_query() in libibmad/sa.c > > As I've mentioned, I did run it with OpenSM in verbose mode, and saw > > that when smpquery fails, OpenSM log does not have any PathRecord request. > > When smpquery passes, I see the PathRecord request and response in the > > OpenSM log. > > OK; that wasn't clear before but is now (that the failure appears to be > a client and not SM issue) :-) FWIW, I don't know what has changed that > would affect this so it could be a latent bug as opposed to a > regression.
Right, there were no changes in this area in this period, likely issue just triggered. I'm not sure but probably I saw something like this in a past, but then thought it was cabling issue. Yevgeny, Arthur, could you rerun smpquery with -dddd (for lot of debug stuff)? Sasha _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
