On Thu, 02 Apr 2009 at 08:07:20PM +0200, Bernd Schubert wrote: > Hello, > > I'm fighting (as usual) with some Lustre problems and I think this time it is > IB related. In the logs of some systems I see messages like these: > > ib_mthca 0000:0d:00.0: Async event 16 for bogus QP 00da0407 > > Anyone knows what is the meaning of that? The kernel modules are from > OFED-1.3.1.
Hi Bernd, we are also using 1.3.1 and Lustre, as you have seen recently at our site ;-) I'm getting messages like these only when large computing jobs are running using IPoIB. I believe that this is a issue with send/receive buffers, because I see dropped packets on IPoIB iface. Those jobs work usually fine (usually because this app is buggy itself) so I find those messages rather harmless. regards, P -- Pawel Dziekonski <[email protected]> Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
