On Wed, 2009-07-15 at 11:22 -0400, Robin Humble wrote:
> all kernels all compiled with the rhel5 kernel tree's standard OFED.
> I think 1.3.2 is what's in rhel5.3/centos5.3?

Yeah, something like that IIRC.

> the error messages are just on the initial mount of the first lustre fs.
> subsequent mounts of other lustre fs's don't get any messages, so it
> seems like it's just an extremely noisy protocol/version negotiation
> the first time the 1.8.1 lnet fires up and tries to talk to 1.6.7.2
> servers??

Maybe one of our LNET experts might have some additional information to
offer.

> another data point is that the above errors don't happen with
> 2.6.18-128.1.14.el5 patched with 1.8.0.1 and using the same in-kernel
> OFED, so it's probably something that's happened between 1.8.0.1 and
> 1.8.1-pre.
> or I guess it could be a rhel change between 2.6.18-128.1.14.el5 and
> 2.6.18-128.1.16.el5, but that seems less likely.
> I can spin up a 2.6.18-128.1.14.el5 with b_release_1_8_1 if you like...

Yeah, it would be a great troubleshooting addition to see if the same
kernel on the clients and servers with the different lustre versions has
the same problem.  This would isolate the problem either to or away from
a problem with the difference in OFED stacks.

> cool. thanks for the explanation.

NP.

b.

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to