[Lustre-discuss] o2ib possible network problems -- solved

Ms. Megan Larko Mon, 22 Sep 2008 13:17:42 -0700

Hello All,

I honestly do not know how it happened, but the value in
/proc/sys/lustre/timeout on the OSS box was set to 100.   All other
systems were set to 1000.
I changed the value on the OSS to 1000 and every error message on all
of the related systems stopped.   I got the idea to re-check from an
e-mail message sent by Brian Murrell archived on os-dir referring to
bug 16237.  Brian listed the above as another thing to check.


Interestingly enough, the readahead (blockdev --report /dev/sdX) on
the same OSS was set to 672.   I have no idea where that came from
either.  All of the other systems have a reported readahead value of
256.   I had changed the readahead value on OSS box first (blockdev
--setra 256 /dev/sdX).   The error messages did not stop until I fixed
the value in /proc/sys/lustre/timeout.

How could my /proc have such odd values in it?

I will see if the change holds for now.   I may have to do something
to make it persistent for future reboots.

Cheers!
megan
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

[Lustre-discuss] o2ib possible network problems -- solved

Reply via email to