Perhaps the machine is not hung completely, but is only unable to support logins.
There can be an issue if the login process attempts to access Lustre, e.g. because the home directory is on Lustre, or perhaps when a directory on Lustre is early in your $PATH. I've seen logins and existing shell sessions routinely hang in my setting when Lustre servers encountered severe problems, while home directories were on Lustre and Lustre was mounted on the login node. I'm sure there are details there that a Lustre expert could fill in; maybe there are some fail-soft mechanisms that should are designed to prevent hangs by returning appropriate error codes. So this may be more an issue of the login mechanisms being unable to recover when attempts to access an expected file or directory give some particular I/O error. David > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:lustre-discuss- > [EMAIL PROTECTED] On Behalf Of Brian J. Murrell > Sent: Tuesday, April 29, 2008 4:53 AM > To: lustre > Subject: Re: [Lustre-discuss] Client is not accesible when OSS/OST server > is down > > On Tue, 2008-04-29 at 12:21 +0900, jungdoo yoon wrote: > > > > When OSS/OST server is down, lustre client is > > not accesible by any means ( ssh, telnet ). > > I guess it is hung over lustre partition. > > Unless you are using Lustre for your root and/or usr filesystem and/or > for swap, Lustre should not hang a machine completely. > > When a machine is hung as you describe, we generally want to see a stack > trace from all of the processes. Given that you cannot access the > machine once it's in this state, you will need to get stack traces from > the console. You can do this by sending a BREAK-t. You will want to be > able to capture the output from the serial console as there will be > much. > > b. _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss