On Tue, 2009-04-07 at 09:20 +0200, Arne Wiebalck wrote: > Brian, Arne,
> what about if you have multiple clients, all having transactions with > the OSS open. Now the OSS goes down and comes back. From what I > understand, the server goes into recovery and rejects new connections > before recovery is finished (correct?). Correct. > What if all but one client > reconnect, i.e. you lose one client: are the transactions of the > successfully reconnected clients replayed or are they discarded? If the lost client has a transaction that needs to be replayed, all of the transactions up to that missing transaction are replayed but all subsequent transactions are discarded and when the recovery timer expires, recovery is aborted. The semantics of this will change when VBR becomes available, in 1.8.something, where something might be 0 even. In that case, only transactions actually dependent on the missing transactions will be discarded. > Independent from the load? I think the 'official' statement was that the > cluster has to be quiescent, i.e. no client activity. Is that (still) > true? Yes, that is the official statement and I don't think any further testing has been done to change that statement, officially, but I think the general feeling is that quiescence should not be necessary, but we just don't have the scientific testing to be assured of that. So if you want to be safe, quiesce the filesystem first. :-) b.
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
