Oleg Drokin wrote: > Hello! > > On Feb 10, 2009, at 12:46 PM, Simon Kelley wrote: >> If, by "the complete event" you mean the "received cancel for unknown >> cookie", there's not much more to tell. Grepping through the last >> month's server logs shows that there are bursts of typically between 3 >> and 7 messages, at the same time and from the same client. After a >> gap, the same thing but from a different client. The number can be as >> low a one, and up to ten. They look to be related to client workload, >> at a guess. > > Ok, so you do not see a pattern of this unknown cookie message followed > by eviction in some time like 100 seconds? That's what my question about. > > Bye, > Oleg >
No, there are plenty of examples of the unknown cookie message and no eviction or other problem. It's possible that there is a pattern where there's a run on "unknown lock cookie" for a particular client and then a couple of minutes later "lock callback timer expired: evicting client" messages for a _different_ client, but the signal is not strong. It does look like the clusters of "unknown lock cookie" may be related to striping. If a file striped across several OSTs experiences the problem then there is a cluster of the messages all referencing the same client node, one from each OST. I'm working on reproducing the problem in a controlled way and getting the information you asked for. Cheers, Simon. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
