On 2010-05-21, at 5:49, Stefano Elmopi <[email protected]> wrote: > > I realized that the time server differed much across machines, > there were at least a few hours of difference. > I'm doing the tests and have not been paying attention to time > synchronization > but now I have aligned the time of all servers and I've configured > ntpd service > and the problem no longer occurs. > I can imagine that the cause of the problem was just the time > misalignment.
The client and server clock should have nothing to do with the functioning of lustre, so it surprising that this would be the cause. > Il giorno 20/mag/10, alle ore 13:28, Johann Lombardi ha scritto: > >> On Thu, May 20, 2010 at 12:29:41PM +0200, Stefano Elmopi wrote: >>> Hi Andreas >>> My version of Lustre 1.8.3 >>> Sorry for my bad English but I used the wrong word, "crash" is not >>> the >>> right word. >>> I try to explain better, I start copying a large file on the file >>> system >>> and while the copy process continues, I reboot the server OSS, >>> and the copy process enters state "- stalled -". >>> I expected that once the server back online, the copy process to >>> resume >>> normal >>> and complete copy of the file, instead the copy process fault. >>> Therefore the copy process that goes wrong, Lustre continues to >>> perform >>> good. >> >> May 19 13:46:31 mdt01prdpom kernel: LustreError: 167-0: This client >> was >> evicted by lustre01-OST0000; in progress operations using this >> service >> will fail. >> >> The cp process failed because the client got evicted by the OSS. >> We need to look at the OSS logs to figure out the root cause of >> the eviction. >> >> Johann > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
