Hi all, our Lustre FS shows an interesting performance problem which I'd like to discuss as some of you might have seen this kind of things before and maybe someone has a quick explanation of what's going on.
We are running Lustre 1.6.5.1. The problem shows up when we read a shared file from multiple nodes that has just been written from the same set of nodes. 512 processes write a checkpoint (1.5 GB from each node) into a shared file by seeking to position RANK*1.5GB and writing 1.5GB in 1.44M chunks. Writing works fine and gives the full file system performance. The data is being written by using write() and no flags aside O_CREAT and O_WRONLY. If the checkpoint is written, the program is terminated and restarted and reads in the same portion of the file. For some reason this almost immediate reading of the same data that was just written on the same node is very slow. If we a) change the set of nodes or b) wait a day, we get the full read performance when we use the same executable and the same shared file. Is there a reason why an immediate read after a write on the same node from/to a shared file is slow? Is there any additional communication, e.g. is the client flushing the buffer cache before the first read? The statistics show that the average time to complete a 1.44MB read request is increasing during the runtime of our program. At some point it hits an upper limit or a saturation point and stays there. Is there some kind of queue or something that is getting full in this kind of write/read-scenario? May tuneable some stuff in /proc/fs/luste? Regards, Michael -- Michael Kluge, M.Sc. Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: [email protected] WWW: http://www.tu-dresden.de/zih
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
