> If I use about 10 clients, then the write performance of my > system is 2 GB/s, and the read performance of my system is > also 2 GB/s. These are the results when I run either reads or > writes, but not both at the same time.
Unfortunately these numbers are meaningless without an idea of the storage system and the access patterns. > But, when I have 10 clients doing reads, and a different 10 > clients doing writes, the write performance barely drops, but > the read performance drops to about 150 MB/s. This may be entirely the right thing for it to happen. > I am using the deadline scheduler. That's a good choice in many cases to avoid starvation problems with CFQ. > Can anyone suggest why this is occurring, and how to address > it? Well depending on storage system and access patterns, the right figure might be 150MB/s and wrong figure the 2GB/s. You are not even saying whether the 10 and 10 are reading/writing to the same file(s) or different ones (which might matter a great deal for MDS/OSS/client inode sync), or how many client systems those threads run on. Anyhow, it could be because of Lustre caching writes but not reads (usually) which often results in writes being faster than reads (but usually by less than that) your storage system does not have the IOPS to do better, flusher issues, disk host adapter issues (lots have buggy elevators or caching policies), too much multithreading. it could be that the prefetcher of (awful) Linux page cache needs huge read-aheads to work well for sequential loads. It could be the clients caching instead, or limitations in the client networking, or in the switches you are using or their setup. Run 'iostat -xd 1' on the OSSes to get a better idea of what is going on, and also ideally on the MDS. Also run on OSSes and MDS: watch -n1 cat /proc/meminfo and observe how 'Dirty' and 'Writeback' go. Try also to run concurrent send/receive (using 2 'nuttcp' instances) between clients and servers to check the networking is simulataneous. Some network chipsets (cheaper ones) can only process N packets per seconds, where N is less than what is necessary to simultaneously run both tranmitter and receiver at full speed (as a rule the transmitter wins, so if you got those in your OSSes bad news for reads). > In my idealized world, the read and write performance should > be proportional to the initial read and write performance, and > the ratio of the number of read and write clients. That's movingly innocent and optimistic. Storage and system performance is extremely anisotropic, especially with Lustre and large storage systems in general. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
