Do you have OSS readcache on? Check out https://bugzilla.lustre.org/show_bug.cgi?id=20778 and https://bugzilla.lustre.org/show_bug.cgi?id=18571
David David Simas wrote: > Hello, > > We have a Lustre 1.8.1 file system about 60 TB in size running on > RHEL 5 x86_64. (I can provide hardware details if anyone thinks > they'd be relevant.) We are seeing memory problems after several > days of sustained I/O into that file system. We are writing from > a small number of clients (4 - 5) at an average rate of 50 MB/s, with > peaks of 350 MB/s. We read all the data at least twice before deleting > them. During this operation, we notice the value of "buffers" > reported in '/proc/meminfo' on the OSSs involved increasing monotonically > until it apparently take up all the system's memory - 32 GB. Then 'kswapd' > starts consuming a large amount of CPU, the load increases (100+), and the > system, including Lustre, slows to crawl and becomes quite useless. If we > stop Lustre I/O at this point, 'kswapd' and the system load calm down, but > the "buffers" value does not decrease. Any I/O on the system will then > (dd if=/dev/urandom of=/tmp/test ...) will cause 'kswapd' to run away > again. We have observed the monotonically increasing "buffers" condition > with non-Lustre I/O on systems running the Lustre 1.8.1 kernel > (2.6.18-128.1.14.el5_lustre.1.8.1), but we haven't gotten them to point > where 'kswapd' goes wild. > > Has anyboy else seen anything like this? > > David Simas > SLAC > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
