On 2010-08-20, at 07:21, John Hammond wrote:
> Indeed, thanks.  On Ranger, the compute nodes use compact flash drives for /, 
> and so they depend on tmpfs's for /tmp, /var/run, /var/log, and of course 
> /dev/shm.  So cleaning up these ram backed filesystems as much as practical 
> before asking for any hugepages is also a win.
> 
> Also, in imitation of the systems that pre-allocate all needed hugepages at 
> boot time, we are considering the idea of first pre-allocating a large chunk 
> of memory (say 7/8) in hugepages, then mounting the Lustre filesystems, then 
> releasing the hugepages.  The hope is that Lustre's persistent structures 
> will be fit into a more compact region of memory thereby.

As discussed in https://bugzilla.lustre.org/show_bug.cgi?id=14323 that I 
previously referenced, the Lustre tunables are based on the total number of 
pages, and do not take huge pages into account.

Also, if the hugepages are released, there is no guarantee that you will be 
able to allocate them all again due to small pinned memory structures 
_somewhere_ in the middle of each huge page.

If you are running an prologue/epilogue script then you should tune the Lustre 
cache size based on the number of huge pages that will be used.  The last time 
this was investigated, there was no way for Lustre to know how many huge pages 
were allocated from within the kernel w/o patching it.  If that has changed in 
newer kernels, it would be possible to dynamically adjust the cache size based 
on this.


> The main obstacle in testing all of this is that benchmarking the gains 
> gotten by a particular approach is difficult, since we have not yet found an 
> easy way of producing external fragmentation of physical memory in short 
> order.  Suggestions are welcome.

Running something like "slocate" across multiple filesystems will fill all of 
RAM with inodes/dentries, and if you pin some of these in memory (e.g. start a 
shell with some deep directory as CWD), you should quickly be able to fragment 
your memory with unfreeable inode/dentry allocations.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to