Thank you for your explanation. I think I understand what you mean. I will test in a small cluster and measure the number of files/locks.
Andreas Dilger <[email protected]> 于2024年3月11日周一 17:34写道: > All of the numbers in this example are estimates/approximations to give an > idea about the amount of memory that the MDS may need under normal > operating circumstances. However, the MDS will also continue to function > with more or less memory. The actual amount of memory in use will change > very significantly based on application type, workload, etc. and the > numbers "256" and "100,000" are purely examples of how many files might be > in use. > > I'm not sure you can "test" those numbers, because whatever number of > files you test with will be the number of files actually in use. You could > potentially _measure_ the number of files/locks in use on a large cluster, > but again this will be highly site and application dependent. > > Cheers, Andreas > > On Mar 11, 2024, at 01:24, Amin Brick Mover <[email protected]> > wrote: > > Hi, Andreas. > > Thank you for your reply. > > Can I consider 256 files per core as an empirical parameter? And does the > parameter '256' need testing based on hardware conditions? Additionally, in > the calculation formula "12 interactive clients * 100,000 files * 2KB = > 2400 MB," is the number '100,000' files also an empirical parameter? Do I > need to test it. Can I directly use the values '256' and '100,000'? > > Andreas Dilger <[email protected]> 于2024年3月11日周一 05:47写道: > >> These numbers are just estimates, you can use values more suitable to >> your workload. >> >> Similarly, 32-core clients may be on the low side these days. NVIDIA DGX >> nodes have 256 cores, though you may not have 1024 of them. >> >> The net answer is that having 64GB+ of RAM is inexpensive these days and >> improves MDS performance, especially if you compare it to the cost of >> client nodes that would sit waiting for filesystem access if the MDS is >> short of RAM. Better to have too much RAM on the MDS than too little. >> >> Cheers, Andreas >> >> On Mar 4, 2024, at 00:56, Amin Brick Mover via lustre-discuss < >> [email protected]> wrote: >> >> In the Lustre Manual 5.5.2.1 section, the examples mentioned: >> >> *For example, for a single MDT on an MDS with 1,024 compute nodes, 12 >> interactive login nodes, and a* >> *20 million file working set (of which 9 million files are cached on the >> clients at one time):* >> *Operating system overhead = 4096 MB (RHEL8)* >> *File system journal = 4096 MB* >> *1024 * 32-core clients * 256 files/core * 2KB = 16384 MB* >> *12 interactive clients * 100,000 files * 2KB = 2400 MB* >> *20 million file working set * 1.5KB/file = 30720 MB* >> >> I'm curious, how were the two numbers, 256 files/core and 100,000 files, >> determined? Why? >> >> _______________________________________________ >> lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Lustre Principal Architect >> Whamcloud >> >> >> >> >> >> >> >> > Cheers, Andreas > -- > Andreas Dilger > Lustre Principal Architect > Whamcloud > > > > > > > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
