I would recommend *against* mounting all 175 OSTs at the same time.  There are 
(or at least were*) some issues with the MGS registration RPCs timing out when 
too many config changes happen at once.  Your "mount and wait 2 sec" is more 
robust and doesn't take very much time (a few minutes) vs. having to restart if 
some of the OSTs have problems registering.  Also, the config logs will have 
the OSTs in a nice order, which doesn't affect any functionality, but makes it 
easier for the admin to see if some device is connected in "lctl dl" output.

Cheers, Andreas


[*] some fixes have landed over time to improve registration RPC resend.

On Jan 8, 2024, at 11:57, Thomas Roth via lustre-discuss 
<[email protected]<mailto:[email protected]>> wrote:

Yes, sorry, I meant the actual procedure of mounting the OSTs for the first 
time.

Last year I did that with 175 OSTs - replacements for EOL hardware. All OSTs 
had been formatted with a specific index, so probably creating a suitable 
/etc/fstab everywhere and sending a 'mount -a -t lustre' to all OSTs 
simultaneously would have worked.

But why the hurry? Instead, I logged in to my new OSS, mounted the OSTs with 2 
sec between each mount command, watched the OSS log, watched the MDS log, saw 
the expected log messages, proceeded to the new OSS - all fine ;-)  Such a 
leisurely approach takes its time, of course.

Once all OSTs were happily incorporated, we put the max_create_count (set to 0 
before) to some finite value and started file migration. As long as the 
migration is more effective, faster, than the users's file creations, the 
result should be evenly filled OSTs with a good mixture of files (file sizes, 
ages, types).


Cheers
Thomas

On 1/8/24 19:07, Andreas Dilger wrote:
The need to rebalance depends on how full the existing OSTs are.  My 
recommendation if you know that the data will continue to grow is to add new 
OSTs when the existing ones are at 60-70% full, and add them in larger groups 
rather than one at a time.
Cheers, Andreas
On Jan 8, 2024, at 09:29, Thomas Roth via lustre-discuss 
<[email protected]<mailto:[email protected]>> wrote:

Just mount the OSTs, one by one and perhaps not if your system is heavily 
loaded. Follow what happens in the MDS log and the OSS log.
And try to rebalance the OSTs fill levels afterwards - very empty OSTs will 
attract all new files, which might be hot and direct your users's fire to your 
new OSS only.

Regards,
Thomas

On 1/8/24 15:38, Backer via lustre-discuss wrote:
Hi,
Good morning and happy new year!
I have a quick question on extending a lustre file system. The extension is 
performed online. I am looking for any best practices or anything to watchout 
while doing the file system extension. The file system extension is done adding 
new OSS and many OSTs within these servers.
Really appreciate your help on this.
Regards,
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to