Hi! I'd like to give the community an update:
- we had stability issues with kernel based nfs server exporting our lustre filesystem. We could trigger the problem with a VSCODE setup. The immediate indication was system load increasing on the NFS server, which was tied to nfsd kernel threads going to "D" state (disk sleep (uninterruptible)). - only way to recover was to reboot the NFS server. This sometimes killed our lustre MDS. - we migrated the NFS gateway servers to user-space NFS-ganesha. Now, since a few weeks, we have not seen issues with the NFS shares. Not with our test case or seen any other issues. Best regards, -- - Simppa - Mr. Simppa Äkäslompolo High performance computing specialist Doctor of Science (Tech.) Aalto Scientific Computing School of Science, Aalto University, Finland +358-50-5311327 https://scicomp.aalto.fi/ ________________________________________ From: Äkäslompolo Simppa <[email protected]> Sent: Thursday, April 3, 2025 13:02 To: OpenSFS Administration via lustre-discuss Subject: Anyone exporting lustre via NFS-ganesha? Hi! We are exporting our lustre via NFS gateway nodes. There has been some instability. We are currently exploring if we could replace the standard in-kernel NFS server with the user-space NFS-ganesha: https://github.com/nfs-ganesha/nfs-ganesha It seems setting it up is more difficult than expected. The documentation isn't super great. If there is someone who could share their notes, we would be highly interested in hearing more! The current thing we are fighting is id-mapping issues, but it doesn't seem to be the last problem :-/ Thanks! -- - Simppa - Mr. Simppa Äkäslompolo High performance computing specialist Doctor of Science (Tech.) Aalto Scientific Computing School of Science, Aalto University, Finland +358-50-5311327 https://scicomp.aalto.fi/ _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
