Hello, Did you deploy Lustre and Slurm on GCP in the same zone ?
What does `df` return on the master node where the client is mounted ? What does `sudo modprobe lustre` return on the master node ? Usually the Lustre client is installed by installing `kmod-lustre-client` and `lustre-client`, cf http://wiki.lustre.org/Installing_the_Lustre_Software#Lustre_Client_Software_Installation . Regarding how to add the Lustre client install on the compute, login and mater node in slurm-gcp, a function can be added into the `startup-script.sh` and called in the nfs mount section starting at line 1039 of the `startup-script.sh` On Sun, Aug 4, 2019 at 8:18 AM Eyal Estrin <[email protected]> wrote: > Hi all, > 1. I am trying to deploy Slurm HPC cluster based on Google Cloud Platform, > with Lustre file system, as instructed below: > https://codelabs.developers.google.com/codelabs/hpc-slurm-on-gcp/#0 > > https://cloud.google.com/blog/products/storage-data-transfer/introducing-lustre-file-system-cloud-deployment-manager-scripts > > https://github.com/GoogleCloudPlatform/deploymentmanager-samples/tree/master/community/lustre > 2. I have created VPC Peering between the Slurm network and the Lustre > cluster network > 3. I have created Firewall rules for allowing all ports and protocols > between the Slurm network and the Lustre cluster network > 4. I have added DNS records for all the Lustre cluster machines inside the > Slurm master node /etc/hosts > 5. I have installed the following Lustre client pre-requirements on the > Slurm master node: > sudo yum install kernel kernel-devel kernel-headers > kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel > 6. I have created the /etc/yum.repos.d/lustre.repo with the following > content: > [lustre-server] > name=CentOS-$releasever - Lustre > baseurl= > https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/server/ > gpgcheck=0 > [e2fsprogs] > name=CentOS-$releasever - Ldiskfs > baseurl=https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/ > gpgcheck=0 > [lustre-client] > name=CentOS-$releasever - Lustre > baseurl= > https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/client/ > gpgcheck=0 > 7. I have installed the Lustre client packages on the Slurm master node, > using the following command: > sudo yum install e2fsprogs lustre-client > 8. I used the following commands to create a mount point for the Lustre > file system from within the Slurm master node: > sudo mkdir -p /lustre > sudo chmod 777 -R /lustre > 9. Due to the fact that on the Slurm master node on Google Cloud Platform, > my logged-in account is not Root account, but a Google G Suite account, the > only way to perform mount and create a test file inside the mount point > /lustre, is to use the following Sudo commands: > sudo mount -t lustre lustre-mds1:/lustre /lustre > sudo touch /lustre/1.txt > I have couple of problems with the above process: > A. Even though the mount point (/lustre) has chmod of 777, the folder is > still owned by Root user and group, and I am still unable to write files > into the /Lustre mount point - How do I allow Google G Suite accounts the > privilege to read/write/delete files from the /Lustre mount point? > > B. How do I add the following packages as part of the Slurm deployment > package on both the Slurm master node and on all Slurm compute nodes ( > https://github.com/SchedMD/slurm-gcp)? > sudo yum install kernel kernel-devel kernel-headers > kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel > sudo yum install e2fsprogs lustre-client > Note: For the Lustre client installation, I need to add the > /etc/yum.repos.d/lustre.repo with specific content (as instructed here: > http://wiki.lustre.org/Installing_the_Lustre_Software) > > > > Thanks, > > Eyal Estrin > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
