Hi all, 1. I am trying to deploy Slurm HPC cluster based on Google Cloud Platform, with Lustre file system, as instructed below: https://codelabs.developers.google.com/codelabs/hpc-slurm-on-gcp/#0 https://cloud.google.com/blog/products/storage-data-transfer/introducing-lustre-file-system-cloud-deployment-manager-scripts https://github.com/GoogleCloudPlatform/deploymentmanager-samples/tree/master/community/lustre 2. I have created VPC Peering between the Slurm network and the Lustre cluster network 3. I have created Firewall rules for allowing all ports and protocols between the Slurm network and the Lustre cluster network 4. I have added DNS records for all the Lustre cluster machines inside the Slurm master node /etc/hosts 5. I have installed the following Lustre client pre-requirements on the Slurm master node: sudo yum install kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel 6. I have created the /etc/yum.repos.d/lustre.repo with the following content: [lustre-server] name=CentOS-$releasever - Lustre baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS-$releasever - Ldiskfs baseurl=https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [lustre-client] name=CentOS-$releasever - Lustre baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/client/ gpgcheck=0 7. I have installed the Lustre client packages on the Slurm master node, using the following command: sudo yum install e2fsprogs lustre-client 8. I used the following commands to create a mount point for the Lustre file system from within the Slurm master node: sudo mkdir -p /lustre sudo chmod 777 -R /lustre 9. Due to the fact that on the Slurm master node on Google Cloud Platform, my logged-in account is not Root account, but a Google G Suite account, the only way to perform mount and create a test file inside the mount point /lustre, is to use the following Sudo commands: sudo mount -t lustre lustre-mds1:/lustre /lustre sudo touch /lustre/1.txt I have couple of problems with the above process: A. Even though the mount point (/lustre) has chmod of 777, the folder is still owned by Root user and group, and I am still unable to write files into the /Lustre mount point - How do I allow Google G Suite accounts the privilege to read/write/delete files from the /Lustre mount point?
B. How do I add the following packages as part of the Slurm deployment package on both the Slurm master node and on all Slurm compute nodes (https://github.com/SchedMD/slurm-gcp)? sudo yum install kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel sudo yum install e2fsprogs lustre-client Note: For the Lustre client installation, I need to add the /etc/yum.repos.d/lustre.repo with specific content (as instructed here: http://wiki.lustre.org/Installing_the_Lustre_Software) Thanks, Eyal Estrin
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
