Thanks for the reply, Andreas. > - the local.sh config and llmount.sh are only for basic testing and > development. That isn't how you would use lustre for production deployment.
Yes, I agree. My goal with llmount.sh was to make sure that I have the VMs and networking set up correctly, and then to use that setup as a base for further development. > - after "ping" did you try "lctl ping" between the various VMs? Do you have > firewall rules that block connection? Firewall is disabled on all the 4 VM's. For some reason, 'lctl ping' was giving me unreliable/unreproducible results. (But, I have realized that 'lctl ping' and 'lctl dl' are reliable means for establishing whether the VMs are correctly set up for Lustre.) Anyway, after a few trial and errors, I managed to get it to work, without llmount.sh. I created new VMs and manually formatted and mounted the Lustre filesystem on the VMs. (To avoid disturbing the sensitive stability of the setup, I haven't tried llmount.sh again. :-) ) On Sun, May 06, 2018 at 12:34:25PM +0000, Dilger, Andreas wrote: > Two notes: > - the local.sh config and llmount.sh are only for basic testing and > development. That isn't how you would use lustre for production deployment. > - after "ping" did you try "lctl ping" between the various VMs? Do you have > firewall rules that block connection? > > Cheers, Andreas > > > On May 5, 2018, at 21:14, Rohan Garg <[email protected]> wrote: > > > > Hi, > > > > I'm trying to set up a virtual cluster (with 4 VirtualBox VMs: 1 > > MGS, 1 MDS, 1 Client, and 1 OSS) using Lustre. The VMs are running > > CentOS-7. I have built Lustre from the master branch. > > > > The VM's have a NAT interface (eth0), and a host-only network > > interface (eth1). > > > > Client: eth0: 10.0.2.15, eth1: 192.168.50.7, hostname: ct-client1 > > OSS: eth0: 10.0.2.15, eth1: 192.168.50.5, hostname: ct-oss1 > > MDS: eth0: 10.0.2.15, eth1: 192.168.50.9, hostname: ct-mds1 > > MGS: eth0: 10.0.2.15, eth1: 192.168.50.11, hostname: ct-mgs1 > > > > - All the VMs have SELinux disabled. > > - All the VMs can ping each other and can use password-less ssh among > > themselves. > > - All the 4 VM's have the following line in /etc/modprobe.d/lnet.conf: > > > > options lnet networks="tcp(eth1)" > > > > I modified the cfg/local.sh file and added the following entries to make > > it use the correct hostnames. > > > > MDSCOUNT=1 > > mds_HOST=ct-mds1 > > MDSDEV1=/dev/sdb > > > > mgs_HOST=ct-mgs1 > > MGSDEV=/dev/sdb > > > > OSTCOUNT=1 > > ost_HOST=ct-oss1 > > OSTDEV1=/dev/sdb > > > > The issue is that I can't get the llmount.sh script to mount the > > filesystem on the client and run successfully. The script exits with the > > following messages: > > > > ... > > Started lustre-OST0000 > > Starting client: ct-client1.lfs.local: -o user_xattr,flock > > ct-mgs1:/lustre /mnt/lustre > > CMD: ct-client1.lfs.local mkdir -p /mnt/lustre > > CMD: ct-client1.lfs.local mount -t lustre -o user_xattr,flock > > ct-mgs1:/lustre /mnt/lustre > > mount.lustre: mount ct-mgs1:/lustre at /mnt/lustre failed: No such file or > > directory > > Is the MGS specification correct? > > Is the filesystem name correct? > > If upgrading, is the copied client log valid? (see upgrade docs) > > > > (Trying to run the last mount command manually also gives the same > > error.) > > > > After the llmount script exits, I can check the output of "lctl list_nids" > > on the 3 server VMs. > > > > OSS: 192.168.50.5@tcp > > MDS: 192.168.50.9@tcp > > MGS: 192.168.50.11@tcp > > > > Here's the dmesg output from the client: > > > > [311.259776] Lustre: Lustre: Build Version: 2.11.51_20_g9ac477c > > [312.792145] Lustre: 1836:0:(gss_svc_upcall.c:1185:gss_init_svc_upcall()) > > Init channel is not opened by lsvcgssd, following request might be dropped > > until lsvcgssd is active > > [312.792162] Lustre: 1836:0:(gss_mech_switch.c:71:lgss_mech_register()) > > Register gssnull mechanism > > [312.792174] Key type lgssc registered > > [312.868636] Lustre: Echo OBD driver; http://www.lustre.org/ > > [325.835994] LustreError: 3302:0:(ldlm_lib.c:488:client_obd_setup()) can't > > add initial connection > > [325.836737] LustreError: 3302:0:(obd_config.c:559:class_setup()) setup > > MGC192.168.50.11@tcp failed (-2) > > [325.837248] LustreError: 3302:0:(obd_mount.c:202:lustre_start_simple()) > > MGC192.168.50.11@tcp setup error -2 > > [325.837765] LustreError: 3302:0:(obd_mount.c:1583:lustre_fill_super()) > > Unable to mount (-2) > > > > I'm not sure if I'm missing something in my config. Any help is appreciated. > > > > Thanks, > > Rohan > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
