Two notes: - the local.sh config and llmount.sh are only for basic testing and development. That isn't how you would use lustre for production deployment. - after "ping" did you try "lctl ping" between the various VMs? Do you have firewall rules that block connection?
Cheers, Andreas > On May 5, 2018, at 21:14, Rohan Garg <[email protected]> wrote: > > Hi, > > I'm trying to set up a virtual cluster (with 4 VirtualBox VMs: 1 > MGS, 1 MDS, 1 Client, and 1 OSS) using Lustre. The VMs are running > CentOS-7. I have built Lustre from the master branch. > > The VM's have a NAT interface (eth0), and a host-only network > interface (eth1). > > Client: eth0: 10.0.2.15, eth1: 192.168.50.7, hostname: ct-client1 > OSS: eth0: 10.0.2.15, eth1: 192.168.50.5, hostname: ct-oss1 > MDS: eth0: 10.0.2.15, eth1: 192.168.50.9, hostname: ct-mds1 > MGS: eth0: 10.0.2.15, eth1: 192.168.50.11, hostname: ct-mgs1 > > - All the VMs have SELinux disabled. > - All the VMs can ping each other and can use password-less ssh among > themselves. > - All the 4 VM's have the following line in /etc/modprobe.d/lnet.conf: > > options lnet networks="tcp(eth1)" > > I modified the cfg/local.sh file and added the following entries to make > it use the correct hostnames. > > MDSCOUNT=1 > mds_HOST=ct-mds1 > MDSDEV1=/dev/sdb > > mgs_HOST=ct-mgs1 > MGSDEV=/dev/sdb > > OSTCOUNT=1 > ost_HOST=ct-oss1 > OSTDEV1=/dev/sdb > > The issue is that I can't get the llmount.sh script to mount the > filesystem on the client and run successfully. The script exits with the > following messages: > > ... > Started lustre-OST0000 > Starting client: ct-client1.lfs.local: -o user_xattr,flock ct-mgs1:/lustre > /mnt/lustre > CMD: ct-client1.lfs.local mkdir -p /mnt/lustre > CMD: ct-client1.lfs.local mount -t lustre -o user_xattr,flock > ct-mgs1:/lustre /mnt/lustre > mount.lustre: mount ct-mgs1:/lustre at /mnt/lustre failed: No such file or > directory > Is the MGS specification correct? > Is the filesystem name correct? > If upgrading, is the copied client log valid? (see upgrade docs) > > (Trying to run the last mount command manually also gives the same > error.) > > After the llmount script exits, I can check the output of "lctl list_nids" > on the 3 server VMs. > > OSS: 192.168.50.5@tcp > MDS: 192.168.50.9@tcp > MGS: 192.168.50.11@tcp > > Here's the dmesg output from the client: > > [311.259776] Lustre: Lustre: Build Version: 2.11.51_20_g9ac477c > [312.792145] Lustre: 1836:0:(gss_svc_upcall.c:1185:gss_init_svc_upcall()) > Init channel is not opened by lsvcgssd, following request might be dropped > until lsvcgssd is active > [312.792162] Lustre: 1836:0:(gss_mech_switch.c:71:lgss_mech_register()) > Register gssnull mechanism > [312.792174] Key type lgssc registered > [312.868636] Lustre: Echo OBD driver; http://www.lustre.org/ > [325.835994] LustreError: 3302:0:(ldlm_lib.c:488:client_obd_setup()) can't > add initial connection > [325.836737] LustreError: 3302:0:(obd_config.c:559:class_setup()) setup > MGC192.168.50.11@tcp failed (-2) > [325.837248] LustreError: 3302:0:(obd_mount.c:202:lustre_start_simple()) > MGC192.168.50.11@tcp setup error -2 > [325.837765] LustreError: 3302:0:(obd_mount.c:1583:lustre_fill_super()) > Unable to mount (-2) > > I'm not sure if I'm missing something in my config. Any help is appreciated. > > Thanks, > Rohan > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
