Thanks for replying back Arman.
/var/log/messages still cribbs about the error as below : Aug 29 15:01:59 MGS-1 kernel: LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. but, adding a mapping in /etc/hosts allows others to connect to MGS now. Seems like a workaround, but things are working as of now. It still fails if you try to configure mdt with an IP. Thanks again. Warm Regards, Abhay Dandekar On Mon, Aug 25, 2014 at 5:00 PM, Arman Khalatyan <[email protected]> wrote: > Hi Abhay, > Could you please check the lnet status? > lctl list_nids, or pings.. > Is you firewall enabled? > BTW, i move all my servers to 2.5.x branch, that was fixing most of my > troubles... > a. > > > On Tue, Aug 19, 2014 at 12:38 PM, Abhay Dandekar > <[email protected]> wrote: > > I came across a similar situation. > > > > Below is the log of machine state. These steps worked on some setups > while > > on some it didnt. > > > > Armaan, > > > > Were you able to get over the problem ? Any workaround ? > > > > Thanks in advance for all your help. > > > > > > Warm Regards, > > Abhay Dandekar > > > > > > ---------- Forwarded message ---------- > > From: Abhay Dandekar <[email protected]> > > Date: Wed, Aug 6, 2014 at 12:18 AM > > Subject: Lustre configuration failure : lwp-MDT0000: Communicating with > > 0@lo, operation mds_connect failed with -11. > > To: [email protected] > > > > > > > > Hi All, > > > > I have come across an lustre installation failure where the MGS is always > > trying to reach "lo" config instead of configured ethernet. > > > > These same steps worked on a different machine, somehow they are failing > > here. > > > > Here are the logs > > > > Lustre installation is success with all the packages installed without > any > > error. > > > > 0. Lustre version > > > > Aug 5 23:07:37 lfs-server kernel: LNet: HW CPU cores: 1, npartitions: 1 > > Aug 5 23:07:37 lfs-server modprobe: FATAL: Error inserting crc32c_intel > > > (/lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko): > > No such device > > Aug 5 23:07:37 lfs-server kernel: alg: No test for crc32 (crc32-table) > > Aug 5 23:07:37 lfs-server kernel: alg: No test for adler32 > (adler32-zlib) > > Aug 5 23:07:41 lfs-server modprobe: FATAL: Error inserting padlock_sha > > > (/lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/kernel/drivers/crypto/padlock-sha.ko): > > No such device > > Aug 5 23:07:41 lfs-server kernel: padlock: VIA PadLock Hash Engine not > > detected. > > Aug 5 23:07:45 lfs-server kernel: Lustre: Lustre: Build Version: > > 2.5.2-RC2--PRISTINE-2.6.32-431.17.1.el6_lustre.x86_64 > > Aug 5 23:07:45 lfs-server kernel: LNet: Added LNI 192.168.122.50@tcp > > [8/256/0/180] > > Aug 5 23:07:45 lfs-server kernel: LNet: Accept secure, port 988 > > > > > > 1. Mkfs > > > > [root@lfs-server ~]# mkfs.lustre --fsname=lustre --mgs --mdt --index=0 > > /dev/sdb > > > > Permanent disk data: > > Target: lustre:MDT0000 > > Index: 0 > > Lustre FS: lustre > > Mount type: ldiskfs > > Flags: 0x65 > > (MDT MGS first_time update ) > > Persistent mount opts: user_xattr,errors=remount-ro > > Parameters: > > > > checking for existing Lustre data: not found > > device size = 10240MB > > formatting backing filesystem ldiskfs on /dev/sdb > > target name lustre:MDT0000 > > 4k blocks 2621440 > > options -J size=400 -I 512 -i 2048 -q -O > > dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E > > lazy_journal_init -F > > mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -J size=400 -I 512 -i > 2048 > > -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E > > lazy_journal_init -F /dev/sdb 2621440 > > Aug 5 17:16:47 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem > with > > ordered data mode. quota=on. Opts: > > Writing CONFIGS/mountdata > > [root@lfs-server ~]# > > > > 2. Mount > > > > [root@lfs-server ~]# mount -t lustre /dev/sdb /mnt/mgs > > Aug 5 17:18:01 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem > with > > ordered data mode. quota=on. Opts: > > Aug 5 17:18:01 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem > with > > ordered data mode. quota=on. Opts: > > Aug 5 17:18:02 lfs-server kernel: Lustre: ctl-lustre-MDT0000: No data > found > > on store. Initialize space > > Aug 5 17:18:02 lfs-server kernel: Lustre: lustre-MDT0000: new disk, > > initializing > > Aug 5 17:18:02 lfs-server kernel: Lustre: MGS: non-config logname > received: > > params > > Aug 5 17:18:02 lfs-server kernel: LustreError: 11-0: > > lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation > mds_connect > > failed with -11. > > [root@lfs-server ~]# > > > > > > 3. Unmount > > [root@lfs-server ~]# umount /dev/sdb > > Aug 5 17:19:46 lfs-server kernel: Lustre: Failing over lustre-MDT0000 > > Aug 5 17:19:52 lfs-server kernel: Lustre: > > 1338:0:(client.c:1908:ptlrpc_expire_one_request()) @@@ Request sent has > > timed out for slow reply: [sent 1407239386/real 1407239386] > > req@ffff88003d795c00 x1475596948340888/t0(0) > > o251->MGC192.168.122.50@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl > 1407239392 > > ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 > > [root@lfs-server ~]# Aug 5 17:19:53 lfs-server kernel: Lustre: server > > umount lustre-MDT0000 complete > > > > [root@lfs-server ~]# > > > > > > 4. [root@mgs ~]# cat /etc/modprobe.d/lustre.conf > > options lnet networks=tcp(eth0) > > [root@mgs ~]# > > > > 5.Even the lnet configuration is in place, it does not pick up the > required > > eth0. > > > > [root@mgs ~]# lctl dl > > 0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 8 > > 1 UP mgs MGS MGS 5 > > 2 UP mgc MGC192.168.122.50@tcp c6ea84c0-b3b2-9d25-8126-32d85956ae4d 5 > > 3 UP mds MDS MDS_uuid 3 > > 4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4 > > 5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 5 > > 6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4 > > 7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4 > > 8 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5 > > [root@mgs ~]# > > > > Any pointers to go ahead ?? > > > > > > Warm Regards, > > Abhay Dandekar > > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
