Hi Andreas, Sorry for bothering you, but modifying the /etc/hosts still does not solve the problem.
Just to give some more info, I am trying to setup a virtual cluster of lustre nodes. Here is my /etc/hosts [root@mgs-new-test ~]# cat /etc/hosts 192.168.122.50 mgs-new-test 192.168.122.50 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 ::0 mgs-new-test [root@mgs-new-test ~]# And here is the latest /var/log/messages Aug 20 11:32:32 mgs-new-test kernel: EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: Aug 20 11:32:32 mgs-new-test kernel: Adding 417784k swap on /dev/mapper/vg_mgsnewtest-lv_swap. Priority:-1 extents:1 across:417784k Aug 20 11:32:32 mgs-new-test kernel: NET: Registered protocol family 10 Aug 20 11:32:32 mgs-new-test kernel: lo: Disabled Privacy Extensions Aug 20 11:33:25 mgs-new-test kernel: LNet: HW CPU cores: 1, npartitions: 1 Aug 20 11:33:25 mgs-new-test kernel: alg: No test for adler32 (adler32-zlib) Aug 20 11:33:25 mgs-new-test kernel: alg: No test for crc32 (crc32-table) Aug 20 11:33:29 mgs-new-test modprobe: FATAL: Error inserting padlock_sha (/lib/modules/2.6.32-431.20.3.el6_lustre.x86_64/kernel/drivers/crypto/padlock-sha.ko): No such device Aug 20 11:33:29 mgs-new-test kernel: padlock: VIA PadLock Hash Engine not detected. Aug 20 11:33:33 mgs-new-test kernel: Lustre: Lustre: Build Version: 2.6.0-RC2--PRISTINE-2.6.32-431.20.3.el6_lustre.x86_64 Aug 20 11:33:33 mgs-new-test kernel: LNet: Added LNI 192.168.122.50@tcp [8/256/0/180] Aug 20 11:33:33 mgs-new-test kernel: LNet: Accept secure, port 988 Aug 20 11:34:41 mgs-new-test kernel: LDISKFS-fs (vdb): mounted filesystem with ordered data mode. quota=on. Opts: Aug 20 11:34:53 mgs-new-test kernel: LDISKFS-fs (vdb): mounted filesystem with ordered data mode. quota=on. Opts: Aug 20 11:34:54 mgs-new-test kernel: LDISKFS-fs (vdb): mounted filesystem with ordered data mode. quota=on. Opts: Aug 20 11:34:54 mgs-new-test kernel: Lustre: ctl-mylustre-MDT0000: No data found on store. Initialize space Aug 20 11:34:54 mgs-new-test kernel: Lustre: mylustre-MDT0000: new disk, initializing Aug 20 11:34:54 mgs-new-test kernel: LustreError: 11-0: *mylustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.* Any pointers where else do I need to make the changes ? Thanks in advance. Warm Regards, Abhay Dandekar > On Wed, Aug 20, 2014 at 3:23 AM, Andreas Dilger <adil...@dilger.ca> wrote: > >> Often this problem is because the hostname in /etc/hosts is actually >> mapped to localhost on the node itself. >> >> Unfortunately, this is how some systems are set up by default. >> >> Cheers, Andreas >> >> On Aug 19, 2014, at 12:39, "Abhay Dandekar" <dandekar.ab...@gmail.com> >> wrote: >> >> I came across a similar situation. >> >> Below is the log of machine state. These steps worked on some setups >> while on some it didnt. >> >> Armaan, >> >> Were you able to get over the problem ? Any workaround ? >> >> Thanks in advance for all your help. >> >> >> Warm Regards, >> Abhay Dandekar >> >> >> ---------- Forwarded message ---------- >> From: Abhay Dandekar <dandekar.ab...@gmail.com> >> Date: Wed, Aug 6, 2014 at 12:18 AM >> Subject: Lustre configuration failure : lwp-MDT0000: Communicating with >> 0@lo, operation mds_connect failed with -11. >> To: lustre-discuss@lists.lustre.org >> >> >> >> Hi All, >> >> I have come across an lustre installation failure where the MGS is always >> trying to reach "lo" config instead of configured ethernet. >> >> These same steps worked on a different machine, somehow they are failing >> here. >> >> Here are the logs >> >> Lustre installation is success with all the packages installed without >> any error. >> >> 0. Lustre version >> >> Aug 5 23:07:37 lfs-server kernel: LNet: HW CPU cores: 1, npartitions: 1 >> Aug 5 23:07:37 lfs-server modprobe: FATAL: Error inserting crc32c_intel >> (/lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko): >> No such device >> Aug 5 23:07:37 lfs-server kernel: alg: No test for crc32 (crc32-table) >> Aug 5 23:07:37 lfs-server kernel: alg: No test for adler32 (adler32-zlib) >> Aug 5 23:07:41 lfs-server modprobe: FATAL: Error inserting padlock_sha >> (/lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/kernel/drivers/crypto/padlock-sha.ko): >> No such device >> Aug 5 23:07:41 lfs-server kernel: padlock: VIA PadLock Hash Engine not >> detected. >> Aug 5 23:07:45 lfs-server kernel: Lustre: Lustre: Build Version: >> 2.5.2-RC2--PRISTINE-2.6.32-431.17.1.el6_lustre.x86_64 >> Aug 5 23:07:45 lfs-server kernel: LNet: Added LNI 192.168.122.50@tcp >> [8/256/0/180] >> Aug 5 23:07:45 lfs-server kernel: LNet: Accept secure, port 988 >> >> >> 1. Mkfs >> >> [root@lfs-server ~]# mkfs.lustre --fsname=lustre --mgs --mdt --index=0 >> /dev/sdb >> >> Permanent disk data: >> Target: lustre:MDT0000 >> Index: 0 >> Lustre FS: lustre >> Mount type: ldiskfs >> Flags: 0x65 >> (MDT MGS first_time update ) >> Persistent mount opts: user_xattr,errors=remount-ro >> Parameters: >> >> checking for existing Lustre data: not found >> device size = 10240MB >> formatting backing filesystem ldiskfs on /dev/sdb >> target name lustre:MDT0000 >> 4k blocks 2621440 >> options -J size=400 -I 512 -i 2048 -q -O >> dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E >> lazy_journal_init -F >> mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -J size=400 -I 512 -i >> 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E >> lazy_journal_init -F /dev/sdb 2621440 >> Aug 5 17:16:47 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem >> with ordered data mode. quota=on. Opts: >> Writing CONFIGS/mountdata >> [root@lfs-server ~]# >> >> 2. Mount >> >> [root@lfs-server ~]# mount -t lustre /dev/sdb /mnt/mgs >> Aug 5 17:18:01 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem >> with ordered data mode. quota=on. Opts: >> Aug 5 17:18:01 lfs-server kernel: LDISKFS-fs (sdb): mounted filesystem >> with ordered data mode. quota=on. Opts: >> Aug 5 17:18:02 lfs-server kernel: Lustre: ctl-lustre-MDT0000: No data >> found on store. Initialize space >> Aug 5 17:18:02 lfs-server kernel: Lustre: lustre-MDT0000: new disk, >> initializing >> Aug 5 17:18:02 lfs-server kernel: Lustre: MGS: non-config logname >> received: params >> Aug 5 17:18:02 lfs-server kernel: LustreError: 11-0: >> lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation >> mds_connect failed with -11. >> [root@lfs-server ~]# >> >> >> 3. Unmount >> [root@lfs-server ~]# umount /dev/sdb >> Aug 5 17:19:46 lfs-server kernel: Lustre: Failing over lustre-MDT0000 >> Aug 5 17:19:52 lfs-server kernel: Lustre: >> 1338:0:(client.c:1908:ptlrpc_expire_one_request()) @@@ Request sent has >> timed out for slow reply: [sent 1407239386/real 1407239386] >> req@ffff88003d795c00 x1475596948340888/t0(0) o251->MGC192.168.122.50@tcp >> @0@lo:26/25 lens 224/224 e 0 to 1 dl 1407239392 ref 2 fl >> Rpc:XN/0/ffffffff rc 0/-1 >> [root@lfs-server ~]# Aug 5 17:19:53 lfs-server kernel: Lustre: server >> umount lustre-MDT0000 complete >> >> [root@lfs-server ~]# >> >> >> 4. [root@mgs ~]# cat /etc/modprobe.d/lustre.conf >> options lnet networks=tcp(eth0) >> [root@mgs ~]# >> >> 5.Even the lnet configuration is in place, it does not pick up the >> required eth0. >> >> [root@mgs ~]# lctl dl >> 0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 8 >> 1 UP mgs MGS MGS 5 >> 2 UP mgc MGC192.168.122.50@tcp c6ea84c0-b3b2-9d25-8126-32d85956ae4d 5 >> 3 UP mds MDS MDS_uuid 3 >> 4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4 >> 5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 5 >> 6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4 >> 7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4 >> 8 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5 >> [root@mgs ~]# >> >> Any pointers to go ahead ?? >> >> >> Warm Regards, >> Abhay Dandekar >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >
_______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss