What do your netmasks look like on each network? From: lustre-discuss <[email protected]<mailto:[email protected]>> on behalf of sohamm <[email protected]<mailto:[email protected]>> Date: Monday, July 18, 2016 at 1:56 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
Hi Thomas Below are the results of the commands you suggested. >From Client [root@dev1 ~]# lctl ping 192.168.200.52@o2ib failed to ping 192.168.200.52@o2ib: Input/output error [root@dev1 ~]# lctl ping 192.168.111.52@tcp 12345-0@lo 12345-192.168.200.52@o2ib 12345-192.168.111.52@tcp [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed: Input/output error Is the MGS running? mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid argument cat /var/log/messages | tail Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to 12345-192.168.200.52@o2ib via <?> (all routers down) Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to 12345-192.168.200.52@o2ib via <?> (all routers down) >From MGS [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp 12345-0@lo 12345-192.168.111.102@tcp Please let me know what else i can try. Looks like i am missing something with the ib config? Do i need router setup as part of lnet ? if i am able to ping mgs from client on the tcp network, it should still work ? Thanks On Sun, Jul 17, 2016 at 1:07 PM, <[email protected]<mailto:[email protected]>> wrote: Send lustre-discuss mailing list submissions to [email protected]<mailto:[email protected]> To subscribe or unsubscribe via the World Wide Web, visit http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org or, via email, send a message with subject or body 'help' to [email protected]<mailto:[email protected]> You can reach the person managing the list at [email protected]<mailto:[email protected]> When replying, please edit your Subject line so it is more specific than "Re: Contents of lustre-discuss digest..." Today's Topics: 1. llapi_file_get_stripe() and /proc/fs/lustre/osc/ entries (John Bauer) 2. luster client mount issues (sohamm) 3. Re: luster client mount issues (Thomas Roth) ---------------------------------------------------------------------- Message: 1 Date: Sat, 16 Jul 2016 15:11:22 -0500 From: John Bauer <[email protected]<mailto:[email protected]>> To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [lustre-discuss] llapi_file_get_stripe() and /proc/fs/lustre/osc/ entries Message-ID: <[email protected]<mailto:[email protected]>> Content-Type: text/plain; charset="utf-8"; Format="flowed" I am using *llapi_file_get_stripe()* to get the ost indexes that a file is striped on. That part is working fine. But there are multiple Lustre file systems on the node resulting in multiple **OST0000* *in the directory /proc/fs/lustre/osc. Is there something in the *struct lov_user_ost_data* or *struct lov_user_md* that would indicate which of the following directories pertains to the file's OST ? dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp1-OST0000-osc-ffff880287ae4c00 dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp2-OST0000-osc-ffff881034d99000 dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp6-OST0000-osc-ffff881003cd7800 dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp7-OST0000-osc-ffff880ffe051c00 dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp8-OST0000-osc-ffff880ffe054c00 dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp9-OST0000-osc-ffff880fcf179400 Thanks -- I/O Doctors, LLC 507-766-0378 [email protected]<mailto:[email protected]> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/95176929/attachment.html> ------------------------------ Message: 2 Date: Sat, 16 Jul 2016 14:34:35 -0700 From: sohamm <[email protected]<mailto:[email protected]>> To: [email protected]<mailto:[email protected]> Subject: [lustre-discuss] luster client mount issues Message-ID: <cakgc+ebq+mcdbsrc7ft4gd+zmz6fbazhavhsqtpgoshyrjq...@mail.gmail.com<mailto:cakgc%2bebq%2bmcdbsrc7ft4gd%[email protected]>> Content-Type: text/plain; charset="utf-8" Hi I am trying to mount lustre client. Below are steps and necessary information surrounding the issue. Please let me know if i am missing something Thanks Div *Mgs:* [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf options lnet networks=o2ib(ib0),tcp0(eth0) [root@lustre_mgs01_vm03 ~]# modprobe lnet [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet lnet 449065 0 libcfs 405839 1 lnet [root@lustre_mgs01_vm03 ~]# lctl network up LNET configured [root@lustre_mgs01_vm03 ~]# lctl list_nids 192.168.200.52@o2ib 192.168.111.52@tcp *On Client:* I am able to ping MGS on both tcp and ib network [root@dev1~]# ping 192.168.111.52 PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data. 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=1 ttl=64 time=5.81 ms 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=2 ttl=64 time=0.802 ms 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=3 ttl=64 time=0.780 ms ^C --- 192.168.111.52 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms [root@dev1 ~]# ping 192.168.200.52 PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data. 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=1 ttl=64 time=24.4 ms 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=2 ttl=64 time=2.14 ms 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=3 ttl=64 time=0.782 ms 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=4 ttl=64 time=9.30 ms ^C --- 192.168.200.52 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3005ms *client mount commands* mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or) mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or) mount -t lustre 192.168.200.52@ob2:/mylustre /lustre *cat /var/log/messages | tail -40* Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: 162-5: Missing mount data: check that /sbin/mount.lustre is installed. Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError: 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-22) Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre: 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1468702998/real 1468702998] req@ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52 Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError: 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52@tcp @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError: 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52@tcp @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre: 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1468703023/real 1468703023] req@ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111 Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre: 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1468703048/real 1468703048] req@ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111 Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError: 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52@tcp @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError: 15c-8: MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other e Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError: 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5 Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre: Unmounted mylustre-client Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError: 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-5) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/28fb2cad/attachment-0001.htm> ------------------------------ Message: 3 Date: Sun, 17 Jul 2016 10:19:18 +0200 From: Thomas Roth <[email protected]<mailto:[email protected]>> To: <[email protected]<mailto:[email protected]>> Subject: Re: [lustre-discuss] luster client mount issues Message-ID: <[email protected]<mailto:[email protected]>> Content-Type: text/plain; charset="windows-1252"; format=flowed Hi, try 'lctl ping' from your clients to the MDS to check if you get through on lnet, e.g. lctl ping ping 192.168.200.52@o2ib or lctl ping 192.168.111.52@tcp and vice versa from the MDS to the clients' nids. Regards, Thomas On 07/16/2016 11:34 PM, sohamm wrote: > Hi > > I am trying to mount lustre client. Below are steps and necessary > information surrounding the issue. Please let me know if i am missing > something > > Thanks > Div > > *Mgs:* > > [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf > > options lnet networks=o2ib(ib0),tcp0(eth0) > > > > [root@lustre_mgs01_vm03 ~]# modprobe lnet > > [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet > > lnet 449065 0 > > libcfs 405839 1 lnet > > [root@lustre_mgs01_vm03 ~]# lctl network up > > LNET configured > > [root@lustre_mgs01_vm03 ~]# lctl list_nids > > 192.168.200.52@o2ib > > 192.168.111.52@tcp > > *On Client:* > I am able to ping MGS on both tcp and ib network > > [root@dev1~]# ping 192.168.111.52 > > PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data. > > 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=1 ttl=64 > time=5.81 ms > > 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=2 ttl=64 > time=0.802 ms > > 64 bytes from 192.168.111.52<http://192.168.111.52>: icmp_req=3 ttl=64 > time=0.780 ms > > ^C > > --- 192.168.111.52 ping statistics --- > > 3 packets transmitted, 3 received, 0% packet loss, time 2000ms > > rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms > > [root@dev1 ~]# ping 192.168.200.52 > > PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data. > > 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=1 ttl=64 > time=24.4 ms > > 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=2 ttl=64 > time=2.14 ms > > 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=3 ttl=64 > time=0.782 ms > > 64 bytes from 192.168.200.52<http://192.168.200.52>: icmp_req=4 ttl=64 > time=9.30 ms > > ^C > > --- 192.168.200.52 ping statistics --- > > 4 packets transmitted, 4 received, 0% packet loss, time 3005ms > > > *client mount commands* > > mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or) > > mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or) > > mount -t lustre 192.168.200.52@ob2:/mylustre /lustre > > > *cat /var/log/messages | tail -40* > > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: 162-5: > Missing mount data: check that /sbin/mount.lustre is installed. > > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError: > 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-22) > > Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre: > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for slow reply: [sent 1468702998/real 1468702998] > req@ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52 > > Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError: > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired > req@ffff8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52@tcp > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl > > Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError: > 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired > req@ffff8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52@tcp > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl > > Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre: > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > failed due to network error: [sent 1468703023/real 1468703023] > req@ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111 > > Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre: > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > failed due to network error: [sent 1468703048/real 1468703048] > req@ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111 > > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError: > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired > req@ffff8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52@tcp > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl > > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError: 15c-8: > MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' failed > (-5). This may be the result of communication errors between this node and > the MGS, a bad configuration, or other e > > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError: > 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5 > > Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre: Unmounted > mylustre-client > > Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError: > 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-5) > > > > _______________________________________________ > lustre-discuss mailing list > [email protected]<mailto:[email protected]> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > -- -------------------------------------------------------------------- Thomas Roth Department: HPC Location: SB3 1.262 Phone: +49-6159-71 1453<tel:%2B49-6159-71%201453> Fax: +49-6159-71 2986<tel:%2B49-6159-71%202986> GSI Helmholtzzentrum f?r Schwerionenforschung GmbH Planckstra?e 1 64291 Darmstadt www.gsi.de<http://www.gsi.de> Gesellschaft mit beschr?nkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Gesch?ftsf?hrung: Professor Dr. Karlheinz Langanke Ursula Weyrich J?rg Blaurock Vorsitzender des Aufsichtsrates: St Dr. Georg Sch?tte Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt ------------------------------ Subject: Digest Footer _______________________________________________ lustre-discuss mailing list [email protected]<mailto:[email protected]> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ------------------------------ End of lustre-discuss Digest, Vol 124, Issue 17 ***********************************************
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
