Hi There, OS=Centos 7.4 Lustre Version: Intel® Manager for Lustre* software 4.0.3.0 İnterconnect: Mellanox OFED, ConnectX-5
In one of my lustre client i have Input/output error in df command, i am unable to see the lustre mount point in df but mtab file shows that lustre is mounted df -h output: df: ‘/home’: Input/output error df: ‘/vol1’: Input/output error df: ‘/cm/shared’: Input/output error Filesystem Size Used Avail Use% Mounted on cat /etc/mtab |grep lustre 10.51.22.11@o2ib:10.51.22.10@o2ib:/lustre/home /home lustre rw,flock,lazystatfs 0 0 10.51.22.11@o2ib:10.51.22.10@o2ib:/lustre /vol1 lustre rw,flock,lazystatfs 0 0 10.51.22.11@o2ib:10.51.22.10@o2ib:/lustre/cmshared /cm/shared lustre rw,flock,lazystatfs 0 0 df -h output: df: ‘/home’: Input/output error df: ‘/vol1’: Input/output error df: ‘/cm/shared’: Input/output error Filesystem Size Used Avail Use% Mounted on When i cd to the mounted point i can reach the lustre filesystem, i can create and delete files and folders. But when i cd to a large fileand run ls -lah command, response from the lustre client freezes. dmesg output: [84276.460557] Lustre: 5617:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1536408434/real 1536408489] req@ffff882f31697800 x1610952588839712/t0(0) o8->[email protected]@o2ib:28/4 lens 520/544 e 0 to 1 dl 1536408714 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [84276.460565] Lustre: 5617:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 910 previous similar messages [84386.986467] LustreError: 122750:0:(llite_lib.c:1772:ll_statfs_internal()) obd_statfs fails: rc = -5 [84386.986471] LustreError: 122750:0:(llite_lib.c:1772:ll_statfs_internal()) Skipped 29 previous similar messages [84704.429967] LNet: 5429:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.52.23.5@o2ib: 4379575 seconds [84704.429970] LNet: 5429:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 863 previous similar messages [84881.004949] Lustre: 5617:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1536409034/real 1536409095] req@ffff882f2a6e5700 x1610952588854608/t0(0) o8->[email protected]@o2ib:28/4 lens 520/544 e 0 to 1 dl 1536409314 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [84881.004957] Lustre: 5617:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 863 previous similar messages [85065.953686] LustreError: 123635:0:(llite_lib.c:1772:ll_statfs_internal()) obd_statfs fails: rc = -5 [85065.953689] LustreError: 123635:0:(llite_lib.c:1772:ll_statfs_internal()) Skipped 26 previous similar messages fstab mount options: lustre flock,_netdev,x-systemd.requires=lnet.service 0 0 ib_* benchmark tests are as usual. Where should i check? Best Regards.
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
