Hi Zhenshi, if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount. I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO.
Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, Aug 8, 2018 at 11:17 AM Zhenshi Zhou <[email protected]> wrote: > Hi Webert, > That command shows the current sessions, whereas the server which I get > the files(osdc,mdsc,monc) disconnect for a long time. > So I cannot get useful infomation from the command you provide. > > Thanks > > Webert de Souza Lima <[email protected]> 于2018年8月8日周三 下午10:10写道: > >> You could also see open sessions at the MDS server by issuing `ceph >> daemon mds.XX session ls` >> >> Regards, >> >> Webert Lima >> DevOps Engineer at MAV Tecnologia >> *Belo Horizonte - Brasil* >> *IRC NICK - WebertRLZ* >> >> >> On Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou <[email protected]> wrote: >> >>> Hi, I find an old server which mounted cephfs and has the debug files. >>> # cat osdc >>> REQUESTS 0 homeless 0 >>> LINGER REQUESTS >>> BACKOFFS >>> # cat monc >>> have monmap 2 want 3+ >>> have osdmap 3507 >>> have fsmap.user 0 >>> have mdsmap 55 want 56+ >>> fs_cluster_id -1 >>> # cat mdsc >>> 194 mds0 getattr #10000036ae3 >>> >>> What does it mean? >>> >>> Zhenshi Zhou <[email protected]> 于2018年8月8日周三 下午1:58写道: >>> >>>> I restarted the client server so that there's no file in that >>>> directory. I will take care of it if the client hangs next time. >>>> >>>> Thanks >>>> >>>> Yan, Zheng <[email protected]> 于2018年8月8日周三 上午11:23写道: >>>> >>>>> On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou <[email protected]> >>>>> wrote: >>>>> > >>>>> > Hi, >>>>> > I check all my ceph servers and they are not mount cephfs on each of >>>>> them(maybe I umount after testing). As a result, the cluster didn't >>>>> encounter a memory deadlock. Besides, I check the monitoring system and >>>>> the >>>>> memory and cpu usage were at common level while the clients hung. >>>>> > Back to my question, there must be something else cause the client >>>>> hang. >>>>> > >>>>> >>>>> Check if there are hang requests in >>>>> /sys/kernel/debug/ceph/xxxx/{osdc,mdsc}, >>>>> >>>>> > Zhenshi Zhou <[email protected]> 于2018年8月8日周三 上午4:16写道: >>>>> >> >>>>> >> Hi, I'm not sure if it just mounts the cephfs without using or >>>>> doing any operation within the mounted directory would be affected by >>>>> flushing cache. I mounted cephfs on osd servers only for testing and then >>>>> left it there. Anyway I will umount it. >>>>> >> >>>>> >> Thanks >>>>> >> >>>>> >> John Spray <[email protected]>于2018年8月8日 周三03:37写道: >>>>> >>> >>>>> >>> On Tue, Aug 7, 2018 at 5:42 PM Reed Dier <[email protected]> >>>>> wrote: >>>>> >>> > >>>>> >>> > This is the first I am hearing about this as well. >>>>> >>> >>>>> >>> This is not a Ceph-specific thing -- it can also affect similar >>>>> >>> systems like Lustre. >>>>> >>> >>>>> >>> The classic case is when under some memory pressure, the kernel >>>>> tries >>>>> >>> to free memory by flushing the client's page cache, but doing the >>>>> >>> flush means allocating more memory on the server, making the memory >>>>> >>> pressure worse, until the whole thing just seizes up. >>>>> >>> >>>>> >>> John >>>>> >>> >>>>> >>> > Granted, I am using ceph-fuse rather than the kernel client at >>>>> this point, but that isn’t etched in stone. >>>>> >>> > >>>>> >>> > Curious if there is more to share. >>>>> >>> > >>>>> >>> > Reed >>>>> >>> > >>>>> >>> > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima < >>>>> [email protected]> wrote: >>>>> >>> > >>>>> >>> > >>>>> >>> > Yan, Zheng <[email protected]> 于2018年8月7日周二 下午7:51写道: >>>>> >>> >> >>>>> >>> >> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou < >>>>> [email protected]> wrote: >>>>> >>> >> this can cause memory deadlock. you should avoid doing this >>>>> >>> >> >>>>> >>> >> > Yan, Zheng <[email protected]>于2018年8月7日 周二19:12写道: >>>>> >>> >> >> >>>>> >>> >> >> did you mount cephfs on the same machines that run ceph-osd? >>>>> >>> >> >> >>>>> >>> > >>>>> >>> > >>>>> >>> > I didn't know about this. I run this setup in production. :P >>>>> >>> > >>>>> >>> > Regards, >>>>> >>> > >>>>> >>> > Webert Lima >>>>> >>> > DevOps Engineer at MAV Tecnologia >>>>> >>> > Belo Horizonte - Brasil >>>>> >>> > IRC NICK - WebertRLZ >>>>> >>> > >>>>> >>> > _______________________________________________ >>>>> >>> > ceph-users mailing list >>>>> >>> > [email protected] >>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>> > >>>>> >>> > >>>>> >>> > _______________________________________________ >>>>> >>> > ceph-users mailing list >>>>> >>> > [email protected] >>>>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>> _______________________________________________ >>>>> >>> ceph-users mailing list >>>>> >>> [email protected] >>>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> > >>>>> > _______________________________________________ >>>>> > ceph-users mailing list >>>>> > [email protected] >>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
