Sorry for the delay; I've been traveling. On Sun, May 17, 2015 at 3:49 PM, Francois Lafont <flafdiv...@free.fr> wrote: > Hi, > > Sorry for my late answer. > > Gregory Farnum wrote: > >>> 1. Is this kind of freeze normal? Can I avoid these freezes with a >>> more recent version of the kernel in the client? >> >> Yes, it's normal. Although you should have been able to do a lazy >> and/or force umount. :) > > Ah, I haven't tried it. > Maybe I'm wrong but I think a "lazy" or a "force" umount wouldn't > succeed. I'll try to test if I can reproduce the freeze. > >> You can't avoid the freeze with a newer client. :( >> >> If you notice the problem quickly enough, you should be able to >> reconnect everything by rebooting the MDS — although if the MDS hasn't >> failed the client then things shouldn't be blocking, so actually that >> probably won't help you. > > Yes, the mds was completely ok and after the hard-reboot of the client, > the client had access again to the cephfs with the exactly same mds service > in the cluster side (no restart etc). > >>> 2. Can I avoid these freezes with ceph-fuse instead of the kernel >>> cephfs module? But in this case, the cephfs performance will be >>> worse. Am I wrong? >> >> No, ceph-fuse will suffer the same blockage, although obviously in >> userspace it's a bit easier to clean up. > > Yes, I suppose that after "kill" commands, I would be able to remount > the cephfs without any reboot etc., isn't it? > >> Depending on your workload it >> will be slightly faster to a lot slower. Though you'll also get >> updates faster/more easily. ;) > > Yes, I imagine that with "ceph-fuse" I have a completely updated > cephfs-client (in user-space) whereas with the cephfs-client kernel > I have just the version available in the current kernel of my client > node (3.16 in my case). > >>> 3. Is there a parameter in ceph.conf to tell mds to be more patient >>> before closing the "stale session" of a client? >> >> Yes. You'll need to increase the "mds session timeout" value on the >> MDS; it currently defaults to 60 seconds. You can increase that to >> whatever values you like. The tradeoff here is that if you have a >> client die, anything it had "capabilities' on (for read/write access) >> will be unavailable for anybody who's doing something that might >> conflict with those capabilities. > > Ok, thanks for the warning, it seems logical. > >> If you've got a new enough MDS (Hammer, probably, but you can check) > > Yes, I use Hammer. > >> then you can use the admin socket to boot specific sessions, so it may >> suit you to set very large timeouts and manually zap any client which >> actually goes away badly (rather than getting disconnected by the >> network). > > Ok, I see. According to the online documentation, the way to close > a cephfs client session is: > > ceph daemon mds.$id session ls # to get the $session_id and the > $address > ceph osd blacklist add $address > ceph osd dump # to get the $epoch > ceph daemon mds.$id osdmap barrier $epoch > ceph daemon mds.$id session evict $session_id > > Is it correct? > > With the commands above, could I reproduce the client freeze in my testing > cluster?
Yes, I believe so. > > I'll try because it convenient to be able reproduce the problem just with > command lines (without to really stop the network in the client etc). I > would like to test if, with ceph-fuse, I can easily restore the situation > of my client. > >>> I'm in a testing period and a hard reboot of my cephfs clients would >>> be quite annoying for me. Thanks in advance for your help. >> >> Yeah. Unfortunately there's a basic tradeoff in strictly-consistent >> (aka POSIX) network filesystems here: if the network goes away, you >> can't be consistent any more because the disconnected client can make >> conflicting changes. And you can't tell exactly when the network >> disappeared. > > And could it be conceivable one day (for instance with an option) to be > able to change the behavior of cephfs to be *not*-strictly-consistent, > like NFS for instance? It seems to me it could improve performances of > cephfs and cephfs could be more flexible concerning short network failure > (not really sure for this second point). Ok it's just a remark of a simple > and unqualified ceph-user ;) but it seems to me that NFS isn't strictly > consistent and generally this not a problem in many use cases. Am I wrong? Mmm, this is something we're pretty resistant to. In particular NFS just doesn't make any efforts to be consistent when there are multiple writers, and CephFS works *really hard* to behave properly in that case. For many use cases it's not a big deal, but for others it is, and we target some of them. -Greg _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com