Hi,

Sorry for my late answer.

Gregory Farnum wrote:

>> 1. Is this kind of freeze normal? Can I avoid these freezes with a
>> more recent version of the kernel in the client?
> 
> Yes, it's normal. Although you should have been able to do a lazy
> and/or force umount. :)

Ah, I haven't tried it.
Maybe I'm wrong but I think a "lazy" or a "force" umount wouldn't
succeed. I'll try to test if I can reproduce the freeze.

> You can't avoid the freeze with a newer client. :(
> 
> If you notice the problem quickly enough, you should be able to
> reconnect everything by rebooting the MDS — although if the MDS hasn't
> failed the client then things shouldn't be blocking, so actually that
> probably won't help you.

Yes, the mds was completely ok and after the hard-reboot of the client,
the client had access again to the cephfs with the exactly same mds service
in the cluster side (no restart etc).

>> 2. Can I avoid these freezes with ceph-fuse instead of the kernel
>> cephfs module? But in this case, the cephfs performance will be
>> worse. Am I wrong?
> 
> No, ceph-fuse will suffer the same blockage, although obviously in
> userspace it's a bit easier to clean up.

Yes, I suppose that after "kill" commands, I would be able to remount
the cephfs without any reboot etc., isn't it?

> Depending on your workload it
> will be slightly faster to a lot slower. Though you'll also get
> updates faster/more easily. ;)

Yes, I imagine that with "ceph-fuse" I have a completely updated
cephfs-client (in user-space) whereas with the cephfs-client kernel
I have just the version available in the current kernel of my client
node (3.16 in my case).

>> 3. Is there a parameter in ceph.conf to tell mds to be more patient
>> before closing the "stale session" of a client?
> 
> Yes. You'll need to increase the "mds session timeout" value on the
> MDS; it currently defaults to 60 seconds. You can increase that to
> whatever values you like. The tradeoff here is that if you have a
> client die, anything it had "capabilities' on (for read/write access)
> will be unavailable for anybody who's doing something that might
> conflict with those capabilities.

Ok, thanks for the warning, it seems logical.

> If you've got a new enough MDS (Hammer, probably, but you can check)

Yes, I use Hammer.

> then you can use the admin socket to boot specific sessions, so it may
> suit you to set very large timeouts and manually zap any client which
> actually goes away badly (rather than getting disconnected by the
> network).

Ok, I see. According to the online documentation, the way to close
a cephfs client session is:

ceph daemon mds.$id session ls             # to get the $session_id and the 
$address
ceph osd blacklist add $address
ceph osd dump                              # to get the $epoch
ceph daemon mds.$id osdmap barrier $epoch
ceph daemon mds.$id session evict $session_id

Is it correct?

With the commands above, could I reproduce the client freeze in my testing
cluster?

I'll try because it convenient to be able reproduce the problem just with
command lines (without to really stop the network in the client etc). I
would like to test if, with ceph-fuse, I can easily restore the situation
of my client.

>> I'm in a testing period and a hard reboot of my cephfs clients would
>> be quite annoying for me. Thanks in advance for your help.
> 
> Yeah. Unfortunately there's a basic tradeoff in strictly-consistent
> (aka POSIX) network filesystems here: if the network goes away, you
> can't be consistent any more because the disconnected client can make
> conflicting changes. And you can't tell exactly when the network
> disappeared.

And could it be conceivable one day (for instance with an option) to be
able to change the behavior of cephfs to be *not*-strictly-consistent,
like NFS for instance? It seems to me it could improve performances of
cephfs and cephfs could be more flexible concerning short network failure
(not really sure for this second point). Ok it's just a remark of a simple
and unqualified ceph-user ;) but it seems to me that NFS isn't strictly
consistent and generally this not a problem in many use cases. Am I wrong?

> So while we hope to make this less painful in the future, the network
> dying that badly is a failure case that you need to be aware of
> meaning that the client might have conflicting information. If it
> *does* have conflicting info, the best we can do about it is be
> polite, return a bunch of error codes, and unmount gracefully. We'll
> get there eventually but it's a lot of work.

Yes, I can imagine the amount of work...
Thank a lot Greg for your answer. ;)

-- 
François Lafont
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to