Sorry for the delay; I've been traveling.

On Sun, May 17, 2015 at 3:49 PM, Francois Lafont <flafdiv...@free.fr> wrote:
> Hi,
>
> Sorry for my late answer.
>
> Gregory Farnum wrote:
>
>>> 1. Is this kind of freeze normal? Can I avoid these freezes with a
>>> more recent version of the kernel in the client?
>>
>> Yes, it's normal. Although you should have been able to do a lazy
>> and/or force umount. :)
>
> Ah, I haven't tried it.
> Maybe I'm wrong but I think a "lazy" or a "force" umount wouldn't
> succeed. I'll try to test if I can reproduce the freeze.
>
>> You can't avoid the freeze with a newer client. :(
>>
>> If you notice the problem quickly enough, you should be able to
>> reconnect everything by rebooting the MDS — although if the MDS hasn't
>> failed the client then things shouldn't be blocking, so actually that
>> probably won't help you.
>
> Yes, the mds was completely ok and after the hard-reboot of the client,
> the client had access again to the cephfs with the exactly same mds service
> in the cluster side (no restart etc).
>
>>> 2. Can I avoid these freezes with ceph-fuse instead of the kernel
>>> cephfs module? But in this case, the cephfs performance will be
>>> worse. Am I wrong?
>>
>> No, ceph-fuse will suffer the same blockage, although obviously in
>> userspace it's a bit easier to clean up.
>
> Yes, I suppose that after "kill" commands, I would be able to remount
> the cephfs without any reboot etc., isn't it?
>
>> Depending on your workload it
>> will be slightly faster to a lot slower. Though you'll also get
>> updates faster/more easily. ;)
>
> Yes, I imagine that with "ceph-fuse" I have a completely updated
> cephfs-client (in user-space) whereas with the cephfs-client kernel
> I have just the version available in the current kernel of my client
> node (3.16 in my case).
>
>>> 3. Is there a parameter in ceph.conf to tell mds to be more patient
>>> before closing the "stale session" of a client?
>>
>> Yes. You'll need to increase the "mds session timeout" value on the
>> MDS; it currently defaults to 60 seconds. You can increase that to
>> whatever values you like. The tradeoff here is that if you have a
>> client die, anything it had "capabilities' on (for read/write access)
>> will be unavailable for anybody who's doing something that might
>> conflict with those capabilities.
>
> Ok, thanks for the warning, it seems logical.
>
>> If you've got a new enough MDS (Hammer, probably, but you can check)
>
> Yes, I use Hammer.
>
>> then you can use the admin socket to boot specific sessions, so it may
>> suit you to set very large timeouts and manually zap any client which
>> actually goes away badly (rather than getting disconnected by the
>> network).
>
> Ok, I see. According to the online documentation, the way to close
> a cephfs client session is:
>
> ceph daemon mds.$id session ls             # to get the $session_id and the 
> $address
> ceph osd blacklist add $address
> ceph osd dump                              # to get the $epoch
> ceph daemon mds.$id osdmap barrier $epoch
> ceph daemon mds.$id session evict $session_id
>
> Is it correct?
>
> With the commands above, could I reproduce the client freeze in my testing
> cluster?

Yes, I believe so.

>
> I'll try because it convenient to be able reproduce the problem just with
> command lines (without to really stop the network in the client etc). I
> would like to test if, with ceph-fuse, I can easily restore the situation
> of my client.
>
>>> I'm in a testing period and a hard reboot of my cephfs clients would
>>> be quite annoying for me. Thanks in advance for your help.
>>
>> Yeah. Unfortunately there's a basic tradeoff in strictly-consistent
>> (aka POSIX) network filesystems here: if the network goes away, you
>> can't be consistent any more because the disconnected client can make
>> conflicting changes. And you can't tell exactly when the network
>> disappeared.
>
> And could it be conceivable one day (for instance with an option) to be
> able to change the behavior of cephfs to be *not*-strictly-consistent,
> like NFS for instance? It seems to me it could improve performances of
> cephfs and cephfs could be more flexible concerning short network failure
> (not really sure for this second point). Ok it's just a remark of a simple
> and unqualified ceph-user ;) but it seems to me that NFS isn't strictly
> consistent and generally this not a problem in many use cases. Am I wrong?

Mmm, this is something we're pretty resistant to. In particular NFS
just doesn't make any efforts to be consistent when there are multiple
writers, and CephFS works *really hard* to behave properly in that
case. For many use cases it's not a big deal, but for others it is,
and we target some of them.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to