Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-19 Thread Webert de Souza Lima
Hi Daniel,

Thanks for clarifying.
I'll have a look at dirfrag option.

Regards,
Webert Lima

Em sáb, 19 de mai de 2018 01:18, Daniel Baumann 
escreveu:

> On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> > New question: will it make any difference in the balancing if instead of
> > having the MAIL directory in the root of cephfs and the domains's
> > subtrees inside it, I discard the parent dir and put all the subtress
> right in cephfs root?
>
> the balancing between the MDS is influenced by which directories are
> accessed, the currently accessed directory-trees are diveded between the
> MDS's (also check the dirfrag option in the docs). assuming you have the
> same access pattern, the "fragmentation" between the MDS's happens at
> these "target-directories", so it doesn't matter if these directories
> are further up or down in the same filesystem tree.
>
> in the multi-MDS scenario where the MDS serving rank 0 fails, the
> effects in the moment of the failure for any cephfs client accessing a
> directory/file are the same (as described in an earlier mail),
> regardless on which level the directory/file is within the filesystem.
>
> Regards,
> Daniel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> New question: will it make any difference in the balancing if instead of
> having the MAIL directory in the root of cephfs and the domains's
> subtrees inside it, I discard the parent dir and put all the subtress right 
> in cephfs root?

the balancing between the MDS is influenced by which directories are
accessed, the currently accessed directory-trees are diveded between the
MDS's (also check the dirfrag option in the docs). assuming you have the
same access pattern, the "fragmentation" between the MDS's happens at
these "target-directories", so it doesn't matter if these directories
are further up or down in the same filesystem tree.

in the multi-MDS scenario where the MDS serving rank 0 fails, the
effects in the moment of the failure for any cephfs client accessing a
directory/file are the same (as described in an earlier mail),
regardless on which level the directory/file is within the filesystem.

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Webert de Souza Lima
Hi Patrick

On Fri, May 18, 2018 at 6:20 PM Patrick Donnelly 
wrote:

> Each MDS may have multiple subtrees they are authoritative for. Each
> MDS may also replicate metadata from another MDS as a form of load
> balancing.


Ok, its good to know that it actually does some load balance. Thanks.
New question: will it make any difference in the balancing if instead of
having the MAIL directory in the root of cephfs and the domains's subtrees
inside it,
I discard the parent dir and put all the subtress right in cephfs root?


> standby-replay daemons are not available to take over for ranks other
> than the one it follows. So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.
>
> It's best if y ou see if the normal balancer (especially in v12.2.6
> [1]) can handle the load for you without trying to micromanage things
> via pins. You can use pinning to isolate metadata load from other
> ranks as a stop-gap measure.
>

Ok I will start with the simplest way. This can be changed after deployment
if it comes to be the case.

On Fri, May 18, 2018 at 6:38 PM Daniel Baumann 
wrote:

> jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
> longer downtime for us due to http://tracker.ceph.com/issues/21749
>
> we're not using standby-replay MDS's anymore but only "normal" standby,
> and didn't have had any problems anymore (running kraken then, upgraded
> to luminous last fall).
>

Thank you very much for your feedback Daniel. I'll go for the regular
standby daemons, then.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/18/2018 11:19 PM, Patrick Donnelly wrote:
> So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.

jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
longer downtime for us due to http://tracker.ceph.com/issues/21749

(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/thread.html#21390
- thanks again for the help back then, still much appreciated)

we're not using standby-replay MDS's anymore but only "normal" standby,
and didn't have had any problems anymore (running kraken then, upgraded
to luminous last fall).

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Patrick Donnelly
Hello Webert,

On Fri, May 18, 2018 at 1:10 PM, Webert de Souza Lima
 wrote:
> Hi,
>
> We're migrating from a Jewel / filestore based cephfs archicture to a
> Luminous / buestore based one.
>
> One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge of
> how it actually works.
> After reading the docs and ML we learned that they work by sort of dividing
> the responsibilities, each with his own and only directory subtree. (please
> correct me if I'm wrong).

Each MDS may have multiple subtrees they are authoritative for. Each
MDS may also replicate metadata from another MDS as a form of load
balancing.

> Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3
> Active and 1 Standby (or Standby-Replay if that's still possible with
> multi-mds).

standby-replay daemons are not available to take over for ranks other
than the one it follows. So, you would want to have a standby-replay
daemon for each rank or just have normal standbys. It will likely
depend on the size of your MDS (cache size) and available hardware.

> Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL.
> Their tree is almost identical but INDEX stores all dovecot metadata with
> heavy IO going on and MAIL stores actual email files, with much more writes
> than reads.
>
> I don't know by now which one could bottleneck the MDS servers most so I
> wonder if I can take metrics on MDS usage per pool when it's deployed.
> Question 2: If the metadata workloads are very different I wonder if I can
> isolate them, like pinning MDS servers X and Y to one of the directories.

It's best if y ou see if the normal balancer (especially in v12.2.6
[1]) can handle the load for you without trying to micromanage things
via pins. You can use pinning to isolate metadata load from other
ranks as a stop-gap measure.

[1] https://github.com/ceph/ceph/pull/21412

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Webert de Souza Lima
Hi,

We're migrating from a Jewel / filestore based cephfs archicture to a
Luminous / buestore based one.

One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge
of how it actually works.
After reading the docs and ML we learned that they work by sort of dividing
the responsibilities, each with his own and only directory subtree. (please
correct me if I'm wrong).

Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3
Active and 1 Standby (or Standby-Replay if that's still possible with
multi-mds).

Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL.
Their tree is almost identical but INDEX stores all dovecot metadata with
heavy IO going on and MAIL stores actual email files, with much more writes
than reads.

I don't know by now which one could bottleneck the MDS servers most so I
wonder if I can take metrics on MDS usage per pool when it's deployed.
Question 2: If the metadata workloads are very different I wonder if I can
isolate them, like pinning MDS servers X and Y to one of the directories.

Cache Tier is deprecated so,
Question 3: how can I think of a read cache mechanism in Luminous with
bluestore, mainly to keep newly created files (emails that just arrived and
will probably be fetched by the user in a few seconds via IMAP/POP3).

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com