Re: [ceph-users] (yet another) multi active mds advise needed
Hi Daniel, Thanks for clarifying. I'll have a look at dirfrag option. Regards, Webert Lima Em sáb, 19 de mai de 2018 01:18, Daniel Baumannescreveu: > On 05/19/2018 01:13 AM, Webert de Souza Lima wrote: > > New question: will it make any difference in the balancing if instead of > > having the MAIL directory in the root of cephfs and the domains's > > subtrees inside it, I discard the parent dir and put all the subtress > right in cephfs root? > > the balancing between the MDS is influenced by which directories are > accessed, the currently accessed directory-trees are diveded between the > MDS's (also check the dirfrag option in the docs). assuming you have the > same access pattern, the "fragmentation" between the MDS's happens at > these "target-directories", so it doesn't matter if these directories > are further up or down in the same filesystem tree. > > in the multi-MDS scenario where the MDS serving rank 0 fails, the > effects in the moment of the failure for any cephfs client accessing a > directory/file are the same (as described in an earlier mail), > regardless on which level the directory/file is within the filesystem. > > Regards, > Daniel > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] (yet another) multi active mds advise needed
On 05/19/2018 01:13 AM, Webert de Souza Lima wrote: > New question: will it make any difference in the balancing if instead of > having the MAIL directory in the root of cephfs and the domains's > subtrees inside it, I discard the parent dir and put all the subtress right > in cephfs root? the balancing between the MDS is influenced by which directories are accessed, the currently accessed directory-trees are diveded between the MDS's (also check the dirfrag option in the docs). assuming you have the same access pattern, the "fragmentation" between the MDS's happens at these "target-directories", so it doesn't matter if these directories are further up or down in the same filesystem tree. in the multi-MDS scenario where the MDS serving rank 0 fails, the effects in the moment of the failure for any cephfs client accessing a directory/file are the same (as described in an earlier mail), regardless on which level the directory/file is within the filesystem. Regards, Daniel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] (yet another) multi active mds advise needed
Hi Patrick On Fri, May 18, 2018 at 6:20 PM Patrick Donnellywrote: > Each MDS may have multiple subtrees they are authoritative for. Each > MDS may also replicate metadata from another MDS as a form of load > balancing. Ok, its good to know that it actually does some load balance. Thanks. New question: will it make any difference in the balancing if instead of having the MAIL directory in the root of cephfs and the domains's subtrees inside it, I discard the parent dir and put all the subtress right in cephfs root? > standby-replay daemons are not available to take over for ranks other > than the one it follows. So, you would want to have a standby-replay > daemon for each rank or just have normal standbys. It will likely > depend on the size of your MDS (cache size) and available hardware. > > It's best if y ou see if the normal balancer (especially in v12.2.6 > [1]) can handle the load for you without trying to micromanage things > via pins. You can use pinning to isolate metadata load from other > ranks as a stop-gap measure. > Ok I will start with the simplest way. This can be changed after deployment if it comes to be the case. On Fri, May 18, 2018 at 6:38 PM Daniel Baumann wrote: > jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a > longer downtime for us due to http://tracker.ceph.com/issues/21749 > > we're not using standby-replay MDS's anymore but only "normal" standby, > and didn't have had any problems anymore (running kraken then, upgraded > to luminous last fall). > Thank you very much for your feedback Daniel. I'll go for the regular standby daemons, then. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] (yet another) multi active mds advise needed
On 05/18/2018 11:19 PM, Patrick Donnelly wrote: > So, you would want to have a standby-replay > daemon for each rank or just have normal standbys. It will likely > depend on the size of your MDS (cache size) and available hardware. jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a longer downtime for us due to http://tracker.ceph.com/issues/21749 (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/thread.html#21390 - thanks again for the help back then, still much appreciated) we're not using standby-replay MDS's anymore but only "normal" standby, and didn't have had any problems anymore (running kraken then, upgraded to luminous last fall). Regards, Daniel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] (yet another) multi active mds advise needed
Hello Webert, On Fri, May 18, 2018 at 1:10 PM, Webert de Souza Limawrote: > Hi, > > We're migrating from a Jewel / filestore based cephfs archicture to a > Luminous / buestore based one. > > One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge of > how it actually works. > After reading the docs and ML we learned that they work by sort of dividing > the responsibilities, each with his own and only directory subtree. (please > correct me if I'm wrong). Each MDS may have multiple subtrees they are authoritative for. Each MDS may also replicate metadata from another MDS as a form of load balancing. > Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3 > Active and 1 Standby (or Standby-Replay if that's still possible with > multi-mds). standby-replay daemons are not available to take over for ranks other than the one it follows. So, you would want to have a standby-replay daemon for each rank or just have normal standbys. It will likely depend on the size of your MDS (cache size) and available hardware. > Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL. > Their tree is almost identical but INDEX stores all dovecot metadata with > heavy IO going on and MAIL stores actual email files, with much more writes > than reads. > > I don't know by now which one could bottleneck the MDS servers most so I > wonder if I can take metrics on MDS usage per pool when it's deployed. > Question 2: If the metadata workloads are very different I wonder if I can > isolate them, like pinning MDS servers X and Y to one of the directories. It's best if y ou see if the normal balancer (especially in v12.2.6 [1]) can handle the load for you without trying to micromanage things via pins. You can use pinning to isolate metadata load from other ranks as a stop-gap measure. [1] https://github.com/ceph/ceph/pull/21412 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] (yet another) multi active mds advise needed
Hi, We're migrating from a Jewel / filestore based cephfs archicture to a Luminous / buestore based one. One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge of how it actually works. After reading the docs and ML we learned that they work by sort of dividing the responsibilities, each with his own and only directory subtree. (please correct me if I'm wrong). Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3 Active and 1 Standby (or Standby-Replay if that's still possible with multi-mds). Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL. Their tree is almost identical but INDEX stores all dovecot metadata with heavy IO going on and MAIL stores actual email files, with much more writes than reads. I don't know by now which one could bottleneck the MDS servers most so I wonder if I can take metrics on MDS usage per pool when it's deployed. Question 2: If the metadata workloads are very different I wonder if I can isolate them, like pinning MDS servers X and Y to one of the directories. Cache Tier is deprecated so, Question 3: how can I think of a read cache mechanism in Luminous with bluestore, mainly to keep newly created files (emails that just arrived and will probably be fetched by the user in a few seconds via IMAP/POP3). Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com