Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Wido den Hollander
On 11/29/19 6:28 AM, jes...@krogh.cc wrote: > Hi Nathan > > Is that true? > > The time it takes to reallocate the primary pg delivers “downtime” by > design.  right? Seen from a writing clients perspective  > That is true. When an OSD goes down it will take a few seconds for it's Placement

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread h...@portsip.cn
Hi Nathan We build a ceph cluster with 3 nodes. node-3: osd-2, mon-b, node-4: osd-0, mon-a, mds-myfs-a, mgr node-5: osd-1, mon-c, mds-myfs-b ceph cluster created by rook. Test phenomenon After one node unusual down(like direct poweroff), try to mount cephfs volume will spend more than 40

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread jesper
Hi Nathan Is that true? The time it takes to reallocate the primary pg delivers “downtime” by design.   right? Seen from a writing clients perspective  Jesper Sent from myMail for iOS Friday, 29 November 2019, 06.24 +0100 from pen...@portsip.com : >Hi Nathan,  > >Thanks for the help.

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Peng Bo
Hi Nathan, Thanks for the help. My colleague will provide more details. BR On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish wrote: > If correctly configured, your cluster should have zero downtime from a > single OSD or node failure. What is your crush map? Are you using > replica or EC? If your

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Nathan Fish
If correctly configured, your cluster should have zero downtime from a single OSD or node failure. What is your crush map? Are you using replica or EC? If your 'min_size' is not smaller than 'size', then you will lose availability. On Thu, Nov 28, 2019 at 10:50 PM Peng Bo wrote: > > Hi all, > >

[ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Peng Bo
Hi all, We are working on use CEPH to build our HA system, the purpose is the system should always provide service even a node of CEPH is down or OSD is lost. Currently, as we practiced once a node/OSD is down, the CEPH cluster needs to take about 40 seconds to sync data, our system can't

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Paul Emmerich
Can confirm that disabling power saving helps. I've also seen latency improvements with sysctl -w net.ipv4.tcp_low_latency=1 Another thing that sometimes helps is disabling the write cache of your SSDs (hdparm -W 0), depends on the disk model, though. Paul -- Paul Emmerich Looking for help

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
Paul, Absolutely, I said I was looking at those settings and most didn't make any sense to me in a production environment (we've been running ceph since Dumpling). However we only have 1 cluster on Bluestore and I wanted to get some opinions if anything other than the defaults in ceph.conf or

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Paul Emmerich
Please don't run this config in production. Disabling checksumming is a bad idea, disabling authentication is also pretty bad. There are also a few options in there that no longer exist (osd op threads) or are no longer relevant (max open files), in general, you should not blindly copy config

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Wido den Hollander
On 11/28/19 12:56 PM, David Majchrzak, ODERLAND Webbhotell AB wrote: > Hi! > > We've deployed a new flash only ceph cluster running Nautilus and I'm > currently looking at any tunables we should set to get the most out of > our NVMe SSDs. > > I've been looking a bit at the options from the

[ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
Hi! We've deployed a new flash only ceph cluster running Nautilus and I'm currently looking at any tunables we should set to get the most out of our NVMe SSDs. I've been looking a bit at the options from the blog post here: