[ceph-users] Monitor handle_auth_bad_method

2020-01-17 Thread Justin Engwer
Hi, I'm a home user of ceph. Most of the time I can look at the email lists and articles and figure things out on my own. I've unfortunately run into an issue I can't troubleshoot myself. Starting one of my monitors yields this error: 2020-01-17 15:34:13.497 7fca3d006040 0 mon.kvm2@-1(probing)

Re: [ceph-users] Slow Performance - Sequential IO

2020-01-17 Thread Anthony Brandelli (abrandel)
Not been able to make any headway on this after some significant effort. -Tested all 48 SSDs with FIO directly, all tested with 10% of each other for 4k iops in rand|seq read|write. -Disabled all CPU power save. -Tested with both rbd cache enabled and disabled on the client. -Tested with drive

Re: [ceph-users] Slow Performance - Sequential IO

2020-01-17 Thread Christian Balzer
Hello, I had very odd results in the past with the fio rbd engine and would suggest testing things in the environment you're going to deploy in, end to end. That said, without any caching and coalescing of writes, sequential 4k writes will hit the same set of OSDs for 4MB worth of data, thus

Re: [ceph-users] Default Pools

2020-01-17 Thread Daniele Riccucci
Hello, I'm still a bit confused by the .rgw.root and the default.rgw.{control,meta,log} pools. I recently removed the RGW daemon I had running and the aforementioned pools, however after a rebalance I suddenly find them again in the output of: $ ceph osd pool ls cephfs_data cephfs_metadata

[ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Janek Bevendorff
Hi, We have a CephFS in our cluster with 3 MDS to which > 300 clients connect at any given time. The FS contains about 80 TB of data and many million files, so it is important that meta data operations work smoothly even when listing large directories. Previously, we had massive stability

Re: [ceph-users] Beginner questions

2020-01-17 Thread Frank Schilder
I would strongly advise against 2+1 EC pools for production if stability is your main concern. There was a discussion towards the end of last year addressing this in more detail. Short story, if you don't have at least 8-10 nodes (in the short run), EC is not suitable. You cannot maintain a

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-17 Thread Igor Fedotov
hmmm.. Just in case - suggest to check H/W errors with dmesg. Also there are some (not very much though) chances this is another incarnation of the following bug: https://tracker.ceph.com/issues/22464 https://github.com/ceph/ceph/pull/24649 The corresponding PR works around it for main

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-17 Thread Stefan Priebe - Profihost AG
HI Igor, Am 17.01.20 um 12:10 schrieb Igor Fedotov: > hmmm.. > > Just in case - suggest to check H/W errors with dmesg. this happens on around 80 nodes - i don't expect all of those have not identified hw errors. Also all of them are monitored - no dmesg outpout contains any errors. > Also

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Yan, Zheng
On Fri, Jan 17, 2020 at 4:47 PM Janek Bevendorff wrote: > > Hi, > > We have a CephFS in our cluster with 3 MDS to which > 300 clients > connect at any given time. The FS contains about 80 TB of data and many > million files, so it is important that meta data operations work > smoothly even when

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Janek Bevendorff
Thanks. I will do that. Right now, we have quite a few lags when listing folders, which is probably due to another client heavily using the system. Unfortunately, it's rather hard to debug at the moment, since the suspected client has to use our Ganesha bridge instead of connecting to the Ceph

Re: [ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-17 Thread Jeff Layton
Actually, scratch that. I went ahead and opened this: https://tracker.ceph.com/issues/43649 Feel free to watch that one for updates. On Fri, 2020-01-17 at 07:43 -0500, Jeff Layton wrote: > No problem. Can you let me know the tracker bug number once you've > opened it? > > Thanks, > Jeff > >

Re: [ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-17 Thread Jeff Layton
On Fri, 2020-01-17 at 17:10 +0100, Ilya Dryomov wrote: > On Fri, Jan 17, 2020 at 2:21 AM Aaron wrote: > > No worries, can definitely do that. > > > > Cheers > > Aaron > > > > On Thu, Jan 16, 2020 at 8:08 PM Jeff Layton wrote: > > > On Thu, 2020-01-16 at 18:42 -0500, Jeff Layton wrote: > > > >

Re: [ceph-users] Beginner questions

2020-01-17 Thread Dave Hall
Frank, Thank you for your input.  It is good to know that the cluster will go read-only in if a node goes down.  Our circumstance is probably a bit unusual, which is why I'm considering the2+1 solution.  We have a researcher who will be collecting extremely large amounts of data in real

Re: [ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-17 Thread Ilya Dryomov
On Fri, Jan 17, 2020 at 2:21 AM Aaron wrote: > > No worries, can definitely do that. > > Cheers > Aaron > > On Thu, Jan 16, 2020 at 8:08 PM Jeff Layton wrote: >> >> On Thu, 2020-01-16 at 18:42 -0500, Jeff Layton wrote: >> > On Wed, 2020-01-15 at 08:05 -0500, Aaron wrote: >> > > Seeing a weird