Re: [ceph-users] dashboard returns 401 on successful auth

2019-06-06 Thread Nathan Fish
I have filed this bug: https://tracker.ceph.com/issues/40051 On Thu, Jun 6, 2019 at 12:52 PM Drew Weaver wrote: > > Hello, > > > > I was able to get Nautilus running on my cluster. > > > > When I try to login to dashboard with the user I created if I enter the > correct credentials in the log I

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Nathan Fish
I have run into a similar hang on 'ls .snap' recently: https://tracker.ceph.com/issues/40101#note-2 On Wed, Jun 12, 2019 at 9:33 AM Yan, Zheng wrote: > > On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote: > > > > Hi list, > > > > I have a setup where two clients mount the same filesystem and

[ceph-users] Degraded pgs during async randwrites

2019-05-06 Thread Nathan Fish
Hello all, I'm testing out a new cluster that we hope to put into production soon. Performance has overall been great, but there's one benchmark that not only stresses the cluster, but causes it to degrade - async randwrites. The benchmark: # The file was previously laid out with dd'd random data

Re: [ceph-users] What does the differences in osd benchmarks mean?

2019-06-27 Thread Nathan Fish
Are these dual-socket machines? Perhaps NUMA is involved? On Thu., Jun. 27, 2019, 4:56 a.m. Lars Täuber, wrote: > Hi! > > In our cluster I ran some benchmarks. > The results are always similar but strange to me. > I don't know what the results mean. > The cluster consists of 7 (nearly)

Re: [ceph-users] shutdown down all monitors

2019-07-11 Thread Nathan Fish
The monitors determine quorum, so stopping all monitors will immediately stop IO to prevent split-brain. I would not recommend shutting down all mons at once in production, though it *should* come back up fine. If you really need to, shut them down in a certain order, and bring them back up in the

Re: [ceph-users] Pool stats issue with upgrades to nautilus

2019-07-12 Thread Nathan Fish
Excellent! I have been checking the tracker (https://tracker.ceph.com/versions/574) every day, and there hadn't been any movement for weeks. On Fri, Jul 12, 2019 at 11:29 AM Sage Weil wrote: > > On Fri, 12 Jul 2019, Nathan Fish wrote: > > Thanks. Speaking of 14.2.2, is ther

Re: [ceph-users] Pool stats issue with upgrades to nautilus

2019-07-12 Thread Nathan Fish
Thanks. Speaking of 14.2.2, is there a timeline for it? We really want some of the fixes in it as soon as possible. On Fri, Jul 12, 2019 at 11:22 AM Sage Weil wrote: > > Hi everyone, > > All current Nautilus releases have an issue where deploying a single new > (Nautilus) BlueStore OSD on an

Re: [ceph-users] increase pg_num error

2019-07-01 Thread Nathan Fish
I ran into this recently. Try running "ceph osd require-osd-release nautilus". This drops backwards compat with pre-nautilus and allows changing settings. On Mon, Jul 1, 2019 at 4:24 AM Sylvain PORTIER wrote: > > Hi all, > > I am using ceph 14.2.1 (Nautilus) > > I am unable to increase the

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-08 Thread Nathan Fish
This is very interesting, thank you. I'm curious, what is the reason for avoiding k's with large prime factors? If I set k=5, what happens? On Mon, Jul 8, 2019 at 8:56 AM Lei Liu wrote: > > Hi Frank, > > Thanks for sharing valuable experience. > > Frank Schilder 于2019年7月8日周一 下午4:36写道: >> >> Hi

Re: [ceph-users] bluestore write iops calculation

2019-08-02 Thread Nathan Fish
Any EC pool with m=1 is fragile. By default, min_size = k+1, so you'd immediately stop IO the moment you lose a single OSD. min_size can be lowered to k, but that can cause data loss and corruption. You should set m=2 at a minimum. 4+2 doesn't take much more space than 4+1, and it's far safer. On

Re: [ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Nathan Fish
The client key which is used to mount the FS needs the 'p' permission to set xattrs. eg: ceph fs authorize cephfs client.foo / rwsp That might be your problem. On Wed, Jul 31, 2019 at 5:43 AM Mattia Belluco wrote: > > Dear ceph users, > > We have been recently trying to use the two quota

Re: [ceph-users] Nautilus 14.2.1 / 14.2.2 crash

2019-07-19 Thread Nathan Fish
Nigel Williams wrote: > > > On Sat, 20 Jul 2019 at 04:28, Nathan Fish wrote: >> >> On further investigation, it seems to be this bug: >> http://tracker.ceph.com/issues/38724 > > > We just upgraded to 14.2.2, and had a dozen OSDs at 14.2.2 go down this bug, &

[ceph-users] Nautilus 14.2.1 / 14.2.2 crash

2019-07-19 Thread Nathan Fish
I came in this morning and started to upgrade to 14.2.2, only to notice that 3 OSDs had crashed overnight - exactly 1 on each of 3 hosts. Apparently there was no data loss, which implies they crashed at different times, far enough part to rebuild? Still digging through logs to find exactly when

Re: [ceph-users] Nautilus 14.2.1 / 14.2.2 crash

2019-07-19 Thread Nathan Fish
On further investigation, it seems to be this bug: http://tracker.ceph.com/issues/38724 On Fri, Jul 19, 2019 at 1:38 PM Nathan Fish wrote: > > I came in this morning and started to upgrade to 14.2.2, only to > notice that 3 OSDs had crashed overnight - exactly 1 on each of 3 > hosts

Re: [ceph-users] MDS fails repeatedly while handling many concurrent meta data operations

2019-07-23 Thread Nathan Fish
configured? Are they HDDs? Do you have WAL and/or DB devices on SSDs? Is the metadata pool on SSDs? On Tue, Jul 23, 2019 at 4:06 PM Janek Bevendorff wrote: > > Thanks for your reply. > > On 23/07/2019 21:03, Nathan Fish wrote: > > What Ceph version? Do the clients match? What CPUs

Re: [ceph-users] MDS fails repeatedly while handling many concurrent meta data operations

2019-07-23 Thread Nathan Fish
What Ceph version? Do the clients match? What CPUs do the MDS servers have, and how is their CPU usage when this occurs? While migrating to a Nautilus cluster recently, we had up to 14 million inodes open, and we increased the cache limit to 16GiB. Other than warnings about oversized cache, this

Re: [ceph-users] Nautilus 14.2.1 / 14.2.2 crash

2019-07-23 Thread Nathan Fish
ll had some crash? > > ,Thanks > > On Sat, 20 Jul 2019 10:09:08 +0800 Nigel Williams > wrote > > > On Sat, 20 Jul 2019 at 04:28, Nathan Fish wrote: > > On further investigation, it seems to be this bug: > http://tracker.ceph.com/issues/38724 > >

Re: [ceph-users] Ceph durability during outages

2019-07-24 Thread Nathan Fish
ous > for data integrity to recover at less than min_size? > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > > Le 24 juill. 2019 à 13:49, Nathan Fish a écrit : > > 2/3 monitors is enough to

Re: [ceph-users] Ceph durability during outages

2019-07-24 Thread Nathan Fish
2/3 monitors is enough to maintain quorum, as is any majority. However, EC pools have a default min_size of k+1 chunks. This can be adjusted to k, but that has it's own dangers. I assume you are using failure domain = "host"? As you had k=6,m=2, and lost 2 failure domains, you had k chunks left,

Re: [ceph-users] Nautilus: significant increase in cephfs metadata pool usage

2019-07-25 Thread Nathan Fish
I have seen significant increases (1GB -> 8GB) proportional to number of inodes open, just like the MDS cache grows. These went away once the stat-heavy workloads (multiple parallel rsyncs) stopped. I disabled autoscale warnings on the metadata pools due to this fluctuation. On Thu, Jul 25, 2019

Re: [ceph-users] MDS / CephFS behaviour with unusual directory layout

2019-07-26 Thread Nathan Fish
open, the MDS cache goes to about 40GiB but it remains stable. MDS CPU usage goes to about 400% (4 cores worth, spread across 6-8 processes). Hope you find this useful. On Fri, Jul 26, 2019 at 11:05 AM Stefan Kooman wrote: > > Quoting Nathan Fish (lordci...@gmail.com): > > MDS CPU load

Re: [ceph-users] MDS / CephFS behaviour with unusual directory layout

2019-07-26 Thread Nathan Fish
Yes, definitely enable standby-replay. I saw sub-second failovers with standby-replay, but when I restarted the new rank 0 (previously 0-s) while the standby was syncing up to become 0-s, the failover took several minutes. This was with ~30GiB of cache. On Fri, Jul 26, 2019 at 12:41 PM Burkhard

Re: [ceph-users] MDS / CephFS behaviour with unusual directory layout

2019-07-26 Thread Nathan Fish
MDS CPU load is proportional to metadata ops/second. MDS RAM cache is proportional to # of files (including directories) in the working set. Metadata pool size is proportional to total # of files, plus everything in the RAM cache. I have seen that the metadata pool can balloon 8x between being

Re: [ceph-users] Bluestore runs out of space and dies

2019-10-31 Thread Nathan Fish
ortunate that we need to give up the whole per > cent (1 % is too much for 4Tb drives). > > On 31/10/2019 15:04, Nathan Fish wrote: > > The best way to prevent this on a testing cluster with tiny virtual > > drives is probably to lower the various full_ratio's significantly. &g

Re: [ceph-users] Full FLash NVME Cluster recommendation

2019-11-15 Thread Nathan Fish
> numbers? I’ve also read a claim that each BlueStore will use 3-4 cores >, > They’re listening to me though about splitting the card into multiple OSDs. > > > On Nov 15, 2019, at 7:38 AM, Nathan Fish wrote: > > > > In order to get optimal performance out of NVMe, you

Re: [ceph-users] Scaling out

2019-11-21 Thread Nathan Fish
The default crush rule uses "host" as the failure domain, so in order to deploy on one host you will need to make a crush rule that specifies "osd". Then simply adding more hosts with osds will result in automatic rebalancing. Once you have enough hosts to satisfy the crush rule ( 3 for replicated

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread Nathan Fish
You should design your cluster and crush rules such that a failure of a single OSD is not a problem. Preferably such that losing any 1 host isn't a problem either. On Thu, Nov 21, 2019 at 6:32 AM zhanrzh...@teamsun.com.cn wrote: > > Hi,all > Suppose the db of bluestore can't read/write,are

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread Nathan Fish
A power outage shouldn't corrupt your db unless you are doing dangerous async writes. And sharing an SSD for several OSDs on the same host is normal, but not an issue given that you have planned for the failure of hosts. On Thu, Nov 21, 2019 at 9:57 AM 展荣臻(信泰) wrote: > > > > In general db is

Re: [ceph-users] Separate disk sets for high IO?

2019-12-16 Thread Nathan Fish
ound any so far? > > > - Original Message ----- > From: "Nathan Fish" > To: "Marc Roos" > Cc: "ceph-users" , "Philip Brown" > > Sent: Monday, December 16, 2019 2:07:44 PM > Subject: Re: [ceph-users] Separate disk sets for high IO? &

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Nathan Fish
MDS cache size scales with the number of files recently opened by clients. if you have RAM to spare, increase "mds cache memory limit". I have raised mine from the default of 1GiB to 32GiB. My rough estimate is 2.5kiB per inode in recent use. On Thu, Dec 5, 2019 at 10:39 AM Ranjan Ghosh wrote:

Re: [ceph-users] Separate disk sets for high IO?

2019-12-16 Thread Nathan Fish
Indeed, you can set device class to pretty much arbitrary strings and specify them. By default, 'hdd', 'ssd', and I think 'nvme' are autodetected - though my Optanes showed up as 'ssd'. On Mon, Dec 16, 2019 at 4:58 PM Marc Roos wrote: > > > > You can classify osd's, eg as ssd. And you can assign

Re: [ceph-users] Large OMAP Object

2019-11-20 Thread Nathan Fish
It's a warning, not an error, and if you consider it to not be a problem, I believe you can change osd_deep_scrub_large_omap_object_value_sum_threshold back to 2M. On Wed, Nov 20, 2019 at 11:37 AM wrote: > > All; > > Since I haven't heard otherwise, I have to assume that the only way to get >

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Nathan Fish
If correctly configured, your cluster should have zero downtime from a single OSD or node failure. What is your crush map? Are you using replica or EC? If your 'min_size' is not smaller than 'size', then you will lose availability. On Thu, Nov 28, 2019 at 10:50 PM Peng Bo wrote: > > Hi all, > >

Re: [ceph-users] Bluestore runs out of space and dies

2019-10-31 Thread Nathan Fish
The best way to prevent this on a testing cluster with tiny virtual drives is probably to lower the various full_ratio's significantly. On Thu, Oct 31, 2019 at 7:17 AM Paul Emmerich wrote: > > BlueStore doesn't handle running out of space gracefully because that > doesn't happen on a real disk

Re: [ceph-users] NFS

2019-10-03 Thread Nathan Fish
We have tried running nfs-ganesha (2.7 - 2.8.1) with FSAL_CEPH backed by a Nautilus CephFS. Performance when doing metadata operations (ie anything with small files) is very slow. On Thu, Oct 3, 2019 at 10:34 AM Marc Roos wrote: > > > How should a multi tenant RGW config look like, I am not able

Re: [ceph-users] sharing single SSD across multiple HD based OSDs

2019-12-09 Thread Nathan Fish
You can loop over the creation of LVs on the SSD of a fixed size, then loop over creating OSDs assigned to each of them. That is what we did, it wasn't bad. On Mon, Dec 9, 2019 at 9:32 PM Philip Brown wrote: > > I have a bunch of hard drives I want to use as OSDs, with ceph nautilus. > >

Re: [ceph-users] Problem : "1 pools have many more objects per pg than average"

2020-01-22 Thread Nathan Fish
Injectargs causes an immediate runtime change; rebooting the mon would negate the change. On Wed., Jan. 22, 2020, 4:41 p.m. St-Germain, Sylvain (SSC/SPC), < sylvain.st-germ...@canada.ca> wrote: > / Problem /// > > I've got a Warning on my cluster

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-26 Thread Nathan Fish
I would start by viewing "ceph status", drive IO with: "iostat -x 1 /dev/sd{a..z}" and the CPU/RAM usage of the active MDS. If "ceph status" warns that the MDS cache is oversized, that may be an easy fix. On Thu, Dec 26, 2019 at 7:33 AM renjianxinlover wrote: > hello, >recently, after