Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Ah, I understand now. Makes a lot of sense. Well, we have a LOT of small files so that might be the reason. I'll keep an eye on it whether the message shows up again. Thank you! Ranjan Am 05.12.19 um 19:40 schrieb Patrick Donnelly: > On Thu, Dec 5, 2019 at 9:45 AM Ranjan Ghosh wrote: >> Ah,

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Philip Brown
Hmm... I reread through the docs in and around https://docs.ceph.com/docs/master/rbd/iscsi-targets/ and it mentions about iscsi multipathing through multiple CEPH storage gateways... but it doesnt seem to say anything about needing multiple POOLS. when you wrote, " 1 pool per storage class

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Hi, Ah, that seems to have fixed it. Hope it stays that way. I've raised it to 4 GB. Thanks to you both! Although I have to say that the message is IMHO *very* misleading: "1 MDSs report oversized cache" sounds to me like the cache is too large (i.e. wasting RAM unnecessarily). Shouldn't the

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Paul Emmerich
ceph-iscsi doesn't support round-robin multi-pathing; so you need at least one LUN per gateway to utilize all of them. Please see https://docs.ceph.com for basics about RBDs and pools. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Philip Brown
Interesting. I thought when you defined a pool, and then defined an RBD within that pool.. that any auto-replication stayed within that pool? So what kind of "load balancing" do you mean? I'm confused. - Original Message - From: "Paul Emmerich" To: "Philip Brown" Cc: "ceph-users"

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Nathan Fish
MDS cache size scales with the number of files recently opened by clients. if you have RAM to spare, increase "mds cache memory limit". I have raised mine from the default of 1GiB to 32GiB. My rough estimate is 2.5kiB per inode in recent use. On Thu, Dec 5, 2019 at 10:39 AM Ranjan Ghosh wrote:

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Eugen Block
Hi, can you provide more details? ceph daemon mds. cache status ceph config show mds. | grep mds_cache_memory_limit Regards, Eugen Zitat von Ranjan Ghosh : Okay, now, after I settled the issue with the oneshot service thanks to the amazing help of Paul and Richard (thanks again!), I still

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Okay, now, after I settled the issue with the oneshot service thanks to the amazing help of Paul and Richard (thanks again!), I still wonder: What could I do about that MDS warning: === health: HEALTH_WARN 1 MDSs report oversized cache === If anybody has any ideas? I tried googling it, of

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi Richard, Ah, I think I understand, now, brilliant. It's *supposed* to do exactly that. Mount it once on boot and then just exit. So everything is working as intended. Great. Thanks Ranjan Am 05.12.19 um 15:18 schrieb Richard: > On 2019-12-05 7:19 AM, Ranjan Ghosh wrote: >> Why is my

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-05 Thread Milan Kupcevic
On 2019-12-05 02:33, Janne Johansson wrote: > Den tors 5 dec. 2019 kl 00:28 skrev Milan Kupcevic > mailto:milan_kupce...@harvard.edu>>: > > > > There is plenty of space to take more than a few failed nodes. But the > question was about what is going on inside a node with a few failed >

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi Paul, thanks for the explanation. I didn't know about the JSON file yet. That's certainly good to know. What I still don't understand, though: Why is my service marked inactivate/dead? Shouldn't it be running? If I run: systemctl start

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Paul Emmerich
The ceph-volume services make sure that the right partitions are mounted at /var/lib/ceph/osd/ceph-X In "simple" mode the service gets the necessary information from a json file (long-hex-string.json) in /etc/ceph ceph-volume simple scan/activate create the json file and systemd unit. ceph-disk

[ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi all, After upgrading to Ubuntu 19.10 and consequently from Mimic to Nautilus, I had a mini-shock when my OSDs didn't come up. Okay, I should have read the docs more closely, I had to do: # ceph-volume simple scan /dev/sdb1 # ceph-volume simple activate --all Hooray. The OSDs came back to

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-12-05 Thread Florian Haas
On 02/12/2019 16:48, Florian Haas wrote: > Doc patch PR is here, for anyone who would feels inclined to review: > > https://github.com/ceph/ceph/pull/31893 Landed, here's the new documentation: https://docs.ceph.com/docs/master/rbd/rbd-exclusive-locks/ Thanks everyone for chiming in, and

Re: [ceph-users] Is a scrub error (read_error) on a primary osd safe to repair?

2019-12-05 Thread Caspar Smit
Konstantin, Thanks for your answer, i will run a ceph pg repair. Could you maybe elaborate globally how this repair process works? Does it just try to re-read the read_error osd? IIRC there was a time when a ceph pg repair wasn't considered 'safe' because it just copied the primary osd shard

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-04 Thread Janne Johansson
Den tors 5 dec. 2019 kl 00:28 skrev Milan Kupcevic < milan_kupce...@harvard.edu>: > > > There is plenty of space to take more than a few failed nodes. But the > question was about what is going on inside a node with a few failed > drives. Current Ceph behavior keeps increasing number of placement

Re: [ceph-users] Is a scrub error (read_error) on a primary osd safe to repair?

2019-12-04 Thread Konstantin Shalygin
I tried to dig in the mailinglist archives but couldn't find a clear answer to the following situation: Ceph encountered a scrub error resulting in HEALTH_ERR Two PG's are active+clean+inconsistent. When investigating the PG i see a "read_error" on the primary OSD. Both PG's are replicated PG's

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Yan, Zheng
On Thu, Dec 5, 2019 at 4:40 AM Stefan Kooman wrote: > > Quoting Stefan Kooman (ste...@bit.nl): > > and it crashed again (and again) ... until we stopped the mds and > > deleted the mds0_openfiles.0 from the metadata pool. > > > > Here is the (debug) output: > > > > A specific workload that

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-04 Thread Milan Kupcevic
On 2019-12-04 04:11, Janne Johansson wrote: > Den ons 4 dec. 2019 kl 01:37 skrev Milan Kupcevic > mailto:milan_kupce...@harvard.edu>>: > > This cluster can handle this case at this moment as it has got plenty of > free space. I wonder how is this going to play out when we get to 90% of >

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > and it crashed again (and again) ... until we stopped the mds and > deleted the mds0_openfiles.0 from the metadata pool. > > Here is the (debug) output: > > A specific workload that *might* have triggered this: recursively deleting a > long > list

Re: [ceph-users] best pool usage for vmware backing

2019-12-04 Thread Paul Emmerich
1 pool per storage class (e.g., SSD and HDD), at least one RBD per gateway per pool for load balancing (failover-only load balancing policy). Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Stefan Kooman
Hi, Quoting Stefan Kooman (ste...@bit.nl): > > please apply following patch, thanks. > > > > diff --git a/src/mds/OpenFileTable.cc b/src/mds/OpenFileTable.cc > > index c0f72d581d..2ca737470d 100644 > > --- a/src/mds/OpenFileTable.cc > > +++ b/src/mds/OpenFileTable.cc > > @@ -470,7 +470,11 @@

[ceph-users] best pool usage for vmware backing

2019-12-04 Thread Philip Brown
Lets say that you had roughly 60 OSDs that you wanted to use to provide storage for VMware, through RBDs served through iscsi. Target VM types are completely mixed. Web front ends, app tier.. a few databases.. and the kitchen sink. Estimated number of VMs: 50-200 b How would people recommend

Re: [ceph-users] v13.2.7 osds crash in build_incremental_map_msg

2019-12-04 Thread Neha Ojha
We'll get https://github.com/ceph/ceph/pull/32000 out in 13.2.8 as quickly as possible. Neha On Wed, Dec 4, 2019 at 6:56 AM Dan van der Ster wrote: > > My advice is to wait. > > We built a 13.2.7 + https://github.com/ceph/ceph/pull/26448 cherry > picked and the OSDs no longer crash. > > My vote

Re: [ceph-users] v13.2.7 osds crash in build_incremental_map_msg

2019-12-04 Thread Dan van der Ster
My advice is to wait. We built a 13.2.7 + https://github.com/ceph/ceph/pull/26448 cherry picked and the OSDs no longer crash. My vote would be for a quick 13.2.8. -- Dan On Wed, Dec 4, 2019 at 2:41 PM Frank Schilder wrote: > > Is this issue now a no-go for updating to 13.2.7 or are there only

Re: [ceph-users] v13.2.7 osds crash in build_incremental_map_msg

2019-12-04 Thread Frank Schilder
Is this issue now a no-go for updating to 13.2.7 or are there only some specific unsafe scenarios? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: ceph-users on behalf of Dan van der Ster Sent: 03 December

[ceph-users] Is a scrub error (read_error) on a primary osd safe to repair?

2019-12-04 Thread Caspar Smit
Hi all, I tried to dig in the mailinglist archives but couldn't find a clear answer to the following situation: Ceph encountered a scrub error resulting in HEALTH_ERR Two PG's are active+clean+inconsistent. When investigating the PG i see a "read_error" on the primary OSD. Both PG's are

Re: [ceph-users] RGW performance with low object sizes

2019-12-04 Thread Christian
> > >> There's a bug in the current stable Nautilus release that causes a loop > and/or crash in get_obj_data::flush (you should be able to see it gobbling > up CPU in perf top). This is the related issue: > https://tracker.ceph.com/issues/39660 -- it should be fixed as soon as > 14.2.5 is

Re: [ceph-users] Failed to encode map errors

2019-12-04 Thread John Hearns
The version is Nautilus. There is a small mismatch in some of the OSD version numbers, but this has been running for a long time and we have nit seen this behaviiour. It is also worth saying that I removed (ahem) then replaced the key for an osd yesterday. Thanks to Wido for suggesting the fix

Re: [ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Stolte, Felix
smime.p7m Description: S/MIME encrypted message ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Janne Johansson
Den ons 4 dec. 2019 kl 09:57 skrev Marc Roos : > > But I guess that in 'ceph osd tree' the ssd's were then also displayed > as hdd? > Probably, and the difference in perf would be the different defaults hdd gets vs ssd OSDs with regards to bluestore caches. -- May the most significant bit of

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-04 Thread Janne Johansson
Den ons 4 dec. 2019 kl 01:37 skrev Milan Kupcevic < milan_kupce...@harvard.edu>: > This cluster can handle this case at this moment as it has got plenty of > free space. I wonder how is this going to play out when we get to 90% of > usage on the whole cluster. A single backplane failure in a node

Re: [ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Marc Roos
But I guess that in 'ceph osd tree' the ssd's were then also displayed as hdd? -Original Message- From: Stolte, Felix [mailto:f.sto...@fz-juelich.de] Sent: woensdag 4 december 2019 9:12 To: ceph-users Subject: [ceph-users] SSDs behind Hardware Raid Hi guys, maybe this is common

Re: [ceph-users] Failed to encode map errors

2019-12-04 Thread Stefan Kooman
Quoting John Hearns (j...@kheironmed.com): > And me again for the second time in one day. > > ceph -w is now showing messages like this: > > 2019-12-03 15:17:22.426988 osd.6 [WRN] failed to encode map e28961 with > expected crc I have seen messages like this when there are daemons running with

[ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Stolte, Felix
smime.p7m Description: S/MIME encrypted message ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Failed to encode map errors

2019-12-03 Thread Martin Verges
Hello, what versions of Ceph are you running? -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB

[ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-03 Thread Milan Kupcevic
On hdd failure the number of placement groups on the rest of osds on the same host goes up. I would expect equal distribution of failed placement groups across the cluster, not just on the troubled host. Shall the host weight auto reduce whenever an osd gets out? Exibit 1: Attached osd-df-tree

Re: [ceph-users] Revert a CephFS snapshot?

2019-12-03 Thread Luis Henriques
On Tue, Dec 03, 2019 at 02:09:30PM -0500, Jeff Layton wrote: > On Tue, 2019-12-03 at 07:59 -0800, Robert LeBlanc wrote: > > On Thu, Nov 14, 2019 at 11:48 AM Sage Weil wrote: > > > On Thu, 14 Nov 2019, Patrick Donnelly wrote: > > > > On Wed, Nov 13, 2019 at 6:36 PM Jerry Lee > > > > wrote: > > >

Re: [ceph-users] osds way ahead of gateway version?

2019-12-03 Thread Gregory Farnum
Unfortunately RGW doesn't test against extended version differences like this and I don't think it's compatible across more than one major release. Basically it's careful to support upgrades between long-term stable releases but nothing else is expected to work. That said, getting off of Giant

[ceph-users] osds way ahead of gateway version?

2019-12-03 Thread Philip Brown
Im in a situation where it would be extremely strategically advantageous to run some OSDs on luminous (so we can try out bluestore) while the gateways stay on giant. Is this a terrible terrible thing, or can we reasonably get away with it? points of interest: 1. i plan to make a new pool for

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Paul Emmerich
On Tue, Dec 3, 2019 at 6:43 PM Robert LeBlanc wrote: > > On Tue, Dec 3, 2019 at 9:11 AM Ed Fisher wrote: >> >> >> >> On Dec 3, 2019, at 10:28 AM, Robert LeBlanc wrote: >> >> Did you make progress on this? We have a ton of < 64K objects as well and >> are struggling to get good performance out

[ceph-users] Failed to encode map errors

2019-12-03 Thread John Hearns
And me again for the second time in one day. ceph -w is now showing messages like this: 2019-12-03 15:17:22.426988 osd.6 [WRN] failed to encode map e28961 with expected crc Any advice please? -- *Kheiron Medical Technologies* kheironmed.com | supporting

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Robert LeBlanc
On Tue, Dec 3, 2019 at 9:11 AM Ed Fisher wrote: > > > On Dec 3, 2019, at 10:28 AM, Robert LeBlanc wrote: > > Did you make progress on this? We have a ton of < 64K objects as well and > are struggling to get good performance out of our RGW. Sometimes we have > RGW instances that are just

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Ed Fisher
> On Dec 3, 2019, at 10:28 AM, Robert LeBlanc wrote: > > Did you make progress on this? We have a ton of < 64K objects as well and are > struggling to get good performance out of our RGW. Sometimes we have RGW > instances that are just gobbling up CPU even when there are no requests to >

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Robert LeBlanc
On Tue, Nov 19, 2019 at 9:34 AM Christian wrote: > Hi, > > I used https://github.com/dvassallo/s3-benchmark to measure some >> performance values for the rgws and got some unexpected results. >> Everything above 64K has excellent performance but below it drops down to >> a fraction of the speed

Re: [ceph-users] Missing Ceph perf-counters in Ceph-Dashboard or Prometheus/InfluxDB...?

2019-12-03 Thread Ernesto Puerta
Thanks, Benjeman! I created this pad (https://pad.ceph.com/p/perf-counters-to-expose) so we can list them there. An alternative approach could also be to allow for whitelisting some perf-counters, so they would become exported no matter their priority. This would allow users to customize which

Re: [ceph-users] Revert a CephFS snapshot?

2019-12-03 Thread Robert LeBlanc
On Thu, Nov 14, 2019 at 11:48 AM Sage Weil wrote: > On Thu, 14 Nov 2019, Patrick Donnelly wrote: > > On Wed, Nov 13, 2019 at 6:36 PM Jerry Lee > wrote: > > > > > > On Thu, 14 Nov 2019 at 07:07, Patrick Donnelly > wrote: > > > > > > > > On Wed, Nov 13, 2019 at 2:30 AM Jerry Lee > wrote: > > >

Re: [ceph-users] v13.2.7 osds crash in build_incremental_map_msg

2019-12-03 Thread Dan van der Ster
I created https://tracker.ceph.com/issues/43106 and we're downgrading our osds back to 13.2.6. -- dan On Tue, Dec 3, 2019 at 4:09 PM Dan van der Ster wrote: > > Hi all, > > We're midway through an update from 13.2.6 to 13.2.7 and started > getting OSDs crashing regularly like this [1]. > Does

[ceph-users] RGW bucket stats - strange behavior & slow performance requiring RGW restarts

2019-12-03 Thread David Monschein
Hi all, I've been observing some strange behavior with my object storage cluster running Nautilus 14.2.4. We currently have around 1800 buckets (A small percentage of those buckets are actively used), with a total of 13.86M objects. We have 20 RGWs right now, 10 for regular S3 access, and 10 for

[ceph-users] v13.2.7 osds crash in build_incremental_map_msg

2019-12-03 Thread Dan van der Ster
Hi all, We're midway through an update from 13.2.6 to 13.2.7 and started getting OSDs crashing regularly like this [1]. Does anyone obviously know what the issue is? (Maybe https://github.com/ceph/ceph/pull/26448/files ?) Or is it some temporary problem while we still have v13.2.6 and v13.2.7

Re: [ceph-users] HA and data recovery of CEPH

2019-12-03 Thread Wido den Hollander
On 12/3/19 3:07 PM, Aleksey Gutikov wrote: That is true. When an OSD goes down it will take a few seconds for it's Placement Groups to re-peer with the other OSDs. During that period writes to those PGs will stall for a couple of seconds. I wouldn't say it's 40s, but it can take ~10s.

Re: [ceph-users] HA and data recovery of CEPH

2019-12-03 Thread Aleksey Gutikov
That is true. When an OSD goes down it will take a few seconds for it's Placement Groups to re-peer with the other OSDs. During that period writes to those PGs will stall for a couple of seconds. I wouldn't say it's 40s, but it can take ~10s. Hello, According to my experience, in case of

Re: [ceph-users] Osd auth del

2019-12-03 Thread John Hearns
Thankyou. ceph auth add did work I did try ceph auth get-or-create this does not read from an input file - it will generate a new key. On Tue, 3 Dec 2019 at 13:50, Willem Jan Withagen wrote: > On 3-12-2019 11:43, Wido den Hollander wrote: > > > > > > On 12/3/19 11:40 AM, John Hearns

Re: [ceph-users] Osd auth del

2019-12-03 Thread Willem Jan Withagen
On 3-12-2019 11:43, Wido den Hollander wrote: On 12/3/19 11:40 AM, John Hearns wrote: I had a fat fingered moment yesterday I typed                       ceph auth del osd.3 Where osd.3 is an otherwise healthy little osd I have not set noout or down on  osd.3 yet This is a Nautilus

Re: [ceph-users] Missing Ceph perf-counters in Ceph-Dashboard or Prometheus/InfluxDB...?

2019-12-03 Thread Benjeman Meekhof
I'd like to see a few of the cache tier counters exposed. You get some info on cache activity in 'ceph -s' so it makes sense from my perspective to have similar availability in exposed counters. There's a tracker for this request (opened by me a while ago): https://tracker.ceph.com/issues/37156

[ceph-users] Missing Ceph perf-counters in Ceph-Dashboard or Prometheus/InfluxDB...?

2019-12-03 Thread Ernesto Puerta
Hi Cephers, As a result of this tracker (https://tracker.ceph.com/issues/42961) Neha and I were wondering if there would be other perf-counters deemed by users/operators as worthy to be exposed via ceph-mgr modules for monitoring purposes. The default behaviour is that only perf-counters with

Re: [ceph-users] Osd auth del

2019-12-03 Thread Wido den Hollander
On 12/3/19 11:40 AM, John Hearns wrote: I had a fat fingered moment yesterday I typed                       ceph auth del osd.3 Where osd.3 is an otherwise healthy little osd I have not set noout or down on  osd.3 yet This is a Nautilus cluster. ceph health reports everything is OK

[ceph-users] Osd auth del

2019-12-03 Thread John Hearns
I had a fat fingered moment yesterday I typed ceph auth del osd.3 Where osd.3 is an otherwise healthy little osd I have not set noout or down on osd.3 yet This is a Nautilus cluster. ceph health reports everything is OK However ceph tell osd.* version hangs when it

[ceph-users] ceph-fuse problem...

2019-12-02 Thread GBS Servers
Ok, have a new problem... i lxc container. [root@centos02 ~]# ceph-fuse -m 192.168.1.101:6789 /mnt/cephfs/ 2019-12-02 19:20:14.831971 7f5906ac9f00 -1 init, newargv = 0x5594fc9fca80 newargc=11 ceph-fuse[1148]: starting ceph client fuse: device not found, try 'modprobe fuse' first ceph-fuse[1148]:

[ceph-users] rados_ioctx_selfmanaged_snap_set_write_ctx examples

2019-12-02 Thread nokia ceph
Hi Team, We would like to create multiple snapshots inside ceph cluster, initiate the request from librados client and came across this rados api rados_ioctx_selfmanaged_snap_set_write_ctx Can some give us sample code on how to use this api . Thanks, Muthu

[ceph-users] ceph-fuse problem...

2019-12-02 Thread GBS Servers
Hi. Im have problem with ceph-fuse on lxc-containers: [root@centos01 ceph]# ceph-fuse -m 192.168.1.101:6789 /mnt/cephfs/ 2019-12-02 18:00:09.923237 7f4890f17f00 -1 init, newargv = 0x55fabbfa49c0 newargc=11 ceph-fuse[1623]: starting ceph client ceph-fuse[1623]: ceph mount failed with (22) Invalid

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-12-02 Thread Florian Haas
On 19/11/2019 22:42, Florian Haas wrote: > On 19/11/2019 22:34, Jason Dillaman wrote: >>> Oh totally, I wasn't arguing it was a bad idea for it to do what it >>> does! I just got confused by the fact that our mon logs showed what >>> looked like a (failed) attempt to blacklist an entire client IP

Re: [ceph-users] createosd problem...

2019-12-02 Thread Alwin Antreich
On Mon, Dec 02, 2019 at 11:57:34AM +0100, GBS Servers wrote: > How to check ? > > Thanks. > > pon., 2 gru 2019 o 10:38 Alwin Antreich napisał(a): > > > Hello, > > > > On Mon, Dec 02, 2019 at 08:17:49AM +0100, GBS Servers wrote: > > > Hi, im have problem with create new osd: > > > > > > > >

Re: [ceph-users] createosd problem...

2019-12-02 Thread GBS Servers
How to check ? Thanks. pon., 2 gru 2019 o 10:38 Alwin Antreich napisał(a): > Hello, > > On Mon, Dec 02, 2019 at 08:17:49AM +0100, GBS Servers wrote: > > Hi, im have problem with create new osd: > > > > > stdin: ceph --cluster ceph --name client.bootstrap-osd --keyring > >

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-12-02 Thread Igor Fedotov
Hi Lars, I've also seen interim space usage burst during my experiments. Up to 2x times of max level size when topmost RocksDB level is  L3 (i.e. 25GB max). So I think 2x (which results in 60-64 GB for DB) is a good grade when your DB is expected to be small and medium sized. Not sure this

Re: [ceph-users] createosd problem...

2019-12-02 Thread Alwin Antreich
Hello, On Mon, Dec 02, 2019 at 08:17:49AM +0100, GBS Servers wrote: > Hi, im have problem with create new osd: > > stdin: ceph --cluster ceph --name client.bootstrap-osd --keyring > /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new > b9e52bda-7f05-44e0-a69b-1d47755343cf > Dec 2 08:09:03

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-12-01 Thread Lars Täuber
Hi, Tue, 26 Nov 2019 13:57:51 + Simon Ironside ==> ceph-users@lists.ceph.com : > Mattia Belluco said back in May: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/035086.html > > "when RocksDB needs to compact a layer it rewrites it > *before* deleting the old data; if you'd

[ceph-users] createosd problem...

2019-12-01 Thread GBS Servers
Hi, im have problem with create new osd: root@serv1:~# pveceph createosd /dev/sdb create OSD on /dev/sdb (bluestore) wipe disk/partition: /dev/sdb 200+0 records in 200+0 records out 209715200 bytes (210 MB, 200 MiB) copied, 0.282059 s, 744 MB/s Creating new GPT entries. GPT data structures

Re: [ceph-users] [ceph-user ] HA and data recovery of CEPH

2019-11-29 Thread Romit Misra
Hi, Peng, There are certain more observation that might help you further. 1. If you are using multiple pools, separate the pools in terms of the crush mapping as well as if possible on the Hardware Hosted. 2. It is not mandated to have all the pools separated, but say pools whose

Re: [ceph-users] scrub errors on rgw data pool

2019-11-29 Thread M Ranga Swami Reddy
Primary OSD crashes with below assert: 12.2.11/src/osd/ReplicatedBackend.cc:1445 assert(peer_missing.count( fromshard)) == here I have 2 OSDs with bluestore backend and 1 osd with filestore backend. On Mon, Nov 25, 2019 at 3:34 PM M Ranga Swami Reddy wrote: > Hello - We are using the ceph

[ceph-users] Can I add existing rgw users to a tenant

2019-11-29 Thread Wei Zhao
Hello: We want to use rgw tenant as a group. But Can I add existing rgw users to a new tenant ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Wido den Hollander
On 11/29/19 6:28 AM, jes...@krogh.cc wrote: > Hi Nathan > > Is that true? > > The time it takes to reallocate the primary pg delivers “downtime” by > design.  right? Seen from a writing clients perspective  > That is true. When an OSD goes down it will take a few seconds for it's Placement

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread h...@portsip.cn
Hi Nathan We build a ceph cluster with 3 nodes. node-3: osd-2, mon-b, node-4: osd-0, mon-a, mds-myfs-a, mgr node-5: osd-1, mon-c, mds-myfs-b ceph cluster created by rook. Test phenomenon After one node unusual down(like direct poweroff), try to mount cephfs volume will spend more than 40

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread jesper
Hi Nathan Is that true? The time it takes to reallocate the primary pg delivers “downtime” by design.   right? Seen from a writing clients perspective  Jesper Sent from myMail for iOS Friday, 29 November 2019, 06.24 +0100 from pen...@portsip.com : >Hi Nathan,  > >Thanks for the help.

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Peng Bo
Hi Nathan, Thanks for the help. My colleague will provide more details. BR On Fri, Nov 29, 2019 at 12:57 PM Nathan Fish wrote: > If correctly configured, your cluster should have zero downtime from a > single OSD or node failure. What is your crush map? Are you using > replica or EC? If your

Re: [ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Nathan Fish
If correctly configured, your cluster should have zero downtime from a single OSD or node failure. What is your crush map? Are you using replica or EC? If your 'min_size' is not smaller than 'size', then you will lose availability. On Thu, Nov 28, 2019 at 10:50 PM Peng Bo wrote: > > Hi all, > >

[ceph-users] HA and data recovery of CEPH

2019-11-28 Thread Peng Bo
Hi all, We are working on use CEPH to build our HA system, the purpose is the system should always provide service even a node of CEPH is down or OSD is lost. Currently, as we practiced once a node/OSD is down, the CEPH cluster needs to take about 40 seconds to sync data, our system can't

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Paul Emmerich
Can confirm that disabling power saving helps. I've also seen latency improvements with sysctl -w net.ipv4.tcp_low_latency=1 Another thing that sometimes helps is disabling the write cache of your SSDs (hdparm -W 0), depends on the disk model, though. Paul -- Paul Emmerich Looking for help

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
Paul, Absolutely, I said I was looking at those settings and most didn't make any sense to me in a production environment (we've been running ceph since Dumpling). However we only have 1 cluster on Bluestore and I wanted to get some opinions if anything other than the defaults in ceph.conf or

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Paul Emmerich
Please don't run this config in production. Disabling checksumming is a bad idea, disabling authentication is also pretty bad. There are also a few options in there that no longer exist (osd op threads) or are no longer relevant (max open files), in general, you should not blindly copy config

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread Wido den Hollander
On 11/28/19 12:56 PM, David Majchrzak, ODERLAND Webbhotell AB wrote: > Hi! > > We've deployed a new flash only ceph cluster running Nautilus and I'm > currently looking at any tunables we should set to get the most out of > our NVMe SSDs. > > I've been looking a bit at the options from the

[ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
Hi! We've deployed a new flash only ceph cluster running Nautilus and I'm currently looking at any tunables we should set to get the most out of our NVMe SSDs. I've been looking a bit at the options from the blog post here:

[ceph-users] Help on diag needed : heartbeat_failed

2019-11-26 Thread Vincent Godin
We encounter a strange behavior on our Mimic 13.2.6 cluster. A any time, and without any load, some OSDs become unreachable from only some hosts. It last 10 mn and then the problem vanish. It 's not always the same OSDs and the same hosts. There is no network failure on any of the host (because

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-11-26 Thread Simon Ironside
Agree this needs tidied up in the docs. New users have little chance of getting it right relying on the docs alone. It's been discussed at length here several times in various threads but it doesn't always seem we reach the same conclusion so reading here doesn't guarantee understanding this

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-11-26 Thread Janne Johansson
It's mentioned here among other places https://books.google.se/books?id=vuiLDwAAQBAJ=PA79=PA79=rocksdb+sizes+3+30+300+g=bl=TlH4GR0E8P=ACfU3U0QOJQZ05POZL9DQFBVwTapML81Ew=en=X=2ahUKEwiPscq57YfmAhVkwosKHY1bB1YQ6AEwAnoECAoQAQ#v=onepage=rocksdb%20sizes%203%2030%20300%20g=false The 4% was a quick

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-11-26 Thread Vincent Godin
The documentation tell to size the DB to 4% of the disk data ie 240GB for a 6 TB disk. Plz gives more explanations when your answer disagree with the documentation ! Le lun. 25 nov. 2019 à 11:00, Konstantin Shalygin a écrit : > > I have an Ceph cluster which was designed for file store. Each

[ceph-users] pg_autoscaler is not working

2019-11-26 Thread Thomas Schneider
Hi, I enabled pg_autoscaler on a specific pool ssd. I failed to increase pg_num / pgp_num on pools ssd to 1024: root@ld3955:~# ceph osd pool autoscale-status  POOL   SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  cephfs_metadata 

Re: [ceph-users] scrub errors on rgw data pool

2019-11-25 Thread M Ranga Swami Reddy
Thanks for reply Have you migrated all filestore OSDs from filestore backend to bluestore backend? Or Have you upgraded from Luminious 12.2.11 to 14.x? What helped here? On Tue, Nov 26, 2019 at 8:03 AM Fyodor Ustinov wrote: > Hi! > > I had similar errors in pools on SSD until I upgraded to

Re: [ceph-users] rbd image size

2019-11-25 Thread Konstantin Shalygin
Hello , I use ceph as block storage in kubernetes. I want to get the rbd usage by command "rbd diff image_id | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }’”, but I found it is a lot bigger than the value which I got by command “df -h” in the pod. I do not know the reason and need

Re: [ceph-users] scrub errors on rgw data pool

2019-11-25 Thread Fyodor Ustinov
Hi! I had similar errors in pools on SSD until I upgraded to nautilus (clean bluestore installation) - Original Message - > From: "M Ranga Swami Reddy" > To: "ceph-users" , "ceph-devel" > > Sent: Monday, 25 November, 2019 12:04:46 > Subject: [ceph-users] scrub errors on rgw data pool

Re: [ceph-users] Upgrading and lost OSDs

2019-11-25 Thread Brent Kennedy
Update on this: I was able to link the block softlink back to the original for each offline drive. I used the ceph-bluestore-tool with show-label “ceph-bluestore-tool show-label --dev /dev/disk/by-partuuid/” on each drive ( apparently the newer commands link them as ceph-uuid, but these

Re: [ceph-users] RBD Mirror DR Testing

2019-11-25 Thread Jason Dillaman
On Mon, Nov 25, 2019 at 12:24 PM Vikas Rana wrote: > > Hi All, > I believe we forgot to take the snapshot in the previous test. Here's the > output from current test where we took snapshot on Primary side but the > snapshot did not replicated to DR side? > VTIER1 is the Primary box with cluster

Re: [ceph-users] RBD Mirror DR Testing

2019-11-25 Thread Vikas Rana
Hi All, I believe we forgot to take the snapshot in the previous test. Here's the output from current test where we took snapshot on Primary side but the snapshot did not replicated to DR side? VTIER1 is the Primary box with cluster ceph. Vtier2a is the DR box with cluster name cephdr.

Re: [ceph-users] rbd image size

2019-11-25 Thread Marc Roos
Is there a point to sending such signature (twice) to a public mailing list, having its emails stored on serveral mailing list websites?

[ceph-users] rbd image size

2019-11-25 Thread 陈旭
Hello , I use ceph as block storage in kubernetes. I want to get the rbd usage by command "rbd diff image_id | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }’”, but I found it is a lot bigger than the value which I got by command “df -h” in the pod. I do not know the reason and need

[ceph-users] scrub errors on rgw data pool

2019-11-25 Thread M Ranga Swami Reddy
Hello - We are using the ceph 12.2.11 version (upgraded from Jewel 10.2.12 to 12.2.11). In this cluster, we are having mix of filestore and bluestore OSD backends. Recently we are seeing the scrub errors on rgw buckets.data pool every day, after scrub operation performed by Ceph. If we run the PG

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-11-25 Thread Konstantin Shalygin
I have an Ceph cluster which was designed for file store. Each host have 5 SSDs write intensive of 400GB and 20 HDD of 6TB. So each HDD have a WAL of 5 GB on SSD If i want to put Bluestore on this cluster, i can only allocate ~75GB of WAL and DB on SSD for each HDD which is far below the 4% limit

[ceph-users] Impact of a small DB size with Bluestore

2019-11-25 Thread Vincent Godin
I have an Ceph cluster which was designed for file store. Each host have 5 SSDs write intensive of 400GB and 20 HDD of 6TB. So each HDD have a WAL of 5 GB on SSD If i want to put Bluestore on this cluster, i can only allocate ~75GB of WAL and DB on SSD for each HDD which is far below the 4% limit

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-11-24 Thread Stefan Kooman
Hi, Quoting Yan, Zheng (uker...@gmail.com): > > > I double checked the code, but didn't find any clue. Can you compile > > > mds with a debug patch? > > > > Sure, I'll try to do my best to get a properly packaged Ceph Mimic > > 13.2.6 with the debug patch in it (and / or get help to get it

[ceph-users] Cannot increate pg_num / pgp_num on a pool

2019-11-24 Thread Thomas
Hi, I failed to increase pg_num / pgp_num on pools ssd to 1024: root@ld3976:~# ceph osd pool get ssd pg_num pg_num: 512 root@ld3976:~# ceph osd pool get ssd pgp_num pgp_num: 512 root@ld3976:~# ceph osd pool set ssd pg_num 1024 root@ld3976:~# ceph osd pool get ssd pg_num pg_num: 512 When I check

<    1   2   3   4   5   6   7   8   9   10   >