Re: [ceph-users] How to just delete PGs stuck incomplete on EC pool

2019-03-01 Thread jesper
Saturday, 2 March 2019, 04.20 +0100 from satha...@gmail.com : >56 OSD, 6-node 12.2.5 cluster on Proxmox > >We had multiple drives fail(about 30%) within a few days of each other, likely >faster than the cluster could recover. Hov did so many drives break? Jesper

[ceph-users] How to just delete PGs stuck incomplete on EC pool

2019-03-01 Thread Daniel K
56 OSD, 6-node 12.2.5 cluster on Proxmox We had multiple drives fail(about 30%) within a few days of each other, likely faster than the cluster could recover. After the dust settled, we have 2 out of 896 pgs stuck inactive. The failed drives are completely inaccessible, so I can't mount them and

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-03-01 Thread David Turner
True, but not before you unmap it from the previous server. It's like physically connecting a harddrive to two servers at the same time. Neither knows what the other is doing to it and can corrupt your data. You should always make sure to unmap an rbd before mapping it to another server. On Fri,

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-03-01 Thread solarflow99
It has to be mounted from somewhere, if that server goes offline, you need to mount it from somewhere else right? On Thu, Feb 28, 2019 at 11:15 PM David Turner wrote: > Why are you making the same rbd to multiple servers? > > On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov wrote: > >> On Wed, Feb

[ceph-users] Problems creating a balancer plan

2019-03-01 Thread Massimo Sgaravatto
Hi I already used the balancer in my ceph luminous cluster a while ago when all the OSDs were using filestore. Now, after having added some bluestore OSDs, if I try to create a plan: [root@ceph-mon-01 ~]# ceph balancer status { "active": false, "plans": [], "mode": "crush-compat" }

Re: [ceph-users] Questions about rbd-mirror and clones

2019-03-01 Thread Jason Dillaman
On Fri, Mar 1, 2019 at 1:09 PM Anthony D'Atri wrote: > > Thanks! I'll do some experimentation. Our orchestration service might need > an overhaul to manage parents / children together. > > Related question, in this thread: > > https://www.spinics.net/lists/ceph-users/msg35257.html > > there

Re: [ceph-users] Erasure coded pools and ceph failure domain setup

2019-03-01 Thread Ravi Patel
Hello, My question is how crush distributes chunks throughout the cluster with erasure coded pools. Currently, we have 4 OSD nodes with 36 drives(OSD daemons) per node. If we use ceph_failire_domaon=host, then we are necessarily limited to k=3,m=1, or k=2,m=2. We would like to explore k>3, m>2

[ceph-users] NFS-Ganesha CEPH_FSAL ceph.quota.max_bytes not enforced

2019-03-01 Thread David C
Hi All Exporting cephfs with the CEPH_FSAL I set the following on a dir: setfattr -n ceph.quota.max_bytes -v 1 /dir setfattr -n ceph.quota.max_files -v 10 /dir >From an NFSv4 client, the quota.max_bytes appears to be completely ignored, I can go GBs over the quota in the dir. The

Re: [ceph-users] PG Calculations Issue

2019-03-01 Thread Matthew H
I believe the question was in regards to which formula to use. There are two different formulas here [1] and here [2]. The difference being the additional steps used to calculate the appropriate PG counts for a pool. In Nautilus though, this mostly moot as the mgr service now has a module to

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Vitaliy Filippov
+1, I also think's it's strange that deleting OSD by "osd out -> osd purge" causes two rebalances instead of one. -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Alexandru Cucu
More on the subject can be found here: https://ceph.com/geen-categorie/difference-between-ceph-osd-reweight-and-ceph-osd-crush-reweight/ On Fri, Mar 1, 2019 at 2:22 PM Darius Kasparavičius wrote: > > Hi, > > Setting crush weight to 0 removes the osds weight from crushmap, by > modifying hosts

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Darius Kasparavičius
Hi, Setting crush weight to 0 removes the osds weight from crushmap, by modifying hosts total weight. Which forces rebalancing of data across all the cluster. Setting and OSD to out only modifies "REWEIGHT" status, which balances data inside the same host. On Fri, Mar 1, 2019 at 12:25 PM Paul

[ceph-users] ceph bug#2445 hitting version-12.2.4

2019-03-01 Thread M Ranga Swami Reddy
Hi - we are using ceph 12.2.4 and bug#24445 hitting, which caused 10 min IO pause on ceph cluster.. Is this bug fixed? bug: https://tracker.ceph.com/issues/24445/ Thanks Swami ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Fyodor Ustinov
Hi! Hmm. "ceph osd crush reweight" and "ceph osd reweight" - do the same or not? I use "ceph osd crush reweight" - Original Message - From: "Paul Emmerich" To: "Fyodor Ustinov" Cc: "David Turner" , "ceph-users" Sent: Friday, 1 March, 2019 12:24:37 Subject: Re: [ceph-users] Right way

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
resending, not sure the prev email reached the mailing list... Hi Chen, thanks for the update. Will prepare patch to periodically reset StupidAllocator today. And just to let you know below is an e-mail from AdamK from RH which might explain the issue with the allocator. Also please note

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Paul Emmerich
On Fri, Mar 1, 2019 at 11:17 AM Fyodor Ustinov wrote: > May be. But "ceph out + ceph osd purge" causes double relocation, and "ceph > reweight 0 + ceph osd purge" - causes only one. No, the commands "ceph osd out X" and "ceph osd reweight X 0" do the exact same thing: both set reweight to 0.

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
Hi Chen, thanks for the update. Will prepare patch to periodically reset StupidAllocator today. And just to let you know below is an e-mail from AdamK from RH which might explain the issue with the allocator. Also please note that StupidAllocator might not perform full defragmentation in

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Fyodor Ustinov
Hi! May be. But "ceph out + ceph osd purge" causes double relocation, and "ceph reweight 0 + ceph osd purge" - causes only one. - Original Message - From: "Paul Emmerich" To: "Fyodor Ustinov" Cc: "David Turner" , "ceph-users" Sent: Friday, 1 March, 2019 11:54:20 Subject: Re:

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Paul Emmerich
"out" is internally implemented as "reweight 0" Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Mar 1, 2019 at 10:48 AM Fyodor Ustinov wrote: > > Hi! > > As

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Fyodor Ustinov
Hi! As far as I understand, reweight also does not lead to the situation "a period where one copy / shard is missing". - Original Message - From: "Paul Emmerich" To: "Fyodor Ustinov" Cc: "David Turner" , "ceph-users" Sent: Friday, 1 March, 2019 11:32:54 Subject: Re: [ceph-users]

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-03-01 Thread xie.xingguo
> Backports should be available in v12.2.11. s/v12.2.11/ v12.2.12/ Sorry for the typo. 原始邮件 发件人:谢型果10072465 收件人:d...@vanderster.com ; 抄送人:ceph-users@lists.ceph.com ; 日 期 :2019年03月01日 17:09 主 题 :Re: [ceph-users] ceph osd pg-upmap-items not working

Re: [ceph-users] Right way to delete OSD from cluster?

2019-03-01 Thread Paul Emmerich
On Fri, Mar 1, 2019 at 8:55 AM Fyodor Ustinov wrote: > > Hi! > > Yes. But I am a little surprised by what is written in the documentation: the point of this is that you don't have a period where one copy/shard is missing if you wait for it to take it out. Yeah, there'll be an unnecessary small

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-03-01 Thread xie.xingguo
See https://github.com/ceph/ceph/pull/26179 Backports should be available in v12.2.11. Or you can manually do it by simply adopting https://github.com/ceph/ceph/pull/26127 if you are eager to get out of the trap right now. 原始邮件 发件人:DanvanderSter 收件人:Kári

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Alexandre DERUMIER
Hi, some news, it seem that it's finally stable for me since 1week. (around 0,7ms of commit latency average) http://odisoweb1.odiso.net/osdstable.png The biggest change is the 18/02, where I have finished to rebuild all my osd, with 2 osd of 3TB for 1NVME 6TB. (previously I only have done