Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-09 Thread Christian Balzer
On Thu, 8 Jan 2015 21:17:12 -0700 Robert LeBlanc wrote: On Thu, Jan 8, 2015 at 8:31 PM, Christian Balzer ch...@gol.com wrote: On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote: Which of course currently means a strongly consistent lockup in these scenarios. ^o^ That is one way of

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-09 Thread Robert LeBlanc
On Thu, Jan 8, 2015 at 8:31 PM, Christian Balzer ch...@gol.com wrote: On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote: Which of course currently means a strongly consistent lockup in these scenarios. ^o^ That is one way of putting it Slightly off-topic and snarky, that strong

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-08 Thread Christian Balzer
On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote: On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer ch...@gol.com wrote: Which of course begs the question of why not having min_size at 1 permanently, so that in the (hopefully rare) case of loosing 2 OSDs at the same time your cluster

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-08 Thread Gregory Farnum
On Wed, Jan 7, 2015 at 9:55 PM, Christian Balzer ch...@gol.com wrote: On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote: On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva ol...@gnu.org wrote: However, I suspect that temporarily setting min size to a lower number could be enough for the

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-08 Thread Robert LeBlanc
On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer ch...@gol.com wrote: Which of course begs the question of why not having min_size at 1 permanently, so that in the (hopefully rare) case of loosing 2 OSDs at the same time your cluster still keeps working (as it should with a size of 3). The

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-07 Thread Craig Lewis
On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva ol...@gnu.org wrote: However, I suspect that temporarily setting min size to a lower number could be enough for the PGs to recover. If ceph osd pool pool set min_size 1 doesn't get the PGs going, I suppose restarting at least one of the OSDs

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-07 Thread Christian Balzer
On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote: On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva ol...@gnu.org wrote: However, I suspect that temporarily setting min size to a lower number could be enough for the PGs to recover. If ceph osd pool pool set min_size 1 doesn't get the

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-30 Thread Christian Eichelmann
Hi Nico and all others who answered, After some more trying to somehow get the pgs in a working state (I've tried force_create_pg, which was putting then in creating state. But that was obviously not true, since after rebooting one of the containing osd's it went back to incomplete), I decided to

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-30 Thread Eneko Lacunza
Hi Christian, Have you tried to migrate the disk from the old storage (pool) to the new one? I think it should show the same problem, but I think it'd be a much easier path to recover than the posix copy. How full is your storage? Maybe you can customize the crushmap, so that some OSDs

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-30 Thread Christian Eichelmann
Hi Eneko, I was trying a rbd cp before, but that was haning as well. But I couldn't find out if the source image was causing the hang or the destination image. That's why I decided to try a posix copy. Our cluster is sill nearly empty (12TB / 867TB). But as far as I understood (If not, somebody

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-30 Thread Eneko Lacunza
Hi Christian, New pool's pgs also show as incomplete? Did you notice something remarkable in ceph logs in the new pools image format? On 30/12/14 12:31, Christian Eichelmann wrote: Hi Eneko, I was trying a rbd cp before, but that was haning as well. But I couldn't find out if the source

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-30 Thread Christian Eichelmann
Hi Eneko, nope, new pool has all pgs active+clean, not errors during image creation. The format command just hangs, without error. Am 30.12.2014 12:33, schrieb Eneko Lacunza: Hi Christian, New pool's pgs also show as incomplete? Did you notice something remarkable in ceph logs in the

[ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-29 Thread Christian Eichelmann
Hi all, we have a ceph cluster, with currently 360 OSDs in 11 Systems. Last week we were replacing one OSD System with a new one. During that, we had a lot of problems with OSDs crashing on all of our systems. But that is not our current problem. After we got everything up and running again, we

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-29 Thread Nico Schottelius
Hey Christian, Christian Eichelmann [Mon, Dec 29, 2014 at 10:56:59AM +0100]: [incomplete PG / RBD hanging, osd lost also not helping] that is very interesting to hear, because we had a similar situation with ceph 0.80.7 and had to re-create a pool, after I deleted 3 pg directories to allow OSDs

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-29 Thread Chad William Seys
Hi Christian, I had a similar problem about a month ago. After trying lots of helpful suggestions, I found none of it worked and I could only delete the affected pools and start over. I opened a feature request in the tracker: http://tracker.ceph.com/issues/10098 If you find a way, let

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2014-12-29 Thread Alexandre Oliva
On Dec 29, 2014, Christian Eichelmann christian.eichelm...@1und1.de wrote: After we got everything up and running again, we still have 3 PGs in the state incomplete. I was checking one of them directly on the systems (replication factor is 3). I have run into this myself at least twice