: Wednesday, January 11, 2017 2:50 PM
To: Shinobu Kinjo
Cc: Ceph Users
Subject: Re: [ceph-users] PGs stuck active+remapped and osds lose data?!
Yes, but everything i want to know is, if my way to change the tunables is
right or not?
> Am 11.01.2017 um 13:11 schrieb Shinobu Kinjo <ski...@redh
Yes, but everything i want to know is, if my way to change the tunables is
right or not?
> Am 11.01.2017 um 13:11 schrieb Shinobu Kinjo :
>
> Please refer to Jens's message.
>
> Regards,
>
>> On Wed, Jan 11, 2017 at 8:53 PM, Marcus Müller
>>
Please refer to Jens's message.
Regards,
On Wed, Jan 11, 2017 at 8:53 PM, Marcus Müller wrote:
> Ok, thank you. I thought I have to set ceph to a tunables profile. If I’m
> right, then I just have to export the current crush map, edit it and import
> it again, like:
Ok, thank you. I thought I have to set ceph to a tunables profile. If I’m
right, then I just have to export the current crush map, edit it and import it
again, like:
ceph osd getcrushmap -o /tmp/crush
crushtool -i /tmp/crush --set-choose-total-tries 100 -o /tmp/crush.new
ceph osd setcrushmap -i
Your current problem has nothing to do with clients and neither does
choose_total_tries.
Try setting just this value to 100 and see if your situation improves.
Ultimately you need to take a good look at your cluster configuration
and how your crush map is configured to deal with that
Yeah, Sam is correct. I've not looked at crushmap. But I should have
noticed what troublesome is with looking at `ceph osd tree`. That's my
bad, sorry for that.
Again please refer to:
http://www.anchor.com.au/blog/2013/02/pulling-apart-cephs-crush-algorithm/
Regards,
On Wed, Jan 11, 2017 at
Hi Sam,
another idea: I have two HDDs here and already wanted to add them to ceph5, so
that I would need a new crush map. Could this problem be solved by doing this?
> Am 10.01.2017 um 17:50 schrieb Samuel Just :
>
> Shinobu isn't correct, you have 9/9 osds up and running.
Ok, thanks. Then I will change the tunables.
As far as I see, this would already help me: ceph osd crush tunables bobtail
Even if we run ceph hammer this would work according to the documentation, am I
right?
And: I’m using librados for our clients (hammer too) could this change create
Shinobu isn't correct, you have 9/9 osds up and running. up does not
equal acting because crush is having trouble fulfilling the weights in
your crushmap and the acting set is being padded out with an extra osd
which happens to have the data to keep you up to the right number of
replicas. Please
Ok, i understand but how can I debug why they are not running as they should?
For me I thought everything is fine because ceph -s said they are up and
running.
I would think of a problem with the crush map.
> Am 10.01.2017 um 08:06 schrieb Shinobu Kinjo :
>
> e.g.,
> OSD7
e.g.,
OSD7 / 3 / 0 are in the same acting set. They should be up, if they
are properly running.
# 9.7
>"up": [
>7,
>3
>],
>"acting": [
>7,
>3,
>0
>],
Here is an example:
"up": [
1,
0,
2
],
"acting": [
1,
0,
>
> That's not perfectly correct.
>
> OSD.0/1/2 seem to be down.
Sorry but where do you see this? I think this indicates that they are up:
osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs?
> Am 10.01.2017 um 07:50 schrieb Shinobu Kinjo :
>
> On Tue, Jan 10, 2017 at
On Tue, Jan 10, 2017 at 3:44 PM, Marcus Müller wrote:
> All osds are currently up:
>
> health HEALTH_WARN
> 4 pgs stuck unclean
> recovery 4482/58798254 objects degraded (0.008%)
> recovery 420522/58798254 objects misplaced
All osds are currently up:
health HEALTH_WARN
4 pgs stuck unclean
recovery 4482/58798254 objects degraded (0.008%)
recovery 420522/58798254 objects misplaced (0.715%)
noscrub,nodeep-scrub flag(s) set
monmap e9: 5 mons at
Looking at ``ceph -s`` you originally provided, all OSDs are up.
> osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs
But looking at ``pg query``, OSD.0 / 1 are not up. Are they something
like related to ?:
> Ceph1, ceph2 and ceph3 are vms on one physical host
Are those OSDs running on vm
> pg 9.7 is stuck unclean for 512936.160212, current state active+remapped,
> last acting [7,3,0]
> pg 7.84 is stuck unclean for 512623.894574, current state active+remapped,
> last acting [4,8,1]
> pg 8.1b is stuck unclean for 513164.616377, current state active+remapped,
> last acting [4,7,2]
On Tue, Jan 10, 2017 at 10:22 AM, Marcus Müller
wrote:
> Trying google with "ceph pg stuck in active and remapped" points to a
> couple of post on this ML typically indicating that it's a problem with the
> CRUSH map and ceph being unable to satisfy the mapping rules.
There is currently a thread about this very issue on the ceph-devel
mailing list (check archives for "PG stuck unclean after
rebalance-by-weight" in the last few days.
Have a read of
http://www.anchor.com.au/blog/2013/02/pulling-apart-cephs-crush-algorithm/
and try bumping choose_total_tries up
> Trying google with "ceph pg stuck in active and remapped" points to a couple
> of post on this ML typically indicating that it's a problem with the CRUSH
> map and ceph being unable to satisfy the mapping rules. Your ceph -s output
> indicates that your using replication of size 3 in your
On Tue, Jan 10, 2017 at 8:23 AM, Marcus Müller
wrote:
> Hi all,
>
> Recently I added a new node with new osds to my cluster, which, of course
> resulted in backfilling. At the end, there are 4 pgs left in the state 4
> active+remapped and I don’t know what to do.
>
>
Hi all,
Recently I added a new node with new osds to my cluster, which, of course
resulted in backfilling. At the end, there are 4 pgs left in the state 4
active+remapped and I don’t know what to do.
Here is how my cluster looks like currently:
ceph -s
health HEALTH_WARN
4
21 matches
Mail list logo