Hoi Dan, That did the trick. Thanks!
Regards, Erwin > Op 24 feb. 2022, om 20:25 heeft Dan van der Ster <[email protected]> het > volgende geschreven: > > Hi Erwin, > > This may be one of the rare cases where the default choose_total_tries > = 50 is too low. > You can try increasing it to 75 or 100 and see if crush can find 3 up OSDs. > > Here's the basic recipe: > > # ceph osd getcrushmap -o crush.map > # crushtool -d crush.map -o crush.txt > # vi crush.txt # and change to tunable choose_total_tries 100 > # crushtool -c crush.txt -o crush.map2 > # ceph osd setcrushmap -i crush.map2 > > Cheers, dan > > On Thu, Feb 24, 2022 at 6:29 PM Erwin Lubbers <[email protected]> wrote: >> >> Hi all, >> >> I have one active+clean+remapped PG on a 152 OSD Octopus (15.2.15) cluster >> with equal balanced OSD's (around 40% usage). The cluster has three replicas >> spreaded around three datacenters (A+B+C). >> >> All PGs are available in each datacenter (as defined in the crush map), but >> only this one (which is in a pool containing 2048 PGs) is up on OSD.34 and >> OSD.42 and acting on OSD.34, OSD.42 and OSD.38. >> >> OSD.34 is located in datacenter A, 42 in B and 38 in A again, but it should >> be in C. >> >> I did restart all OSD's, monitors, managers and servers. I did out the OSDs >> that the PG is acting on and bring it back in a minute later. In all cases >> the PG holds the same state after backfilling, but one of the A replicas >> switches to another OSD in the A datacenter. I did turn off and on the >> balancer. But nothing seems to recover the PG to active+clean. >> >> Any suggestions? >> >> Regards, >> Erwin >> _______________________________________________ >> ceph-users mailing list -- [email protected] >> To unsubscribe send an email to [email protected] > _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
