Re: [ceph-users] 1 mon unable to join the quorum
I'm not sure I completely understand your "test". What exactly are you trying to achieve and what documentation are you following? On Fri, Mar 30, 2018 at 10:49 PM, Julien Lavesquewrote: > Brad, > > Thanks for your answer > > On 30/03/2018 02:09, Brad Hubbard wrote: >> >> 2018-03-19 11:03:50.819493 7f842ed47640 0 mon.controller02 does not >> exist in monmap, will attempt to join an existing cluster >> 2018-03-19 11:03:50.820323 7f842ed47640 0 starting mon.controller02 >> rank -1 at 172.18.8.6:6789/0 mon_data >> /var/lib/ceph/mon/ceph-controller02 fsid >> f37f31b1-92c5-47c8-9834-1757a677d020 >> >> We are called 'mon.controller02' and we can not find our name in the >> local copy of the monmap. >> >> 2018-03-19 11:03:52.346318 7f842735d700 10 >> mon.controller02@-1(probing) e68 ready to join, but i'm not in the >> monmap or my addr is blank, trying to join >> >> Our name is not in the copy of the monmap we got from peer controller01 >> either. > > > During our test we have deleted completely the controller02 monitor and add > it again. > > The log you have is when the controller02 is added (so it wasn't in the > monmap before) > > >> >> $ cat ../controller02-mon_status.log >> [root@controller02 ~]# ceph --admin-daemon >> /var/run/ceph/ceph-mon.controller02.asok mon_status >> { >> "name": "controller02", >> "rank": 1, >> "state": "electing", >> "election_epoch": 32749, >> "quorum": [], >> "outside_quorum": [], >> "extra_probe_peers": [], >> "sync_provider": [], >> "monmap": { >> "epoch": 71, >> "fsid": "f37f31b1-92c5-47c8-9834-1757a677d020", >> "modified": "2018-03-29 10:48:06.371157", >> "created": "0.00", >> "mons": [ >> { >> "rank": 0, >> "name": "controller01", >> "addr": "172.18.8.5:6789\/0" >> }, >> { >> "rank": 1, >> "name": "controller02", >> "addr": "172.18.8.6:6789\/0" >> }, >> { >> "rank": 2, >> "name": "controller03", >> "addr": "172.18.8.7:6789\/0" >> } >> ] >> } >> } >> >> In the monmaps we are called 'controller02', not 'mon.controller02'. >> These names need to be identical. >> > > The cluster has been deployed using ceph-ansible with the servers hostname. > All monitors are called mon.controller0x in the monmap and all the 3 > monitors have the same configuration > > We have the same behavior creating a monmap from scratch : > > [root@controller03 ~]# monmaptool --create --add controller01 > 172.18.8.5:6789 --add controller02 172.18.8.6:6789 --add controller03 > 172.18.8.7:6789 --fsid f37f31b1-92c5-47c8-9834-1757a677d020 --clobber > test-monmap > monmaptool: monmap file test-monmap > monmaptool: set fsid to f37f31b1-92c5-47c8-9834-1757a677d020 > monmaptool: writing epoch 0 to test-monmap (3 monitors) > > [root@controller03 ~]# monmaptool --print test-monmap > monmaptool: monmap file test-monmap > epoch 0 > fsid f37f31b1-92c5-47c8-9834-1757a677d020 > last_changed 2018-03-30 14:42:18.809719 > created 2018-03-30 14:42:18.809719 > 0: 172.18.8.5:6789/0 mon.controller01 > 1: 172.18.8.6:6789/0 mon.controller02 > 2: 172.18.8.7:6789/0 mon.controller03 > > >> >> On Thu, Mar 29, 2018 at 7:23 PM, Julien Lavesque >> wrote: >>> >>> Hi Brad, >>> >>> The results have been uploaded on the tracker >>> (https://tracker.ceph.com/issues/23403) >>> >>> Julien >>> >>> >>> On 29/03/2018 07:54, Brad Hubbard wrote: Can you update with the result of the following commands from all of the MONs? # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok quorum_status On Thu, Mar 29, 2018 at 3:11 PM, Gauvain Pocentek wrote: > > > Hello Ceph users, > > We are having a problem on a ceph cluster running Jewel: one of the > mons > left the quorum, and we have not been able to make it join again. The > two > other monitors are running just fine, but obviously we need this third > one. > > The problem happened before Jewel, when the cluster was running > Infernalis. > We upgraded hoping that it would solve the problem, but no luck. > > We've validated several things: no network problem, no clock skew, same > OS > and ceph version everywhere. We've also removed the mon completely, and > recreated it. We also tried to run an additional mon on one of the OSD > machines, this mon didn't join the quorum either. > > We've opened https://tracker.ceph.com/issues/23403 with logs from the 3 > mons > during a fresh startup of the problematic logs. > > Is there anything we could try to do to
Re: [ceph-users] Backfilling on Luminous
Somehow, I missed replying to this, the random split would be enabled for all new PGs or the PGs that get mapped to new OSDs. For existing OSDs, one has to use ceph-objectstore-tool’s apply-layout commad to run on each OSD while the OSD is offline. If you want to pre-split PGs using ‘expected_num_objects’ at the time of pool creation, be aware of this fix http://tracker.ceph.com/issues/22530. Thanks, -Pavan. From: David TurnerDate: Tuesday, March 20, 2018 at 1:50 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Backfilling on Luminous @Pavan, I did not know about 'filestore split rand factor'. That looks like it was added in Jewel and I must have missed it. To disable it, would I just set it to 0 and restart all of the OSDs? That isn't an option at the moment, but restarting the OSDs after this backfilling is done is definitely doable. On Mon, Mar 19, 2018 at 5:28 PM Pavan Rallabhandi > wrote: David, Pretty sure you must be aware of the filestore random split on existing OSD PGs, `filestore split rand factor`, may be you could try that too. Thanks, -Pavan. From: ceph-users > on behalf of David Turner > Date: Monday, March 19, 2018 at 1:36 PM To: Caspar Smit > Cc: ceph-users > Subject: EXT: Re: [ceph-users] Backfilling on Luminous Sorry for being away. I set all of my backfilling to VERY slow settings over the weekend and things have been stable, but incredibly slow (1% recovery from 3% misplaced to 2% all weekend). I'm back on it now and well rested. @Caspar, SWAP isn't being used on these nodes and all of the affected OSDs have been filestore. @Dan, I think you hit the nail on the head. I didn't know that logging was added for subfolder splitting in Luminous!!! That's AMAZING We are seeing consistent subfolder splitting all across the cluster. The majority of the crashed OSDs have a split started before the crash and then commenting about it in the crash dump. Looks like I just need to write a daemon to watch for splitting to start and throttle recovery until it's done. I had injected the following timeout settings, but it didn't seem to affect anything. I may need to have placed them in ceph.conf and let them pick up the new settings as the OSDs crashed, but I didn't really want different settings on some OSDs in the cluster. osd_op_thread_suicide_timeout=1200 (from 180) osd-recovery-thread-timeout=300 (from 30) My game plan for now is to watch for splitting in the log, increase recovery sleep, decrease osd_recovery_max_active, and watch for splitting to finish before setting them back to more aggressive settings. After this cluster is done backfilling I'm going to do my best to reproduce this scenario in a test environment and open a ticket to hopefully fix why this is happening so detrimentally. On Fri, Mar 16, 2018 at 4:00 AM Caspar Smit > wrote: Hi David, What about memory usage? 1] 23 OSD nodes: 15x 10TB Seagate Ironwolf filestore with journals on Intel DC P3700, 70% full cluster, Dual Socket E5-2620 v4 @ 2.10GHz, 128GB RAM. If you upgrade to bluestore, memory usage will likely increase. 15x10TB ~~ 150GB RAM needed especially in recovery/backfilling scenario's like these. Kind regards, Caspar 2018-03-15 21:53 GMT+01:00 Dan van der Ster >: Did you use perf top or iotop to try to identify where the osd is stuck? Did you try increasing the op thread suicide timeout from 180s? Splitting should log at the beginning and end of an op, so it should be clear if it's taking longer than the timeout. .. Dan On Mar 15, 2018 9:23 PM, "David Turner" > wrote: I am aware of the filestore splitting happening. I manually split all of the subfolders a couple weeks ago on this cluster, but every time we have backfilling the newly moved PGs have a chance to split before the backfilling is done. When that has happened in the past it causes some blocked requests and will flap OSDs if we don't increase the osd_heartbeat_grace, but it has never consistently killed the OSDs during the task. Maybe that's new in Luminous due to some of the priority and timeout settings. This problem in general seems unrelated to the subfolder splitting, though, since it started to happen very quickly into the backfilling process. Definitely before many of the recently moved PGs would have reached that point. I've also confirmed that the OSDs that are dying are not just stuck on a process (like it
Re: [ceph-users] rgw make container private again
As usual, I found solution after a while. Metadata field is not deleting as it should by API docs, but it can be changed. So I just changed it with curl -X POST -i -H "X-Auth-Token: -H "X-Container-Read: :*" https://endpoint.url/swift/v1/containername and now metadata field looks like this X-Container-Read: :* Essentialy this behaves the same as when there's no X-Container-Read at all. But overall this is still an issue - what should've taken 5 seconds to just uncheck a box in Horizon interface turned into couple hours of debugging. Can anyone who uses same version check if this issue is reproducible? If so - this seems to be a ticket-worthy. 2018-03-30 17:40 GMT+03:00 Vladimir Prokofev: > CEPH 12.2.2, RGW. > I'm using it as an object storage endpoint for Openstack. > > Recently while browsing an object storage from Horizon, I accidently > marked container as public. The issue is - I can't make it private again! > Docs state that to do it I should simply delete X-Container-Read metadata, > but I just can't! > > Examples: > private container headers(only relevant output, some other empty > container): > X-Container-Bytes-Used-Actual: 0 > X-Storage-Policy: default-placement > > public container headers(only relevant output): > X-Container-Bytes-Used-Actual: 114688 > X-Container-Read: .r:*,.rlistings > X-Storage-Policy: default-placement > > As you can see, there's now an X-Container-Read header. > > > I've tried to make it back private with swift client and curl, but to no > success. Here're some curl examples. > > Updating works! > If I do > curl -X POST -i -H "X-Auth-Token: " -H "X-Container-Read: > .r:test" https://endpoint.url/swift/v1/containername > metadata will become > X-Container-Read: .r:test > > But if I do > curl -X POST -i -H "X-Auth-Token: " -H > "X-Remove-Container-Read: x" https://endpoint.url/swift/v1/containername > > nothing happens, metadata field will remain there. > > So is this a broken API in RGW, or am I missing something? Maybe there's > some explicit warning that after becoming public you can't make container > private again? > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rgw make container private again
CEPH 12.2.2, RGW. I'm using it as an object storage endpoint for Openstack. Recently while browsing an object storage from Horizon, I accidently marked container as public. The issue is - I can't make it private again! Docs state that to do it I should simply delete X-Container-Read metadata, but I just can't! Examples: private container headers(only relevant output, some other empty container): X-Container-Bytes-Used-Actual: 0 X-Storage-Policy: default-placement public container headers(only relevant output): X-Container-Bytes-Used-Actual: 114688 X-Container-Read: .r:*,.rlistings X-Storage-Policy: default-placement As you can see, there's now an X-Container-Read header. I've tried to make it back private with swift client and curl, but to no success. Here're some curl examples. Updating works! If I do curl -X POST -i -H "X-Auth-Token: " -H "X-Container-Read: .r:test" https://endpoint.url/swift/v1/containername metadata will become X-Container-Read: .r:test But if I do curl -X POST -i -H "X-Auth-Token: " -H "X-Remove-Container-Read: x" https://endpoint.url/swift/v1/containername nothing happens, metadata field will remain there. So is this a broken API in RGW, or am I missing something? Maybe there's some explicit warning that after becoming public you can't make container private again? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 mon unable to join the quorum
Brad, Thanks for your answer On 30/03/2018 02:09, Brad Hubbard wrote: 2018-03-19 11:03:50.819493 7f842ed47640 0 mon.controller02 does not exist in monmap, will attempt to join an existing cluster 2018-03-19 11:03:50.820323 7f842ed47640 0 starting mon.controller02 rank -1 at 172.18.8.6:6789/0 mon_data /var/lib/ceph/mon/ceph-controller02 fsid f37f31b1-92c5-47c8-9834-1757a677d020 We are called 'mon.controller02' and we can not find our name in the local copy of the monmap. 2018-03-19 11:03:52.346318 7f842735d700 10 mon.controller02@-1(probing) e68 ready to join, but i'm not in the monmap or my addr is blank, trying to join Our name is not in the copy of the monmap we got from peer controller01 either. During our test we have deleted completely the controller02 monitor and add it again. The log you have is when the controller02 is added (so it wasn't in the monmap before) $ cat ../controller02-mon_status.log [root@controller02 ~]# ceph --admin-daemon /var/run/ceph/ceph-mon.controller02.asok mon_status { "name": "controller02", "rank": 1, "state": "electing", "election_epoch": 32749, "quorum": [], "outside_quorum": [], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 71, "fsid": "f37f31b1-92c5-47c8-9834-1757a677d020", "modified": "2018-03-29 10:48:06.371157", "created": "0.00", "mons": [ { "rank": 0, "name": "controller01", "addr": "172.18.8.5:6789\/0" }, { "rank": 1, "name": "controller02", "addr": "172.18.8.6:6789\/0" }, { "rank": 2, "name": "controller03", "addr": "172.18.8.7:6789\/0" } ] } } In the monmaps we are called 'controller02', not 'mon.controller02'. These names need to be identical. The cluster has been deployed using ceph-ansible with the servers hostname. All monitors are called mon.controller0x in the monmap and all the 3 monitors have the same configuration We have the same behavior creating a monmap from scratch : [root@controller03 ~]# monmaptool --create --add controller01 172.18.8.5:6789 --add controller02 172.18.8.6:6789 --add controller03 172.18.8.7:6789 --fsid f37f31b1-92c5-47c8-9834-1757a677d020 --clobber test-monmap monmaptool: monmap file test-monmap monmaptool: set fsid to f37f31b1-92c5-47c8-9834-1757a677d020 monmaptool: writing epoch 0 to test-monmap (3 monitors) [root@controller03 ~]# monmaptool --print test-monmap monmaptool: monmap file test-monmap epoch 0 fsid f37f31b1-92c5-47c8-9834-1757a677d020 last_changed 2018-03-30 14:42:18.809719 created 2018-03-30 14:42:18.809719 0: 172.18.8.5:6789/0 mon.controller01 1: 172.18.8.6:6789/0 mon.controller02 2: 172.18.8.7:6789/0 mon.controller03 On Thu, Mar 29, 2018 at 7:23 PM, Julien Lavesquewrote: Hi Brad, The results have been uploaded on the tracker (https://tracker.ceph.com/issues/23403) Julien On 29/03/2018 07:54, Brad Hubbard wrote: Can you update with the result of the following commands from all of the MONs? # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok quorum_status On Thu, Mar 29, 2018 at 3:11 PM, Gauvain Pocentek wrote: Hello Ceph users, We are having a problem on a ceph cluster running Jewel: one of the mons left the quorum, and we have not been able to make it join again. The two other monitors are running just fine, but obviously we need this third one. The problem happened before Jewel, when the cluster was running Infernalis. We upgraded hoping that it would solve the problem, but no luck. We've validated several things: no network problem, no clock skew, same OS and ceph version everywhere. We've also removed the mon completely, and recreated it. We also tried to run an additional mon on one of the OSD machines, this mon didn't join the quorum either. We've opened https://tracker.ceph.com/issues/23403 with logs from the 3 mons during a fresh startup of the problematic logs. Is there anything we could try to do to resolve this issue? We are getting out of ideas. We'd appreciate any suggestion! Gauvain Pocentek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com