[ceph-users] ceph-monstore-tool missing in 12.1.1 on Xenial?

2017-07-30 Thread Daniel K
All 3 of my mons crashed while I was adding OSDs and now error out with:

 (/build/ceph-12.1.1/src/mon/OSDMonitor.cc: 3018: FAILED
assert(osdmap.get_up_osd_features() & CEPH_FEATURE_MON_STATEFUL_SUB)


I've resorted to just rebuilding the mon DB and making 3 new mon daemons,
using the steps here:
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
under "Recovery using OSDs" but I am not finding the ceph-monstore-tool
anywhere.

Is there a different package I need to install or did this tool get
replaced with something else in Luminous?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bug in OSD Maps

2017-07-30 Thread Stuart Harland
I know this thread has been silent for a while, however due to various reasons, 
I have been forced to work specifically on this issue this weekend.

As it turns out, you were partly right, the fix for the state is to use 
ceph-objectstore, however it was not to remove the PG in question, rather to 
inject the missing OSD Map Epoch. Once it has the required Epoch, it can 
successfully start the OSD in question and resume its download of OSDmaps 
through the normal mechanism.

As an example, osd id 123 on storage1 with missing epoch 9876:

On A monitor:
  ceph osd getmap 9876 > e9876

SCP (or other mechanism) the file e9876 from monitor to storage1

Then forcibly inject the epoch into the not-running OSD (our system is 
configured with cluster name txc1, as a result your mileage may vary).

  sudo ceph-objectstore-tool --cluster=txc1 --data-path 
/var/lib/ceph/osd/txc1-123 --journal-path /var/lib/ceph/osd/txc1-123/journal 
--op set-osdmap --file /path/to/e9876 --epoch 9876 --force

I wanted to share this nugget of information for posterity, as I can not be the 
only person out there who has run across this and there appears to be limited 
documentation on this (and what documentation of ceph-objectstore-tool there 
is, is slightly inconsistent with the realities of its use). Thanks also to 
Wido for the poke in the right direction elsewhere, as he filled in the missing 
bits.

Regards,

Stuart 


 − Stuart Harland: 
Infrastructure Engineer
Email: s.harl...@livelinktechnology.net 

Tel: +44 (0) 207 183 1411



LiveLink Technology Ltd
McCormack House
56A East Street
Havant
PO9 1BS

IMPORTANT: The information transmitted in this e-mail is intended only for the 
person or entity to whom it is addressed and may contain confidential and/or 
privileged information. If you are not the intended recipient of this message, 
please do not read, copy, use or disclose this communication and notify the 
sender immediately. Any review, retransmission, dissemination or other use of, 
or taking any action in reliance upon this information by persons or entities 
other than the intended recipient is prohibited. Any views or opinions 
presented in this e-mail are solely those of the author and do not necessarily 
represent those of LiveLink. This e-mail message has been checked for the 
presence of computer viruses. However, LiveLink is not able to accept liability 
for any damage caused by this e-mail.



> On 26 May 2017, at 22:53, Gregory Farnum  wrote:
> 
> Yeah, not sure. It might just be that the restarting is newly exposing old 
> issues, but I don't see how. I gather from skimming that ticket that it was a 
> disk state bug earlier on that was going undetected until Jewel, which is why 
> I was wondering about the upgrades.
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com