The issue, Sage, is that we have to deal with the cluster being
re-expanded.  If we start with 5 monitors and scale back to 3, running
the "ceph mon remove N" command after stopping each monitor and don't
restart the existing monitors, we cannot re-add those same monitors
that were previously removed.  They will suicide at startup.

On Mon, Jun 24, 2013 at 4:22 PM, Sage Weil <[email protected]> wrote:
> On Mon, 24 Jun 2013, Mandell Degerness wrote:
>> Hmm.  This is a bit ugly from our perspective, but not fatal to your
>> design (just our implementation).  At the time we run the rm, the
>> cluster is smaller and so the restart of each monitor is not fatal to
>> the cluster.  The problem is on our side in terms of guaranteeing
>> order of behaviors.
>
> Sorry, I'm still confused about where the monitor gets restarted.  It
> doesn't matter if the removed monitor is stopped or failed/gone; 'ceph mon
> rm ...' will remove it from the monmap and quorum.  It sounds like you're
> suggesting that the surviving monitors need to be restarted, but they do
> not, as long as enough of them are alive to form a quorum and pass the
> decree that the mon cluster is smaller.  So 5 -> 2 would be problematic,
> but 5 -> 3 (assuming there are 3 currently up) will work without
> restarts...
>
> sage
>
>
>>
>> On Mon, Jun 24, 2013 at 1:54 PM, Sage Weil <[email protected]> wrote:
>> > On Mon, 24 Jun 2013, Mandell Degerness wrote:
>> >> I'm testing the change (actually re-starting the monitors after the
>> >> monitor removal), but this brings up the issue with why we didn't want
>> >> to do this in the first place:  When reducing the number of monitors
>> >> from 5 to 3, we are guaranteed to have a service outage for the time
>> >> it takes to restart at least one of the monitors (and, possibly, for
>> >> two of the restarts, now that I think on it).  In theory, the
>> >> stop/start cycle is very short and should complete in a reasonable
>> >> time.  What I'm concerned about, however, is the case that something
>> >> is wrong with our re-written config file.  In that case, the outage is
>> >> immediate and will last until the problem is corrected on the first
>> >> server to have the monitor restarted.
>> >
>> > I'm jumping into this thread late, but: why would you follow the second
>> > removal procedure for broken clusters?  To go from 5->3 mons, you should
>> > just stop 2 of the mons and do 'ceph mon rm <addr1>' 'ceph mon rm
>> > <addr2>'.
>> >
>> > sage
>> >
>> >>
>> >> On Mon, Jun 24, 2013 at 10:07 AM, John Nielsen <[email protected]> wrote:
>> >> > On Jun 21, 2013, at 5:00 PM, Mandell Degerness 
>> >> > <[email protected]> wrote:
>> >> >
>> >> >> There is a scenario where we would want to remove a monitor and, at a
>> >> >> later date, re-add the monitor (using the same IP address).  Is there
>> >> >> a supported way to do this?  I tried deleting the monitor directory
>> >> >> and rebuilding from scratch following the add monitor procedures from
>> >> >> the web, but the monitor still suicide's when started.
>> >> >
>> >> >
>> >> > I assume you're already referencing this:
>> >> > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/
>> >> >
>> >> > I have done what you describe before. There were a couple hiccups, 
>> >> > let's see if I remember the specifics:
>> >> >
>> >> > Remove:
>> >> > Follow the first two steps under "removing a monitor (manual) at the 
>> >> > link above:
>> >> >         service ceph stop mon.N
>> >> >         ceph mon remove N
>> >> > Comment out the monitor entry in ceph.conf on ALL mon, osd and client 
>> >> > hosts.
>> >> > Restart services as required to make everyone happy with the smaller 
>> >> > set of monitors
>> >> >
>> >> > Re-add:
>> >> > Wipe the old monitor's directory and re-create it
>> >> > Follow the steps for "adding a monitor (manual) at the link above. 
>> >> > Instead of adding a new entry you can just un-comment your old ones in 
>> >> > ceph.conf. You can also start the monitor with "service ceph start mon 
>> >> > N" on the appropriate host instead of running yourself (step 8). Note 
>> >> > that you DO need to run ceph-mon as specified in step 5. I was 
>> >> > initially confused about the '--mkfs' flag there--it doesn't refer to 
>> >> > the OS's filesystem, you should use a directory or mountpoint that is 
>> >> > already prepared/mounted.
>> >> >
>> >> > HTH. If you run into trouble post exactly the steps you followed and 
>> >> > additional details about your setup.
>> >> >
>> >> > JN
>> >> >
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> [email protected]
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >>
>>
>>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to