Re: [ceph-users] faster switch to another mds

2019-02-26 Thread Marc Roos
ceph-users Subject: Re: [ceph-users] faster switch to another mds On Tue, Feb 19, 2019 at 11:39 AM Fyodor Ustinov wrote: > > Hi! > > From documentation: > > mds beacon grace > Description:The interval without beacons before Ceph declares an MDS laggy (and possibly replace

Re: [ceph-users] faster switch to another mds

2019-02-20 Thread David Turner
If I'm not mistaken, if you stop them at the same time during a reboot on a node with both mds and mon, the mons might receive it, but wait to finish their own election vote before doing anything about it. If you're trying to keep optimal uptime for your mds, then stopping it first and on its own

Re: [ceph-users] faster switch to another mds

2019-02-20 Thread Patrick Donnelly
On Tue, Feb 19, 2019 at 11:39 AM Fyodor Ustinov wrote: > > Hi! > > From documentation: > > mds beacon grace > Description:The interval without beacons before Ceph declares an MDS > laggy (and possibly replace it). > Type: Float > Default:15 > > I do not understand, 15 - are is secon

Re: [ceph-users] faster switch to another mds

2019-02-19 Thread Fyodor Ustinov
sers" Sent: Tuesday, 19 February, 2019 20:57:49 Subject: Re: [ceph-users] faster switch to another mds It's also been mentioned a few times that when MDS and MON are on the same host that the downtime for MDS is longer when both daemons stop at about the same time. It's been suggested

Re: [ceph-users] faster switch to another mds

2019-02-19 Thread David Turner
It's also been mentioned a few times that when MDS and MON are on the same host that the downtime for MDS is longer when both daemons stop at about the same time. It's been suggested to stop the MDS daemon, wait for `ceph mds stat` to reflect the change, and then restart the rest of the server. HT

Re: [ceph-users] faster switch to another mds

2019-02-11 Thread Gregory Farnum
You can't tell from the client log here, but probably the MDS itself was failing over to a new instance during that interval. There's not much experience with it, but you could experiment with faster failover by reducing the mds beacon and grace times. This may or may not work reliably... On Sat,

[ceph-users] faster switch to another mds

2019-02-09 Thread Fyodor Ustinov
Hi! I have ceph cluster with 3 nodes with mon/mgr/mds servers. I reboot one node and see this in client log: Feb 09 20:29:14 ceph-nfs1 kernel: libceph: mon2 10.5.105.40:6789 socket closed (con state OPEN) Feb 09 20:29:14 ceph-nfs1 kernel: libceph: mon2 10.5.105.40:6789 session lost, hunting for