Le mercredi 28 mars 2018 à 15:57 -0700, Jean-Charles Lopez a écrit :
> if I read you crrectly you have 3 MONs on each data center. This
> means that when the link goes down you will loose quorum making the
> cluster unavailable.

Oh yes, sure. I'm planning to add this 7th MON.
I'm not sure the problem is related, since there is very little
activity on the cluster during the event, and the "hunting for new mon"
messages persist for a very long time after the connection is up again.
There are messages on the various MONs calling for elections, and the
primary MON finally wins when the link is back:

2018-03-28 19:07:18.554186 7ffacd6df700  0 log_channel(cluster) log [INF] : 
mon.brome calling monitor election
2018-03-28 19:07:18.554295 7ffacd6df700  1 mon.brome@0(electing).elector(1551) 
init, last seen epoch 1551, mid-election, bumping
...
2018-03-28 19:08:13.703667 7ffacfee4700  0 log_channel(cluster) log [INF] : 
mon.brome is new leader, mons brome,chlore,fluor,soufre in quorum (ranks 
0,1,2,5)
...
2018-03-28 19:08:13.804470 7ffac8ed6700  0 log_channel(cluster) log [WRN] : 
Health check failed: 2/6 mons down, quorum brome,chlore,fluor,soufre (MON_DOWN)
2018-03-28 19:08:13.834336 7ffac8ed6700  0 log_channel(cluster) log [WRN] : 
overall HEALTH_WARN 2/6 mons down, quorum brome,chlore,fluor,soufre
2018-03-28 19:08:16.243895 7ffacd6df700  0 log_channel(cluster) log [INF] : 
mon.brome calling monitor election
2018-03-28 19:08:16.244011 7ffacd6df700  1 mon.brome@0(electing).elector(1577) 
init, last seen epoch 1577, mid-election, bumping
2018-03-28 19:08:17.106483 7ffacd6df700  0 log_channel(cluster) log [INF] : 
mon.brome is new leader, mons brome,chlore,fluor,oxygene,phosphore,soufre in 
quorum (ranks 0,1,2,3,4,5)
...
2018-03-28 19:08:17.178867 7ffacd6df700  0 log_channel(cluster) log [INF] : 
Cluster is now healthy
2018-03-28 19:08:17.227849 7ffac8ed6700  0 log_channel(cluster) log [INF] : 
overall HEALTH_OK

> If my perception is correct, you’d have to start a 7th MON somewhere
> else accessible from both sites for your cluster to maintain quorum
> during this event.

The actual problem noticed is AFTER the event.
Since my setup is low-end (specially the routing + PPP + IPsec between
datacenters), it may be a new piece of information most didn't
experience, and I suspect the kernel routing table change may trigger
the lasting problem for the cephfs kernel client.

> Regards
> JC
> 
> > On Mar 28, 2018, at 15:40, Nicolas Huillard <nhuill...@dolomede.fr>
> > wrote:
> > 
> > Hi all,
> > 
> > I didn't find much information regarding this kernel client loop in
> > the
> > ML. Here are my observation, around which I'll try to investigate.
> > 
> > My setup:
> > * 2 datacenters connected using an IPsec tunnel configured for
> > routing
> > (2 subnets)
> > * connection to the WAN using PPPoE and the pppd kernel module
> > * the PPP connection lasts exactly 7 days, after which the provider
> > kills it, and my PPP client restarts it (the WAN/inter-cluster
> > communication is thus disconnected during ~30s)
> > * 3 MON+n×OSD+MGR+MDS on each datacenter
> > * 2 client servers using cephfs/kernel module; one of them on each
> > datacenter runs the pppd client and the IPSec endpoint (Pacemaker
> > manages this front-end aspect of the cluster)
> > * a single cephfs mount which is not managed by Pacemaker
> > 
> > Observations:
> > * when the ppp0 connection stops, the pppd restores the default
> > route
> > from "using the PPP tunnel" to "using a virtual IP which happens to
> > be
> > on the same host" (but could move to the other peer)
> > 
> > Mar 28 19:07:09 neon pppd[5543]: restoring old default route to
> > eth0 [172.21.0.254]
> > 
> > * IPsec et al. react cleanly (remove the tunnel, recreate it when
> > PPP
> > is up again)
> > 
> > Mar 28 19:07:43 neon pppd[5543]: Connected to 02:....:42 via
> > interface eth1.835
> > Mar 28 19:07:43 neon pppd[5543]: CHAP authentication succeeded
> > Mar 28 19:07:43 neon pppd[5543]: peer from calling number
> > 02:....:42 authorized
> > Mar 28 19:07:43 neon pppd[5543]: replacing old default route to
> > eth0 [172.21.0.254]
> > 
> > * 20s after the PPP link is up and IPsec is restored, libceph
> > starts to
> > complain (neon is the client/gateway on 172.21.0.0/16 which lost
> > its
> > PPP, sodium is the remote side of the same IPsec tunnel) :
> > 
> > Mar 28 19:08:03 neon kernel: [1232455.656828] libceph: mon1
> > 172.21.0.18:6789 socket closed (con state OPEN)
> > Mar 28 19:08:12 neon kernel: [1232463.846633] ceph: mds0 caps stale
> > Mar 28 19:08:16 neon kernel: [1232468.128577] ceph: mds0 caps went
> > stale, renewing
> > Mar 28 19:08:16 neon kernel: [1232468.128581] ceph: mds0 caps stale
> > Mar 28 19:08:30 neon kernel: [1232482.601183] libceph: mon3
> > 172.22.0.16:6789 session established
> > Mar 28 19:09:01 neon kernel: [1232513.256059] libceph: mon3
> > 172.22.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:09:01 neon kernel: [1232513.321176] libceph: mon5
> > 172.22.0.20:6789 session established
> > Mar 28 19:09:32 neon kernel: [1232543.977003] libceph: mon5
> > 172.22.0.20:6789 session lost, hunting for new mon
> > Mar 28 19:09:32 neon kernel: [1232543.979567] libceph: mon2
> > 172.21.0.20:6789 session established
> > Mar 28 19:09:39 neon kernel: [1232551.435001] ceph: mds0 caps
> > renewed
> > Mar 28 19:10:02 neon kernel: [1232574.697885] libceph: mon2
> > 172.21.0.20:6789 session lost, hunting for new mon
> > Mar 28 19:10:02 neon kernel: [1232574.763614] libceph: mon4
> > 172.22.0.18:6789 session established
> > Mar 28 19:10:33 neon kernel: [1232605.418776] libceph: mon4
> > 172.22.0.18:6789 session lost, hunting for new mon
> > Mar 28 19:10:33 neon kernel: [1232605.420896] libceph: mon0
> > 172.21.0.16:6789 session established
> > Mar 28 19:11:04 neon kernel: [1232636.139720] libceph: mon0
> > 172.21.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:11:04 neon kernel: [1232636.205717] libceph: mon3
> > 172.22.0.16:6789 session established
> > 
> > Mar 28 19:07:40 sodium kernel: [1211268.708716] libceph: mon0
> > 172.21.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:07:44 sodium kernel: [1211272.208735] libceph: mon5
> > 172.22.0.20:6789 socket closed (con state OPEN)
> > Mar 28 19:07:53 sodium kernel: [1211281.683700] libceph: mon2
> > 172.21.0.20:6789 socket closed (con state OPEN)
> > Mar 28 19:08:18 sodium kernel: [1211306.856489] libceph: mon5
> > 172.22.0.20:6789 session established
> > Mar 28 19:08:49 sodium kernel: [1211337.575101] libceph: mon5
> > 172.22.0.20:6789 session lost, hunting for new mon
> > Mar 28 19:08:49 sodium kernel: [1211337.640884] libceph: mon0
> > 172.21.0.16:6789 session established
> > Mar 28 19:09:20 sodium kernel: [1211368.296187] libceph: mon0
> > 172.21.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:09:20 sodium kernel: [1211368.299194] libceph: mon4
> > 172.22.0.18:6789 session established
> > Mar 28 19:09:50 sodium kernel: [1211399.017229] libceph: mon4
> > 172.22.0.18:6789 session lost, hunting for new mon
> > Mar 28 19:09:50 sodium kernel: [1211399.019655] libceph: mon5
> > 172.22.0.20:6789 session established
> > 
> > * the active MDS happens to be on sodium's side (172.22.*), whereas
> > the
> > primary MON happens to be on neon's side (172.21.*), which explains
> > the
> > dissimilar messages
> > 
> > * the remote peer for sodium also had a cephfs mount, and also
> > complained (lithium just experienced a lack of response from its
> > gateway, a virtual IP on sodium, which IPsec tunnel to neon was
> > down) :
> > 
> > Mar 28 19:07:35 lithium kernel: [603537.458858] libceph: mon2
> > 172.21.0.20:6789 session lost, hunting for new mon
> > Mar 28 19:07:44 lithium kernel: [603545.702369] libceph: mon5
> > 172.22.0.20:6789 socket closed (con state OPEN)
> > Mar 28 19:07:58 lithium kernel: [603560.177718] libceph: mon2
> > 172.21.0.20:6789 socket closed (con state OPEN)
> > Mar 28 19:08:16 lithium kernel: [603577.883001] ceph: mds0 caps
> > went stale, renewing
> > Mar 28 19:08:16 lithium kernel: [603577.883004] ceph: mds0 caps
> > stale
> > Mar 28 19:08:16 lithium kernel: [603577.883483] ceph: mds0 caps
> > renewed
> > Mar 28 19:08:19 lithium kernel: [603581.559718] libceph: mon0
> > 172.21.0.16:6789 session established
> > Mar 28 19:08:52 lithium kernel: [603614.261565] libceph: mon0
> > 172.21.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:08:52 lithium kernel: [603614.263845] libceph: mon3
> > 172.22.0.16:6789 session established
> > Mar 28 19:09:23 lithium kernel: [603644.982627] libceph: mon3
> > 172.22.0.16:6789 session lost, hunting for new mon
> > Mar 28 19:09:23 lithium kernel: [603644.984546] libceph: mon5
> > 172.22.0.20:6789 session established
> > Mar 28 19:09:54 lithium kernel: [603675.703696] libceph: mon5
> > 172.22.0.20:6789 session lost, hunting for new mon
> > Mar 28 19:09:54 lithium kernel: [603675.773835] libceph: mon2
> > 172.21.0.20:6789 session established
> > 
> > * all those "hunting for" messages stop when I unmount/remount the
> > cephfs filesystem (supported by the kernel module, not Fuse)
> > 
> > Mar 28 20:38:27 neon kernel: [1237879.172749] libceph: mon2
> > 172.21.0.20:6789 session lost, hunting for new mon
> > Mar 28 20:38:27 neon kernel: [1237879.238902] libceph: mon4
> > 172.22.0.18:6789 session established
> > Mar 28 20:38:57 neon kernel: [1237909.893569] libceph: mon4
> > 172.22.0.18:6789 session lost, hunting for new mon
> > Mar 28 20:38:57 neon kernel: [1237909.895888] libceph: mon1
> > 172.21.0.18:6789 session established
> > Mar 28 20:39:31 neon kernel: [1237942.989104] libceph: mon0
> > 172.21.0.16:6789 session established
> > Mar 28 20:39:31 neon kernel: [1237942.990244] libceph: client114108
> > fsid 819889bd-de05-4bf5-ab43-da16d93f9308
> > 
> > I suspect that all this is related to the kernel routing table
> > which is
> > altered by pppd, restored to it's original value, then re-updated
> > when
> > the PPP link re-opens. I have experienced problems with some
> > daemons
> > like dnsmasq, ntpd, etc. The only solution seems to be to restart
> > those
> > deamons.
> > I may have to unmount/remount cephfs to have the same effect. I'll
> > also
> > try cephfs/Fuse.
> > 
> > Did anyone dig into the cause of this flurry of messages?

-- 
Nicolas Huillard
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to