[ceph-users] Troubles MDS

Georg Höllrigl Wed, 16 Apr 2014 07:12:29 -0700

Hello,

Using Ceph MDS with one active and one standby server - a day ago one ofthe mds crashed and I restarted it.

Tonight it crashed again, a few hours later, also the second mds crashed.


#ceph -v
ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)

At the moment cephfs is dead - with following health status:

#ceph -s
    cluster b04fc583-9e71-48b7-a741-92f4dff4cfef
     health HEALTH_WARN mds cluster is degraded; mds c is laggy

monmap e3: 3 mons at{ceph-m-01=10.0.0.176:6789/0,ceph-m-02=10.0.1.107:6789/0,ceph-m-03=10.0.1.108:6789/0},election epoch 6274, quorum 0,1,2 ceph-m-01,ceph-m-02,ceph-m-03

     mdsmap e2055: 1/1/1 up {0=ceph-m-03=up:rejoin(laggy or crashed)}
     osdmap e3752: 39 osds: 39 up, 39 in
      pgmap v3277576: 8328 pgs, 17 pools, 6461 GB data, 17066 kobjects
            13066 GB used, 78176 GB / 91243 GB avail
                8328 active+clean
  client io 1193 B/s rd, 0 op/s

I couldn't really find any useful infos in the logfiles nor searching indocumentations. Any ideas how to get cephfs up and running?


Here is part of mds log:

2014-04-16 14:07:05.603501 7ff184c64700 1 mds.0.server reconnect gaveup on client.7846580 10.0.1.152:0/14639

2014-04-16 14:07:05.603525 7ff184c64700  1 mds.0.46 reconnect_done

2014-04-16 14:07:05.674990 7ff186d69700 1 mds.0.46 handle_mds_map i amnow mds.0.462014-04-16 14:07:05.674996 7ff186d69700 1 mds.0.46 handle_mds_map statechange up:reconnect --> up:rejoin

2014-04-16 14:07:05.674998 7ff186d69700  1 mds.0.46 rejoin_start

2014-04-16 14:07:22.347521 7ff17f825700 0 -- 10.0.1.107:6815/17325 >>10.0.1.68:0/4128280551 pipe(0x5e2ac80 sd=930 :6815 s=2 pgs=153 cs=1 l=0c=0x5e2e160).fault with nothing to send, going to standby


Any ideas, how to solve "laggy or crashed" ?


Georg
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Troubles MDS

Reply via email to