The lease timeout means this (peon) monitor hasn't heard from the leader monitor in too long; its read lease on the system state has expired. So it calls a new election since that means the leader is down or misbehaving. Do the other monitors have a similar problem at this stage?
The manager freezing until you restart it is a separate bug, but I'm not sure what the dashboard/mgr people will want to see there. John? -Greg On Sun, Jan 28, 2018 at 9:11 AM Karun Josy <[email protected]> wrote: > Still the issue is continuing. Any one else has noticed it ? > > > When this happens, the Ceph Dashboard GUI gets stuck and we have to > restart the manager daemon to make it work again > > Karun Josy > > On Wed, Jan 17, 2018 at 6:16 AM, Karun Josy <[email protected]> wrote: > >> Hello, >> >> In one of our cluster set up, there is frequent monitor elections >> happening. >> In the logs of one of the monitor, there is "lease_timeout" message >> before that happens. Can anyone help me to figure it out ? >> (When this happens, the Ceph Dashboard GUI gets stuck and we have to >> restart the manager daemon to make it work again) >> >> Ceph version : Luminous 12.2.2 >> >> Log : >> ========================= >> >> 2018-01-16 16:33:08.001937 7f0cfbaad700 4 rocksdb: >> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/compaction_job.cc:1173] >> [default] [JOB 885] Compacted 1@0 + 1@1 files to L1 => 20046585 bytes >> 2018-01-16 16:33:08.015891 7f0cfbaad700 4 rocksdb: (Original Log Time >> 2018/01/16-16:33:08.015826) >> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/compaction_job.cc:621] >> [default] compacted to: base level 1 max bytes base 268435456 files[0 1 0 0 >> 0 0 0] max score 0.07, MB/sec: 32.7 rd, 30.9 wr, level 1, files in(1, 1) >> out(1) MB in(1.3, 18.9) out(19.1), read-write-amplify(31.0) >> write-amplify(15.1) OK, records in: 4305, records dropped: 515 >> >> 2018-01-16 16:33:08.015897 7f0cfbaad700 4 rocksdb: (Original Log Time >> 2018/01/16-16:33:08.015840) EVENT_LOG_v1 {"time_micros": 1516149188015833, >> "job": 885, "event": "compaction_finished", "compaction_time_micros": >> 647876, "output_level": 1, "num_output_files": 1, "total_output_size": >> 20046585, "num_input_records": 4305, "num_output_records": 3790, >> "num_subcompactions": 1, "num_single_delete_mismatches": 0, >> "num_single_delete_fallthrough": 0, "lsm_state": [0, 1, 0, 0, 0, 0, 0]} >> 2018-01-16 16:33:08.016131 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 >> {"time_micros": 1516149188016128, "job": 885, "event": >> "table_file_deletion", "file_number": 2419} >> 2018-01-16 16:33:08.018147 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 >> {"time_micros": 1516149188018146, "job": 885, "event": >> "table_file_deletion", "file_number": 2417} >> 2018-01-16 16:33:11.051010 7f0d042be700 0 >> mon.ceph-mon3@2(peon).data_health(436) >> update_stats avail 84% total 20918 MB, used 2179 MB, avail 17653 MB >> 2018-01-16 16:33:17.269954 7f0d042be700 1 mon.ceph-mon3@2(peon).paxos(paxos >> active c 84337..84838) lease_timeout -- calling new election >> 2018-01-16 16:33:17.291096 7f0d01ab9700 0 log_channel(cluster) log [INF] >> : mon.ceph-sgp-mon3 calling new monitor election >> 2018-01-16 16:33:17.291182 7f0d01ab9700 1 >> mon.ceph-mon3@2(electing).elector(436) >> init, last seen epoch 436 >> 2018-01-16 16:33:20.834853 7f0d01ab9700 1 mon.ceph-mon3@2(peon).log >> v23189 check_sub sending message to client.65755 10.255.0.95:0/2603001850 >> with 8 entries (version 23189) >> >> >> >> Karun >> > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
