Hi, everyone.
Recently, when I was doing some stress test, one of the monitors of my ceph
cluster was marked down, and all the monitors repeatedly call new election and
the I/O can be finished. There were three monitors in my cluster: rg3-ceph36,
rg3-ceph40, rg3-ceph45. It was rg3-ceph40 that was always marked down, but it
was running.
Here is rg3-ceph40’s monitor log with debug_mon and debug_paxos set to 20/0.
2017-02-03 16:33:25.738290 7f4f301ca700 5
mon.rg3-ceph40@1(electing).elector(7009) handle_ack from mon.2
2017-02-03 16:33:25.738294 7f4f301ca700 5
mon.rg3-ceph40@1(electing).elector(7009) so far i have
{1=37154696925806591,2=37154696925806591}
2017-02-03 16:33:28.033563 7f4f309cb700 11 mon.rg3-ceph40@1(electing) e1 tick
2017-02-03 16:33:28.033584 7f4f309cb700 20 mon.rg3-ceph40@1(electing) e1
sync_trim_providers
2017-02-03 16:33:30.737928 7f4f309cb700 5
mon.rg3-ceph40@1(electing).elector(7009) election timer expired
2017-02-03 16:33:30.737953 7f4f309cb700 10
mon.rg3-ceph40@1(electing).elector(7009) bump_epoch 7009 to 7010
2017-02-03 16:33:30.740784 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1
join_election
2017-02-03 16:33:30.740802 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1 _reset
2017-02-03 16:33:30.740805 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1
cancel_probe_timeout (none scheduled)
2017-02-03 16:33:30.740807 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1
timecheck_finish
2017-02-03 16:33:30.740810 7f4f309cb700 15 mon.rg3-ceph40@1(electing) e1
health_tick_stop
2017-02-03 16:33:30.740812 7f4f309cb700 15 mon.rg3-ceph40@1(electing) e1
health_interval_stop
2017-02-03 16:33:30.740814 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1
scrub_reset
2017-02-03 16:33:30.740816 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxos(paxos recovering c 200550..201284) restart --
canceling timeouts
2017-02-03 16:33:30.740823 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(pgmap 85794..86329) restart
2017-02-03 16:33:30.740827 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(mdsmap 1..1) restart
2017-02-03 16:33:30.740830 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(osdmap 25125..25724) restart
2017-02-03 16:33:30.740832 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(logm 98223..98787) restart
2017-02-03 16:33:30.740834 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(monmap 1..1) restart
2017-02-03 16:33:30.740836 7f4f309cb700 10
mon.rg3-ceph40@1(electing).paxosservice(auth 501..623) restart
2017-02-03 16:33:30.740872 7f4f309cb700 10 mon.rg3-ceph40@1(electing) e1
win_election epoch 7010 quorum 1,2 features 37154696925806591
2017-02-03 16:33:30.740889 7f4f309cb700 0 log_channel(cluster) log [INF] :
mon.rg3-ceph40@1 won leader election with quorum 1,2
2017-02-03 16:33:30.741119 7f4f309cb700 10 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) leader_init -- starting paxos recovery
2017-02-03 16:33:30.742301 7f4f309cb700 10 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) get_new_proposal_number = 350501
2017-02-03 16:33:30.742315 7f4f309cb700 10 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) collect with pn 350501
2017-02-03 16:33:30.742328 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(monmap 1..1) election_finished
2017-02-03 16:33:30.742332 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(monmap 1..1) _active - not active
2017-02-03 16:33:30.742334 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(pgmap 85794..86329) election_finished
2017-02-03 16:33:30.742336 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(pgmap 85794..86329) _active - not active
2017-02-03 16:33:30.742338 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(mdsmap 1..1) election_finished
2017-02-03 16:33:30.742340 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(mdsmap 1..1) _active - not active
2017-02-03 16:33:30.742341 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(osdmap 25125..25724) election_finished
2017-02-03 16:33:30.742343 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(osdmap 25125..25724) _active - not active
2017-02-03 16:33:30.742345 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) election_finished
2017-02-03 16:33:30.742346 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) _active - not active
2017-02-03 16:33:30.742348 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(auth 501..623) election_finished
2017-02-03 16:33:30.742350 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(auth 501..623) _active - not active
2017-02-03 16:33:30.742352 7f4f309cb700 10
mon.rg3-ceph40@1(leader).data_health(7010) start_epoch epoch 7010
2017-02-03 16:33:30.742361 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck_finish
2017-02-03 16:33:30.742363 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
resend_routed_requests
2017-02-03 16:33:30.742377 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1 requeue
for self tid 3383 log(1 entries) v1
2017-02-03 16:33:30.742383 7f4f309cb700 20 mon.rg3-ceph40@1(leader) e1 have
connection
2017-02-03 16:33:30.742386 7f4f309cb700 20 mon.rg3-ceph40@1(leader) e1
ms_dispatch existing session MonSession: mon.1 10.205.198.85:6789/0 is
openallow * for mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.742396 7f4f309cb700 20 mon.rg3-ceph40@1(leader) e1 caps
allow *
2017-02-03 16:33:30.742400 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) dispatch log(1
entries) v1 from mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.742407 7f4f309cb700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.742408
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.742420 7f4f309cb700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) waiting for paxos ->
readable (v0)
2017-02-03 16:33:30.742423 7f4f309cb700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.742424
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.742432 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
register_cluster_logger
2017-02-03 16:33:30.742438 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck_start
2017-02-03 16:33:30.742440 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck_start_round curr 0
2017-02-03 16:33:30.742442 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck_start_round new 1
2017-02-03 16:33:30.742444 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1 timecheck
2017-02-03 16:33:30.742445 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck start timecheck epoch 7010 round 1
2017-02-03 16:33:30.742452 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck send time_check( ping e 7010 r 1 ) v1 to mon.2 10.205.198.149:6789/0
2017-02-03 16:33:30.742462 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
timecheck_start_round setting up next event
2017-02-03 16:33:30.742467 7f4f309cb700 15 mon.rg3-ceph40@1(leader) e1
health_tick_start
2017-02-03 16:33:30.742469 7f4f309cb700 15 mon.rg3-ceph40@1(leader) e1
health_tick_stop
2017-02-03 16:33:30.742472 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
do_health_to_clog_interval
2017-02-03 16:33:30.742474 7f4f309cb700 10 mon.rg3-ceph40@1(leader) e1
do_health_to_clog (force)
2017-02-03 16:33:30.743637 7f4f309cb700 10
mon.rg3-ceph40@1(leader).data_health(7010) get_health
2017-02-03 16:33:30.743661 7f4f309cb700 0 log_channel(cluster) log [INF] :
HEALTH_WARN; 1 mons down, quorum 1,2 rg3-ceph40,rg3-ceph45
2017-02-03 16:33:30.743701 7f4f309cb700 15 mon.rg3-ceph40@1(leader) e1
health_interval_start
2017-02-03 16:33:30.743704 7f4f309cb700 15 mon.rg3-ceph40@1(leader) e1
health_interval_stop
2017-02-03 16:33:30.743705 7f4f309cb700 20 mon.rg3-ceph40@1(leader) e1
health_interval_calc_next_update now: 2017-02-03 16:33:30.743705, next:
2017-02-03 17:00:00.000000, interval: 3600
2017-02-03 16:33:30.743724 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1 have
connection
2017-02-03 16:33:30.743732 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1
ms_dispatch existing session MonSession: mon.1 10.205.198.85:6789/0 is
openallow * for mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.743744 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1 caps
allow *
2017-02-03 16:33:30.743747 7f4f301ca700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) dispatch log(1
entries) v1 from mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.743756 7f4f301ca700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.743756
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.743766 7f4f301ca700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) waiting for paxos ->
readable (v0)
2017-02-03 16:33:30.743770 7f4f301ca700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.743775
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.743786 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1 have
connection
2017-02-03 16:33:30.743789 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1
ms_dispatch existing session MonSession: mon.1 10.205.198.85:6789/0 is
openallow * for mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.743796 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1 caps
allow *
2017-02-03 16:33:30.743798 7f4f301ca700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) dispatch log(1
entries) v1 from mon.1 10.205.198.85:6789/0
2017-02-03 16:33:30.743803 7f4f301ca700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.743804
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.743809 7f4f301ca700 10
mon.rg3-ceph40@1(leader).paxosservice(logm 98223..98787) waiting for paxos ->
readable (v0)
2017-02-03 16:33:30.743812 7f4f301ca700 5 mon.rg3-ceph40@1(leader).paxos(paxos
recovering c 200550..201284) is_readable = 0 - now=2017-02-03 16:33:30.743814
lease_expire=0.000000 has v0 lc 201284
2017-02-03 16:33:30.789898 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1 have
connection
2017-02-03 16:33:30.789910 7f4f301ca700 20 mon.rg3-ceph40@1(leader) e1
ms_dispatch existing session MonSession: mon.2 10.205.198.149:6789/0 is
openallow * for mon.2 10.205.198.149:6789/0
I turned down rg3-ceph40 by /etc/init.d/ceph stop mon, the other two monitors
stopped calling new election, however, the I/O were still stuck. The following
the monitor log of rg3-ceph36:
2017-02-03 17:03:02.279230 7fe0be819700 10 mon.rg3-ceph36@0(leader).pg v86564
encode_pending v 86565
2017-02-03 17:03:02.337531 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99039
encode_full log v 99039
2017-02-03 17:03:02.337612 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99039
encode_pending v99040
2017-02-03 17:03:02.354173 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader) e1
refresh_from_paxos
2017-02-03 17:03:02.354261 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86564
update_from_paxos read_incremental
2017-02-03 17:03:02.354766 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
read_pgmap_meta
2017-02-03 17:03:02.354874 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
map_pg_creates to 0 pgs -- no change
2017-02-03 17:03:02.354881 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
send_pg_creates to 0 pgs
2017-02-03 17:03:02.354885 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
update_logger
2017-02-03 17:03:02.355051 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99039
update_from_paxos
2017-02-03 17:03:02.355061 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99039
update_from_paxos version 99039 summary v 99039
2017-02-03 17:03:02.355180 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).auth v625
update_from_paxos
2017-02-03 17:03:02.355188 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
map_pg_creates to 0 pgs -- no change
2017-02-03 17:03:02.355190 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
send_pg_creates to 0 pgs
2017-02-03 17:03:02.355234 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
create_pending v 86566
2017-02-03 17:03:02.355238 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.67 10.205.198.148:6812/99322
2017-02-03 17:03:02.355259 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.12 10.205.198.147:6802/96650
2017-02-03 17:03:02.355272 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.80 10.205.198.147:6816/99711
2017-02-03 17:03:02.355283 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.72 10.205.198.82:6804/2609
2017-02-03 17:03:02.355292 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.17 10.205.198.82:6802/2275
2017-02-03 17:03:02.355302 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.22 10.205.198.149:6802/11553
2017-02-03 17:03:02.355324 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.83 10.205.198.145:6816/83116
2017-02-03 17:03:02.355335 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.86 10.205.198.146:6806/92691
2017-02-03 17:03:02.355362 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.71 10.205.198.82:6808/3461
2017-02-03 17:03:02.355377 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.32 10.205.198.81:6806/48193
2017-02-03 17:03:02.355395 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.30 10.205.198.83:6812/1188
2017-02-03 17:03:02.355418 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.65 10.205.198.145:6810/81841
2017-02-03 17:03:02.355448 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.33 10.205.198.149:6808/12810
2017-02-03 17:03:02.355466 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.34 10.205.198.149:6816/14873
2017-02-03 17:03:02.355477 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.3 10.205.198.81:6802/47424
2017-02-03 17:03:02.355488 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.20 10.205.198.83:6806/101823
2017-02-03 17:03:02.355498 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.55 10.205.198.148:6816/100258
2017-02-03 17:03:02.355509 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.9 10.205.198.145:6806/80779
2017-02-03 17:03:02.355519 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.84 10.205.198.146:6804/92333
2017-02-03 17:03:02.355528 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.54 10.205.198.145:6814/82758
2017-02-03 17:03:02.355538 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.87 10.205.198.146:6812/94054
2017-02-03 17:03:02.355563 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).pg v86565
_updated_stats for osd.68 10.205.198.148:6802/97249
2017-02-03 17:03:02.355603 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
check_osd_map already seen 25728 >= 25728
2017-02-03 17:03:02.355611 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
update_logger
2017-02-03 17:03:02.355635 7fe0bfc1f700 0 log_channel(cluster) log [INF] :
pgmap v86565: 4096 pgs: 4096 active+clean; 296 MB data, 14135 MB used, 290 TB /
306 TB avail; 5268 B/s rd, 6 op/s
2017-02-03 17:03:02.379079 7fe0be018700 10 mon.rg3-ceph36@0(leader).log v99039
preprocess_query log(1 entries) v1 from mon.0 10.205.198.81:6789/0
2017-02-03 17:03:02.379089 7fe0be018700 10 mon.rg3-ceph36@0(leader).log v99039
preprocess_log log(1 entries) v1 from mon.0
2017-02-03 17:03:02.412442 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader) e1
refresh_from_paxos
2017-02-03 17:03:02.412681 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
update_from_paxos
2017-02-03 17:03:02.412689 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
update_from_paxos version 99040 summary v 99039
2017-02-03 17:03:02.412709 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
update_from_paxos latest full 99039
2017-02-03 17:03:02.412743 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).log v99040
update_from_paxos applying incremental log 99040 2017-02-03 17:03:01.280909
mon.0 10.205.198.81:6789/0 13620 : cluster [INF] pgmap v86564: 4096 pgs: 4096
active+clean; 296 MB data, 13947 MB used, 290 TB / 306 TB avail
2017-02-03 17:03:02.412822 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
check_subs
2017-02-03 17:03:02.412962 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).auth v625
update_from_paxos
2017-02-03 17:03:02.412969 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
map_pg_creates to 0 pgs -- no change
2017-02-03 17:03:02.412972 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
send_pg_creates to 0 pgs
2017-02-03 17:03:02.413011 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
create_pending v 99041
2017-02-03 17:03:02.413015 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).log v99040
_updated_log for mon.0 10.205.198.81:6789/0
2017-02-03 17:03:02.413042 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
preprocess_query log(1 entries) v1 from mon.0 10.205.198.81:6789/0
2017-02-03 17:03:02.413048 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
preprocess_log log(1 entries) v1 from mon.0
2017-02-03 17:03:02.413055 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
prepare_update log(1 entries) v1 from mon.0 10.205.198.81:6789/0
2017-02-03 17:03:02.413059 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
prepare_log log(1 entries) v1 from mon.0
2017-02-03 17:03:02.413063 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99040
logging 2017-02-03 17:03:02.355639 mon.0 10.205.198.81:6789/0 13621 : cluster
[INF] pgmap v86565: 4096 pgs: 4096 active+clean; 296 MB data, 14135 MB used,
290 TB / 306 TB avail; 5268 B/s rd, 6 op/s
2017-02-03 17:03:03.412444 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99040
encode_full log v 99040
2017-02-03 17:03:03.412548 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99040
encode_pending v99041
2017-02-03 17:03:03.479045 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader) e1
refresh_from_paxos
2017-02-03 17:03:03.479292 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99041
update_from_paxos
2017-02-03 17:03:03.479299 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99041
update_from_paxos version 99041 summary v 99040
2017-02-03 17:03:03.479320 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99041
update_from_paxos latest full 99040
2017-02-03 17:03:03.479348 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).log v99041
update_from_paxos applying incremental log 99041 2017-02-03 17:03:02.355639
mon.0 10.205.198.81:6789/0 13621 : cluster [INF] pgmap v86565: 4096 pgs: 4096
active+clean; 296 MB data, 14135 MB used, 290 TB / 306 TB avail; 5268 B/s rd, 6
op/s
2017-02-03 17:03:03.479423 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99041
check_subs
2017-02-03 17:03:03.479572 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).auth v625
update_from_paxos
2017-02-03 17:03:03.479581 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
map_pg_creates to 0 pgs -- no change
2017-02-03 17:03:03.479584 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).pg v86565
send_pg_creates to 0 pgs
2017-02-03 17:03:03.479631 7fe0bfc1f700 10 mon.rg3-ceph36@0(leader).log v99041
create_pending v 99042
2017-02-03 17:03:03.479638 7fe0bfc1f700 7 mon.rg3-ceph36@0(leader).log v99041
_updated_log for mon.0 10.205.198.81:6789/0
2017-02-03 17:03:04.489266 7fe0be819700 10 mon.rg3-ceph36@0(leader).pg v86565
check_down_pgs
2017-02-03 17:03:04.489440 7fe0be819700 10 mon.rg3-ceph36@0(leader).pg v86565
v86565: 4096 pgs: 4096 active+clean; 296 MB data, 14135 MB used, 290 TB / 306
TB avail; 5268 B/s rd, 6 op/s
2017-02-03 17:03:04.489473 7fe0be819700 10 mon.rg3-ceph36@0(leader).osd e25728
e25728: 90 osds: 90 up, 90 in
2017-02-03 17:03:04.489530 7fe0be819700 10 mon.rg3-ceph36@0(leader).osd e25728
min_last_epoch_clean 25728
2017-02-03 17:03:04.489533 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99041
log
2017-02-03 17:03:04.489538 7fe0be819700 10 mon.rg3-ceph36@0(leader).auth v625
auth
2017-02-03 17:03:05.814732 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
handle_subscribe mon_subscribe({monmap=2+,osd_pg_creates=0}) v2
2017-02-03 17:03:05.814760 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
check_sub monmap next 2 have 1
2017-02-03 17:03:08.567850 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
handle_subscribe mon_subscribe({monmap=2+,osd_pg_creates=0}) v2
2017-02-03 17:03:08.567880 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
check_sub monmap next 2 have 1
2017-02-03 17:03:09.489665 7fe0be819700 10 mon.rg3-ceph36@0(leader).pg v86565
v86565: 4096 pgs: 4096 active+clean; 296 MB data, 14135 MB used, 290 TB / 306
TB avail; 5268 B/s rd, 6 op/s
2017-02-03 17:03:09.489708 7fe0be819700 10 mon.rg3-ceph36@0(leader).osd e25728
e25728: 90 osds: 90 up, 90 in
2017-02-03 17:03:09.489750 7fe0be819700 10 mon.rg3-ceph36@0(leader).osd e25728
min_last_epoch_clean 25728
2017-02-03 17:03:09.489753 7fe0be819700 10 mon.rg3-ceph36@0(leader).log v99041
log
2017-02-03 17:03:09.489758 7fe0be819700 10 mon.rg3-ceph36@0(leader).auth v625
auth
2017-02-03 17:03:10.940004 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 received
forwarded message from osd.29 10.205.198.82:6812/4405 via mon.2
10.205.198.149:6789/0
2017-02-03 17:03:10.940036 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 caps
are allow *
2017-02-03 17:03:10.940040 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 entity
name 'osd.29' type 4
2017-02-03 17:03:10.940042 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 mesg
0x7fe0bb1a4c00 from 10.205.198.149:6789/0
2017-02-03 17:03:10.940065 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
preprocess_query pg_stats(0 pgs tid 16094 v 0) v1 from osd.29
10.205.198.82:6812/4405
2017-02-03 17:03:10.940082 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_update pg_stats(0 pgs tid 16094 v 0) v1 from osd.29
10.205.198.82:6812/4405
2017-02-03 17:03:10.940088 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_pg_stats pg_stats(0 pgs tid 16094 v 0) v1 from osd.29
2017-02-03 17:03:10.940101 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
message contains no new osd|pg stats
2017-02-03 17:03:10.986247 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
handle_subscribe mon_subscribe({monmap=2+,osd_pg_creates=0}) v2
2017-02-03 17:03:10.986265 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1
check_sub monmap next 2 have 1
2017-02-03 17:03:11.344862 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 received
forwarded message from osd.77 10.205.198.145:6812/82199 via mon.2
10.205.198.149:6789/0
2017-02-03 17:03:11.344881 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 caps
are allow *
2017-02-03 17:03:11.344885 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 entity
name 'osd.77' type 4
2017-02-03 17:03:11.344887 7fe0be018700 10 mon.rg3-ceph36@0(leader) e1 mesg
0x7fe0bb1a5b00 from 10.205.198.149:6789/0
2017-02-03 17:03:11.344904 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
preprocess_query pg_stats(0 pgs tid 16272 v 0) v1 from osd.77
10.205.198.145:6812/82199
2017-02-03 17:03:11.344926 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_update pg_stats(0 pgs tid 16272 v 0) v1 from osd.77
10.205.198.145:6812/82199
2017-02-03 17:03:11.344931 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_pg_stats pg_stats(0 pgs tid 16272 v 0) v1 from osd.77
2017-02-03 17:03:11.344941 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
message contains no new osd|pg stats
2017-02-03 17:03:11.833360 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
preprocess_query pg_stats(0 pgs tid 17368 v 0) v1 from osd.49
10.205.198.85:6812/13922
2017-02-03 17:03:11.833385 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_update pg_stats(0 pgs tid 17368 v 0) v1 from osd.49
10.205.198.85:6812/13922
2017-02-03 17:03:11.833390 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
prepare_pg_stats pg_stats(0 pgs tid 17368 v 0) v1 from osd.49
2017-02-03 17:03:11.833398 7fe0be018700 10 mon.rg3-ceph36@0(leader).pg v86565
message contains no new osd|pg stats
I’m using hammer version, 0.94.5. Please help me, thank you.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com