Hi list,
A few days ago we had some problems with our ceph cluster, and now we have some
OSDs crashing on start with messages like this right before crashing:
2017-06-09 15:35:02.226430 7fb46d9e4700 -1 log_channel(cluster) log [ERR] :
trim_object Snap 4aae0 not in clones
I can start those OSDs if i set 'osd recovery max active = 0', but then the
PGs in those OSDs stay in stuck degraded/unclean state.
Now, one of the problems we faced was missing clones, which we solved with
ceph-objectstore-tool's remove-clone-metadata option, but it doesn't seem to
work in this case (PG and object taken from the sample log posted below):
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-199 --journal-path
/var/lib/ceph/osd/ceph-199/journal --pgid 3.1cd
rbd_data.6e30d2d518fee6d.000000000000008f remove-clone-metadata $((0x4aae0))
Clone 4aae0 not present
Any hints how can we debug/fix this problem?
Thanks in advance.
Our cluster:
Hammer (0.94.10)
5 mons
576 osds
Our clients:
Openstack, some Hammer (0.94.10), some Giant (0.87.2)
(We're in the process of upgrading everithing to Hhammer and then to Jewel)
Our ceph.conf:
[global]
fsid = 5ecc1509-4344-4087-9a1c-ac68fb085a75
mon initial members = ceph17, ceph19, ceph20, cephmon01, cephmon02
mon host = 172.17.22.17,172.17.22.19,172.17.22.20,172.17.23.251,172.17.23.252
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
filestore xattr use omap = true
osd pool default size = 2
osd pool default min size = 1
public network = 172.17.0.0/16
cluster network = 172.18.0.0/16
osd pool default pg num = 2048
osd pool default pgp num = 2048
mon osd down out subtree limit = host
rbd default format = 2
osd op threads = 2
osd disk threads = 1
osd max backfills = 2
osd recovery threads = 1
osd recovery max active = 2
osd recovery op priority = 2
[mon]
mon compact on start = true
[osd]
osd crush update on start = false
Here's a sample log, with [...] replacing very long text/lines, but I can
provide full logs if needed:
2017-06-08 16:47:14.519092 7fa61bee4700 -1 log_channel(cluster) log [ERR] :
trim_object Snap 4aae0 not in clones
2017-06-08 16:47:16.197600 7fa62ee75700 0 osd.199 pg_epoch: 1779479 pg[3.1cd(
v 1779479'105504641 (1779407'105501638,1779479'105504641] local-les=1779478
n=1713
ec=33 les/c 1779478/1779479 1779477/1779477/1779477) [199,134] r=0 lpr=1779477
luod=1779479'105504639 crt=1779479'105504637 lcod 1779479'105504638 mlcod
1779479'105504638 active+clean
snaptrimq=[4aae0~1,4aae2~1,4aae4~1,4aae6~1,[...],4bc5f~4,4bc64~2,4bc68~2]]
removing repgather(0xf9d09c0 1779479'105504639
rep_tid=1753 committed?=1 applied?=1 lock=0
op=osd_op(client.215404673.0:6234090 rbd_data.6e30d2d518fee6d.000000000000008f
[set-alloc-hint object_size 8388608
write_size 8388608,write 2101248~4096] 3.56a5e1cd snapc 4a86e=[]
ack+ondisk+write+known_if_redirected e1779479) v4)
2017-06-08 16:47:16.197834 7fa62ee75700 0 osd.199 pg_epoch: 1779479 pg[3.1cd(
v 1779479'105504641 (1779407'105501638,1779479'105504641] local-les=1779478
n=1713
ec=33 les/c 1779478/1779479 1779477/1779477/1779477) [199,134] r=0 lpr=1779477
luod=1779479'105504639 crt=1779479'105504637 lcod 1779479'105504638 mlcod
1779479'105504638 active+clean
snaptrimq=[4aae0~1,4aae2~1,4aae4~1,[...],4bc5f~4,4bc64~2,4bc68~2]] q front
is repgather(0x10908a00 0'0 rep_tid=1740 committed?=0
applied?=0 lock=0)
2017-06-08 16:47:16.267386 7fa62ee75700 -1 osd/ReplicatedPG.cc: In function
'void ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)' thread 7fa62ee75700
time
2017-06-08 16:47:16.197953
osd/ReplicatedPG.cc: 7477: FAILED assert(repop_queue.front() == repop)
ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85)
[0xbdf735]
2: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x12e1) [0x853111]
3: (ReplicatedPG::repop_all_committed(ReplicatedPG::RepGather*)+0xcc)
[0x85341c]
4: (Context::complete(int)+0x9) [0x6c6329]
5: (ReplicatedBackend::op_commit(ReplicatedBackend::InProgressOp*)+0x1dc)
[0xa11b4c]
6: (Context::complete(int)+0x9) [0x6c6329]
7: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8ad2d4]
8: (Context::complete(int)+0x9) [0x6c6329]
9: (Finisher::finisher_thread_entry()+0x168) [0xb02168]
10: (()+0x7dc5) [0x7fa63ce39dc5]
11: (clone()+0x6d) [0x7fa63b91b76d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
--- begin dump of recent events ---
-10000> 2017-06-09 15:35:27.341080 7fb4862fb700 5 osd.199 1779827 tick
-9999> 2017-06-09 15:35:27.341102 7fb4862fb700 20 osd.199 1779827
scrub_time_permit should run between 0 - 24 now 15 = yes
-9998> 2017-06-09 15:35:27.341121 7fb4862fb700 20 osd.199 1779827
scrub_load_below_threshold loadavg 1.35 >= max 0.5 = no, load too high
-9997> 2017-06-09 15:35:27.341130 7fb4862fb700 20 osd.199 1779827 sched_scrub
load_is_low=0
[...]
-5> 2017-06-09 15:35:30.894979 7fb480edd700 15 osd.199 pg_epoch: 1779827
pg[3.92f( v 1779827'125690903 (1779818'125687817,1779827'125690903]
local-les=1779826
n=3186 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,564] r=0
lpr=1779825 lua=1779827'125690899 crt=1779827'125690898 lcod 1779827'125690902
mlcod
1779827'125690898 active+clean] publish_stats_to_osd 1779827:186528651
-4> 2017-06-09 15:35:30.894986 7fb480edd700 10 osd.199 pg_epoch: 1779827
pg[3.92f( v 1779827'125690903 (1779818'125687817,1779827'125690903]
local-les=1779826
n=3186 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,564] r=0
lpr=1779825 lua=1779827'125690899 crt=1779827'125690898 lcod 1779827'125690902
mlcod
1779827'125690898 active+clean] sending commit on repgather(0xf05b600
1779827'125690903 rep_tid=1162 committed?=1 applied?=0 lock=0
op=osd_op(client.260852320.0:1973188 rbd_data.3d86d73dbe422e.00000000000000ef
[set-alloc-hint object_size 8388608 write_size 8388608,write 3444736~4096]
3.ef04392f
snapc 4abd1=[] ack+ondisk+write+known_if_redirected e1779827) v4) 0xf918c00
-3> 2017-06-09 15:35:30.895055 7fb480edd700 1 -- 172.17.22.31:6800/209017
--> 172.17.21.25:0/1016405 -- osd_op_reply(1973188
rbd_data.3d86d73dbe422e.00000000000000ef [set-alloc-hint object_size 8388608
write_size 8388608,write 3444736~4096] v1779827'125690903 uv125690903 ondisk =
0)
v6 -- ?+0 0xf918c00 con 0xee5f080
-2> 2017-06-09 15:35:30.895066 7fb480edd700 5 -- op tracker -- seq: 3953,
time: 2017-06-09 15:35:30.895066, event: commit_sent, op:
osd_op(client.260852320.0:1973188 rbd_data.3d86d73dbe422e.00000000000000ef
[set-alloc-hint object_size 8388608 write_size 8388608,write 3444736~4096]
3.ef04392f
snapc 4abd1=[] ack+ondisk+write+known_if_redirected e1779827)
-1> 2017-06-09 15:35:30.906430 7fb46d9e4700 10 osd.199 pg_epoch: 1779827
pg[3.506( v 1779827'139421210 (1779816'139418118,1779827'139421210]
local-les=1779826
n=3131 ec=33 les/c 1779826/1779827 1779825/1779825/1779825) [199,171] r=0
lpr=1779825 crt=1779827'139421199 lcod 1779827'139421209 mlcod
1779827'139421209
active+clean+scrubbing+deep] be_deep_scrub
3/6e748506/rbd_data.9f52fa7790c71b.0000000000000122/head seed 4294967295
0> 2017-06-09 15:35:30.919706 7fb4816de700 -1 *** Caught signal (Aborted)
**
in thread 7fb4816de700
ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
1: /usr/bin/ceph-osd() [0xadc182]
2: (()+0xf370) [0x7fb48ec9b370]
3: (gsignal()+0x37) [0x7fb48d6b31d7]
4: (abort()+0x148) [0x7fb48d6b48c8]
5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fb48dfb79d5]
6: (()+0x5e946) [0x7fb48dfb5946]
7: (()+0x5e973) [0x7fb48dfb5973]
8: (()+0x5eb93) [0x7fb48dfb5b93]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x27a) [0xbdf92a]
10: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x12e1) [0x853111]
11: (ReplicatedPG::repop_all_applied(ReplicatedPG::RepGather*)+0x195)
[0x8532e5]
12: (Context::complete(int)+0x9) [0x6c6329]
13: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0x21c)
[0xa118dc]
14: (Context::complete(int)+0x9) [0x6c6329]
15: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8ad2d4]
16: (Context::complete(int)+0x9) [0x6c6329]
17: (void finish_contexts<Context>(CephContext*, std::list<Context*,
std::allocator<Context*> >&, int)+0xac) [0x709e3c]
18: (C_ContextsBase<Context, Context>::complete(int)+0x9) [0x6c6c19]
19: (Finisher::finisher_thread_entry()+0x168) [0xb02168]
20: (()+0x7dc5) [0x7fb48ec93dc5]
21: (clone()+0x6d) [0x7fb48d77576d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/20 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 keyvaluestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.199.log
--- end dump of recent events ---
Cheers,
--
Ricardo J. Barberis
Usuario Linux Nº 250625: http://counter.li.org/
Usuario LFS Nº 5121: http://www.linuxfromscratch.org/
Senior SysAdmin / IT Architect - www.DonWeb.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com