Hi,our mon is acting up all of a sudden and dying in crash loop with the 
following:

2019-10-04 14:00:24.339583 lease_expire=0.000000 has v0 lc 4549352
    -3> 2019-10-04 14:00:24.335 7f6e5d461700  5 
mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) 
is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.000000 has v0 
lc 4549352
    -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map 
closest pinned map ver 252615 not available! error: (2) No such file or 
directory
    -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int 
OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread 
7f6e5d461700 time 2019-10-04 14:00:24.347580
/build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0)

 ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x152) [0x7f6e68eb064e]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char 
const*, ...)+0) [0x7f6e68eb0829]
 3: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
 4: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
 5: 
(OSDMonitor::encode_trim_extra(std::shared_ptr<MonitorDBStore::Transaction>, 
unsigned long)+0x8c) [0x717c3c]
 6: (PaxosService::maybe_trim()+0x473) [0x707443]
 7: (Monitor::tick()+0xa9) [0x5ecf39]
 8: (C_MonContext::finish(int)+0x39) [0x5c3f29]
 9: (Context::complete(int)+0x9) [0x6070d9]
 10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580]
 11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d]
 12: (()+0x76ba) [0x7f6e67cab6ba]
 13: (clone()+0x6d) [0x7f6e674d441d]

     0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) **
 in thread 7f6e5d461700 thread_name:safe_timer

 ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
(stable)
 1: (()+0x11390) [0x7f6e67cb5390]
 2: (gsignal()+0x38) [0x7f6e67402428]
 3: (abort()+0x16a) [0x7f6e6740402a]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1a3) [0x7f6e68eb069f]
 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char 
const*, ...)+0) [0x7f6e68eb0829]
 6: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
 7: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
 8: 
(OSDMonitor::encode_trim_extra(std::shared_ptr<MonitorDBStore::Transaction>, 
unsigned long)+0x8c) [0x717c3c]
 9: (PaxosService::maybe_trim()+0x473) [0x707443]
 10: (Monitor::tick()+0xa9) [0x5ecf39]
 11: (C_MonContext::finish(int)+0x39) [0x5c3f29]
 12: (Context::complete(int)+0x9) [0x6070d9]
 13: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580]
 14: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d]
 15: (()+0x76ba) [0x7f6e67cab6ba]
 16: (clone()+0x6d) [0x7f6e674d441d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.


This was running fine for 2months now, it's a crashed cluster that is in 
recovery.
Any suggestions?
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to