Re: [ceph-users] Scrub shutdown the OSD process

Olivier Bonvalet Wed, 17 Apr 2013 11:55:11 -0700

Some additional infos :

today at 18:57:40, the PG 3.1 [19,5,28] was having a scrub date of
"2013-03-28 08:38:12.858041", and the OSD 28 was recovering.


Ten minutes later (@ 19:07:40), that PG 3.1 was having a scrub date of
today.

But at 19:41:04 I seen a error in syslog :
        osd.10 52042 heartbeat_check: no reply from osd.28 since 2013-04-17 
19:40:43.565511

So, since 19:47:44, the PG 3.1 [19,5] is in "active+degraded" state, is
scrub date is returned to "2013-03-28 08:38:12.858041" ; and of course
the osd.28 is DOWN, the process abort :

     0> 2013-04-17 19:40:46.791010 7f6658f5a700 -1 *** Caught signal (Aborted) 
**
 in thread 7f6658f5a700

 ceph version 0.56.4-4-gd89ab0e (d89ab0ea6fa8d0961cad82f6a81eccbd3bbd3f55)
 1: /usr/bin/ceph-osd() [0x7a6289]
 2: (()+0xeff0) [0x7f666b488ff0]
 3: (gsignal()+0x35) [0x7f6669f121b5]
 4: (abort()+0x180) [0x7f6669f14fc0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f666a7a6dc5]
 6: (()+0xcb166) [0x7f666a7a5166]
 7: (()+0xcb193) [0x7f666a7a5193]
 8: (()+0xcb28e) [0x7f666a7a528e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x7c9) [0x8f9549]
 10: (ReplicatedPG::_scrub(ScrubMap&)+0x1a78) [0x57a038]
 11: (PG::scrub_compare_maps()+0xeb8) [0x696c18]
 12: (PG::chunky_scrub()+0x2d9) [0x6c37f9]
 13: (PG::scrub()+0x145) [0x6c4e55]
 14: (OSD::ScrubWQ::_process(PG*)+0xc) [0x64048c]
 15: (ThreadPool::worker(ThreadPool::WorkThread*)+0x879) [0x815179]
 16: (ThreadPool::WorkThread::entry()+0x10) [0x817980]
 17: (()+0x68ca) [0x7f666b4808ca]
 18: (clone()+0x6d) [0x7f6669fafb6d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.


What I didn't understand is why the OSD process crash, instead of
marking that PG "corrupted", and does that PG really "corrupted" are is
this just an OSD bug ?

Thanks,
Olivier

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ceph-users] Scrub shutdown the OSD process

Reply via email to