Hi,

This is one we've seen before, issue #326

        http://tracker.newdream.net/issues/326

Was that the first (and only?) osd to fail?

What kind of workload were you subjecting the cluster to?  Just the file 
system?  RBD?  Anything unusual?

Also, can you confirm what version of the code you were running?  The osd 
log at /var/log/ceph/osd.*.log should have a version number and sha1 id, 
something like

ceph version 0.22~rc (3cd9d853cd58c79dc12427be8488e57970abda04)

Thanks!
sage


On Mon, 6 Sep 2010, Leander Yu wrote:

> Hi all,
> I have setup a 10 osd + 2 mds + 3 mon ceph cluster. it runs ok at
> beginning. However after one day, some of the osd  crashed with
> following assert fail
> I am using the unstable trunk. ceph.conf is attached.
> 
> -------------- osd 3 -----------------
> osd/PG.h: In function 'void PG::IndexedLog::index(PG::Log::Entry&)':
> osd/PG.h:429: FAILED assert(caller_ops.count(e.reqid) == 0)
>  1: (OSD::_process_pg_info(unsigned int, int, PG::Info&, PG::Log&,
> PG::Missing&, std::map<int, MOSDPGInfo*, std::less<int>,
> std::allocator<std::pair<int const, MOSDPGInfo*> > >*, int&)+0xb06)
> [0x4cf426]
>  2: (OSD::handle_pg_log(MOSDPGLog*)+0xa9) [0x4cf999]
>  3: (OSD::_dispatch(Message*)+0x3ed) [0x4e7dfd]
>  4: (OSD::ms_dispatch(Message*)+0x39) [0x4e86c9]
>  5: (SimpleMessenger::dispatch_entry()+0x789) [0x46b5f9]
>  6: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45849c]
>  7: (Thread::_entry_func(void*)+0xa) [0x46c0ca]
>  8: (()+0x6a3a) [0x7f69fd39ea3a]
>  9: (clone()+0x6d) [0x7f69fc5bc77d]
> 
> -------------- osd 7 --------------------
> osd/ReplicatedPG.cc: In function 'void ReplicatedPG::sub_op_pull(MOSDSubOp*)':
> osd/ReplicatedPG.cc:3021: FAILED assert(r == 0)
>  1: (OSD::dequeue_op(PG*)+0x344) [0x4e6fd4]
>  2: (ThreadPool::worker()+0x28f) [0x5b5a9f]
>  3: (ThreadPool::WorkThread::entry()+0xd) [0x4f0acd]
>  4: (Thread::_entry_func(void*)+0xa) [0x46c0ca]
>  5: (()+0x6a3a) [0x7efff4f12a3a]
>  6: (clone()+0x6d) [0x7efff413077d]
> 
> Please let me if you need more information. I still keep the
> environment for collecting more data for debug.
> 
> Thanks.
> 

Reply via email to