Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")
https://tracker.ceph.com/issues/38724 On Fri, Aug 23, 2019 at 10:18 PM Paul Emmerich wrote: > > I've seen that before (but never on Nautilus), there's already an > issue at tracker.ceph.com but I don't recall the id or title. > > > Paul > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 > > On Fri, Aug 23, 2019 at 1:47 PM Lars Täuber wrote: > > > > Hi Paul, > > > > a result of fgrep is attached. > > Can you do something with it? > > > > I can't read it. Maybe this is the relevant part: > > " bluestore(/var/lib/ceph/osd/first-16) _txc_add_transaction error (39) > > Directory not empty not handled on operation 21 (op 1, counting from 0)" > > > > Later I tried it again and the osd is working again. > > > > It feels like I hit a bug!? > > > > Huge thanks for your help. > > > > Cheers, > > Lars > > > > Fri, 23 Aug 2019 13:36:00 +0200 > > Paul Emmerich ==> Lars Täuber : > > > Filter the log for "7f266bdc9700" which is the id of the crashed > > > thread, it should contain more information on the transaction that > > > caused the crash. > > > > > > > > > Paul > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Cheers, Brad ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")
I've seen that before (but never on Nautilus), there's already an issue at tracker.ceph.com but I don't recall the id or title. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Aug 23, 2019 at 1:47 PM Lars Täuber wrote: > > Hi Paul, > > a result of fgrep is attached. > Can you do something with it? > > I can't read it. Maybe this is the relevant part: > " bluestore(/var/lib/ceph/osd/first-16) _txc_add_transaction error (39) > Directory not empty not handled on operation 21 (op 1, counting from 0)" > > Later I tried it again and the osd is working again. > > It feels like I hit a bug!? > > Huge thanks for your help. > > Cheers, > Lars > > Fri, 23 Aug 2019 13:36:00 +0200 > Paul Emmerich ==> Lars Täuber : > > Filter the log for "7f266bdc9700" which is the id of the crashed > > thread, it should contain more information on the transaction that > > caused the crash. > > > > > > Paul > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")
Hi Paul, a result of fgrep is attached. Can you do something with it? I can't read it. Maybe this is the relevant part: " bluestore(/var/lib/ceph/osd/first-16) _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0)" Later I tried it again and the osd is working again. It feels like I hit a bug!? Huge thanks for your help. Cheers, Lars Fri, 23 Aug 2019 13:36:00 +0200 Paul Emmerich ==> Lars Täuber : > Filter the log for "7f266bdc9700" which is the id of the crashed > thread, it should contain more information on the transaction that > caused the crash. > > > Paul > 7f266bdc9700.log.gz Description: application/gzip ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")
Filter the log for "7f266bdc9700" which is the id of the crashed thread, it should contain more information on the transaction that caused the crash. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Aug 23, 2019 at 9:29 AM Lars Täuber wrote: > > Hi there! > > In our testcluster is an osd that won't start anymore. > > Here is a short part of the log: > > -1> 2019-08-23 08:56:13.316 7f266bdc9700 -1 > /tmp/release/Debian/WORKDIR/ceph-14.2.2/src/os/bluestore/BlueStore.cc: In > function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, > ObjectStore::Transaction*)' thread 7f266bdc9700 time 2019-08-23 > 08:56:13.318938 > /tmp/release/Debian/WORKDIR/ceph-14.2.2/src/os/bluestore/BlueStore.cc: 11208: > ceph_abort_msg("unexpected error") > > ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus > (stable) > 1: (ceph::__ceph_abort(char const*, int, char const*, > std::__cxx11::basic_string, std::allocator > > const&)+0xdf) [0x564406ac153a] > 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > ObjectStore::Transaction*)+0x2830) [0x5644070e48d0] > 3: > (BlueStore::queue_transactions(boost::intrusive_ptr&, > std::vector std::allocator >&, boost::intrusive_ptr, > ThreadPool::TPHandle*)+0x42a) [0x5644070ec33a] > 4: > (ObjectStore::queue_transaction(boost::intrusive_ptr&, > ObjectStore::Transaction&&, boost::intrusive_ptr, > ThreadPool::TPHandle*)+0x7f) [0x564406cd620f] > 5: (PG::_delete_some(ObjectStore::Transaction*)+0x945) [0x564406d32d85] > 6: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x71) > [0x564406d337d1] > 7: (boost::statechart::simple_state PG::RecoveryState::ToDelete, boost::mpl::list mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, > mpl_::na, mpl_::na, mpl_::na>, > (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base > const&, void const*)+0x109) [0x564406d81ec9] > 8: (boost::statechart::state_machine PG::RecoveryState::Initial, std::allocator, > boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base > const&)+0x6b) [0x564406d4e7cb] > 9: (PG::do_peering_event(std::shared_ptr, > PG::RecoveryCtx*)+0x2af) [0x564406d3f39f] > 10: (OSD::dequeue_peering_evt(OSDShard*, PG*, > std::shared_ptr, ThreadPool::TPHandle&)+0x1b4) > [0x564406c7e644] > 11: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, > ThreadPool::TPHandle&)+0xc4) [0x564406c7e8c4] > 12: (OSD::ShardedOpWQ::_process(unsigned int, > ceph::heartbeat_handle_d*)+0x7d7) [0x564406c72667] > 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) > [0x56440724f7d4] > 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5644072521d0] > 15: (()+0x7fa3) [0x7f26862f6fa3] > 16: (clone()+0x3f) [0x7f2685ea64cf] > > > The log is so huge that I don't know which part may be of interest. The cite > is the part I think is most useful. > Is there anybody able to read and explain this? > > > Thanks in advance, > Lars > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")
Hi there! In our testcluster is an osd that won't start anymore. Here is a short part of the log: -1> 2019-08-23 08:56:13.316 7f266bdc9700 -1 /tmp/release/Debian/WORKDIR/ceph-14.2.2/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread 7f266bdc9700 time 2019-08-23 08:56:13.318938 /tmp/release/Debian/WORKDIR/ceph-14.2.2/src/os/bluestore/BlueStore.cc: 11208: ceph_abort_msg("unexpected error") ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string, std::allocator > const&)+0xdf) [0x564406ac153a] 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x2830) [0x5644070e48d0] 3: (BlueStore::queue_transactions(boost::intrusive_ptr&, std::vector >&, boost::intrusive_ptr, ThreadPool::TPHandle*)+0x42a) [0x5644070ec33a] 4: (ObjectStore::queue_transaction(boost::intrusive_ptr&, ObjectStore::Transaction&&, boost::intrusive_ptr, ThreadPool::TPHandle*)+0x7f) [0x564406cd620f] 5: (PG::_delete_some(ObjectStore::Transaction*)+0x945) [0x564406d32d85] 6: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x71) [0x564406d337d1] 7: (boost::statechart::simple_state, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x109) [0x564406d81ec9] 8: (boost::statechart::state_machine, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x564406d4e7cb] 9: (PG::do_peering_event(std::shared_ptr, PG::RecoveryCtx*)+0x2af) [0x564406d3f39f] 10: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr, ThreadPool::TPHandle&)+0x1b4) [0x564406c7e644] 11: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0xc4) [0x564406c7e8c4] 12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x7d7) [0x564406c72667] 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) [0x56440724f7d4] 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5644072521d0] 15: (()+0x7fa3) [0x7f26862f6fa3] 16: (clone()+0x3f) [0x7f2685ea64cf] The log is so huge that I don't know which part may be of interest. The cite is the part I think is most useful. Is there anybody able to read and explain this? Thanks in advance, Lars ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com