[ceph-users] Bluestore OSD died - error (39) Directory not empty not handled on operation 21

Stephen Lord Tue, 05 Apr 2016 10:28:41 -0700

I was experimenting with using bluestore OSDs and appear to have found a fairly 
consistent way to crash them…


Changing the number of copies in a pool down from 3 to 1 has now twice caused 
the mass panic of a whole pool of OSDs. In one case it was a cache tier, in 
another case it was just a pool hosting rbd images. 

From the log file of one of the OSDs:

2016-04-05 12:09:54.272475 7f5a58027700  0 bluestore(/var/lib/ceph/osd/ceph-43) 
 error (39) Directory not empty not handled on operation 21 (op 1, counting 
from 0)
2016-04-05 12:09:54.272489 7f5a58027700  0 bluestore(/var/lib/ceph/osd/ceph-43) 
 transaction dump:
{
    "ops": [
        {
            "op_num": 0,
            "op_name": "remove",
            "collection": "2.354_head",
            "oid": "#2:2ac00000::::head#"
        },
        {
            "op_num": 1,
            "op_name": "rmcoll",
            "collection": "2.354_head"
        }
    ]
}


2016-04-05 12:09:54.275114 7f5a58027700 -1 os/bluestore/BlueStore.cc: In 
function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
ObjectStore::Transaction*)' thread 7f5a58027700 time 2016-04-05 12:09:54.272532
os/bluestore/BlueStore.cc: 4357: FAILED assert(0 == "unexpected error")

 ceph version 10.1.0 (96ae8bd25f31862dbd5302f304ebf8bf1166aba6)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) 
[0x7f5a82e74a55]
 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
ObjectStore::Transaction*)+0x77a) [0x7f5a82b02eba]
 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> 
>&, std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a5) [0x7f5a82b056e5]
 4: (ObjectStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> 
>&, Context*, Context*, Context*, Context*, std::shared_ptr<TrackedOp>)+0x2a6) 
[0x7f5a82aad0b6]
 5: (OSD::RemoveWQ::_process(std::pair<boost::intrusive_ptr<PG>, 
std::shared_ptr<DeletingState> >, ThreadPool::TPHandle&)+0x6e4) [0x7f5a827debb4]
 6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, 
std::shared_ptr<DeletingState> >, std::pair<boost::intrusive_ptr<PG>, 
std::shared_ptr<DeletingState> > >::_void_process(void*, 
ThreadPool::TPHandle&)+0x11a) [0x7f5a8283a15a]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa7e) [0x7f5a82e65a9e]
 8: (ThreadPool::WorkThread::entry()+0x10) [0x7f5a82e66980]
 9: (()+0x7dc5) [0x7f5a80dbedc5]
 10: (clone()+0x6d) [0x7f5a7f44a28d]

In both cases a replicated pool with 3 copies was created, some content added 
and then the number of copies set down to 1. Not a common thing to do I know, 
but this works on FileStore OSDs.

This is a cluster deployed using redhat 7 Jewel (10.1) RPMs from 
download.ceph.com

Steve



----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Bluestore OSD died - error (39) Directory not empty not handled on operation 21

Reply via email to