Hi,
While trying to get an OSD back in the test cluster, which had been
dropped out for unknown reason, we see a RocksDB Segmentation fault
during "compaction". I increased debugging to 20/20 for OSD / RocksDB,
see part of the logfile below:
... 49477, 49476, 49475, 49474, 49473, 49472, 49471, 49470, 49469, 49468,
49467], "files_L1": [49465], "score": 1138.25, "input_data_size": 82872298}
-1> 2018-01-12 08:48:23.915753 7f91eaf89e40 1 freelist init
0> 2018-01-12 08:48:45.630418 7f91eaf89e40 -1 *** Caught signal
(Segmentation fault) **
in thread 7f91eaf89e40 thread_name:ceph-osd
ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous
(stable)
1: (()+0xa65824) [0x55a124693824]
2: (()+0x11390) [0x7f91e9238390]
3: (()+0x1f8af) [0x7f91eab658af]
4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice const&,
rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions
const&, rocksdb::ImmutableCFOptions const&,
rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, rocksdb::Block*,
unsigned int, rocksdb::Slice const&, unsigned long, bool,
rocksdb::Cache::Priority)+0x1d9) [0x55a124a64e49]
5:
(rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*,
rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::Slice,
rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool)+0x3b7)
[0x55a124a66827]
6:
(rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*,
rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::BlockIter*,
bool, rocksdb::Status)+0x2ac) [0x55a124a66b6c]
7:
(rocksdb::BlockBasedTable::BlockEntryIteratorState::NewSecondaryIterator(rocksdb::Slice
const&)+0x97) [0x55a124a6f2e7]
8: (()+0xe6c48e) [0x55a124a9a48e]
9: (()+0xe6ca06) [0x55a124a9aa06]
10: (rocksdb::MergingIterator::Seek(rocksdb::Slice const&)+0x126)
[0x55a124a7bc86]
11: (rocksdb::DBIter::Seek(rocksdb::Slice const&)+0x20a) [0x55a124b1bdaa]
12:
(RocksDBStore::RocksDBWholeSpaceIteratorImpl::lower_bound(std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)+0x46) [0x55a1245d4676]
13: (BitmapFreelistManager::init(unsigned long)+0x2dc) [0x55a12463976c]
14: (BlueStore::_open_fm(bool)+0xc00) [0x55a124526c50]
15: (BlueStore::_mount(bool)+0x3dc) [0x55a12459aa1c]
16: (OSD::init()+0x3e2) [0x55a1241064e2]
17: (main()+0x2f07) [0x55a1240181d7]
18: (__libc_start_main()+0xf0) [0x7f91e81be830]
19: (_start()+0x29) [0x55a1240a37f9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
The disk in question is very old (powered on ~ 8 years), so it might be that
part of the data is corrupt. Would RocksDB throw a similar error like this in
that case?
Gr. Stefan
P.s. We're trying to learn as much as possible when things do not go according
to plan. There is way more debug info available in case anyone is interested.
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / [email protected]
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com