> Am 11.10.2019 um 14:07 schrieb Igor Fedotov <ifedo...@suse.de>: > > > Hi! > > originally your issue looked like the ones from > https://tracker.ceph.com/issues/42223 > > And it looks like lack of some key information for FreeListManager in RocksDB. > > Once you have it present we can check the content of the RocksDB to prove > this hypothesis, please let me know if you want the guideline for that. > > > > The last log is different, the key record is probably: > > -2> 2019-10-09 23:03:47.011 7fb4295a7700 -1 rocksdb: submit_common error: > Corruption: block checksum mismatch: expected 2181709173, got 2130853119 in > db/204514.sst offset 0 size 61648 code = 2 Rocksdb transaction: > > which most probably denotes data corruption in DB. Unfortunately for now I > can't say if this is related to the original issue or not. > > This time it reminds the issue shared in this mailing list a while ago by > Stefan Priebe. The post caption is "Bluestore OSDs keep crashing in > BlueStore.cc: 8808: FAILED assert(r == 0)" > > So first of all I'd suggest to distinguish these issues for now and try to > troubleshoot them separately. > > > > As for the first case I'm wondering if you have any OSDs still failing this > way, i.e. asserting in allocator and showing 0 extents loaded: "_open_alloc > loaded 0 B in 0 extents" > > If so lets check DB content first. > > > > For the second case I'm wondering the most if the issue is permanent for a > specific OSD or it disappears after OSD/node restart as it occurred in > Stefan's case? >
Just a note it came back shortly after some days. I‘m still waiting for a ceph release which fixes the issue v12.2.13... Stefan > > Thanks, > > Igor > > > > On 10/10/2019 1:59 PM, cephuser2345 user wrote: >> Hi igor >> since the last osd crash we had some 4 more tried to check RocksDB with >> ceph-kvstore-tool : >> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71 compact >> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71 repair >> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71 destructive-repair >> >> nothing helped we had to redeploy the osd by removing it from the cluster >> and reinstalling >> >> we have updated to ceph 14.2.4 2 weeks or more ago still osd's falling >> in the same way >> i have manged to to capture the first fault by using : ceph crash ls added >> the log+meta to this email >> can something dose this logs can shed some light ? >> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Sep 12, 2019 at 7:20 PM Igor Fedotov <ifedo...@suse.de> wrote: >>>> Hi, >>>> >>>> this line: >>>> >>>> -2> 2019-09-12 16:38:15.101 7fcd02fd1f80 1 >>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents >>>> >>>> tells me that OSD is unable to load free list manager properly, i.e. list >>>> of free/allocated blocks in unavailable. >>>> >>>> You might want to set 'debug bluestore = 10" and check additional log >>>> output between >>>> >>>> these two lines: >>>> >>>> -3> 2019-09-12 16:38:15.093 7fcd02fd1f80 1 >>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc opening allocation >>>> metadata >>>> -2> 2019-09-12 16:38:15.101 7fcd02fd1f80 1 >>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents >>>> >>>> And/or check RocksDB records prefixed with "b" prefix using >>>> ceph-kvstore-tool. >>>> >>>> >>>> >>>> Igor >>>> >>>> >>>> >>>> P.S. >>>> >>>> Sorry, might be unresponsive for the next two week as I'm going on >>>> vacation. >>>> >>>> >>>> >>>> On 9/12/2019 7:04 PM, cephuser2345 user wrote: >>>>> Hi >>>>> we have updated the ceph version from 14.2.2 to version 14.2.3. >>>>> the osd getting : >>>>> >>>>> -21 76.68713 host osd048 >>>>> 66 hdd 12.78119 osd.66 up 1.00000 1.00000 >>>>> 67 hdd 12.78119 osd.67 up 1.00000 1.00000 >>>>> 68 hdd 12.78119 osd.68 up 1.00000 1.00000 >>>>> 69 hdd 12.78119 osd.69 up 1.00000 1.00000 >>>>> 70 hdd 12.78119 osd.70 up 1.00000 1.00000 >>>>> 71 hdd 12.78119 osd.71 down 0 1.00000 >>>>> >>>>> we can not get the osd up getting error its happening in alot of osds >>>>> can you please assist :) added txt log >>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc opening allocation >>>>> metadata >>>>> -2> 2019-09-12 16:38:15.101 7fcd02fd1f80 1 >>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents >>>>> -1> 2019-09-12 16:38:15.101 7fcd02fd1f80 -1 >>>>> /build/ceph-14.2.3/src/os/bluestore/fastbmap_allocator_impl.h: In >>>>> function 'void AllocatorLevel02<T>::_mark_allocated(uint64_t, uint64_t) >>>>> [with L1 = AllocatorLevel01Loose; uint64_t = long unsigned int]' thread >>>>> 7fcd02fd1f80 time 2019-09-12 16:38:15.102539 >>>>> >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@lists.ceph.com >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com