> Am 11.10.2019 um 14:07 schrieb Igor Fedotov <ifedo...@suse.de>:
> 
> 
> Hi!
> 
> originally your issue looked like the ones from 
> https://tracker.ceph.com/issues/42223
> 
> And it looks like lack of some key information for FreeListManager in RocksDB.
> 
> Once you have it present we can check the content of the RocksDB to prove 
> this hypothesis, please let me know if you want the guideline for that.
> 
> 
> 
> The last log is different, the key record is probably:
> 
> -2> 2019-10-09 23:03:47.011 7fb4295a7700 -1 rocksdb: submit_common error: 
> Corruption: block checksum mismatch: expected 2181709173, got 2130853119  in 
> db/204514.sst offset 0 size 61648 code = 2 Rocksdb transaction: 
> 
> which most probably denotes data corruption in DB. Unfortunately for now I 
> can't say if this is related to the original issue or not.
> 
> This time it reminds the issue shared in this mailing list a while ago by 
> Stefan Priebe. The post caption is "Bluestore OSDs keep crashing in 
> BlueStore.cc: 8808: FAILED assert(r == 0)"
> 
> So first of all I'd suggest to distinguish these issues for now and try to 
> troubleshoot them separately.
> 
> 
> 
> As for the first case I'm wondering if you have any OSDs still failing this 
> way, i.e. asserting in allocator and showing 0 extents loaded: "_open_alloc 
> loaded 0 B in 0 extents"
> 
> If so lets check DB content first.
> 
> 
> 
> For the second case I'm wondering the most if the issue is permanent for a 
> specific OSD or it disappears after OSD/node restart as it occurred in 
> Stefan's case?
> 

Just a note it came back shortly after some days. I‘m still waiting for a ceph 
release which fixes the issue v12.2.13...


Stefan
> 
> Thanks,
> 
> Igor
> 
> 
> 
> On 10/10/2019 1:59 PM, cephuser2345 user wrote:
>> Hi igor 
>> since the last osd crash we had some 4 more  tried to check RocksDB with 
>> ceph-kvstore-tool :
>> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71 compact 
>> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71  repair
>> ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-71  destructive-repair
>> 
>> nothing helped  we had  to redeploy the osd by removing it from the cluster 
>> and reinstalling 
>> 
>> we have updated  to ceph  14.2.4   2 weeks or more ago still osd's falling 
>> in the same way 
>> i have manged to to  capture the first fault  by using : ceph crash ls added 
>> the log+meta  to this email 
>> can something dose this logs can shed some light ?
>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Sep 12, 2019 at 7:20 PM Igor Fedotov <ifedo...@suse.de> wrote:
>>>> Hi,
>>>> 
>>>> this line:
>>>> 
>>>>     -2> 2019-09-12 16:38:15.101 7fcd02fd1f80  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents
>>>> 
>>>> tells me that OSD is unable to load free list manager properly, i.e. list 
>>>> of free/allocated blocks in unavailable.
>>>> 
>>>> You might want to set 'debug bluestore = 10" and check additional log 
>>>> output between 
>>>> 
>>>> these two lines:
>>>> 
>>>>     -3> 2019-09-12 16:38:15.093 7fcd02fd1f80  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc opening allocation 
>>>> metadata
>>>>     -2> 2019-09-12 16:38:15.101 7fcd02fd1f80  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents
>>>> 
>>>> And/or check RocksDB records prefixed with "b" prefix using 
>>>> ceph-kvstore-tool.
>>>> 
>>>> 
>>>> 
>>>> Igor
>>>> 
>>>> 
>>>> 
>>>> P.S.
>>>> 
>>>> Sorry, might be unresponsive for the next two week as I'm going on 
>>>> vacation. 
>>>> 
>>>> 
>>>> 
>>>> On 9/12/2019 7:04 PM, cephuser2345 user wrote:
>>>>> Hi
>>>>> we have updated  the ceph version from 14.2.2 to version 14.2.3.
>>>>> the osd getting :
>>>>> 
>>>>>   -21        76.68713     host osd048                         
>>>>>  66   hdd  12.78119         osd.66      up  1.00000 1.00000 
>>>>>  67   hdd  12.78119         osd.67      up  1.00000 1.00000 
>>>>>  68   hdd  12.78119         osd.68      up  1.00000 1.00000 
>>>>>  69   hdd  12.78119         osd.69      up  1.00000 1.00000 
>>>>>  70   hdd  12.78119         osd.70      up  1.00000 1.00000 
>>>>>  71   hdd  12.78119         osd.71    down        0 1.00000 
>>>>> 
>>>>> we can not   get the osd  up  getting error its happening in alot of osds
>>>>> can you please assist :)  added txt log 
>>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc opening allocation 
>>>>> metadata
>>>>>     -2> 2019-09-12 16:38:15.101 7fcd02fd1f80  1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents
>>>>>     -1> 2019-09-12 16:38:15.101 7fcd02fd1f80 -1 
>>>>> /build/ceph-14.2.3/src/os/bluestore/fastbmap_allocator_impl.h: In 
>>>>> function 'void AllocatorLevel02<T>::_mark_allocated(uint64_t, uint64_t) 
>>>>> [with L1 = AllocatorLevel01Loose; uint64_t = long unsigned int]' thread 
>>>>> 7fcd02fd1f80 time 2019-09-12 16:38:15.102539
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to