On 2/9/19 5:40 PM, Brad Hubbard wrote:
> On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez <ru...@fsf.org> wrote:
>>
>> Hi there,
>>
>> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
>>
>> Today we had two disks fail out of the controller, and after a reboot
>> they both seemed to come back fine but ceph-osd was only able to start
>> in one of them. The other one gets this:
>>
>> 2019-02-08 18:53:00.703376 7f64f948ce00 -1
>> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
>> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
>> location [0x4000~1000], logical extent 0x0~1000, object
>> #-1:7b3f43c4:::osd_superblock:0#
>> 2019-02-08 18:53:00.703406 7f64f948ce00 -1 osd.3 0 OSD::init() : unable
>> to read osd superblock
>>
>> Note that there are no actual IO errors being shown by the controller in
>> dmesg, and that the disk is readable. The metadata FS is mounted and
>> looks normal.
>>
>> I tried running "ceph-bluestore-tool repair --path
>> /var/lib/ceph/osd/ceph-3 --deep 1" and that gets many instances of:
> 
> Running this with debug_bluestore=30 might give more information on
> the nature of the IO error.

I had collected the logs with debug info already, and nothing
significant was listed there. I applied this patch
https://github.com/ceph/ceph/pull/26247 and it allowed me to move
forward. There was a osd map corruption issue that I had to handle by
hand, but after that the osd started fine. After it started and
backfills finished, the bluestore_ignore_data_csum flag is no longer
needed, so I reverted to standard packages.

-- 
Ruben Rodriguez | Chief Technology Officer, Free Software Foundation
GPG Key: 05EF 1D2F FE61 747D 1FC8  27C3 7FAC 7D26 472F 4409
https://fsf.org | https://gnu.org

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to