Hi,
This
> fsck failed: (5) Input/output error

Sounds like an Hardware issue.
Did you have a Look on "dmesg"?

Hth
Mehmet

Am 21. Dezember 2021 17:47:35 MEZ schrieb Sebastian Mazza 
<sebast...@macforce.at>:
>Hi all,
>
>after a reboot of a cluster 3 OSDs can not be started. The OSDs exit with  the 
>following error message:
>       2021-12-21T01:01:02.209+0100 7fd368cebf00  4 rocksdb: 
> [db_impl/db_impl.cc:396] Shutdown: canceling all background work
>       2021-12-21T01:01:02.209+0100 7fd368cebf00  4 rocksdb: 
> [db_impl/db_impl.cc:573] Shutdown complete
>       2021-12-21T01:01:02.209+0100 7fd368cebf00 -1 rocksdb: Corruption: Bad 
> table magic number: expected 9863518390377041911, found 0 in db/002182.sst
>       2021-12-21T01:01:02.213+0100 7fd368cebf00 -1 
> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: 
>       2021-12-21T01:01:02.213+0100 7fd368cebf00  1 bluefs umount
>       2021-12-21T01:01:02.213+0100 7fd368cebf00  1 bdev(0x559bbe0ea800 
> /var/lib/ceph/osd/ceph-7/block) close
>       2021-12-21T01:01:02.293+0100 7fd368cebf00  1 bdev(0x559bbe0ea400 
> /var/lib/ceph/osd/ceph-7/block) close
>       2021-12-21T01:01:02.537+0100 7fd368cebf00 -1 osd.7 0 OSD:init: unable 
> to mount object store
>       2021-12-21T01:01:02.537+0100 7fd368cebf00 -1  ** ERROR: osd init 
> failed: (5) Input/output error
>
>
>I found a similar problem in this Mailing list: 
>https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MJLVS7UPJ5AZKOYN3K2VQW7WIOEQGC5V/#MABLFA4FHG6SX7YN4S6BGSCP6DOAX6UE
>
>In this thread, Francois was able to successfully repair his OSD data with 
>`ceph-bluestore-tool fsck`. I tried to run: 
>`ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-7 -l 
>/var/log/ceph/bluestore-tool-fsck-osd-7.log --log-level 20  > 
>/var/log/ceph/bluestore-tool-fsck-osd-7.out  2>&1`
>But that results in:
>       2021-12-21T16:44:18.455+0100 7fc54ef7a240 -1 rocksdb: Corruption: Bad 
> table magic number: expected 9863518390377041911, found 0 in db/002182.sst
>       2021-12-21T16:44:18.455+0100 7fc54ef7a240 -1 
> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: 
>       fsck failed: (5) Input/output error
>
>I also tried to run `ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-7 
>repair`. But that also fails with:
>       2021-12-21T17:34:06.780+0100 7f35765f7240  0 
> bluestore(/var/lib/ceph/osd/ceph-7) _open_db_and_around read-only:0 repair:0
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bdev(0x55fce5a1a800 
> /var/lib/ceph/osd/ceph-7/block) open path /var/lib/ceph/osd/ceph-7/block
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bdev(0x55fce5a1a800 
> /var/lib/ceph/osd/ceph-7/block) open size 12000134430720 (0xae9ffc00000, 11 
> TiB) 
>               block_size 4096 (4 KiB) rotational discard not supported
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 
> bluestore(/var/lib/ceph/osd/ceph-7) _set_cache_sizes cache_size 1073741824 
> meta 0.45 kv 0.45 data 0.06
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bdev(0x55fce5a1ac00 
> /var/lib/ceph/osd/ceph-7/block) open path /var/lib/ceph/osd/ceph-7/block
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bdev(0x55fce5a1ac00 
> /var/lib/ceph/osd/ceph-7/block) open size 12000134430720 (0xae9ffc00000, 11 
> TiB) 
>               block_size 4096 (4 KiB) rotational discard not supported
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bluefs add_block_device 
> bdev 1 path /var/lib/ceph/osd/ceph-7/block size 11 TiB
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bluefs mount
>       2021-12-21T17:34:06.780+0100 7f35765f7240  1 bluefs _init_alloc shared, 
> id 1, capacity 0xae9ffc00000, block size 0x10000
>       2021-12-21T17:34:06.904+0100 7f35765f7240  1 bluefs mount 
> shared_bdev_used = 0
>       2021-12-21T17:34:06.904+0100 7f35765f7240  1 
> bluestore(/var/lib/ceph/osd/ceph-7) _prepare_db_environment set db_paths to 
> db,11400127709184 db.slow,11400127709184
>       2021-12-21T17:34:06.908+0100 7f35765f7240 -1 rocksdb: Corruption: Bad 
> table magic number: expected 9863518390377041911, found 0 in db/002182.sst
>       2021-12-21T17:34:06.908+0100 7f35765f7240 -1 
> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: 
>       2021-12-21T17:34:06.908+0100 7f35765f7240  1 bluefs umount
>       2021-12-21T17:34:06.908+0100 7f35765f7240  1 bdev(0x55fce5a1ac00 
> /var/lib/ceph/osd/ceph-7/block) close
>       2021-12-21T17:34:07.072+0100 7f35765f7240  1 bdev(0x55fce5a1a800 
> /var/lib/ceph/osd/ceph-7/block) close
>
>
>The cluster is not in production, therefore, I can remove all corrupt pools 
>and delete the OSDs. However, I would like to understand what was going on, in 
>order to be able to avoid such a situation in the future.
>
>I will provide the OSD logs from the time around the server reboot at the 
>following link: https://we.tl/t-fArHXTmSM7
>
>Ceph version: 16.2.6
>
>
>Thanks,
>Sebastian
>
>_______________________________________________
>ceph-users mailing list -- ceph-users@ceph.io
>To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to