After ~3 uneventful weeks after upgrading from 15.2.17 to 16.2.14 I’ve started 
seeing OSD crashes with "cur >= fnode.size” and "cur >= p.length”, which seems 
to be resolved in the next point release for pacific later this month, but 
until then, I’d love to keep the OSDs from flapping.

> $ for crash in $(ceph crash ls | grep osd | awk '{print $1}') ; do ceph crash 
> info $crash | egrep "(assert_condition|crash_id)" ; done
>     "assert_condition": "cur >= fnode.size",
>     "crash_id": 
> "2024-01-03T09:07:55.698213Z_348af2d3-d4a7-4c27-9f71-70e6dc7c1af7",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-03T14:21:55.794692Z_4557c416-ffca-4165-aa91-d63698d41454",
>     "assert_condition": "cur >= fnode.size",
>     "crash_id": 
> "2024-01-03T22:53:43.010010Z_15dc2b2a-30fb-4355-84b9-2f9560f08ea7",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-04T02:34:34.408976Z_2954a2c2-25d2-478e-92ad-d79c42d3ba43",
>     "assert_condition": "cur2 >= p.length",
>     "crash_id": 
> "2024-01-04T21:57:07.100877Z_12f89c2c-4209-4f5a-b243-f0445ba629d2",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-05T00:35:08.561753Z_a189d967-ab02-4c61-bf68-1229222fd259",
>     "assert_condition": "cur >= fnode.size",
>     "crash_id": 
> "2024-01-05T04:11:48.625086Z_a598cbaf-2c4f-4824-9939-1271eeba13ea",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-05T13:49:34.911210Z_953e38b9-8ae4-4cfe-8f22-d4b7cdf65cea",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-05T13:54:25.732770Z_4924b1c0-309c-4471-8c5d-c3aaea49166c",
>     "assert_condition": "cur >= p.length",
>     "crash_id": 
> "2024-01-05T16:35:16.485416Z_0bca3d2a-2451-4275-a049-a65c58c1aff1”,

As noted in 
https://lists.ceph.io/hyperkitty/list/[email protected]/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
 
<https://lists.ceph.io/hyperkitty/list/[email protected]/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/>

> You can apparently work around the issue by setting 
> 'bluestore_volume_selection_policy' config parameter to rocksdb_original.

However, after trying to set that parameter with `ceph config set osd.$osd 
bluestore_volume_selection_policy rocksdb_original` it doesn’t appear to set?

> $ ceph config show-with-defaults osd.0  | grep 
> bluestore_volume_selection_policy
> bluestore_volume_selection_policy                           use_some_extra

> $ ceph config set osd.0 bluestore_volume_selection_policy rocksdb_original
> $ ceph config show osd.0  | grep bluestore_volume_selection_policy
> bluestore_volume_selection_policy   use_some_extra                    default 
>                 mom

This, I assume, should reflect the new setting, however it still shows the 
default “use_some_extra” value.

But then this seems to imply that the config is set?
> $ ceph config dump | grep bluestore_volume_selection_policy
>     osd.0                dev       bluestore_volume_selection_policy       
> rocksdb_original                                              *
> [snip]
>     osd.9                dev       bluestore_volume_selection_policy       
> rocksdb_original                                              *

Does this need to be set in ceph.conf or is there another setting that also 
needs to be set?
Even after bouncing the OSD daemon, `ceph config show` still reports 
“use_some_extra"

Appreciate any help they can offer to point me towards to bridge the gap 
between now and the next point release.

Thanks,
Reed
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to