Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-03-11 Thread Eugen Block
Hi all, we had some assistance with our SSD crash issue outside of this mailing list - which is not resolved yet (http://tracker.ceph.com/issues/38395) - but there's one thing I'd like to ask the list. I noticed that a lot of the OSD crashes show a correlation to MON elections. For the

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
On 2/7/2019 6:06 PM, Eugen Block wrote: At first - you should upgrade to 12.2.11 (or bring the mentioned patch in by other means) to fix rename procedure which will avoid new inconsistent objects appearance in DB. This should at least reduce the OSD crash frequency. We'll have to wait until

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Eugen Block
At first - you should upgrade to 12.2.11 (or bring the mentioned patch in by other means) to fix rename procedure which will avoid new inconsistent objects appearance in DB. This should at least reduce the OSD crash frequency. We'll have to wait until 12.2.11 is available for openSUSE, I'm

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
Eugen, At first - you should upgrade to 12.2.11 (or bring the mentioned patch in by other means) to fix rename procedure which will avoid new inconsistent objects appearance in DB. This should at least reduce the OSD crash frequency. At second - theoretically previous crashes could result

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Eugen Block
Hi Igor, thanks for the quick response! Just to make sure I don't misunderstand, and because it's a production cluster: before anything else I should run fsck on that OSD? Depending on the result we'll decide how to continue, right? Is there anything else to be enabled for that command or

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
Hi Eugen, looks like this isn't [1] but rather https://tracker.ceph.com/issues/38049 and https://tracker.ceph.com/issues/36541 (= https://tracker.ceph.com/issues/36638 for luminous) Hence it's not fixed in 12.2.10, target release is 12.2.11 Also please note the patch allows to avoid new

[ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Eugen Block
Hi list, I found this thread [1] about crashing SSD OSDs, although that was about an upgrade to 12.2.7, we just hit (probably) the same issue after our update to 12.2.10 two days ago in a production cluster. Just half an hour ago I saw one OSD (SSD) crashing (for the first time):