[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-05 Thread Zakhar Kirpichenko
Thank you, Tyler. Unfortunately (or fortunately?) the drive is fine in this case: there were no errors reported by the kernel at the time, and I successfully managed to run a bunch of tests on the drive for many hours before rebooting the host. The drive has worked without any issues for 3 days

[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-05 Thread Tyler Stachecki
On Tue, Dec 5, 2023 at 10:13 AM Zakhar Kirpichenko wrote: > > Any input from anyone? > > /Z IIt's not clear whether or not these issues are related. I see three things in this e-mail chain: 1) bdev() _aio_thread with EPERM, as in the subject of this e-mail chain 2) bdev() _aio_thread with the

[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-05 Thread Zakhar Kirpichenko
Any input from anyone? /Z On Mon, 4 Dec 2023 at 12:52, Zakhar Kirpichenko wrote: > Hi, > > Just to reiterate, I'm referring to an OSD crash loop because of the > following error: > > "2023-12-03T04:00:36.686+ 7f08520e2700 -1 bdev(0x55f02a28a400 > /var/lib/ceph/osd/ceph-56/block)

[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-04 Thread Zakhar Kirpichenko
Hi, Just to reiterate, I'm referring to an OSD crash loop because of the following error: "2023-12-03T04:00:36.686+ 7f08520e2700 -1 bdev(0x55f02a28a400 /var/lib/ceph/osd/ceph-56/block) _aio_thread got r=-1 ((1) Operation not permitted)". More relevant log entries:

[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-03 Thread Zakhar Kirpichenko
Thanks! The bug I referenced is the reason for the 1st OSD crash, but not for the subsequent crashes. The reason for those is described where you . I'm asking for help with that one. /Z On Sun, 3 Dec 2023 at 15:31, Kai Stian Olstad wrote: > On Sun, Dec 03, 2023 at 06:53:08AM +0200, Zakhar

[ceph-users] Re: Ceph 16.2.14: osd crash, bdev() _aio_thread got r=-1 ((1) Operation not permitted)

2023-12-03 Thread Kai Stian Olstad
On Sun, Dec 03, 2023 at 06:53:08AM +0200, Zakhar Kirpichenko wrote: One of our 16.2.14 cluster OSDs crashed again because of the dreaded https://tracker.ceph.com/issues/53906 bug. It would be good to understand what has triggered this condition and how it can be resolved without rebooting