Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Jeff Layton
On Thu, 2019-08-15 at 16:45 +0900, Hector Martin wrote: > On 15/08/2019 03.40, Jeff Layton wrote: > > On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > > > Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). > > > Please take a look. > > > > > > > (sorry for duplicate mai

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Hector Martin
On 15/08/2019 03.40, Jeff Layton wrote: On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). Please take a look. (sorry for duplicate mail -- the other one ended up in moderation) Thanks Ilya, That function is pretty st

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Jeff Layton
On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > > MDS servers. This is a CephFS with two ranks; I manually failed over the > > first rank and the new MDS ser

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > MDS servers. This is a CephFS with two ranks; I manually failed over the > first rank and the new MDS server ran out of RAM in the rejoin phase > (ceph-mds didn't get