Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread John Spray
On Thu, Jun 21, 2018 at 4:39 PM Benjeman Meekhof wrote: > > I do have one follow-up related question: While doing this I took > offline all the standby MDS, and max_mds on our cluster is at 1. Were > I to enable multiple MDS would they all actively split up processing > the purge queue? When

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread Benjeman Meekhof
I do have one follow-up related question: While doing this I took offline all the standby MDS, and max_mds on our cluster is at 1. Were I to enable multiple MDS would they all actively split up processing the purge queue? We have not yet at this point ever allowed multi active MDS but plan to

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread Benjeman Meekhof
Thanks very much John! Skipping over the corrupt entry by setting a new expire_pos seems to have worked. The journal expire_pos is now advancing and pools are being purged. It has a little while to go to catch up to current write_pos but the journal inspect command gives an 'OK' for overall

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread John Spray
On Wed, Jun 20, 2018 at 2:17 PM Benjeman Meekhof wrote: > > Thanks for the response. I was also hoping to be able to debug better > once we got onto Mimic. We just finished that upgrade yesterday and > cephfs-journal-tool does find a corruption in the purge queue though > our MDS continues to

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-20 Thread Benjeman Meekhof
Thanks for the response. I was also hoping to be able to debug better once we got onto Mimic. We just finished that upgrade yesterday and cephfs-journal-tool does find a corruption in the purge queue though our MDS continues to startup and the filesystem appears to be functional as usual. How

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-15 Thread John Spray
On Fri, Jun 15, 2018 at 2:55 PM, Benjeman Meekhof wrote: > Have seen some posts and issue trackers related to this topic in the > past but haven't been able to put it together to resolve the issue I'm > having. All on Luminous 12.2.5 (upgraded over time from past > releases). We are going to

[ceph-users] MDS: journaler.pq decode error

2018-06-15 Thread Benjeman Meekhof
Have seen some posts and issue trackers related to this topic in the past but haven't been able to put it together to resolve the issue I'm having. All on Luminous 12.2.5 (upgraded over time from past releases). We are going to upgrade to Mimic near future if that would somehow resolve the