Since my problem is going to be archived on the Internet I'll keep following up, so the next person with this problem might save some time.
The seek was because ext4 can't seek to 23TB, but changing to an xfs mount to create this file resulted in success. Here is what I wound up doing to fix this: * Bring down all MDSes so they stop flapping * Back up journal (as seen in previous message) * Apply journal manually * Reset journal manually * Clear session table * Clear other tables (not sure I needed to do this) * Mark FS down * Mark the rank 0 MDS as failed * Reset the FS (yes, I really mean it) * Restart MDSes * Finally get some sleep If anybody has any idea what may have caused this situation, I am keenly interested. If not, hopefully I at least helped someone else. ________________________________ From: Pickett, Neale T Sent: Monday, April 1, 2019 12:31 To: ceph-users@lists.ceph.com Subject: Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat We decided to go ahead and try truncating the journal, but before we did, we would try to back it up. However, there are ridiculous values in the header. It can't write a journal this large because (I presume) my ext4 filesystem can't seek to this position in the (sparse) file. I would not be surprised to learn that memory allocation is trying to do something similar, hence the allocation of all available memory. This seems like a new kind of journal corruption that isn't being reported correctly. [root@lima /]# time cephfs-journal-tool --cluster=prodstore journal export backup.bin journal is 24652730602129~673601102 2019-04-01 17:49:52.776977 7fdcb999e040 -1 Error 22 ((22) Invalid argument) seeking to 0x166be9401291 Error ((22) Invalid argument) real 0m27.832s user 0m2.028s sys 0m3.438s [root@lima /]# cephfs-journal-tool --cluster=prodstore event get summary Events by type: EXPORT: 187 IMPORTFINISH: 182 IMPORTSTART: 182 OPEN: 3133 SUBTREEMAP: 129 UPDATE: 42185 Errors: 0 [root@lima /]# cephfs-journal-tool --cluster=prodstore header get { "magic": "ceph fs volume v011", "write_pos": 24653404029749, "expire_pos": 24652730602129, "trimmed_pos": 24652730597376, "stream_format": 1, "layout": { "stripe_unit": 4194304, "stripe_count": 1, "object_size": 4194304, "pool_id": 2, "pool_ns": "" } } [root@lima /]# printf "%x\n" "24653404029749" 166c1163c335 [root@lima /]# printf "%x\n" "24652730602129" 166be9401291
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com