Since my problem is going to be archived on the Internet I'll keep following 
up, so the next person with this problem might save some time.


The seek was because ext4 can't seek to 23TB, but changing to an xfs mount to 
create this file resulted in success.


Here is what I wound up doing to fix this:


  *   Bring down all MDSes so they stop flapping
  *   Back up journal (as seen in previous message)
  *   Apply journal manually
  *   Reset journal manually
  *   Clear session table
  *   Clear other tables (not sure I needed to do this)
  *   Mark FS down
  *   Mark the rank 0 MDS as failed
  *   Reset the FS (yes, I really mean it)
  *   Restart MDSes
  *   Finally get some sleep

If anybody has any idea what may have caused this situation, I am keenly 
interested. If not, hopefully I at least helped someone else.


________________________________
From: Pickett, Neale T
Sent: Monday, April 1, 2019 12:31
To: ceph-users@lists.ceph.com
Subject: Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat


We decided to go ahead and try truncating the journal, but before we did, we 
would try to back it up. However, there are ridiculous values in the header. It 
can't write a journal this large because (I presume) my ext4 filesystem can't 
seek to this position in the (sparse) file.


I would not be surprised to learn that memory allocation is trying to do 
something similar, hence the allocation of all available memory. This seems 
like a new kind of journal corruption that isn't being reported correctly.

[root@lima /]# time cephfs-journal-tool --cluster=prodstore journal export 
backup.bin
journal is 24652730602129~673601102
2019-04-01 17:49:52.776977 7fdcb999e040 -1 Error 22 ((22) Invalid argument) 
seeking to 0x166be9401291
Error ((22) Invalid argument)

real    0m27.832s
user    0m2.028s
sys     0m3.438s
[root@lima /]# cephfs-journal-tool --cluster=prodstore event get summary
Events by type:
  EXPORT: 187
  IMPORTFINISH: 182
  IMPORTSTART: 182
  OPEN: 3133
  SUBTREEMAP: 129
  UPDATE: 42185
Errors: 0
[root@lima /]# cephfs-journal-tool --cluster=prodstore header get
{
    "magic": "ceph fs volume v011",
    "write_pos": 24653404029749,
    "expire_pos": 24652730602129,
    "trimmed_pos": 24652730597376,
    "stream_format": 1,
    "layout": {
        "stripe_unit": 4194304,
        "stripe_count": 1,
        "object_size": 4194304,
        "pool_id": 2,
        "pool_ns": ""
    }
}

[root@lima /]# printf "%x\n" "24653404029749"
166c1163c335
[root@lima /]# printf "%x\n" "24652730602129"
166be9401291

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to