Hi all,
I’m running a Ceph cluster managed by Rook on Kubernetes, and my CephFS
metadata/journal appears to be in a bad state. I’d like to get advice
before I attempt any destructive metadata repair operations.
Below is a summary of the situation.
*Environment*
- Orchestrator: Rook/Ceph on Kubernetes (namespace: rook-ceph)
- Ceph version: 18.2.2 (Reef, stable) for mon/mgr/osd/mds
- Filesystem:
- Name: rookfs
- One metadata pool
- One data pool
*Cluster health*
ceph status shows:
- Health: HEALTH_WARN
- Warnings:
- 1 MDS reports slow metadata IOs
- 1 MDS reports slow requests
- Reduced data availability: some PGs in stale state
- A large number of daemons have recently crashed
MDS section reports:
- 1/1 MDS daemons up, 1 hot standby
CephFS state and MDS status
ceph fs dump for rookfs shows
- Filesystem rookfs is marked damaged.
- max_mds = 1.
- in set is empty.
- up set is {0=<mds_id>}.
- Flags mention allow_standby_replay.
So, the filesystem is marked damaged in the fsmap, while one MDS is still
up:active on rank 0.
ceph tell mds.* status confirms that the MDS for rookfs is:
- state: up:active
- fs_name: rookfs
- whoami: 0
I ran the following commands on the CephFS journal:
1. Journal reset:
bash
cephfs-journal-tool --rank=rookfs:0 journal reset
This completed and indicated a new journal start offset.
2. Journal inspection:
bash
cephfs-journal-tool --rank=rookfs:0 journal inspect
Output :
- Bad entry start ptr (...) at certain offsets
- Overall journal integrity: DAMAGED
- Corrupt regions reported, including a range up to ffffffffffffffff
So, even after the reset, cephfs-journal-tool reports the journal as
DAMAGED with
corrupt regions.
Listing the metadata pool shows at least the mds_snaptable object, so the
metadata pool is not empty.
*Current behaviour*
- ceph fs status is sometimes very slow or appears to hang.
- Ceph health reports:
- “MDSs report slow metadata IOs”
- “MDSs report slow requests”
- Stale PGs in the cluster
- The filesystem rookfs is marked damaged in ceph fs dump, but the
MDS is still up:active on rank 0.
Any guidance or best practices for handling this kind of journal corruption
and damaged filesystem in a Rook/Kubernetes setup would be greatly
appreciated, including precautions you would strongly recommend before
running the heavy-repair commands.
Best regards,
Anthony
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]