[ceph-users] Re: Help needed, ceph fs down due to large stray dir

Frank Schilder Fri, 10 Jan 2025 12:43:57 -0800

Hi all,

I got the MDS up. however, after quite some time its sitting with almost no CPU 
load:


top - 21:40:02 up  2:49,  1 user,  load average: 0.00, 0.02, 0.34
Tasks: 606 total,   1 running, 247 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.1 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
GiB Mem :    503.7 total,     12.3 free,    490.3 used,      1.1 buff/cache
GiB Swap:   3577.0 total,   3367.0 free,    210.0 used.      2.9 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND   
                                 
  59495 ceph      20   0  685.8g 477.9g   0.0g S   1.0 94.9  53:47.57 ceph-mds  
                                 

I'm not sure if its doing anything at all. Only messages like these keep 
showing up in the log:

2025-01-10T21:38:08.459+0100 7f87ccd5f700  1 heartbeat_map is_healthy 'MDSRank' 
had timed out after 15.000000954s
2025-01-10T21:38:08.459+0100 7f87ccd5f700  0 mds.beacon.ceph-12 Skipping beacon 
heartbeat to monitors (last acked 3019.23s ago); MDS internal heartbeat is not 
healthy!

The MDS cluster looks healthy from this output:

# ceph fs status
con-fs2 - 1554 clients
=======
RANK  STATE     MDS       ACTIVITY     DNS    INOS   DIRS   CAPS  
 0    active  ceph-15  Reqs:    0 /s   255k   248k  5434   1678   
 1    active  ceph-14  Reqs:    2 /s   402k   396k  26.7k   144k  
 2    active  ceph-12  Reqs:    0 /s  86.9M  86.9M  46.2k  3909   
 3    active  ceph-08  Reqs:    0 /s   637k   630k  2663   7457   
 4    active  ceph-11  Reqs:    0 /s  1496k  1492k   113k   103k  
 5    active  ceph-16  Reqs:    2 /s   775k   769k  65.3k  12.9k  
 6    active  ceph-24  Reqs:    0 /s   130k   113k  7294   8670   
 7    active  ceph-13  Reqs:   65 /s  3619k  3609k   469k  47.2k  
        POOL           TYPE     USED  AVAIL  
   con-fs2-meta1     metadata  4078G  7269G  
   con-fs2-meta2       data       0   7258G  
    con-fs2-data       data    1225T  2476T  
con-fs2-data-ec-ssd    data     794G  22.6T  
   con-fs2-data2       data    5747T  2253T  
STANDBY MDS  
  ceph-09    
  ceph-10    
  ceph-23    
  ceph-17    
MDS version: ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) 
pacific (stable)

Did it mark itself out of the cluster and is waiting for the MON to fail it?? 
Please help.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <fr...@dtu.dk>
Sent: Friday, January 10, 2025 8:51 PM
To: Spencer Macphee
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Help needed, ceph fs down due to large stray dir

Hi all,

I seem to have gotten the MDS up to the point that it reports stats. Does this 
mean anything:

2025-01-10T20:50:25.256+0100 7f87ccd5f700  1 heartbeat_map is_healthy 'MDSRank' 
had timed out after 15.000000954s
2025-01-10T20:50:25.256+0100 7f87ccd5f700  0 mds.beacon.ceph-12 Skipping beacon 
heartbeat to monitors (last acked 156.027s ago); MDS internal heartbeat is not 
healthy!

I hope it doesn't get failed by some king of timeout now.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Spencer Macphee <spencerofsyd...@gmail.com>
Sent: Friday, January 10, 2025 7:16 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Help needed, ceph fs down due to large stray dir

I had a similar issue some months ago that ended up using around 300 gigabytes 
of RAM for a similar number of strays.

You can get an idea of the strays kicking around by checking the omapkeys of 
the stray objects in the cephfs metadata pool. Strays are tracked in objects: 
600.00000000, 601.00000000, 602.00000000, etc... etc... That would also give 
you an indication if it's progressing at each restart.

On Fri, Jan 10, 2025 at 1:30 PM Frank Schilder 
<fr...@dtu.dk<mailto:fr...@dtu.dk>> wrote:
Hi all,

we seem to have a serious issue with our file system, ceph version is pacific 
latest. After a large cleanup operation we had an MDS rank with 100Mio stray 
entries (yes, one hundred million). Today we restarted this daemon, which 
cleans up the stray entries. It seems that this leads to a restart loop due to 
OOM. The rank becomes active and then starts pulling in DNS and INOS entries 
until all memory is exhausted.

I have no idea if there is at least progress removing the stray items or if it 
starts from scratch every time. If it needs to pull as many DNS/INOS into cache 
as there are stray items, we don't have a server at hand with enough RAM.

Q1: Is the MDS at least making progress in every restart iteration?
Q2: If not, how do we get this rank up again?
Q3: If we can't get this rank up soon, can we at least move directories away 
from this rank by pinning it to another rank?

Currently, the rank in question reports .mds_cache.num_strays=0 in perf dump.

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

Reply via email to