An MDS-wide lock is acquired before the cache dump is done.
After the dump is complete, the lock is released.
So, the MDS freezing temporarily during the cache dump is expected.
On Fri, May 26, 2023 at 12:51 PM Emmanuel Jaep
wrote:
> Hi Milind,
>
> I finally managed to dump the cache and find
Hi Milind,
I finally managed to dump the cache and find the file.
It generated a 1.5 GB file with about 7 Mio lines. It's kind of hard to
know what is out of the ordinary…
Furthermore, I noticed that dumping the cache was actually stopping the
MDS. Is it a normal behavior?
Best,
Emmanuel
On Th
try the command with the --id argument:
# ceph --id admin --cluster floki daemon mds.icadmin011 dump cache
/tmp/dump.txt
I presume that your keyring has an appropriate entry for the client.admin
user
On Wed, May 24, 2023 at 5:10 PM Emmanuel Jaep
wrote:
> Absolutely! :-)
>
> root@icadmin011:/t
Absolutely! :-)
root@icadmin011:/tmp# ceph --cluster floki daemon mds.icadmin011 dump cache
/tmp/dump.txt
root@icadmin011:/tmp# ll
total 48
drwxrwxrwt 12 root root 4096 May 24 13:23 ./
drwxr-xr-x 18 root root 4096 Jun 9 2022 ../
drwxrwxrwt 2 root root 4096 May 4 12:43 .ICE-unix/
drwxrwxrwt
I hope the daemon mds.icadmin011 is running on the same machine that you
are looking for /tmp/dump.txt, since the file is created on the system
which has that daemon running.
On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep
wrote:
> Hi Milind,
>
> you are absolutely right.
>
> The dump_ops_in_flig
Hi Milind,
you are absolutely right.
The dump_ops_in_flight is giving a good hint about what's happening:
{
"ops": [
{
"description": "internal op exportdir:mds.5:975673",
"initiated_at": "2023-05-23T17:49:53.030611+0200",
"age": 60596.355186077999,
Emmanuel,
You probably missed the "daemon" keyword after the "ceph" command name.
Here's the docs for pacific:
https://docs.ceph.com/en/pacific/cephfs/troubleshooting/
So, your command should've been:
# ceph daemon mds.icadmin011 dump cache /tmp/dump.txt
You could also dump the ops in flight with