Success!

I remembered I had a server I'd taken out of the cluster to
investigate some issues, that had some good quality 800GB Intel DC
SSDs, dedicated an entire drive to swap, tuned up min_free_kbytes,
added an MDS to that server and let it run. Took 3 - 4 hours but
eventually came back online. It used the 128GB of RAM and about 250GB
of the swap.

Dan, thanks so much for steering me down this path, I would have more
than likely started hacking away at the journal otherwise!

Frank, thanks for pointing me towards that other thread, I used your
min_free_kbytes tip

I now need to consider updating - I wonder if the risk averse CephFS
operator would go for the latest Nautilus or latest Octopus, it used
to be that the newer CephFS code meant the most stable but don't know
if that's still the case.

Thanks, again
David

On Thu, Oct 22, 2020 at 7:06 PM Frank Schilder <fr...@dtu.dk> wrote:
>
> The post was titled "mds behind on trimming - replay until memory exhausted".
>
> > Load up with swap and try the up:replay route.
> > Set the beacon to 100000 until it finishes.
>
> Good point! The MDS will not send beacons for a long time. Same was necessary 
> in the other case.
>
> Good luck!
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to