Re: [ceph-users] Behaviour of a cluster with full OSD(s)

Craig Lewis Tue, 23 Dec 2014 11:49:49 -0800

On Tue, Dec 23, 2014 at 3:34 AM, Max Power <
[email protected]> wrote:


> I understand that the status "osd full" should never be reached. As I am
> new to
> ceph I want to be prepared for this case. I tried two different scenarios
> and
> here are my experiences:
>

For a real cluster, you should be monitoring your cluster, and taking
immediate action once you get an OSD in nearfull state.  Waiting until OSDs
are toofull is too late.

For a test cluster, it's a great learning experience. :-)



> The first one is to completely fill the storage (for me: writing files to a
> rados blockdevice). I discovered that the writing client (dd for example)
> gets
> completly stucked then. And this prevents me from stoping the process
> (SIGTERM,
> SIGKILL). At the moment I restart the whole computer to prevent writing to
> the
> cluster. Then I unmap the rbd device and set the full ratio a bit higher
> (0.95
> to 0.97). I do a mount on my adminnode and delete files till everything is
> okay
> again.
> Is this the best practice?


It is a design feature of Ceph that all cluster reads and writes stop until
the toofull situation is resolved.

The route you took is one of two ways to recover.  The other route you
found in your replica test.



> Is it possible to prevent the system from running in
> a "osd full" state? I could make the block devices smaller than the
> cluster can
> save. But it's hard to calculate this exactly.
>

If you continue to add data to the cluster after it's nearfull, then you're
going to hit toofull.
Once you hit nearfull, you need to delete existing data, or add more OSDs.

You've probably noticed that some OSDs are using more space than others.
You can try to even them out with `ceph osd reweight` or `ceph osd crush
reweight`, but that's a delaying tactic.  When I hit nearfull, I place an
order for new hardware, then use `ceph osd reweight` until it arrives.



> The next scenario is to change a pool size from say 2 to 3 replicas. While
> the
> cluster copies the objects it gets stuck as an osd reaches it limit.
> Normally
> the osd process quits then and I cannot restart it (even after setting the
> replicas back). The only possibility is to manually delete complete PG
> folders
> after exploring them with 'pg dump'. Is this the only way to get it back
> working
> again?
>

There are some other configs that might have come into play here.  You
might have run into osd_failsafe_nearfull_ratio
or osd_failsafe_full_ratio.  You could try bumping those up a bit, and see
if that lets the process stay up long enough to start reducing replicas.

Since osd_failsafe_full_ratio is already 0.97, I wouldn't take it any
higher than 0.98.  Ceph triggers on "greater-than" percentages, so 0.99
will let you fill a disk to 100% full.  If you get a disk to 100% full, the
only way to cleanup is to start deleting PG directories.

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Behaviour of a cluster with full OSD(s)

Reply via email to