Re: [ceph-users] Ceph cluster in error state (full) with raw usage 32% of total capacity

Etienne Menguy Wed, 16 Aug 2017 01:33:08 -0700

Hi,


Your crushmap has issues.

You don't have any root and you have duplicates entries. Currently you store 
data on a single OSD.


You can manually fix the crushmap by decompiling, editing and compiling.

http://docs.ceph.com/docs/hammer/rados/operations/crush-map/#editing-a-crush-map

(if you have some production data, do a backup first)


Étienne


________________________________
From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Mandar Naik 
<mandar.p...@gmail.com>
Sent: Wednesday, August 16, 2017 09:39
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph cluster in error state (full) with raw usage 32% 
of total capacity

Hi,
I just wanted to give a friendly reminder for this issue. I would appreciate if 
someone
can help me out here. Also, please do let me know in case some more information 
is
required here.

On Thu, Aug 10, 2017 at 2:41 PM, Mandar Naik 
<mandar.p...@gmail.com<mailto:mandar.p...@gmail.com>> wrote:
Hi Peter,
Thanks a lot for the reply. Please find 'ceph osd df' output here -

# ceph osd df
ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
 2 0.04399  1.00000 46056M 35576k 46021M  0.08 0.00   0
 1 0.04399  1.00000 46056M 40148k 46017M  0.09 0.00 384
 0 0.04399  1.00000 46056M 43851M  2205M 95.21 2.99 192
 0 0.04399  1.00000 46056M 43851M  2205M 95.21 2.99 192
 1 0.04399  1.00000 46056M 40148k 46017M  0.09 0.00 384
 2 0.04399  1.00000 46056M 35576k 46021M  0.08 0.00   0
              TOTAL   134G 43925M 94244M 31.79
MIN/MAX VAR: 0.00/2.99  STDDEV: 44.85

I setup this cluster by manipulating CRUSH map using CLI. I had a default root
before but it gave me an impression that since every rack is under a single
root bucket its marking entire cluster down in case one of the osd is 95% full. 
So I
removed root bucket but that still did not help me. No crush rule is referring
to root bucket in the above mentioned case.

Yes, I added one osd under two racks by linking host bucket from one rack to 
another
using following command -

"osd crush link <name> <args> [<args>...] :  link existing entry for <name> 
under location <args>"


On Thu, Aug 10, 2017 at 1:40 PM, Peter Maloney 
<peter.malo...@brockmann-consult.de<mailto:peter.malo...@brockmann-consult.de>> 
wrote:
I think a `ceph osd df` would be useful.

And how did you set up such a cluster? I don't see a root, and you have each 
osd in there more than once...is that even possible?



On 08/10/17 08:46, Mandar Naik wrote:

Hi,

I am evaluating ceph cluster for a solution where ceph could be used for 
provisioning

pools which could be either stored local to a node or replicated across a 
cluster.  This

way ceph could be used as single point of solution for writing both local as 
well as replicated

data. Local storage helps avoid possible storage cost that comes with 
replication factor of more

than one and also provide availability as long as the data host is alive.


So I tried an experiment with Ceph cluster where there is one crush rule which 
replicates data across

nodes and other one only points to a crush bucket that has local ceph osd. 
Cluster configuration

is pasted below.


Here I observed that if one of the disk is full (95%) entire cluster goes into 
error state and stops

accepting new writes from/to other nodes. So ceph cluster became unusable even 
though it’s only

32% full. The writes are blocked even for pools which are not touching the full 
osd.


I have tried playing around crush hierarchy but it did not help. So is it 
possible to store data in the above

manner with Ceph ? If yes could we get cluster state in usable state after one 
of the node is full ?



# ceph df


GLOBAL:

   SIZE     AVAIL      RAW USED     %RAW USED

   134G     94247M       43922M         31.79


# ceph –s


   cluster ba658a02-757d-4e3c-7fb3-dc4bf944322f

    health HEALTH_ERR

           1 full osd(s)

           full,sortbitwise,require_jewel_osds flag(s) set

    monmap e3: 3 mons at 
{ip-10-0-9-122=10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0<http://10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0>}

           election epoch 14, quorum 0,1,2 
ip-10-0-9-122,ip-10-0-9-146,ip-10-0-9-210

    osdmap e93: 3 osds: 3 up, 3 in

           flags full,sortbitwise,require_jewel_osds

     pgmap v630: 384 pgs, 6 pools, 43772 MB data, 18640 objects

           43922 MB used, 94247 MB / 134 GB avail

                384 active+clean


# ceph osd tree


ID WEIGHT  TYPE NAME               UP/DOWN REWEIGHT PRIMARY-AFFINITY

-9 0.04399 rack ip-10-0-9-146-rack

-8 0.04399     host ip-10-0-9-146

2 0.04399         osd.2                up  1.00000          1.00000

-7 0.04399 rack ip-10-0-9-210-rack

-6 0.04399     host ip-10-0-9-210

1 0.04399         osd.1                up  1.00000          1.00000

-5 0.04399 rack ip-10-0-9-122-rack

-3 0.04399     host ip-10-0-9-122

0 0.04399         osd.0                up  1.00000          1.00000

-4 0.13197 rack rep-rack

-3 0.04399     host ip-10-0-9-122

0 0.04399         osd.0                up  1.00000          1.00000

-6 0.04399     host ip-10-0-9-210

1 0.04399         osd.1                up  1.00000          1.00000

-8 0.04399     host ip-10-0-9-146

2 0.04399         osd.2                up  1.00000          1.00000


# ceph osd crush rule list

[

   "rep_ruleset",

   "ip-10-0-9-122_ruleset",

   "ip-10-0-9-210_ruleset",

   "ip-10-0-9-146_ruleset"

]


# ceph osd crush rule dump rep_ruleset

{

   "rule_id": 0,

   "rule_name": "rep_ruleset",

   "ruleset": 0,

   "type": 1,

   "min_size": 1,

   "max_size": 10,

   "steps": [

       {

           "op": "take",

           "item": -4,

           "item_name": "rep-rack"

       },

       {

           "op": "chooseleaf_firstn",

           "num": 0,

           "type": "host"

       },

       {

           "op": "emit"

       }

   ]

}


# ceph osd crush rule dump ip-10-0-9-122_ruleset

{

   "rule_id": 1,

   "rule_name": "ip-10-0-9-122_ruleset",

   "ruleset": 1,

   "type": 1,

   "min_size": 1,

   "max_size": 10,

   "steps": [

       {

           "op": "take",

           "item": -5,

           "item_name": "ip-10-0-9-122-rack"

       },

       {

           "op": "chooseleaf_firstn",

           "num": 0,

           "type": "host"

       },

       {

           "op": "emit"

       }

   ]

}


--
Thanks,
Mandar Naik.



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: 
peter.malo...@brockmann-consult.de<mailto:peter.malo...@brockmann-consult.de>
Internet: http://www.brockmann-consult.de
--------------------------------------------



--
Thanks,
Mandar Naik.



--
Thanks,
Mandar Naik.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster in error state (full) with raw usage 32% of total capacity

Reply via email to