Hi Peter,
Thanks a lot for the reply. Please find 'ceph osd df' output here -
# ceph osd df
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
2 0.04399 1.00000 46056M 35576k 46021M 0.08 0.00 0
1 0.04399 1.00000 46056M 40148k 46017M 0.09 0.00 384
0 0.04399 1.00000 46056M 43851M 2205M 95.21 2.99 192
0 0.04399 1.00000 46056M 43851M 2205M 95.21 2.99 192
1 0.04399 1.00000 46056M 40148k 46017M 0.09 0.00 384
2 0.04399 1.00000 46056M 35576k 46021M 0.08 0.00 0
TOTAL 134G 43925M 94244M 31.79
MIN/MAX VAR: 0.00/2.99 STDDEV: 44.85
I setup this cluster by manipulating CRUSH map using CLI. I had a default
root
before but it gave me an impression that since every rack is under a single
root bucket its marking entire cluster down in case one of the osd is 95%
full. So I
removed root bucket but that still did not help me. No crush rule is
referring
to root bucket in the above mentioned case.
Yes, I added one osd under two racks by linking host bucket from one rack
to another
using following command -
"osd crush link <name> <args> [<args>...] : link existing entry for <name>
under location <args>"
On Thu, Aug 10, 2017 at 1:40 PM, Peter Maloney <
[email protected]> wrote:
> I think a `ceph osd df` would be useful.
>
> And how did you set up such a cluster? I don't see a root, and you have
> each osd in there more than once...is that even possible?
>
>
>
> On 08/10/17 08:46, Mandar Naik wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * Hi, I am evaluating ceph cluster for a solution where ceph could be used
> for provisioning pools which could be either stored local to a node or
> replicated across a cluster. This way ceph could be used as single point
> of solution for writing both local as well as replicated data. Local
> storage helps avoid possible storage cost that comes with replication
> factor of more than one and also provide availability as long as the data
> host is alive. So I tried an experiment with Ceph cluster where there is
> one crush rule which replicates data across nodes and other one only points
> to a crush bucket that has local ceph osd. Cluster configuration is pasted
> below. Here I observed that if one of the disk is full (95%) entire cluster
> goes into error state and stops accepting new writes from/to other nodes.
> So ceph cluster became unusable even though it’s only 32% full. The writes
> are blocked even for pools which are not touching the full osd. I have
> tried playing around crush hierarchy but it did not help. So is it possible
> to store data in the above manner with Ceph ? If yes could we get cluster
> state in usable state after one of the node is full ? # ceph df GLOBAL:
> SIZE AVAIL RAW USED %RAW USED 134G 94247M
> 43922M 31.79 # ceph –s cluster
> ba658a02-757d-4e3c-7fb3-dc4bf944322f health HEALTH_ERR 1
> full osd(s) full,sortbitwise,require_jewel_osds flag(s) set
> monmap e3: 3 mons at
> {ip-10-0-9-122=10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0
> <http://10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0>}
> election epoch 14, quorum 0,1,2
> ip-10-0-9-122,ip-10-0-9-146,ip-10-0-9-210 osdmap e93: 3 osds: 3 up, 3
> in flags full,sortbitwise,require_jewel_osds pgmap v630:
> 384 pgs, 6 pools, 43772 MB data, 18640 objects 43922 MB used,
> 94247 MB / 134 GB avail 384 active+clean # ceph osd tree ID
> WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -9
> 0.04399 rack ip-10-0-9-146-rack -8 0.04399 host ip-10-0-9-146 2 0.04399
> osd.2 up 1.00000 1.00000 -7 0.04399 rack
> ip-10-0-9-210-rack -6 0.04399 host ip-10-0-9-210 1 0.04399
> osd.1 up 1.00000 1.00000 -5 0.04399 rack
> ip-10-0-9-122-rack -3 0.04399 host ip-10-0-9-122 0 0.04399
> osd.0 up 1.00000 1.00000 -4 0.13197 rack
> rep-rack -3 0.04399 host ip-10-0-9-122 0 0.04399 osd.0
> up 1.00000 1.00000 -6 0.04399 host
> ip-10-0-9-210 1 0.04399 osd.1 up 1.00000
> 1.00000 -8 0.04399 host ip-10-0-9-146 2 0.04399 osd.2
> up 1.00000 1.00000 # ceph osd crush rule list [
> "rep_ruleset", "ip-10-0-9-122_ruleset", "ip-10-0-9-210_ruleset",
> "ip-10-0-9-146_ruleset" ] # ceph osd crush rule dump rep_ruleset {
> "rule_id": 0, "rule_name": "rep_ruleset", "ruleset": 0, "type":
> 1, "min_size": 1, "max_size": 10, "steps": [ {
> "op": "take", "item": -4, "item_name":
> "rep-rack" }, { "op": "chooseleaf_firstn",
> "num": 0, "type": "host" }, {
> "op": "emit" } ] } # ceph osd crush rule dump
> ip-10-0-9-122_ruleset { "rule_id": 1, "rule_name":
> "ip-10-0-9-122_ruleset", "ruleset": 1, "type": 1, "min_size": 1,
> "max_size": 10, "steps": [ { "op": "take",
> "item": -5, "item_name": "ip-10-0-9-122-rack"
> }, { "op": "chooseleaf_firstn", "num":
> 0, "type": "host" }, { "op": "emit"
> } ] } *
>
> --
> Thanks,
> Mandar Naik.
>
>
> _______________________________________________
> ceph-users mailing
> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> --------------------------------------------
> Peter Maloney
> Brockmann Consult
> Max-Planck-Str. 2
> 21502 Geesthacht
> Germany
> Tel: +49 4152 889 300
> Fax: +49 4152 889 333
> E-mail: [email protected]
> Internet: http://www.brockmann-consult.de
> --------------------------------------------
>
>
--
Thanks,
Mandar Naik.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com