Dimitar
Is it fixed ?
- is your cluster pool size is 2
- you can consider running ceph pg repair {pgid} or ceph osd lost 4 ( this is
a bit dangerous command )
****************************************************************
Karan Singh
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/
****************************************************************
> On 22 Feb 2016, at 10:10, Dimitar Boichev <[email protected]>
> wrote:
>
> Anyone ?
>
> Regards.
>
> From: ceph-users [mailto:[email protected]] On Behalf Of
> Dimitar Boichev
> Sent: Thursday, February 18, 2016 5:06 PM
> To: [email protected]
> Subject: [ceph-users] osd not removed from crush map after ceph osd crush
> remove
>
> Hello,
> I am running a tiny cluster of 2 nodes.
> ceph -v
> ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
>
> One osd died and I added a new osd (not replacing the old one).
> After that I wanted to remove the failed osd completely from the cluster.
> Here is what I did:
> ceph osd reweight osd.4 0.0
> ceph osd crush reweight osd.4 0.0
> ceph osd out osd.4
> ceph osd crush remove osd.4
> ceph auth del osd.4
> ceph osd rm osd.4
>
>
> But after the rebalancing I ended up with 155 PGs in stale+active+clean
> state.
>
> @storage1:/tmp# ceph -s
> cluster 7a9120b9-df42-4308-b7b1-e1f3d0f1e7b3
> health HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are
> blocked > 32 sec; nodeep-scrub flag(s) set
> monmap e1: 1 mons at {storage1=192.168.10.3:6789/0}, election epoch 1,
> quorum 0 storage1
> osdmap e1064: 6 osds: 6 up, 6 in
> flags nodeep-scrub
> pgmap v26760322: 712 pgs, 8 pools, 532 GB data, 155 kobjects
> 1209 GB used, 14210 GB / 15419 GB avail
> 155 stale+active+clean
> 557 active+clean
> client io 91925 B/s wr, 5 op/s
>
> I know about the 1 monitor problem I just want to fix the cluster to healthy
> state then I will add the third storage node and go up to 3 monitors.
>
> The problem is as follows:
> @storage1:/tmp# ceph pg map 2.3a
> osdmap e1064 pg 2.3a (2.3a) -> up [6] acting [6]
> @storage1:/tmp# ceph pg 2.3a query
> Error ENOENT: i don't have pgid 2.3a
>
>
> @storage1:/tmp# ceph health detail
> HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32
> sec; 1 osds have slow requests; nodeep-scrub flag(s) set
> pg 7.2a is stuck stale for 8887559.656879, current state stale+active+clean,
> last acting [4]
> pg 5.28 is stuck stale for 8887559.656886, current state stale+active+clean,
> last acting [4]
> pg 7.2b is stuck stale for 8887559.656889, current state stale+active+clean,
> last acting [4]
> pg 7.2c is stuck stale for 8887559.656892, current state stale+active+clean,
> last acting [4]
> pg 0.2b is stuck stale for 8887559.656893, current state stale+active+clean,
> last acting [4]
> pg 6.2c is stuck stale for 8887559.656894, current state stale+active+clean,
> last acting [4]
> pg 6.2f is stuck stale for 8887559.656893, current state stale+active+clean,
> last acting [4]
> pg 2.2b is stuck stale for 8887559.656896, current state stale+active+clean,
> last acting [4]
> pg 2.25 is stuck stale for 8887559.656896, current state stale+active+clean,
> last acting [4]
> pg 6.20 is stuck stale for 8887559.656898, current state stale+active+clean,
> last acting [4]
> pg 5.21 is stuck stale for 8887559.656898, current state stale+active+clean,
> last acting [4]
> pg 0.24 is stuck stale for 8887559.656904, current state stale+active+clean,
> last acting [4]
> pg 2.21 is stuck stale for 8887559.656904, current state stale+active+clean,
> last acting [4]
> pg 5.27 is stuck stale for 8887559.656906, current state stale+active+clean,
> last acting [4]
> pg 2.23 is stuck stale for 8887559.656908, current state stale+active+clean,
> last acting [4]
> pg 6.26 is stuck stale for 8887559.656909, current state stale+active+clean,
> last acting [4]
> pg 7.27 is stuck stale for 8887559.656913, current state stale+active+clean,
> last acting [4]
> pg 7.18 is stuck stale for 8887559.656914, current state stale+active+clean,
> last acting [4]
> pg 0.1e is stuck stale for 8887559.656914, current state stale+active+clean,
> last acting [4]
> pg 6.18 is stuck stale for 8887559.656919, current state stale+active+clean,
> last acting [4]
> pg 2.1f is stuck stale for 8887559.656919, current state stale+active+clean,
> last acting [4]
> pg 7.1b is stuck stale for 8887559.656922, current state stale+active+clean,
> last acting [4]
> pg 0.1b is stuck stale for 8887559.656919, current state stale+active+clean,
> last acting [4]
> pg 6.1d is stuck stale for 8887559.656925, current state stale+active+clean,
> last acting [4]
> pg 2.18 is stuck stale for 8887559.656920, current state stale+active+clean,
> last acting [4]
> pg 7.1d is stuck stale for 8887559.656926, current state stale+active+clean,
> last acting [4]
> pg 5.1c is stuck stale for 8887559.656921, current state stale+active+clean,
> last acting [4]
> pg 5.1d is stuck stale for 8887559.656920, current state stale+active+clean,
> last acting [4]
> pg 6.11 is stuck stale for 8887559.656922, current state stale+active+clean,
> last acting [4]
> pg 5.13 is stuck stale for 8887559.656919, current state stale+active+clean,
> last acting [4]
> pg 0.16 is stuck stale for 8887559.656924, current state stale+active+clean,
> last acting [4]
> pg 6.10 is stuck stale for 8887559.656928, current state stale+active+clean,
> last acting [4]
> pg 2.17 is stuck stale for 8887559.656927, current state stale+active+clean,
> last acting [4]
> pg 7.12 is stuck stale for 8887559.656932, current state stale+active+clean,
> last acting [4]
> pg 0.12 is stuck stale for 8887559.656929, current state stale+active+clean,
> last acting [4]
> pg 6.14 is stuck stale for 8887559.656935, current state stale+active+clean,
> last acting [4]
> pg 0.11 is stuck stale for 8887559.656932, current state stale+active+clean,
> last acting [4]
> pg 7.16 is stuck stale for 8887559.656936, current state stale+active+clean,
> last acting [4]
> pg 0.10 is stuck stale for 8887559.656936, current state stale+active+clean,
> last acting [4]
> pg 2.d is stuck stale for 8887559.656933, current state stale+active+clean,
> last acting [4]
> pg 6.9 is stuck stale for 8887559.656939, current state stale+active+clean,
> last acting [4]
> pg 7.9 is stuck stale for 8887559.656939, current state stale+active+clean,
> last acting [4]
> pg 0.d is stuck stale for 8887559.656940, current state stale+active+clean,
> last acting [4]
> pg 7.a is stuck stale for 8887559.656944, current state stale+active+clean,
> last acting [4]
> pg 0.c is stuck stale for 8887559.656941, current state stale+active+clean,
> last acting [4]
> pg 2.e is stuck stale for 8887559.656947, current state stale+active+clean,
> last acting [4]
> pg 6.a is stuck stale for 8887559.656953, current state stale+active+clean,
> last acting [4]
> pg 0.b is stuck stale for 8887559.656949, current state stale+active+clean,
> last acting [4]
> pg 2.9 is stuck stale for 8887559.656954, current state stale+active+clean,
> last acting [4]
> pg 5.f is stuck stale for 8887559.656953, current state stale+active+clean,
> last acting [4]
> pg 7.d is stuck stale for 8887559.656958, current state stale+active+clean,
> last acting [4]
> pg 6.f is stuck stale for 8887559.656957, current state stale+active+clean,
> last acting [4]
> pg 3.4 is stuck stale for 8887559.656957, current state stale+active+clean,
> last acting [4]
> pg 5.3 is stuck stale for 8887559.656956, current state stale+active+clean,
> last acting [4]
> pg 2.4 is stuck stale for 8887559.656961, current state stale+active+clean,
> last acting [4]
> pg 6.0 is stuck stale for 8887559.656966, current state stale+active+clean,
> last acting [4]
> pg 3.6 is stuck stale for 8887559.656965, current state stale+active+clean,
> last acting [4]
> pg 3.7 is stuck stale for 8887559.656964, current state stale+active+clean,
> last acting [4]
> pg 2.6 is stuck stale for 8887559.656970, current state stale+active+clean,
> last acting [4]
> pg 0.3 is stuck stale for 8887559.656965, current state stale+active+clean,
> last acting [4]
> pg 5.6 is stuck stale for 8887559.656970, current state stale+active+clean,
> last acting [4]
> pg 7.4 is stuck stale for 8887559.656975, current state stale+active+clean,
> last acting [4]
> pg 3.1 is stuck stale for 8887559.656970, current state stale+active+clean,
> last acting [4]
> pg 6.4 is stuck stale for 8887559.656975, current state stale+active+clean,
> last acting [4]
> pg 5.4 is stuck stale for 8887559.656972, current state stale+active+clean,
> last acting [4]
> pg 2.3 is stuck stale for 8887559.656977, current state stale+active+clean,
> last acting [4]
> pg 5.5 is stuck stale for 8887559.656977, current state stale+active+clean,
> last acting [4]
> pg 3.3 is stuck stale for 8887559.656982, current state stale+active+clean,
> last acting [4]
> pg 5.7a is stuck stale for 8887559.657309, current state stale+active+clean,
> last acting [4]
> pg 6.78 is stuck stale for 8887559.657308, current state stale+active+clean,
> last acting [4]
> pg 5.78 is stuck stale for 8887559.657311, current state stale+active+clean,
> last acting [4]
> pg 5.79 is stuck stale for 8887559.657311, current state stale+active+clean,
> last acting [4]
> pg 6.7c is stuck stale for 8887559.657313, current state stale+active+clean,
> last acting [4]
> pg 7.7e is stuck stale for 8887559.657312, current state stale+active+clean,
> last acting [4]
> pg 6.7e is stuck stale for 8887559.657315, current state stale+active+clean,
> last acting [4]
> pg 7.70 is stuck stale for 8887559.657316, current state stale+active+clean,
> last acting [4]
> pg 6.73 is stuck stale for 8887559.657316, current state stale+active+clean,
> last acting [4]
> pg 5.77 is stuck stale for 8887559.657317, current state stale+active+clean,
> last acting [4]
> pg 5.74 is stuck stale for 8887559.657319, current state stale+active+clean,
> last acting [4]
> pg 5.75 is stuck stale for 8887559.657321, current state stale+active+clean,
> last acting [4]
> pg 7.68 is stuck stale for 8887559.657322, current state stale+active+clean,
> last acting [4]
> pg 6.68 is stuck stale for 8887559.657324, current state stale+active+clean,
> last acting [4]
> pg 7.6b is stuck stale for 8887559.657326, current state stale+active+clean,
> last acting [4]
> pg 6.6d is stuck stale for 8887559.657328, current state stale+active+clean,
> last acting [4]
> pg 5.6e is stuck stale for 8887559.657330, current state stale+active+clean,
> last acting [4]
> pg 6.6c is stuck stale for 8887559.657330, current state stale+active+clean,
> last acting [4]
> pg 7.6f is stuck stale for 8887559.657331, current state stale+active+clean,
> last acting [4]
> pg 7.60 is stuck stale for 8887559.657333, current state stale+active+clean,
> last acting [4]
> pg 6.60 is stuck stale for 8887559.657333, current state stale+active+clean,
> last acting [4]
> pg 7.62 is stuck stale for 8887559.657334, current state stale+active+clean,
> last acting [4]
> pg 6.65 is stuck stale for 8887559.657334, current state stale+active+clean,
> last acting [4]
> pg 7.64 is stuck stale for 8887559.657339, current state stale+active+clean,
> last acting [4]
> pg 5.67 is stuck stale for 8887559.657338, current state stale+active+clean,
> last acting [4]
> pg 7.66 is stuck stale for 8887559.657340, current state stale+active+clean,
> last acting [4]
> pg 6.66 is stuck stale for 8887559.657340, current state stale+active+clean,
> last acting [4]
> pg 7.67 is stuck stale for 8887559.657345, current state stale+active+clean,
> last acting [4]
> pg 6.59 is stuck stale for 8887559.657344, current state stale+active+clean,
> last acting [4]
> pg 7.58 is stuck stale for 8887559.657348, current state stale+active+clean,
> last acting [4]
> pg 6.58 is stuck stale for 8887559.657348, current state stale+active+clean,
> last acting [4]
> pg 7.59 is stuck stale for 8887559.657352, current state stale+active+clean,
> last acting [4]
> pg 6.5b is stuck stale for 8887559.657353, current state stale+active+clean,
> last acting [4]
> pg 5.59 is stuck stale for 8887559.657348, current state stale+active+clean,
> last acting [4]
> pg 6.5a is stuck stale for 8887559.657356, current state stale+active+clean,
> last acting [4]
> pg 5.5e is stuck stale for 8887559.657352, current state stale+active+clean,
> last acting [4]
> pg 6.5d is stuck stale for 8887559.657358, current state stale+active+clean,
> last acting [4]
> pg 6.5f is stuck stale for 8887559.657356, current state stale+active+clean,
> last acting [4]
> pg 7.51 is stuck stale for 8887559.657356, current state stale+active+clean,
> last acting [4]
> pg 7.52 is stuck stale for 8887559.657356, current state stale+active+clean,
> last acting [4]
> pg 7.53 is stuck stale for 8887559.657358, current state stale+active+clean,
> last acting [4]
> pg 6.55 is stuck stale for 8887559.657359, current state stale+active+clean,
> last acting [4]
> pg 7.54 is stuck stale for 8887559.657364, current state stale+active+clean,
> last acting [4]
> pg 6.54 is stuck stale for 8887559.657364, current state stale+active+clean,
> last acting [4]
> pg 6.57 is stuck stale for 8887559.657365, current state stale+active+clean,
> last acting [4]
> pg 7.56 is stuck stale for 8887559.657369, current state stale+active+clean,
> last acting [4]
> pg 5.55 is stuck stale for 8887559.657371, current state stale+active+clean,
> last acting [4]
> pg 7.48 is stuck stale for 8887559.657372, current state stale+active+clean,
> last acting [4]
> pg 6.49 is stuck stale for 8887559.657375, current state stale+active+clean,
> last acting [4]
> pg 5.4a is stuck stale for 8887559.657376, current state stale+active+clean,
> last acting [4]
> pg 6.48 is stuck stale for 8887559.657379, current state stale+active+clean,
> last acting [4]
> pg 7.4a is stuck stale for 8887559.657380, current state stale+active+clean,
> last acting [4]
> pg 6.4a is stuck stale for 8887559.657383, current state stale+active+clean,
> last acting [4]
> pg 6.4d is stuck stale for 8887559.657385, current state stale+active+clean,
> last acting [4]
> pg 7.4d is stuck stale for 8887559.657387, current state stale+active+clean,
> last acting [4]
> pg 6.4c is stuck stale for 8887559.657389, current state stale+active+clean,
> last acting [4]
> pg 6.4e is stuck stale for 8887559.657391, current state stale+active+clean,
> last acting [4]
> pg 5.42 is stuck stale for 8887559.657391, current state stale+active+clean,
> last acting [4]
> pg 6.43 is stuck stale for 8887559.657393, current state stale+active+clean,
> last acting [4]
> pg 5.41 is stuck stale for 8887559.657393, current state stale+active+clean,
> last acting [4]
> pg 5.47 is stuck stale for 8887559.657394, current state stale+active+clean,
> last acting [4]
> pg 7.46 is stuck stale for 8887559.657396, current state stale+active+clean,
> last acting [4]
> pg 6.39 is stuck stale for 8887559.657398, current state stale+active+clean,
> last acting [4]
> pg 5.3a is stuck stale for 8887559.657399, current state stale+active+clean,
> last acting [4]
> pg 2.3e is stuck stale for 8887559.657399, current state stale+active+clean,
> last acting [4]
> pg 0.3c is stuck stale for 8887559.657402, current state stale+active+clean,
> last acting [4]
> pg 7.3c is stuck stale for 8887559.657404, current state stale+active+clean,
> last acting [4]
> pg 7.3d is stuck stale for 8887559.657405, current state stale+active+clean,
> last acting [4]
> pg 0.39 is stuck stale for 8887559.657402, current state stale+active+clean,
> last acting [4]
> pg 5.3c is stuck stale for 8887559.657405, current state stale+active+clean,
> last acting [4]
> pg 2.3a is stuck stale for 8887559.657406, current state stale+active+clean,
> last acting [4]
> pg 0.38 is stuck stale for 8887559.657409, current state stale+active+clean,
> last acting [4]
> pg 2.35 is stuck stale for 8887559.657411, current state stale+active+clean,
> last acting [4]
> pg 0.37 is stuck stale for 8887559.657412, current state stale+active+clean,
> last acting [4]
> pg 5.32 is stuck stale for 8887559.657413, current state stale+active+clean,
> last acting [4]
> pg 2.34 is stuck stale for 8887559.657416, current state stale+active+clean,
> last acting [4]
> pg 0.36 is stuck stale for 8887559.657416, current state stale+active+clean,
> last acting [4]
> pg 7.32 is stuck stale for 8887559.657419, current state stale+active+clean,
> last acting [4]
> pg 6.33 is stuck stale for 8887559.657420, current state stale+active+clean,
> last acting [4]
> pg 0.35 is stuck stale for 8887559.657423, current state stale+active+clean,
> last acting [4]
> pg 6.35 is stuck stale for 8887559.657423, current state stale+active+clean,
> last acting [4]
> pg 5.36 is stuck stale for 8887559.657424, current state stale+active+clean,
> last acting [4]
> pg 2.30 is stuck stale for 8887559.657427, current state stale+active+clean,
> last acting [4]
> pg 5.37 is stuck stale for 8887559.657429, current state stale+active+clean,
> last acting [4]
> pg 7.36 is stuck stale for 8887559.657430, current state stale+active+clean,
> last acting [4]
> pg 6.37 is stuck stale for 8887559.657432, current state stale+active+clean,
> last acting [4]
> pg 6.28 is stuck stale for 8887559.657427, current state stale+active+clean,
> last acting [4]
>
>
> This stays that way and I think this is because when I downloaded and
> decompiled the crush map I discovered this:
> @storage1:/tmp# crushtool -d /tmp/crushmap
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 device4
> device 5 osd.5
> device 6 osd.6
>
>
>
> Is there a way to remove this device 4 aka osd.4 from here so ceph can make
> another copy from the other location shown in “ceph pg map 2.3a” ?
>
> Regards.
>
> Dimitar Boichev
> SysAdmin Team Lead
> AXSMarine Sofia
> Phone: +359 889 22 55 42
> Skype: dimitar.boichev.axsmarine
> E-mail: [email protected]
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com