Re: [ceph-users] Undersized pgs problem

Bob R Sat, 28 Nov 2015 10:12:02 -0800

Vasiliy,

Your OSDs are marked as 'down' but 'in'.


"Ceph OSDs have two known states that can be combined. *Up* and *Down* only
tells you whether the OSD is actively involved in the cluster. OSD states
also are expressed in terms of cluster replication: *In* and *Out*. Only
when a Ceph OSD is tagged as *Out* does the self-healing process occur"

Bob

On Fri, Nov 27, 2015 at 6:15 AM, Mart van Santen <[email protected]> wrote:

>
> Dear Vasilily,
>
>
>
> On 11/27/2015 02:00 PM, Irek Fasikhov wrote:
>
> You have time to synchronize?
>
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
> 2015-11-27 15:57 GMT+03:00 Vasiliy Angapov <[email protected]>:
>
>> > It seams that you played around with crushmap, and done something wrong.
>> > Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
>> devices renamed to 'device' think threre is you problem.
>> Is this a mistake actually? What I did is removed a bunch of OSDs from
>> my cluster that's why the numeration is sparse. But is it an issue to
>> a have a sparse numeration of OSDs?
>>
>
> I think this is normal and should be no problem. I had this also
> previously.
>
>
>> > Hi.
>> > Vasiliy, Yes it is a problem with crusmap. Look at height:
>> > -3 14.56000     host slpeah001
>> > -2 14.56000     host slpeah002
>> What exactly is wrong here?
>>
>
> I do not know how the weight of the hosts contribute to determine were to
> store the 3-th copy of the PG. As you explained, you have enough space on
> all hosts, but maybe if the weights of the hosts do not count up and the
> crushmap maybe come to the conclusion it is not able to place the PGs. What
> you can try, is to artificially raise the weights of these hosts, to see if
> it starts mapping the thirth copies for the pg's onto the available host.
>
> I had a similiar problem in the past, this was solved by upgrading to the
> latest crush tunables. But be aware, that can create massive datamovement
> behavior.
>
>
>
>> I also found out that my OSD logs are full of such records:
>> 2015-11-26 08:31:19.273268 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:19.273276 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a520).accept: got bad
>> authorizer
>> 2015-11-26 08:31:24.273207 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:24.273225 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:24.273231 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a3c0).accept: got bad
>> authorizer
>> 2015-11-26 08:31:29.273199 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:29.273215 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:29.273222 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a260).accept: got bad
>> authorizer
>> 2015-11-26 08:31:34.273469 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:34.273482 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:34.273486 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a100).accept: got bad
>> authorizer
>> 2015-11-26 08:31:39.273310 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:39.273331 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:39.273342 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19fa0).accept: got bad
>> authorizer
>> 2015-11-26 08:31:44.273753 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:44.273769 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:44.273776 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee189a0).accept: got bad
>> authorizer
>> 2015-11-26 08:31:49.273412 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>> 2015-11-26 08:31:49.273431 7fe4f49b1700  0 cephx: verify_authorizer
>> could not get service secret for service osd secret_id=2924
>> 2015-11-26 08:31:49.273455 7fe4f49b1700  0 --
>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19080).accept: got bad
>> authorizer
>> 2015-11-26 08:31:54.273293 7fe4f49b1700  0 auth: could not find
>> secret_id=2924
>>
>> What does it mean? Google sais it might be a time sync issue, but my
>> clocks are perfectly synchronized...
>>
>
> Normally you get an error warning in "ceph status" if time is out of sync.
> Nevertheless, you can try to restart the OSD's. I had issues with timing in
> the past and discovered it sometime helps to restart the daemons *after*
> syncing the times, before the accepted the new timings. But this was mostly
> the case with monitors though.
>
>
>
> Regards,
>
>
> Mart
>
>
>
>
>
>> 2015-11-26 21:05 GMT+08:00 Irek Fasikhov < <[email protected]>
>> [email protected]>:
>> > Hi.
>> > Vasiliy, Yes it is a problem with crusmap. Look at height:
>> > " -3 14.56000     host slpeah001
>> >  -2 14.56000     host slpeah002
>> >  "
>> >
>> > С уважением, Фасихов Ирек Нургаязович
>> > Моб.: +79229045757
>> >
>> > 2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич
>> > <[email protected]>:
>> >>
>> >> It seams that you played around with crushmap, and done something
>> wrong.
>> >> Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
>> >> devices renamed to 'device' think threre is you problem.
>> >>
>> >> Отправлено с мобильного устройства.
>> >>
>> >>
>> >> -----Original Message-----
>> >> From: Vasiliy Angapov < <[email protected]>[email protected]>
>> >> To: ceph-users <[email protected]>
>> >> Sent: чт, 26 нояб. 2015 7:53
>> >> Subject: [ceph-users] Undersized pgs problem
>> >>
>> >> Hi, colleagues!
>> >>
>> >> I have small 4-node CEPH cluster (0.94.2), all pools have size 3,
>> min_size
>> >> 1.
>> >> This night one host failed and cluster was unable to rebalance saying
>> >> there are a lot of undersized pgs.
>> >>
>> >> root@slpeah002:[~]:# ceph -s
>> >>     cluster 78eef61a-3e9c-447c-a3ec-ce84c617d728
>> >>      health HEALTH_WARN
>> >>             1486 pgs degraded
>> >>             1486 pgs stuck degraded
>> >>             2257 pgs stuck unclean
>> >>             1486 pgs stuck undersized
>> >>             1486 pgs undersized
>> >>             recovery 80429/555185 objects degraded (14.487%)
>> >>             recovery 40079/555185 objects misplaced (7.219%)
>> >>             4/20 in osds are down
>> >>             1 mons down, quorum 1,2 slpeah002,slpeah007
>> >>      monmap e7: 3 mons at
>> >>
>> >> {slpeah001=
>> 192.168.254.11:6780/0,slpeah002=192.168.254.12:6780/0,slpeah007=172.31.252.46:6789/0
>> }
>> >>             election epoch 710, quorum 1,2 slpeah002,slpeah007
>> >>      osdmap e14062: 20 osds: 16 up, 20 in; 771 remapped pgs
>> >>       pgmap v7021316: 4160 pgs, 5 pools, 1045 GB data, 180 kobjects
>> >>             3366 GB used, 93471 GB / 96838 GB avail
>> >>             80429/555185 objects degraded (14.487%)
>> >>             40079/555185 objects misplaced (7.219%)
>> >>                 1903 active+clean
>> >>                 1486 active+undersized+degraded
>> >>                  771 active+remapped
>> >>   client io 0 B/s rd, 246 kB/s wr, 67 op/s
>> >>
>> >>   root@slpeah002:[~]:# ceph osd tree
>> >> ID  WEIGHT   TYPE NAME          UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> >>  -1 94.63998 root default
>> >>  -9 32.75999     host slpeah007
>> >>  72  5.45999         osd.72          up  1.00000          1.00000
>> >>  73  5.45999         osd.73          up  1.00000          1.00000
>> >>  74  5.45999         osd.74          up  1.00000          1.00000
>> >>  75  5.45999         osd.75          up  1.00000          1.00000
>> >>  76  5.45999         osd.76          up  1.00000          1.00000
>> >>  77  5.45999         osd.77          up  1.00000          1.00000
>> >> -10 32.75999     host slpeah008
>> >>  78  5.45999         osd.78          up  1.00000          1.00000
>> >>  79  5.45999         osd.79          up  1.00000          1.00000
>> >>  80  5.45999         osd.80          up  1.00000          1.00000
>> >>  81  5.45999         osd.81          up  1.00000          1.00000
>> >>  82  5.45999         osd.82          up  1.00000          1.00000
>> >>  83  5.45999         osd.83          up  1.00000          1.00000
>> >>  -3 14.56000     host slpeah001
>> >>   1  3.64000          osd.1         down  1.00000          1.00000
>> >>  33  3.64000         osd.33        down  1.00000          1.00000
>> >>  34  3.64000         osd.34        down  1.00000          1.00000
>> >>  35  3.64000         osd.35        down  1.00000          1.00000
>> >>  -2 14.56000     host slpeah002
>> >>   0  3.64000         osd.0           up  1.00000          1.00000
>> >>  36  3.64000         osd.36          up  1.00000          1.00000
>> >>  37  3.64000         osd.37          up  1.00000          1.00000
>> >>  38  3.64000         osd.38          up  1.00000          1.00000
>> >>
>> >> Crushmap:
>> >>
>> >>  # begin crush map
>> >> tunable choose_local_tries 0
>> >> tunable choose_local_fallback_tries 0
>> >> tunable choose_total_tries 50
>> >> tunable chooseleaf_descend_once 1
>> >> tunable chooseleaf_vary_r 1
>> >> tunable straw_calc_version 1
>> >> tunable allowed_bucket_algs 54
>> >>
>> >> # devices
>> >> device 0 osd.0
>> >> device 1 osd.1
>> >> device 2 device2
>> >> device 3 device3
>> >> device 4 device4
>> >> device 5 device5
>> >> device 6 device6
>> >> device 7 device7
>> >> device 8 device8
>> >> device 9 device9
>> >> device 10 device10
>> >> device 11 device11
>> >> device 12 device12
>> >> device 13 device13
>> >> device 14 device14
>> >> device 15 device15
>> >> device 16 device16
>> >> device 17 device17
>> >> device 18 device18
>> >> device 19 device19
>> >> device 20 device20
>> >> device 21 device21
>> >> device 22 device22
>> >> device 23 device23
>> >> device 24 device24
>> >> device 25 device25
>> >> device 26 device26
>> >> device 27 device27
>> >> device 28 device28
>> >> device 29 device29
>> >> device 30 device30
>> >> device 31 device31
>> >> device 32 device32
>> >> device 33 osd.33
>> >> device 34 osd.34
>> >> device 35 osd.35
>> >> device 36 osd.36
>> >> device 37 osd.37
>> >> device 38 osd.38
>> >> device 39 device39
>> >> device 40 device40
>> >> device 41 device41
>> >> device 42 device42
>> >> device 43 device43
>> >> device 44 device44
>> >> device 45 device45
>> >> device 46 device46
>> >> device 47 device47
>> >> device 48 device48
>> >> device 49 device49
>> >> device 50 device50
>> >> device 51 device51
>> >> device 52 device52
>> >> device 53 device53
>> >> device 54 device54
>> >> device 55 device55
>> >> device 56 device56
>> >> device 57 device57
>> >> device 58 device58
>> >> device 59 device59
>> >> device 60 device60
>> >> device 61 device61
>> >> device 62 device62
>> >> device 63 device63
>> >> device 64 device64
>> >> device 65 device65
>> >> device 66 device66
>> >> device 67 device67
>> >> device 68 device68
>> >> device 69 device69
>> >> device 70 device70
>> >> device 71 device71
>> >> device 72 osd.72
>> >> device 73 osd.73
>> >> device 74 osd.74
>> >> device 75 osd.75
>> >> device 76 osd.76
>> >> device 77 osd.77
>> >> device 78 osd.78
>> >> device 79 osd.79
>> >> device 80 osd.80
>> >> device 81 osd.81
>> >> device 82 osd.82
>> >> device 83 osd.83
>> >>
>> >> # types
>> >> type 0 osd
>> >> type 1 host
>> >> type 2 chassis
>> >> type 3 rack
>> >> type 4 row
>> >> type 5 pdu
>> >> type 6 pod
>> >> type 7 room
>> >> type 8 datacenter
>> >> type 9 region
>> >> type 10 root
>> >>
>> >> # buckets
>> >> host slpeah007 {
>> >>         id -9           # do not change unnecessarily
>> >>         # weight 32.760
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.72 weight 5.460
>> >>         item osd.73 weight 5.460
>> >>         item osd.74 weight 5.460
>> >>         item osd.75 weight 5.460
>> >>         item osd.76 weight 5.460
>> >>         item osd.77 weight 5.460
>> >> }
>> >> host slpeah008 {
>> >>         id -10          # do not change unnecessarily
>> >>         # weight 32.760
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.78 weight 5.460
>> >>         item osd.79 weight 5.460
>> >>         item osd.80 weight 5.460
>> >>         item osd.81 weight 5.460
>> >>         item osd.82 weight 5.460
>> >>         item osd.83 weight 5.460
>> >> }
>> >> host slpeah001 {
>> >>         id -3           # do not change unnecessarily
>> >>         # weight 14.560
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.1 weight 3.640
>> >>         item osd.33 weight 3.640
>> >>         item osd.34 weight 3.640
>> >>         item osd.35 weight 3.640
>> >> }
>> >> host slpeah002 {
>> >>         id -2           # do not change unnecessarily
>> >>         # weight 14.560
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item osd.0 weight 3.640
>> >>         item osd.36 weight 3.640
>> >>         item osd.37 weight 3.640
>> >>         item osd.38 weight 3.640
>> >> }
>> >> root default {
>> >>         id -1           # do not change unnecessarily
>> >>         # weight 94.640
>> >>         alg straw
>> >>         hash 0  # rjenkins1
>> >>         item slpeah007 weight 32.760
>> >>         item slpeah008 weight 32.760
>> >>         item slpeah001 weight 14.560
>> >>         item slpeah002 weight 14.560
>> >> }
>> >>
>> >> # rules
>> >> rule default {
>> >>         ruleset 0
>> >>         type replicated
>> >>         min_size 1
>> >>         max_size 10
>> >>         step take default
>> >>         step chooseleaf firstn 0 type host
>> >>         step emit
>> >> }
>> >>
>> >> # end crush map
>> >>
>> >>
>> >>
>> >> This is odd because pools have size 3 and I have 3 hosts alive, so why
>> >> it is saying that undersized pgs are present? It makes me feel like
>> >> CRUSH is not working properly.
>> >> There is not much data currently in cluster, something about 3TB and
>> >> as you can see from osd tree - each host have minimum of 14TB disk
>> >> space on OSDs.
>> >> So I'm a bit stuck now...
>> >> How can I find the source of trouble?
>> >>
>> >> Thanks in advance!
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> [email protected]
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> [email protected]
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >
>>
>
>
>
> _______________________________________________
> ceph-users mailing 
> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> Mart van Santen
> Greenhost
> E: [email protected]
> T: +31 20 4890444
> W: https://greenhost.nl
>
> A PGP signature can be attached to this e-mail,
> you need PGP software to verify it.
> My public key is available in keyserver(s)
> see: http://tinyurl.com/openpgp-manual
>
> PGP Fingerprint: CA85 EB11 2B70 042D AF66  B29A 6437 01A1 10A3 D3A5
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Undersized pgs problem

Reply via email to