Re: [ceph-users] ceph df shows 100% used

2018-01-22 Thread Webert de Souza Lima
Hi,

On Fri, Jan 19, 2018 at 8:31 PM, zhangbingyin 
 wrote:

> 'MAX AVAIL' in the 'ceph df' output represents the amount of data that can
> be used before the first OSD becomes full, and not the sum of all free
> space across a set of OSDs.
>

Thank you very much. I figured this out by the end of the day. That is the
answer. I'm not sure this is in ceph.com docs though.
Now I know the problem is indeed solved (by doing proper reweight).

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-19 Thread QR


'MAX AVAIL' in the 'ceph df' output represents the amount of data that can be 
used before the first OSD becomes full, and not the sum of all free space 
across a set of OSDs.
 原始邮件 发件人: Webert de 
Souza Lima收件人: 
ceph-users发送时间: 2018年1月19日(周五) 20:20主题: Re: 
[ceph-users] ceph df shows 100% usedWhile it seemed to be solved yesterday, 
today the %USED has grown a lot again. See:
~# ceph osd df tree http://termbin.com/0zhk

~# ceph df detail
http://termbin.com/thox

94% USED while there is about 21TB worth of data, size = 2 menas ~42TB RAW 
Usage, but the OSDs in that root sum ~70TB available space.

Regards,
Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - 
WebertRLZ
On Thu, Jan 18, 2018 at 8:21 PM, Webert de Souza Lima  
wrote:
With the help of robbat2 and llua on IRC channel I was able to solve this 
situation by taking down the 2-OSD only hosts.
After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, ceph df 
showed the expected storage capacity usage (about 70%)


With this in mind, those guys have told me that it is due the cluster beeing 
uneven and unable to balance properly. It makes sense and it worked.
But for me it is still a very unexpected bahaviour for ceph to say that the 
pools are 100% full and Available Space is 0.
There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were full 
(it wasn't), ceph could still use space from OSDs from the other hosts.
Regards,
Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - 
WebertRLZ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-19 Thread Webert de Souza Lima
While it seemed to be solved yesterday, today the %USED has grown a lot
again. See:

~# ceph osd df tree
http://termbin.com/0zhk

~# ceph df detail
http://termbin.com/thox

94% USED while there is about 21TB worth of data, size = 2 menas ~42TB RAW
Usage, but the OSDs in that root sum ~70TB available space.



Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*

On Thu, Jan 18, 2018 at 8:21 PM, Webert de Souza Lima  wrote:

> With the help of robbat2 and llua on IRC channel I was able to solve this
> situation by taking down the 2-OSD only hosts.
> After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0,
> ceph df showed the expected storage capacity usage (about 70%)
>
>
> With this in mind, those guys have told me that it is due the cluster
> beeing uneven and unable to balance properly. It makes sense and it worked.
> But for me it is still a very unexpected bahaviour for ceph to say that
> the pools are 100% full and Available Space is 0.
>
> There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were
> full (it wasn't), ceph could still use space from OSDs from the other hosts.
>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> *IRC NICK - WebertRLZ*
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread Webert de Souza Lima
With the help of robbat2 and llua on IRC channel I was able to solve this
situation by taking down the 2-OSD only hosts.
After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, ceph
df showed the expected storage capacity usage (about 70%)


With this in mind, those guys have told me that it is due the cluster
beeing uneven and unable to balance properly. It makes sense and it worked.
But for me it is still a very unexpected bahaviour for ceph to say that the
pools are 100% full and Available Space is 0.

There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were
full (it wasn't), ceph could still use space from OSDs from the other hosts.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread Webert de Souza Lima
Hi David, thanks for replying.


On Thu, Jan 18, 2018 at 5:03 PM David Turner  wrote:

> You can have overall space available in your cluster because not all of
> your disks are in the same crush root.  You have multiple roots
> corresponding to multiple crush rulesets.  All pools using crush ruleset 0
> are full because all of the osds in that crush rule are full.
>


So I did check this. The usage of the OSDs that belonged to that root
(default) was about 60%.
All the pools using crush ruleset 0 were being show 100% there was only 1
near-full OSD in that crush rule. That's what is so weird about it.

On Thu, Jan 18, 2018 at 8:05 PM, David Turner  wrote:

> `ceph osd df` is a good command for you to see what's going on.  Compare
> the osd numbers with `ceph osd tree`.
>

I am sorry I forgot to send this output, here it is. I have added 2 OSDs to
that crush, borrowed them from the host mia1-master-ds05, to see if the
available space would higher, but it didn't.
So adding new OSDs to this didn't take any effect.

ceph osd df tree

ID  WEIGHT   REWEIGHT SIZE   USEAVAIL  %USE  VAR  PGS TYPE NAME
 -9 13.5- 14621G  2341G 12279G 16.02 0.31   0 root
databases
 -8  6.5-  7182G   835G  6346G 11.64 0.22   0 host
mia1-master-ds05
 20  3.0  1.0  3463G   380G  3082G 10.99 0.21 260
osd.20
 17  3.5  1.0  3719G   455G  3263G 12.24 0.24 286
osd.17
-10  7.0-  7438G  1505G  5932G 20.24 0.39   0 host
mia1-master-fe01
 21  3.5  1.0  3719G   714G  3004G 19.22 0.37 269
osd.21
 22  3.5  1.0  3719G   791G  2928G 21.27 0.41 295
osd.22
 -3  2.39996-  2830G  1647G  1182G 58.22 1.12   0 root
databases-ssd
 -5  1.19998-  1415G   823G   591G 58.22 1.12   0 host
mia1-master-ds02-ssd
 24  0.3  1.0   471G   278G   193G 58.96 1.14 173
osd.24
 25  0.3  1.0   471G   276G   194G 58.68 1.13 172
osd.25
 26  0.3  1.0   471G   269G   202G 57.03 1.10 167
osd.26
 -6  1.19998-  1415G   823G   591G 58.22 1.12   0 host
mia1-master-ds03-ssd
 27  0.3  1.0   471G   244G   227G 51.87 1.00 152
osd.27
 28  0.3  1.0   471G   281G   190G 59.63 1.15 175
osd.28
 29  0.3  1.0   471G   297G   173G 63.17 1.22 185
osd.29
 -1 71.69997- 76072G 44464G 31607G 58.45 1.13   0 root default
 -2 26.59998- 29575G 17334G 12240G 58.61 1.13   0 host
mia1-master-ds01
  0  3.2  1.0  3602G  1907G  1695G 52.94 1.02  90
osd.0
  1  3.2  1.0  3630G  2721G   908G 74.97 1.45 112
osd.1
  2  3.2  1.0  3723G  2373G  1349G 63.75 1.23  98
osd.2
  3  3.2  1.0  3723G  1781G  1941G 47.85 0.92 105
osd.3
  4  3.2  1.0  3723G  1880G  1843G 50.49 0.97  95
osd.4
  5  3.2  1.0  3723G  2465G  1257G 66.22 1.28 111
osd.5
  6  3.7  1.0  3723G  1722G  2001G 46.25 0.89 109
osd.6
  7  3.7  1.0  3723G  2481G  1241G 66.65 1.29 126
osd.7
 -4  8.5-  9311G  8540G   770G 91.72 1.77   0 host
mia1-master-fe02
  8  5.5  0.7  5587G  5419G   167G 97.00 1.87 189
osd.8
 23  3.0  1.0  3724G  3120G   603G 83.79 1.62 128
osd.23
 -7 29.5- 29747G 17821G 11926G 59.91 1.16   0 host
mia1-master-ds04
  9  3.7  1.0  3718G  2493G  1224G 67.07 1.29 114
osd.9
 10  3.7  1.0  3718G  2454G  1264G 66.00 1.27  90
osd.10
 11  3.7  1.0  3718G  2202G  1516G 59.22 1.14 116
osd.11
 12  3.7  1.0  3718G  2290G  1427G 61.61 1.19 113
osd.12
 13  3.7  1.0  3718G  2015G  1703G 54.19 1.05 112
osd.13
 14  3.7  1.0  3718G  1264G  2454G 34.00 0.66 101
osd.14
 15  3.7  1.0  3718G  2195G  1522G 59.05 1.14 104
osd.15
 16  3.7  1.0  3718G  2905G   813G 78.13 1.51 130
osd.16
-11  7.0-  7438G   768G  6669G 10.33 0.20   0 host
mia1-master-ds05-borrowed-osds
 18  3.5  1.0  3719G   393G  3325G 10.59 0.20 262
osd.18
 19  3.5  1.0  3719G   374G  3344G 10.07 0.19 256
osd.19
TOTAL 93524G 48454G 45069G 51.81
MIN/MAX VAR: 0.19/1.87  STDDEV: 22.02



Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*

On Thu, Jan 18, 2018 at 8:05 PM, David Turner  wrote:

> `ceph osd df` is a good command for you to see what's going on.  Compare
> the osd numbers with `ceph osd tree`.
>
>
>>
>> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima <
>> webert.b...@gmail.com> wrote:
>>
>>> Sorry I forgot, this is a ceph jewel 10.2.10
>>>
>>>
>>> Regards,
>>>
>>> Webert Lima
>>> DevOps Engineer at MAV Tecnologia
>>> *Belo Horizonte - Brasil*
>>> *IRC NICK - WebertRLZ*
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread David Turner
You hosts are also not balanced in your default root.  Your failure domain
is host, but one of your hosts has 8.5TB of storage in it compared to
26.6TB and 29.6TB.  You only have size=2 (along with min_size=1 which is
bad for a lot of reasons) so it should still be able to place data mostly
between ds01 and ds04 and ignore fe02 since it doesn't have much space at
all.  Anyway, `ceph osd df` will be good output to see what the
distribution between osds looks like.

 -1 64.69997 root default
 -2 26.59998 host mia1-master-ds01
  0  3.2 osd.0  up  1.0  1.0
  1  3.2 osd.1  up  1.0  1.0
  2  3.2 osd.2  up  1.0  1.0
  3  3.2 osd.3  up  1.0  1.0
  4  3.2 osd.4  up  1.0  1.0
  5  3.2 osd.5  up  1.0  1.0
  6  3.7 osd.6  up  1.0  1.0
  7  3.7 osd.7  up  1.0  1.0
 -4  8.5 host mia1-master-fe02
  8  5.5 osd.8  up  1.0  1.0
 23  3.0 osd.23 up  1.0  1.0
 -7 29.5 host mia1-master-ds04
  9  3.7 osd.9  up  1.0  1.0
 10  3.7 osd.10 up  1.0  1.0
 11  3.7 osd.11 up  1.0  1.0
 12  3.7 osd.12 up  1.0  1.0
 13  3.7 osd.13 up  1.0  1.0
 14  3.7 osd.14 up  1.0  1.0
 15  3.7 osd.15 up  1.0  1.0
 16  3.7 osd.16 up  1.0  1.0



On Thu, Jan 18, 2018 at 5:05 PM David Turner  wrote:

> `ceph osd df` is a good command for you to see what's going on.  Compare
> the osd numbers with `ceph osd tree`.
>
> On Thu, Jan 18, 2018 at 5:03 PM David Turner 
> wrote:
>
>> You can have overall space available in your cluster because not all of
>> your disks are in the same crush root.  You have multiple roots
>> corresponding to multiple crush rulesets.  All pools using crush ruleset 0
>> are full because all of the osds in that crush rule are full.
>>
>> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima <
>> webert.b...@gmail.com> wrote:
>>
>>> Sorry I forgot, this is a ceph jewel 10.2.10
>>>
>>>
>>> Regards,
>>>
>>> Webert Lima
>>> DevOps Engineer at MAV Tecnologia
>>> *Belo Horizonte - Brasil*
>>> *IRC NICK - WebertRLZ*
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread David Turner
`ceph osd df` is a good command for you to see what's going on.  Compare
the osd numbers with `ceph osd tree`.

On Thu, Jan 18, 2018 at 5:03 PM David Turner  wrote:

> You can have overall space available in your cluster because not all of
> your disks are in the same crush root.  You have multiple roots
> corresponding to multiple crush rulesets.  All pools using crush ruleset 0
> are full because all of the osds in that crush rule are full.
>
> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>> Sorry I forgot, this is a ceph jewel 10.2.10
>>
>>
>> Regards,
>>
>> Webert Lima
>> DevOps Engineer at MAV Tecnologia
>> *Belo Horizonte - Brasil*
>> *IRC NICK - WebertRLZ*
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread David Turner
You can have overall space available in your cluster because not all of
your disks are in the same crush root.  You have multiple roots
corresponding to multiple crush rulesets.  All pools using crush ruleset 0
are full because all of the osds in that crush rule are full.

On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima 
wrote:

> Sorry I forgot, this is a ceph jewel 10.2.10
>
>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> *IRC NICK - WebertRLZ*
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread Webert de Souza Lima
Sorry I forgot, this is a ceph jewel 10.2.10


Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df shows 100% used

2018-01-18 Thread Webert de Souza Lima
Also, there is no quota set for the pools

Here is "ceph osd pool get xxx all": http://termbin.com/ix0n


Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph df shows 100% used

2018-01-18 Thread Webert de Souza Lima
Hello,

I'm running near-out-of service radosgw (very slow to write new objects)
and I suspect it's because of ceph df is showing 100% usage in some pools,
though I don't know what that information comes from.

Pools:
#~ ceph osd pool ls detail  -> http://termbin.com/lsd0

Crush Rules (important is rule 0)
~# ceph osd crush rule dump ->  http://termbin.com/wkpo

OSD Tree:
~# ceph osd tree -> http://termbin.com/87vt

Ceph DF, which shows 100% Usage:
~# ceph df detail -> http://termbin.com/15mz

Ceph Status, which shows 45600 GB / 93524 GB avail:
~# ceph -s -> http://termbin.com/wycq


Any thoughts?

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com