Re: [ceph-users] Qemu RBD image usage

2019-12-09 Thread Marc Roos
 
This should get you started with using rbd.


  
  

  
  



  
  
  WDC
  WD40EFRX-68WT0N0
  
  
  





cat > secret.xml <


client.rbd.vps secret


EOF

virsh secret-define --file secret.xml

virsh secret-set-value --secret  --base64 `ceph auth get-key 
client.rbd.vps 2>/dev/null`



-Original Message-
To: ceph-users@lists.ceph.com
Cc: d...@ceph.io
Subject: [ceph-users] Qemu RBD image usage

Hi all,
   I want to attach another RBD image into the Qemu VM to be used as 
disk.
   However, it always failed.  The VM definiation xml is attached.
   Could anyone tell me where I did wrong?
   || nstcc3@nstcloudcc3:~$ sudo virsh start ubuntu_18_04_mysql 
--console
   || error: Failed to start domain ubuntu_18_04_mysql
   || error: internal error: process exited while connecting to monitor:
   || 2019-12-09T16:24:30.284454Z qemu-system-x86_64: -drive
   || 
file=rbd:rwl_mysql/mysql_image:auth_supported=none:mon_host=nstcloudcc4\
:6789,format=raw,if=none,id=drive-virtio-disk1:
   || error connecting: Operation not supported


   The cluster info is below:
   || ceph@nstcloudcc3:~$ ceph --version
   || ceph version 14.0.0-16935-g9b6ef711f3 
(9b6ef711f3a40898756457cb287bf291f45943f0) octopus (dev)
   || ceph@nstcloudcc3:~$ ceph -s
   ||   cluster:
   || id: e31502ff-1fb4-40b7-89a8-2b85a77a3b09
   || health: HEALTH_OK
   ||  
   ||   services:
   || mon: 1 daemons, quorum nstcloudcc4 (age 2h)
   || mgr: nstcloudcc4(active, since 2h)
   || osd: 4 osds: 4 up (since 2h), 4 in (since 2h)
   ||  
   ||   data:
   || pools:   1 pools, 128 pgs
   || objects: 6 objects, 6.3 KiB
   || usage:   4.0 GiB used, 7.3 TiB / 7.3 TiB avail
   || pgs: 128 active+clean
   ||  
   || ceph@nstcloudcc3:~$
   || ceph@nstcloudcc3:~$ rbd info rwl_mysql/mysql_image
   || rbd image 'mysql_image':
   || size 100 GiB in 25600 objects
   || order 22 (4 MiB objects)
   || snapshot_count: 0
   || id: 110feda39b1c
   || block_name_prefix: rbd_data.110feda39b1c
   || format: 2
   || features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten
   || op_features: 
   || flags: 
   || create_timestamp: Mon Dec  9 23:48:17 2019
   || access_timestamp: Mon Dec  9 23:48:17 2019
   || modify_timestamp: Mon Dec  9 23:48:17 2019

B.R.
Changcheng


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG Balancer Upmap mode not working

2019-12-09 Thread Lars Täuber
Hi Anthony!

Mon, 9 Dec 2019 17:11:12 -0800
Anthony D'Atri  ==> ceph-users  
:
> > How is that possible? I dont know how much more proof I need to present 
> > that there's a bug.  
> 
> FWIW, your pastes are hard to read with all the ? in them.  Pasting 
> non-7-bit-ASCII?

I don't see much "?" in his posts. Maybe a display issue?

> > |I increased PGs and see no difference.  
> 
> From what pgp_num to what new value?  Numbers that are not a power of 2 can 
> contribute to the sort of problem you describe.  Do you have host CRUSH fault 
> domain?
> 

Does the fault domain play a role with this situation? I can't see the reason. 
This would only be important if the OSDs weren't evenly distributed across the 
hosts.
Philippe can you posts your 'ceph osd tree'?

> > Raising PGs to 100 is an old statement anyway, anything 60+ should be fine. 
> >   
> 
> Fine in what regard?  To be sure, Wido’s advice means a *ratio* of at least 
> 100.  ratio = (pgp_num * replication) / #osds
> 
> The target used to be 200, a commit around 12.2.1 retconned that to 100.  
> Best I can tell the rationale is memory usage at the expense of performance.
> 
> Is your original except complete? Ie., do you only have 24 OSDs?  Across how 
> many nodes?
> 
> The old guidance for tiny clusters:
> 
> • Less than 5 OSDs set pg_num to 128
> 
> • Between 5 and 10 OSDs set pg_num to 512
> 
> • Between 10 and 50 OSDs set pg_num to 1024

This is what I thought too. But in this posts
https://lists.ceph.io/hyperkitty/list/ceph-us...@ceph.io/message/TR6CJQKSMOHNGOMQO4JBDMGEL2RMWE36/
[Why are the mailing lists ceph.io and ceph.com not merged? It's hard to find 
the link to messages this way.]
Konstantin suggested to reduce to pg_num=512. The cluster had 35 OSDs.
It is still merging very slowly the PGs.

In the meantime I added 5 more OSDs and thinking about rising the pg_num back 
to 1024.
I wonder how less PGs can balance better than 512.

I'm in a similar situation like Philippe with my cluster.
ceph osd df class hdd:
[…]
MIN/MAX VAR: 0.73/1.21  STDDEV: 6.27

Attached is a picture of the dashboard with tiny bars of the data distribution. 
The nearly empty OSDs are SSDs used for its own pool.
I think there might be a bug in the balancing algorithm.

Thanks,
Lars

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] sharing single SSD across multiple HD based OSDs

2019-12-09 Thread Nathan Fish
You can loop over the creation of LVs on the SSD of a fixed size, then
loop over creating OSDs assigned to each of them. That is what we did,
it wasn't bad.

On Mon, Dec 9, 2019 at 9:32 PM Philip Brown  wrote:
>
> I have a bunch of hard drives I want to use as OSDs, with ceph nautilus.
>
> ceph-volume lvm create makes straight raw dev usage relatively easy, since 
> you can just do
>
> ceph-volume lvm create --data /dev/sdc
>
> or whatever.
> Its nice that it takes care of all the LVM jiggerypokery automatically.
>
> but.. what if you have a single SSD. lets call it /dev/sdx. and I want to use 
> it for the WAL, for
> /dev/sdc, sdd, sde, sdf, and so on.
>
> Do you have to associate each OSD with a unique WAL dev, or can they "share"?
>
> Do I really have to MANUALLY go carve up /dev/sdx into slices, LVM or 
> otherwise, and then go hand manage the slicing?
>
> ceph-volume lvm create --data /dev/sdc --block.wal /dev/sdx1
> ceph-volume lvm create --data /dev/sdd --block.wal /dev/sdx2
> ceph-volume lvm create --data /dev/sde --block.wal /dev/sdx3
> ?
>
> can I not get away with some other more simplified usage?
>
>
>
> --
> Philip Brown| Sr. Linux System Administrator | Medata, Inc.
> 5 Peters Canyon Rd Suite 250
> Irvine CA 92606
> Office 714.918.1310| Fax 714.918.1325
> pbr...@medata.com| www.medata.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] sharing single SSD across multiple HD based OSDs

2019-12-09 Thread Philip Brown
I have a bunch of hard drives I want to use as OSDs, with ceph nautilus.

ceph-volume lvm create makes straight raw dev usage relatively easy, since you 
can just do

ceph-volume lvm create --data /dev/sdc

or whatever.
Its nice that it takes care of all the LVM jiggerypokery automatically.

but.. what if you have a single SSD. lets call it /dev/sdx. and I want to use 
it for the WAL, for
/dev/sdc, sdd, sde, sdf, and so on.

Do you have to associate each OSD with a unique WAL dev, or can they "share"?

Do I really have to MANUALLY go carve up /dev/sdx into slices, LVM or 
otherwise, and then go hand manage the slicing?

ceph-volume lvm create --data /dev/sdc --block.wal /dev/sdx1
ceph-volume lvm create --data /dev/sdd --block.wal /dev/sdx2
ceph-volume lvm create --data /dev/sde --block.wal /dev/sdx3
?

can I not get away with some other more simplified usage?



--
Philip Brown| Sr. Linux System Administrator | Medata, Inc. 
5 Peters Canyon Rd Suite 250 
Irvine CA 92606 
Office 714.918.1310| Fax 714.918.1325 
pbr...@medata.com| www.medata.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG Balancer Upmap mode not working

2019-12-09 Thread Anthony D'Atri
> How is that possible? I dont know how much more proof I need to present that 
> there's a bug.

FWIW, your pastes are hard to read with all the ? in them.  Pasting 
non-7-bit-ASCII?

> |I increased PGs and see no difference.

From what pgp_num to what new value?  Numbers that are not a power of 2 can 
contribute to the sort of problem you describe.  Do you have host CRUSH fault 
domain?

> Raising PGs to 100 is an old statement anyway, anything 60+ should be fine. 

Fine in what regard?  To be sure, Wido’s advice means a *ratio* of at least 
100.  ratio = (pgp_num * replication) / #osds

The target used to be 200, a commit around 12.2.1 retconned that to 100.  Best 
I can tell the rationale is memory usage at the expense of performance.

Is your original except complete? Ie., do you only have 24 OSDs?  Across how 
many nodes?

The old guidance for tiny clusters:

• Less than 5 OSDs set pg_num to 128

• Between 5 and 10 OSDs set pg_num to 512

• Between 10 and 50 OSDs set pg_num to 1024


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Robert LeBlanc
On Mon, Dec 9, 2019 at 11:58 AM Paul Emmerich 
wrote:

> solved it: the warning is of course generated by ceph-mgr and not ceph-mon.
>
> So for my problem that means: should have injected the option in ceph-mgr.
> That's why it obviously worked when setting it on the pool...
>
> The solution for you is to simply put the option under global and restart
> ceph-mgr (or use daemon config set; it doesn't support changing config via
> ceph tell for some reason)
>
>
> Paul
>
> On Mon, Dec 9, 2019 at 8:32 PM Paul Emmerich 
> wrote:
>
>>
>>
>> On Mon, Dec 9, 2019 at 5:17 PM Robert LeBlanc 
>> wrote:
>>
>>> I've increased the deep_scrub interval on the OSDs on our Nautilus
>>> cluster with the following added to the [osd] section:
>>>
>>
>> should have read the beginning of your email; you'll need to set the
>> option on the mons as well because they generate the warning. So your
>> problem might be completely different from what I'm seeing here
>>
>
>
>>
>>
>> Paul
>>
>>
>>>
>>> osd_deep_scrub_interval = 260
>>>
>>> And I started seeing
>>>
>>> 1518 pgs not deep-scrubbed in time
>>>
>>> in ceph -s. So I added
>>>
>>> mon_warn_pg_not_deep_scrubbed_ratio = 1
>>>
>>> since the default would start warning with a whole week left to scrub.
>>> But the messages persist. The cluster has been running for a month with
>>> these settings. Here is an example of the output. As you can see, some of
>>> these are not even two weeks old, no where close to the 75% of 4 weeks.
>>>
>>> pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
>>>pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
>>>pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
>>>pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
>>>pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
>>>pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
>>>pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
>>>pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
>>>pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
>>>pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.51
>>>pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
>>>1468 more pgs...
>>> Mon Dec  9 08:12:01 PST 2019
>>>
>>> There is very little data on the cluster, so it's not a problem of
>>> deep-scrubs taking too long:
>>>
>>> $ ceph df
>>> RAW STORAGE:
>>>CLASS SIZEAVAIL   USEDRAW USED %RAW USED
>>>hdd   6.3 PiB 6.1 PiB 153 TiB  154 TiB  2.39
>>>nvme  5.8 TiB 5.6 TiB 138 GiB  197 GiB  3.33
>>>TOTAL 6.3 PiB 6.2 PiB 154 TiB  154 TiB  2.39
>>>
>>> POOLS:
>>>POOL   ID STORED  OBJECTS USED
>>>%USED MAX AVAIL
>>>.rgw.root   1 3.0 KiB   7 3.0 KiB
>>> 0   1.8 PiB
>>>default.rgw.control 2 0 B   8 0 B
>>> 0   1.8 PiB
>>>default.rgw.meta3 7.4 KiB  24 7.4 KiB
>>> 0   1.8 PiB
>>>default.rgw.log 4  11 GiB 341  11 GiB
>>> 0   1.8 PiB
>>>default.rgw.buckets.data6 100 TiB  41.84M 100 TiB
>>>  1.82   4.2 PiB
>>>default.rgw.buckets.index   7  33 GiB 574  33 GiB
>>> 0   1.8 PiB
>>>default.rgw.buckets.non-ec  8 8.1 MiB  22 8.1 MiB
>>> 0   1.8 PiB
>>>
>>> Please help me figure out what I'm doing wrong with these settings.
>>>
>>
Paul,

Thanks, I did set both options to the global on the mons and restarted
them, but that didn't help. Having the scrub interval set in the global
section and restarting the mgr fixed it.


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Paul Emmerich
solved it: the warning is of course generated by ceph-mgr and not ceph-mon.

So for my problem that means: should have injected the option in ceph-mgr.
That's why it obviously worked when setting it on the pool...

The solution for you is to simply put the option under global and restart
ceph-mgr (or use daemon config set; it doesn't support changing config via
ceph tell for some reason)


Paul

On Mon, Dec 9, 2019 at 8:32 PM Paul Emmerich  wrote:

>
>
> On Mon, Dec 9, 2019 at 5:17 PM Robert LeBlanc 
> wrote:
>
>> I've increased the deep_scrub interval on the OSDs on our Nautilus
>> cluster with the following added to the [osd] section:
>>
>
> should have read the beginning of your email; you'll need to set the
> option on the mons as well because they generate the warning. So your
> problem might be completely different from what I'm seeing here
>


>
>
> Paul
>
>
>>
>> osd_deep_scrub_interval = 260
>>
>> And I started seeing
>>
>> 1518 pgs not deep-scrubbed in time
>>
>> in ceph -s. So I added
>>
>> mon_warn_pg_not_deep_scrubbed_ratio = 1
>>
>> since the default would start warning with a whole week left to scrub.
>> But the messages persist. The cluster has been running for a month with
>> these settings. Here is an example of the output. As you can see, some of
>> these are not even two weeks old, no where close to the 75% of 4 weeks.
>>
>> pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
>>pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
>>pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
>>pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
>>pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
>>pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
>>pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
>>pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
>>pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
>>pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.51
>>pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
>>1468 more pgs...
>> Mon Dec  9 08:12:01 PST 2019
>>
>> There is very little data on the cluster, so it's not a problem of
>> deep-scrubs taking too long:
>>
>> $ ceph df
>> RAW STORAGE:
>>CLASS SIZEAVAIL   USEDRAW USED %RAW USED
>>hdd   6.3 PiB 6.1 PiB 153 TiB  154 TiB  2.39
>>nvme  5.8 TiB 5.6 TiB 138 GiB  197 GiB  3.33
>>TOTAL 6.3 PiB 6.2 PiB 154 TiB  154 TiB  2.39
>>
>> POOLS:
>>POOL   ID STORED  OBJECTS USED
>>%USED MAX AVAIL
>>.rgw.root   1 3.0 KiB   7 3.0 KiB
>> 0   1.8 PiB
>>default.rgw.control 2 0 B   8 0 B
>> 0   1.8 PiB
>>default.rgw.meta3 7.4 KiB  24 7.4 KiB
>> 0   1.8 PiB
>>default.rgw.log 4  11 GiB 341  11 GiB
>> 0   1.8 PiB
>>default.rgw.buckets.data6 100 TiB  41.84M 100 TiB
>>  1.82   4.2 PiB
>>default.rgw.buckets.index   7  33 GiB 574  33 GiB
>> 0   1.8 PiB
>>default.rgw.buckets.non-ec  8 8.1 MiB  22 8.1 MiB
>> 0   1.8 PiB
>>
>> Please help me figure out what I'm doing wrong with these settings.
>>
>> Thanks,
>> Robert LeBlanc
>> 
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Paul Emmerich
On Mon, Dec 9, 2019 at 5:17 PM Robert LeBlanc  wrote:

> I've increased the deep_scrub interval on the OSDs on our Nautilus cluster
> with the following added to the [osd] section:
>

should have read the beginning of your email; you'll need to set the option
on the mons as well because they generate the warning. So your problem
might be completely different from what I'm seeing here


Paul


>
> osd_deep_scrub_interval = 260
>
> And I started seeing
>
> 1518 pgs not deep-scrubbed in time
>
> in ceph -s. So I added
>
> mon_warn_pg_not_deep_scrubbed_ratio = 1
>
> since the default would start warning with a whole week left to scrub. But
> the messages persist. The cluster has been running for a month with these
> settings. Here is an example of the output. As you can see, some of these
> are not even two weeks old, no where close to the 75% of 4 weeks.
>
> pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
>pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
>pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
>pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
>pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
>pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
>pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
>pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
>pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
>pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.51
>pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
>1468 more pgs...
> Mon Dec  9 08:12:01 PST 2019
>
> There is very little data on the cluster, so it's not a problem of
> deep-scrubs taking too long:
>
> $ ceph df
> RAW STORAGE:
>CLASS SIZEAVAIL   USEDRAW USED %RAW USED
>hdd   6.3 PiB 6.1 PiB 153 TiB  154 TiB  2.39
>nvme  5.8 TiB 5.6 TiB 138 GiB  197 GiB  3.33
>TOTAL 6.3 PiB 6.2 PiB 154 TiB  154 TiB  2.39
>
> POOLS:
>POOL   ID STORED  OBJECTS USED
>%USED MAX AVAIL
>.rgw.root   1 3.0 KiB   7 3.0 KiB
> 0   1.8 PiB
>default.rgw.control 2 0 B   8 0 B
> 0   1.8 PiB
>default.rgw.meta3 7.4 KiB  24 7.4 KiB
> 0   1.8 PiB
>default.rgw.log 4  11 GiB 341  11 GiB
> 0   1.8 PiB
>default.rgw.buckets.data6 100 TiB  41.84M 100 TiB
>  1.82   4.2 PiB
>default.rgw.buckets.index   7  33 GiB 574  33 GiB
> 0   1.8 PiB
>default.rgw.buckets.non-ec  8 8.1 MiB  22 8.1 MiB
> 0   1.8 PiB
>
> Please help me figure out what I'm doing wrong with these settings.
>
> Thanks,
> Robert LeBlanc
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Paul Emmerich
Hi,

nice coincidence that you mention that today; I've just debugged the exact
same problem on a setup where deep_scrub_interval was increased.

The solution was to set the deep_scrub_interval directly on all pools
instead (which was better for this particular setup anyways):

ceph osd pool set  deep_scrub_interval 

Here's the code that generates the warning:
https://github.com/ceph/ceph/blob/v14.2.4/src/mon/PGMap.cc#L3058

* There's no obvious bug in the code, no reason why it shouldn't work with
the option unless "pool->opts.get(pool_opts_t::DEEP_SCRUB_INTERVAL, x)"
returns the wrong thing if it's not configured for a pool
* I've used "config diff" to check that all mons use the correct value for
deep_scrub_interval
* mon_warn_pg_not_deep_scrubbed_ratio is a little bit odd because the
warning will trigger at (mon_warn_pg_not_deep_scrubbed_ratio + 1) *
deep_scrub_interval which is somewhat unexpected, so by default at 125% the
configured interval



Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Mon, Dec 9, 2019 at 5:17 PM Robert LeBlanc  wrote:

> I've increased the deep_scrub interval on the OSDs on our Nautilus cluster
> with the following added to the [osd] section:
>
> osd_deep_scrub_interval = 260
>
> And I started seeing
>
> 1518 pgs not deep-scrubbed in time
>
> in ceph -s. So I added
>
> mon_warn_pg_not_deep_scrubbed_ratio = 1
>
> since the default would start warning with a whole week left to scrub. But
> the messages persist. The cluster has been running for a month with these
> settings. Here is an example of the output. As you can see, some of these
> are not even two weeks old, no where close to the 75% of 4 weeks.
>
> pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
>pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
>pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
>pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
>pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
>pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
>pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
>pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
>pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
>pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.51
>pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
>1468 more pgs...
> Mon Dec  9 08:12:01 PST 2019
>
> There is very little data on the cluster, so it's not a problem of
> deep-scrubs taking too long:
>
> $ ceph df
> RAW STORAGE:
>CLASS SIZEAVAIL   USEDRAW USED %RAW USED
>hdd   6.3 PiB 6.1 PiB 153 TiB  154 TiB  2.39
>nvme  5.8 TiB 5.6 TiB 138 GiB  197 GiB  3.33
>TOTAL 6.3 PiB 6.2 PiB 154 TiB  154 TiB  2.39
>
> POOLS:
>POOL   ID STORED  OBJECTS USED
>%USED MAX AVAIL
>.rgw.root   1 3.0 KiB   7 3.0 KiB
> 0   1.8 PiB
>default.rgw.control 2 0 B   8 0 B
> 0   1.8 PiB
>default.rgw.meta3 7.4 KiB  24 7.4 KiB
> 0   1.8 PiB
>default.rgw.log 4  11 GiB 341  11 GiB
> 0   1.8 PiB
>default.rgw.buckets.data6 100 TiB  41.84M 100 TiB
>  1.82   4.2 PiB
>default.rgw.buckets.index   7  33 GiB 574  33 GiB
> 0   1.8 PiB
>default.rgw.buckets.non-ec  8 8.1 MiB  22 8.1 MiB
> 0   1.8 PiB
>
> Please help me figure out what I'm doing wrong with these settings.
>
> Thanks,
> Robert LeBlanc
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Qemu RBD image usage

2019-12-09 Thread Liu, Changcheng
Hi all,
   I want to attach another RBD image into the Qemu VM to be used as disk.
   However, it always failed.  The VM definiation xml is attached.
   Could anyone tell me where I did wrong?
   || nstcc3@nstcloudcc3:~$ sudo virsh start ubuntu_18_04_mysql --console
   || error: Failed to start domain ubuntu_18_04_mysql
   || error: internal error: process exited while connecting to monitor:
   || 2019-12-09T16:24:30.284454Z qemu-system-x86_64: -drive
   || 
file=rbd:rwl_mysql/mysql_image:auth_supported=none:mon_host=nstcloudcc4\:6789,format=raw,if=none,id=drive-virtio-disk1:
   || error connecting: Operation not supported


   The cluster info is below:
   || ceph@nstcloudcc3:~$ ceph --version
   || ceph version 14.0.0-16935-g9b6ef711f3 
(9b6ef711f3a40898756457cb287bf291f45943f0) octopus (dev)
   || ceph@nstcloudcc3:~$ ceph -s
   ||   cluster:
   || id: e31502ff-1fb4-40b7-89a8-2b85a77a3b09
   || health: HEALTH_OK
   ||  
   ||   services:
   || mon: 1 daemons, quorum nstcloudcc4 (age 2h)
   || mgr: nstcloudcc4(active, since 2h)
   || osd: 4 osds: 4 up (since 2h), 4 in (since 2h)
   ||  
   ||   data:
   || pools:   1 pools, 128 pgs
   || objects: 6 objects, 6.3 KiB
   || usage:   4.0 GiB used, 7.3 TiB / 7.3 TiB avail
   || pgs: 128 active+clean
   ||  
   || ceph@nstcloudcc3:~$
   || ceph@nstcloudcc3:~$ rbd info rwl_mysql/mysql_image
   || rbd image 'mysql_image':
   || size 100 GiB in 25600 objects
   || order 22 (4 MiB objects)
   || snapshot_count: 0
   || id: 110feda39b1c
   || block_name_prefix: rbd_data.110feda39b1c
   || format: 2
   || features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten
   || op_features: 
   || flags: 
   || create_timestamp: Mon Dec  9 23:48:17 2019
   || access_timestamp: Mon Dec  9 23:48:17 2019
   || modify_timestamp: Mon Dec  9 23:48:17 2019

B.R.
Changcheng


ubuntu_18_04_mysql.xml
Description: XML document
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Robert LeBlanc
I've increased the deep_scrub interval on the OSDs on our Nautilus cluster
with the following added to the [osd] section:

osd_deep_scrub_interval = 260

And I started seeing

1518 pgs not deep-scrubbed in time

in ceph -s. So I added

mon_warn_pg_not_deep_scrubbed_ratio = 1

since the default would start warning with a whole week left to scrub. But
the messages persist. The cluster has been running for a month with these
settings. Here is an example of the output. As you can see, some of these
are not even two weeks old, no where close to the 75% of 4 weeks.

pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
   pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
   pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
   pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
   pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
   pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
   pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
   pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
   pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
   pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.51
   pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
   1468 more pgs...
Mon Dec  9 08:12:01 PST 2019

There is very little data on the cluster, so it's not a problem of
deep-scrubs taking too long:

$ ceph df
RAW STORAGE:
   CLASS SIZEAVAIL   USEDRAW USED %RAW USED
   hdd   6.3 PiB 6.1 PiB 153 TiB  154 TiB  2.39
   nvme  5.8 TiB 5.6 TiB 138 GiB  197 GiB  3.33
   TOTAL 6.3 PiB 6.2 PiB 154 TiB  154 TiB  2.39

POOLS:
   POOL   ID STORED  OBJECTS USED
   %USED MAX AVAIL
   .rgw.root   1 3.0 KiB   7 3.0 KiB
0   1.8 PiB
   default.rgw.control 2 0 B   8 0 B
0   1.8 PiB
   default.rgw.meta3 7.4 KiB  24 7.4 KiB
0   1.8 PiB
   default.rgw.log 4  11 GiB 341  11 GiB
0   1.8 PiB
   default.rgw.buckets.data6 100 TiB  41.84M 100 TiB
 1.82   4.2 PiB
   default.rgw.buckets.index   7  33 GiB 574  33 GiB
0   1.8 PiB
   default.rgw.buckets.non-ec  8 8.1 MiB  22 8.1 MiB
0   1.8 PiB

Please help me figure out what I'm doing wrong with these settings.

Thanks,
Robert LeBlanc

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster in ERR status when rebalancing

2019-12-09 Thread Paul Emmerich
This is a (harmless) bug that existed since Mimic and will be fixed in
14.2.5 (I think?). The health error will clear up without any intervention.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Mon, Dec 9, 2019 at 12:03 PM Eugen Block  wrote:

> Hi,
>
> since we upgraded our cluster to Nautilus we also see those messages
> sometimes when it's rebalancing. There are several reports about this
> [1] [2], we didn't see it in Luminous. But eventually the rebalancing
> finished and the error message cleared, so I'd say there's (probably)
> nothing to worry about if there aren't any other issues.
>
> Regards,
> Eugen
>
>
> [1] https://tracker.ceph.com/issues/39555
> [2] https://tracker.ceph.com/issues/41255
>
>
> Zitat von Simone Lazzaris :
>
> > Hi all;
> > Long story short, I have a cluster of 26 OSD in 3 nodes (8+9+9). One
> > of the disk is showing
> > some read error, so I''ve added an OSD in the faulty node (OSD.26)
> > and set the (re)weight of
> > the faulty OSD (OSD.12) to zero.
> >
> > The cluster is now rebalancing, which is fine, but I have now 2 PG
> > in "backfill_toofull" state, so
> > the cluster health is "ERR":
> >
> >   cluster:
> > id: 9ec27b0f-acfd-40a3-b35d-db301ac5ce8c
> > health: HEALTH_ERR
> > Degraded data redundancy (low space): 2 pgs backfill_toofull
> >
> >   services:
> > mon: 3 daemons, quorum s1,s2,s3 (age 7d)
> > mgr: s1(active, since 7d), standbys: s2, s3
> > osd: 27 osds: 27 up (since 2h), 26 in (since 2h); 262 remapped pgs
> > rgw: 3 daemons active (s1, s2, s3)
> >
> >   data:
> > pools:   10 pools, 1200 pgs
> > objects: 11.72M objects, 37 TiB
> > usage:   57 TiB used, 42 TiB / 98 TiB avail
> > pgs: 2618510/35167194 objects misplaced (7.446%)
> >  938 active+clean
> >  216 active+remapped+backfill_wait
> >  44  active+remapped+backfilling
> >  2   active+remapped+backfill_wait+backfill_toofull
> >
> >   io:
> > recovery: 163 MiB/s, 50 objects/s
> >
> >   progress:
> > Rebalancing after osd.12 marked out
> >   [=.]
> >
> > As you can see, there is plenty of space and none of my OSD  is in
> > full or near full state:
> >
> >
> ++--+---+---++-++-+---+
> > | id | host |  used | avail | wr ops | wr data | rd ops | rd data |
> >  state   |
> >
> ++--+---+---++-++-+---+
> > | 0  |  s1  | 2415G | 1310G |0   | 0   |0   | 0   |
> > exists,up |
> > | 1  |  s2  | 2009G | 1716G |0   | 0   |0   | 0   |
> > exists,up |
> > | 2  |  s3  | 2183G | 1542G |0   | 0   |0   | 0   |
> > exists,up |
> > | 3  |  s1  | 2680G | 1045G |0   | 0   |0   | 0   |
> > exists,up |
> > | 4  |  s2  | 2063G | 1662G |0   | 0   |0   | 0   |
> > exists,up |
> > | 5  |  s3  | 2269G | 1456G |0   | 0   |0   | 0   |
> > exists,up |
> > | 6  |  s1  | 2523G | 1202G |0   | 0   |0   | 0   |
> > exists,up |
> > | 7  |  s2  | 1973G | 1752G |0   | 0   |0   | 0   |
> > exists,up |
> > | 8  |  s3  | 2007G | 1718G |0   | 0   |1   | 0   |
> > exists,up |
> > | 9  |  s1  | 2485G | 1240G |0   | 0   |0   | 0   |
> > exists,up |
> > | 10 |  s2  | 2385G | 1340G |0   | 0   |0   | 0   |
> > exists,up |
> > | 11 |  s3  | 2079G | 1646G |0   | 0   |0   | 0   |
> > exists,up |
> > | 12 |  s1  | 2272G | 1453G |0   | 0   |0   | 0   |
> > exists,up |
> > | 13 |  s2  | 2381G | 1344G |0   | 0   |0   | 0   |
> > exists,up |
> > | 14 |  s3  | 1923G | 1802G |0   | 0   |0   | 0   |
> > exists,up |
> > | 15 |  s1  | 2617G | 1108G |0   | 0   |0   | 0   |
> > exists,up |
> > | 16 |  s2  | 2099G | 1626G |0   | 0   |0   | 0   |
> > exists,up |
> > | 17 |  s3  | 2336G | 1389G |0   | 0   |0   | 0   |
> > exists,up |
> > | 18 |  s1  | 2435G | 1290G |0   | 0   |0   | 0   |
> > exists,up |
> > | 19 |  s2  | 2198G | 1527G |0   | 0   |0   | 0   |
> > exists,up |
> > | 20 |  s3  | 2159G | 1566G |0   | 0   |0   | 0   |
> > exists,up |
> > | 21 |  s1  | 2128G | 1597G |0   | 0   |0   | 0   |
> > exists,up |
> > | 22 |  s3  | 2064G | 1661G |0   | 0   |0   | 0   |
> > exists,up |
> > | 23 |  s2  | 1943G | 1782G |0   | 0   |0   | 0   |
> > exists,up |
> > | 24 |  s3  | 2168G | 1557G |0   | 0   |0   | 0   |
> > exists,up |
> > | 25 |  s2  | 2113G | 1612G |0   | 0   |0   | 0   |
> > exists,up |
> > | 26 |  s1  | 68.9G | 3657G |0   | 0   |0   | 0   |
> > exist

Re: [ceph-users] Cluster in ERR status when rebalancing

2019-12-09 Thread Eugen Block

Hi,

since we upgraded our cluster to Nautilus we also see those messages  
sometimes when it's rebalancing. There are several reports about this  
[1] [2], we didn't see it in Luminous. But eventually the rebalancing  
finished and the error message cleared, so I'd say there's (probably)  
nothing to worry about if there aren't any other issues.


Regards,
Eugen


[1] https://tracker.ceph.com/issues/39555
[2] https://tracker.ceph.com/issues/41255


Zitat von Simone Lazzaris :


Hi all;
Long story short, I have a cluster of 26 OSD in 3 nodes (8+9+9). One  
of the disk is showing
some read error, so I''ve added an OSD in the faulty node (OSD.26)  
and set the (re)weight of

the faulty OSD (OSD.12) to zero.

The cluster is now rebalancing, which is fine, but I have now 2 PG  
in "backfill_toofull" state, so

the cluster health is "ERR":

  cluster:
id: 9ec27b0f-acfd-40a3-b35d-db301ac5ce8c
health: HEALTH_ERR
Degraded data redundancy (low space): 2 pgs backfill_toofull

  services:
mon: 3 daemons, quorum s1,s2,s3 (age 7d)
mgr: s1(active, since 7d), standbys: s2, s3
osd: 27 osds: 27 up (since 2h), 26 in (since 2h); 262 remapped pgs
rgw: 3 daemons active (s1, s2, s3)

  data:
pools:   10 pools, 1200 pgs
objects: 11.72M objects, 37 TiB
usage:   57 TiB used, 42 TiB / 98 TiB avail
pgs: 2618510/35167194 objects misplaced (7.446%)
 938 active+clean
 216 active+remapped+backfill_wait
 44  active+remapped+backfilling
 2   active+remapped+backfill_wait+backfill_toofull

  io:
recovery: 163 MiB/s, 50 objects/s

  progress:
Rebalancing after osd.12 marked out
  [=.]

As you can see, there is plenty of space and none of my OSD  is in  
full or near full state:


++--+---+---++-++-+---+
| id | host |  used | avail | wr ops | wr data | rd ops | rd data |   
 state   |

++--+---+---++-++-+---+
| 0  |  s1  | 2415G | 1310G |0   | 0   |0   | 0   |  
exists,up |
| 1  |  s2  | 2009G | 1716G |0   | 0   |0   | 0   |  
exists,up |
| 2  |  s3  | 2183G | 1542G |0   | 0   |0   | 0   |  
exists,up |
| 3  |  s1  | 2680G | 1045G |0   | 0   |0   | 0   |  
exists,up |
| 4  |  s2  | 2063G | 1662G |0   | 0   |0   | 0   |  
exists,up |
| 5  |  s3  | 2269G | 1456G |0   | 0   |0   | 0   |  
exists,up |
| 6  |  s1  | 2523G | 1202G |0   | 0   |0   | 0   |  
exists,up |
| 7  |  s2  | 1973G | 1752G |0   | 0   |0   | 0   |  
exists,up |
| 8  |  s3  | 2007G | 1718G |0   | 0   |1   | 0   |  
exists,up |
| 9  |  s1  | 2485G | 1240G |0   | 0   |0   | 0   |  
exists,up |
| 10 |  s2  | 2385G | 1340G |0   | 0   |0   | 0   |  
exists,up |
| 11 |  s3  | 2079G | 1646G |0   | 0   |0   | 0   |  
exists,up |
| 12 |  s1  | 2272G | 1453G |0   | 0   |0   | 0   |  
exists,up |
| 13 |  s2  | 2381G | 1344G |0   | 0   |0   | 0   |  
exists,up |
| 14 |  s3  | 1923G | 1802G |0   | 0   |0   | 0   |  
exists,up |
| 15 |  s1  | 2617G | 1108G |0   | 0   |0   | 0   |  
exists,up |
| 16 |  s2  | 2099G | 1626G |0   | 0   |0   | 0   |  
exists,up |
| 17 |  s3  | 2336G | 1389G |0   | 0   |0   | 0   |  
exists,up |
| 18 |  s1  | 2435G | 1290G |0   | 0   |0   | 0   |  
exists,up |
| 19 |  s2  | 2198G | 1527G |0   | 0   |0   | 0   |  
exists,up |
| 20 |  s3  | 2159G | 1566G |0   | 0   |0   | 0   |  
exists,up |
| 21 |  s1  | 2128G | 1597G |0   | 0   |0   | 0   |  
exists,up |
| 22 |  s3  | 2064G | 1661G |0   | 0   |0   | 0   |  
exists,up |
| 23 |  s2  | 1943G | 1782G |0   | 0   |0   | 0   |  
exists,up |
| 24 |  s3  | 2168G | 1557G |0   | 0   |0   | 0   |  
exists,up |
| 25 |  s2  | 2113G | 1612G |0   | 0   |0   | 0   |  
exists,up |
| 26 |  s1  | 68.9G | 3657G |0   | 0   |0   | 0   |  
exists,up |

++--+---+---++-++-+---+



root@s1:~# ceph pg dump|egrep 'toofull|PG_STAT'
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES
OMAP_BYTES* OMAP_KEYS* LOG  DISK_LOG STATE
   STATE_STAMP
VERSION   REPORTED   UP UP_PRIMARY ACTING  
ACTING_PRIMARY LAST_SCRUB
SCRUB_STAMPLAST_DEEP_SCRUB DEEP_SCRUB_STAMP   
 SNAPTRIMQ_LEN
6.212 0  00 0   0  
38145321727   0  0 3023 3023
active+remapped+backfill_wait+backfill_toofull 2019-12-09  
11:11:39.093042  13598'212053
13713:1179718  [6,19,24]  6  [13,0

Re: [ceph-users] Cluster in ERR status when rebalancing

2019-12-09 Thread Simone Lazzaris
In data lunedì 9 dicembre 2019 11:46:34 CET, huang jun ha scritto:
> what about the pool's backfill_full_ratio value?
> 
That vaule, as far as I can see, is 0.9000, which is not reached by any OSD:
root@s1:~# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL   %USE  
VAR  PGS 
STATUS 
 0   hdd 3.63869  1.0 3.6 TiB 2.4 TiB 2.3 TiB 1.1 GiB 7.0 GiB 1.3 TiB 64.66 
1.12 149 up 
 3   hdd 3.63869  1.0 3.6 TiB 2.6 TiB 2.6 TiB 2.8 GiB 7.4 GiB 1.0 TiB 72.12 
1.25 164 up 
 6   hdd 3.63869  1.0 3.6 TiB 2.5 TiB 2.5 TiB 442 MiB 6.9 GiB 1.2 TiB 67.75 
1.18 157 up 
 9   hdd 3.63869  1.0 3.6 TiB 2.4 TiB 2.4 TiB 1.3 GiB 6.9 GiB 1.2 TiB 66.91 
1.16 154 up 
12   hdd 3.638690 0 B 0 B 0 B 0 B 0 B 0 B 0 
   0 131 up 
15   hdd 3.63869  1.0 3.6 TiB 2.5 TiB 2.5 TiB 1.7 GiB 7.0 GiB 1.1 TiB 69.93 
1.22 154 up 
18   hdd 3.63869  1.0 3.6 TiB 2.4 TiB 2.4 TiB 1.4 GiB 6.9 GiB 1.3 TiB 65.15 
1.13 147 up 
21   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.1 TiB 900 MiB 6.5 GiB 1.5 TiB 57.46 
1.00 136 up 
26   hdd 3.63869  1.0 3.6 TiB 107 GiB 106 GiB 533 MiB 1.1 GiB 3.5 TiB  2.88 
0.05   8 up 
 1   hdd 3.63869  1.0 3.6 TiB 2.0 TiB 2.0 TiB 615 MiB 5.1 GiB 1.7 TiB 53.93 
0.94 129 up 
 4   hdd 3.63869  1.0 3.6 TiB 2.0 TiB 2.0 TiB  30 MiB 5.7 GiB 1.6 TiB 55.38 
0.96 127 up 
 7   hdd 3.63869  1.0 3.6 TiB 1.9 TiB 1.9 TiB 1.3 MiB 5.4 GiB 1.7 TiB 52.97 
0.92 125 up 
10   hdd 3.63869  1.0 3.6 TiB 2.3 TiB 2.3 TiB 486 KiB 6.0 GiB 1.3 TiB 64.13 
1.12 148 up 
13   hdd 3.63869  1.0 3.6 TiB 2.3 TiB 2.3 TiB 707 MiB 5.9 GiB 1.3 TiB 63.90 
1.11 150 up 
16   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.0 TiB 981 KiB 5.7 GiB 1.6 TiB 56.38 
0.98 134 up 
19   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.1 TiB 536 MiB 6.2 GiB 1.5 TiB 58.78 
1.02 135 up 
23   hdd 3.63869  1.0 3.6 TiB 1.9 TiB 1.9 TiB 579 MiB 6.2 GiB 1.8 TiB 51.72 
0.90 122 up 
25   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.0 TiB 564 MiB 6.6 GiB 1.6 TiB 56.48 
0.98 130 up 
 2   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.1 TiB 358 MiB 6.4 GiB 1.5 TiB 58.47 
1.02 137 up 
 5   hdd 3.63869  1.0 3.6 TiB 2.2 TiB 2.2 TiB 1.4 GiB 6.7 GiB 1.4 TiB 60.67 
1.06 140 up 
 8   hdd 3.63869  1.0 3.6 TiB 2.0 TiB 2.0 TiB 376 MiB 6.1 GiB 1.7 TiB 53.88 
0.94 125 up 
11   hdd 3.63869  1.0 3.6 TiB 2.0 TiB 2.0 TiB 2.0 GiB 6.2 GiB 1.6 TiB 55.48 
0.97 132 up 
14   hdd 3.63869  1.0 3.6 TiB 1.9 TiB 1.9 TiB 990 MiB 5.7 GiB 1.8 TiB 51.64 
0.90 124 up 
17   hdd 3.63869  1.0 3.6 TiB 2.3 TiB 2.3 TiB 182 MiB 6.8 GiB 1.4 TiB 62.70 
1.09 146 up 
20   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.1 TiB 901 MiB 6.4 GiB 1.5 TiB 57.73 
1.00 134 up 
22   hdd 3.63869  1.0 3.6 TiB 2.0 TiB 2.0 TiB 621 MiB 6.0 GiB 1.6 TiB 55.15 
0.96 128 up 
24   hdd 3.63869  1.0 3.6 TiB 2.1 TiB 2.1 TiB 425 MiB 6.4 GiB 1.5 TiB 58.21 
1.01 134 up 
TOTAL  98 TiB  57 TiB  56 TiB  21 GiB 166 GiB  42 TiB 57.48 

MIN/MAX VAR: 0.05/1.25  STDDEV: 12.26



*Simone Lazzaris*
*Qcom S.p.A.*
simone.lazza...@qcom.it[1] | www.qcom.it[2]
* LinkedIn[3]* | *Facebook*[4]




[1] mailto:simone.lazza...@qcom.it
[2] https://www.qcom.it
[3] https://www.linkedin.com/company/qcom-spa
[4] http://www.facebook.com/qcomspa
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster in ERR status when rebalancing

2019-12-09 Thread huang jun
what about the pool's backfill_full_ratio value?

Simone Lazzaris  于2019年12月9日周一 下午6:38写道:
>
> Hi all;
>
> Long story short, I have a cluster of 26 OSD in 3 nodes (8+9+9). One of the 
> disk is showing some read error, so I''ve added an OSD in the faulty node 
> (OSD.26) and set the (re)weight of the faulty OSD (OSD.12) to zero.
>
>
>
> The cluster is now rebalancing, which is fine, but I have now 2 PG in 
> "backfill_toofull" state, so the cluster health is "ERR":
>
>
>
> cluster:
>
> id: 9ec27b0f-acfd-40a3-b35d-db301ac5ce8c
>
> health: HEALTH_ERR
>
> Degraded data redundancy (low space): 2 pgs backfill_toofull
>
> services:
>
> mon: 3 daemons, quorum s1,s2,s3 (age 7d)
>
> mgr: s1(active, since 7d), standbys: s2, s3
>
> osd: 27 osds: 27 up (since 2h), 26 in (since 2h); 262 remapped pgs
>
> rgw: 3 daemons active (s1, s2, s3)
>
> data:
>
> pools: 10 pools, 1200 pgs
>
> objects: 11.72M objects, 37 TiB
>
> usage: 57 TiB used, 42 TiB / 98 TiB avail
>
> pgs: 2618510/35167194 objects misplaced (7.446%)
>
> 938 active+clean
>
> 216 active+remapped+backfill_wait
>
> 44 active+remapped+backfilling
>
> 2 active+remapped+backfill_wait+backfill_toofull
>
> io:
>
> recovery: 163 MiB/s, 50 objects/s
>
> progress:
>
> Rebalancing after osd.12 marked out
>
> [=.]
>
> As you can see, there is plenty of space and none of my OSD is in full or 
> near full state:
>
>
>
> ++--+---+---++-++-+---+
>
> | id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
>
> ++--+---+---++-++-+---+
>
> | 0 | s1 | 2415G | 1310G | 0 | 0 | 0 | 0 | exists,up |
>
> | 1 | s2 | 2009G | 1716G | 0 | 0 | 0 | 0 | exists,up |
>
> | 2 | s3 | 2183G | 1542G | 0 | 0 | 0 | 0 | exists,up |
>
> | 3 | s1 | 2680G | 1045G | 0 | 0 | 0 | 0 | exists,up |
>
> | 4 | s2 | 2063G | 1662G | 0 | 0 | 0 | 0 | exists,up |
>
> | 5 | s3 | 2269G | 1456G | 0 | 0 | 0 | 0 | exists,up |
>
> | 6 | s1 | 2523G | 1202G | 0 | 0 | 0 | 0 | exists,up |
>
> | 7 | s2 | 1973G | 1752G | 0 | 0 | 0 | 0 | exists,up |
>
> | 8 | s3 | 2007G | 1718G | 0 | 0 | 1 | 0 | exists,up |
>
> | 9 | s1 | 2485G | 1240G | 0 | 0 | 0 | 0 | exists,up |
>
> | 10 | s2 | 2385G | 1340G | 0 | 0 | 0 | 0 | exists,up |
>
> | 11 | s3 | 2079G | 1646G | 0 | 0 | 0 | 0 | exists,up |
>
> | 12 | s1 | 2272G | 1453G | 0 | 0 | 0 | 0 | exists,up |
>
> | 13 | s2 | 2381G | 1344G | 0 | 0 | 0 | 0 | exists,up |
>
> | 14 | s3 | 1923G | 1802G | 0 | 0 | 0 | 0 | exists,up |
>
> | 15 | s1 | 2617G | 1108G | 0 | 0 | 0 | 0 | exists,up |
>
> | 16 | s2 | 2099G | 1626G | 0 | 0 | 0 | 0 | exists,up |
>
> | 17 | s3 | 2336G | 1389G | 0 | 0 | 0 | 0 | exists,up |
>
> | 18 | s1 | 2435G | 1290G | 0 | 0 | 0 | 0 | exists,up |
>
> | 19 | s2 | 2198G | 1527G | 0 | 0 | 0 | 0 | exists,up |
>
> | 20 | s3 | 2159G | 1566G | 0 | 0 | 0 | 0 | exists,up |
>
> | 21 | s1 | 2128G | 1597G | 0 | 0 | 0 | 0 | exists,up |
>
> | 22 | s3 | 2064G | 1661G | 0 | 0 | 0 | 0 | exists,up |
>
> | 23 | s2 | 1943G | 1782G | 0 | 0 | 0 | 0 | exists,up |
>
> | 24 | s3 | 2168G | 1557G | 0 | 0 | 0 | 0 | exists,up |
>
> | 25 | s2 | 2113G | 1612G | 0 | 0 | 0 | 0 | exists,up |
>
> | 26 | s1 | 68.9G | 3657G | 0 | 0 | 0 | 0 | exists,up |
>
> ++--+---+---++-++-+---+
>
>
> Why is this happening? I thought that maybe the 2 PG marked as toofull 
> involved either the OSD.12 (which is emptying) or the 26 (the new one) but it 
> seems that this is not the case:
>
>
>
> root@s1:~# ceph pg dump|egrep 'toofull|PG_STAT'
>
> PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES 
> OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG STATE STATE_STAMP VERSION REPORTED UP 
> UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB 
> DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
>
> 6.212 0 0 0 0 0 38145321727 0 0 3023 3023 
> active+remapped+backfill_wait+backfill_toofull 2019-12-09 11:11:39.093042 
> 13598'212053 13713:1179718 [6,19,24] 6 [13,0,24] 13 13549'211985 2019-12-08 
> 19:46:10.461113 11644'211779 2019-12-06 07:37:42.864325 0
>
> 6.bc 11057 0 0 22114 0 37733931136 0 0 3032 3032 
> active+remapped+backfill_wait+backfill_toofull 2019-12-09 10:42:25.534277 
> 13549'212110 13713:1229839 [15,25,17] 15 [19,18,17] 19 13549'211983 
> 2019-12-08 11:02:45.846031 11644'211854 2019-12-06 06:22:43.565313 0
>
>
>
> Any hints? I'm not worried because I think that the cluster will heal 
> himself, but this is not clear and logic.
>
>
>
> --
>
> Simone Lazzaris
> Staff R&D
>
> Qcom S.p.A.
> Via Roggia Vignola, 9 | 24047 Treviglio (BG)
> T +39 0363 47905 | D +39 0363 1970352
> simone.lazza...@qcom.it | www.qcom.it
>
> Qcom Official Pages LinkedIn | Facebook
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@l

[ceph-users] Cluster in ERR status when rebalancing

2019-12-09 Thread Simone Lazzaris
Hi all;
Long story short, I have a cluster of 26 OSD in 3 nodes (8+9+9). One of the 
disk is showing 
some read error, so I''ve added an OSD in the faulty node (OSD.26) and set the 
(re)weight of 
the faulty OSD (OSD.12) to zero.

The cluster is now rebalancing, which is fine, but I have now 2 PG in 
"backfill_toofull" state, so 
the cluster health is "ERR":

  cluster:
id: 9ec27b0f-acfd-40a3-b35d-db301ac5ce8c
health: HEALTH_ERR
Degraded data redundancy (low space): 2 pgs backfill_toofull
 
  services:
mon: 3 daemons, quorum s1,s2,s3 (age 7d)
mgr: s1(active, since 7d), standbys: s2, s3
osd: 27 osds: 27 up (since 2h), 26 in (since 2h); 262 remapped pgs
rgw: 3 daemons active (s1, s2, s3)
 
  data:
pools:   10 pools, 1200 pgs
objects: 11.72M objects, 37 TiB
usage:   57 TiB used, 42 TiB / 98 TiB avail
pgs: 2618510/35167194 objects misplaced (7.446%)
 938 active+clean
 216 active+remapped+backfill_wait
 44  active+remapped+backfilling
 2   active+remapped+backfill_wait+backfill_toofull
 
  io:
recovery: 163 MiB/s, 50 objects/s
 
  progress:
Rebalancing after osd.12 marked out
  [=.]
 
As you can see, there is plenty of space and none of my OSD  is in full or near 
full state:

++--+---+---++-++-+---+
| id | host |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
++--+---+---++-++-+---+
| 0  |  s1  | 2415G | 1310G |0   | 0   |0   | 0   | exists,up |
| 1  |  s2  | 2009G | 1716G |0   | 0   |0   | 0   | exists,up |
| 2  |  s3  | 2183G | 1542G |0   | 0   |0   | 0   | exists,up |
| 3  |  s1  | 2680G | 1045G |0   | 0   |0   | 0   | exists,up |
| 4  |  s2  | 2063G | 1662G |0   | 0   |0   | 0   | exists,up |
| 5  |  s3  | 2269G | 1456G |0   | 0   |0   | 0   | exists,up |
| 6  |  s1  | 2523G | 1202G |0   | 0   |0   | 0   | exists,up |
| 7  |  s2  | 1973G | 1752G |0   | 0   |0   | 0   | exists,up |
| 8  |  s3  | 2007G | 1718G |0   | 0   |1   | 0   | exists,up |
| 9  |  s1  | 2485G | 1240G |0   | 0   |0   | 0   | exists,up |
| 10 |  s2  | 2385G | 1340G |0   | 0   |0   | 0   | exists,up |
| 11 |  s3  | 2079G | 1646G |0   | 0   |0   | 0   | exists,up |
| 12 |  s1  | 2272G | 1453G |0   | 0   |0   | 0   | exists,up |
| 13 |  s2  | 2381G | 1344G |0   | 0   |0   | 0   | exists,up |
| 14 |  s3  | 1923G | 1802G |0   | 0   |0   | 0   | exists,up |
| 15 |  s1  | 2617G | 1108G |0   | 0   |0   | 0   | exists,up |
| 16 |  s2  | 2099G | 1626G |0   | 0   |0   | 0   | exists,up |
| 17 |  s3  | 2336G | 1389G |0   | 0   |0   | 0   | exists,up |
| 18 |  s1  | 2435G | 1290G |0   | 0   |0   | 0   | exists,up |
| 19 |  s2  | 2198G | 1527G |0   | 0   |0   | 0   | exists,up |
| 20 |  s3  | 2159G | 1566G |0   | 0   |0   | 0   | exists,up |
| 21 |  s1  | 2128G | 1597G |0   | 0   |0   | 0   | exists,up |
| 22 |  s3  | 2064G | 1661G |0   | 0   |0   | 0   | exists,up |
| 23 |  s2  | 1943G | 1782G |0   | 0   |0   | 0   | exists,up |
| 24 |  s3  | 2168G | 1557G |0   | 0   |0   | 0   | exists,up |
| 25 |  s2  | 2113G | 1612G |0   | 0   |0   | 0   | exists,up |
| 26 |  s1  | 68.9G | 3657G |0   | 0   |0   | 0   | exists,up |
++--+---+---++-++-+---+



root@s1:~# ceph pg dump|egrep 'toofull|PG_STAT'
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES   
OMAP_BYTES* OMAP_KEYS* LOG  DISK_LOG STATE  
STATE_STAMP
VERSION   REPORTED   UP UP_PRIMARY ACTING ACTING_PRIMARY 
LAST_SCRUB
SCRUB_STAMPLAST_DEEP_SCRUB DEEP_SCRUB_STAMP   
SNAPTRIMQ_LEN 
6.212 0  00 0   0 38145321727   
0  0 3023 3023 
active+remapped+backfill_wait+backfill_toofull 2019-12-09 11:11:39.093042  
13598'212053  
13713:1179718  [6,19,24]  6  [13,0,24] 13  13549'211985 
2019-12-08 19:46:10.461113
11644'211779 2019-12-06 07:37:42.864325 0 
6.bc  11057  00 22114   0 37733931136   
0  0 3032 3032 
active+remapped+backfill_wait+backfill_toofull 2019-12-09 10:42:25.534277  
13549'212110  
13713:1229839 [15,25,17] 15 [19,18,17] 19  13549'211983 
2019-12-08 11:02:45.846031
11644'211854 2019-12-06 06:22:43.565313 0 

Any hints? I'm not worried 

Re: [ceph-users] Missing Ceph perf-counters in Ceph-Dashboard or Prometheus/InfluxDB...?

2019-12-09 Thread Stefan Kooman
Quoting Ernesto Puerta (epuer...@redhat.com):

> The default behaviour is that only perf-counters with priority
> PRIO_USEFUL (5) or higher are exposed (via `get_all_perf_counters` API
> call) to ceph-mgr modules (including Dashboard, DiskPrediction or
> Prometheus/InfluxDB/Telegraf exporters).
> 
> While changing that is rather trivial, it could make sense to get
> users' feedback and come up with a list of missing perf-counters to be
> exposed.

I made https://tracker.ceph.com/issues/4188 a while ago: missing metrics
in all but prometheus module.

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com