Hi David,
Sorry for the late reply and its completed OSD Sync and more ever
still fourth OSD available size is keep reducing. Is there any option to check
or fix .
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 3.29749 1.00000 3376G 2320G 1056G 68.71 1.10 144
1 3.26869 1.00000 3347G 1871G 1475G 55.92 0.89 134
2 3.27339 1.00000 3351G 1699G 1652G 50.69 0.81 134
3 3.24089 1.00000 3318G 1865G 1452G 56.22 0.90 142
4 3.24089 1.00000 3318G 2839G 478G 85.57 1.37 158
5 3.32669 1.00000 3406G 2249G 1156G 66.04 1.06 136
6 3.27800 1.00000 3356G 1924G 1432G 57.33 0.92 139
7 3.20470 1.00000 3281G 1949G 1331G 59.42 0.95 141
TOTAL 26757G 16720G 10037G 62.49
MIN/MAX VAR: 0.81/1.37 STDDEV: 10.26
Regards
Prabu GJ
---- On Mon, 13 Nov 2017 00:27:47 +0530 David Turner
<[email protected]> wrote ----
You cannot reduce the PG count for a pool. So there isn't anything you can
really do for this unless you create a new FS with better PG counts and migrate
your data into it.
The problem with having more PGs than you need is in the memory footprint for
the osd daemon. There are warning thresholds for having too many PGs per osd.
Also in future expansions, if you need to add pools, you might not be able to
create the pools with the proper amount of PGs due to older pools that have way
too many PGs.
It would still be nice to see the output from those commands I asked about.
The built-in reweighting scripts might help your data distribution.
reweight-by-utilization
On Sun, Nov 12, 2017, 11:41 AM gjprabu <[email protected]> wrote:
Hi David,
Thanks for your valuable reply , once complete the backfilling for new osd and
will consider by increasing replica value asap. Is it possible to decrease the
metadata pg count ? if the pg count for metadata for value same as data count
what kind of issue may occur ?
Regards
PrabuGJ
---- On Sun, 12 Nov 2017 21:25:05 +0530 David
Turner<[email protected]> wrote ----
What's the output of `ceph df` to see if your PG counts are good or not? Like
everyone else has said, the space on the original osds can't be expected to
free up until the backfill from adding the new osd has finished.
You don't have anything in your cluster health to indicate that your cluster
will not be able to finish this backfilling operation on its own.
You might find this URL helpful in calculating your PG counts.
http://ceph.com/pgcalc/ As a side note. It is generally better to keep your PG
counts as base 2 numbers (16, 64, 256, etc). When you do not have a base 2
number then some of your PGs will take up twice as much space as others. In
your case with 250, you have 244 PGs that are the same size and 6 PGs that are
twice the size of those 244 PGs. Bumping that up to 256 will even things out.
Assuming that the metadata pool is for a CephFS volume, you do not need nearly
so many PGs for that pool. Also, I would recommend changing at least the
metadata pool to 3 replica_size. If we can talk you into 3 replica for
everything else, great! But if not, at least do the metadata pool. If you lose
an object in the data pool, you just lose that file. If you lose an object in
the metadata pool, you might lose access to the entire CephFS volume.
On Sun, Nov 12, 2017, 9:39 AM gjprabu <[email protected]> wrote:
Hi Cassiano,
Thanks for your valuable feedback and will wait for some time till new
osd sync get complete. Also for by increasing pg count it is the issue will
solve? our setup pool size for data and metadata pg number is 250. Is this
correct for 7 OSD with 2 replica. Also currently stored data size is 17TB.
ceph osd df
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 3.29749 1.00000 3376G 2814G 562G 83.35 1.23 165
1 3.26869 1.00000 3347G 1923G 1423G 57.48 0.85 152
2 3.27339 1.00000 3351G 1980G 1371G 59.10 0.88 161
3 3.24089 1.00000 3318G 2131G 1187G 64.23 0.95 168
4 3.24089 1.00000 3318G 2998G 319G 90.36 1.34 176
5 3.32669 1.00000 3406G 2476G 930G 72.68 1.08 165
6 3.27800 1.00000 3356G 1518G 1838G 45.24 0.67 166
TOTAL 23476G 15843G 7632G 67.49
MIN/MAX VAR: 0.67/1.34 STDDEV: 14.53
ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 22.92604 root default
-2 3.29749 host intcfs-osd1
0 3.29749 osd.0 up 1.00000 1.00000
-3 3.26869 host intcfs-osd2
1 3.26869 osd.1 up 1.00000 1.00000
-4 3.27339 host intcfs-osd3
2 3.27339 osd.2 up 1.00000 1.00000
-5 3.24089 host intcfs-osd4
3 3.24089 osd.3 up 1.00000 1.00000
-6 3.24089 host intcfs-osd5
4 3.24089 osd.4 up 1.00000 1.00000
-7 3.32669 host intcfs-osd6
5 3.32669 osd.5 up 1.00000 1.00000
-8 3.27800 host intcfs-osd7
6 3.27800 osd.6 up 1.00000 1.00000
ceph osd pool ls detail
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool
crash_replay_interval 45 stripe_width 0
pool 4 'downloads_metadata' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 250 pgp_num 250 last_change 36 flags hashpspool
stripe_width 0
Regards
Prabu GJ
---- On Sun, 12 Nov 2017 19:20:34 +0530 Cassiano Pilipavicius
<[email protected]> wrote ----
I am also not an expert, but it looks like you have big data volumes on few
PGs, from what I've seen, the pg data is only deleted from the old OSD when is
completed copied to the new osd.
So, if 1 pg have 100G por example, only when it is fully copied to the new OSD,
the space will be released on the old OSD.
If you have a busy cluster/network, it may take a good while. Maybe just wait a
litle and check from time to time and the space will eventually be released.
Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I’m not an expert either so if someone in the list have some ideas on this
problem, don’t be shy, share them with us.
For now, I only have hypothese that the OSD space will be recovered as soon as
the recovery process is complete.
Hope everything will get back in order soon (before reaching 95% or above).
I saw some messages on the list about the fstrim tool which can help reclaim
unused free space, but i don’t know if it’s apply to your case.
Cordialement / Best regards,
Sébastien VIGNERON
CRIANN,
Ingénieur / Engineer
Technopôle du Madrillet
745, avenue de l'Université
76800 Saint-Etienne du Rouvray - France
tél. +33 2 32 91 42 91
fax. +33 2 32 91 42 92
http://www.criann.fr
mailto:[email protected]
support: [email protected]
Le 12 nov. 2017 à 13:29, gjprabu <[email protected]> a écrit :
Hi Sebastien,
Below is the query details. I am not that much expert and still learning .
pg's are not stuck stat before adding osd and pg are slowly clearing stat to
active-clean. Today morning there was around 53
active+undersized+degraded+remapped+wait_backfill and now it is 21 only, hope
its going on and i am seeing the space keep increasing in newly added OSD
(osd.6)
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 3.29749 1.00000 3376G 2814G 562G 83.35 1.23 165 ( Available Spaces not
reduced after adding new OSD)
1 3.26869 1.00000 3347G 1923G 1423G 57.48 0.85 152
2 3.27339 1.00000 3351G 1980G 1371G 59.10 0.88 161
3 3.24089 1.00000 3318G 2131G 1187G 64.23 0.95 168
4 3.24089 1.00000 3318G 2998G 319G 90.36 1.34 176 ( Available Spaces not
reduced after adding new OSD)
5 3.32669 1.00000 3406G 2476G 930G 72.68 1.08 165 ( Available Spaces not
reduced after adding new OSD)
6 3.27800 1.00000 3356G 1518G 1838G 45.24 0.67 166
TOTAL 23476G 15843G 7632G 67.49
MIN/MAX VAR: 0.67/1.34 STDDEV: 14.53
...
_______________________________________________ ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com