Yes, increasing the PG count for the data pool will be what you want to do when you add osds to your cluster.
On Wed, Nov 22, 2017, 9:25 AM gjprabu <[email protected]> wrote: > Hi David, > > Thanks, will check osd weight settings and we are not using rbd > and will delete. As per the pg calculation for the 8 osd we should keep 512 > pg but our cause unfortunately have set 256 for meta data and 256 for > data. Now is that ok to increase the pg count in data pool alone . if we > need to add more osd we should required to increase pg count. Please > suggest. > > Regards > Prabu GJ > > > ---- On Tue, 21 Nov 2017 21:43:13 +0530 *David Turner > <[email protected] <[email protected]>>* wrote ---- > > Your rbd pool can be removed (unless you're planning to use it) which will > delete those PGs from your cluster/OSDs. Also all of your backfilling > finished and has settled. Now you just need to work on balancing the > weights for the OSDs in your cluster. > > There are multiple ways to balance the usage of the clusters. Changing > the crush weight of the OSD, changing the reweight of the OSD, doing that > by using `ceph osd reweight-by-utilization`, doing that by using Cern's > modified version of that which can weight things up as well as down, etc. > I use a method that changes the crush weight of the OSD, but does so by > downloading the crush map and using the crushtool to generate a balanced > map and do it in one go. A very popular method on the list is to create a > cron that does very small modifications in the background and keeps things > balanced by utilization. > > You should be able to find a lot of references in the ML or in blog posts > about doing these various options. The take away is that the CRUSH > algorithm is putting too much data on osd.4 and not enough data on osd.2 > (those are the extremes, but there are others not quite as extreme) and you > need to modify the weight and/or reweight of the osd to help the algorithm > balance that out. > > On Tue, Nov 21, 2017 at 12:11 AM gjprabu <[email protected]> wrote: > > > Hi David, > > This is our current status. > > > ~]# ceph status > cluster b466e09c-f7ae-4e89-99a7-99d30eba0a13 > health HEALTH_WARN > mds0: Client integ-hm3 failing to respond to cache pressure > mds0: Client integ-hm9-bkp failing to respond to cache pressure > mds0: Client me-build1-bkp failing to respond to cache pressure > monmap e2: 3 mons at {intcfs-mon1= > 192.168.113.113:6789/0,intcfs-mon2=192.168.113.114:6789/0,intcfs-mon3=192.168.113.72:6789/0 > } > election epoch 16, quorum 0,1,2 > intcfs-mon3,intcfs-mon1,intcfs-mon2 > fsmap e177798: 1/1/1 up {0=intcfs-osd1=up:active}, 1 up:standby > osdmap e4388: 8 osds: 8 up, 8 in > flags sortbitwise > pgmap v24129785: 564 pgs, 3 pools, 6885 GB data, 17138 kobjects > 14023 GB used, 12734 GB / 26757 GB avail > 560 active+clean > 3 active+clean+scrubbing > 1 active+clean+scrubbing+deep > client io 47187 kB/s rd, 965 kB/s wr, 125 op/s rd, 525 op/s wr > > ]# ceph df > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 26757G 12735G 14022G 52.41 > POOLS: > NAME ID USED %USED MAX AVAIL > OBJECTS > rbd 0 0 0 > 3787G 0 > downloads_data 3 6885G 51.46 3787G > 16047944 > downloads_metadata 4 84773k 0 3787G > 1501805 > > > Regards > Prabu GJ > > ---- On Mon, 20 Nov 2017 21:35:17 +0530 *David Turner > <[email protected] <[email protected]>>* wrote ---- > > > What is your current `ceph status` and `ceph df`? The status of your > cluster has likely changed a bit in the last week. > > On Mon, Nov 20, 2017 at 6:00 AM gjprabu <[email protected]> wrote: > > > Hi David, > > Sorry for the late reply and its completed OSD Sync and more > ever still fourth OSD available size is keep reducing. Is there any option > to check or fix . > > > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > > 0 3.29749 1.00000 3376G 2320G 1056G 68.71 1.10 144 > 1 3.26869 1.00000 3347G 1871G 1475G 55.92 0.89 134 > 2 3.27339 1.00000 3351G 1699G 1652G 50.69 0.81 134 > 3 3.24089 1.00000 3318G 1865G 1452G 56.22 0.90 142 > 4 3.24089 1.00000 3318G 2839G 478G 85.57 1.37 158 > 5 3.32669 1.00000 3406G 2249G 1156G 66.04 1.06 136 > 6 3.27800 1.00000 3356G 1924G 1432G 57.33 0.92 139 > 7 3.20470 1.00000 3281G 1949G 1331G 59.42 0.95 141 > TOTAL 26757G 16720G 10037G 62.49 > MIN/MAX VAR: 0.81/1.37 STDDEV: 10.26 > > > Regards > Prabu GJ > > > > ---- On Mon, 13 Nov 2017 00:27:47 +0530 *David Turner > <[email protected] <[email protected]>>* wrote ---- > > You cannot reduce the PG count for a pool. So there isn't anything you > can really do for this unless you create a new FS with better PG counts and > migrate your data into it. > > The problem with having more PGs than you need is in the memory footprint > for the osd daemon. There are warning thresholds for having too many PGs > per osd. Also in future expansions, if you need to add pools, you might > not be able to create the pools with the proper amount of PGs due to older > pools that have way too many PGs. > > It would still be nice to see the output from those commands I asked about. > > The built-in reweighting scripts might help your data distribution. > reweight-by-utilization > > On Sun, Nov 12, 2017, 11:41 AM gjprabu <[email protected]> wrote: > > > Hi David, > > Thanks for your valuable reply , once complete the backfilling for new osd > and will consider by increasing replica value asap. Is it possible to > decrease the metadata pg count ? if the pg count for metadata for value > same as data count what kind of issue may occur ? > > Regards > PrabuGJ > > > ---- On Sun, 12 Nov 2017 21:25:05 +0530 David Turner<[email protected]> > wrote ---- > > What's the output of `ceph df` to see if your PG counts are good or not? > Like everyone else has said, the space on the original osds can't be > expected to free up until the backfill from adding the new osd has finished. > > You don't have anything in your cluster health to indicate that your > cluster will not be able to finish this backfilling operation on its own. > > You might find this URL helpful in calculating your PG counts. > http://ceph.com/pgcalc/ As a side note. It is generally better to keep > your PG counts as base 2 numbers (16, 64, 256, etc). When you do not have a > base 2 number then some of your PGs will take up twice as much space as > others. In your case with 250, you have 244 PGs that are the same size and > 6 PGs that are twice the size of those 244 PGs. Bumping that up to 256 > will even things out. > > Assuming that the metadata pool is for a CephFS volume, you do not need > nearly so many PGs for that pool. Also, I would recommend changing at least > the metadata pool to 3 replica_size. If we can talk you into 3 replica for > everything else, great! But if not, at least do the metadata pool. If you > lose an object in the data pool, you just lose that file. If you lose an > object in the metadata pool, you might lose access to the entire CephFS > volume. > > On Sun, Nov 12, 2017, 9:39 AM gjprabu <[email protected]> wrote: > > > Hi Cassiano, > > Thanks for your valuable feedback and will wait for some time till > new osd sync get complete. Also for by increasing pg count it is the issue > will solve? our setup pool size for data and metadata pg number is 250. Is > this correct for 7 OSD with 2 replica. Also currently stored data size is > 17TB. > > ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > 0 3.29749 1.00000 3376G 2814G 562G 83.35 1.23 165 > 1 3.26869 1.00000 3347G 1923G 1423G 57.48 0.85 152 > 2 3.27339 1.00000 3351G 1980G 1371G 59.10 0.88 161 > 3 3.24089 1.00000 3318G 2131G 1187G 64.23 0.95 168 > 4 3.24089 1.00000 3318G 2998G 319G 90.36 1.34 176 > 5 3.32669 1.00000 3406G 2476G 930G 72.68 1.08 165 > 6 3.27800 1.00000 3356G 1518G 1838G 45.24 0.67 166 > TOTAL 23476G 15843G 7632G 67.49 > MIN/MAX VAR: 0.67/1.34 STDDEV: 14.53 > > ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 22.92604 root default > -2 3.29749 host intcfs-osd1 > 0 3.29749 osd.0 up 1.00000 1.00000 > -3 3.26869 host intcfs-osd2 > 1 3.26869 osd.1 up 1.00000 1.00000 > -4 3.27339 host intcfs-osd3 > 2 3.27339 osd.2 up 1.00000 1.00000 > -5 3.24089 host intcfs-osd4 > 3 3.24089 osd.3 up 1.00000 1.00000 > -6 3.24089 host intcfs-osd5 > 4 3.24089 osd.4 up 1.00000 1.00000 > -7 3.32669 host intcfs-osd6 > 5 3.32669 osd.5 up 1.00000 1.00000 > -8 3.27800 host intcfs-osd7 > 6 3.27800 osd.6 up 1.00000 1.00000 > > *ceph osd pool ls detail* > > pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 > pool 3 '*downloads_data*' replicated size 2 min_size 1 crush_ruleset 0 > object_hash rjenkins* pg_num 250 pgp_num 250* last_change 39 flags > hashpspool crash_replay_interval 45 stripe_width 0 > pool 4 '*downloads_metadata*' replicated size 2 min_size 1 crush_ruleset > 0 object_hash rjenkins *pg_num 250 pgp_num 250 *last_change 36 flags > hashpspool stripe_width 0 > > Regards > Prabu GJ > > ---- On Sun, 12 Nov 2017 19:20:34 +0530 *Cassiano Pilipavicius > <[email protected] <[email protected]>>* wrote ---- > > > I am also not an expert, but it looks like you have big data volumes on > few PGs, from what I've seen, the pg data is only deleted from the old OSD > when is completed copied to the new osd. > > So, if 1 pg have 100G por example, only when it is fully copied to the new > OSD, the space will be released on the old OSD. > > If you have a busy cluster/network, it may take a good while. Maybe just > wait a litle and check from time to time and the space will eventually be > released. > > Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu: > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > I’m not an expert either so if someone in the list have some ideas on this > problem, don’t be shy, share them with us. > > For now, I only have hypothese that the OSD space will be recovered as > soon as the recovery process is complete. > Hope everything will get back in order soon (before reaching 95% or above). > > I saw some messages on the list about the fstrim tool which can help > reclaim unused free space, but i don’t know if it’s apply to your case. > > Cordialement / Best regards, > > Sébastien VIGNERON > CRIANN, > Ingénieur / Engineer > Technopôle du Madrillet > 745, avenue de l'Université > <https://maps.google.com/?q=745,+avenue+de+l'Universit%C3%A9%C2%A0+76800+Saint-Etienne+du+Rouvray+-+France&entry=gmail&source=g> > > 76800 Saint-Etienne du Rouvray - France > <https://maps.google.com/?q=745,+avenue+de+l'Universit%C3%A9%C2%A0+76800+Saint-Etienne+du+Rouvray+-+France&entry=gmail&source=g> > > tél. +33 2 32 91 42 91 > fax. +33 2 32 91 42 92 > http://www.criann.fr > mailto:[email protected] <[email protected]> > support: [email protected] > > Le 12 nov. 2017 à 13:29, gjprabu <[email protected]> a écrit : > > Hi Sebastien, > > Below is the query details. I am not that much expert and still > learning . pg's are not stuck stat before adding osd and pg are slowly > clearing stat to active-clean. Today morning there was around > 53 active+undersized+degraded+remapped+wait_backfill and now it is 21 only, > hope its going on and i am seeing the space keep increasing in newly added > OSD (osd.6) > > > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > *0 3.29749 1.00000 3376G 2814G 562G 83.35 1.23 165 ( Available Spaces > not reduced after adding new OSD)* > 1 3.26869 1.00000 3347G 1923G 1423G 57.48 0.85 152 > 2 3.27339 1.00000 3351G 1980G 1371G 59.10 0.88 161 > 3 3.24089 1.00000 3318G 2131G 1187G 64.23 0.95 168 > *4 3.24089 1.00000 3318G 2998G 319G 90.36 1.34 176 ( Available Spaces > not reduced after adding new OSD)* > *5 3.32669 1.00000 3406G 2476G 930G 72.68 1.08 165 ( Available Spaces > not reduced after adding new OSD)* > 6 3.27800 1.00000 3356G 1518G 1838G 45.24 0.67 166 > TOTAL 23476G 15843G 7632G 67.49 > MIN/MAX VAR: 0.67/1.34 STDDEV: 14.53 > > ... > > > > _______________________________________________ ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
