Unfortunately, any snapshots created prior to 12.2.2 against a separate data pool were incorrectly associated to the base image pool instead of the data pool. Was the base RBD pool used only for data-pool associated images (i.e. all the snapshots that exists within the pool can be safely deleted)?
On Mon, Jan 29, 2018 at 11:50 AM, Karun Josy <[email protected]> wrote: > > The problem we are experiencing is described here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1497332 > > However, we are running 12.2.2. > > Across our 6 ceph clusters, this one with the problem was first version > 12.2.0, then upgraded to .1 and then to .2. > > The other 5 ceph installations started as version 12.2.1 and then updated > to .2. > > Karun Josy > > On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy <[email protected]> wrote: > >> Thank you for your response. >> >> We don't think there is an issue with the cluster being behind snap >> trimming. We just don't think snaptrim is occurring at all. >> >> We have 6 individual ceph clusters. When we delete old snapshots for >> clients, we can see space being made available. In this particular one >> however, with 300 virtual machines, 28TBs of data (this is our largest >> ceph), I can delete hundreds of snapshots, and not a single gigabyte >> becomes available after doing that. >> >> In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes >> becoming available again after doing massive deletions of snapshots. >> >> The Luminous gui also never shows "snaptrimming" occurring in the EC >> pool. While the other 5 Luminous clusters, their GUI will show >> snaptrimming occurring for the EC pool. Within minutes we can see the >> additional space becoming available. >> >> This isn't an issue of the trimming queue behind schedule. The system >> shows there is no trimming scheduled in the queue, ever. >> >> However, when using ceph du on particular virtual machines, we can see >> that snapshots we delete are indeed no longer listed in ceph du's output. >> >> So, they seem to be deleting. But the space is not being reclaimed. >> >> All clusters are same hardware. Some have more disks and servers than >> others. The only major difference is that this particular Ceph with this >> problem, it had the noscrub and nodeep-scrub flags set for many weeks. >> >> >> Karun Josy >> >> On Mon, Jan 29, 2018 at 6:27 PM, David Turner <[email protected]> >> wrote: >> >>> I don't know why you keep asking the same question about snap trimming. >>> You haven't shown any evidence that your cluster is behind on that. Have >>> you looked into fstrim inside of your VMs? >>> >>> On Mon, Jan 29, 2018, 4:30 AM Karun Josy <[email protected]> wrote: >>> >>>> fast-diff map is not enabled for RBD images. >>>> Can it be a reason for Trimming not happening ? >>>> >>>> Karun Josy >>>> >>>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <[email protected]> >>>> wrote: >>>> >>>>> Hi David, >>>>> >>>>> Thank you for your reply! I really appreciate it. >>>>> >>>>> The images are in pool id 55. It is an erasure coded pool. >>>>> >>>>> --------------- >>>>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut >>>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> -------------- >>>>> >>>>> Current snap_trim_sleep value is default. >>>>> "osd_snap_trim_sleep": "0.000000". I assume it means there is no >>>>> delay. (Can't find any documentation related to it) >>>>> Will changing its value initiate snaptrimming, like >>>>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' >>>>> >>>>> Also, we are using an rbd user with the below profile. It is used >>>>> while deleting snapshots >>>>> ------- >>>>> caps: [mon] profile rbd >>>>> caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, >>>>> profile rbd-read-only pool=templates >>>>> ------- >>>>> >>>>> Can it be a reason ? >>>>> >>>>> Also, can you let me know which all logs to check while deleting >>>>> snapshots to see if it is snaptrimming ? >>>>> I am sorry I feel like pestering you too much. >>>>> But in mailing lists, I can see you have dealt with similar issues >>>>> with Snapshots >>>>> So I think you can help me figure this mess out. >>>>> >>>>> >>>>> Karun Josy >>>>> >>>>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner <[email protected]> >>>>> wrote: >>>>> >>>>>> Prove* a positive >>>>>> >>>>>> On Sat, Jan 27, 2018, 8:45 AM David Turner <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Unless you have things in your snap_trimq, your problem isn't snap >>>>>>> trimming. That is currently how you can check snap trimming and you say >>>>>>> you're caught up. >>>>>>> >>>>>>> Are you certain that you are querying the correct pool for the >>>>>>> images you are snapshotting. You showed that you tested 4 different >>>>>>> pools. >>>>>>> You should only need to check the pool with the images you are dealing >>>>>>> with. >>>>>>> >>>>>>> You can inversely price a positive by changing your snap_trim >>>>>>> settings to not do any cleanup and see if the appropriate PGs have >>>>>>> anything >>>>>>> in their q. >>>>>>> >>>>>>> On Sat, Jan 27, 2018, 12:06 AM Karun Josy <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Is scrubbing and deep scrubbing necessary for Snaptrim operation to >>>>>>>> happen ? >>>>>>>> >>>>>>>> Karun Josy >>>>>>>> >>>>>>>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thank you for your quick response! >>>>>>>>> >>>>>>>>> I used the command to fetch the snap_trimq from many pgs, however >>>>>>>>> it seems they don't have any in queue ? >>>>>>>>> >>>>>>>>> For eg : >>>>>>>>> ==================== >>>>>>>>> $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ -f2 >>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ -f2 >>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ -f2 >>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ -f2 >>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 | >>>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ -f2 >>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 | >>>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> 0 >>>>>>>>> ===================== >>>>>>>>> >>>>>>>>> >>>>>>>>> While going through the PG query, I find that these PGs have no >>>>>>>>> value in purged_snaps section too. >>>>>>>>> For eg : >>>>>>>>> ceph pg 55.80 query >>>>>>>>> -- >>>>>>>>> --- >>>>>>>>> --- >>>>>>>>> { >>>>>>>>> "peer": "83(3)", >>>>>>>>> "pgid": "55.80s3", >>>>>>>>> "last_update": "43360'15121927", >>>>>>>>> "last_complete": "43345'15073146", >>>>>>>>> "log_tail": "43335'15064480", >>>>>>>>> "last_user_version": 15066124, >>>>>>>>> "last_backfill": "MAX", >>>>>>>>> "last_backfill_bitwise": 1, >>>>>>>>> "purged_snaps": [], >>>>>>>>> "history": { >>>>>>>>> "epoch_created": 5950, >>>>>>>>> "epoch_pool_created": 5950, >>>>>>>>> "last_epoch_started": 43339, >>>>>>>>> "last_interval_started": 43338, >>>>>>>>> "last_epoch_clean": 43340, >>>>>>>>> "last_interval_clean": 43338, >>>>>>>>> "last_epoch_split": 0, >>>>>>>>> "last_epoch_marked_full": 42032, >>>>>>>>> "same_up_since": 43338, >>>>>>>>> "same_interval_since": 43338, >>>>>>>>> "same_primary_since": 43276, >>>>>>>>> "last_scrub": "35299'13072533", >>>>>>>>> "last_scrub_stamp": "2018-01-18 14:01:19.557972", >>>>>>>>> "last_deep_scrub": "31372'12176860", >>>>>>>>> "last_deep_scrub_stamp": "2018-01-15 >>>>>>>>> 12:21:17.025305", >>>>>>>>> "last_clean_scrub_stamp": "2018-01-18 >>>>>>>>> 14:01:19.557972" >>>>>>>>> }, >>>>>>>>> >>>>>>>>> Not sure if it is related. >>>>>>>>> >>>>>>>>> The cluster is not open to any new clients. However we see a >>>>>>>>> steady growth of space usage every day. >>>>>>>>> And worst case scenario, it might grow faster than we can add more >>>>>>>>> space, which will be dangerous. >>>>>>>>> >>>>>>>>> Any help is really appreciated. >>>>>>>>> >>>>>>>>> Karun Josy >>>>>>>>> >>>>>>>>> On Fri, Jan 26, 2018 at 8:23 PM, David Turner < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> "snap_trimq": "[]", >>>>>>>>>> >>>>>>>>>> That is exactly what you're looking for to see how many objects a >>>>>>>>>> PG still had that need to be cleaned up. I think something like this >>>>>>>>>> should >>>>>>>>>> give you the number of objects in the snap_trimq for a PG. >>>>>>>>>> >>>>>>>>>> echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 | >>>>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>> >>>>>>>>>> Note, I'm not at a computer and topping this from my phone so >>>>>>>>>> it's not pretty and I know of a few ways to do that better, but that >>>>>>>>>> should >>>>>>>>>> work all the same. >>>>>>>>>> >>>>>>>>>> For your needs a visual inspection of several PGs should be >>>>>>>>>> sufficient to see if there is anything in the snap_trimq to begin >>>>>>>>>> with. >>>>>>>>>> >>>>>>>>>> On Fri, Jan 26, 2018, 9:18 AM Karun Josy <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Thank you for the response. To be honest, I am afraid it is >>>>>>>>>>> going to be a issue in our cluster. >>>>>>>>>>> It seems snaptrim has not been going on for sometime now , maybe >>>>>>>>>>> because we were expanding the cluster adding nodes for the past few >>>>>>>>>>> weeks. >>>>>>>>>>> >>>>>>>>>>> I would be really glad if you can guide me how to overcome this. >>>>>>>>>>> Cluster has about 30TB data and 11 million objects. With about >>>>>>>>>>> 100 disks spread across 16 nodes. Version is 12.2.2 >>>>>>>>>>> Searching through the mailing lists I can see many cases where >>>>>>>>>>> the performance were affected while snaptrimming. >>>>>>>>>>> >>>>>>>>>>> Can you help me figure out these : >>>>>>>>>>> >>>>>>>>>>> - How to find snaptrim queue of a PG. >>>>>>>>>>> - Can snaptrim be started just on 1 PG >>>>>>>>>>> - How can I make sure cluster IO performance is not affected ? >>>>>>>>>>> I read about osd_snap_trim_sleep , how can it be changed ? >>>>>>>>>>> Is this the command : ceph tell osd.* injectargs >>>>>>>>>>> '--osd_snap_trim_sleep 0.005' >>>>>>>>>>> >>>>>>>>>>> If yes what is the recommended value that we can use ? >>>>>>>>>>> >>>>>>>>>>> Also, what all parameters should we be concerned about? I would >>>>>>>>>>> really appreciate any suggestions. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Below is a brief extract of a PG queried >>>>>>>>>>> ---------------------------- >>>>>>>>>>> ceph pg 55.77 query >>>>>>>>>>> { >>>>>>>>>>> "state": "active+clean", >>>>>>>>>>> "snap_trimq": "[]", >>>>>>>>>>> --- >>>>>>>>>>> ---- >>>>>>>>>>> >>>>>>>>>>> "pgid": "55.77s7", >>>>>>>>>>> "last_update": "43353'17222404", >>>>>>>>>>> "last_complete": "42773'16814984", >>>>>>>>>>> "log_tail": "42763'16812644", >>>>>>>>>>> "last_user_version": 16814144, >>>>>>>>>>> "last_backfill": "MAX", >>>>>>>>>>> "last_backfill_bitwise": 1, >>>>>>>>>>> "purged_snaps": [], >>>>>>>>>>> "history": { >>>>>>>>>>> "epoch_created": 5950, >>>>>>>>>>> --- >>>>>>>>>>> --- >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Karun Josy >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 26, 2018 at 6:36 PM, David Turner < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> You may find the information in this ML thread useful. >>>>>>>>>>>> https://www.spinics.net/lists/ceph-users/msg41279.html >>>>>>>>>>>> >>>>>>>>>>>> It talks about a couple ways to track your snaptrim queue. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 26, 2018 at 2:09 AM Karun Josy < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> We have set no scrub , no deep scrub flag on a ceph cluster. >>>>>>>>>>>>> When we are deleting snapshots we are not seeing any change in >>>>>>>>>>>>> usage space. >>>>>>>>>>>>> >>>>>>>>>>>>> I understand that Ceph OSDs delete data asynchronously, so >>>>>>>>>>>>> deleting a snapshot doesn’t free up the disk space immediately. >>>>>>>>>>>>> But we are >>>>>>>>>>>>> not seeing any change for sometime. >>>>>>>>>>>>> >>>>>>>>>>>>> What can be possible reason ? Any suggestions would be really >>>>>>>>>>>>> helpful as the cluster size seems to be growing each day even >>>>>>>>>>>>> though >>>>>>>>>>>>> snapshots are deleted. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Karun >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> ceph-users mailing list >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>> >> > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Jason
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
