The problem we are experiencing is described here: https://bugzilla.redhat.com/show_bug.cgi?id=1497332
However, we are running 12.2.2. Across our 6 ceph clusters, this one with the problem was first version 12.2.0, then upgraded to .1 and then to .2. The other 5 ceph installations started as version 12.2.1 and then updated to .2. Karun Josy On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy <[email protected]> wrote: > Thank you for your response. > > We don't think there is an issue with the cluster being behind snap > trimming. We just don't think snaptrim is occurring at all. > > We have 6 individual ceph clusters. When we delete old snapshots for > clients, we can see space being made available. In this particular one > however, with 300 virtual machines, 28TBs of data (this is our largest > ceph), I can delete hundreds of snapshots, and not a single gigabyte > becomes available after doing that. > > In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes > becoming available again after doing massive deletions of snapshots. > > The Luminous gui also never shows "snaptrimming" occurring in the EC > pool. While the other 5 Luminous clusters, their GUI will show > snaptrimming occurring for the EC pool. Within minutes we can see the > additional space becoming available. > > This isn't an issue of the trimming queue behind schedule. The system > shows there is no trimming scheduled in the queue, ever. > > However, when using ceph du on particular virtual machines, we can see > that snapshots we delete are indeed no longer listed in ceph du's output. > > So, they seem to be deleting. But the space is not being reclaimed. > > All clusters are same hardware. Some have more disks and servers than > others. The only major difference is that this particular Ceph with this > problem, it had the noscrub and nodeep-scrub flags set for many weeks. > > > Karun Josy > > On Mon, Jan 29, 2018 at 6:27 PM, David Turner <[email protected]> > wrote: > >> I don't know why you keep asking the same question about snap trimming. >> You haven't shown any evidence that your cluster is behind on that. Have >> you looked into fstrim inside of your VMs? >> >> On Mon, Jan 29, 2018, 4:30 AM Karun Josy <[email protected]> wrote: >> >>> fast-diff map is not enabled for RBD images. >>> Can it be a reason for Trimming not happening ? >>> >>> Karun Josy >>> >>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <[email protected]> >>> wrote: >>> >>>> Hi David, >>>> >>>> Thank you for your reply! I really appreciate it. >>>> >>>> The images are in pool id 55. It is an erasure coded pool. >>>> >>>> --------------- >>>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> -------------- >>>> >>>> Current snap_trim_sleep value is default. >>>> "osd_snap_trim_sleep": "0.000000". I assume it means there is no delay. >>>> (Can't find any documentation related to it) >>>> Will changing its value initiate snaptrimming, like >>>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' >>>> >>>> Also, we are using an rbd user with the below profile. It is used while >>>> deleting snapshots >>>> ------- >>>> caps: [mon] profile rbd >>>> caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, >>>> profile rbd-read-only pool=templates >>>> ------- >>>> >>>> Can it be a reason ? >>>> >>>> Also, can you let me know which all logs to check while deleting >>>> snapshots to see if it is snaptrimming ? >>>> I am sorry I feel like pestering you too much. >>>> But in mailing lists, I can see you have dealt with similar issues with >>>> Snapshots >>>> So I think you can help me figure this mess out. >>>> >>>> >>>> Karun Josy >>>> >>>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner <[email protected]> >>>> wrote: >>>> >>>>> Prove* a positive >>>>> >>>>> On Sat, Jan 27, 2018, 8:45 AM David Turner <[email protected]> >>>>> wrote: >>>>> >>>>>> Unless you have things in your snap_trimq, your problem isn't snap >>>>>> trimming. That is currently how you can check snap trimming and you say >>>>>> you're caught up. >>>>>> >>>>>> Are you certain that you are querying the correct pool for the images >>>>>> you are snapshotting. You showed that you tested 4 different pools. You >>>>>> should only need to check the pool with the images you are dealing with. >>>>>> >>>>>> You can inversely price a positive by changing your snap_trim >>>>>> settings to not do any cleanup and see if the appropriate PGs have >>>>>> anything >>>>>> in their q. >>>>>> >>>>>> On Sat, Jan 27, 2018, 12:06 AM Karun Josy <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Is scrubbing and deep scrubbing necessary for Snaptrim operation to >>>>>>> happen ? >>>>>>> >>>>>>> Karun Josy >>>>>>> >>>>>>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thank you for your quick response! >>>>>>>> >>>>>>>> I used the command to fetch the snap_trimq from many pgs, however >>>>>>>> it seems they don't have any in queue ? >>>>>>>> >>>>>>>> For eg : >>>>>>>> ==================== >>>>>>>> $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 | >>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>> 0 >>>>>>>> ===================== >>>>>>>> >>>>>>>> >>>>>>>> While going through the PG query, I find that these PGs have no >>>>>>>> value in purged_snaps section too. >>>>>>>> For eg : >>>>>>>> ceph pg 55.80 query >>>>>>>> -- >>>>>>>> --- >>>>>>>> --- >>>>>>>> { >>>>>>>> "peer": "83(3)", >>>>>>>> "pgid": "55.80s3", >>>>>>>> "last_update": "43360'15121927", >>>>>>>> "last_complete": "43345'15073146", >>>>>>>> "log_tail": "43335'15064480", >>>>>>>> "last_user_version": 15066124, >>>>>>>> "last_backfill": "MAX", >>>>>>>> "last_backfill_bitwise": 1, >>>>>>>> "purged_snaps": [], >>>>>>>> "history": { >>>>>>>> "epoch_created": 5950, >>>>>>>> "epoch_pool_created": 5950, >>>>>>>> "last_epoch_started": 43339, >>>>>>>> "last_interval_started": 43338, >>>>>>>> "last_epoch_clean": 43340, >>>>>>>> "last_interval_clean": 43338, >>>>>>>> "last_epoch_split": 0, >>>>>>>> "last_epoch_marked_full": 42032, >>>>>>>> "same_up_since": 43338, >>>>>>>> "same_interval_since": 43338, >>>>>>>> "same_primary_since": 43276, >>>>>>>> "last_scrub": "35299'13072533", >>>>>>>> "last_scrub_stamp": "2018-01-18 14:01:19.557972", >>>>>>>> "last_deep_scrub": "31372'12176860", >>>>>>>> "last_deep_scrub_stamp": "2018-01-15 >>>>>>>> 12:21:17.025305", >>>>>>>> "last_clean_scrub_stamp": "2018-01-18 >>>>>>>> 14:01:19.557972" >>>>>>>> }, >>>>>>>> >>>>>>>> Not sure if it is related. >>>>>>>> >>>>>>>> The cluster is not open to any new clients. However we see a steady >>>>>>>> growth of space usage every day. >>>>>>>> And worst case scenario, it might grow faster than we can add more >>>>>>>> space, which will be dangerous. >>>>>>>> >>>>>>>> Any help is really appreciated. >>>>>>>> >>>>>>>> Karun Josy >>>>>>>> >>>>>>>> On Fri, Jan 26, 2018 at 8:23 PM, David Turner < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> "snap_trimq": "[]", >>>>>>>>> >>>>>>>>> That is exactly what you're looking for to see how many objects a >>>>>>>>> PG still had that need to be cleaned up. I think something like this >>>>>>>>> should >>>>>>>>> give you the number of objects in the snap_trimq for a PG. >>>>>>>>> >>>>>>>>> echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 | cut >>>>>>>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>> >>>>>>>>> Note, I'm not at a computer and topping this from my phone so it's >>>>>>>>> not pretty and I know of a few ways to do that better, but that >>>>>>>>> should work >>>>>>>>> all the same. >>>>>>>>> >>>>>>>>> For your needs a visual inspection of several PGs should be >>>>>>>>> sufficient to see if there is anything in the snap_trimq to begin >>>>>>>>> with. >>>>>>>>> >>>>>>>>> On Fri, Jan 26, 2018, 9:18 AM Karun Josy <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Thank you for the response. To be honest, I am afraid it is going >>>>>>>>>> to be a issue in our cluster. >>>>>>>>>> It seems snaptrim has not been going on for sometime now , maybe >>>>>>>>>> because we were expanding the cluster adding nodes for the past few >>>>>>>>>> weeks. >>>>>>>>>> >>>>>>>>>> I would be really glad if you can guide me how to overcome this. >>>>>>>>>> Cluster has about 30TB data and 11 million objects. With about >>>>>>>>>> 100 disks spread across 16 nodes. Version is 12.2.2 >>>>>>>>>> Searching through the mailing lists I can see many cases where >>>>>>>>>> the performance were affected while snaptrimming. >>>>>>>>>> >>>>>>>>>> Can you help me figure out these : >>>>>>>>>> >>>>>>>>>> - How to find snaptrim queue of a PG. >>>>>>>>>> - Can snaptrim be started just on 1 PG >>>>>>>>>> - How can I make sure cluster IO performance is not affected ? >>>>>>>>>> I read about osd_snap_trim_sleep , how can it be changed ? >>>>>>>>>> Is this the command : ceph tell osd.* injectargs >>>>>>>>>> '--osd_snap_trim_sleep 0.005' >>>>>>>>>> >>>>>>>>>> If yes what is the recommended value that we can use ? >>>>>>>>>> >>>>>>>>>> Also, what all parameters should we be concerned about? I would >>>>>>>>>> really appreciate any suggestions. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Below is a brief extract of a PG queried >>>>>>>>>> ---------------------------- >>>>>>>>>> ceph pg 55.77 query >>>>>>>>>> { >>>>>>>>>> "state": "active+clean", >>>>>>>>>> "snap_trimq": "[]", >>>>>>>>>> --- >>>>>>>>>> ---- >>>>>>>>>> >>>>>>>>>> "pgid": "55.77s7", >>>>>>>>>> "last_update": "43353'17222404", >>>>>>>>>> "last_complete": "42773'16814984", >>>>>>>>>> "log_tail": "42763'16812644", >>>>>>>>>> "last_user_version": 16814144, >>>>>>>>>> "last_backfill": "MAX", >>>>>>>>>> "last_backfill_bitwise": 1, >>>>>>>>>> "purged_snaps": [], >>>>>>>>>> "history": { >>>>>>>>>> "epoch_created": 5950, >>>>>>>>>> --- >>>>>>>>>> --- >>>>>>>>>> --- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Karun Josy >>>>>>>>>> >>>>>>>>>> On Fri, Jan 26, 2018 at 6:36 PM, David Turner < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> You may find the information in this ML thread useful. >>>>>>>>>>> https://www.spinics.net/lists/ceph-users/msg41279.html >>>>>>>>>>> >>>>>>>>>>> It talks about a couple ways to track your snaptrim queue. >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 26, 2018 at 2:09 AM Karun Josy <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> We have set no scrub , no deep scrub flag on a ceph cluster. >>>>>>>>>>>> When we are deleting snapshots we are not seeing any change in >>>>>>>>>>>> usage space. >>>>>>>>>>>> >>>>>>>>>>>> I understand that Ceph OSDs delete data asynchronously, so >>>>>>>>>>>> deleting a snapshot doesn’t free up the disk space immediately. >>>>>>>>>>>> But we are >>>>>>>>>>>> not seeing any change for sometime. >>>>>>>>>>>> >>>>>>>>>>>> What can be possible reason ? Any suggestions would be really >>>>>>>>>>>> helpful as the cluster size seems to be growing each day even >>>>>>>>>>>> though >>>>>>>>>>>> snapshots are deleted. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Karun >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> ceph-users mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
