OK, at least it should be pretty straightforward to correct programmatically. I can throw together a quick program to clean your pools but you will need to compile it (you will need the librbd1-devel package installed) since unfortunately the rados Python API doesn't provide access to self-managed snapshots. You should also delete all existing image snapshots prior to executing the code since their state is suspect.
On Tue, Jan 30, 2018 at 10:37 AM, Karun Josy <karunjo...@gmail.com> wrote: > Hi Jason, > > >> Was the base RBD pool used only for data-pool associated images > Yes, it is only used for storing metadata of ecpool. > > We use 2 pools for erasure coding > > ecpool - erasure coded datapool > vm - replicated pool to store metadata > > Karun Josy > > On Tue, Jan 30, 2018 at 8:00 PM, Jason Dillaman <jdill...@redhat.com> > wrote: > >> Unfortunately, any snapshots created prior to 12.2.2 against a separate >> data pool were incorrectly associated to the base image pool instead of the >> data pool. Was the base RBD pool used only for data-pool associated images >> (i.e. all the snapshots that exists within the pool can be safely deleted)? >> >> On Mon, Jan 29, 2018 at 11:50 AM, Karun Josy <karunjo...@gmail.com> >> wrote: >> >>> >>> The problem we are experiencing is described here: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1497332 >>> >>> However, we are running 12.2.2. >>> >>> Across our 6 ceph clusters, this one with the problem was first version >>> 12.2.0, then upgraded to .1 and then to .2. >>> >>> The other 5 ceph installations started as version 12.2.1 and then >>> updated to .2. >>> >>> Karun Josy >>> >>> On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy <karunjo...@gmail.com> >>> wrote: >>> >>>> Thank you for your response. >>>> >>>> We don't think there is an issue with the cluster being behind snap >>>> trimming. We just don't think snaptrim is occurring at all. >>>> >>>> We have 6 individual ceph clusters. When we delete old snapshots for >>>> clients, we can see space being made available. In this particular one >>>> however, with 300 virtual machines, 28TBs of data (this is our largest >>>> ceph), I can delete hundreds of snapshots, and not a single gigabyte >>>> becomes available after doing that. >>>> >>>> In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes >>>> becoming available again after doing massive deletions of snapshots. >>>> >>>> The Luminous gui also never shows "snaptrimming" occurring in the EC >>>> pool. While the other 5 Luminous clusters, their GUI will show >>>> snaptrimming occurring for the EC pool. Within minutes we can see the >>>> additional space becoming available. >>>> >>>> This isn't an issue of the trimming queue behind schedule. The system >>>> shows there is no trimming scheduled in the queue, ever. >>>> >>>> However, when using ceph du on particular virtual machines, we can see >>>> that snapshots we delete are indeed no longer listed in ceph du's output. >>>> >>>> So, they seem to be deleting. But the space is not being reclaimed. >>>> >>>> All clusters are same hardware. Some have more disks and servers than >>>> others. The only major difference is that this particular Ceph with this >>>> problem, it had the noscrub and nodeep-scrub flags set for many weeks. >>>> >>>> >>>> Karun Josy >>>> >>>> On Mon, Jan 29, 2018 at 6:27 PM, David Turner <drakonst...@gmail.com> >>>> wrote: >>>> >>>>> I don't know why you keep asking the same question about snap >>>>> trimming. You haven't shown any evidence that your cluster is behind on >>>>> that. Have you looked into fstrim inside of your VMs? >>>>> >>>>> On Mon, Jan 29, 2018, 4:30 AM Karun Josy <karunjo...@gmail.com> wrote: >>>>> >>>>>> fast-diff map is not enabled for RBD images. >>>>>> Can it be a reason for Trimming not happening ? >>>>>> >>>>>> Karun Josy >>>>>> >>>>>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <karunjo...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> Thank you for your reply! I really appreciate it. >>>>>>> >>>>>>> The images are in pool id 55. It is an erasure coded pool. >>>>>>> >>>>>>> --------------- >>>>>>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | >>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>> 0 >>>>>>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | >>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>> 0 >>>>>>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | >>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>> 0 >>>>>>> -------------- >>>>>>> >>>>>>> Current snap_trim_sleep value is default. >>>>>>> "osd_snap_trim_sleep": "0.000000". I assume it means there is no >>>>>>> delay. (Can't find any documentation related to it) >>>>>>> Will changing its value initiate snaptrimming, like >>>>>>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' >>>>>>> >>>>>>> Also, we are using an rbd user with the below profile. It is used >>>>>>> while deleting snapshots >>>>>>> ------- >>>>>>> caps: [mon] profile rbd >>>>>>> caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, >>>>>>> profile rbd-read-only pool=templates >>>>>>> ------- >>>>>>> >>>>>>> Can it be a reason ? >>>>>>> >>>>>>> Also, can you let me know which all logs to check while deleting >>>>>>> snapshots to see if it is snaptrimming ? >>>>>>> I am sorry I feel like pestering you too much. >>>>>>> But in mailing lists, I can see you have dealt with similar issues >>>>>>> with Snapshots >>>>>>> So I think you can help me figure this mess out. >>>>>>> >>>>>>> >>>>>>> Karun Josy >>>>>>> >>>>>>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner <drakonst...@gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> Prove* a positive >>>>>>>> >>>>>>>> On Sat, Jan 27, 2018, 8:45 AM David Turner <drakonst...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Unless you have things in your snap_trimq, your problem isn't snap >>>>>>>>> trimming. That is currently how you can check snap trimming and you >>>>>>>>> say >>>>>>>>> you're caught up. >>>>>>>>> >>>>>>>>> Are you certain that you are querying the correct pool for the >>>>>>>>> images you are snapshotting. You showed that you tested 4 different >>>>>>>>> pools. >>>>>>>>> You should only need to check the pool with the images you are >>>>>>>>> dealing with. >>>>>>>>> >>>>>>>>> You can inversely price a positive by changing your snap_trim >>>>>>>>> settings to not do any cleanup and see if the appropriate PGs have >>>>>>>>> anything >>>>>>>>> in their q. >>>>>>>>> >>>>>>>>> On Sat, Jan 27, 2018, 12:06 AM Karun Josy <karunjo...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Is scrubbing and deep scrubbing necessary for Snaptrim operation >>>>>>>>>> to happen ? >>>>>>>>>> >>>>>>>>>> Karun Josy >>>>>>>>>> >>>>>>>>>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <karunjo...@gmail.com >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> Thank you for your quick response! >>>>>>>>>>> >>>>>>>>>>> I used the command to fetch the snap_trimq from many pgs, >>>>>>>>>>> however it seems they don't have any in queue ? >>>>>>>>>>> >>>>>>>>>>> For eg : >>>>>>>>>>> ==================== >>>>>>>>>>> $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ >>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ >>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ >>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ >>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 >>>>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ >>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 >>>>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>> 0 >>>>>>>>>>> ===================== >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> While going through the PG query, I find that these PGs have no >>>>>>>>>>> value in purged_snaps section too. >>>>>>>>>>> For eg : >>>>>>>>>>> ceph pg 55.80 query >>>>>>>>>>> -- >>>>>>>>>>> --- >>>>>>>>>>> --- >>>>>>>>>>> { >>>>>>>>>>> "peer": "83(3)", >>>>>>>>>>> "pgid": "55.80s3", >>>>>>>>>>> "last_update": "43360'15121927", >>>>>>>>>>> "last_complete": "43345'15073146", >>>>>>>>>>> "log_tail": "43335'15064480", >>>>>>>>>>> "last_user_version": 15066124, >>>>>>>>>>> "last_backfill": "MAX", >>>>>>>>>>> "last_backfill_bitwise": 1, >>>>>>>>>>> "purged_snaps": [], >>>>>>>>>>> "history": { >>>>>>>>>>> "epoch_created": 5950, >>>>>>>>>>> "epoch_pool_created": 5950, >>>>>>>>>>> "last_epoch_started": 43339, >>>>>>>>>>> "last_interval_started": 43338, >>>>>>>>>>> "last_epoch_clean": 43340, >>>>>>>>>>> "last_interval_clean": 43338, >>>>>>>>>>> "last_epoch_split": 0, >>>>>>>>>>> "last_epoch_marked_full": 42032, >>>>>>>>>>> "same_up_since": 43338, >>>>>>>>>>> "same_interval_since": 43338, >>>>>>>>>>> "same_primary_since": 43276, >>>>>>>>>>> "last_scrub": "35299'13072533", >>>>>>>>>>> "last_scrub_stamp": "2018-01-18 14:01:19.557972", >>>>>>>>>>> "last_deep_scrub": "31372'12176860", >>>>>>>>>>> "last_deep_scrub_stamp": "2018-01-15 >>>>>>>>>>> 12:21:17.025305", >>>>>>>>>>> "last_clean_scrub_stamp": "2018-01-18 >>>>>>>>>>> 14:01:19.557972" >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>> Not sure if it is related. >>>>>>>>>>> >>>>>>>>>>> The cluster is not open to any new clients. However we see a >>>>>>>>>>> steady growth of space usage every day. >>>>>>>>>>> And worst case scenario, it might grow faster than we can add >>>>>>>>>>> more space, which will be dangerous. >>>>>>>>>>> >>>>>>>>>>> Any help is really appreciated. >>>>>>>>>>> >>>>>>>>>>> Karun Josy >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 26, 2018 at 8:23 PM, David Turner < >>>>>>>>>>> drakonst...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> "snap_trimq": "[]", >>>>>>>>>>>> >>>>>>>>>>>> That is exactly what you're looking for to see how many objects >>>>>>>>>>>> a PG still had that need to be cleaned up. I think something like >>>>>>>>>>>> this >>>>>>>>>>>> should give you the number of objects in the snap_trimq for a PG. >>>>>>>>>>>> >>>>>>>>>>>> echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 | >>>>>>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>>>>>>>> >>>>>>>>>>>> Note, I'm not at a computer and topping this from my phone so >>>>>>>>>>>> it's not pretty and I know of a few ways to do that better, but >>>>>>>>>>>> that should >>>>>>>>>>>> work all the same. >>>>>>>>>>>> >>>>>>>>>>>> For your needs a visual inspection of several PGs should be >>>>>>>>>>>> sufficient to see if there is anything in the snap_trimq to begin >>>>>>>>>>>> with. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 26, 2018, 9:18 AM Karun Josy <karunjo...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for the response. To be honest, I am afraid it is >>>>>>>>>>>>> going to be a issue in our cluster. >>>>>>>>>>>>> It seems snaptrim has not been going on for sometime now , >>>>>>>>>>>>> maybe because we were expanding the cluster adding nodes for the >>>>>>>>>>>>> past few >>>>>>>>>>>>> weeks. >>>>>>>>>>>>> >>>>>>>>>>>>> I would be really glad if you can guide me how to overcome >>>>>>>>>>>>> this. >>>>>>>>>>>>> Cluster has about 30TB data and 11 million objects. With about >>>>>>>>>>>>> 100 disks spread across 16 nodes. Version is 12.2.2 >>>>>>>>>>>>> Searching through the mailing lists I can see many cases where >>>>>>>>>>>>> the performance were affected while snaptrimming. >>>>>>>>>>>>> >>>>>>>>>>>>> Can you help me figure out these : >>>>>>>>>>>>> >>>>>>>>>>>>> - How to find snaptrim queue of a PG. >>>>>>>>>>>>> - Can snaptrim be started just on 1 PG >>>>>>>>>>>>> - How can I make sure cluster IO performance is not affected ? >>>>>>>>>>>>> I read about osd_snap_trim_sleep , how can it be changed ? >>>>>>>>>>>>> Is this the command : ceph tell osd.* injectargs >>>>>>>>>>>>> '--osd_snap_trim_sleep 0.005' >>>>>>>>>>>>> >>>>>>>>>>>>> If yes what is the recommended value that we can use ? >>>>>>>>>>>>> >>>>>>>>>>>>> Also, what all parameters should we be concerned about? I >>>>>>>>>>>>> would really appreciate any suggestions. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Below is a brief extract of a PG queried >>>>>>>>>>>>> ---------------------------- >>>>>>>>>>>>> ceph pg 55.77 query >>>>>>>>>>>>> { >>>>>>>>>>>>> "state": "active+clean", >>>>>>>>>>>>> "snap_trimq": "[]", >>>>>>>>>>>>> --- >>>>>>>>>>>>> ---- >>>>>>>>>>>>> >>>>>>>>>>>>> "pgid": "55.77s7", >>>>>>>>>>>>> "last_update": "43353'17222404", >>>>>>>>>>>>> "last_complete": "42773'16814984", >>>>>>>>>>>>> "log_tail": "42763'16812644", >>>>>>>>>>>>> "last_user_version": 16814144, >>>>>>>>>>>>> "last_backfill": "MAX", >>>>>>>>>>>>> "last_backfill_bitwise": 1, >>>>>>>>>>>>> "purged_snaps": [], >>>>>>>>>>>>> "history": { >>>>>>>>>>>>> "epoch_created": 5950, >>>>>>>>>>>>> --- >>>>>>>>>>>>> --- >>>>>>>>>>>>> --- >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Karun Josy >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 26, 2018 at 6:36 PM, David Turner < >>>>>>>>>>>>> drakonst...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> You may find the information in this ML thread useful. >>>>>>>>>>>>>> https://www.spinics.net/lists/ceph-users/msg41279.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> It talks about a couple ways to track your snaptrim queue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 2:09 AM Karun Josy < >>>>>>>>>>>>>> karunjo...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We have set no scrub , no deep scrub flag on a ceph cluster. >>>>>>>>>>>>>>> When we are deleting snapshots we are not seeing any change >>>>>>>>>>>>>>> in usage space. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I understand that Ceph OSDs delete data asynchronously, so >>>>>>>>>>>>>>> deleting a snapshot doesn’t free up the disk space immediately. >>>>>>>>>>>>>>> But we are >>>>>>>>>>>>>>> not seeing any change for sometime. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What can be possible reason ? Any suggestions would be >>>>>>>>>>>>>>> really helpful as the cluster size seems to be growing each day >>>>>>>>>>>>>>> even though >>>>>>>>>>>>>>> snapshots are deleted. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Karun >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> ceph-users mailing list >>>>>>>>>>>>>>> ceph-users@lists.ceph.com >>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>> >>>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> -- >> Jason >> > > -- Jason
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com