Re: [ceph-users] Snapshot trimming

Jason Dillaman Tue, 30 Jan 2018 08:18:31 -0800

OK, at least it should be pretty straightforward to correct
programmatically. I can throw together a quick program to clean your pools
but you will need to compile it (you will need the librbd1-devel package
installed) since unfortunately the rados Python API doesn't provide access
to self-managed snapshots. You should also delete all existing image
snapshots prior to executing the code since their state is suspect.


On Tue, Jan 30, 2018 at 10:37 AM, Karun Josy <karunjo...@gmail.com> wrote:

> Hi Jason,
>
> >> Was the base RBD pool used only for data-pool associated images
> Yes, it is only used for storing metadata of ecpool.
>
> We use 2 pools for erasure coding
>
> ecpool - erasure coded datapool
> vm -  replicated pool to store  metadata
>
> Karun Josy
>
> On Tue, Jan 30, 2018 at 8:00 PM, Jason Dillaman <jdill...@redhat.com>
> wrote:
>
>> Unfortunately, any snapshots created prior to 12.2.2 against a separate
>> data pool were incorrectly associated to the base image pool instead of the
>> data pool. Was the base RBD pool used only for data-pool associated images
>> (i.e. all the snapshots that exists within the pool can be safely deleted)?
>>
>> On Mon, Jan 29, 2018 at 11:50 AM, Karun Josy <karunjo...@gmail.com>
>> wrote:
>>
>>>
>>> The problem we are experiencing is described here:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1497332
>>>
>>> However, we are running 12.2.2.
>>>
>>> Across our 6 ceph clusters, this one with the problem was first version
>>> 12.2.0, then upgraded to .1 and then to .2.
>>>
>>> The other 5 ceph installations started as version 12.2.1 and then
>>> updated to .2.
>>>
>>> Karun Josy
>>>
>>> On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy <karunjo...@gmail.com>
>>> wrote:
>>>
>>>> Thank you for your response.
>>>>
>>>> We don't think there is an issue with the cluster being behind snap
>>>> trimming. We just don't think snaptrim is occurring at all.
>>>>
>>>> We have 6 individual ceph clusters. When we delete old snapshots for
>>>> clients, we can see space being made available. In this particular one
>>>> however, with 300 virtual machines, 28TBs of data (this is our largest
>>>> ceph), I can delete hundreds of snapshots, and not a single gigabyte
>>>> becomes available after doing that.
>>>>
>>>> In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes
>>>> becoming available again after doing massive deletions of snapshots.
>>>>
>>>> The Luminous gui also never shows "snaptrimming" occurring in the EC
>>>> pool.  While the other 5 Luminous clusters, their GUI will show
>>>> snaptrimming occurring for the EC pool. Within minutes we can see the
>>>> additional space becoming available.
>>>>
>>>> This isn't an issue of the trimming queue behind schedule. The system
>>>> shows there is no trimming scheduled in the queue, ever.
>>>>
>>>> However, when using ceph du on particular virtual machines, we can see
>>>> that snapshots we delete are indeed no longer listed in ceph du's output.
>>>>
>>>> So, they seem to be deleting. But the space is not being reclaimed.
>>>>
>>>> All clusters are same hardware. Some have more disks and servers than
>>>> others. The only major difference is that this particular Ceph with this
>>>> problem, it had the noscrub and nodeep-scrub flags set for many weeks.
>>>>
>>>>
>>>> Karun Josy
>>>>
>>>> On Mon, Jan 29, 2018 at 6:27 PM, David Turner <drakonst...@gmail.com>
>>>> wrote:
>>>>
>>>>> I don't know why you keep asking the same question about snap
>>>>> trimming. You haven't shown any evidence that your cluster is behind on
>>>>> that. Have you looked into fstrim inside of your VMs?
>>>>>
>>>>> On Mon, Jan 29, 2018, 4:30 AM Karun Josy <karunjo...@gmail.com> wrote:
>>>>>
>>>>>> fast-diff map is not enabled for RBD images.
>>>>>> Can it be a reason for Trimming not happening ?
>>>>>>
>>>>>> Karun Josy
>>>>>>
>>>>>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <karunjo...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Thank you for your reply! I really appreciate it.
>>>>>>>
>>>>>>> The images are in pool id 55. It is an erasure coded pool.
>>>>>>>
>>>>>>> ---------------
>>>>>>> $ echo $(( $(ceph pg  55.58 query | grep snap_trimq | cut -d[ -f2 |
>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>> 0
>>>>>>> $ echo $(( $(ceph pg  55.a query | grep snap_trimq | cut -d[ -f2 |
>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>> 0
>>>>>>> $ echo $(( $(ceph pg  55.65 query | grep snap_trimq | cut -d[ -f2 |
>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>> 0
>>>>>>> --------------
>>>>>>>
>>>>>>> Current snap_trim_sleep value is default.
>>>>>>> "osd_snap_trim_sleep": "0.000000". I assume it means there is no
>>>>>>> delay. (Can't find any documentation related to it)
>>>>>>> Will changing its value initiate snaptrimming, like
>>>>>>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05'
>>>>>>>
>>>>>>> Also, we are using an rbd user with the below profile. It is used
>>>>>>> while deleting snapshots
>>>>>>> -------
>>>>>>>         caps: [mon] profile rbd
>>>>>>>         caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm,
>>>>>>> profile rbd-read-only pool=templates
>>>>>>> -------
>>>>>>>
>>>>>>> Can it be a reason ?
>>>>>>>
>>>>>>> Also, can you let me know which all logs to check while deleting
>>>>>>> snapshots to see if it is snaptrimming ?
>>>>>>> I am sorry I feel like pestering you too much.
>>>>>>> But in mailing lists, I can see you have dealt with similar issues
>>>>>>> with Snapshots
>>>>>>> So I think you can help me figure this mess out.
>>>>>>>
>>>>>>>
>>>>>>> Karun Josy
>>>>>>>
>>>>>>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner <drakonst...@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Prove* a positive
>>>>>>>>
>>>>>>>> On Sat, Jan 27, 2018, 8:45 AM David Turner <drakonst...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Unless you have things in your snap_trimq, your problem isn't snap
>>>>>>>>> trimming. That is currently how you can check snap trimming and you 
>>>>>>>>> say
>>>>>>>>> you're caught up.
>>>>>>>>>
>>>>>>>>> Are you certain that you are querying the correct pool for the
>>>>>>>>> images you are snapshotting. You showed that you tested 4 different 
>>>>>>>>> pools.
>>>>>>>>> You should only need to check the pool with the images you are 
>>>>>>>>> dealing with.
>>>>>>>>>
>>>>>>>>> You can inversely price a positive by changing your snap_trim
>>>>>>>>> settings to not do any cleanup and see if the appropriate PGs have 
>>>>>>>>> anything
>>>>>>>>> in their q.
>>>>>>>>>
>>>>>>>>> On Sat, Jan 27, 2018, 12:06 AM Karun Josy <karunjo...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Is scrubbing and deep scrubbing necessary for Snaptrim operation
>>>>>>>>>> to happen ?
>>>>>>>>>>
>>>>>>>>>> Karun Josy
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <karunjo...@gmail.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank you for your quick response!
>>>>>>>>>>>
>>>>>>>>>>> I used the command to fetch the snap_trimq from many pgs,
>>>>>>>>>>> however it seems they don't have any in queue ?
>>>>>>>>>>>
>>>>>>>>>>> For eg :
>>>>>>>>>>> ====================
>>>>>>>>>>> $ echo $(( $(ceph pg  55.4a query | grep snap_trimq | cut -d[
>>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  55.5a query | grep snap_trimq | cut -d[
>>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  55.88 query | grep snap_trimq | cut -d[
>>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  55.55 query | grep snap_trimq | cut -d[
>>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  54.a query | grep snap_trimq | cut -d[ -f2
>>>>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  34.1d query | grep snap_trimq | cut -d[
>>>>>>>>>>> -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> $ echo $(( $(ceph pg  1.3f query | grep snap_trimq | cut -d[ -f2
>>>>>>>>>>> | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>> 0
>>>>>>>>>>> =====================
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> While going through the PG query, I find that these PGs have no
>>>>>>>>>>> value in purged_snaps section too.
>>>>>>>>>>> For eg :
>>>>>>>>>>> ceph pg  55.80 query
>>>>>>>>>>> --
>>>>>>>>>>> ---
>>>>>>>>>>> ---
>>>>>>>>>>>  {
>>>>>>>>>>>             "peer": "83(3)",
>>>>>>>>>>>             "pgid": "55.80s3",
>>>>>>>>>>>             "last_update": "43360'15121927",
>>>>>>>>>>>             "last_complete": "43345'15073146",
>>>>>>>>>>>             "log_tail": "43335'15064480",
>>>>>>>>>>>             "last_user_version": 15066124,
>>>>>>>>>>>             "last_backfill": "MAX",
>>>>>>>>>>>             "last_backfill_bitwise": 1,
>>>>>>>>>>>             "purged_snaps": [],
>>>>>>>>>>>             "history": {
>>>>>>>>>>>                 "epoch_created": 5950,
>>>>>>>>>>>                 "epoch_pool_created": 5950,
>>>>>>>>>>>                 "last_epoch_started": 43339,
>>>>>>>>>>>                 "last_interval_started": 43338,
>>>>>>>>>>>                 "last_epoch_clean": 43340,
>>>>>>>>>>>                 "last_interval_clean": 43338,
>>>>>>>>>>>                 "last_epoch_split": 0,
>>>>>>>>>>>                 "last_epoch_marked_full": 42032,
>>>>>>>>>>>                 "same_up_since": 43338,
>>>>>>>>>>>                 "same_interval_since": 43338,
>>>>>>>>>>>                 "same_primary_since": 43276,
>>>>>>>>>>>                 "last_scrub": "35299'13072533",
>>>>>>>>>>>                 "last_scrub_stamp": "2018-01-18 14:01:19.557972",
>>>>>>>>>>>                 "last_deep_scrub": "31372'12176860",
>>>>>>>>>>>                 "last_deep_scrub_stamp": "2018-01-15
>>>>>>>>>>> 12:21:17.025305",
>>>>>>>>>>>                 "last_clean_scrub_stamp": "2018-01-18
>>>>>>>>>>> 14:01:19.557972"
>>>>>>>>>>>             },
>>>>>>>>>>>
>>>>>>>>>>> Not sure if it is related.
>>>>>>>>>>>
>>>>>>>>>>> The cluster is not open to any new clients. However we see a
>>>>>>>>>>> steady growth of  space usage every day.
>>>>>>>>>>> And worst case scenario, it might grow faster than we can add
>>>>>>>>>>> more space, which will be dangerous.
>>>>>>>>>>>
>>>>>>>>>>> Any help is really appreciated.
>>>>>>>>>>>
>>>>>>>>>>> Karun Josy
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:23 PM, David Turner <
>>>>>>>>>>> drakonst...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> "snap_trimq": "[]",
>>>>>>>>>>>>
>>>>>>>>>>>> That is exactly what you're looking for to see how many objects
>>>>>>>>>>>> a PG still had that need to be cleaned up. I think something like 
>>>>>>>>>>>> this
>>>>>>>>>>>> should give you the number of objects in the snap_trimq for a PG.
>>>>>>>>>>>>
>>>>>>>>>>>> echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 |
>>>>>>>>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 ))
>>>>>>>>>>>>
>>>>>>>>>>>> Note, I'm not at a computer and topping this from my phone so
>>>>>>>>>>>> it's not pretty and I know of a few ways to do that better, but 
>>>>>>>>>>>> that should
>>>>>>>>>>>> work all the same.
>>>>>>>>>>>>
>>>>>>>>>>>> For your needs a visual inspection of several PGs should be
>>>>>>>>>>>> sufficient to see if there is anything in the snap_trimq to begin 
>>>>>>>>>>>> with.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jan 26, 2018, 9:18 AM Karun Josy <karunjo...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>  Hi David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for the response. To be honest, I am afraid it is
>>>>>>>>>>>>> going to be a issue in our cluster.
>>>>>>>>>>>>> It seems snaptrim has not been going on for sometime now ,
>>>>>>>>>>>>> maybe because we were expanding the cluster adding nodes for the 
>>>>>>>>>>>>> past few
>>>>>>>>>>>>> weeks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would be really glad if you can guide me how to overcome
>>>>>>>>>>>>> this.
>>>>>>>>>>>>> Cluster has about 30TB data and 11 million objects. With about
>>>>>>>>>>>>> 100 disks spread across 16 nodes. Version is 12.2.2
>>>>>>>>>>>>> Searching through the mailing lists I can see many cases where
>>>>>>>>>>>>> the performance were affected while snaptrimming.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you help me figure out these :
>>>>>>>>>>>>>
>>>>>>>>>>>>> - How to find snaptrim queue of a PG.
>>>>>>>>>>>>> - Can snaptrim be started just on 1 PG
>>>>>>>>>>>>> - How can I make sure cluster IO performance is not affected ?
>>>>>>>>>>>>> I read about osd_snap_trim_sleep , how can it be changed ?
>>>>>>>>>>>>> Is this the command : ceph tell osd.* injectargs
>>>>>>>>>>>>> '--osd_snap_trim_sleep 0.005'
>>>>>>>>>>>>>
>>>>>>>>>>>>> If yes what is the recommended value that we can use ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, what all parameters should we be concerned about? I
>>>>>>>>>>>>> would really appreciate any suggestions.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Below is a brief extract of a PG queried
>>>>>>>>>>>>> ----------------------------
>>>>>>>>>>>>> ceph pg  55.77 query
>>>>>>>>>>>>> {
>>>>>>>>>>>>>     "state": "active+clean",
>>>>>>>>>>>>>     "snap_trimq": "[]",
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> ----
>>>>>>>>>>>>>
>>>>>>>>>>>>> "pgid": "55.77s7",
>>>>>>>>>>>>>             "last_update": "43353'17222404",
>>>>>>>>>>>>>             "last_complete": "42773'16814984",
>>>>>>>>>>>>>             "log_tail": "42763'16812644",
>>>>>>>>>>>>>             "last_user_version": 16814144,
>>>>>>>>>>>>>             "last_backfill": "MAX",
>>>>>>>>>>>>>             "last_backfill_bitwise": 1,
>>>>>>>>>>>>>             "purged_snaps": [],
>>>>>>>>>>>>>             "history": {
>>>>>>>>>>>>>                 "epoch_created": 5950,
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Karun Josy
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 6:36 PM, David Turner <
>>>>>>>>>>>>> drakonst...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> You may find the information in this ML thread useful.
>>>>>>>>>>>>>> https://www.spinics.net/lists/ceph-users/msg41279.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It talks about a couple ways to track your snaptrim queue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 2:09 AM Karun Josy <
>>>>>>>>>>>>>> karunjo...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We have set no scrub , no deep scrub flag on a ceph cluster.
>>>>>>>>>>>>>>> When we are deleting snapshots we are not seeing any change
>>>>>>>>>>>>>>> in usage space.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand that Ceph OSDs delete data asynchronously, so
>>>>>>>>>>>>>>> deleting a snapshot doesn’t free up the disk space immediately. 
>>>>>>>>>>>>>>> But we are
>>>>>>>>>>>>>>> not seeing any change for sometime.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What can be possible reason ? Any suggestions would be
>>>>>>>>>>>>>>> really helpful as the cluster size seems to be growing each day 
>>>>>>>>>>>>>>> even though
>>>>>>>>>>>>>>> snapshots are deleted.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Karun
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>>>> ceph-users@lists.ceph.com
>>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> Jason
>>
>
>


-- 
Jason

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Snapshot trimming

Reply via email to