David,

We would love some testing of the tool. Are you set up to compile and deploy 
Ceph changes? If your situation is not related to the leaked multipart objects 
due to retries, it will let you know nothing was fixed.  That is still a useful 
test.

Another variation of multipart leak comes from multipart sessions not being 
aborted or completed. That is something that Ceph tools can already assist 
with. My fix is for objects that do not show up in standard bucket listings but 
are accounted for in bucket stats.

On Aug 31, 2017, at 4:26 PM, David Turner 
<[email protected]<mailto:[email protected]>> wrote:

Jewel 10.2.7.  I found a discrepancy in object counts for a multisite 
configuration and it's looking like it might be orphaned multipart files 
causing it.  It doesn't look like this PR has received much attention.  Is 
there anything I can do to help you with testing/confirming a use case for this 
tool?

On Tue, Aug 29, 2017 at 5:28 PM William Schroeder 
<[email protected]<mailto:[email protected]>> wrote:
Hello!

Our team finally had a chance to take another look at the problem identified by 
Brian Felton in http://tracker.ceph.com/issues/16767.  Basically, if any 
multipart objects are retried before an Abort or Complete, they remain on the 
system, taking up space and leaving their accounting in “radosgw-admin bucket 
stats”.  The problem is confirmed in Hammer and Jewel.

This past week, we succeeded in some experimental code to remove those parts.  
I am not sure if this code has any unintended consequences, so **I would 
greatly appreciate reviews of the new tool**!  I have tested it successfully 
against objects created and leaked in the ceph-demo Docker image for Jewel.  
Here is a pull request with the patch:

https://github.com/ceph/ceph/pull/17349

Basically, we added a new subcommand for “bucket” called “fixmpleak”.  This 
lists objects in the “multipart” namespace, and it identifies objects that are 
not associated with current .meta files in that list.  It then deletes those 
objects with a delete op, which results in the accounting being corrected and 
the space being reclaimed on the OSDs.

This is not a preventative measure, which would be a lot more complicated, but 
we figure to run this tool hourly against all our buckets to keep things clean.

_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to