I've been working on a new tool that would detect leaked rados objects. It will 
take some time for it to be merged into an official release, or even into the 
master branch, but if anyone likes to play with it, it is in the 
wip-rgw-orphans branch.

At the moment I recommend to not remove any object that the tool reports, but 
rather move it to a different pool for backup (using the rados tool cp command).

The tool works in a few stages:
(1) list all the rados objects in the specified pool, store in repository
(2) list all bucket instances in the system, store in repository
(3) iterate through bucket instances in repository, list (logical) objects, for 
each object store the expected rados objects that build it
(4) compare data from (1) and (3), each object that is in (1), but not in (3), 
stat, if older than $start_time - $stale_period, report it

There can be lot's of things that can go wrong with this, so we really need to 
be careful here.

The tool can be run by the following command:

$ radosgw-admin orphans find --pool=<data pool> --job-id=<name>  
[--num-shards=<num shards>] [--orphan-stale-secs=<seconds>]

The tool can be stopped, and restarted, and it will continue from the stage 
where it stopped. Note that some of the stages will restart from the beginning 
(of the stages), due to system limitation (specifically 1, 2).

In order to clean up a job's data:

$ radosgw-admin orphans finish --job-id=<name>

Note that the jobs run in the radosgw-admin process context, it does not 
schedule a job on the radosgw process.

Please let me know of any issue you find.

Thanks,
Yehuda

----- Original Message -----
> From: "Ben Hines" <bhi...@gmail.com>
> To: "Ben" <b@benjackson.email>
> Cc: "Yehuda Sadeh-Weinraub" <yeh...@redhat.com>, "ceph-users" 
> <ceph-us...@ceph.com>
> Sent: Thursday, April 30, 2015 3:00:16 PM
> Subject: Re: [ceph-users] Shadow Files
> 
> Going to hold off on our 94.1 update for this issue
> 
> Hopefully this can make it into a 94.2 or a v95 git release.
> 
> -Ben
> 
> On Mon, Apr 27, 2015 at 2:32 PM, Ben < b@benjackson.email > wrote:
> 
> 
> How long are you thinking here?
> 
> We added more storage to our cluster to overcome these issues, and we can't
> keep throwing storage at it until the issues are fixed.
> 
> 
> On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote:
> 
> 
> It will get to the ceph mainline eventually. We're still reviewing and
> testing the fix, and there's more work to be done on the cleanup tool.
> 
> Yehuda
> 
> ----- Original Message -----
> 
> 
> From: "Ben" <b@benjackson.email>
> To: "Yehuda Sadeh-Weinraub" < yeh...@redhat.com >
> Cc: "ceph-users" < ceph-us...@ceph.com >
> Sent: Sunday, April 26, 2015 11:02:23 PM
> Subject: Re: [ceph-users] Shadow Files
> 
> Are these fixes going to make it into the repository versions of ceph,
> or will we be required to compile and install manually?
> 
> On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote:
> 
> 
> Yeah, that's definitely something that we'd address soon.
> 
> Yehuda
> 
> ----- Original Message -----
> 
> 
> From: "Ben" <b@benjackson.email>
> To: "Ben Hines" < bhi...@gmail.com >, "Yehuda Sadeh-Weinraub"
> < yeh...@redhat.com >
> Cc: "ceph-users" < ceph-us...@ceph.com >
> Sent: Friday, April 24, 2015 5:14:11 PM
> Subject: Re: [ceph-users] Shadow Files
> 
> Definitely need something to help clear out these old shadow files.
> 
> I'm sure our cluster has around 100TB of these shadow files.
> 
> I've written a script to go through known objects to get prefixes of
> objects
> that should exist to compare to ones that shouldn't, but the time it
> takes
> to do this over millions and millions of objects is just too long.
> 
> On 25/04/15 09:53, Ben Hines wrote:
> 
> 
> 
> When these are fixed it would be great to get good steps for listing /
> cleaning up any orphaned objects. I have suspicions this is affecting
> us.
> 
> thanks-
> 
> -Ben
> 
> On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub <
> yeh...@redhat.com >
> wrote:
> 
> 
> These ones:
> 
> http://tracker.ceph.com/issues/10295
> http://tracker.ceph.com/issues/11447
> 
> ----- Original Message -----
> 
> 
> From: "Ben Jackson" <b@benjackson.email>
> To: "Yehuda Sadeh-Weinraub" < yeh...@redhat.com >
> Cc: "ceph-users" < ceph-us...@ceph.com >
> Sent: Friday, April 24, 2015 3:06:02 PM
> Subject: Re: [ceph-users] Shadow Files
> 
> We were firefly, then we upgraded to giant, now we are on hammer.
> 
> What issues?
> 
> On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub < yeh...@redhat.com >
> wrote:
> 
> 
> What version are you running? There are two different issues that we
> were
> fixing this week, and we should have that upstream pretty soon.
> 
> Yehuda
> 
> ----- Original Message -----
> 
> 
> From: "Ben" <b@benjackson.email>
> To: "ceph-users" < ceph-us...@ceph.com >
> Cc: "Yehuda Sadeh-Weinraub" < yeh...@redhat.com >
> Sent: Thursday, April 23, 2015 7:42:06 PM
> Subject: [ceph-users] Shadow Files
> 
> We are still experiencing a problem with out gateway not properly
> clearing out shadow files.
> 
> I have done numerous tests where I have:
> -Uploaded a file of 1.5GB in size using s3browser application
> -Done an object stat on the file to get its prefix
> -Done rados ls -p .rgw.buckets | grep <prefix> to count the number
> of
> shadow files associated (in this case it is around 290 shadow files)
> -Deleted said file with s3browser
> -Performed a gc list, which shows the ~290 files listed
> -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep
> <prefix>
> to
> recount the shadow files only to be left with 290 files still there
> 
> From log output /var/log/ceph/radosgw.log, I can see the following
> when
> clicking DELETE (this appears 290 times)
> 2015-04-24 10:43:29.996523 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule->part_size=0
> 2015-04-24 10:43:29.996557 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule->part_size=0
> 2015-04-24 10:43:29.996564 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs= 13107200 stripe_ofs= 13107200 part_ofs=0
> rule->part_size=0
> 2015-04-24 10:43:29.996570 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=17301504 stripe_ofs=17301504 part_ofs=0
> rule->part_size=0
> 2015-04-24 10:43:29.996576 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=21495808 stripe_ofs=21495808 part_ofs=0
> rule->part_size=0
> 2015-04-24 10:43:29.996581 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=25690112 stripe_ofs=25690112 part_ofs=0
> rule->part_size=0
> 2015-04-24 10:43:29.996586 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=29884416 stripe_ofs=29884416 part_ofs=0
> rule->part_size=0
> 2015-04-24 10:43:29.996592 7f0b0afb5700 0
> RGWObjManifest::operator++():
> result: ofs=34078720 stripe_ofs=34078720 part_ofs=0
> rule->part_size=0
> 
> In this same log, I also see the gc process saying it is removing
> said
> file (these records appear 290 times too)
> 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing
> .rgw.buckets:<objectname>
> 
> So even though it appears that the GC is processing its removal, the
> shadow files remain!
> 
> Please help!
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to