Hi Yehuda,

First run:

/opt/ceph/bin/radosgw-admin  --pool=.rgw.buckets --job-id=testing
ERROR: failed to open log pool ret=-2
job not found

Do I have to precreate some pool?


On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub <[email protected]> wrote:
>
> I've been working on a new tool that would detect leaked rados objects. It 
> will take some time for it to be merged into an official release, or even 
> into the master branch, but if anyone likes to play with it, it is in the 
> wip-rgw-orphans branch.
>
> At the moment I recommend to not remove any object that the tool reports, but 
> rather move it to a different pool for backup (using the rados tool cp 
> command).
>
> The tool works in a few stages:
> (1) list all the rados objects in the specified pool, store in repository
> (2) list all bucket instances in the system, store in repository
> (3) iterate through bucket instances in repository, list (logical) objects, 
> for each object store the expected rados objects that build it
> (4) compare data from (1) and (3), each object that is in (1), but not in 
> (3), stat, if older than $start_time - $stale_period, report it
>
> There can be lot's of things that can go wrong with this, so we really need 
> to be careful here.
>
> The tool can be run by the following command:
>
> $ radosgw-admin orphans find --pool=<data pool> --job-id=<name>  
> [--num-shards=<num shards>] [--orphan-stale-secs=<seconds>]
>
> The tool can be stopped, and restarted, and it will continue from the stage 
> where it stopped. Note that some of the stages will restart from the 
> beginning (of the stages), due to system limitation (specifically 1, 2).
>
> In order to clean up a job's data:
>
> $ radosgw-admin orphans finish --job-id=<name>
>
> Note that the jobs run in the radosgw-admin process context, it does not 
> schedule a job on the radosgw process.
>
> Please let me know of any issue you find.
>
> Thanks,
> Yehuda
>
> ----- Original Message -----
>> From: "Ben Hines" <[email protected]>
>> To: "Ben" <[email protected]>
>> Cc: "Yehuda Sadeh-Weinraub" <[email protected]>, "ceph-users" 
>> <[email protected]>
>> Sent: Thursday, April 30, 2015 3:00:16 PM
>> Subject: Re: [ceph-users] Shadow Files
>>
>> Going to hold off on our 94.1 update for this issue
>>
>> Hopefully this can make it into a 94.2 or a v95 git release.
>>
>> -Ben
>>
>> On Mon, Apr 27, 2015 at 2:32 PM, Ben < [email protected] > wrote:
>>
>>
>> How long are you thinking here?
>>
>> We added more storage to our cluster to overcome these issues, and we can't
>> keep throwing storage at it until the issues are fixed.
>>
>>
>> On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote:
>>
>>
>> It will get to the ceph mainline eventually. We're still reviewing and
>> testing the fix, and there's more work to be done on the cleanup tool.
>>
>> Yehuda
>>
>> ----- Original Message -----
>>
>>
>> From: "Ben" <[email protected]>
>> To: "Yehuda Sadeh-Weinraub" < [email protected] >
>> Cc: "ceph-users" < [email protected] >
>> Sent: Sunday, April 26, 2015 11:02:23 PM
>> Subject: Re: [ceph-users] Shadow Files
>>
>> Are these fixes going to make it into the repository versions of ceph,
>> or will we be required to compile and install manually?
>>
>> On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote:
>>
>>
>> Yeah, that's definitely something that we'd address soon.
>>
>> Yehuda
>>
>> ----- Original Message -----
>>
>>
>> From: "Ben" <[email protected]>
>> To: "Ben Hines" < [email protected] >, "Yehuda Sadeh-Weinraub"
>> < [email protected] >
>> Cc: "ceph-users" < [email protected] >
>> Sent: Friday, April 24, 2015 5:14:11 PM
>> Subject: Re: [ceph-users] Shadow Files
>>
>> Definitely need something to help clear out these old shadow files.
>>
>> I'm sure our cluster has around 100TB of these shadow files.
>>
>> I've written a script to go through known objects to get prefixes of
>> objects
>> that should exist to compare to ones that shouldn't, but the time it
>> takes
>> to do this over millions and millions of objects is just too long.
>>
>> On 25/04/15 09:53, Ben Hines wrote:
>>
>>
>>
>> When these are fixed it would be great to get good steps for listing /
>> cleaning up any orphaned objects. I have suspicions this is affecting
>> us.
>>
>> thanks-
>>
>> -Ben
>>
>> On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub <
>> [email protected] >
>> wrote:
>>
>>
>> These ones:
>>
>> http://tracker.ceph.com/issues/10295
>> http://tracker.ceph.com/issues/11447
>>
>> ----- Original Message -----
>>
>>
>> From: "Ben Jackson" <[email protected]>
>> To: "Yehuda Sadeh-Weinraub" < [email protected] >
>> Cc: "ceph-users" < [email protected] >
>> Sent: Friday, April 24, 2015 3:06:02 PM
>> Subject: Re: [ceph-users] Shadow Files
>>
>> We were firefly, then we upgraded to giant, now we are on hammer.
>>
>> What issues?
>>
>> On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub < [email protected] >
>> wrote:
>>
>>
>> What version are you running? There are two different issues that we
>> were
>> fixing this week, and we should have that upstream pretty soon.
>>
>> Yehuda
>>
>> ----- Original Message -----
>>
>>
>> From: "Ben" <[email protected]>
>> To: "ceph-users" < [email protected] >
>> Cc: "Yehuda Sadeh-Weinraub" < [email protected] >
>> Sent: Thursday, April 23, 2015 7:42:06 PM
>> Subject: [ceph-users] Shadow Files
>>
>> We are still experiencing a problem with out gateway not properly
>> clearing out shadow files.
>>
>> I have done numerous tests where I have:
>> -Uploaded a file of 1.5GB in size using s3browser application
>> -Done an object stat on the file to get its prefix
>> -Done rados ls -p .rgw.buckets | grep <prefix> to count the number
>> of
>> shadow files associated (in this case it is around 290 shadow files)
>> -Deleted said file with s3browser
>> -Performed a gc list, which shows the ~290 files listed
>> -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep
>> <prefix>
>> to
>> recount the shadow files only to be left with 290 files still there
>>
>> From log output /var/log/ceph/radosgw.log, I can see the following
>> when
>> clicking DELETE (this appears 290 times)
>> 2015-04-24 10:43:29.996523 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule->part_size=0
>> 2015-04-24 10:43:29.996557 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule->part_size=0
>> 2015-04-24 10:43:29.996564 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs= 13107200 stripe_ofs= 13107200 part_ofs=0
>> rule->part_size=0
>> 2015-04-24 10:43:29.996570 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=17301504 stripe_ofs=17301504 part_ofs=0
>> rule->part_size=0
>> 2015-04-24 10:43:29.996576 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=21495808 stripe_ofs=21495808 part_ofs=0
>> rule->part_size=0
>> 2015-04-24 10:43:29.996581 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=25690112 stripe_ofs=25690112 part_ofs=0
>> rule->part_size=0
>> 2015-04-24 10:43:29.996586 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=29884416 stripe_ofs=29884416 part_ofs=0
>> rule->part_size=0
>> 2015-04-24 10:43:29.996592 7f0b0afb5700 0
>> RGWObjManifest::operator++():
>> result: ofs=34078720 stripe_ofs=34078720 part_ofs=0
>> rule->part_size=0
>>
>> In this same log, I also see the gc process saying it is removing
>> said
>> file (these records appear 290 times too)
>> 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>> 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing
>> .rgw.buckets:<objectname>
>>
>> So even though it appears that the GC is processing its removal, the
>> shadow files remain!
>>
>> Please help!
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to