Re: [ceph-users] Shadow Files
- Original Message - From: Daniel Hoffman daniel.hoff...@13andrew.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Sunday, May 10, 2015 5:03:22 PM Subject: Re: [ceph-users] Shadow Files Any updates on when this is going to be released? Daniel On Wed, May 6, 2015 at 3:51 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Yes, so it seems. The librados::nobjects_begin() call expects at least a Hammer (0.94) backend. Probably need to add a try/catch there to catch this issue, and maybe see if using a different api would be better compatible with older backends. Yehuda I cleaned up the commits a bit, but it needs to be reviewed, and it'll be nice to get some more testing to it before it goes on an official release. There's still the issue of running it against a firefly backend. I looked at backporting it to firefly, but it's not going to be a trivial work, so I think the better time usage would be to get the hammer one to work against a firefly backend. There are some librados api quirks that we need to flush out first. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
Thanks. Can you please let me know the suitable/best git version/tree to be pulling to compile and use this feature/patch? Thanks On Tue, May 12, 2015 at 4:38 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: -- *From: *Daniel Hoffman daniel.hoff...@13andrew.com *To: *Yehuda Sadeh-Weinraub yeh...@redhat.com *Cc: *Ben b@benjackson.email, ceph-users ceph-us...@ceph.com *Sent: *Sunday, May 10, 2015 5:03:22 PM *Subject: *Re: [ceph-users] Shadow Files Any updates on when this is going to be released? Daniel On Wed, May 6, 2015 at 3:51 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Yes, so it seems. The librados::nobjects_begin() call expects at least a Hammer (0.94) backend. Probably need to add a try/catch there to catch this issue, and maybe see if using a different api would be better compatible with older backends. Yehuda I cleaned up the commits a bit, but it needs to be reviewed, and it'll be nice to get some more testing to it before it goes on an official release. There's still the issue of running it against a firefly backend. I looked at backporting it to firefly, but it's not going to be a trivial work, so I think the better time usage would be to get the hammer one to work against a firefly backend. There are some librados api quirks that we need to flush out first. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
It's the wip-rgw-orphans branch. - Original Message - From: Daniel Hoffman daniel.hoff...@13andrew.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, David Zafman dzaf...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Monday, May 11, 2015 4:30:11 PM Subject: Re: [ceph-users] Shadow Files Thanks. Can you please let me know the suitable/best git version/tree to be pulling to compile and use this feature/patch? Thanks On Tue, May 12, 2015 at 4:38 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: From: Daniel Hoffman daniel.hoff...@13andrew.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Sunday, May 10, 2015 5:03:22 PM Subject: Re: [ceph-users] Shadow Files Any updates on when this is going to be released? Daniel On Wed, May 6, 2015 at 3:51 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Yes, so it seems. The librados::nobjects_begin() call expects at least a Hammer (0.94) backend. Probably need to add a try/catch there to catch this issue, and maybe see if using a different api would be better compatible with older backends. Yehuda I cleaned up the commits a bit, but it needs to be reviewed, and it'll be nice to get some more testing to it before it goes on an official release. There's still the issue of running it against a firefly backend. I looked at backporting it to firefly, but it's not going to be a trivial work, so I think the better time usage would be to get the hammer one to work against a firefly backend. There are some librados api quirks that we need to flush out first. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
Any updates on when this is going to be released? Daniel On Wed, May 6, 2015 at 3:51 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Yes, so it seems. The librados::nobjects_begin() call expects at least a Hammer (0.94) backend. Probably need to add a try/catch there to catch this issue, and maybe see if using a different api would be better compatible with older backends. Yehuda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 10:14:38 AM Subject: Re: [ceph-users] Shadow Files Unfortunately it immediately aborted (running against a 0.80.9 Ceph). Does Ceph also have to be a 0.94 level? last error was -3 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool -2 2015-05-06 01:11:11.710995 7f311dd15880 1 -- 10.200.3.92:0/1001510 -- 10.200.3.32:6800/1870 -- osd_op(client.4065115.0:27 ^A/ [pgnls start_epoch 0] 11.0 ack+read +known_if_redirected e952) v5 -- ?+0 0x39a4e80 con 0x39a4aa0 -1 2015-05-06 01:11:11.712125 7f31026f4700 1 -- 10.200.3.92:0/1001510 == osd.1 10.200.3.32:6800/1870 1 osd_op_reply(27 [pgnls start_epoch 0] v934'6252 uv6252 ondisk = -22 ((22) Invalid argument)) v6 167+0+0 (3260127617 0 0) 0x7f30c4000a90 con 0x39a4aa0 0 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool terminate called after throwing an instance of 'std::runtime_error' what(): rados returned (22) Invalid argument *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] 3: (gsignal()+0x37) [0x7f31195d85d7] 4: (abort()+0x148) [0x7f31195d9cc8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f3119edc9b5] 6: (()+0x5e926) [0x7f3119eda926] 7: (()+0x5e953) [0x7f3119eda953] 8: (()+0x5eb73) [0x7f3119edab73] 9: (()+0x4d116) [0x7f311b606116] 10: (librados::IoCtx::nobjects_begin()+0x2e) [0x7f311b60c60e] 11: (RGWOrphanSearch::build_all_oids_index()+0x62) [0x516a02] 12: (RGWOrphanSearch::run()+0x1e3) [0x51ad23] 13: (main()+0xa430) [0x4fbc30] 14: (__libc_start_main()+0xf5) [0x7f31195c4af5] 15: radosgw-admin() [0x5028d9] 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] On Tue, May 5, 2015 at 10:41 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Can you try creating the .log pool? Yehda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 3:37:15 AM Subject: Re: [ceph-users] Shadow Files ...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num
Re: [ceph-users] Shadow Files
Unfortunately it immediately aborted (running against a 0.80.9 Ceph). Does Ceph also have to be a 0.94 level? last error was -3 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool -2 2015-05-06 01:11:11.710995 7f311dd15880 1 -- 10.200.3.92:0/1001510 -- 10.200.3.32:6800/1870 -- osd_op(client.4065115.0:27 ^A/ [pgnls start_epoch 0] 11.0 ack+read +known_if_redirected e952) v5 -- ?+0 0x39a4e80 con 0x39a4aa0 -1 2015-05-06 01:11:11.712125 7f31026f4700 1 -- 10.200.3.92:0/1001510 == osd.1 10.200.3.32:6800/1870 1 osd_op_reply(27 [pgnls start_epoch 0] v934'6252 uv6252 ondisk = -22 ((22) Invalid argument)) v6 167+0+0 (3260127617 0 0) 0x7f30c4000a90 con 0x39a4aa0 0 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool terminate called after throwing an instance of 'std::runtime_error' what(): rados returned (22) Invalid argument *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] 3: (gsignal()+0x37) [0x7f31195d85d7] 4: (abort()+0x148) [0x7f31195d9cc8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f3119edc9b5] 6: (()+0x5e926) [0x7f3119eda926] 7: (()+0x5e953) [0x7f3119eda953] 8: (()+0x5eb73) [0x7f3119edab73] 9: (()+0x4d116) [0x7f311b606116] 10: (librados::IoCtx::nobjects_begin()+0x2e) [0x7f311b60c60e] 11: (RGWOrphanSearch::build_all_oids_index()+0x62) [0x516a02] 12: (RGWOrphanSearch::run()+0x1e3) [0x51ad23] 13: (main()+0xa430) [0x4fbc30] 14: (__libc_start_main()+0xf5) [0x7f31195c4af5] 15: radosgw-admin() [0x5028d9] 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] On Tue, May 5, 2015 at 10:41 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Can you try creating the .log pool? Yehda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 3:37:15 AM Subject: Re: [ceph-users] Shadow Files ...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release
Re: [ceph-users] Shadow Files
Yes, so it seems. The librados::nobjects_begin() call expects at least a Hammer (0.94) backend. Probably need to add a try/catch there to catch this issue, and maybe see if using a different api would be better compatible with older backends. Yehuda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 10:14:38 AM Subject: Re: [ceph-users] Shadow Files Unfortunately it immediately aborted (running against a 0.80.9 Ceph). Does Ceph also have to be a 0.94 level? last error was -3 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool -2 2015-05-06 01:11:11.710995 7f311dd15880 1 -- 10.200.3.92:0/1001510 -- 10.200.3.32:6800/1870 -- osd_op(client.4065115.0:27 ^A/ [pgnls start_epoch 0] 11.0 ack+read +known_if_redirected e952) v5 -- ?+0 0x39a4e80 con 0x39a4aa0 -1 2015-05-06 01:11:11.712125 7f31026f4700 1 -- 10.200.3.92:0/1001510 == osd.1 10.200.3.32:6800/1870 1 osd_op_reply(27 [pgnls start_epoch 0] v934'6252 uv6252 ondisk = -22 ((22) Invalid argument)) v6 167+0+0 (3260127617 0 0) 0x7f30c4000a90 con 0x39a4aa0 0 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool terminate called after throwing an instance of 'std::runtime_error' what(): rados returned (22) Invalid argument *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] 3: (gsignal()+0x37) [0x7f31195d85d7] 4: (abort()+0x148) [0x7f31195d9cc8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f3119edc9b5] 6: (()+0x5e926) [0x7f3119eda926] 7: (()+0x5e953) [0x7f3119eda953] 8: (()+0x5eb73) [0x7f3119edab73] 9: (()+0x4d116) [0x7f311b606116] 10: (librados::IoCtx::nobjects_begin()+0x2e) [0x7f311b60c60e] 11: (RGWOrphanSearch::build_all_oids_index()+0x62) [0x516a02] 12: (RGWOrphanSearch::run()+0x1e3) [0x51ad23] 13: (main()+0xa430) [0x4fbc30] 14: (__libc_start_main()+0xf5) [0x7f31195c4af5] 15: radosgw-admin() [0x5028d9] 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] On Tue, May 5, 2015 at 10:41 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Can you try creating the .log pool? Yehda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 3:37:15 AM Subject: Re: [ceph-users] Shadow Files ...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2
Re: [ceph-users] Shadow Files
...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release. -Ben On Mon, Apr 27, 2015 at 2:32 PM, Ben b@benjackson.email wrote: How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com , Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda
Re: [ceph-users] Shadow Files
Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release. -Ben On Mon, Apr 27, 2015 at 2:32 PM, Ben b@benjackson.email wrote: How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com , Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still
Re: [ceph-users] Shadow Files
I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release. -Ben On Mon, Apr 27, 2015 at 2:32 PM, Ben b@benjackson.email wrote: How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com , Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done
Re: [ceph-users] Shadow Files
How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com, Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103
Re: [ceph-users] Shadow Files
Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com, Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com
Re: [ceph-users] Shadow Files
It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com, Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc
Re: [ceph-users] Shadow Files
Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com, Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
They definitely sound like the issues we are experiencing. When do you think an update will be available? On 25 Apr 2015 8:10 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com mailto:yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com mailto:yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com mailto:ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com mailto:yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com mailto:ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com mailto:yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help
Re: [ceph-users] Shadow Files
We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow Files
What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still experiencing a problem with out gateway not properly clearing out shadow files. I have done numerous tests where I have: -Uploaded a file of 1.5GB in size using s3browser application -Done an object stat on the file to get its prefix -Done rados ls -p .rgw.buckets | grep prefix to count the number of shadow files associated (in this case it is around 290 shadow files) -Deleted said file with s3browser -Performed a gc list, which shows the ~290 files listed -Waited 24 hours to redo the rados ls -p .rgw.buckets | grep prefix to recount the shadow files only to be left with 290 files still there From log output /var/log/ceph/radosgw.log, I can see the following when clicking DELETE (this appears 290 times) 2015-04-24 10:43:29.996523 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=4718592 stripe_ofs=4718592 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996557 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=8912896 stripe_ofs=8912896 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996564 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=13107200 stripe_ofs=13107200 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996570 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=17301504 stripe_ofs=17301504 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996576 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=21495808 stripe_ofs=21495808 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996581 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=25690112 stripe_ofs=25690112 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996586 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=29884416 stripe_ofs=29884416 part_ofs=0 rule-part_size=0 2015-04-24 10:43:29.996592 7f0b0afb5700 0 RGWObjManifest::operator++(): result: ofs=34078720 stripe_ofs=34078720 part_ofs=0 rule-part_size=0 In this same log, I also see the gc process saying it is removing said file (these records appear 290 times too) 2015-04-23 14:16:27.926952 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.928572 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.929636 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.930448 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.931226 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.932103 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname 2015-04-23 14:16:27.933470 7f15be0ee700 0 gc::process: removing .rgw.buckets:objectname So even though it appears that the GC is processing its removal, the shadow files remain! Please help! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
- Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Craig Lewis cle...@centraldesktop.com, ceph-users ceph-us...@ceph.com Sent: Tuesday, March 17, 2015 7:28:28 PM Subject: Re: [ceph-users] Shadow files None of this helps with trying to remove defunct shadow files which number in the 10s of millions. Did it at least reflect that the garbage collection system works? Is there a quick way to see which shadow files are safe to delete easily? There's no easy process. If you know that a lot of the removed data is on buckets that shouldn't exist anymore then you could start by trying to identify that. You could do that by: $ radosgw-admin metadata list bucket then, for each bucket: $ radosgw-admin metadata get bucket:bucket name This will give you the bucket markers of all existing buckets. Each data object (head and shadow objects) is prefixed by bucket markers. Objects that don't have valid bucket markers can be removed. Note that I would first list all objects, then get the list of valid bucket markers, as the operation is racy and new buckets can be created in the mean time. We did discuss a new garbage cleanup tool that will address your specific issue, and we have a design for it, but it's not there yet. Yehuda Remembering that there are MILLIONS of objects. We have a 320TB cluster which is 272TB full. Of this, we should only actually be seeing 190TB. There is 80TB of shadow files that should no longer exist. On 2015-03-18 02:00, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: Craig Lewis cle...@centraldesktop.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Monday, March 16, 2015 3:38:42 PM Subject: Re: [ceph-users] Shadow files Thats the thing. The peaks and troughs are in USERS BUCKETS only. The actual cluster usage does not go up and down, it just goes up up up. I would expect to see peaks and troughs much the same as the user buckets peaks and troughs on the overall cluster disk usage. But this is not the case. We upgraded the cluster and radosgws to GIANT (0.87.1) yesterday, and now we are seeing a large number of misplaced(??) objects being moved around. Does this mean it has found all the shadow files that shouldn't exist anymore, and is deleting them? If so I would expect to start seeing overall cluster usage drop, but this hasn't happened yet. No, I don't think so. Sounds like your cluster is recovering, and it happens in a completely different layer. Any ideas? try running: $ radosgw-admin gc list --include-all This should be showing all the shadow objects that are pending for delete. Note that if you have a non-default radosgw configuration, make sure you run radosgw-admin using the same user and config that radosgw is running (e.g., add -n client.user appropriately), otherwise it might not look at the correct zone data. You could create an object, identify the shadow objects for that object, remove it, check to see that the gc list command shows these shadow objects. Then, wait the configured time (2 hours?), and see if it was removed. Yehuda On 2015-03-17 06:12, Craig Lewis wrote: Out of curiousity, what's the frequency of the peaks and troughs? RadosGW has configs on how long it should wait after deleting before garbage collecting, how long between GC runs, and how many objects it can GC in per run. The defaults are 2 hours, 1 hour, and 32 respectively. Search http://docs.ceph.com/docs/master/radosgw/config-ref/ [2] for rgw gc. If your peaks and troughs have a frequency less than 1 hour, then GC is going to delay and alias the disk usage w.r.t. the object count. If you have millions of objects, you probably need to tweak those values. If RGW is only GCing 32 objects an hour, it's never going to catch up. Now that I think about it, I bet I'm having issues here too. I delete more than (32*24) objects per day... On Sun, Mar 15, 2015 at 4:41 PM, Ben b@benjackson.email wrote: It is either a problem with CEPH, Civetweb or something else in our configuration. But deletes in user buckets is still leaving a high number of old shadow files. Since we have millions and millions of objects, it is hard to reconcile what should and shouldnt exist. Looking at our cluster usage, there are no troughs, it is just a rising peak. But when looking at users data usage, we can see peaks and troughs as you would expect as data is deleted and added. Our ceph version 0.80.9 Please ideas? On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone
Re: [ceph-users] Shadow files
- Original Message - From: Abhishek L abhishek.lekshma...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Wednesday, March 18, 2015 10:54:37 AM Subject: Re: [ceph-users] Shadow files Yehuda Sadeh-Weinraub writes: Is there a quick way to see which shadow files are safe to delete easily? There's no easy process. If you know that a lot of the removed data is on buckets that shouldn't exist anymore then you could start by trying to identify that. You could do that by: $ radosgw-admin metadata list bucket then, for each bucket: $ radosgw-admin metadata get bucket:bucket name This will give you the bucket markers of all existing buckets. Each data object (head and shadow objects) is prefixed by bucket markers. Objects that don't have valid bucket markers can be removed. Note that I would first list all objects, then get the list of valid bucket markers, as the operation is racy and new buckets can be created in the mean time. We did discuss a new garbage cleanup tool that will address your specific issue, and we have a design for it, but it's not there yet. Could you share the design/ideas for making the cleanup tool. After an initial search I could only find two issues [1] http://tracker.ceph.com/issues/10342 It is sketched in here (10342), probably needs to be better formatted and documented. Yehuda [2] http://tracker.ceph.com/issues/9604 though not much details are there to get started. -- Abhishek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Yehuda Sadeh-Weinraub writes: Is there a quick way to see which shadow files are safe to delete easily? There's no easy process. If you know that a lot of the removed data is on buckets that shouldn't exist anymore then you could start by trying to identify that. You could do that by: $ radosgw-admin metadata list bucket then, for each bucket: $ radosgw-admin metadata get bucket:bucket name This will give you the bucket markers of all existing buckets. Each data object (head and shadow objects) is prefixed by bucket markers. Objects that don't have valid bucket markers can be removed. Note that I would first list all objects, then get the list of valid bucket markers, as the operation is racy and new buckets can be created in the mean time. We did discuss a new garbage cleanup tool that will address your specific issue, and we have a design for it, but it's not there yet. Could you share the design/ideas for making the cleanup tool. After an initial search I could only find two issues [1] http://tracker.ceph.com/issues/10342 [2] http://tracker.ceph.com/issues/9604 though not much details are there to get started. -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
None of this helps with trying to remove defunct shadow files which number in the 10s of millions. Is there a quick way to see which shadow files are safe to delete easily? Remembering that there are MILLIONS of objects. We have a 320TB cluster which is 272TB full. Of this, we should only actually be seeing 190TB. There is 80TB of shadow files that should no longer exist. On 2015-03-18 02:00, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: Craig Lewis cle...@centraldesktop.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Monday, March 16, 2015 3:38:42 PM Subject: Re: [ceph-users] Shadow files Thats the thing. The peaks and troughs are in USERS BUCKETS only. The actual cluster usage does not go up and down, it just goes up up up. I would expect to see peaks and troughs much the same as the user buckets peaks and troughs on the overall cluster disk usage. But this is not the case. We upgraded the cluster and radosgws to GIANT (0.87.1) yesterday, and now we are seeing a large number of misplaced(??) objects being moved around. Does this mean it has found all the shadow files that shouldn't exist anymore, and is deleting them? If so I would expect to start seeing overall cluster usage drop, but this hasn't happened yet. No, I don't think so. Sounds like your cluster is recovering, and it happens in a completely different layer. Any ideas? try running: $ radosgw-admin gc list --include-all This should be showing all the shadow objects that are pending for delete. Note that if you have a non-default radosgw configuration, make sure you run radosgw-admin using the same user and config that radosgw is running (e.g., add -n client.user appropriately), otherwise it might not look at the correct zone data. You could create an object, identify the shadow objects for that object, remove it, check to see that the gc list command shows these shadow objects. Then, wait the configured time (2 hours?), and see if it was removed. Yehuda On 2015-03-17 06:12, Craig Lewis wrote: Out of curiousity, what's the frequency of the peaks and troughs? RadosGW has configs on how long it should wait after deleting before garbage collecting, how long between GC runs, and how many objects it can GC in per run. The defaults are 2 hours, 1 hour, and 32 respectively. Search http://docs.ceph.com/docs/master/radosgw/config-ref/ [2] for rgw gc. If your peaks and troughs have a frequency less than 1 hour, then GC is going to delay and alias the disk usage w.r.t. the object count. If you have millions of objects, you probably need to tweak those values. If RGW is only GCing 32 objects an hour, it's never going to catch up. Now that I think about it, I bet I'm having issues here too. I delete more than (32*24) objects per day... On Sun, Mar 15, 2015 at 4:41 PM, Ben b@benjackson.email wrote: It is either a problem with CEPH, Civetweb or something else in our configuration. But deletes in user buckets is still leaving a high number of old shadow files. Since we have millions and millions of objects, it is hard to reconcile what should and shouldnt exist. Looking at our cluster usage, there are no troughs, it is just a rising peak. But when looking at users data usage, we can see peaks and troughs as you would expect as data is deleted and added. Our ceph version 0.80.9 Please ideas? On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [1] ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [1] Links: -- [1] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2] http://docs.ceph.com/docs/master/radosgw/config-ref/ ___ ceph-users mailing list ceph-users
Re: [ceph-users] Shadow files
Out of curiousity, what's the frequency of the peaks and troughs? RadosGW has configs on how long it should wait after deleting before garbage collecting, how long between GC runs, and how many objects it can GC in per run. The defaults are 2 hours, 1 hour, and 32 respectively. Search http://docs.ceph.com/docs/master/radosgw/config-ref/ for rgw gc. If your peaks and troughs have a frequency less than 1 hour, then GC is going to delay and alias the disk usage w.r.t. the object count. If you have millions of objects, you probably need to tweak those values. If RGW is only GCing 32 objects an hour, it's never going to catch up. Now that I think about it, I bet I'm having issues here too. I delete more than (32*24) objects per day... On Sun, Mar 15, 2015 at 4:41 PM, Ben b@benjackson.email wrote: It is either a problem with CEPH, Civetweb or something else in our configuration. But deletes in user buckets is still leaving a high number of old shadow files. Since we have millions and millions of objects, it is hard to reconcile what should and shouldnt exist. Looking at our cluster usage, there are no troughs, it is just a rising peak. But when looking at users data usage, we can see peaks and troughs as you would expect as data is deleted and added. Our ceph version 0.80.9 Please ideas? On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
On Mon, Mar 16, 2015 at 12:12 PM, Craig Lewis cle...@centraldesktop.com wrote: Out of curiousity, what's the frequency of the peaks and troughs? RadosGW has configs on how long it should wait after deleting before garbage collecting, how long between GC runs, and how many objects it can GC in per run. The defaults are 2 hours, 1 hour, and 32 respectively. Search http://docs.ceph.com/docs/master/radosgw/config-ref/ for rgw gc. If your peaks and troughs have a frequency less than 1 hour, then GC is going to delay and alias the disk usage w.r.t. the object count. If you have millions of objects, you probably need to tweak those values. If RGW is only GCing 32 objects an hour, it's never going to catch up. Now that I think about it, I bet I'm having issues here too. I delete more than (32*24) objects per day... Uh, that's not quite what rgw_gc_max_objs mean. That param configures how the garbage control data objects and internal classes are sharded, and each grouping will only delete one object at a time. So it controls the parallelism, but not the total number of objects! Also, Yehuda says that changing this can be a bit dangerous because it currently needs to be consistent across any program doing or generating GC work. -Greg On Sun, Mar 15, 2015 at 4:41 PM, Ben b@benjackson.email wrote: It is either a problem with CEPH, Civetweb or something else in our configuration. But deletes in user buckets is still leaving a high number of old shadow files. Since we have millions and millions of objects, it is hard to reconcile what should and shouldnt exist. Looking at our cluster usage, there are no troughs, it is just a rising peak. But when looking at users data usage, we can see peaks and troughs as you would expect as data is deleted and added. Our ceph version 0.80.9 Please ideas? On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
It is either a problem with CEPH, Civetweb or something else in our configuration. But deletes in user buckets is still leaving a high number of old shadow files. Since we have millions and millions of objects, it is hard to reconcile what should and shouldnt exist. Looking at our cluster usage, there are no troughs, it is just a rising peak. But when looking at users data usage, we can see peaks and troughs as you would expect as data is deleted and added. Our ceph version 0.80.9 Please ideas? On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
- Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Our cluster has millions of objects in it, there has to be an easy way to reconcile objects that no longer exist to shadow files? We are in a critical position now because we have millions of objects, a large number of TB of data, and closing in on 42 osds near full 89% util out of 112 osds. On 2015-03-13 02:25, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email To: ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Hi I just want to tell you there is a rgw object visualisation that could help you in our tool called inkscope available on github Best regards Envoyé de mon Galaxy Ace4 Orange Message d'origine De : Italo Santos okd...@gmail.com Date :12/03/2015 21:26 (GMT+01:00) À : Ben b@benjackson.email Cc : Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-us...@ceph.com Objet : Re: [ceph-users] Shadow files Hello Ben, I’m facing with the same issue - #10295http://tracker.ceph.com/issues/10295 and I remove the object directly from that rados successfully. But is very important map all object before do that. I recommend you take a look to the links bellow to understand more about the objects name: Translating a RadosGW object name into a filename on disk https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12161.html http://www.spinics.net/lists/ceph-devel/msg20426.html Regards. Italo Santos http://italosantos.com.br/ On Thursday, March 12, 2015 at 12:25 PM, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.emailmailto:b@benjackson.email To: ceph-us...@ceph.commailto:ceph-us...@ceph.com Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Hello Ben, I’m facing with the same issue - #10295 (http://tracker.ceph.com/issues/10295) and I remove the object directly from that rados successfully. But is very important map all object before do that. I recommend you take a look to the links bellow to understand more about the objects name: Translating a RadosGW object name into a filename on disk https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12161.html http://www.spinics.net/lists/ceph-devel/msg20426.html Regards. Italo Santos http://italosantos.com.br/ On Thursday, March 12, 2015 at 12:25 PM, Yehuda Sadeh-Weinraub wrote: - Original Message - From: Ben b@benjackson.email (mailto:b@benjackson.email) To: ceph-us...@ceph.com (mailto:ceph-us...@ceph.com) Sent: Wednesday, March 11, 2015 8:46:25 PM Subject: Re: [ceph-users] Shadow files Anyone got any info on this? Is it safe to delete shadow files? It depends. Shadow files are badly named objects that represent part of the objects data. They are only safe to remove if you know that the corresponding objects no longer exist. Yehuda On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Anyone got any info on this? Is it safe to delete shadow files? On 2015-03-11 10:03, Ben wrote: We have a large number of shadow files in our cluster that aren't being deleted automatically as data is deleted. Is it safe to delete these files? Is there something we need to be aware of when deleting them? Is there a script that we can run that will delete these safely? Is there something wrong with our cluster that it isn't deleting these files when it should be? We are using civetweb with radosgw, with tengine ssl proxy infront of it Any advice please Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com