Re: [ceph-users] Shadow Files
Unfortunately it immediately aborted (running against a 0.80.9 Ceph). Does Ceph also have to be a 0.94 level? last error was -3 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool -2 2015-05-06 01:11:11.710995 7f311dd15880 1 -- 10.200.3.92:0/1001510 -- 10.200.3.32:6800/1870 -- osd_op(client.4065115.0:27 ^A/ [pgnls start_epoch 0] 11.0 ack+read +known_if_redirected e952) v5 -- ?+0 0x39a4e80 con 0x39a4aa0 -1 2015-05-06 01:11:11.712125 7f31026f4700 1 -- 10.200.3.92:0/1001510 == osd.1 10.200.3.32:6800/1870 1 osd_op_reply(27 [pgnls start_epoch 0] v934'6252 uv6252 ondisk = -22 ((22) Invalid argument)) v6 167+0+0 (3260127617 0 0) 0x7f30c4000a90 con 0x39a4aa0 0 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 2015-05-06 01:11:11.710947 7f311dd15880 0 run(): building index of all objects in pool terminate called after throwing an instance of 'std::runtime_error' what(): rados returned (22) Invalid argument *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] 3: (gsignal()+0x37) [0x7f31195d85d7] 4: (abort()+0x148) [0x7f31195d9cc8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f3119edc9b5] 6: (()+0x5e926) [0x7f3119eda926] 7: (()+0x5e953) [0x7f3119eda953] 8: (()+0x5eb73) [0x7f3119edab73] 9: (()+0x4d116) [0x7f311b606116] 10: (librados::IoCtx::nobjects_begin()+0x2e) [0x7f311b60c60e] 11: (RGWOrphanSearch::build_all_oids_index()+0x62) [0x516a02] 12: (RGWOrphanSearch::run()+0x1e3) [0x51ad23] 13: (main()+0xa430) [0x4fbc30] 14: (__libc_start_main()+0xf5) [0x7f31195c4af5] 15: radosgw-admin() [0x5028d9] 2015-05-06 01:11:11.712652 7f311dd15880 -1 *** Caught signal (Aborted) ** in thread 7f311dd15880 ceph version 0.94-1339-gc905d51 (c905d517c2c778a88b006302996591b60d167cb6) 1: radosgw-admin() [0x61e604] 2: (()+0xf130) [0x7f311a59f130] On Tue, May 5, 2015 at 10:41 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: Can you try creating the .log pool? Yehda - Original Message - From: Anthony Alba ascanio.al...@gmail.com To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com Sent: Tuesday, May 5, 2015 3:37:15 AM Subject: Re: [ceph-users] Shadow Files ...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release
Re: [ceph-users] Shadow Files
...sorry clicked send to quickly /opt/ceph/bin/radosgw-admin orphans find --pool=.rgw.buckets --job-id=abcd ERROR: failed to open log pool ret=-2 job not found On Tue, May 5, 2015 at 6:36 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release. -Ben On Mon, Apr 27, 2015 at 2:32 PM, Ben b@benjackson.email wrote: How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com , Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda
Re: [ceph-users] Shadow Files
Hi Yehuda, First run: /opt/ceph/bin/radosgw-admin --pool=.rgw.buckets --job-id=testing ERROR: failed to open log pool ret=-2 job not found Do I have to precreate some pool? On Tue, May 5, 2015 at 8:17 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any object that the tool reports, but rather move it to a different pool for backup (using the rados tool cp command). The tool works in a few stages: (1) list all the rados objects in the specified pool, store in repository (2) list all bucket instances in the system, store in repository (3) iterate through bucket instances in repository, list (logical) objects, for each object store the expected rados objects that build it (4) compare data from (1) and (3), each object that is in (1), but not in (3), stat, if older than $start_time - $stale_period, report it There can be lot's of things that can go wrong with this, so we really need to be careful here. The tool can be run by the following command: $ radosgw-admin orphans find --pool=data pool --job-id=name [--num-shards=num shards] [--orphan-stale-secs=seconds] The tool can be stopped, and restarted, and it will continue from the stage where it stopped. Note that some of the stages will restart from the beginning (of the stages), due to system limitation (specifically 1, 2). In order to clean up a job's data: $ radosgw-admin orphans finish --job-id=name Note that the jobs run in the radosgw-admin process context, it does not schedule a job on the radosgw process. Please let me know of any issue you find. Thanks, Yehuda - Original Message - From: Ben Hines bhi...@gmail.com To: Ben b@benjackson.email Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com, ceph-users ceph-us...@ceph.com Sent: Thursday, April 30, 2015 3:00:16 PM Subject: Re: [ceph-users] Shadow Files Going to hold off on our 94.1 update for this issue Hopefully this can make it into a 94.2 or a v95 git release. -Ben On Mon, Apr 27, 2015 at 2:32 PM, Ben b@benjackson.email wrote: How long are you thinking here? We added more storage to our cluster to overcome these issues, and we can't keep throwing storage at it until the issues are fixed. On 28/04/15 01:49, Yehuda Sadeh-Weinraub wrote: It will get to the ceph mainline eventually. We're still reviewing and testing the fix, and there's more work to be done on the cleanup tool. Yehuda - Original Message - From: Ben b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Sunday, April 26, 2015 11:02:23 PM Subject: Re: [ceph-users] Shadow Files Are these fixes going to make it into the repository versions of ceph, or will we be required to compile and install manually? On 2015-04-26 02:29, Yehuda Sadeh-Weinraub wrote: Yeah, that's definitely something that we'd address soon. Yehuda - Original Message - From: Ben b@benjackson.email To: Ben Hines bhi...@gmail.com , Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 5:14:11 PM Subject: Re: [ceph-users] Shadow Files Definitely need something to help clear out these old shadow files. I'm sure our cluster has around 100TB of these shadow files. I've written a script to go through known objects to get prefixes of objects that should exist to compare to ones that shouldn't, but the time it takes to do this over millions and millions of objects is just too long. On 25/04/15 09:53, Ben Hines wrote: When these are fixed it would be great to get good steps for listing / cleaning up any orphaned objects. I have suspicions this is affecting us. thanks- -Ben On Fri, Apr 24, 2015 at 3:10 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: These ones: http://tracker.ceph.com/issues/10295 http://tracker.ceph.com/issues/11447 - Original Message - From: Ben Jackson b@benjackson.email To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: ceph-users ceph-us...@ceph.com Sent: Friday, April 24, 2015 3:06:02 PM Subject: Re: [ceph-users] Shadow Files We were firefly, then we upgraded to giant, now we are on hammer. What issues? On 25 Apr 2015 2:12 am, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: What version are you running? There are two different issues that we were fixing this week, and we should have that upstream pretty soon. Yehuda - Original Message - From: Ben b@benjackson.email To: ceph-users ceph-us...@ceph.com Cc: Yehuda Sadeh-Weinraub yeh...@redhat.com Sent: Thursday, April 23, 2015 7:42:06 PM Subject: [ceph-users] Shadow Files We are still
[ceph-users] Calamari: No Cluster - Hosts - Info?
Running Calamari v1.2.3.1 and hit a oddity: Cluster has registered successfully and all graphs display. Only the KEY VALUE tables from Manage Cluster Hosts (click the i' icon) are all empty. Manually running e.g. salt '*' grains.item num_cpus works. salt '*' ceph.get_heartbeats work. /var/lib/graphite is being populated. I have another calamari instance at v1.2.1 which shows the grains correctly. Any suggestions why the KEY VALUE table is empty? - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD Startup Best Practice: gpt/udev or SysVInit/systemd ?
Hi Cephers, What is your best practice for starting up OSDs? I am trying to determine the most robust technique on CentOS 7 where I have too much choice: udev/gpt/uuid or /etc/init.d/ceph or /etc/systemd/system/ceph-osd@X 1. Use udev/gpt/UUID: no OSD sections in /etc/ceph/mycluster.conf or premounts in /etc/fstab. Let udev + ceph-disk-activate do its magic. 2. Use /etc/init.d/ceph start osd or systemctl start ceph-osd@N a. do you change partition UUID so no udev kicks in? b. do you keep [osd.N] sections in /etc/ceph/mycluster.conf c. premount all journals/OSDs in /etc/fstab? The problem with this approach, though very explicit and robust, is that it is is hard to maintain /etc/fstab on the OSD hosts. - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.90 released
Hi Sage, Has the repo metadata been regenerated? One of my reposync jobs can only see up to 0.89, using http://ceph.com/rpm-testing. Thanks Anthony On Sat, Dec 20, 2014 at 6:22 AM, Sage Weil sw...@redhat.com wrote: This is the last development release before Christmas. There are some API cleanups for librados and librbd, and lots of bug fixes across the board for the OSD, MDS, RGW, and CRUSH. The OSD also gets support for discard (potentially helpful on SSDs, although it is off by default), and there are several improvements to ceph-disk. The next two development releases will be getting a slew of new functionality for hammer. Stay tuned! Upgrading - * Previously, the formatted output of 'ceph pg stat -f ...' was a full pg dump that included all metadata about all PGs in the system. It is now a concise summary of high-level PG stats, just like the unformatted 'ceph pg stat' command. * All JSON dumps of floating point values were incorrecting surrounding the value with quotes. These quotes have been removed. Any consumer of structured JSON output that was consuming the floating point values was previously having to interpret the quoted string and will most likely need to be fixed to take the unquoted number. Notable Changes --- * arch: fix NEON feaeture detection (#10185 Loic Dachary) * build: adjust build deps for yasm, virtualenv (Jianpeng Ma) * build: improve build dependency tooling (Loic Dachary) * ceph-disk: call partx/partprobe consistency (#9721 Loic Dachary) * ceph-disk: fix dmcrypt key permissions (Loic Dachary) * ceph-disk: fix umount race condition (#10096 Blaine Gardner) * ceph-disk: init=none option (Loic Dachary) * ceph-monstore-tool: fix shutdown (#10093 Loic Dachary) * ceph-objectstore-tool: fix import (#10090 David Zafman) * ceph-objectstore-tool: many improvements and tests (David Zafman) * ceph.spec: package rbd-replay-prep (Ken Dreyer) * common: add 'perf reset ...' admin command (Jianpeng Ma) * common: do not unlock rwlock on destruction (Federico Simoncelli) * common: fix block device discard check (#10296 Sage Weil) * common: remove broken CEPH_LOCKDEP optoin (Kefu Chai) * crush: fix tree bucket behavior (Rongze Zhu) * doc: add build-doc guidlines for Fedora and CentOS/RHEL (Nilamdyuti Goswami) * doc: enable rbd cache on openstack deployments (Sebastien Han) * doc: improved installation nots on CentOS/RHEL installs (John Wilkins) * doc: misc cleanups (Adam Spiers, Sebastien Han, Nilamdyuti Goswami, Ken Dreyer, John Wilkins) * doc: new man pages (Nilamdyuti Goswami) * doc: update release descriptions (Ken Dreyer) * doc: update sepia hardware inventory (Sandon Van Ness) * librados: only export public API symbols (Jason Dillaman) * libradosstriper: fix stat strtoll (Dongmao Zhang) * libradosstriper: fix trunc method (#10129 Sebastien Ponce) * librbd: fix list_children from invalid pool ioctxs (#10123 Jason Dillaman) * librbd: only export public API symbols (Jason Dillaman) * many coverity fixes (Danny Al-Gaaf) * mds: 'flush journal' admin command (John Spray) * mds: fix MDLog IO callback deadlock (John Spray) * mds: fix deadlock during journal probe vs purge (#10229 Yan, Zheng) * mds: fix race trimming log segments (Yan, Zheng) * mds: store backtrace for stray dir (Yan, Zheng) * mds: verify backtrace when fetching dirfrag (#9557 Yan, Zheng) * mon: add max pgs per osd warning (Sage Weil) * mon: fix *_ratio units and types (Sage Weil) * mon: fix JSON dumps to dump floats as flots and not strings (Sage Weil) * mon: fix formatter 'pg stat' command output (Sage Weil) * msgr: async: several fixes (Haomai Wang) * msgr: simple: fix rare deadlock (Greg Farnum) * osd: batch pg log trim (Xinze Chi) * osd: clean up internal ObjectStore interface (Sage Weil) * osd: do not abort deep scrub on missing hinfo (#10018 Loic Dachary) * osd: fix ghobject_t formatted output to include shard (#10063 Loic Dachary) * osd: fix osd peer check on scrub messages (#9555 Sage Weil) * osd: fix pgls filter ops (#9439 David Zafman) * osd: flush snapshots from cache tier immediately (Sage Weil) * osd: keyvaluestore: fix getattr semantics (Haomai Wang) * osd: keyvaluestore: fix key ordering (#10119 Haomai Wang) * osd: limit in-flight read requests (Jason Dillaman) * osd: log when scrub or repair starts (Loic Dachary) * osd: support for discard for journal trim (Jianpeng Ma) * qa: fix osd create dup tests (#10083 Loic Dachary) * rgw: add location header when object is in another region (VRan Liu) * rgw: check timestamp on s3 keystone auth (#10062 Abhishek Lekshmanan) * rgw: make sysvinit script set ulimit -n properly (Sage Weil) * systemd: better systemd unit files (Owen Synge) * tests: ability to run unit tests under docker (Loic Dachary) Getting Ceph * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.90.tar.gz * For packages,
[ceph-users] Rgw leaving hundreds of shadow multiparty objects
Hello Cephers I am observing in .87 and .89 that rgw occupies a lot more disk space than the objects and .rgw.buckets has thousands of _shadow and _multipart objects. After deleting the S3 objects he rados objects still remain. radosgw-admin gc list is empty radosgw-admin gc process doesn't change anything. Any suggestions to clear these orphans? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] normalizing radosgw
Suggestion: can we look at normalizing custom cluster names as well for RHEL-like systems? /etc/init.d/ceph: how do you pass --cluster myname at system startup? systemd service file uses EnvironmentFile. What about . /etc/sysconfig/ceph #near the top and /etc/sysconfig/ceph contains cluster=myname For ceph-radosgw even this does not work. You need export CEPH_CONF=/etc/ceph/myname.conf as well. Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Giant or Firefly for production
Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? Any gotchas? Regards Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] s3-tests with giant/radosgw, many failures with fastcgi
I am seeing a lot of failures with Giant/radosgw and s3test particularly with fastcgi. I am using community patched apache, fastcgi. civetweb is doing much better. 1. Both tests hangs at s3tests.functional.test_headers.test_object_create_bad_contentlength_mismatch_above I have to exclude this test. 2. radosgw/civetweb; civetweb is doing very well. Ran 297 tests in 68.624s FAILED (SKIP=4, errors=2, failures=4) FAIL: s3tests.functional.test_headers.test_object_create_bad_contenttype_unreadable AssertionError: S3ResponseError not raised FAIL: s3tests.functional.test_headers.test_object_create_bad_authorization_unreadable AssertionError: 400 != 403 FAIL: s3tests.functional.test_headers.test_bucket_create_bad_authorization_unreadable AssertionError: 400 != 403 FAIL: s3tests.functional.test_s3.test_bucket_list_maxkeys_unreadable AssertionError: S3ResponseError not raised 3. Apache/mod_fastcgi is a total disaster. Ran 297 tests in 46.210s FAILED (SKIP=4, errors=2, failures=87) Lots of FAILs like the following: FAIL: s3tests.functional.test_headers.test_object_create_bad_md5_invalid AssertionError: '' != 'Bad Request' FAIL: s3tests.functional.test_headers.test_object_create_bad_md5_wrong AssertionError: '' != 'Bad Request' FAIL: s3tests.functional.test_headers.test_object_create_bad_md5_empty AssertionError: '' != 'Bad Request' FAIL: s3tests.functional.test_headers.test_object_create_bad_md5_unreadable AssertionError: '' != 'Forbidden' FAIL: s3tests.functional.test_headers.test_object_create_bad_contentlength_empty AssertionError: '' != 'Bad Request' FAIL: s3tests.functional.test_headers.test_object_create_bad_contentlength_none AssertionError: '' != 'Length Required' FAIL: s3tests.functional.test_headers.test_object_create_bad_contenttype_unreadable AssertionError: S3ResponseError not raised FAIL: s3tests.functional.test_headers.test_object_create_bad_authorization_invalid AssertionError: '' != 'Bad Request' FAIL: s3tests.functional.test_headers.test_object_create_bad_authorization_unreadable AssertionError: 400 != 403 FAIL: s3tests.functional.test_headers.test_object_create_bad_authorization_empty AssertionError: '' != 'Forbidden' FAIL: s3tests.functional.test_headers.test_object_create_bad_authorization_none AssertionError: '' != 'Forbidden' Is there a simple fastcgi tweak I am missing? I am using the exemplary CentOS 7 configuration, with no tweaks to $cluster.conf. rgw_print_continue = true #otherwise nothing runs - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Pass custom cluster name to SysVinit script on system startup?
How do you start up the SysVinit script with a cluster name? /etc/init.d/ceph (--cluster XXX --how to you pass this for a normal system reboot; i.e. not run from command line) On a reboot, somehow the OSD manages to come up but the mon fails (no --cluster on the command line).. http://tracker.ceph.com/issues/3747 - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD systemd unit files makes it look failed
Hi the current OSD systemd unit files starts the OSD daemons correctly and ceph is HEALTH_OK. However there are some process tracking issues and systemd thinks the service has failed. systemctl stop ceph-osd@0 cannot stop the OSDs. [Service] EnvironmentFile=-/etc/sysconfig/ceph Environment=CLUSTER=cephtest ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i [root@ceph1 ~]# systemctl status cephosd@0 ceph-osd@0.service - Ceph object storage daemon Loaded: loaded (/etc/systemd/system/ceph-osd@.service; enabled) Active: failed (Result: exit-code) since Thu 2014-11-20 17:36:40 SGT; 10min ago Process: 4251 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i (code=exited, status=1/FAILURE) Process: 4196 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 4251 (code=exited, status=1/FAILURE) Nov 20 17:36:40 ceph1.cephtest.com systemd[1]: Starting Ceph object storage daemon... Nov 20 17:36:40 ceph1.cephtest.com ceph-osd-prestart.sh[4196]: create-or-move updated item name 'osd.0' weight 0.19 at location {...h map Nov 20 17:36:40 ceph1.cephtest.com systemd[1]: Started Ceph object storage daemon. Nov 20 17:36:40 ceph1.cephtest.com ceph-osd[4251]: starting osd.0 at :/0 osd_data /var/lib/ceph/osd/cephtest-0 /var/lib/ceph/osd/fl...ournal Nov 20 17:36:40 ceph1.cephtest.com ceph-osd[4251]: 2014-11-20 17:36:40.857110 7f422e034880 -1 asok(0x4740230) AdminSocketConfigO...exists Nov 20 17:36:40 ceph1.cephtest.com ceph-osd[4251]: 2014-11-20 17:36:40.857196 7f422e034880 -1 filestore(/var/lib/ceph/osd/cephtest-...failed Nov 20 17:36:40 ceph1.cephtest.com ceph-osd[4251]: 2014-11-20 17:36:40.857207 7f422e034880 -1 ** ERROR: error converting store ...e busy Nov 20 17:36:40 ceph1.cephtest.com systemd[1]: ceph-osd@0.service: main process exited, code=exited, status=1/FAILURE Nov 20 17:36:40 ceph1.cephtest.com systemd[1]: Unit ceph-osd@0.service entered failed state. Nov 20 17:46:20 ceph1.cephtest.com systemd[1]: Stopped Ceph object storage daemon. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD systemd unit files makes it look failed
HI Sage, Cephers (I'm not on ceph-devel at the moment, will switch in a moment.) Thanks. I am testing on RHEL7/CentOS 7. As quick workaround setting the .service file to [Service] Type=forking ExecStart= ceph-osd -i #without --foreground) ExecPreStart = works for the moment . Is there a reason the daemons shouldn't run in forking mode under systemd? Both ceph-mon@.service and ceph-radosgw@.service (== self-created based on the ceph-mon@.service unit file) work with Type=forking ExecStart=path_to_daemon_no_foreground or #Type=simple ExecStart=path_to_daemon --foreground BTW do you want a pull requrest for ceph-radosgw@.service? - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Unclear about CRUSH map and more than one step emit in rule
The step emit documentation states Outputs the current value and empties the stack. Typically used at the end of a rule, but may also be used to pick from different trees in the same rule. What use case is there for more than one step emit? Where would you put it since a rule looks like rule rulename { ruleset ruleset type [ replicated | raid4 ] min_size min-size max_size max-size step take bucket-type step [choose|chooseleaf] [firstn|indep] N bucket-type step emit } Hazard a guess: after step emit you start with step take... all over again? - Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)
When I create a new OSD with a block device as journal that has existing data on it, ceph is causing FAILED assert. The block device iss a journal from a previous experiment. It can safely be overwritten. If I zero the block device with dd if=/dev/zero bs=512 count=1000 of=MyJournalDev then the assert doesn't happen. Is there a way to tell mkfs to ignore data on the journal device and just go ahead and clobber it ? 2014-11-13 21:22:26.463359 7f8383486880 -1 journal Unable to read past sequence 2 but header indicates the journal has committed up through 5202, journal is cor rupt os/FileJournal.cc: In function 'bool FileJournal::read_entry(ceph::bufferlist, uint64_t, bool*)' thread 7f8383486880 time 2014-11-13 21:22:26.463363 os/FileJournal.cc: 1693: FAILED assert(0) ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xb7ac55] 2: (FileJournal::read_entry(ceph::buffer::list, unsigned long, bool*)+0xb04) [0xa339a4] 3: (JournalingObjectStore::journal_replay(unsigned long)+0x237) [0x910787] 4: (FileStore::mount()+0x3f8b) [0x8e482b] 5: (OSD::mkfs(CephContext*, ObjectStore*, std::string const, uuid_d, int)+0xf0) [0x65d940] 6: (main()+0xbf6) [0x620d76] 7: (__libc_start_main()+0xf5) [0x7f8380823af5] 8: ceph-osd() [0x63a969] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)
Ah no. On 13 Nov 2014 21:49, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi, Did you mkjournal the reused journal? ceph-osd -i $ID --mkjournal Cheers, Dan No - however the man page states that --mkjournal is for : Create a new journal file to match an existing object repository. This is useful if the journal device or file is wiped out due to a disk or file system failure. I thought mkfs would create a new OSD and new journal in one shot (the journal device is specified in ceph.conf). In otherwords I do not have an existing object repository.. My steps: ceph.conf: osd journal = /dev/sdb1 # This was used in a previous experiment so has garbage on it # /dev/sdc1 is mounted on /var/lib/ceph/osd/ceph-0 ceph-osd -i 0 --mkfs --mkkey --osd-uuid 123456 At this point it crashes with the FAILED assert. Do you mean I should run ceph-osd -i $ID --mkjournal before the mkfs? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Multiple rules in a ruleset: any examples? Which rule wins?
Hi list, When there are multiple rules in a ruleset, is it the case that first one wins? When will a rule faisl, does it fall through to the next rule? Are min_size, max_size the only determinants? Are there any examples? The only examples I've see put one rule per ruleset (e.g. the docs have a ssd/platter example but that shows 1 rule per ruleset) Regards -Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multiple rules in a ruleset: any examples? Which rule wins?
Thanks! What happens when the lone rule fails? Is there a fallback rule that will place the blob in a random PG? Say I misconfigure, and my choose/chooseleaf don't add up to pool min size. (This also explains why all examples in the wild use only 1 rule per ruleset.) On Fri, Nov 14, 2014 at 7:03 AM, Gregory Farnum g...@gregs42.com wrote: On Thu, Nov 13, 2014 at 2:58 PM, Anthony Alba ascanio.al...@gmail.com wrote: Hi list, When there are multiple rules in a ruleset, is it the case that first one wins? When will a rule faisl, does it fall through to the next rule? Are min_size, max_size the only determinants? Are there any examples? The only examples I've see put one rule per ruleset (e.g. the docs have a ssd/platter example but that shows 1 rule per ruleset) The intention of rulesets is that they are used only for pools of different sizes, so the behavior when you have multiple rules which match to a given size is probably not well-defined. That said, even using multiple rules in a single ruleset is not well tested and I believe the functionality is being removed in the next release. I would recommend against using them. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Use case: one-way RADOS replication between two clusters by time period
Great information, thanks. I would like to confirm that if I regularly delete older buckets off the LIVE primary system, the extra objects on the ARCHIVE secondaries are ignored during replication. I.e. it does not behave like rsync -avz --delete LIVE/ ARCHIVE/ Rather it behaves more like rsync -avz LIVE/ ARCHIVE/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Use case: one-way RADOS replication between two clusters by time period
Hi list, Can RADOS fulfil the following use case: I wish to have a radosgw-S3 object store that is LIVE, this represents current objects of users. Separated by an air-gap is another radosgw-S3 object store that is ARCHIVE. The objects will only be created and manipulated by radosgw. Periodically, (on the order of 3-6 months), I want to connect the two clusters and replicate all objects from LIVE to ARCHIVE created from time period DDMM1 - DDMM2 or better yet from the last timestamp . This is a one way replication and the objects are transferred only in the LIVE == ARCHIVE direction. Can this be done easily? Thanks Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] new installation
Firewall? Disable iptables, set SELinux to Permissive. On 15 Oct, 2014 5:49 pm, Roman intra...@gmail.com wrote: Pascal, Here is my latest installation: cluster 204986f6-f43c-4199-b093-8f5c7bc641bb health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean; recovery 20/40 objects degraded (50.000%) monmap e1: 2 mons at {ceph02= 192.168.33.142:6789/0,ceph03=192.168.33.143:6789/0}, election epoch 4, quorum 0,1 ceph02,ceph03 mdsmap e4: 1/1/1 up {0=ceph02=up:active} osdmap e8: 2 osds: 2 up, 2 in pgmap v14: 192 pgs, 3 pools, 1884 bytes data, 20 objects 68796 kB used, 6054 MB / 6121 MB avail 20/40 objects degraded (50.000%) 192 active+degraded host ceph01 - admin host ceph02 - mon.ceph02 + osd.1 (sdb, 8G) + mds host ceph03 - mon.ceph03 + osd.0 (sdb, 8G) $ ceph osd tree # idweight type name up/down reweight -1 0 root default -2 0 host ceph03 0 0 osd.0 up 1 -3 0 host ceph02 1 0 osd.1 up 1 $ ceph osd dump epoch 8 fsid 204986f6-f43c-4199-b093-8f5c7bc641bb created 2014-10-15 13:39:05.986977 modified 2014-10-15 13:40:45.644870 flags pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 max_osd 2 osd.0 up in weight 1 up_from 4 up_thru 4 down_at 0 last_clean_interval [0,0) 192.168.33.143:6800/2284 192.168.33.143:6801/2284 192.168.33.143:6802/2284 192.168.33.143:6803/2284 exists,up dccd6b99-1885-4c62-864b-107bd9ba0d84 osd.1 up in weight 1 up_from 8 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.33.142:6800/2399 192.168.33.142:6801/2399 192.168.33.142:6802/2399 192.168.33.142:6803/2399 exists,up 4d4adf4b-ae8e-4e26-8667-c952c7fc4e45 Thanks, Roman Hello, osdmap e10: 4 osds: 2 up, 2 in What about following commands : # ceph osd tree # ceph osd dump You have 2 OSDs on 2 hosts, but 4 OSDs seems to be debined in your crush map. Regards, Pascal Le 15 oct. 2014 à 11:11, Roman intra...@gmail.com a écrit : Hi ALL, I've created 2 mon and 2 osd on Centos 6.5 (x86_64). I've tried 4 times (clean centos installation) but always have health: HEALTH_WARN Never HEALTH_OK always HEALTH_WARN! :( # ceph -s cluster d073ed20-4c0e-445e-bfb0-7b7658954874 health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean monmap e1: 2 mons at {ceph02= 192.168.0.142:6789/0,ceph03=192.168.0.143:6789/0}, election epoch 4, quorum 0,1 ceph02,ceph03 osdmap e10: 4 osds: 2 up, 2 in pgmap v15: 192 pgs, 3 pools, 0 bytes data, 0 objects 68908 kB used, 6054 MB / 6121 MB avail 192 active+degraded What am I doing wrong??? --- host: 192.168.0.141 - admin host: 192.168.0.142 - mon.ceph02 + osd.0 (/dev/sdb, 8G) host: 192.168.0.143 - mon.ceph03 + osd.1 (/dev/sdb, 8G) ceph-deploy version 1.5.18 [global] osd pool default size = 2 --- Thanks, Roman. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Pascal Morillon University of Rennes 1 IRISA, Rennes, France SED Offices : E206 (Grid5000), D050 (SED) Phone : +33 2 99 84 22 10 pascal.moril...@irisa.fr Twitter @pmorillon https://twitter.com/pmorillon xmpp: pmori...@jabber.grid5000.fr http://www.grid5000.fr ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Misconfigured caps on client.admin key, anyway to recover from EAESS denied?
Following the manual starter guide, I set up a Ceph cluster with HEALTH_OK, (1 mon, 2 osd). In testing out auth commands I misconfigured the client.admin key by accidentally deleting mon 'allow *'. Now I'm getting EACESS denied for all ceph actions. Is there a way to recover or recreate a new client.admin key. Key was: client.admin key: ABCDEFG... caps: [mon] allow * caps: [osd] allow * Misconfigured key: ABCDEFG... caps: [osd] allow * ...now all ceph commands fail, so I'm not sure how to start fixing the key on the mons/osds. - anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Misconfigured caps on client.admin key, anyway to recover from EAESS denied?
You can disable cephx completely, fix the key and enable cephx again. auth_cluster_required, auth_service_required and auth_client_required That did not work: i.e disabling cephx in the cluster conf and restarting the cluster. The cluster still complained about failed authentication. I *believe* if you grab the monitor key you can use that to make the necessary changes, though. Otherwise hacking at the monitor stores is an option. You mean use the mon. key but as the client.admin user? - anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Giant: only 1 default pool created rbd, no data or metadata
Hi, I am following the manual creation method with 0.8.6 on CentOS7. When I start mon.node1, I only have one pool created. No data, metadata pools. Any suggestions? Steps: ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *' ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring \ --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd allow ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring monmaptool --create --add node1 192.168.122.111 --fsid a7190f44-d739-4c2d-ad20-b7ade32921c9 /tmp/monmap mkdir /var/lib/ceph/mon/ceph-node1 ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring [root@node1 ~]# ceph osd lspools 0 rbd, [root@node1 ~]# ceph -s cluster a7190f44-d739-4c2d-ad20-b7ade32921c9 health HEALTH_ERR 64 pgs stuck inactive; 64 pgs stuck unclean; no osds monmap e1: 1 mons at {node1=192.168.122.111:6789/0}, election epoch 2, quorum 0 node1 osdmap e1: 0 osds: 0 up, 0 in pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 64 creating [global] fsid = a7190f44-d739-4c2d-ad20-b7ade32921c9 mon initial members = node1 mon host = 192.168.122.111 public network = 192.168.122.0/24 cluster network = 10.10.122.0/24 auth cluster required = cephx auth service required = cephx auth client required = cephx osd journal size = 1024 filestore xattr use omap = true osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 333 osd pool default pgp num = 333 osd crush chooseleaf type = 1 [mon.node1] host = node1 mon address = 192.168.122.111 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com