[ceph-users] failed to populate the monitor daemon(s) with the monitor map and keyring.
My PC had problems to quick install,so I followed the Installation (Manual) guide, but when I was at the step populate the monitor daemon(s) with the monitor map and keyring., error occured and the print is : IO error: /var/lib/ceph/mon/ceph-node1/store.db/LOCK: No such file or directory ceph-mon: error opening mon data directory at '/var/lib/ceph/mon/ceph-node1': (22) Invalid argument expecting your help,thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd: add failed: (34) Numerical result out of range
I was building a small test cluster and noticed a difference with trying to rbd map depending on whether the cluster was built using fedora or CentOS. When I used CentOS osds, and tried to rbd map from arch linux or fedora, I would get rbd: add failed: (34) Numerical result out of range. It seemed to happen when the tool was writing to /sys/bus/rbd/add_single_major. If I rebuild the osds using fedora (20 in this case), everything works fine. In each scenario, I used ceph-0.80.1 on all the boxes. Is that expected? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd: add failed: (34) Numerical result out of range
On Mon, Jun 9, 2014 at 11:48 AM, lists+c...@deksai.com wrote: I was building a small test cluster and noticed a difference with trying to rbd map depending on whether the cluster was built using fedora or CentOS. When I used CentOS osds, and tried to rbd map from arch linux or fedora, I would get rbd: add failed: (34) Numerical result out of range. It seemed to happen when the tool was writing to /sys/bus/rbd/add_single_major. If I rebuild the osds using fedora (20 in this case), everything works fine. In each scenario, I used ceph-0.80.1 on all the boxes. Is that expected? No, it's most certainly not expected. If you are willing to help debug this, let's start with the output of 'rbd info'. Return to the failing setup, do 'rbd map image', make sure it fails, and, on the same box, do 'rbd info image'. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] add new data host
Hi all, I adding a new ceph-data host, but #ceph -s -k /etc/ceph/ceph.client.admin.keyring 2014-06-09 17:39:51.686082 7fade4f14700 0 librados: client.admin authentication error (1) Operation not permitted Error connecting to cluster: PermissionError my ceph.conf: [global] auth cluster required = cephx auth service required = cephx auth client required = cephx keyring = /etc/ceph/ceph.client.admin.keyring any suggest ? Thanks all -- TABA ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] add new data host
i solved this by export key from ceph auth export... :D above question, i use key with old format version. On 06/09/2014 05:44 PM, Ta Ba Tuan wrote: Hi all, I adding a new ceph-data host, but #ceph -s -k /etc/ceph/ceph.client.admin.keyring 2014-06-09 17:39:51.686082 7fade4f14700 0 librados: client.admin authentication error (1) Operation not permitted Error connecting to cluster: PermissionError my ceph.conf: [global] auth cluster required = cephx auth service required = cephx auth client required = cephx keyring = /etc/ceph/ceph.client.admin.keyring any suggest ? Thanks all -- TABA ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd snap protect error
Hi all, I installed cep firefly and now I am playing with rbd snapshot. I created a pool (libvirt-pool) with two images: libvirtimage1 (format 1) image2 (format 2). When I try to protect the first image: rbd --pool libvirt-pool snap protect --image libvirtimage1 --snap libvirt-snap it gives me an error because the image is in format 1: image must support layering. This is correct because libvirtimage1 is in format 1. But If I try with the second image: rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap it gives the following: snap failed (2) No such file or directory Image2 exists infact I can see it : rbd -p libvirt-pool ls libvirtimage1 image2 Could someone help me, please ? Regards ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd snap protect error
On Mon, Jun 9, 2014 at 3:01 PM, Ignazio Cassano ignaziocass...@gmail.com wrote: Hi all, I installed cep firefly and now I am playing with rbd snapshot. I created a pool (libvirt-pool) with two images: libvirtimage1 (format 1) image2 (format 2). When I try to protect the first image: rbd --pool libvirt-pool snap protect --image libvirtimage1 --snap libvirt-snap it gives me an error because the image is in format 1: image must support layering. This is correct because libvirtimage1 is in format 1. But If I try with the second image: rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap it gives the following: snap failed (2) No such file or directory Image2 exists infact I can see it : rbd -p libvirt-pool ls libvirtimage1 image2 Could someone help me, please ? You have to create the snapshot first: rbd --pool libvirt-pool snap create --image image2 --snap image2-snap rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd snap protect error
Many thanks... Can I create a format 2 image (with support for linear snapshot) using qemu-img command ? 2014-06-09 13:05 GMT+02:00 Ilya Dryomov ilya.dryo...@inktank.com: On Mon, Jun 9, 2014 at 3:01 PM, Ignazio Cassano ignaziocass...@gmail.com wrote: Hi all, I installed cep firefly and now I am playing with rbd snapshot. I created a pool (libvirt-pool) with two images: libvirtimage1 (format 1) image2 (format 2). When I try to protect the first image: rbd --pool libvirt-pool snap protect --image libvirtimage1 --snap libvirt-snap it gives me an error because the image is in format 1: image must support layering. This is correct because libvirtimage1 is in format 1. But If I try with the second image: rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap it gives the following: snap failed (2) No such file or directory Image2 exists infact I can see it : rbd -p libvirt-pool ls libvirtimage1 image2 Could someone help me, please ? You have to create the snapshot first: rbd --pool libvirt-pool snap create --image image2 --snap image2-snap rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd snap protect error
On 06/09/2014 02:00 PM, Ignazio Cassano wrote: Many thanks... Can I create a format 2 image (with support for linear snapshot) using qemu-img command ? Yes: qemu-img create -f raw rbd:rbd/image1:rbd_default_format=2 10G 'rbd_default_format' is a Ceph setting which is passed down to librbd directly. Wido 2014-06-09 13:05 GMT+02:00 Ilya Dryomov ilya.dryo...@inktank.com mailto:ilya.dryo...@inktank.com: On Mon, Jun 9, 2014 at 3:01 PM, Ignazio Cassano ignaziocass...@gmail.com mailto:ignaziocass...@gmail.com wrote: Hi all, I installed cep firefly and now I am playing with rbd snapshot. I created a pool (libvirt-pool) with two images: libvirtimage1 (format 1) image2 (format 2). When I try to protect the first image: rbd --pool libvirt-pool snap protect --image libvirtimage1 --snap libvirt-snap it gives me an error because the image is in format 1: image must support layering. This is correct because libvirtimage1 is in format 1. But If I try with the second image: rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap it gives the following: snap failed (2) No such file or directory Image2 exists infact I can see it : rbd -p libvirt-pool ls libvirtimage1 image2 Could someone help me, please ? You have to create the snapshot first: rbd --pool libvirt-pool snap create --image image2 --snap image2-snap rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd snap protect error
Many thanks 2014-06-09 14:04 GMT+02:00 Wido den Hollander w...@42on.com: On 06/09/2014 02:00 PM, Ignazio Cassano wrote: Many thanks... Can I create a format 2 image (with support for linear snapshot) using qemu-img command ? Yes: qemu-img create -f raw rbd:rbd/image1:rbd_default_format=2 10G 'rbd_default_format' is a Ceph setting which is passed down to librbd directly. Wido 2014-06-09 13:05 GMT+02:00 Ilya Dryomov ilya.dryo...@inktank.com mailto:ilya.dryo...@inktank.com: On Mon, Jun 9, 2014 at 3:01 PM, Ignazio Cassano ignaziocass...@gmail.com mailto:ignaziocass...@gmail.com wrote: Hi all, I installed cep firefly and now I am playing with rbd snapshot. I created a pool (libvirt-pool) with two images: libvirtimage1 (format 1) image2 (format 2). When I try to protect the first image: rbd --pool libvirt-pool snap protect --image libvirtimage1 --snap libvirt-snap it gives me an error because the image is in format 1: image must support layering. This is correct because libvirtimage1 is in format 1. But If I try with the second image: rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap it gives the following: snap failed (2) No such file or directory Image2 exists infact I can see it : rbd -p libvirt-pool ls libvirtimage1 image2 Could someone help me, please ? You have to create the snapshot first: rbd --pool libvirt-pool snap create --image image2 --snap image2-snap rbd --pool libvirt-pool snap protect --image image2 --snap image2-snap Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Recommended way to use Ceph as storage for file server
We have an NFS to RBD gateway with a large number of smaller RBDs. In our use case we are allowing users to request their own RBD containers that are then served up via NFS into a mixed cluster of clients.Our gateway is quite beefy, probably more than it needs to be, 2x8 core cpus and 96GB ram. It has been pressed into this service, drawn from a pool homogeneous servers rather then being spec'd out for this role explicitly (it could likely be less beefy). It has performed well. Our RBD nodes connected via 2x10GB nics in a transmit-load-balance config. The server has performed well in this role. It could just be the specs. An individual RBD in this NFS gateway won't see the parallel performance advantages that CephFS promises, however, one potential advantage is that a multi-RBD backend will be able to simultaneously manage NFS client requests isolated to different RBD. One RBD may still get a heavy load but it at least the server as a whole has the potential to spread requests across different devices. I haven't done load comparisons so this is just a point of interest. It's probably moot if the kernel doesn't do a good job of spreading NFS load across threads or there is some other kernel/RBD constriction point. ~jpr On 06/02/2014 12:35 PM, Dimitri Maziuk wrote: A more or less obvious alternative for CephFS would be to simply create a huge RBD and have a separate file server (running NFS / Samba / whatever) use that block device as backend. Just put a regular FS on top of the RBD and use it that way. Clients wouldn't really have any of the real performance and resilience benefits that Ceph could offer though, because the (single machine?) file server is now the bottleneck. Performance: assuming all your nodes are fast storage on a quad-10Gb pipe. Resilience: your gateway can be an active-passive HA pair, that shouldn't be any different from NFS+DRBD setups. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy 1.5.4 (addressing packages coming from EPEL)
Hi All, We've experienced a lot of issues since EPEL started packaging a 0.80.1-2 version that YUM will see as higher than 0.80.1 and therefore will choose to install the EPEL one. That package has some issues from what we have seen and in most cases will break the installation process. There is a new version of ceph-deploy (1.5.4) that addresses this problem by setting the priorities so that the ceph.repo will be considered before the EPEL one. Some improvements where done for how ceph-deploy parses cephdeploy.conf files so that priorities can be correctly set (and honored) from there as well. The changelog with the details of this release can be found here: http://ceph.com/ceph-deploy/docs/changelog.html#id1 Make sure you update! -Alfredo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Teuthology: Need help on Lock server setup running schedule_suite.sh
Hi, I am trying to run schedule_suite.sh on our custom Ceph build for leveraging InkTank suites in our testing. Can someone help me in using this shell script, where I can provide my own targets instead of the script picking from Ceph lab? Also kindly let me know if anyone has setup a lock server for this script to run. If yes, please share the details on how to setup the lock server. Thanks and Regards, Rajesh Raman PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy 1.5.4 (addressing packages coming from EPEL)
Thanks Alfredo , happy to see your email. I was a victim of this problem , hope 1.5.4 will take away my pain :-) - Karan Sing - On 09 Jun 2014, at 15:33, Alfredo Deza alfredo.d...@inktank.com wrote: http://ceph.com/ceph-deploy/docs/changelog.html#id1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD keyrings shifted and down
More detail to this. I recently upgraded my Ceph cluster from Emperor to Firefly. After the upgrade had been done, I noticed 1 of the OSD not coming back to life. While in the process of troubleshooting, rebooted the osd server and the keyring shifted. My $ENV. 4x OSD servers (each has 12, 1 for root and 11 for OSD) 1x mon + mds + admin for ceph-deploy Hopefully someone out there experience similar situation, if you do, please share your fixes. Thanks, Jimmy From: J L j...@yahoo-inc.commailto:j...@yahoo-inc.com Date: Friday, June 6, 2014 at 1:13 PM To: J L j...@yahoo-inc.commailto:j...@yahoo-inc.com, ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: Re: [ceph-users] OSD keyrings shifted and down Has anyone run into this issue and would like to provide any troubleshooting tip? Thanks, Jimmy From: J L j...@yahoo-inc.commailto:j...@yahoo-inc.com Date: Thursday, June 5, 2014 at 4:20 PM To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: [ceph-users] OSD keyrings shifted and down Hello Ceph Guru, I rebooted osd server to fix “osd.33”. When the server came back online, I noticed all the osd are down, while I am troubleshooting and restarting the osd, I got below error for authentication. I also noticed the “keyring” for each osd had shifted. For example, for osd.33 which mapped to /var/lib/ceph/osd/ceph-33, its keyring should be mapped to [osd.33], in this case it mapped to [osd.34]. Can I just simply change the osd.# in the keyring to correct the mapping or is there proper for the fix? Please help. Thanks in advance!! -Jimmy [root@gfsnode1 ceph-34]# service ceph start osd.34 === osd.34 === 2014-06-05 15:08:54.053958 7f08f2b47700 0 librados: osd.34 authentication error (1) Operation not permitted Error connecting to cluster: PermissionError failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.34 --keyring=/var/lib/ceph/osd/ceph-34/keyring osd crush create-or-move -- 34 2.73 host=gfsnode1 root=default' [root@gfsnode1 ceph-34]# [root@gfsnode1 osd]# ls -l total 0 lrwxrwxrwx 1 root root 12 May 15 15:21 ceph-33 - /ceph/osd120 lrwxrwxrwx 1 root root 12 May 15 15:22 ceph-34 - /ceph/osd121 lrwxrwxrwx 1 root root 12 May 15 15:23 ceph-35 - /ceph/osd122 lrwxrwxrwx 1 root root 12 May 15 15:24 ceph-36 - /ceph/osd123 lrwxrwxrwx 1 root root 12 May 15 15:24 ceph-37 - /ceph/osd124 lrwxrwxrwx 1 root root 12 May 15 15:25 ceph-38 - /ceph/osd125 lrwxrwxrwx 1 root root 12 May 15 15:25 ceph-39 - /ceph/osd126 lrwxrwxrwx 1 root root 12 May 15 15:26 ceph-40 - /ceph/osd127 lrwxrwxrwx 1 root root 12 May 15 15:27 ceph-41 - /ceph/osd128 lrwxrwxrwx 1 root root 12 May 15 15:27 ceph-42 - /ceph/osd129 lrwxrwxrwx 1 root root 12 May 15 15:28 ceph-43 - /ceph/osd130 [root@gfsnode1 osd]# cat ceph-33/keyring [osd.34] key = AQAwPnVT6G7fBRAA86D4FuxN0U8uKXk0brPbCQ== [root@gfsnode1 osd]# cat ceph-34/keyring [osd.35] key = AQBbPnVTmG4BLxAA6UV6XHbZepXUEXB6VJQzEA== [root@gfsnode1 osd]# cat ceph-35/keyring [osd.36] key = AQCDPnVTuL97JRAA1soDHToJ1c6WhXX+mnnRPw== [root@gfsnode1 osd]# cat ceph-36/keyring [osd.37] key = AQCwPnVTYAttNhAAomeRalOEHWlyO7C9tF+7SQ== [root@gfsnode1 osd]# cat ceph-37/keyring [osd.38] key = AQDKPnVTQC1DLBAAl0959S0st+UcFw8uOppa7g== [root@gfsnode1 osd]# cat ceph-38/keyring [osd.39] key = AQDjPnVTMFGwNxAABH5M1Y8uXoqecPesS09IGw== [root@gfsnode1 osd]# cat ceph-39/keyring [osd.40] key = AQChQXVT6JHiBxAAohTnBGxb2ZAbgCjt5M0xBw== [root@gfsnode1 osd]# cat ceph-40/keyring [osd.41] key = AQBGP3VTAHI0CRAAZkcUPLOFT1jx9v3DVNX4nQ== [root@gfsnode1 osd]# cat ceph-41/keyring [osd.42] key = AQAEsIdTMBTjChAAfJrsqIEBcCGEXv0jcK2vtQ== [root@gfsnode1 osd]# cat ceph-42/keyring [osd.43] key = AQB6P3VT2KW7ORAAU+1Ix/fUXIBU8jky0BQ9jw== [root@gfsnode1 osd]# cat ceph-43/keyring cat: ceph-43/keyring: No such file or directory [root@gfsnode1 osd]# ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] perplexed by unmapped groups on fresh firefly install
Miki, osd crush chooseleaf type is set to 1 by default, which means that it looks to peer with placement groups on another node, not the same node. You would need to set that to 0 for a 1-node cluster. John On Sun, Jun 8, 2014 at 10:40 PM, Miki Habryn dic...@rcpt.to wrote: I set up a single-node, dual-osd cluster following the Quick Start on ceph.com with Firefly packages, adding osd pool default size = 2. All of the pgs came up in active+remapped or active+degraded status. I read up on tunables and set them to optimal, to no result, so I added a third osd instead. About 39 pgs moved to active status, but the rest stayed in active+remapped or active+degraded. When I raised the replication level to 3 with ceph osd pool set ... size 3, all the pgs went back to degraded or remapped. Just for kicks, I tried to set the replication level to 1, and I still only got 39 pgs active. Is there something obvious I'm doing wrong? m. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] failed assertion on AuthMonitor
Barring a newly-introduced bug (doubtful), that assert basically means that your computer lied to the ceph monitor about the durability or ordering of data going to disk, and the store is now inconsistent. If you don't have data you care about on the cluster, by far your best option is: 1) Figure out what part of the system is lying about data durability (probably your filesystem or controller is ignoring barriers), 2) start the Ceph install over It's possible that the ceph-monstore-tool will let you edit the store back into a consistent state, but it looks like the system can't find the *initial* commit, which means you'll need to manufacture a new one wholesale with the right keys from the other system components. (I am assuming that the system didn't crash right while you were turning on the monitor for the first time; if it did that makes it slightly more likely to be a bug on our end, but again it'll be easiest to just start over since you don't have any data in it yet.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Jun 8, 2014 at 10:26 PM, Mohammad Salehe sal...@gmail.com wrote: Hi, I'm receiving failed assertion in AuthMonitor::update_from_paxos(bool*) after a system crash. I've saved a complete monitor log with 10/20 for 'mon' and 'paxos' here. There is only one monitor and two OSDs in the cluster as I was just at the beginning of deployment. I will be thankful if someone could help. -- Mohammad Salehe sal...@gmail.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to avoid deep-scrubbing performance hit?
I've correlated a large deep scrubbing operation to cluster stability problems. My primary cluster does a small amount of deep scrubs all the time, spread out over the whole week. It has no stability problems. My secondary cluster doesn't spread them out. It saves them up, and tries to do all of the deep scrubs over the weekend. The secondary starts loosing OSDs about an hour after these deep scrubs start. To avoid this, I'm thinking of writing a script that continuously scrubs the oldest outstanding PG. In psuedo-bash: # Sort by the deep-scrub timestamp, taking the single oldest PG while ceph pg dump | awk '$1 ~ /[0-9a-f]+\.[0-9a-f]+/ {print $20, $21, $1}' | sort | head -1 | read date time pg do ceph pg deep-scrub ${pg} while ceph status | grep scrubbing+deep do sleep 5 done sleep 30 done Does anybody think this will solve my problem? I'm also considering disabling deep-scrubbing until the secondary finishes replicating from the primary. Once it's caught up, the write load should drop enough that opportunistic deep scrubs should have a chance to run. It should only take another week or two to catch up. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to avoid deep-scrubbing performance hit?
On Mon, Jun 9, 2014 at 3:22 PM, Craig Lewis cle...@centraldesktop.com wrote: I've correlated a large deep scrubbing operation to cluster stability problems. My primary cluster does a small amount of deep scrubs all the time, spread out over the whole week. It has no stability problems. My secondary cluster doesn't spread them out. It saves them up, and tries to do all of the deep scrubs over the weekend. The secondary starts loosing OSDs about an hour after these deep scrubs start. To avoid this, I'm thinking of writing a script that continuously scrubs the oldest outstanding PG. In psuedo-bash: # Sort by the deep-scrub timestamp, taking the single oldest PG while ceph pg dump | awk '$1 ~ /[0-9a-f]+\.[0-9a-f]+/ {print $20, $21, $1}' | sort | head -1 | read date time pg do ceph pg deep-scrub ${pg} while ceph status | grep scrubbing+deep do sleep 5 done sleep 30 done Does anybody think this will solve my problem? I'm also considering disabling deep-scrubbing until the secondary finishes replicating from the primary. Once it's caught up, the write load should drop enough that opportunistic deep scrubs should have a chance to run. It should only take another week or two to catch up. If the problem is just that your secondary cluster is under a heavy write load, and so the scrubbing won't run automatically until the PGs hit their time limit, maybe it's appropriate to change the limits so they can run earlier. You can bump up osd scrub load threshold. Or maybe that would be a terrible thing to do, not sure. But it sounds like the cluster is just skipping the voluntary scrubs, and then they all come due at once (probably from some earlier event). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to avoid deep-scrubbing performance hit?
Craig, I've struggled with the same issue for quite a while. If your i/o is similar to mine, I believe you are on the right track. For the past month or so, I have been running this cronjob: * * * * * for strPg in `ceph pg dump | egrep '^[0-9]\.[0-9a-f]{1,4}' | sort -k20 | awk '{ print $1 }' | head -2`; do ceph pg deep-scrub $strPg; done That roughly handles my 20672 PGs that are set to be deep-scrubbed every 7 days. Your script may be a bit better, but this quick and dirty method has helped my cluster maintain more consistency. The real key for me is to avoid the clumpiness I have observed without that hack where concurrent deep-scrubs sit at zero for a long period of time (despite having PGs that were months overdue for a deep-scrub), then concurrent deep-scrubs suddenly spike up and stay in the teens for hours, killing client writes/second. The scrubbing behavior table[0] indicates that a periodic tick initiates scrubs on a per-PG basis. Perhaps the timing of ticks aren't sufficiently randomized when you restart lots of OSDs concurrently (for instance via pdsh). On my cluster I suffer a significant drag on client writes/second when I exceed perhaps four or five concurrent PGs in deep-scrub. When concurrent deep-scrubs get into the teens, I get a massive drop in client writes/second. Greg, is there locking involved when a PG enters deep-scrub? If so, is the entire PG locked for the duration or is each individual object inside the PG locked as it is processed? Some of my PGs will be in deep-scrub for minutes at a time. 0: http://ceph.com/docs/master/dev/osd_internals/scrub/ Thanks, Mike Dawson On 6/9/2014 6:22 PM, Craig Lewis wrote: I've correlated a large deep scrubbing operation to cluster stability problems. My primary cluster does a small amount of deep scrubs all the time, spread out over the whole week. It has no stability problems. My secondary cluster doesn't spread them out. It saves them up, and tries to do all of the deep scrubs over the weekend. The secondary starts loosing OSDs about an hour after these deep scrubs start. To avoid this, I'm thinking of writing a script that continuously scrubs the oldest outstanding PG. In psuedo-bash: # Sort by the deep-scrub timestamp, taking the single oldest PG while ceph pg dump | awk '$1 ~ /[0-9a-f]+\.[0-9a-f]+/ {print $20, $21, $1}' | sort | head -1 | read date time pg do ceph pg deep-scrub ${pg} while ceph status | grep scrubbing+deep do sleep 5 done sleep 30 done Does anybody think this will solve my problem? I'm also considering disabling deep-scrubbing until the secondary finishes replicating from the primary. Once it's caught up, the write load should drop enough that opportunistic deep scrubs should have a chance to run. It should only take another week or two to catch up. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Fail to Block Devices and OpenStack
Hi, I fail for the cooperation of Openstack and Ceph. I was set on the basis of the url. http://ceph.com/docs/next/rbd/rbd-openstack/ Can look at the state of cephcluster from Openstack(cephClient) Failure occurs at cinder create Ceph Cluster: CentOS release 6.5 Ceph 0.80.1 OpenStack: Ubuntu 12.04.4 OpenStack DevStack Icehouse # glance image-create --name cirros --disk-format raw --container-format ovf --file /usr/local/src/cirros-0.3.2-x86_64-disk.raw --is-public True +--+--+ | Property | Value| +--+--+ | checksum | cf2392db1f59d59ed69a8f8491b670e0 | | container_format | ovf | | created_at | 2014-06-09T05:04:48 | | deleted | False| | deleted_at | None | | disk_format | raw | | id | f4a0f971-437b-4d3f-a0c4-1c82f31e9f1e | | is_public| True | | min_disk | 0| | min_ram | 0| | name | cirros | | owner| 5a10a1fed82b45a7affaf57f814434bb | | protected| False| | size | 41126400 | | status | active | | updated_at | 2014-06-09T05:04:50 | | virtual_size | None | +--+--+ # cinder create --image-id f4a0f971-437b-4d3f-a0c4-1c82f31e9f1e --display-name boot-from-rbd 1 ++--+ |Property|Value | ++--+ | attachments | [] | | availability_zone| nova | |bootable|false | | created_at | 2014-06-09T05:12:51.00 | | description | None | | encrypted|False | | id | 30d1eee7-54d6-4911-af06-b35d2f8ef0c4 | |metadata| {} | | name |boot-from-rbd | | os-vol-host-attr:host | None | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 5a10a1fed82b45a7affaf57f814434bb | | size | 1 | | snapshot_id | None | | source_volid | None | | status | creating | |user_id | 90ed966837e44f91a582b73960dd848c | | volume_type | None | ++--+ # cinder list +--++---+--+-+--+-+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--++---+--+-+--+-+ | 30d1eee7-54d6-4911-af06-b35d2f8ef0c4 | error | boot-from-rbd | 1 | None| false | | +--++---+--+-+--+-+ I've done all the setting of URL(http://ceph.com/docs/next/rbd/rbd-openstack/) There is a setup required except URL? Best Regards. Yamashita ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com