Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Here is the new link for sample book : https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0 https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ On 13 Feb 2015, at 05:25, Frank Yu flyxia...@gmail.com wrote: Wow, Cong BTW, I found the link of sample copy is 404. 2015-02-06 6:53 GMT+08:00 Karan Singh karan.si...@csc.fi mailto:karan.si...@csc.fi: Hello Community Members I am happy to introduce the first book on Ceph with the title “Learning Ceph”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 Finally , I would like to express my sincere thanks to Sage Weil - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. Patrick McGarry - For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book Don Talton , Julien Recurt , Sebastien Han and Zihong Chen , Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 tel:%2B358%209%204572001 fax +358 9 4572302 tel:%2B358%209%204572302 http://www.csc.fi/ http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Regards Frank Yu smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Karan Whether to send the book in Russian? Thanks. 2015-02-13 11:43 GMT+03:00 Karan Singh karan.si...@csc.fi: Here is the new link for sample book : https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ On 13 Feb 2015, at 05:25, Frank Yu flyxia...@gmail.com wrote: Wow, Cong BTW, I found the link of sample copy is 404. 2015-02-06 6:53 GMT+08:00 Karan Singh karan.si...@csc.fi: Hello Community Members I am happy to introduce the first book on Ceph with the title “*Learning Ceph*”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 *Finally , I would like to express my sincere thanks to * *Sage Weil* - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. *Patrick McGarry *- For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and *Zihong Chen *, Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Regards Frank Yu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Great Thanks :) 2015-02-13 16:43 GMT+08:00 Karan Singh karan.si...@csc.fi: Here is the new link for sample book : https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ On 13 Feb 2015, at 05:25, Frank Yu flyxia...@gmail.com wrote: Wow, Cong BTW, I found the link of sample copy is 404. 2015-02-06 6:53 GMT+08:00 Karan Singh karan.si...@csc.fi: Hello Community Members I am happy to introduce the first book on Ceph with the title “*Learning Ceph*”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 *Finally , I would like to express my sincere thanks to * *Sage Weil* - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. *Patrick McGarry *- For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and *Zihong Chen *, Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Regards Frank Yu -- Regards Frank Yu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Question about ceph exclusive object?
Hello, What exactly does the parameter 'bool exclusive' mean in the int librados::IoCtxImpl::create(const object_t oid, bool exclusive)? I can't find any doc to describe this :-( -- Den ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Question about ceph exclusive object?
On 2015-02-13 17:54, Dennis Chen wrote: Hello, What exactly does the parameter 'bool exclusive' mean in the int librados::IoCtxImpl::create(const object_t oid, bool exclusive)? I can't find any doc to describe this :-( From documentation in librados.h: it should be set to either either LIBRADOS_CREATE_EXCLUSIVE (fail if the file already exists) or LIBRADOS_CREATE_IDEMPOTENT (continue if the file already exists). -kv ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Just buy it. Nice book, I don't have read yet all the book, but it seem to cover all ceph features. Good job ! - Mail original - De: Karan Singh karan.si...@csc.fi À: Ceph Community ceph-commun...@lists.ceph.com, ceph-users ceph-users@lists.ceph.com, ceph-maintain...@ceph.com, ceph-users ceph-us...@ceph.com, ceph-devel ceph-de...@vger.kernel.org Cc: Sage Weil sw...@redhat.com, don don...@thoughtstorm.net Envoyé: Jeudi 5 Février 2015 23:53:11 Objet: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph Hello Community Members I am happy to introduce the first book on Ceph with the title “ Learning Ceph ”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 Finally , I would like to express my sincere thanks to Sage Weil - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. Patrick McGarry - For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book Don Talton , Julien Recurt , Sebastien Han and Zihong Chen , Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph Performance with SSD journal
Hi. What version? 2015-02-13 6:04 GMT+03:00 Sumit Gaur sumitkg...@gmail.com: Hi Chir, Please fidn my answer below in blue On Thu, Feb 12, 2015 at 12:42 PM, Chris Hoy Poy ch...@gopc.net wrote: Hi Sumit, A couple questions: What brand/model SSD? samsung 480G SSD(PM853T) having random write 90K IOPS (4K, 368MBps) What brand/model HDD? 64GB memory, 300GB SAS HDD (seagate), 10Gb nic Also how they are connected to controller/motherboard? Are they sharing a bus (ie SATA expander)? no , They are connected with local Bus not the SATA expander. RAM? *64GB * Also look at the output of iostat -x or similiar, are the SSDs hitting 100% utilisation? *No, SSD was hitting 2000 iops only. * I suspect that the 5:1 ratio of HDDs to SDDs is not ideal, you now have 5x the write IO trying to fit into a single SSD. * I have not seen any documented reference to calculate the ratio. Could you suggest one. Here I want to mention that results for 1024K write improve a lot. Problem is with 1024K read and 4k write .* *SSD journal 810 IOPS and 810MBps* *HDD journal 620 IOPS and 620 MBps* I'll take a punt on it being a SATA connected SSD (most common), 5x ~130 megabytes/second gets very close to most SATA bus limits. If its a shared BUS, you possibly hit that limit even earlier (since all that data is now being written twice out over the bus). cheers; \Chris -- *From: *Sumit Gaur sumitkg...@gmail.com *To: *ceph-users@lists.ceph.com *Sent: *Thursday, 12 February, 2015 9:23:35 AM *Subject: *[ceph-users] ceph Performance with SSD journal Hi Ceph-Experts, Have a small ceph architecture related question As blogs and documents suggest that ceph perform much better if we use journal on SSD. I have made the ceph cluster with 30 HDD + 6 SSD for 6 OSD nodes. 5 HDD + 1 SSD on each node and each SSD have 5 partition for journaling 5 OSDs on the node. Now I ran similar test as I ran for all HDD setup. What I saw below two reading goes in wrong direction as expected 1) 4K write IOPS are less for SSD setup, though not major difference but less. 2) 1024K Read IOPS are less for SSD setup than HDD setup. On the other hand 4K read and 1024K write both have much better numbers for SSD setup. Let me know if I am missing some obvious concept. Thanks sumit ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD slow requests causing disk aborts in KVM
Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. yes, sure echo 60 /sys/block/sdx/device/timeout for 60s for example - Mail original - De: Krzysztof Nowicki krzysztof.a.nowi...@gmail.com À: Andrey Korolyov and...@xdel.ru, aderumier aderum...@odiso.com Cc: ceph-users ceph-users@lists.ceph.com Envoyé: Vendredi 13 Février 2015 08:18:26 Objet: Re: [ceph-users] OSD slow requests causing disk aborts in KVM Thu Feb 12 2015 at 16:23:38 użytkownik Andrey Korolyov and...@xdel.ru napisał: On Fri, Feb 6, 2015 at 12:16 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com wrote: Hi all, I'm running a small Ceph cluster with 4 OSD nodes, which serves as a storage backend for a set of KVM virtual machines. The VMs use RBD for disk storage. On the VM side I'm using virtio-scsi instead of virtio-blk in order to gain DISCARD support. Each OSD node is running on a separate machine, using 3TB WD Black drive + Samsung SSD for journal. The machines used for OSD nodes are not equal in spec. Three of them are small servers, while one is a desktop PC. The last node is the one causing trouble. During high loads caused by remapping due to one of the other nodes going down I've experienced some slow requests. To my surprise however these slow requests caused aborts from the block device on the VM side, which ended up corrupting files. What I wonder if such behaviour (aborts) is normal in case slow requests pile up. I always though that these requests would be delayed but eventually they'd be handled. Are there any tunables that would help me avoid such situations? I would really like to avoid VM outages caused by such corruption issues. I can attach some logs if needed. Best regards Chris Hi, this is unevitable payoff for using scsi backend on a storage which is capable to slow enough operations. There was some argonaut/bobtail-era discussions in ceph ml, may be those readings can be interesting for you. AFAIR the scsi disk would about after 70s of non-receiving ack state for a pending operation. Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. As for the versions I'm running all Ceph nodes on Gentoo with Ceph version 0.80.5. The VM guest in question is running Ubuntu 12.04 LTS with kernel 3.13. The guest filesystem is BTRFS. I'm thinking that the corruption may be some BTRFS bug. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] certificate of `ceph.com' is not trusted!
I think the root-CA (COMODO RSA Certification Authority) is not available on your Linux host? Using Google chrome connecting to https://ceph.com/ works fine. No, its a wget bug. I now switched to LWP::UserAgent and it works perfectly. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD slow requests causing disk aborts in KVM
Thanks for the info. I'll try that for now and see if it helps. Fri Feb 13 2015 at 09:00:59 użytkownik Alexandre DERUMIER aderum...@odiso.com napisał: Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. yes, sure echo 60 /sys/block/sdx/device/timeout for 60s for example - Mail original - De: Krzysztof Nowicki krzysztof.a.nowi...@gmail.com À: Andrey Korolyov and...@xdel.ru, aderumier aderum...@odiso.com Cc: ceph-users ceph-users@lists.ceph.com Envoyé: Vendredi 13 Février 2015 08:18:26 Objet: Re: [ceph-users] OSD slow requests causing disk aborts in KVM Thu Feb 12 2015 at 16:23:38 użytkownik Andrey Korolyov and...@xdel.ru napisał: On Fri, Feb 6, 2015 at 12:16 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com wrote: Hi all, I'm running a small Ceph cluster with 4 OSD nodes, which serves as a storage backend for a set of KVM virtual machines. The VMs use RBD for disk storage. On the VM side I'm using virtio-scsi instead of virtio-blk in order to gain DISCARD support. Each OSD node is running on a separate machine, using 3TB WD Black drive + Samsung SSD for journal. The machines used for OSD nodes are not equal in spec. Three of them are small servers, while one is a desktop PC. The last node is the one causing trouble. During high loads caused by remapping due to one of the other nodes going down I've experienced some slow requests. To my surprise however these slow requests caused aborts from the block device on the VM side, which ended up corrupting files. What I wonder if such behaviour (aborts) is normal in case slow requests pile up. I always though that these requests would be delayed but eventually they'd be handled. Are there any tunables that would help me avoid such situations? I would really like to avoid VM outages caused by such corruption issues. I can attach some logs if needed. Best regards Chris Hi, this is unevitable payoff for using scsi backend on a storage which is capable to slow enough operations. There was some argonaut/bobtail-era discussions in ceph ml, may be those readings can be interesting for you. AFAIR the scsi disk would about after 70s of non-receiving ack state for a pending operation. Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. As for the versions I'm running all Ceph nodes on Gentoo with Ceph version 0.80.5. The VM guest in question is running Ubuntu 12.04 LTS with kernel 3.13. The guest filesystem is BTRFS. I'm thinking that the corruption may be some BTRFS bug. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Status of SAMBA VFS
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, 6 Feb 2015, Gregory Farnum wrote: On Fri, Feb 6, 2015 at 7:11 AM, Dennis Kramer (DT) den...@holmes.nl wrote: On Fri, 6 Feb 2015, Gregory Farnum wrote: On Fri, Feb 6, 2015 at 6:39 AM, Dennis Kramer (DT) den...@holmes.nl wrote: I've used the upstream module for our production cephfs cluster, but i've noticed a bug where timestamps aren't being updated correctly. Modified files are being reset to the beginning of Unix time. It looks like this bug only manifest itself in applications like MS Office where extra metadata is added to files. If I for example modify a text file in notepad everything is working fine, but when I modify a docx (or .xls for that matter), the timestamp is getting a reset to 1-1-1970. You can imagine that this could be a real dealbreaker for production use (think of backups/rsyncs based on mtimes which will render useless). Further more the return values for free/total disk space is also not working correctly when you mount a share in Windows. My 340TB cluster had 7.3EB storage available in Windows ;) This could be fixed with a workaround by using a custom dfree command = script in the smb.conf, but VFS will override this and thus this script will not work (unless you remove the lines of codes for these disk operations in vfs_ceph.c). My experience with the VFS module is pretty awesome nonetheless. I really noticed an improvement in throughput when using this module instead of an re-export with the kernel client. So I hope the VFS module will be maintained actively again any time soon. Can you file bugs for these? The timestamp one isn't anything I've heard of before. http://tracker.ceph.com/issues/10834 The weird free space on Windows actually does sound familiar; I think it has to do with either Windows or the Samba/Windows interface not handling our odd block sizes properly... This one has been fixed. http://tracker.ceph.com/issues/10835 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlTd3gsACgkQiJDTKUBxIRseAQCeNFsD4jEQKqsWvGVWPph1m6+S o+QAoNDy9zrcxdYNYBM+czdMpV9DV6o1 =U/hx -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Random OSDs respawning continuously
Hi all, When i stop the respawning osd on an OSD node, another osd is respawning on the same node. when the OSD is started to respawing, it puts the following info in the osd log. slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(*osd.551*.95229:11 191 10005c4.0033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg OSD.551 is part of cache tier. All the respawning osds have the log with different cache tier OSDs. If i restart all the osds in the cache tier osd node, respawning is stopped and cluster become active + clean state. But when i try to write some data on the cluster, random osd starts the respawning. can anyone help me how to solve this issue? 2015-02-13 19:10:02.309848 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 11 included below; oldest blocked for 30.132629 secs 2015-02-13 19:10:02.309854 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.132629 seconds old, received at 2015-02-13 19:09:32.177075: osd_op(osd.551.95229:63 10002ae. [copy-from ver 7622] 13.7273b256 RETRY=130 snapc 1=[] ondisk+retry+write+ignore_overlay+enforce_snapc+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309858 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.131608 seconds old, received at 2015-02-13 19:09:32.178096: osd_op(osd.551.95229:41 5 10003a0.0006 [copy-get max 8388608] 13.aefb256 RETRY=118 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309861 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130994 seconds old, received at 2015-02-13 19:09:32.178710: osd_op(osd.551.95229:26 83 100029d.003b [copy-get max 8388608] 13.a2be1256 RETRY=115 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309864 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130426 seconds old, received at 2015-02-13 19:09:32.179278: osd_op(osd.551.95229:39 39 10004e9.0032 [copy-get max 8388608] 13.6a25b256 RETRY=105 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309868 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.129697 seconds old, received at 2015-02-13 19:09:32.180007: osd_op(osd.551.95229:97 49 1000553.007e [copy-get max 8388608] 13.c8645256 RETRY=59 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310284 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 6 included below; oldest blocked for 31.133092 secs 2015-02-13 19:10:03.310305 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(osd.551.95229:11 191 10005c4.0033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310308 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.128616 seconds old, received at 2015-02-13 19:09:32.181551: osd_op(osd.551.95229:12 903 10002e4.00d6 [copy-get max 8388608] 13.f56a3256 RETRY=41 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310322 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127807 seconds old, received at 2015-02-13 19:09:32.182360: osd_op(osd.551.95229:14 165 1000480.0110 [copy-get max 8388608] 13.fd8c1256 RETRY=32 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310327 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127320 seconds old, received at 2015-02-13 19:09:32.182847: osd_op(osd.551.95229:15 013 100047f.0133 [copy-get max 8388608] 13.b7b05256 RETRY=27 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310331 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.126935 seconds old, received at 2015-02-13 19:09:32.183232: osd_op(osd.551.95229:15 767 100066d.001e [copy-get max 8388608] 13.3b017256 RETRY=25 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:04.310685 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 1 included below; oldest blocked for 32.133566 secs 2015-02-13 19:10:04.310705 7f53eef54700 0 log_channel(default) log [WRN] : slow request 32.126584 seconds old, received at 2015-02-13 19:09:32.184057: osd_op(osd.551.95229:16 293 1000601.0029 [copy-get max 8388608]
[ceph-users] Issus with device-mapper drive partition names.
When trying to zap and prepare a disk it fails to find the partitions. [ceph@ceph0-mon0 ~]$ ceph-deploy -v disk zap ceph0-node1:/dev/mapper/35000c50031a1c08b [ ceph_deploy.conf ][ DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ ceph_deploy.cli ][ INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy -v disk zap ceph0-node1:/dev/mapper/35000c50031a1c08b [ ceph_deploy.osd ][ DEBUG ] zapping /dev/mapper/35000c50031a1c08b on ceph0-node1 [ ceph0-node1 ][ DEBUG ] connection detected need for sudo [ ceph0-node1 ][ DEBUG ] connected to host: ceph0-node1 [ ceph0-node1 ][ DEBUG ] detect platform information from remote host [ ceph0-node1 ][ DEBUG ] detect machine type [ ceph_deploy.osd ][ INFO ] Distro info: CentOS Linux 7.0.1406 Core [ ceph0-node1 ][ DEBUG ] zeroing last few blocks of device [ ceph0-node1 ][ DEBUG ] find the location of an executable [ ceph0-node1 ][ INFO ] Running command: sudo /usr/sbin/ceph-disk zap /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ DEBUG ] Creating new GPT entries. [ ceph0-node1 ][ DEBUG ] Warning: The kernel is still using the old partition table. [ ceph0-node1 ][ DEBUG ] The new table will be used at the next reboot. [ ceph0-node1 ][ DEBUG ] GPT data structures destroyed! You may now partition the disk using fdisk or [ ceph0-node1 ][ DEBUG ] other utilities. [ ceph0-node1 ][ DEBUG ] Warning: The kernel is still using the old partition table. [ ceph0-node1 ][ DEBUG ] The new table will be used at the next reboot. [ ceph0-node1 ][ DEBUG ] The operation has completed successfully. [ ceph_deploy.osd ][ INFO ] calling partx on zapped device /dev/mapper/35000c50031a1c08b [ ceph_deploy.osd ][ INFO ] re-reading known partitions will display errors [ ceph0-node1 ][ INFO ] Running command: sudo partx -a /dev/mapper/35000c50031a1c08b Now running prepare fails because it can't find the newly created partitions. [ceph@ceph0-mon0 ~]$ ceph-deploy -v osd prepare ceph0-node1:/dev/mapper/35000c50031a1c08b [ ceph_deploy.conf ][ DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ ceph_deploy.cli ][ INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy -v osd prepare ceph0-node1:/dev/mapper/35000c50031a1c08b [ ceph_deploy.osd ][ DEBUG ] Preparing cluster ceph disks ceph0-node1:/dev/mapper/35000c50031a1c08b: [ ceph0-node1 ][ DEBUG ] connection detected need for sudo [ ceph0-node1 ][ DEBUG ] connected to host: ceph0-node1 [ ceph0-node1 ][ DEBUG ] detect platform information from remote host [ ceph0-node1 ][ DEBUG ] detect machine type [ ceph_deploy.osd ][ INFO ] Distro info: CentOS Linux 7.0.1406 Core [ ceph_deploy.osd ][ DEBUG ] Deploying osd to ceph0-node1 [ ceph0-node1 ][ DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ ceph0-node1 ][ INFO ] Running command: sudo udevadm trigger --subsystem-match=block --action=add [ ceph_deploy.osd ][ DEBUG ] Preparing host ceph0-node1 disk /dev/mapper/35000c50031a1c08b journal None activate False [ ceph0-node1 ][ INFO ] Running command: sudo ceph-disk -v prepare --fs-type xfs --cluster ceph -- /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Will colocate journal with data on /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ WARNIN ] DEBUG:ceph-disk:Creating journal partition num 2 size 1 on /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:1M --change-name=2:ceph journal --partition-guid=2:b9202d1b-63be-4deb-ad08-0a143a31f4a9 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ DEBUG ] Information: Moved requested sector from 34 to 2048 in [ ceph0-node1 ][ DEBUG ] order to align on 2048-sector boundaries. [ ceph0-node1 ][ DEBUG ] Warning: The kernel is still using the old partition table. [ ceph0-node1 ][ DEBUG ] The new table will be used at the next reboot. [ ceph0-node1 ][ DEBUG ] The operation has completed successfully. [ ceph0-node1 ][ WARNIN ] INFO:ceph-disk:calling partx on prepared device /dev/mapper/35000c50031a1c08b [ ceph0-node1 ][ WARNIN ]
[ceph-users] Any suggestions on the best way to migrate / fix my cluster configuration
Hi, An engineer who worked for me a couple of years ago setup a Ceph cloud running an old version. We are now seeing serious performance problems that are affecting other systems. So I have tried to research what to do. I have updated to 0.80.7 version and added more hardware and things are no better. I am now stuck trying to decide how to move forward and wonder if anyone can advise? The setup was originally 3 machines each with 2 3TB disks using EXT-4 FS on the first disk and XFS on the second on each node. One has 8GB Ram and the other two have 4GB RAM. I decided to add a 4th node into the group and set that up with XFS on the primary disk and BTRFS on the second. On this machine the Journal whilst on the same disk as the OSD is in a raw partition on each disk. The memory is 8GB. It took about 7 days for the data to finally move off the first node onto the fourth node and already bad performance became abysmal. I had planned to add a PCI based SSD as a main drive and use the two disks with journals on the SSD but I could not get the SSD to work at all. I am stuck trying to keep this production cluster operational and trying to find a way to migrate onto a better configuration. But worried that I may make the wrong decision and make things worse. The original 3 machines are running ubuntu 12.10 and the newest one is 14.10. The node ST001 does not have any active OSDs but does have active NFS server with disk images for Virtual Machines on it. Ideally I want to migrate these onto Ceph. If anyone can offer any suggestions on the best way to proceed with this, then I would be more than happy to listen. I will also be happy to provide additional information if that would be useful. Thanks in advance, Carl Taylor The machines are SuperMicro blades with Intel(R) Xeon(R) CPU E3-1220 v3 @ 3.10GHz cpu's 4 or 8GB Ram - intended to upgrade all to 8GB. The disks are 3TB SATA. # id weight type name up/down reweight -1 21.38 root default -2 5.41 host st002 3 2.68 osd.3 up 1 ext4 1 2.73 osd.1 up 1 xfs -3 5.41 host st003 5 2.68 osd.5 up 1 ext4 2 2.73 osd.2 up 1 xfs -4 5.41 host st001 4 2.68 osd.4 up 0 0 2.73 osd.0 up 0 -5 5.15 host st004 6 2.64 osd.6 up 1 btrfs 7 2.51 osd.7 up 1 xfs As you can see from the following there is no rhyme nor reason to the performance. root@st004:/home/cjtaylor# time ceph tell osd.7 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 52021545.00} real 0m20.885s user 0m0.041s sys 0m0.024s root@st004:/home/cjtaylor# time ceph tell osd.6 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 20573023.00} real 1m8.642s user 0m0.109s sys 0m0.031s root@st003:~# time ceph tell osd.5 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 31145749.00} real 0m36.698s user 0m0.060s sys 0m0.032s root@st003:~# time ceph tell osd.2 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 24935604.00} real 0m44.964s user 0m0.076s sys 0m0.020s root@st002:~# time ceph tell osd.3 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 25826386.00} real 0m44.951s user 0m0.060s sys 0m0.024s root@st002:~# time ceph tell osd.1 bench { bytes_written: 1073741824, blocksize: 4194304, bytes_per_sec: 17568847.00} real 1m4.088s user 0m0.072s sys 0m0.024s cluster cd1fa211-911e-4b18-8392-9adcf0ed0bd5 health HEALTH_OK monmap e3: 3 mons at {st001= 172.16.2.109:6789/0,st002=172.16.2.101:6789/0,st003=172.16.2.106:6789/0}, election epoch 15498, quorum 0,1,2 st002,st003,st001 mdsmap e344: 1/1/1 up {0=st001=up:active} osdmap e14769: 8 osds: 8 up, 6 in pgmap v23155467: 896 pgs, 19 pools, 3675 GB data, 8543 kobjects 9913 GB used, 6159 GB / 16356 GB avail 894 active+clean 2 active+clean+scrubbing+deep client io 373 kB/s rd, 809 kB/s wr, 57 op/s ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] UGRENT: add mon failed and ceph monitor refresh log crazily
What version is this? It's hard to tell from the logs below, but it looks like there might be a connectivity problem? Is it able to exchange messages with the other monitors? Perhaps more improtantly, though, if you simply stop the new mon.f, can mon.e join? What is in its log? sage On Fri, 13 Feb 2015, minchen wrote: Hi , all developers and users when i add a new mon to current mon cluter, failed with 2 mon out of quorum. there are 5 mons in our ceph cluster: epoch 7 fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 last_changed 2015-02-13 09:11:45.758839 created 0.00 0: 10.117.16.17:6789/0 mon.b 1: 10.118.32.7:6789/0 mon.cHEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) 2: 10.119.16.11:6789/0 mon.d 3: 10.122.0.9:6789/0 mon.e 4: 10.122.48.11:6789/0 mon.f mon.f is newly added to montior cluster, but when starting mon.f, it caused both mon.e and mon.f out of quorum: HEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) mon.b ,mon.c, mon.d, log refresh crazily as following: Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.063628 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.063629 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090647 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090648 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090661 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090662 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 .. and mon.f log : Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.526676 7f3931dfd7c0 0 ceph version 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f), process ceph-mon, pid 30639 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.607412 7f3931dfd7c0 0 mon.f does not exist in monmap, will attempt to join an existing cluster Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.609838 7f3931dfd7c0 0 starting mon.f rank -1 at 10.122.48.11:6789/0 mon_data /osd/ceph/mon fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.610076 7f3931dfd7c0 1 mon.f@-1(probing) e0 preinit fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636499 7f392a504700 0 -- 10.122.48.11:6789/0 10.119.16.11:6789/0 pipe(0x7f3934ebfb80 sd=26 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9ce0).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636797 7f392a201700 0 -- 10.122.48.11:6789/0 10.122.0.9:6789/0 pipe(0x7f3934ec0800 sd=29 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa940).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636968 7f392a403700 0 -- 10.122.48.11:6789/0 10.118.32.7:6789/0 pipe(0x7f3934ec0080 sd=27 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9e40).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.637037 7f392a302700 0 -- 10.122.48.11:6789/0 10.117.16.17:6789/0 pipe(0x7f3934ebfe00 sd=28 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa260).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.638854 7f392c00a700 0 mon.f@-1(probing) e7 my rank is now 4 (was -1) Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.639365 7f392c00a700 1 mon.f@4(synchronizing) e7 sync_obtain_latest_monmap Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.639494 7f392b008700 0 -- 10.122.48.11:6789/0 10.122.0.9:6789/0 pipe(0x7f3934ec0580 sd=17 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa680).accept connect_seq 2 vs existing 0 state connecting Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.639513 7f392b008700 0 -- 10.122.48.11:6789/0 10.122.0.9:6789/0 pipe(0x7f3934ec0580 sd=17 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa680).accept we reset (peer sent cseq 2, 0x7f3934ebf400.cseq = 0), sending RESETSESSION .. Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.643159 7f392af07700 0 -- 10.122.48.11:6789/0 10.119.16.11:6789/0 pipe(0x7f3934ec1700 sd=28 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eab2e0).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.637037 7f392a302700 0 -- 10.122.48.11:6789/0 10.117.16.17:6789/0 pipe(0x7f3934ebfe00 sd=28 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa260).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.638854 7f392c00a700 0 mon.f@-1(probing)
[ceph-users] CRUSHMAP for chassis balance
Dear cepher, Currently I am working on crushmap to try to make sure the at least one copy are going to different chassis. Say chassis1 has host1,host2,host3, and chassis2 has host4,host5,host6. With replication =2, it's not a problem, I can use the following step in rule step take chasses1 step chooseleaf firstn 1 type host step emit step take chasses2 step chooseleaf firstn 1 type host step emit But for replication=3, I tried step take chasses1 step chooseleaf firstn 1 type host step emit step take chasses2 step chooseleaf firstn 1 type host step emit step take default step chooseleaf firstn 1 type host step emit At the end, the 3rd osd returned in rule test is always duplicate with first 1 or first 2. Any idea or what's the direction to move forward? Thanks in advance BR, Luke MYCOM-OSI This electronic message contains information from Mycom which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or any other use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by post or telephone (to the numbers or correspondence address above) or by email (at the email address above) immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 回复:Re: ceph mds zombie
OKthank you very much ! 在 Yan, Zheng uker...@gmail.com,2015-2-13 下午12:33写道:On Tue, Feb 10, 2015 at 9:26 PM, kenmasida 981163...@qq.com wrote: hi, everybody Thang you for reading my question. my ceph cluster is 5 mon, 1 mds , 3 osd . When ceph cluster runned one day or some days, I can't cp some file from ceph. I use mount.ceph for client . The cp'command is zombie for a long long time ! When I restart mds , cp again , it work well . But after some days , I alse can't cp file from ceph cluster. kernel version? when the hang happens again, find PID of cp and send content of /proc/PID/stack to us. Regards Yan, Zheng The mon log , mds log ,osd log is good, ceph -w is healthy ok. What can i do ? Any advise is important to me ! Thant you very much ! ceph version is 0.80.5 , centos 6.4 x86 64bit , rpm install. Best Regards Kenmasida ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph Performance with SSD journal
Hi Irek, I am using v0.80.5 Firefly http://ceph.com/docs/master/release-notes/#v0-80-5-firefly -sumit On Fri, Feb 13, 2015 at 1:30 PM, Irek Fasikhov malm...@gmail.com wrote: Hi. What version? 2015-02-13 6:04 GMT+03:00 Sumit Gaur sumitkg...@gmail.com: Hi Chir, Please fidn my answer below in blue On Thu, Feb 12, 2015 at 12:42 PM, Chris Hoy Poy ch...@gopc.net wrote: Hi Sumit, A couple questions: What brand/model SSD? samsung 480G SSD(PM853T) having random write 90K IOPS (4K, 368MBps) What brand/model HDD? 64GB memory, 300GB SAS HDD (seagate), 10Gb nic Also how they are connected to controller/motherboard? Are they sharing a bus (ie SATA expander)? no , They are connected with local Bus not the SATA expander. RAM? *64GB * Also look at the output of iostat -x or similiar, are the SSDs hitting 100% utilisation? *No, SSD was hitting 2000 iops only. * I suspect that the 5:1 ratio of HDDs to SDDs is not ideal, you now have 5x the write IO trying to fit into a single SSD. * I have not seen any documented reference to calculate the ratio. Could you suggest one. Here I want to mention that results for 1024K write improve a lot. Problem is with 1024K read and 4k write .* *SSD journal 810 IOPS and 810MBps* *HDD journal 620 IOPS and 620 MBps* I'll take a punt on it being a SATA connected SSD (most common), 5x ~130 megabytes/second gets very close to most SATA bus limits. If its a shared BUS, you possibly hit that limit even earlier (since all that data is now being written twice out over the bus). cheers; \Chris -- *From: *Sumit Gaur sumitkg...@gmail.com *To: *ceph-users@lists.ceph.com *Sent: *Thursday, 12 February, 2015 9:23:35 AM *Subject: *[ceph-users] ceph Performance with SSD journal Hi Ceph-Experts, Have a small ceph architecture related question As blogs and documents suggest that ceph perform much better if we use journal on SSD. I have made the ceph cluster with 30 HDD + 6 SSD for 6 OSD nodes. 5 HDD + 1 SSD on each node and each SSD have 5 partition for journaling 5 OSDs on the node. Now I ran similar test as I ran for all HDD setup. What I saw below two reading goes in wrong direction as expected 1) 4K write IOPS are less for SSD setup, though not major difference but less. 2) 1024K Read IOPS are less for SSD setup than HDD setup. On the other hand 4K read and 1024K write both have much better numbers for SSD setup. Let me know if I am missing some obvious concept. Thanks sumit ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Awesome! Just bought the paper back copy. The sample looked very good. Thanks! Grt, On Fri, 6 Feb 2015, Karan Singh wrote: Hello Community Members I am happy to introduce the first book on Ceph with the title ?Learning Ceph?. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 Finally , I would like to express my sincere thanks to Sage Weil - For developing Ceph and everything around it as well as writing foreword for ?Learning Ceph?. Patrick McGarry - For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book Don Talton , Julien Recurt , Sebastien Han and Zihong Chen , Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ http://www.csc.fi/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlTeIbYACgkQiJDTKUBxIRtAKQCeNFTMsIcoBXOrjyNauKZJGf72 lKsAoN56HgyInQehLK4LbzSJezadNQ5b =dRfa -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] UGRENT: add mon failed and ceph monitor refreshlog crazily
It sounds a bit like the extra load on mon.e from the synchronization is preventing it from joining the quorum? If you stop and restart mon.f it should pick a different mon to pull from, though. Perhaps see if that makes a different mon drop out? Then at least we'd understand what is going on... sage On Fri, 13 Feb 2015, minchen wrote: ceph version is 0.80.4 when add mon.f to {b,c,d,e}, mon.e is out quorum, mon.b, mon.c, mon.d are electing in cycle(restart a new election after leader win) . so, i think current 4 monitors can exchange messages to each other successfully. In addtion, mon.f is stuck at state synchronizing, and geting data from mon.e after probing. When I stop mon.f , mon.e goes back to quorum after a while, then ceph cluster becomes HEALTH_OK. But, all mon.b, mon.c, mon.d and mon.e logs are refreshing paxos acitive or updating messages many times per second, and paxos commit seq is increasing fastly. while the same situation not occurs in cluster of ?ceph-0.80.7 If you are still confused, maybe I should reproduct this in our cluster, and get complete mon logs ... -- Original -- From: sweil;sw...@redhat.com; Date: Fri, Feb 13, 2015 10:28 PM To: minchenminc...@ubuntukylin.com; Cc: ceph-usersceph-users@lists.ceph.com; joaoj...@redhat.com; Subject: Re: [ceph-users] UGRENT: add mon failed and ceph monitor refreshlog crazily What version is this? It's hard to tell from the logs below, but it looks like there might be a connectivity problem? Is it able to exchange messages with the other monitors? Perhaps more improtantly, though, if you simply stop the new mon.f, can mon.e join? What is in its log? sage On Fri, 13 Feb 2015, minchen wrote: Hi , all developers and users when i add a new mon to current mon cluter, failed with 2 mon out of quorum. there are 5 mons in our ceph cluster: epoch 7 fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 last_changed 2015-02-13 09:11:45.758839 created 0.00 0: 10.117.16.17:6789/0 mon.b 1: 10.118.32.7:6789/0 mon.cHEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) 2: 10.119.16.11:6789/0 mon.d 3: 10.122.0.9:6789/0 mon.e 4: 10.122.48.11:6789/0 mon.f mon.f is newly added to montior cluster, but when starting mon.f, it caused both mon.e and mon.f out of quorum: HEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) mon.b ,mon.c, mon.d, log refresh crazily as following: Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.063628 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.063629 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090647 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090648 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090661 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090662 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 .. and mon.f log : Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.526676 7f3931dfd7c0 0 ceph version 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f), process ceph-mon, pid 30639 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.607412 7f3931dfd7c0 0 mon.f does not exist in monmap, will attempt to join an existing cluster Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.609838 7f3931dfd7c0 0 starting mon.f rank -1 at 10.122.48.11:6789/0 mon_data /osd/ceph/mon fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.610076 7f3931dfd7c0 1 mon.f@-1(probing) e0 preinit fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636499 7f392a504700 0 -- 10.122.48.11:6789/0 10.119.16.11:6789/0 pipe(0x7f3934ebfb80 sd=26 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9ce0).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636797 7f392a201700 0 -- 10.122.48.11:6789/0 10.122.0.9:6789/0 pipe(0x7f3934ec0800 sd=29 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa940).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636968 7f392a403700 0 -- 10.122.48.11:6789/0 10.118.32.7:6789/0 pipe(0x7f3934ec0080 sd=27 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9e40).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.637037 7f392a302700 0 -- 10.122.48.11:6789/0 10.117.16.17:6789/0
Re: [ceph-users] UGRENT: add mon failed and ceph monitor refreshlog crazily
ceph version is 0.80.4 when add mon.f to {b,c,d,e}, mon.e is out quorum, mon.b, mon.c, mon.d are electing in cycle(restart a new election after leader win) . so, i think current 4 monitors can exchange messages to each other successfully. In addtion, mon.f is stuck at state synchronizing, and geting data from mon.e after probing. When I stop mon.f , mon.e goes back to quorum after a while, then ceph cluster becomes HEALTH_OK. But, all mon.b, mon.c, mon.d and mon.e logs are refreshing paxos acitive or updating messages many times per second, and paxos commit seq is increasing fastly. while the same situation not occurs in cluster of ceph-0.80.7 If you are still confused, maybe I should reproduct this in our cluster, and get complete mon logs ... -- Original -- From: sweil;sw...@redhat.com; Date: Fri, Feb 13, 2015 10:28 PM To: minchenminc...@ubuntukylin.com; Cc: ceph-usersceph-users@lists.ceph.com; joaoj...@redhat.com; Subject: Re: [ceph-users] UGRENT: add mon failed and ceph monitor refreshlog crazily What version is this? It's hard to tell from the logs below, but it looks like there might be a connectivity problem? Is it able to exchange messages with the other monitors? Perhaps more improtantly, though, if you simply stop the new mon.f, can mon.e join? What is in its log? sage On Fri, 13 Feb 2015, minchen wrote: Hi , all developers and users when i add a new mon to current mon cluter, failed with 2 mon out of quorum. there are 5 mons in our ceph cluster: epoch 7 fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 last_changed 2015-02-13 09:11:45.758839 created 0.00 0: 10.117.16.17:6789/0 mon.b 1: 10.118.32.7:6789/0 mon.cHEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) 2: 10.119.16.11:6789/0 mon.d 3: 10.122.0.9:6789/0 mon.e 4: 10.122.48.11:6789/0 mon.f mon.f is newly added to montior cluster, but when starting mon.f, it caused both mon.e and mon.f out of quorum: HEALTH_WARN 2 mons down, quorum 0,1,2 b,c,d mon.e (rank 3) addr 10.122.0.9:6789/0 is down (out of quorum) mon.f (rank 4) addr 10.122.48.11:6789/0 is down (out of quorum) mon.b ,mon.c, mon.d, log refresh crazily as following: Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.063628 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.063629 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090647 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090648 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 Feb 13 09:37:34 root ceph-mon: 2015-02-13 09:37:34.090661 7f7b64e14700 1 mon.b@0(leader).paxos(paxos active c 11818589..11819234) is_readable now=2015-02-13 09:37:34.090662 lease_expire=2015-02-13 09:37:38.205219 has v0 lc 11819234 .. and mon.f log : Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.526676 7f3931dfd7c0 0 ceph version 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f), process ceph-mon, pid 30639 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.607412 7f3931dfd7c0 0 mon.f does not exist in monmap, will attempt to join an existing cluster Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.609838 7f3931dfd7c0 0 starting mon.f rank -1 at 10.122.48.11:6789/0 mon_data /osd/ceph/mon fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.610076 7f3931dfd7c0 1 mon.f@-1(probing) e0 preinit fsid 0dfd2bd5-1896-4712-916b-ec02dcc7b049 Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636499 7f392a504700 0 -- 10.122.48.11:6789/0 10.119.16.11:6789/0 pipe(0x7f3934ebfb80 sd=26 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9ce0).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636797 7f392a201700 0 -- 10.122.48.11:6789/0 10.122.0.9:6789/0 pipe(0x7f3934ec0800 sd=29 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa940).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.636968 7f392a403700 0 -- 10.122.48.11:6789/0 10.118.32.7:6789/0 pipe(0x7f3934ec0080 sd=27 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934ea9e40).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.637037 7f392a302700 0 -- 10.122.48.11:6789/0 10.117.16.17:6789/0 pipe(0x7f3934ebfe00 sd=28 :6789 s=0 pgs=0 cs=0 l=0 c=0x7f3934eaa260).accept connect_seq 0 vs existing 0 state wait Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.638854 7f392c00a700 0 mon.f@-1(probing) e7 my rank is now 4 (was -1) Feb 13 09:16:26 root ceph-mon: 2015-02-13 09:16:26.639365 7f392c00a700 1 mon.f@4(synchronizing) e7 sync_obtain_latest_monmap Feb 13 09:16:26 root
Re: [ceph-users] CRUSHMAP for chassis balance
With sufficiently new CRUSH versions (all the latest point releases on LTS?) I think you can simply have the rule return extra IDs which are dropped if they exceed the number required. So you can choose two chassis, then have those both choose to lead OSDs, and return those 4 from the rule. -Greg On Fri, Feb 13, 2015 at 6:13 AM Luke Kao luke@mycom-osi.com wrote: Dear cepher, Currently I am working on crushmap to try to make sure the at least one copy are going to different chassis. Say chassis1 has host1,host2,host3, and chassis2 has host4,host5,host6. With replication =2, it’s not a problem, I can use the following step in rule step take chasses1 step chooseleaf firstn 1 type host step emit step take chasses2 step chooseleaf firstn 1 type host step emit But for replication=3, I tried step take chasses1 step chooseleaf firstn 1 type host step emit step take chasses2 step chooseleaf firstn 1 type host step emit step take default step chooseleaf firstn 1 type host step emit At the end, the 3rd osd returned in rule test is always duplicate with first 1 or first 2. Any idea or what’s the direction to move forward? Thanks in advance BR, Luke MYCOM-OSI -- This electronic message contains information from Mycom which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or any other use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by post or telephone (to the numbers or correspondence address above) or by email (at the email address above) immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
On 05-02-15 23:53, Karan Singh wrote: Hello Community Members I am happy to introduce the first book on Ceph with the title “*Learning Ceph*”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Great! Just ordered myself a copy! Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 *Finally , I would like to express my sincere thanks to * *Sage Weil* - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. *Patrick McGarry *- For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and *Zihong Chen *, Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Random OSDs respawning continuously
It's not entirely clear, but it looks like all the ops are just your caching pool OSDs trying to promote objects, and your backing pool OSD's aren't fast enough to satisfy all the IO demanded of them. You may be overloading the system. -Greg On Fri, Feb 13, 2015 at 6:06 AM Mohamed Pakkeer mdfakk...@gmail.com wrote: Hi all, When i stop the respawning osd on an OSD node, another osd is respawning on the same node. when the OSD is started to respawing, it puts the following info in the osd log. slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(*osd.551*.95229:11 191 10005c4.0033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg OSD.551 is part of cache tier. All the respawning osds have the log with different cache tier OSDs. If i restart all the osds in the cache tier osd node, respawning is stopped and cluster become active + clean state. But when i try to write some data on the cluster, random osd starts the respawning. can anyone help me how to solve this issue? 2015-02-13 19:10:02.309848 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 11 included below; oldest blocked for 30.132629 secs 2015-02-13 19:10:02.309854 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.132629 seconds old, received at 2015-02-13 19:09:32.177075: osd_op(osd.551.95229:63 10002ae. [copy-from ver 7622] 13.7273b256 RETRY=130 snapc 1=[] ondisk+retry+write+ignore_overlay+enforce_snapc+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309858 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.131608 seconds old, received at 2015-02-13 19:09:32.178096: osd_op(osd.551.95229:41 5 10003a0.0006 [copy-get max 8388608] 13.aefb256 RETRY=118 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309861 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130994 seconds old, received at 2015-02-13 19:09:32.178710: osd_op(osd.551.95229:26 83 100029d.003b [copy-get max 8388608] 13.a2be1256 RETRY=115 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309864 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.130426 seconds old, received at 2015-02-13 19:09:32.179278: osd_op(osd.551.95229:39 39 10004e9.0032 [copy-get max 8388608] 13.6a25b256 RETRY=105 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:02.309868 7f53eef54700 0 log_channel(default) log [WRN] : slow request 30.129697 seconds old, received at 2015-02-13 19:09:32.180007: osd_op(osd.551.95229:97 49 1000553.007e [copy-get max 8388608] 13.c8645256 RETRY=59 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310284 7f53eef54700 0 log_channel(default) log [WRN] : 11 slow requests, 6 included below; oldest blocked for 31.133092 secs 2015-02-13 19:10:03.310305 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.129671 seconds old, received at 2015-02-13 19:09:32.180496: osd_op(osd.551.95229:11 191 10005c4.0033 [copy-get max 8388608] 13.f4ccd256 RETRY=50 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310308 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.128616 seconds old, received at 2015-02-13 19:09:32.181551: osd_op(osd.551.95229:12 903 10002e4.00d6 [copy-get max 8388608] 13.f56a3256 RETRY=41 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310322 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127807 seconds old, received at 2015-02-13 19:09:32.182360: osd_op(osd.551.95229:14 165 1000480.0110 [copy-get max 8388608] 13.fd8c1256 RETRY=32 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310327 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.127320 seconds old, received at 2015-02-13 19:09:32.182847: osd_op(osd.551.95229:15 013 100047f.0133 [copy-get max 8388608] 13.b7b05256 RETRY=27 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e95518) currently reached_pg 2015-02-13 19:10:03.310331 7f53eef54700 0 log_channel(default) log [WRN] : slow request 31.126935 seconds old, received at 2015-02-13 19:09:32.183232: osd_op(osd.551.95229:15 767 100066d.001e [copy-get max 8388608] 13.3b017256 RETRY=25 ack+retry+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
Re: [ceph-users] OSD turned itself off
Hi, I skimmed the logs again, as we’ve had more of this kinda errors, I saw a lot of lossy connections errors, -2567 2014-11-24 11:49:40.028755 7f6d49367700 0 -- 10.168.7.23:6819/10217 10.168.7.54:0/1011446 pipe(0x19321b80 sd=44 :6819 s=0 pgs=0 cs=0 l=1 c=0x110d2b00).accept replacing existing (lossy) channel (new one lossy=1) -2564 2014-11-24 11:49:42.000463 7f6d51df1700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/1015676 pipe(0x22d6000 sd=204 :6819 s=0 pgs=0 cs=0 l=1 c=0x16e218c0).accept replacing existing (lossy) channel (new one lossy=1) -2563 2014-11-24 11:49:47.704467 7f6d4d1a5700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/3029106 pipe(0x231f6780 sd=158 :6819 s=0 pgs=0 cs=0 l=1 c=0x136bd1e0).accept replacing existing (lossy) channel (new one lossy=1) -2562 2014-11-24 11:49:48.180604 7f6d4cb9f700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/2027138 pipe(0x1657f180 sd=254 :6819 s=0 pgs=0 cs=0 l=1 c=0x13273340).accept replacing existing (lossy) channel (new one lossy=1) -2561 2014-11-24 11:49:48.808604 7f6d4c498700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/2023529 pipe(0x12831900 sd=289 :6819 s=0 pgs=0 cs=0 l=1 c=0x12401600).accept replacing existing (lossy) channel (new one lossy=1) -2559 2014-11-24 11:49:50.128379 7f6d4b88c700 0 -- 10.168.7.23:6819/10217 10.168.7.53:0/1023180 pipe(0x11cb2280 sd=309 :6819 s=0 pgs=0 cs=0 l=1 c=0x1280a000).accept replacing existing (lossy) channel (new one lossy=1) -2558 2014-11-24 11:49:52.472867 7f6d425eb700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/3019692 pipe(0x18eb4a00 sd=311 :6819 s=0 pgs=0 cs=0 l=1 c=0x10df6b00).accept replacing existing (lossy) channel (new one lossy=1) -2556 2014-11-24 11:49:55.100208 7f6d49e72700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/3021273 pipe(0x1bacf680 sd=353 :6819 s=0 pgs=0 cs=0 l=1 c=0x164ae2c0).accept replacing existing (lossy) channel (new one lossy=1) -2555 2014-11-24 11:49:55.776568 7f6d49468700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/3024351 pipe(0x1bacea00 sd=20 :6819 s=0 pgs=0 cs=0 l=1 c=0x1887ba20).accept replacing existing (lossy) channel (new one lossy=1) -2554 2014-11-24 11:49:57.704437 7f6d49165700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/1023529 pipe(0x1a32ac80 sd=213 :6819 s=0 pgs=0 cs=0 l=1 c=0xfe93b80).accept replacing existing (lossy) channel (new one lossy=1) -2553 2014-11-24 11:49:58.694246 7f6d47549700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/3017204 pipe(0x102e5b80 sd=370 :6819 s=0 pgs=0 cs=0 l=1 c=0xfb5a000).accept replacing existing (lossy) channel (new one lossy=1) -2551 2014-11-24 11:50:00.412242 7f6d4673b700 0 -- 10.168.7.23:6819/10217 10.168.7.52:0/3027138 pipe(0x1b83b400 sd=250 :6819 s=0 pgs=0 cs=0 l=1 c=0x12922dc0).accept replacing existing (lossy) channel (new one lossy=1) -2387 2014-11-24 11:50:22.761490 7f6d44fa4700 0 -- 10.168.7.23:6840/4010217 10.168.7.25:0/27131 pipe(0xfc60c80 sd=300 :6840 s=0 pgs=0 cs=0 l=1 c=0x1241d080).accept replacing existing (lossy) channel (new one lossy=1) -2300 2014-11-24 11:50:31.366214 7f6d517eb700 0 -- 10.168.7.23:6840/4010217 10.168.7.22:0/15549 pipe(0x193b3180 sd=214 :6840 s=0 pgs=0 cs=0 l=1 c=0x10ebbe40).accept replacing existing (lossy) channel (new one lossy=1) -2247 2014-11-24 11:50:37.372934 7f6d4a276700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/1013890 pipe(0x25d4780 sd=112 :6819 s=0 pgs=0 cs=0 l=1 c=0x10666580).accept replacing existing (lossy) channel (new one lossy=1) -2246 2014-11-24 11:50:37.738539 7f6d4f6ca700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/3026502 pipe(0x1338ea00 sd=230 :6819 s=0 pgs=0 cs=0 l=1 c=0x123f11e0).accept replacing existing (lossy) channel (new one lossy=1) -2245 2014-11-24 11:50:38.390093 7f6d48c60700 0 -- 10.168.7.23:6819/10217 10.168.7.51:0/2026502 pipe(0x16ba7400 sd=276 :6819 s=0 pgs=0 cs=0 l=1 c=0x7d4fb80).accept replacing existing (lossy) channel (new one lossy=1) -2242 2014-11-24 11:50:40.505458 7f6d3e43a700 0 -- 10.168.7.23:6819/10217 10.168.7.53:0/1012682 pipe(0x12a53180 sd=183 :6819 s=0 pgs=0 cs=0 l=1 c=0x10537080).accept replacing existing (lossy) channel (new one lossy=1) -2198 2014-11-24 11:51:14.273025 7f6d44ea3700 0 -- 10.168.7.23:6865/5010217 10.168.7.25:0/30755 pipe(0x162bb680 sd=327 :6865 s=0 pgs=0 cs=0 l=1 c=0x16e21600).accept replacing existing (lossy) channel (new one lossy=1) -1881 2014-11-29 00:45:42.247394 7f6d5c155700 0 -- 10.168.7.23:6819/10217 submit_message osd_op_reply(949861 rbd_data.1c56a792eb141f2.6200 [stat,write 2228224~12288] ondisk = 0) v4 remote, 10.168.7.54:0/1025735, failed lossy con, dropping message 0x1bc00400 -976 2015-01-05 07:10:01.763055 7f6d5c155700 0 -- 10.168.7.23:6819/10217 submit_message osd_op_reply(11034565 rbd_data.1cc69562eb141f2.03ce [stat,write 1925120~4096] ondisk = 0) v4 remote, 10.168.7.54:0/2007323, failed lossy con, dropping message 0x12989400 -855 2015-01-10 22:01:36.589036 7f6d5b954700 0 -- 10.168.7.23:6819/10217 submit_message
Re: [ceph-users] Issus with device-mapper drive partition names.
I ran into something similiar when messing with my test cluster - basically, it doesn't like existing GPT tables on devices. I got in the habit of running 'gdisk /dev/sdX' and using the 'x' (expert) and 'z' (zap) commands to get rid of the GPT table prior to doing ceph setup. On Thu, Feb 12, 2015 at 3:09 PM, Tyler Bishop tyler.bis...@beyondhosting.net wrote: When trying to zap and prepare a disk it fails to find the partitions. [ceph@ceph0-mon0 ~]$ ceph-deploy -v disk zap ceph0-node1:/dev/mapper/35000c50031a1c08b [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy -v disk zap ceph0-node1:/dev/mapper/35000c50031a1c08b [ceph_deploy.osd][DEBUG ] zapping /dev/mapper/35000c50031a1c08b on ceph0-node1 [ceph0-node1][DEBUG ] connection detected need for sudo [ceph0-node1][DEBUG ] connected to host: ceph0-node1 [ceph0-node1][DEBUG ] detect platform information from remote host [ceph0-node1][DEBUG ] detect machine type [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.0.1406 Core [ceph0-node1][DEBUG ] zeroing last few blocks of device [ceph0-node1][DEBUG ] find the location of an executable [ceph0-node1][INFO ] Running command: sudo /usr/sbin/ceph-disk zap /dev/mapper/35000c50031a1c08b [ceph0-node1][DEBUG ] Creating new GPT entries. [ceph0-node1][DEBUG ] Warning: The kernel is still using the old partition table. [ceph0-node1][DEBUG ] The new table will be used at the next reboot. [ceph0-node1][DEBUG ] GPT data structures destroyed! You may now partition the disk using fdisk or [ceph0-node1][DEBUG ] other utilities. [ceph0-node1][DEBUG ] Warning: The kernel is still using the old partition table. [ceph0-node1][DEBUG ] The new table will be used at the next reboot. [ceph0-node1][DEBUG ] The operation has completed successfully. [ceph_deploy.osd][INFO ] calling partx on zapped device /dev/mapper/35000c50031a1c08b [ceph_deploy.osd][INFO ] re-reading known partitions will display errors [ceph0-node1][INFO ] Running command: sudo partx -a /dev/mapper/35000c50031a1c08b Now running prepare fails because it can't find the newly created partitions. [ceph@ceph0-mon0 ~]$ ceph-deploy -v osd prepare ceph0-node1:/dev/mapper/35000c50031a1c08b [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy -v osd prepare ceph0-node1:/dev/mapper/35000c50031a1c08b [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph0-node1:/dev/mapper/35000c50031a1c08b: [ceph0-node1][DEBUG ] connection detected need for sudo [ceph0-node1][DEBUG ] connected to host: ceph0-node1 [ceph0-node1][DEBUG ] detect platform information from remote host [ceph0-node1][DEBUG ] detect machine type [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.0.1406 Core [ceph_deploy.osd][DEBUG ] Deploying osd to ceph0-node1 [ceph0-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph0-node1][INFO ] Running command: sudo udevadm trigger --subsystem-match=block --action=add [ceph_deploy.osd][DEBUG ] Preparing host ceph0-node1 disk /dev/mapper/35000c50031a1c08b journal None activate False [ceph0-node1][INFO ] Running command: sudo ceph-disk -v prepare --fs-type xfs --cluster ceph -- /dev/mapper/35000c50031a1c08b [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size [ceph0-node1][WARNIN] INFO:ceph-disk:Will colocate journal with data on /dev/mapper/35000c50031a1c08b [ceph0-node1][WARNIN] DEBUG:ceph-disk:Creating journal partition num 2 size 1 on /dev/mapper/35000c50031a1c08b [ceph0-node1][WARNIN] INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:1M --change-name=2:ceph journal --partition-guid=2:b9202d1b-63be-4deb-ad08-0a143a31f4a9 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/mapper/35000c50031a1c08b [ceph0-node1][DEBUG ] Information: Moved requested sector from 34 to 2048 in [ceph0-node1][DEBUG ] order to align on 2048-sector boundaries. [ceph0-node1][DEBUG ] Warning: The kernel is still using the old partition table. [ceph0-node1][DEBUG ] The new table will be used at the next reboot.
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Congrats, Karan! sage On Fri, 6 Feb 2015, Karan Singh wrote: Hello Community Members I am happy to introduce the first book on Ceph with the title ?Learning Ceph?. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s =booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 Finally , I would like to express my sincere thanks to Sage Weil - For developing Ceph and everything around it as well as writing foreword for ?Learning Ceph?. Patrick McGarry - For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book Don Talton , Julien Recurt , Sebastien Han and Zihong Chen , Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph Performance with SSD journal
Not sure what exact brands of samsung you have, but i've got the 840 Pro and it sucks big time. its is slow and unreliable and halts to a stand still over a period of time due to the trimming issue. Even after i've left unreserved like 50% of the disk. Unlike the Intel disks (even the consumer brand like 520 and 530 are just way better. I will stay away from any samsung drives in the future. Andrei - Original Message - From: Sumit Gaur sumitkg...@gmail.com To: Irek Fasikhov malm...@gmail.com Cc: ceph-users@lists.ceph.com Sent: Friday, 13 February, 2015 1:09:38 PM Subject: Re: [ceph-users] ceph Performance with SSD journal Hi Irek, I am using v0.80.5 Firefly -sumit On Fri, Feb 13, 2015 at 1:30 PM, Irek Fasikhov malm...@gmail.com wrote: Hi. What version? 2015-02-13 6:04 GMT+03:00 Sumit Gaur sumitkg...@gmail.com : Hi Chir, Please fidn my answer below in blue On Thu, Feb 12, 2015 at 12:42 PM, Chris Hoy Poy ch...@gopc.net wrote: Hi Sumit, A couple questions: What brand/model SSD? samsung 480G SSD(PM853T) having random write 90K IOPS (4K, 368MBps) What brand/model HDD? 64GB memory, 300GB SAS HDD (seagate) , 10Gb nic Also how they are connected to controller/motherboard? Are they sharing a bus (ie SATA expander)? no , They are connected with local Bus not the SATA expander. RAM? 64GB Also look at the output of iostat -x or similiar, are the SSDs hitting 100% utilisation? No, SSD was hitting 2000 iops only. I suspect that the 5:1 ratio of HDDs to SDDs is not ideal, you now have 5x the write IO trying to fit into a single SSD. I have not seen any documented reference to calculate the ratio. Could you suggest one. Here I want to mention that results for 1024K write improve a lot. Problem is with 1024K read and 4k write . SSD journal 810 IOPS and 810MBps HDD journal 620 IOPS and 620 MBps I'll take a punt on it being a SATA connected SSD (most common), 5x ~130 megabytes/second gets very close to most SATA bus limits. If its a shared BUS, you possibly hit that limit even earlier (since all that data is now being written twice out over the bus). cheers; \Chris From: Sumit Gaur sumitkg...@gmail.com To: ceph-users@lists.ceph.com Sent: Thursday, 12 February, 2015 9:23:35 AM Subject: [ceph-users] ceph Performance with SSD journal Hi Ceph -Experts, Have a small ceph architecture related question As blogs and documents suggest that ceph perform much better if we use journal on SSD . I have made the ceph cluster with 30 HDD + 6 SSD for 6 OSD nodes. 5 HDD + 1 SSD on each node and each SSD have 5 partition for journaling 5 OSDs on the node. Now I ran similar test as I ran for all HDD setup. What I saw below two reading goes in wrong direction as expected 1) 4K write IOPS are less for SSD setup, though not major difference but less. 2) 1024K Read IOPS are less for SSD setup than HDD setup. On the other hand 4K read and 1024K write both have much better numbers for SSD setup. Let me know if I am missing some obvious concept. Thanks sumit ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph
Yeah, guys, thanks! I've got it a few days ago and done a few chapters already. Well done! Andrei - Original Message - From: Wido den Hollander w...@42on.com To: ceph-users@lists.ceph.com Sent: Friday, 13 February, 2015 5:38:47 PM Subject: Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph On 05-02-15 23:53, Karan Singh wrote: Hello Community Members I am happy to introduce the first book on Ceph with the title “*Learning Ceph*”. Me and many folks from the publishing house together with technical reviewers spent several months to get this book compiled and published. Finally the book is up for sale on , i hope you would like it and surely will learn a lot from it. Great! Just ordered myself a copy! Amazon : http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph Packtpub : https://www.packtpub.com/application-development/learning-ceph You can grab the sample copy from here : https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0 *Finally , I would like to express my sincere thanks to * *Sage Weil* - For developing Ceph and everything around it as well as writing foreword for “Learning Ceph”. *Patrick McGarry *- For his usual off the track support that too always. Last but not the least , to our great community members , who are also reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and *Zihong Chen *, Thank you guys for your efforts. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Calamari build in vagrants
0cbcfbaa791baa3ee25c4f1a135f005c1d568512 on the 1.2.3 branch has the change to yo 1.1.0. I've just cherry-picked that to v1.3 and master. On 02/12/2015 11:21 AM, Steffen Winther wrote: Steffen Winther ceph.user@... writes: Trying to build calamari rpm+deb packages following this guide: http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html Server packages works fine, but fails in clients for: dashboard manage admin login due to: yo 1.1.0 seems needed to build the clients, but can't found this with npm, what to do about this anyone? 1.1.0 seems oldest version npm will install, latest says 1.4.5 :( build error: npm ERR! notarget No compatible version found: yo at '=1.0.0-0 1.1.0-0' npm ERR! notarget Valid install targets: npm ERR! notarget [1.1.0,1.1.1,1.1.2, 1.2.0,1.2.1,1.3.0,1.3.2,1.3.3] Found a tar ball of yo@1.0.6 which can be installed with either: npm install -g tar ball of npm install -g package directory :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Dan Mick Red Hat, Inc. Ceph docs: http://ceph.com/docs ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Having problem to start Radosgw
Hi all, I’m having a problem to start radosgw, giving me error that I can’t diagnose: $ radosgw -c ceph.conf -d 2015-02-14 07:46:58.435802 7f9d739557c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27609 2015-02-14 07:46:58.437284 7f9d739557c0 -1 asok(0x7f9d74da80a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': (17) File exists 2015-02-14 07:46:58.499004 7f9d739557c0 0 framework: fastcgi 2015-02-14 07:46:58.499016 7f9d739557c0 0 starting handler: fastcgi 2015-02-14 07:46:58.501160 7f9d477fe700 0 ERROR: FCGX_Accept_r returned -9 2015-02-14 07:46:58.594271 7f9d648ab700 -1 failed to list objects pool_iterate returned r=-2 2015-02-14 07:46:58.594276 7f9d648ab700 0 ERROR: lists_keys_next(): ret=-2 2015-02-14 07:46:58.594278 7f9d648ab700 0 ERROR: sync_all_users() returned ret=-2 ^C2015-02-14 07:47:29.119185 7f9d47fff700 1 handle_sigterm 2015-02-14 07:47:29.119214 7f9d47fff700 1 handle_sigterm set alarm for 120 2015-02-14 07:47:29.119222 7f9d739557c0 -1 shutting down 2015-02-14 07:47:29.142726 7f9d739557c0 1 final shutdown since it complains that this file exists: /var/run/ceph/ceph-client.admin.asok, I removed it, but now, I get this error: $ radosgw -c ceph.conf -d 2015-02-14 07:47:55.140276 7f31cc0637c0 0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27741 2015-02-14 07:47:55.201561 7f31cc0637c0 0 framework: fastcgi 2015-02-14 07:47:55.201567 7f31cc0637c0 0 starting handler: fastcgi 2015-02-14 07:47:55.203443 7f319effd700 0 ERROR: FCGX_Accept_r returned -9 2015-02-14 07:47:55.304048 7f319700 -1 failed to list objects pool_iterate returned r=-2 2015-02-14 07:47:55.304054 7f319700 0 ERROR: lists_keys_next(): ret=-2 2015-02-14 07:47:55.304060 7f319700 0 ERROR: sync_all_users() returned ret=-2 Cant somebody help me where to start fixing this? Thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com