[ceph-users] Replace all monitors
Hi, from now I have 5 monitors which share slow SSD with several OSD journal. As a result, each data migration operation (reweight, recovery, etc) is very slow and the cluster is near down. So I have to change that. I'm looking to replace this 5 monitors by 3 new monitors, which still share (very fast) SSD with several OSD. I suppose it's not a good idea, since monitors should have a dedicated storage. What do you think about that ? Is it a better practice to have dedicated storage, but share CPU with Xen VM ? Second point, I'm not sure how to do that migration, without downtime. I was hoping to add the 3 new monitors, then progressively remove the 5 old monitors, but in the doc [1] indicate a special procedure for unhealthy cluster, which seem to be for clusters with damaged monitors, right ? In my case I only have dead PG [2] (#5226), from which I can't recover, but monitors are fine. Can I use the standard procedure ? Thanks, Olivier [1] http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors [2] http://tracker.ceph.com/issues/5226 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] error noticed while setting the Storage cluster
Thanks wido ,I have rectified it , I have created the ceph cluster and created cloudstack osd. On hypervisor(KVM host) side do I need to install any ceph packages to communicate to ceph storage cluster which was exists on other host. Regards Sadhu -Original Message- From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Wido den Hollander Sent: 07 August 2013 00:21 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] error noticed while setting the Storage cluster On 08/06/2013 08:31 PM, Suresh Sadhu wrote: HI , I am getting following error when I try to execute this command from admin node. Followed below procedure mentioned in the document. http://ceph.com/docs/master/start/quick-ceph-deploy/ Sadhu@ubuntu-2:~$ ceph-deploy install --stable cuttlefish ubuntu3 sadhu@ubuntu3's password: Traceback (most recent call last): File /usr/lib/python2.7/dist-packages/pushy/client.py, line 383, in __init__ self.modules = AutoImporter(self) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 236, in __init__ remote_compile = self.__client.eval(compile) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 478, in eval return self.remote.eval(code, globals, locals) File /usr/lib/python2.7/dist-packages/pushy/protocol/connection.py, line 54, in eval return self.send_request(MessageType.evaluate, args) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 311, in send_request self.__send_message(message_type, args) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 560, in __send_message self.__ostream.send_message(m) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 97, in send_message self.__file.write(bytes_) IOError: [Errno 32] Broken pipe [remote] sudo: /etc/sudoers.d/ceph: syntax error near line 1 [remote] sudo: /etc/sudoers.d/ceph: syntax error near line 2 [remote] sudo: parse error in /etc/sudoers.d/ceph near line 1 [remote] sudo: no valid sudoers sources found, quitting Have you verified your sudoers file? Might be a copy/paste issue? Wido [remote] sudo: unable to initialize policy plugin Traceback (most recent call last): File /usr/bin/ceph-deploy, line 21, in module main() File /usr/lib/pymodules/python2.7/ceph_deploy/cli.py, line 112, in main return args.func(args) File /usr/lib/pymodules/python2.7/ceph_deploy/install.py, line 364, in install sudo = args.pushy(get_transport(hostname)) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 583, in connect return PushyClient(target, **kwargs) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 383, in __init__ self.modules = AutoImporter(self) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 236, in __init__ remote_compile = self.__client.eval(compile) File /usr/lib/python2.7/dist-packages/pushy/client.py, line 478, in eval return self.remote.eval(code, globals, locals) File /usr/lib/python2.7/dist-packages/pushy/protocol/connection.py, line 54, in eval return self.send_request(MessageType.evaluate, args) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 311, in send_request self.__send_message(message_type, args) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 560, in __send_message self.__ostream.send_message(m) File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py, line 97, in send_message self.__file.write(bytes_) IOError: [Errno 32] Broken pipe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Openstack glance ceph rbd_store_user authentification problem
Hi, recently I had a problem with openstack glance and ceph. I used the http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance documentation and http://docs.openstack.org/developer/glance/configuring.html documentation I'm using ubuntu 12.04 LTS with grizzly from Ubuntu Cloud Archive and ceph 61.7. glance-api.conf had following config options default_store = rbd rbd_store_user=images rbd_store_pool = images rbd_store_ceph_conf = /etc/ceph/ceph.conf All the time when doing glance image create I get errors. In the glance api log I only found error like 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images Traceback (most recent call last): 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/api/v1/images.py, line 444, in _upload 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images image_meta['size']) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/store/rbd.py, line 241, in add 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images with rados.Rados(conffile=self.conf_file, rados_id=self.user) as conn: 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 134, in __enter__ 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images self.connect() 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 192, in connect 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images raise make_ex(ret, error calling connect) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images ObjectNotFound: error calling connect This trace message helped me not very much :-( My google search glance.api.v1.images ObjectNotFound: error calling connect did only find http://irclogs.ceph.widodh.nl/index.php?date=2012-10-26 This points me to an ceph authentification problem. But the ceph tools worked fine for me. The I tried the debug option in glance-api.conf and I found following entry . DEBUG glance.common.config [-] rbd_store_pool = images log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 DEBUG glance.common.config [-] rbd_store_user = glance log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 The glance-api service did not use my rbd_store_user = images option!! Then I configured a client.glance auth and it worked with the implicit glance user!!! Now my question: Am I the only one with this problem?? Regards, Steffen Thorhauer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack glance ceph rbd_store_user authentification problem
Steffan, It works for me. I have: user@node:/etc/ceph# cat /etc/glance/glance-api.conf | grep rbd default_store = rbd # glance.store.rbd.Store, rbd_store_ceph_conf = /etc/ceph/ceph.conf rbd_store_user = images rbd_store_pool = images rbd_store_chunk_size = 4 Thanks, Mike Dawson On 8/8/2013 9:01 AM, Steffen Thorhauer wrote: Hi, recently I had a problem with openstack glance and ceph. I used the http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance documentation and http://docs.openstack.org/developer/glance/configuring.html documentation I'm using ubuntu 12.04 LTS with grizzly from Ubuntu Cloud Archive and ceph 61.7. glance-api.conf had following config options default_store = rbd rbd_store_user=images rbd_store_pool = images rbd_store_ceph_conf = /etc/ceph/ceph.conf All the time when doing glance image create I get errors. In the glance api log I only found error like 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images Traceback (most recent call last): 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/api/v1/images.py, line 444, in _upload 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images image_meta['size']) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/store/rbd.py, line 241, in add 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images with rados.Rados(conffile=self.conf_file, rados_id=self.user) as conn: 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 134, in __enter__ 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images self.connect() 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 192, in connect 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images raise make_ex(ret, error calling connect) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images ObjectNotFound: error calling connect This trace message helped me not very much :-( My google search glance.api.v1.images ObjectNotFound: error calling connect did only find http://irclogs.ceph.widodh.nl/index.php?date=2012-10-26 This points me to an ceph authentification problem. But the ceph tools worked fine for me. The I tried the debug option in glance-api.conf and I found following entry . DEBUG glance.common.config [-] rbd_store_pool = images log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 DEBUG glance.common.config [-] rbd_store_user = glance log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 The glance-api service did not use my rbd_store_user = images option!! Then I configured a client.glance auth and it worked with the implicit glance user!!! Now my question: Am I the only one with this problem?? Regards, Steffen Thorhauer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how to recover the osd.
Looks like you didn't get osd.0 deployed properly. Can you show: - ls /var/lib/ceph/osd/ceph-0 - cat /etc/ceph/ceph.conf Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/8/2013 9:13 AM, Suresh Sadhu wrote: HI, My storage health cluster is warning state , one of the osd is in down state and even if I try to start the osd it fail to start sadhu@ubuntu3:~$ ceph osd stat e22: 2 osds: 1 up, 1 in sadhu@ubuntu3:~$ ls /var/lib/ceph/osd/ ceph-0 ceph-1 sadhu@ubuntu3:~$ ceph osd tree # idweight type name up/down reweight -1 0.14root default -2 0.14host ubuntu3 0 0.06999 osd.0 down0 1 0.06999 osd.1 up 1 sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start 0 /etc/init.d/ceph: 0. not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) Ceph health status in warning mode. pg 4.10 is active+degraded, acting [1] pg 3.17 is active+degraded, acting [1] pg 5.16 is active+degraded, acting [1] pg 4.17 is active+degraded, acting [1] pg 3.10 is active+degraded, acting [1] recovery 62/124 degraded (50.000%) mds.ceph@ubuntu3 at 10.147.41.3:6803/2148 is laggy/unresponsi regards sadhu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to set Object Size/Stripe Width/Stripe Count?
Hi list, I saw the info about data striping in http://ceph.com/docs/master/architecture/#data-striping . But couldn't find the way to set these values. Could you please tell me how to that or give me a link? Thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] minimum object size in ceph
On Wed, 7 Aug 2013, Nulik Nol wrote: thanks Dan, i meant like PRIMARY KEY in a RDBMS, or Key for NoSQL (key-value pair) database to perform put() get() operations. Well, if it is string then it's ok, I can print binary keys in HEX or uuencode or something like that. Is there a limit on maximum string length for object name? It is pretty long.. I think 4096 characters, although things are not quite as efficient on the backend when names are long. sage Regards Nulik On Tue, Aug 6, 2013 at 4:08 PM, Dan Mick dan.m...@inktank.com wrote: No minumum object size. As for key, not sure what you mean; the closest thing to an object 'key' is its name, but it's obvious from routines like rados_read() and rados_write() that that's a const char *. Did you mean some other key? On 08/06/2013 12:13 PM, Nulik Nol wrote: Hi, when using the C api (RADOS) what is the minimum object size ? And what is the key type ? (uint64_t, char[], or something like that ?) TIA Nulik ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Dan Mick, Filesystem Engineering Inktank Storage, Inc. http://inktank.com Ceph docs: http://ceph.com/docs ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to set Object Size/Stripe Width/Stripe Count?
This can help you. http://www.sebastien-han.fr/blog/2013/02/11/mount-a-specific-pool-with-cephfs/ On Thu, Aug 8, 2013 at 7:48 AM, Da Chun ng...@qq.com wrote: Hi list, I saw the info about data striping in http://ceph.com/docs/master/architecture/#data-striping . But couldn't find the way to set these values. Could you please tell me how to that or give me a link? Thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] kernel BUG at net/ceph/osd_client.c:2103
Hello, I don't know if it's useful, but I can also reproduce this bug with : rbd kernel 3.10.4 ceph osd 0.61.4 image format 2 rbd formatted in xfs, after some snapshots, and mount/umount test (no write on the file system), xfs mount make segfault and kernel have same log. Cheers, Laurent Barbe Le 05/08/2013 07:22, Olivier Bonvalet a écrit : Yes of course, thanks ! Le dimanche 04 août 2013 à 20:59 -0700, Sage Weil a écrit : Hi Olivier, This looks like http://tracker.ceph.com/issues/5760. We should be able to look at this more closely this week. In the meantime, you might want to go back to 3.9.x. If we have a patch that addresses the bug, would you be able to test it? Thanks! sage On Mon, 5 Aug 2013, Olivier Bonvalet wrote: Sorry, the dev list is probably a better place for that one. Le lundi 05 ao?t 2013 ? 03:07 +0200, Olivier Bonvalet a ?crit : Hi, I've just upgraded a Xen Dom0 (Debian Wheezy with Xen 4.2.2) from Linux 3.9.11 to Linux 3.10.5, and now I have kernel panic after launching some VM which use RBD kernel client. In kernel logs, I have : Aug 5 02:51:22 murmillia kernel: [ 289.205652] kernel BUG at net/ceph/osd_client.c:2103! Aug 5 02:51:22 murmillia kernel: [ 289.205725] invalid opcode: [#1] SMP Aug 5 02:51:22 murmillia kernel: [ 289.205908] Modules linked in: cbc rbd libceph libcrc32c xen_gntdev ip6table_mangle ip6t_REJECT ip6table_filter ip6_tables xt_DSCP iptable_mangle xt_LOG xt_physdev ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge loop coretemp ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support gpio_ich microcode serio_raw sb_edac edac_core evdev lpc_ich i2c_i801 mfd_core wmi ac ioatdma shpchp button dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crc32c_intel isci megaraid_sas libsas ahci libahci ehci_pci ehci_hcd libata scsi_transport_sas igb scsi_mod i2c_algo_bit ixgbe usbcore i2c_core dca usb_common ptp pps_core mdio Aug 5 02:51:22 murmillia kernel: [ 289.210499] CPU: 2 PID: 5326 Comm: blkback.3.xvda Not tainted 3.10-dae-dom0 #1 Aug 5 02:51:22 murmillia kernel: [ 289.210617] Hardware name: Supermicro X9DRW-7TPF+/X9DRW-7TPF+, BIOS 2.0a 03/11/2013 Aug 5 02:51:22 murmillia kernel: [ 289.210738] task: 880037d01040 ti: 88003803a000 task.ti: 88003803a000 Aug 5 02:51:22 murmillia kernel: [ 289.210858] RIP: e030:[a02d21d0] [a02d21d0] ceph_osdc_build_request+0x2bb/0x3c6 [libceph] Aug 5 02:51:22 murmillia kernel: [ 289.211062] RSP: e02b:88003803b9f8 EFLAGS: 00010212 Aug 5 02:51:22 murmillia kernel: [ 289.211154] RAX: 880033a181c0 RBX: 880033a182ec RCX: Aug 5 02:51:22 murmillia kernel: [ 289.211251] RDX: 880033a182af RSI: 8050 RDI: 880030d34888 Aug 5 02:51:22 murmillia kernel: [ 289.211347] RBP: 2000 R08: 88003803ba58 R09: Aug 5 02:51:22 murmillia kernel: [ 289.211444] R10: R11: R12: 880033ba3500 Aug 5 02:51:22 murmillia kernel: [ 289.211541] R13: 0001 R14: 88003847aa78 R15: 88003847ab58 Aug 5 02:51:22 murmillia kernel: [ 289.211644] FS: 7f775da8c700() GS:88003f84() knlGS: Aug 5 02:51:22 murmillia kernel: [ 289.211765] CS: e033 DS: ES: CR0: 80050033 Aug 5 02:51:22 murmillia kernel: [ 289.211858] CR2: 7fa21ee2c000 CR3: 2be14000 CR4: 00042660 Aug 5 02:51:22 murmillia kernel: [ 289.211956] DR0: DR1: DR2: Aug 5 02:51:22 murmillia kernel: [ 289.212052] DR3: DR6: 0ff0 DR7: 0400 Aug 5 02:51:22 murmillia kernel: [ 289.212148] Stack: Aug 5 02:51:22 murmillia kernel: [ 289.212232] 2000 00243847aa78 880039949b40 Aug 5 02:51:22 murmillia kernel: [ 289.212577] 2201 880033811d98 88003803ba80 88003847aa78 Aug 5 02:51:22 murmillia kernel: [ 289.212921] 880030f24380 880002a38400 2000 a029584c Aug 5 02:51:22 murmillia kernel: [ 289.213264] Call Trace: Aug 5 02:51:22 murmillia kernel: [ 289.213358] [a029584c] ? rbd_osd_req_format_write+0x71/0x7c [rbd] Aug 5 02:51:22 murmillia kernel: [ 289.213459] [a0296f05] ? rbd_img_request_fill+0x695/0x736 [rbd] Aug 5 02:51:22 murmillia kernel: [ 289.213562] [810c96a7] ? arch_local_irq_restore+0x7/0x8 Aug 5 02:51:22 murmillia kernel: [ 289.213667] [81357ff8] ? down_read+0x9/0x19 Aug 5 02:51:22 murmillia kernel: [ 289.213763] [a029828a] ? rbd_request_fn+0x191/0x22e [rbd] Aug 5 02:51:22 murmillia kernel: [ 289.213864] [8117ac9e] ? __blk_run_queue_uncond+0x1e/0x26 Aug 5 02:51:22 murmillia kernel: [ 289.213962] [8117b7aa] ? blk_flush_plug_list+0x1c1/0x1e4 Aug 5 02:51:22 murmillia kernel: [
Re: [ceph-users] how to recover the osd.
Thanks Mike,Please find the output of two commands sadhu@ubuntu3:~$ ls /var/lib/ceph/osd/ceph-0 sadhu@ubuntu3:~$ cat /etc/ceph/ceph.conf [global] fsid = 593dac9e-ce55-4803-acb4-2d32b4e0d3be mon_initial_members = ubuntu3 mon_host = 10.147.41.3 #auth_supported = cephx auth cluster required = cephx auth service required = cephx auth client required = cephx osd_journal_size = 1024 filestore_xattr_use_omap = true -Original Message- From: Mike Dawson [mailto:mike.daw...@cloudapt.com] Sent: 08 August 2013 18:50 To: Suresh Sadhu Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] how to recover the osd. Looks like you didn't get osd.0 deployed properly. Can you show: - ls /var/lib/ceph/osd/ceph-0 - cat /etc/ceph/ceph.conf Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/8/2013 9:13 AM, Suresh Sadhu wrote: HI, My storage health cluster is warning state , one of the osd is in down state and even if I try to start the osd it fail to start sadhu@ubuntu3:~$ ceph osd stat e22: 2 osds: 1 up, 1 in sadhu@ubuntu3:~$ ls /var/lib/ceph/osd/ ceph-0 ceph-1 sadhu@ubuntu3:~$ ceph osd tree # idweight type name up/down reweight -1 0.14root default -2 0.14host ubuntu3 0 0.06999 osd.0 down0 1 0.06999 osd.1 up 1 sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start 0 /etc/init.d/ceph: 0. not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) Ceph health status in warning mode. pg 4.10 is active+degraded, acting [1] pg 3.17 is active+degraded, acting [1] pg 5.16 is active+degraded, acting [1] pg 4.17 is active+degraded, acting [1] pg 3.10 is active+degraded, acting [1] recovery 62/124 degraded (50.000%) mds.ceph@ubuntu3 at 10.147.41.3:6803/2148 is laggy/unresponsi regards sadhu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replace all monitors
On Thu, 8 Aug 2013, Olivier Bonvalet wrote: Hi, from now I have 5 monitors which share slow SSD with several OSD journal. As a result, each data migration operation (reweight, recovery, etc) is very slow and the cluster is near down. So I have to change that. I'm looking to replace this 5 monitors by 3 new monitors, which still share (very fast) SSD with several OSD. I suppose it's not a good idea, since monitors should have a dedicated storage. What do you think about that ? Is it a better practice to have dedicated storage, but share CPU with Xen VM ? I think it's okay, as long as you aren't wroried about the device filling up and the monitors are on different hosts. Second point, I'm not sure how to do that migration, without downtime. I was hoping to add the 3 new monitors, then progressively remove the 5 old monitors, but in the doc [1] indicate a special procedure for unhealthy cluster, which seem to be for clusters with damaged monitors, right ? In my case I only have dead PG [2] (#5226), from which I can't recover, but monitors are fine. Can I use the standard procedure ? The 'healthy' caveat in this case is about the monitor cluster; teh special procedure is only needed if you don't have enough healthy mons to form a quorum. The normal procedure should work just fine. sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how to recover the osd.
Earlier its created properly after rebooting host ,mount points are gone due to that ls command not shown earlier but now I have mounted again now am able to see the same folder structure sadhu@ubuntu3:/var/lib/ceph$ ls /var/lib/ceph/osd/ceph-1 activate.monmap active ceph_fsid current fsid journal keyring magic ready store_version upstart whoami sadhu@ubuntu3:/var/lib/ceph$ ls /var/lib/ceph/osd/ceph-0 activate.monmap active ceph_fsid current fsid journal keyring magic ready store_version upstart whoami sadhu@ubuntu3:/var/lib/ceph$ mount sadhu@ubuntu3:/var/lib/ceph$ ceph osd stat e31: 2 osds: 2 up, 2 in still it shows ceph health stat as warning. sadhu@ubuntu3:/var/lib/ceph$ ceph health HEALTH_WARN 225 pgs degraded; 676 pgs stuck unclean; recovery 21/124 degraded (16.935%); mds ceph@ubuntu3 is laggy Thanks sadhu -Original Message- From: Mike Dawson [mailto:mike.daw...@cloudapt.com] Sent: 08 August 2013 22:08 To: Suresh Sadhu Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] how to recover the osd. On 8/8/2013 12:30 PM, Suresh Sadhu wrote: Thanks Mike,Please find the output of two commands sadhu@ubuntu3:~$ ls /var/lib/ceph/osd/ceph-0 ^^^ that is a problem. It appears that osd.0 didn't get deployed properly. To see an example of what structure should be there, do: ls /var/lib/ceph/osd/ceph-1 ceph-0 should be similar to the apparently working ceph-1 on your cluster. It should look similar to: #ls /var/lib/ceph/osd/ceph-0 ceph_fsid current fsid keyring magic ready store_version whoami - Mike sadhu@ubuntu3:~$ cat /etc/ceph/ceph.conf [global] fsid = 593dac9e-ce55-4803-acb4-2d32b4e0d3be mon_initial_members = ubuntu3 mon_host = 10.147.41.3 #auth_supported = cephx auth cluster required = cephx auth service required = cephx auth client required = cephx osd_journal_size = 1024 filestore_xattr_use_omap = true -Original Message- From: Mike Dawson [mailto:mike.daw...@cloudapt.com] Sent: 08 August 2013 18:50 To: Suresh Sadhu Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] how to recover the osd. Looks like you didn't get osd.0 deployed properly. Can you show: - ls /var/lib/ceph/osd/ceph-0 - cat /etc/ceph/ceph.conf Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/8/2013 9:13 AM, Suresh Sadhu wrote: HI, My storage health cluster is warning state , one of the osd is in down state and even if I try to start the osd it fail to start sadhu@ubuntu3:~$ ceph osd stat e22: 2 osds: 1 up, 1 in sadhu@ubuntu3:~$ ls /var/lib/ceph/osd/ ceph-0 ceph-1 sadhu@ubuntu3:~$ ceph osd tree # idweight type name up/down reweight -1 0.14root default -2 0.14host ubuntu3 0 0.06999 osd.0 down0 1 0.06999 osd.1 up 1 sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start 0 /etc/init.d/ceph: 0. not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) sadhu@ubuntu3:~$ sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) Ceph health status in warning mode. pg 4.10 is active+degraded, acting [1] pg 3.17 is active+degraded, acting [1] pg 5.16 is active+degraded, acting [1] pg 4.17 is active+degraded, acting [1] pg 3.10 is active+degraded, acting [1] recovery 62/124 degraded (50.000%) mds.ceph@ubuntu3 at 10.147.41.3:6803/2148 is laggy/unresponsi regards sadhu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Chat Logs: Ceph Dev Summit
Hey all - I just posted the IRC chat logs from the Ceph Developer Summit. You can find them on the wiki, one log for sessions 1-16 and another for sessions 17-29: http://wiki.ceph.com/01Planning/CDS/Emperor/Chat_Log%3A_Sessions_1-16 http://wiki.ceph.com/01Planning/CDS/Emperor/Chat_Log%3A_Sessions_17-29 Cheers, Ross ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Backup monmap, osdmap, and crushmap
I've seen a couple posts here about broken clusters that had to repair by modifing the monmap, osdmap, or the crush rules. The old school sysadmin in me says it would be a good idea to make backups of these 3 databases. So far though, it seems like everybody was able to repair their clusters by dumping the current map and modifying it. I'll probably do it, just to assuage my paranoia, but I was wondering what you guys thought. I'm thinking of cronning this on the MON servers: #!/usr/bin/env bash # Number of days to keep backups cleanup_age=10 # Fetch the current timestamp, to use in the backup filenames date=$(date +%Y-%m-%dT%H:%M:%S) # Dump the current maps cd /var/lib/ceph/backups/ ceph mon getmap -o ./monmap.${date} ceph osd getmap -o ./osdmap.${date} ceph osd getcrushmap -o ./crushmap.${date} # Delete old maps find . -type f -regextype posix-extended -regex '\./(mon|osd|crush)map\..*' -mtime +${cleanup_age} -print0 | xargs -0 rm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replace all monitors
Le jeudi 08 août 2013 à 09:43 -0700, Sage Weil a écrit : On Thu, 8 Aug 2013, Olivier Bonvalet wrote: Hi, from now I have 5 monitors which share slow SSD with several OSD journal. As a result, each data migration operation (reweight, recovery, etc) is very slow and the cluster is near down. So I have to change that. I'm looking to replace this 5 monitors by 3 new monitors, which still share (very fast) SSD with several OSD. I suppose it's not a good idea, since monitors should have a dedicated storage. What do you think about that ? Is it a better practice to have dedicated storage, but share CPU with Xen VM ? I think it's okay, as long as you aren't wroried about the device filling up and the monitors are on different hosts. Not sure to understand : by «dedicated storage», I was talking of the monitor. Can I put monitors on Xen «host», if they have dedicated storage ? Second point, I'm not sure how to do that migration, without downtime. I was hoping to add the 3 new monitors, then progressively remove the 5 old monitors, but in the doc [1] indicate a special procedure for unhealthy cluster, which seem to be for clusters with damaged monitors, right ? In my case I only have dead PG [2] (#5226), from which I can't recover, but monitors are fine. Can I use the standard procedure ? The 'healthy' caveat in this case is about the monitor cluster; teh special procedure is only needed if you don't have enough healthy mons to form a quorum. The normal procedure should work just fine. Great, thanks ! sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] journal on ssd
Let me just clarify... the prepare process created all 10 partitions in sdg the thing is that only 2 (sdg1, sdg2) would be present in /dev. The partx bit is just a hack as I am not familiar with the entire sequence. Initially I was deploying this test cluster in 5 nodes, each with 10 spinners, 1 OS spinner, 1 ssd for journal. *All* nodes would only bring up the first 2 osds. From the start the partitions for journals are there: ~]# parted /dev/sdg GNU Parted 2.1 Using /dev/sdg Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA Samsung SSD 840 (scsi) Disk /dev/sdg: 512GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End SizeFile system Name Flags 1 1049kB 4295MB 4294MB ceph journal 2 4296MB 8590MB 4294MB ceph journal 3 8591MB 12.9GB 4294MB ceph journal 4 12.9GB 17.2GB 4294MB ceph journal 5 17.2GB 21.5GB 4294MB ceph journal 6 21.5GB 25.8GB 4294MB ceph journal 7 25.8GB 30.1GB 4294MB ceph journal 8 30.1GB 34.4GB 4294MB ceph journal 9 34.4GB 38.7GB 4294MB ceph journal 10 38.7GB 42.9GB 4294MB ceph journal After partx all the entries show up under /dev and I have been able to install the cluster successfully. The only weirdness happened with only one node. Not everything was entirely active+clean. That got resolved after I added the 2nd node. At the moment with 3 nodes: 2013-08-08 17:38:38.328991 mon.0 [INF] pgmap v412: 192 pgs: 192 active+clean; 9518 bytes data, 1153 MB used, 83793 GB / 83794 GB avail Thanks, On Thu, Aug 8, 2013 at 8:17 AM, Sage Weil s...@inktank.com wrote: On Wed, 7 Aug 2013, Tren Blackburn wrote: On Tue, Aug 6, 2013 at 11:14 AM, Joao Pedras jpped...@gmail.com wrote: Greetings all. I am installing a test cluster using one ssd (/dev/sdg) to hold the journals. Ceph's version is 0.61.7 and I am using ceph-deploy obtained from ceph's git yesterday. This is on RHEL6.4, fresh install. When preparing the first 2 drives, sda and sdb, all goes well and the journals get created in sdg1 and sdg2: $ ceph-deploy osd prepare ceph00:sda:sdg ceph00:sdb:sdg [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph00:/dev/sda:/dev/sdg ceph00:/dev/sdb:/dev/sdg [ceph_deploy.osd][DEBUG ] Deploying osd to ceph00 [ceph_deploy.osd][DEBUG ] Host ceph00 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph00 disk /dev/sda journal /dev/sdg activate False [ceph_deploy.osd][DEBUG ] Preparing host ceph00 disk /dev/sdb journal /dev/sdg activate False When preparing sdc or any disk after the first 2 I get the following in that osd's log but no errors on ceph-deploy: # tail -f /var/log/ceph/ceph-osd.2.log 2013-08-06 10:51:36.655053 7f5ba701a780 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 11596 2013-08-06 10:51:36.658671 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) mkfs in /var/lib/ceph/tmp/mnt.i2NK47 2013-08-06 10:51:36.658697 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) mkfs fsid is already set to 5d1beb09-1f80-421d-a88c-57789e2fc33e 2013-08-06 10:51:36.813783 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) leveldb db exists/created 2013-08-06 10:51:36.813964 7f5ba701a780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-08-06 10:51:36.813999 7f5ba701a780 1 journal _open /var/lib/ceph/tmp/mnt.i2NK47/journal fd 10: 0 bytes, block size 4096 bytes, directio = 1, aio = 0 2013-08-06 10:51:36.814035 7f5ba701a780 -1 journal check: ondisk fsid ---- doesn't match expected 5d1beb09-1f80-421d-a88c-57789e2fc33e, invalid (someone else's?) journal 2013-08-06 10:51:36.814093 7f5ba701a780 -1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.i2NK47/journal: (22) Invalid argument 2013-08-06 10:51:36.814125 7f5ba701a780 -1 OSD::mkfs: FileStore::mkfs failed with error -22 2013-08-06 10:51:36.814185 7f5ba701a780 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.i2NK47: (22) Invalid argument I have cleaned the disks with dd, zapped them and so forth but this always occurs. If doing sdc/sdd first, for example, then sda or whatever follows fails with similar errors. Does anyone have any insight on this issue? Very strange! What does the partition table look like at this point? Does the joural nsymlink in the osd data directory point to the right partition/device on the failing osd? sage -- Joao Pedras ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] journal on ssd
I might be able to give that a shot tomorrow as I will probably reinstall this set. On Thu, Aug 8, 2013 at 6:19 PM, Sage Weil s...@inktank.com wrote: On Thu, 8 Aug 2013, Joao Pedras wrote: Let me just clarify... the prepare process created all 10 partitions in sdg the thing is that only 2 (sdg1, sdg2) would be present in /dev. The partx bit is just a hack as I am not familiar with the entire sequence. Initially I was deploying this test cluster in 5 nodes, each with 10 spinners, 1 OS spinner, 1 ssd for journal. *All* nodes would only bring up the first 2 osds. From the start the partitions for journals are there: ~]# parted /dev/sdg GNU Parted 2.1 Using /dev/sdg Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA Samsung SSD 840 (scsi) Disk /dev/sdg: 512GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End SizeFile system Name Flags 1 1049kB 4295MB 4294MB ceph journal 2 4296MB 8590MB 4294MB ceph journal 3 8591MB 12.9GB 4294MB ceph journal 4 12.9GB 17.2GB 4294MB ceph journal 5 17.2GB 21.5GB 4294MB ceph journal 6 21.5GB 25.8GB 4294MB ceph journal 7 25.8GB 30.1GB 4294MB ceph journal 8 30.1GB 34.4GB 4294MB ceph journal 9 34.4GB 38.7GB 4294MB ceph journal 10 38.7GB 42.9GB 4294MB ceph journal After partx all the entries show up under /dev and I have been able to install the cluster successfully. This really seems like something that udev should be doing. I think the next step would be to reproduce the problem directly, by wiping the partition table (ceph-disk zap /dev/sdg) and running the sgdisk commands to create the partitions directly from the command line, and then verifying that the /dev entries are (not) present. It may be that our ugly ceph-disk-udev helper is throwing a wrench in things, but I'm not sure offhand how that would be. Once you have a sequence that reproduces the problem, though, we can experiement (by e.g. disabling the ceph helper to rule that out). sage The only weirdness happened with only one node. Not everything was entirely active+clean. That got resolved after I added the 2nd node. At the moment with 3 nodes: 2013-08-08 17:38:38.328991 mon.0 [INF] pgmap v412: 192 pgs: 192 active+clean; 9518 bytes data, 1153 MB used, 83793 GB / 83794 GB avail Thanks, On Thu, Aug 8, 2013 at 8:17 AM, Sage Weil s...@inktank.com wrote: On Wed, 7 Aug 2013, Tren Blackburn wrote: On Tue, Aug 6, 2013 at 11:14 AM, Joao Pedras jpped...@gmail.com wrote: Greetings all. I am installing a test cluster using one ssd (/dev/sdg) to hold the journals. Ceph's version is 0.61.7 and I am using ceph-deploy obtained from ceph's git yesterday. This is on RHEL6.4, fresh install. When preparing the first 2 drives, sda and sdb, all goes well and the journals get created in sdg1 and sdg2: $ ceph-deploy osd prepare ceph00:sda:sdg ceph00:sdb:sdg [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph00:/dev/sda:/dev/sdg ceph00:/dev/sdb:/dev/sdg [ceph_deploy.osd][DEBUG ] Deploying osd to ceph00 [ceph_deploy.osd][DEBUG ] Host ceph00 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph00 disk /dev/sda journal /dev/sdg activate False [ceph_deploy.osd][DEBUG ] Preparing host ceph00 disk /dev/sdb journal /dev/sdg activate False When preparing sdc or any disk after the first 2 I get the following in that osd's log but no errors on ceph-deploy: # tail -f /var/log/ceph/ceph-osd.2.log 2013-08-06 10:51:36.655053 7f5ba701a780 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 11596 2013-08-06 10:51:36.658671 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) mkfs in /var/lib/ceph/tmp/mnt.i2NK47 2013-08-06 10:51:36.658697 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) mkfs fsid is already set to 5d1beb09-1f80-421d-a88c-57789e2fc33e 2013-08-06 10:51:36.813783 7f5ba701a780 1 filestore(/var/lib/ceph/tmp/mnt.i2NK47) leveldb db exists/created 2013-08-06 10:51:36.813964 7f5ba701a780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-08-06 10:51:36.813999 7f5ba701a780 1 journal _open /var/lib/ceph/tmp/mnt.i2NK47/journal fd 10: 0 bytes,
[ceph-users] ceph-deploy behind corporate firewalls
hi all, I am not sure if I am the only one having issues with ceph-deploy behind a firewall or not. I haven't seen any other reports of similar issues yet. With http proxies I am able to have apt-get working, but wget is still an issue. Working to use the newer ceph-deploy mechanism to deploy my next POC set up on four storage nodes. The ceph-deploy install process unfortunately uses wget to retrieve the Ceph release key and failing the install. To get around this i can manually add the Ceph release key on all my nodes and apt-get install all the Ceph packages. Question though is whether there is anything else that ceph-deploy does that I would need to do manually to have everything in state where ceph-deploy would work correctly for the rest of the cluster setup and deployment, i.e. ceph-deploy new -and- ceph-deploy mon create, etc.? thank you, Harvey ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy behind corporate firewalls
On Thu, 8 Aug 2013, Harvey Skinner wrote: hi all, I am not sure if I am the only one having issues with ceph-deploy behind a firewall or not. I haven't seen any other reports of similar issues yet. With http proxies I am able to have apt-get working, but wget is still an issue. This is indeed a problem for many users. It's on our list of things to add to the tool! Working to use the newer ceph-deploy mechanism to deploy my next POC set up on four storage nodes. The ceph-deploy install process unfortunately uses wget to retrieve the Ceph release key and failing the install. To get around this i can manually add the Ceph release key on all my nodes and apt-get install all the Ceph packages. Question though is whether there is anything else that ceph-deploy does that I would need to do manually to have everything in state where ceph-deploy would work correctly for the rest of the cluster setup and deployment, i.e. ceph-deploy new -and- ceph-deploy mon create, etc.? I'm pretty sure install is the only thing that needs apt or wget; you should be fine. sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com