Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd [.] 0x5c2897 4.66% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, level 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next() 2.70% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long) 2.60% ceph-osd ceph-osd [.] PGLog::check() 2.57% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock 2.49% ceph-osd ceph-osd [.] ceph_crc32c_le_intel 1.93% ceph-osd libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, char*) 1.53% ceph-osd libstdc++.so.6.0.16[.] std::string::append(char const*, unsigned long) 1.47% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long) 1.33% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string 0.98% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*) 0.90% ceph-osd libstdc++.so.6.0.16[.] std::string::assign(char const*, unsigned long) 0.75% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_replace_safe(unsigned long, unsigned long, char cons 0.58% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes 0.55% ceph-osd ceph-osd [.] leveldb::Block::Iter::Valid() const 0.51% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache:: 0.50% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::CentralFreeList::FetchFromSpans() 0.47% ceph-osd libstdc++.so.6.0.16[.] 0x9ebc8 0.46% ceph-osd libc-2.15.so [.] vfprintf 0.45% ceph-osd [kernel.kallsyms] [k] find_busiest_group 0.45% ceph-osd libstdc++.so.6.0.16[.] std::string::resize(unsigned long, char) 0.43% ceph-osd libpthread-2.15.so [.] pthread_mutex_unlock 0.41% ceph-osd [kernel.kallsyms] [k] iput_final 0.40% ceph-osd ceph-osd [.] leveldb::Block::Iter::Seek(leveldb::Slice const) 0.39% ceph-osd libc-2.15.so [.] _IO_vfscanf 0.39% ceph-osd ceph-osd [.] leveldb::Block::Iter::key() const 0.39% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::CentralFreeList::ReleaseToSpans(void*) 0.37% ceph-osd libstdc++.so.6.0.16[.] std::basic_ostreamchar, std::char_traitschar std::__ostream_in Cuttlefish - Events: 160K cycles 7.53% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock 6.26% ceph-osd libc-2.15.so [.] 0x89115 3.06% ceph-osd ceph-osd [.] ceph_crc32c_le 2.66% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long) 2.46% ceph-osd [kernel.kallsyms] [k] find_busiest_group 1.80% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*) 1.42% ceph-osd [kernel.kallsyms] [k] try_to_wake_up 1.27% ceph-osd ceph-osd [.] 0x531fb6 1.21% ceph-osd libstdc++.so.6.0.16[.] 0x9ebc8 1.14% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes 1.02% ceph-osd libc-2.15.so [.] _IO_vfscanf 1.01% ceph-osd [kernel.kallsyms] [k] update_shares 0.98% ceph-osd [kernel.kallsyms] [k] filemap_fdatawait_range 0.90% ceph-osd libstdc++.so.6.0.16[.] std::basic_ostreamchar, std::char_traitschar std 0.89% ceph-osd [kernel.kallsyms] [k] iput_final 0.79% ceph-osd libstdc++.so.6.0.16[.] std::basic_stringchar, std::char_traitschar, std::a 0.79% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string 0.78% ceph-osd libc-2.15.so [.] vfprintf 0.70% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc: 0.69% ceph-osd [kernel.kallsyms] [k] __d_lookup_rcu 0.69% ceph-osd
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hi Sam, It looks like that has dropped the CPU usage a fair bit. CPU usage still seems a bit higher than Cuttlefish but that might just be due to the levelDB changes. Here's the updated perf report - Events: 80K cycles 17.25% ceph-osd libc-2.15.so [.] 0x15d534 14.63% ceph-osd ceph-osd [.] 0x5c801b 3.87% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, leveldb::Slice const) const 2.91% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next() 2.58% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock 2.45% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long) 2.02% ceph-osd ceph-osd [.] ceph_crc32c_le_intel 1.80% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long) 1.38% ceph-osd libstdc++.so.6.0.16[.] std::string::append(char const*, unsigned long) 1.15% ceph-osd libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, char*) 1.04% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*) 1.03% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string 0.77% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long) 0.72% ceph-osd libstdc++.so.6.0.16[.] 0x9ebc8 0.68% ceph-osd libstdc++.so.6.0.16[.] std::basic_stringchar, std::char_traitschar, std::allocatorchar ::basic_string(std::string const) 0.67% ceph-osd [kernel.kallsyms] [k] find_busiest_group 0.61% ceph-osd [kernel.kallsyms] [k] tg_load_down 0.57% ceph-osd libc-2.15.so [.] vfprintf 0.54% ceph-osd libc-2.15.so [.] _IO_vfscanf 0.53% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) 0.51% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes 0.47% ceph-osd libpthread-2.15.so [.] pthread_mutex_unlock 0.47% ceph-osd libstdc++.so.6.0.16[.] std::string::assign(char const*, unsigned long) 0.47% ceph-osd ceph-osd [.] leveldb::Block::Iter::Valid() const On Tue, Aug 27, 2013 at 2:33 PM, Samuel Just sam.j...@inktank.com wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd [.] 0x5c2897 4.66% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, level 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next() 2.70% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long) 2.60% ceph-osd ceph-osd [.] PGLog::check() 2.57% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock 2.49% ceph-osd ceph-osd [.] ceph_crc32c_le_intel 1.93% ceph-osd libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, char*) 1.53% ceph-osd libstdc++.so.6.0.16[.] std::string::append(char const*, unsigned long) 1.47% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long) 1.33% ceph-osd [kernel.kallsyms] [k] copy_user_generic_string 0.98% ceph-osd libtcmalloc.so.0.1.0 [.] operator delete(void*) 0.90% ceph-osd libstdc++.so.6.0.16[.] std::string::assign(char const*, unsigned long) 0.75% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_replace_safe(unsigned long, unsigned long, char cons 0.58% ceph-osd [kernel.kallsyms] [k] wait_sb_inodes 0.55% ceph-osd ceph-osd [.] leveldb::Block::Iter::Valid() const 0.51% ceph-osd libtcmalloc.so.0.1.0 [.] tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache:: 0.50% ceph-osd libtcmalloc.so.0.1.0 [.]
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd [.] 0x5c2897 4.66% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, level 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next() 2.70% ceph-osd libstdc++.so.6.0.16[.]
Re: [ceph-users] lvm for a quick ceph lab cluster test
On 26.08.2013 23:07, Samuel Just wrote: Seems reasonable to me. I'm not sure I've heard anything about using LVM under ceph. Let us know how it goes! We are currently using it on a test cluster distributed on our desktops. Loïc Dachary visited us and wrote a small article: http://dachary.org/?p=2269 One thing with LVM volumes is that you have to manually create the filesystem (mkfs.xfs) and mount it somewhere and then point ceph-deploy to that directory. It then creates a symlink under /var/lib/ceph/osd. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph + Xen - RBD io hang
Hi, I am doing some experimentation with Ceph and Xen (on the same host) and I'm experiencing some problems with the rbd device that I'm using as the block device. My environment is: 2 node Ceph 0.67.2 cluster, 4x OSD (btrfs) and 1x mon Xen 4.3.0 Kernel 3.10.9 The domU I'm trying to build is from the Ubuntu 13.04 desktop release. When I pass through the rbd (format 1 or 2) device as phy:/dev/rbd/rbd/ubuntu-test then the domU has no problems reading data from it, the test I ran was: for i in $(seq 0 1023) ; do dd if=/dev/xvda of=/dev/null bs=4k count=1024 skip=$(($i * 4)) done However writing data causes the domU to hang while while i is still in single figures but it doesn't seem consistent about the exact value. for i in $(seq 0 1023) ; do dd of=/dev/xvda of=/dev/zero bs=4k count=1024 seek=$(($i * 4)) done eventually the kernel in the domU will print a hung task warning. I have tried the domU as pv and hvm (with xen_platform_pci = 1 and 0) but have the same behaviour in both cases. Once this state is triggered on the rbd device then any interaction with it in dom0 will result in the same hang. I'm assuming that there is some unfavourable interaction between ceph/rbd and blkback but I haven't found anything in the dom0 logs so I would like to know if anyone has some suggestions about where to start trying to hunt this down. Thanks, James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph + Xen - RBD io hang
Hi, I use Ceph 0.61.8 and Xen 4.2.2 (Debian) in production, and can't use kernel 3.10.* on dom0, which hang very soon. But it's visible in kernel logs of the dom0, not the domU. Anyway, you should probably re-try with kernel 3.9.11 for the dom0 (I also use 3.10.9 in domU). Olivier Le mardi 27 août 2013 à 11:46 +0100, James Dingwall a écrit : Hi, I am doing some experimentation with Ceph and Xen (on the same host) and I'm experiencing some problems with the rbd device that I'm using as the block device. My environment is: 2 node Ceph 0.67.2 cluster, 4x OSD (btrfs) and 1x mon Xen 4.3.0 Kernel 3.10.9 The domU I'm trying to build is from the Ubuntu 13.04 desktop release. When I pass through the rbd (format 1 or 2) device as phy:/dev/rbd/rbd/ubuntu-test then the domU has no problems reading data from it, the test I ran was: for i in $(seq 0 1023) ; do dd if=/dev/xvda of=/dev/null bs=4k count=1024 skip=$(($i * 4)) done However writing data causes the domU to hang while while i is still in single figures but it doesn't seem consistent about the exact value. for i in $(seq 0 1023) ; do dd of=/dev/xvda of=/dev/zero bs=4k count=1024 seek=$(($i * 4)) done eventually the kernel in the domU will print a hung task warning. I have tried the domU as pv and hvm (with xen_platform_pci = 1 and 0) but have the same behaviour in both cases. Once this state is triggered on the rbd device then any interaction with it in dom0 will result in the same hang. I'm assuming that there is some unfavourable interaction between ceph/rbd and blkback but I haven't found anything in the dom0 logs so I would like to know if anyone has some suggestions about where to start trying to hunt this down. Thanks, James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with keyrings during deployment
Hi again, I continue to try debugging the problem reported before. Now, I have been trying to use a couple of VM for doing this (one with Ubuntu 12.04 64-bit, and the other with Ubuntu 12.10 64-bit, and I use the ceph.com repos for installing the Ceph libraries). And, unfortunately, I am getting into the same problem: the keyring do not appear where they should (i.e. bootstrap-mds and bootstrap-osd in /var/lib/ceph). I have followed the preflight check list ( http://ceph.com/docs/next/start/quick-start-preflight/), and the ceph user in the admin box can login perfectly well on the server box, so not sure what's going on here. I have even tried to use a single ceph server for installing everything (adding the 'osd crush chooseleaf type = 0' line into the ceph conf file) but then again the keyrings do not appear. Nobody is having the same problems than me (using latest Ceph Dumpling 0.67.2 release here)? Thanks for any insight! Francesc On Mon, Aug 26, 2013 at 1:55 PM, Francesc Alted franc...@continuum.iowrote: Hi, I am a newcomer to Ceph. After having a look at the docs (BTW, it is nice to see its concepts being implemented), I am trying to do some tests, mainly to check the Python APIs to access RADOS and RDB components. I am following this quick guide: http://ceph.com/docs/next/start/quick-ceph-deploy/ But after adding a monitor (ceph-deploy mon create ceph-server), I see that the subdirectories bootstrap-mds and bootstrap-osd (in /var/lib/ceph) do not contain keyrings. I have tried to create the monitor again (as suggested in the docs), but the keyrings continue to not appear there: $ ceph-deploy gatherkeys ceph-server [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /etc/ceph/ceph.client.admin.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-osd/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-mds/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-server'] My admin node (the machine from where I issue the ceph commands) is an openSUSE 12.3 where I compiled the ceph-0.67.1 tarball. The server node is a Debian Precise 64-bit (using vagrant w/ VirtaulBox), and Ceph installation seems to have gone well, as per the logs: [ceph-server][INFO ] Running command: ceph --version [ceph-server][INFO ] ceph version 0.67.2 (eb4380dd036a0b644c6283869911d615ed729ac8) Any hints on what is going on there? Thanks! -- Francesc Alted -- Francesc Alted ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with keyrings during deployment
Hey Francesc, I encountered these while playing with ceph-deploy a couple of days earlier. Haven't done any troubleshooting on it yet. I encountered the error with the gatherkeys-option, just like you did. Regards, Oliver On di, 2013-08-27 at 13:18 +0200, Francesc Alted wrote: Hi again, I continue to try debugging the problem reported before. Now, I have been trying to use a couple of VM for doing this (one with Ubuntu 12.04 64-bit, and the other with Ubuntu 12.10 64-bit, and I use the ceph.com repos for installing the Ceph libraries). And, unfortunately, I am getting into the same problem: the keyring do not appear where they should (i.e. bootstrap-mds and bootstrap-osd in /var/lib/ceph). I have followed the preflight check list (http://ceph.com/docs/next/start/quick-start-preflight/), and the ceph user in the admin box can login perfectly well on the server box, so not sure what's going on here. I have even tried to use a single ceph server for installing everything (adding the 'osd crush chooseleaf type = 0' line into the ceph conf file) but then again the keyrings do not appear. Nobody is having the same problems than me (using latest Ceph Dumpling 0.67.2 release here)? Thanks for any insight! Francesc On Mon, Aug 26, 2013 at 1:55 PM, Francesc Alted franc...@continuum.io wrote: Hi, I am a newcomer to Ceph. After having a look at the docs (BTW, it is nice to see its concepts being implemented), I am trying to do some tests, mainly to check the Python APIs to access RADOS and RDB components. I am following this quick guide: http://ceph.com/docs/next/start/quick-ceph-deploy/ But after adding a monitor (ceph-deploy mon create ceph-server), I see that the subdirectories bootstrap-mds and bootstrap-osd (in /var/lib/ceph) do not contain keyrings. I have tried to create the monitor again (as suggested in the docs), but the keyrings continue to not appear there: $ ceph-deploy gatherkeys ceph-server [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /etc/ceph/ceph.client.admin.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-osd/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-mds/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-server'] My admin node (the machine from where I issue the ceph commands) is an openSUSE 12.3 where I compiled the ceph-0.67.1 tarball. The server node is a Debian Precise 64-bit (using vagrant w/ VirtaulBox), and Ceph installation seems to have gone well, as per the logs: [ceph-server][INFO ] Running command: ceph --version [ceph-server][INFO ] ceph version 0.67.2 (eb4380dd036a0b644c6283869911d615ed729ac8) Any hints on what is going on there? Thanks! -- Francesc Alted -- Francesc Alted ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hi Olver/Matthew, Ignoring CPU usage, has speed remained slower as well? Mark On 08/27/2013 03:08 AM, Oliver Daudey wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd [.] 0x5c2897 4.66% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, level 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next() 2.70% ceph-osd libstdc++.so.6.0.16[.] std::string::_M_mutate(unsigned long, unsigned long, unsigned long) 2.60% ceph-osd ceph-osd [.] PGLog::check() 2.57% ceph-osd [kernel.kallsyms] [k] __ticket_spin_lock 2.49% ceph-osd ceph-osd [.] ceph_crc32c_le_intel 1.93% ceph-osd libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, char*) 1.53% ceph-osd libstdc++.so.6.0.16[.] std::string::append(char const*, unsigned long) 1.47% ceph-osd libtcmalloc.so.0.1.0 [.] operator new(unsigned long) 1.33% ceph-osd [kernel.kallsyms]
[ceph-users] Ceph-OSD on compute nodes?
How does the community feel about running OSDs on the same node as openstack compute? What if its only 3 sata disks? Isnt ceph-OSD a bit to CPU and ram hungry for doing such a thing and would lead little left over for vm instances? Just curious as I just saw someone in a forum that said they were going to do that and i always thought it was not recommended by ceph developers. - Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 26/08/13 19:31, Sage Weil wrote: I'm wondering what kind of delay, or additional start-on logic I can add to the upstart script to work around this. Hmm, this is beyond my upstart-fu, unfortunately. This has come up before, actually. Previously we would wait for any interface to come up and then start, but that broke with multi-nic machines, and I ended up just making things start in runlevel [2345]. James, do you know what should be done to make the job wait for *all* network interfaces to be up? Is that even the right solution here? This is actually really tricky; runlevel [2345] should cover most use cases as this will ensure that interfaces configured in /etc/network/interfaces. But it sounds like that might not be the case; Travis - in your example are all network interfaces configured using /etc/network/interfaces? - -- James Page Ubuntu and Debian Developer james.p...@ubuntu.com jamesp...@debian.org -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCAAGBQJSHG6KAAoJEL/srsug59jDHJ4P/RR0HOUC8awoKlH1f92GRqqa bOo/vszIY/4c2NhCTXRX3jWxBCexlNLlwYExsbX9hzP3DDOMzOMdXh2rMM9o3zaD 98z+o1jC+hUYf27UmK+ZbZGqr4Xh9bi07g6jF2/u3OmbCcxQUaRQdzDp4dbf1MK4 Q2iigJhLBSiZw+OX0+2210+7Cmz9lKKNeuuUcsqT0jdagPYJIQlIbA9v7aNzsxlI AmEShkCBoI9lzedyFsBIZ10gtMDrvBPJHyDf3VySW/ZhLlZeAnPhRZaZ/AkcrToX 1x6quQvheWyr52bbe0gnAAoIZUpLyCG0+Xkp9+jw11HWLTdGMsn3nMI7BUZ6MHrB 8rIdBGc9gxuKsZyqP/QRBVWDWjACHckjAl0ORJdeXkfm6ZmruRTEB2CNgXZF+Wl5 h0InmcdjMTIwgxV5wgJ4d6Lom45AKaTIumpBiGvmMjuVm08V0xftkPpNbpIsbbol fmmpqTlxJtVrsd1CZd1nN74z+EOgrCDRJg4bzSPVRjkYIJc6by3udLSRFlQfz5qd 8pm7PsbSEBEY873HPZMuxAMfXQKf/EMNZTq6bbrA61sgIXUEr/YFmG9EiA8ptnAp 1cy4zRgrnL1KI9rSrjKi19wFeYEn0HRLlPqlA8likUTaGbNpXppohpt7RyE1eA6t vdMkd1v47yZuNoEsEA8e =iLfL -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy pushy dependency problem
Hey ceph devs, I think there is a problem with your dependencies in the current version of ceph-deploy for FC19 (and probably other redhat variants). The package ceph-deploy explictly requires python-pushy = 0.5.3, but the pushy package is simply named pushy (and is the correct version). The spec file looks fine in the ceph-deploy git repo, maybe you just need to rerun the package/repo generation? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.commailto:kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy pushy dependency problem
I'm having the same issue on debian 7.1. When reinstalling ceph-deploy after purge I get the following: root@vl0181:~# aptitude install ceph-deploy The following NEW packages will be installed: ceph-deploy{b} 0 packages upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 36.5 kB of archives. After unpacking 360 kB will be used. The following packages have unmet dependencies: ceph-deploy : Depends: python-pushy (= 0.5.3) but 0.5.1-1 is installed. The following actions will resolve these dependencies: Keep the following packages at their current version: 1) ceph-deploy [Not Installed] Aptitude updated and upgraded. Regards Am 26.08.2013 um 23:57 schrieb Kevin Weiler kevin.wei...@imc-chicago.com: Hey ceph devs, I think there is a problem with your dependencies in the current version of ceph-deploy for FC19 (and probably other redhat variants). The package ceph-deploy explictly requires python-pushy = 0.5.3, but the pushy package is simply named pushy (and is the correct version). The spec file looks fine in the ceph-deploy git repo, maybe you just need to rerun the package/repo generation? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Storage, File Systems and Data Scrubbing
in most cases. It seems that most people choose this system because of its journaling feature and XFS for its additional attribute storage which has a 64kb limit which should be sufficient for most operations. But when you look at file system benchmarks btrfs is really, really slow. Then comes XFS, then EXT4, but EXT2 really dwarfs all other throughput results. On journaling systems (like XFS, EXT4 and btrfs) disabling journaling actually helps throughput as well. Sometimes more then 2 times for write actions. The preferred configuration for OSD's is one OSD per disk. Each object is striped among all Object Storage Daemons in a cluster. So if I would take one disk for the cluster and check its data, chances are slim that I will find a complete object there (a non-striped, full object I mean). When a client issues an object write (I assume a full object/file write in this case) it is the client's responsibility to stripe it among the object storage daemons. When a stripe is successfully stored by the daemon an ACK signal is send to (?) the client and all participating OSD's. When all participating OSD's for the object have completed the client assumes all is well and returns control to the application If I'm not mistaken, then journaling is meant for the rare occasions that a hardware failure will occur and the data is corrupted. Ceph does this too in another way of course. But ceph should be able to notice when a block/stripe is correct or not. In the rare occasion that a node is failing while doing a write; an ACK signal is not send to the caller and therefor the client can resend the block/stripe to another OSD. Therefor I fail to see the purpose of this extra journaling feature. Also ceph schedules a data scrubbing process every day (or however it is configured) that should be able to tackle bad sectors or other errors on the file system and accordingly repair them on the same daemon or flag the whole block as bad. Since everything is replicated the block is still in the storage cluster so no harm is done. In a normal/single file system I truly see the value of journaling and the potential for btrfs (although it's still very slow). However in a system like ceph, journaling seems to me more like a paranoid super fail save. Did anyone experiment with file systems that disabled journaling and how did it perform? Regards, Johannes __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8713 (20130821) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8713 (20130821) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8713 (20130821) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8731 (20130826) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8733 (20130827) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8733 (20130827) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some help needed with ceph deployment
Hi, It seems that all my pgs are stuck somewhat. I'm not sure what to do from here. I waited a day in the hope that ceph would find a way to deal with this... but nothing happened. I'm testing on a single ubuntu server 13.04 with dumpling 0.67.2. Below is my ceph status. root@cephnode2:/root# ceph -s cluster 9087eb7a-abe1-4d38-99dc-cb6b266f0f84 health HEALTH_WARN 37 pgs degraded; 192 pgs stuck unclean monmap e1: 1 mons at {cephnode2=172.16.1.2:6789/0}, election epoch 1, quorum 0 cephnode2 osdmap e38: 6 osds: 6 up, 6 in pgmap v65: 192 pgs: 155 active+remapped, 37 active+degraded; 0 bytes data, 213 MB used, 11172 GB / 11172 GB avail mdsmap e1: 0/0/1 up root@cephnode2:/root# ceph osd tree # idweight type name up/down reweight -1 10.92 root default -2 10.92 host cephnode2 0 1.82osd.0 up 1 1 1.82osd.1 up 1 2 1.82osd.2 up 1 3 1.82osd.3 up 1 4 1.82osd.4 up 1 5 1.82osd.5 up 1 root@cephnode2:/root#ceph health detail HEALTH_WARN 37 pgs degraded; 192 pgs stuck unclean pg 0.3f is stuck unclean since forever, current state active+remapped, last acting [2,0] pg 1.3e is stuck unclean since forever, current state active+remapped, last acting [2,0] pg 2.3d is stuck unclean since forever, current state active+remapped, last acting [2,0] pg 0.3e is stuck unclean since forever, current state active+remapped, last acting [4,0] pg 1.3f is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 2.3c is stuck unclean since forever, current state active+remapped, last acting [4,0] pg 0.3d is stuck unclean since forever, current state active+degraded, last acting [0] pg 1.3c is stuck unclean since forever, current state active+degraded, last acting [0] pg 2.3f is stuck unclean since forever, current state active+remapped, last acting [4,1] pg 0.3c is stuck unclean since forever, current state active+remapped, last acting [3,1] pg 1.3d is stuck unclean since forever, current state active+remapped, last acting [4,0] pg 2.3e is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 0.3b is stuck unclean since forever, current state active+degraded, last acting [0] pg 1.3a is stuck unclean since forever, current state active+degraded, last acting [0] pg 2.39 is stuck unclean since forever, current state active+degraded, last acting [0] pg 0.3a is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 1.3b is stuck unclean since forever, current state active+remapped, last acting [3,1] pg 2.38 is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 0.39 is stuck unclean since forever, current state active+degraded, last acting [0] pg 1.38 is stuck unclean since forever, current state active+degraded, last acting [0] pg 2.3b is stuck unclean since forever, current state active+degraded, last acting [0] pg 0.38 is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 1.39 is stuck unclean since forever, current state active+remapped, last acting [1,0] pg 2.3a is stuck unclean since forever, current state active+remapped, last acting [3,1] pg 0.37 is stuck unclean since forever, current state active+remapped, last acting [3,2] [...] and many more. I found one entry on the mailing list from someone that had a similar issue and he fixed it with the following commands: #ceph osd getcrushmap -o /tmp/crush #crushtool -i /tmp/crush --enable-unsafe-tunables --set-choose-local-tries 0 --set-choose-local-fallback-tries 0 --set-choose-total-tries 50 -o /tmp/crush.new root@ceph-admin:/etc/ceph# ceph osd setcrushmap -i /tmp/crush.new but I'm not sure what he is trying to do here. Especially -enable-unsafe-tunables seems a little ... unsafe. I also read this http://eu.ceph.com/docs/wip-3060/ops/manage/failures/osd/#failures-osd-unfound link. But it doesn't detail about any actions that one can do in order to fix it to a HEALTH_OK status. Regards, Johannes __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8733 (20130827) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
On Aug 27, 2013, at 2:08, Oliver Daudey oli...@xs4all.nl wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: memcpy is in fact also present in your Cuttlefish OSD, just a bit further down the list (increased from .7% to 1.4%). memcmp definitely looks suspicious and is something we're looking into. 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey Ian, Samuel, FYI: I still had some attempted optimization-options in place on the production-cluster, which might have skewed my results a bit. OSD version 0.67.2-16-geeb1f86 seems to be a lot less hard on the CPU in the configuration that I did all other tests in. I haven't yet verified sufficiently if this is accompanied by a speed-increase as well. On the test-cluster, I didn't see any difference in speed, but that may not mean much, as the load-pattern on production is totally different. Sorry for that mixup. Updated `perf top'-output, extra options removed, under current load, which should be higher than in my previous mail: 18.08% [kernel] [k] intel_idle 5.87% [kernel] [k] find_busiest_group 4.92% kvm [.] 0xcefe2 3.24% [kernel] [k] native_write_msr_safe 2.92% [kernel] [k] default_send_IPI_mask_sequence_phys 2.66% [kernel] [k] _raw_spin_lock 1.50% [kernel] [k] native_write_cr0 1.36% libleveldb.so.1.9[.] 0x3cebc 1.27% [kernel] [k] __hrtimer_start_range_ns 1.17% [kernel] [k] hrtimer_interrupt 1.10% libc-2.11.3.so [.] memcmp 1.07% [kernel] [k] apic_timer_interrupt 1.00% [kernel] [k] find_next_bit 0.99% [kernel] [k] cpumask_next_and 0.99% [kernel] [k] __schedule 0.97% [kernel] [k] clockevents_program_event 0.97% [kernel] [k] _raw_spin_unlock_irqrestore 0.90% [kernel] [k] fget_light 0.85% [kernel] [k] do_select 0.84% [kernel] [k] reschedule_interrupt 0.83% [kvm_intel] [k] vmx_vcpu_run 0.79% [kernel] [k] _raw_spin_lock_irqsave 0.78% [kernel] [k] try_to_wake_up 0.70% libc-2.11.3.so [.] memcpy 0.66% [kernel] [k] copy_user_generic_string 0.63% [kernel] [k] sync_inodes_sb 0.61% [kernel] [k] load_balance 0.61% [kernel] [k] tg_load_down 0.56% [kernel] [k] irq_entries_start 0.56% libc-2.11.3.so [.] 0x73612 0.54% libpthread-2.11.3.so [.] pthread_mutex_lock 0.51% [kernel] [k] rcu_needs_cpu Regards, Oliver On 27-08-13 16:04, Ian Colle wrote: On Aug 27, 2013, at 2:08, Oliver Daudey oli...@xs4all.nl wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: memcpy is in fact also present in your Cuttlefish OSD, just a bit further down the list (increased from .7% to 1.4%). memcmp definitely looks suspicious and is something we're looking into. 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k]
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Hi James, Yes, all configured using the interfaces file. Only two interfaces, eth0 and eth1: auto eth0 iface eth0 inet dhcp auto eth1 iface eth1 inet dhcp I took a single node and rebooted it several times, and it really was about 50/50 whether or not the OSDs showed up under 'localhost' or n0. I tried a few different things last night with no luck. I modified when ceph-all starts by writing differet start on values to /etc/init/ceph-all.override. I was grasping for straws a bit, as I just kept adding (and'ing) events, hoping to find something that works. I tried: start on (local-filesystems and net-device-up IFACE=eth0) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1 and started network-services) Oddly, the last one seemed to work at first. When I added the started network-services to the list, the OSDs came up correctly each time! But, the monitor never started. If I started it directly start ceph-mon id=n0, it came up fine, but not during boot. I spent a couple hours trying to debug *that* before I gave up and switched to static hostnames. =/ I had even thrown --verbose in the kernel command line so I could see all the upstart events happening, but didn't see anything obvious. So now I'm back to the stock upstart scripts, using static hostnames, and, and I don't have any issues with OSDs moving in the crushmap, or any new problems with the monitors. Sage, I do think I still saw a weird issue with my third mon not starting (same as the original email -- even now with static hostnames), but it was late, and I lost access to the cluster right about then and haven't regained it. Ill double-check that when I get access again and hopefully will find that problem has gone away too. - Travis ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
What options were you using? -Sam On Tue, Aug 27, 2013 at 7:35 AM, Oliver Daudey oli...@xs4all.nl wrote: Hey Ian, Samuel, FYI: I still had some attempted optimization-options in place on the production-cluster, which might have skewed my results a bit. OSD version 0.67.2-16-geeb1f86 seems to be a lot less hard on the CPU in the configuration that I did all other tests in. I haven't yet verified sufficiently if this is accompanied by a speed-increase as well. On the test-cluster, I didn't see any difference in speed, but that may not mean much, as the load-pattern on production is totally different. Sorry for that mixup. Updated `perf top'-output, extra options removed, under current load, which should be higher than in my previous mail: 18.08% [kernel] [k] intel_idle 5.87% [kernel] [k] find_busiest_group 4.92% kvm [.] 0xcefe2 3.24% [kernel] [k] native_write_msr_safe 2.92% [kernel] [k] default_send_IPI_mask_sequence_phys 2.66% [kernel] [k] _raw_spin_lock 1.50% [kernel] [k] native_write_cr0 1.36% libleveldb.so.1.9[.] 0x3cebc 1.27% [kernel] [k] __hrtimer_start_range_ns 1.17% [kernel] [k] hrtimer_interrupt 1.10% libc-2.11.3.so [.] memcmp 1.07% [kernel] [k] apic_timer_interrupt 1.00% [kernel] [k] find_next_bit 0.99% [kernel] [k] cpumask_next_and 0.99% [kernel] [k] __schedule 0.97% [kernel] [k] clockevents_program_event 0.97% [kernel] [k] _raw_spin_unlock_irqrestore 0.90% [kernel] [k] fget_light 0.85% [kernel] [k] do_select 0.84% [kernel] [k] reschedule_interrupt 0.83% [kvm_intel] [k] vmx_vcpu_run 0.79% [kernel] [k] _raw_spin_lock_irqsave 0.78% [kernel] [k] try_to_wake_up 0.70% libc-2.11.3.so [.] memcpy 0.66% [kernel] [k] copy_user_generic_string 0.63% [kernel] [k] sync_inodes_sb 0.61% [kernel] [k] load_balance 0.61% [kernel] [k] tg_load_down 0.56% [kernel] [k] irq_entries_start 0.56% libc-2.11.3.so [.] 0x73612 0.54% libpthread-2.11.3.so [.] pthread_mutex_lock 0.51% [kernel] [k] rcu_needs_cpu Regards, Oliver On 27-08-13 16:04, Ian Colle wrote: On Aug 27, 2013, at 2:08, Oliver Daudey oli...@xs4all.nl wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: memcpy is in fact also present in your Cuttlefish OSD, just a bit further down the list (increased from .7% to 1.4%). memcmp definitely looks suspicious and is something we're looking into. 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel]
Re: [ceph-users] ceph-deploy pushy dependency problem
Hi Kevin - The latest version, ceph-deploy 1.2.2 in the ceph rpm-dumpling repo should have the correct dependency (see below).If doesn't work for you, or if you are using a different repo please let me know. Thanks, Gary ubuntu@jenkins:~/repos/rpm-dumpling/fc19/noarch$ rpm -q --requires -p ceph-deploy-1.2.2-0.noarch.rpm warning: ceph-deploy-1.2.2-0.noarch.rpm: Header V4 RSA/SHA1 Signature, key ID 17ed316d: NOKEY /usr/bin/env gdisk pushy = 0.5.3 python(abi) = 2.7 python-argparse python-distribute rpmlib(CompressedFileNames) = 3.0.4-1 rpmlib(PayloadFilesHavePrefix) = 4.0-1 On Aug 26, 2013, at 2:57 PM, Kevin Weiler kevin.wei...@imc-chicago.com wrote: Hey ceph devs, I think there is a problem with your dependencies in the current version of ceph-deploy for FC19 (and probably other redhat variants). The package ceph-deploy explictly requires python-pushy = 0.5.3, but the pushy package is simply named pushy (and is the correct version). The spec file looks fine in the ceph-deploy git repo, maybe you just need to rerun the package/repo generation? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey Samuel, These: osd op threads = 8 osd disk threads = 2 filestore op threads = 8 They increased performance on my test-cluster, but seemed to have the opposite effect on the much heavier loaded production-environment. Regards, Oliver On 27-08-13 16:37, Samuel Just wrote: What options were you using? -Sam On Tue, Aug 27, 2013 at 7:35 AM, Oliver Daudey oli...@xs4all.nl wrote: Hey Ian, Samuel, FYI: I still had some attempted optimization-options in place on the production-cluster, which might have skewed my results a bit. OSD version 0.67.2-16-geeb1f86 seems to be a lot less hard on the CPU in the configuration that I did all other tests in. I haven't yet verified sufficiently if this is accompanied by a speed-increase as well. On the test-cluster, I didn't see any difference in speed, but that may not mean much, as the load-pattern on production is totally different. Sorry for that mixup. Updated `perf top'-output, extra options removed, under current load, which should be higher than in my previous mail: 18.08% [kernel] [k] intel_idle 5.87% [kernel] [k] find_busiest_group 4.92% kvm [.] 0xcefe2 3.24% [kernel] [k] native_write_msr_safe 2.92% [kernel] [k] default_send_IPI_mask_sequence_phys 2.66% [kernel] [k] _raw_spin_lock 1.50% [kernel] [k] native_write_cr0 1.36% libleveldb.so.1.9[.] 0x3cebc 1.27% [kernel] [k] __hrtimer_start_range_ns 1.17% [kernel] [k] hrtimer_interrupt 1.10% libc-2.11.3.so [.] memcmp 1.07% [kernel] [k] apic_timer_interrupt 1.00% [kernel] [k] find_next_bit 0.99% [kernel] [k] cpumask_next_and 0.99% [kernel] [k] __schedule 0.97% [kernel] [k] clockevents_program_event 0.97% [kernel] [k] _raw_spin_unlock_irqrestore 0.90% [kernel] [k] fget_light 0.85% [kernel] [k] do_select 0.84% [kernel] [k] reschedule_interrupt 0.83% [kvm_intel] [k] vmx_vcpu_run 0.79% [kernel] [k] _raw_spin_lock_irqsave 0.78% [kernel] [k] try_to_wake_up 0.70% libc-2.11.3.so [.] memcpy 0.66% [kernel] [k] copy_user_generic_string 0.63% [kernel] [k] sync_inodes_sb 0.61% [kernel] [k] load_balance 0.61% [kernel] [k] tg_load_down 0.56% [kernel] [k] irq_entries_start 0.56% libc-2.11.3.so [.] 0x73612 0.54% libpthread-2.11.3.so [.] pthread_mutex_lock 0.51% [kernel] [k] rcu_needs_cpu Regards, Oliver On 27-08-13 16:04, Ian Colle wrote: On Aug 27, 2013, at 2:08, Oliver Daudey oli...@xs4all.nl wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: memcpy is in fact also present in your Cuttlefish OSD, just a bit further down the list (increased from .7% to 1.4%). memcmp definitely looks suspicious and is something we're looking into. 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey Mark, That will take a day or so for me to know with enough certainty. With the low CPU-usage and preliminary results today, I'm confident enough to upgrade all OSDs in production and test the cluster all-Dumpling tomorrow. For now, I only upgraded a single OSD and measured CPU-usage and whatever performance-effects that had on the cluster, so if I would lose that OSD, I could recover. :-) Will get back to you. Regards, Oliver On 27-08-13 15:04, Mark Nelson wrote: Hi Olver/Matthew, Ignoring CPU usage, has speed remained slower as well? Mark On 08/27/2013 03:08 AM, Oliver Daudey wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K cycles 17.93% ceph-osd libc-2.15.so [.] 0x15d523 17.03% ceph-osd ceph-osd [.] 0x5c2897 4.66% ceph-osd ceph-osd [.] leveldb::InternalKeyComparator::Compare(leveldb::Slice const, level 3.46% ceph-osd ceph-osd [.] leveldb::Block::Iter::Next()
Re: [ceph-users] ceph-deploy pushy dependency problem
Hi Mike - We released a 1.2.2 version recently that should have the correct dependency. You may have to do an aptitude update to refresh the cache. I did a test install on our debian wheezy system and that correctly updated the python pushy package. let me know if you continue to have problems. Thanks, Gary root@gitbuilder-cdep-deb-wheezy-amd64-basic:~# aptitude install ceph-deploy The following NEW packages will be installed: ceph-deploy The following packages will be upgraded: python-pushy 1 packages upgraded, 1 newly installed, 0 to remove and 6 not upgraded. Need to get 0 B/69.1 kB of archives. After unpacking 417 kB will be used. Do you want to continue? [Y/n/?] y debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 42202 files and directories currently installed.) Preparing to replace python-pushy 0.5.1-1 (using .../python-pushy_0.5.3-1~bpo70+1.ceph_amd64.deb) ... Unpacking replacement python-pushy ... Selecting previously unselected package ceph-deploy. Unpacking ceph-deploy (from .../ceph-deploy_1.2.2~bpo70+1_all.deb) ... Setting up python-pushy (0.5.3-1~bpo70+1.ceph) ... Setting up ceph-deploy (1.2.2~bpo70+1) ... Current status: 6 updates [-1]. On Aug 27, 2013, at 6:13 AM, Nico Massenberg nico.massenb...@kontrast.de wrote: I'm having the same issue on debian 7.1. When reinstalling ceph-deploy after purge I get the following: root@vl0181:~# aptitude install ceph-deploy The following NEW packages will be installed: ceph-deploy{b} 0 packages upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 36.5 kB of archives. After unpacking 360 kB will be used. The following packages have unmet dependencies: ceph-deploy : Depends: python-pushy (= 0.5.3) but 0.5.1-1 is installed. The following actions will resolve these dependencies: Keep the following packages at their current version: 1) ceph-deploy [Not Installed] Aptitude updated and upgraded. Regards Am 26.08.2013 um 23:57 schrieb Kevin Weiler kevin.wei...@imc-chicago.com: Hey ceph devs, I think there is a problem with your dependencies in the current version of ceph-deploy for FC19 (and probably other redhat variants). The package ceph-deploy explictly requires python-pushy = 0.5.3, but the pushy package is simply named pushy (and is the correct version). The spec file looks fine in the ceph-deploy git repo, maybe you just need to rerun the package/repo generation? Thanks! -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail: kevin.wei...@imc-chicago.com The information in this e-mail is intended only for the person or entity to which it is addressed. It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it. If you receive this e-mail unintentionally, please inform us immediately by reply and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments. Messages and attachments are scanned for all known viruses. Always scan attachments before opening them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Ok, definitely let us know how it goes! For what it's worth, I'm testing Sam's wip-dumpling-perf branch with the wbthrottle code disabled now and comparing it both to that same branch with it enabled along with 0.67.1. Don't have any perf data, but quite a bit of other data to look through, both in terms of RADOS bench and RBD. Mark On 08/27/2013 10:07 AM, Oliver Daudey wrote: Hey Mark, That will take a day or so for me to know with enough certainty. With the low CPU-usage and preliminary results today, I'm confident enough to upgrade all OSDs in production and test the cluster all-Dumpling tomorrow. For now, I only upgraded a single OSD and measured CPU-usage and whatever performance-effects that had on the cluster, so if I would lose that OSD, I could recover. :-) Will get back to you. Regards, Oliver On 27-08-13 15:04, Mark Nelson wrote: Hi Olver/Matthew, Ignoring CPU usage, has speed remained slower as well? Mark On 08/27/2013 03:08 AM, Oliver Daudey wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it right but it seems ok). Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel), journal and osd data is on an SSD, OSD's are in the same pool with the same weight and the perf tests were run at the same time on a realworld load consisting of RBD traffic only. Dumpling - Events: 332K
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Oliver, This patch isn't in dumpling head yet, you may want to wait on a dumpling point release. -Sam On Tue, Aug 27, 2013 at 8:13 AM, Mark Nelson mark.nel...@inktank.com wrote: Ok, definitely let us know how it goes! For what it's worth, I'm testing Sam's wip-dumpling-perf branch with the wbthrottle code disabled now and comparing it both to that same branch with it enabled along with 0.67.1. Don't have any perf data, but quite a bit of other data to look through, both in terms of RADOS bench and RBD. Mark On 08/27/2013 10:07 AM, Oliver Daudey wrote: Hey Mark, That will take a day or so for me to know with enough certainty. With the low CPU-usage and preliminary results today, I'm confident enough to upgrade all OSDs in production and test the cluster all-Dumpling tomorrow. For now, I only upgraded a single OSD and measured CPU-usage and whatever performance-effects that had on the cluster, so if I would lose that OSD, I could recover. :-) Will get back to you. Regards, Oliver On 27-08-13 15:04, Mark Nelson wrote: Hi Olver/Matthew, Ignoring CPU usage, has speed remained slower as well? Mark On 08/27/2013 03:08 AM, Oliver Daudey wrote: Hey Samuel, The PGLog::check() is now no longer visible in profiling, so it helped for that. Unfortunately, it doesn't seem to have helped to bring down the OSD's CPU-loading much. Leveldb still uses much more than in Cuttlefish. On my test-cluster, I didn't notice any difference in the RBD bench-results, either, so I have to assume that it didn't help performance much. Here's the `perf top' I took just now on my production-cluster with your new version, under regular load. Also note the memcmp and memcpy, which also don't show up when running a Cuttlefish-OSD: 15.65% [kernel] [k] intel_idle 7.20% libleveldb.so.1.9[.] 0x3ceae 6.28% libc-2.11.3.so [.] memcmp 5.22% [kernel] [k] find_busiest_group 3.92% kvm [.] 0x2cf006 2.40% libleveldb.so.1.9[.] leveldb::InternalKeyComparator::Compar 1.95% [kernel] [k] _raw_spin_lock 1.69% [kernel] [k] default_send_IPI_mask_sequence_phys 1.46% libc-2.11.3.so [.] memcpy 1.17% libleveldb.so.1.9[.] leveldb::Block::Iter::Next() 1.16% [kernel] [k] hrtimer_interrupt 1.07% [kernel] [k] native_write_cr0 1.01% [kernel] [k] __hrtimer_start_range_ns 1.00% [kernel] [k] clockevents_program_event 0.93% [kernel] [k] find_next_bit 0.93% libstdc++.so.6.0.13 [.] std::string::_M_mutate(unsigned long, 0.89% [kernel] [k] cpumask_next_and 0.87% [kernel] [k] __schedule 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.85% [kernel] [k] do_select 0.84% [kernel] [k] apic_timer_interrupt 0.80% [kernel] [k] fget_light 0.79% [kernel] [k] native_write_msr_safe 0.76% [kernel] [k] _raw_spin_lock_irqsave 0.66% libc-2.11.3.so [.] 0xdc6d8 0.61% libpthread-2.11.3.so [.] pthread_mutex_lock 0.61% [kernel] [k] tg_load_down 0.59% [kernel] [k] reschedule_interrupt 0.59% libsnappy.so.1.1.2 [.] snappy::RawUncompress(snappy::Source*, 0.56% libstdc++.so.6.0.13 [.] std::string::append(char const*, unsig 0.54% [kvm_intel] [k] vmx_vcpu_run 0.53% [kernel] [k] copy_user_generic_string 0.53% [kernel] [k] load_balance 0.50% [kernel] [k] rcu_needs_cpu 0.45% [kernel] [k] fput Regards, Oliver On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote: I just pushed a patch to wip-dumpling-log-assert (based on current dumpling head). I had disabled most of the code in PGLog::check() but left an (I thought) innocuous assert. It seems that with (at least) g++ 4.6.3, stl list::size() is linear in the size of the list, so that assert actually traverses the pg log on each operation. The patch in wip-dumpling-log-assert should disable that assert as well by default. Let me know if it helps. It should be built within an hour of this email. -Sam On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson manderson8...@gmail.com wrote: Hi Guys, I'm having the same problem as Oliver with 0.67.2. CPU usage is around double that of the 0.61.8 OSD's in the same cluster which appears to be causing the performance decrease. I did a perf comparison (not sure if I did it
Re: [ceph-users] Ceph + Xen - RBD io hang
Hi James, Can you post the contents of the hung task warning so we can see where it is stuck? Thanks! sage On Tue, 27 Aug 2013, James Dingwall wrote: Hi, I am doing some experimentation with Ceph and Xen (on the same host) and I'm experiencing some problems with the rbd device that I'm using as the block device. My environment is: 2 node Ceph 0.67.2 cluster, 4x OSD (btrfs) and 1x mon Xen 4.3.0 Kernel 3.10.9 The domU I'm trying to build is from the Ubuntu 13.04 desktop release. When I pass through the rbd (format 1 or 2) device as phy:/dev/rbd/rbd/ubuntu-test then the domU has no problems reading data from it, the test I ran was: for i in $(seq 0 1023) ; do dd if=/dev/xvda of=/dev/null bs=4k count=1024 skip=$(($i * 4)) done However writing data causes the domU to hang while while i is still in single figures but it doesn't seem consistent about the exact value. for i in $(seq 0 1023) ; do dd of=/dev/xvda of=/dev/zero bs=4k count=1024 seek=$(($i * 4)) done eventually the kernel in the domU will print a hung task warning. I have tried the domU as pv and hvm (with xen_platform_pci = 1 and 0) but have the same behaviour in both cases. Once this state is triggered on the rbd device then any interaction with it in dom0 will result in the same hang. I'm assuming that there is some unfavourable interaction between ceph/rbd and blkback but I haven't found anything in the dom0 logs so I would like to know if anyone has some suggestions about where to start trying to hunt this down. Thanks, James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with keyrings during deployment
On Tue, 27 Aug 2013, Francesc Alted wrote: Hi again, I continue to try debugging the problem reported before. Now, I have been trying to use a couple of VM for doing this (one with Ubuntu 12.04 64-bit, and the other with Ubuntu 12.10 64-bit, and I use the ceph.com repos for installing the Ceph libraries). And, unfortunately, I am getting into the same problem: the keyring do not appear where they should (i.e. bootstrap-mds and bootstrap-osd in /var/lib/ceph). I have followed the preflight check list (http://ceph.com/docs/next/start/quick-start-preflight/), and the ceph user in the admin box can login perfectly well on the server box, so not sure what's going on here. I have even tried to use a single ceph server for installing everything (adding the 'osd crush chooseleaf type = 0' line into the ceph conf file) but then again the keyrings do not appear. Nobody is having the same problems than me (using latest Ceph Dumpling 0.67.2 release here)? Thanks for any insight! There are several possible pitfalls here; the missing keys are just the most visible symptom of the monitors not forming an initial quorum. Can you post the contents of your ceph.conf and output from 'ceph daemon mon.`hostnam` mon-status' on each of the mon nodes? thanks! sage Francesc On Mon, Aug 26, 2013 at 1:55 PM, Francesc Alted franc...@continuum.io wrote: Hi, I am a newcomer to Ceph. After having a look at the docs (BTW, it is nice to see its concepts being implemented), I am trying to do some tests, mainly to check the Python APIs to access RADOS and RDB components. I am following this quick guide: http://ceph.com/docs/next/start/quick-ceph-deploy/ But after adding a monitor (ceph-deploy mon create ceph-server), I see that the subdirectories bootstrap-mds and bootstrap-osd (in /var/lib/ceph) do not contain keyrings. I have tried to create the monitor again (as suggested in the docs), but the keyrings continue to not appear there: $ ceph-deploy gatherkeys ceph-server [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /etc/ceph/ceph.client.admin.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-osd/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph-server'] [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-server for /var/lib/ceph/bootstrap-mds/ceph.keyring [ceph_deploy.gatherkeys][WARNIN] Unable to find /var/lib/ceph/bootstrap-mds/ceph.keyring on ['ceph-server'] My admin node (the machine from where I issue the ceph commands) is an openSUSE 12.3 where I compiled the ceph-0.67.1 tarball. The server node is a Debian Precise 64-bit (using vagrant w/ VirtaulBox), and Ceph installation seems to have gone well, as per the logs: [ceph-server][INFO ] Running command: ceph --version [ceph-server][INFO ] ceph version 0.67.2 (eb4380dd036a0b644c6283869911d615ed729ac8) Any hints on what is going on there? Thanks! -- Francesc Alted -- Francesc Alted ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] problem uploading 1GB and up file
Hello, I got a weird upload issue with Ceph - dumpling (0.67.2) and I don't know if someone can help me out to pin point my problem... Basically, If I'm trying to upload a 1Gb file, as soon as my upload is completed, apache return a 500 error... no problem if I upload a 900Mb file or less, just got that specific problem with any file bigger than 1Gb !! I also got 2 apache server in place, one with the modify fastcgi module and the other one without - both server generate the same issue/behavior... One thing that I remark : with a 1Gb file or up, radosgw throw me a bunch of these errors : ( not getting these with a 900Mb or less file size... ) 2013-08-27 11:18:09.850700 7f1e490ec700 20 get_obj_state: s-obj_tag was set empty 2013-08-27 11:18:09.850705 7f1e490ec700 20 prepare_atomic_for_write_impl: state is not atomic. state=0x7f1dc40ad938 2013-08-27 11:18:09.859304 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:_shadow__ivTDZIBHCbNVE4p-DDeXDJHni8SArBZ_435 state=0x7f1dc40adbf8 s-prefetch_data=0 (...) 2013-08-27 11:18:09.878697 7f1e490ec700 0 WARNING: set_req_state_err err_no=27 resorting to 500 not seeing anything others errors, no OSD errors, no MDS errors... Can't find anything on google related to this warning msg: set_req_state_err err_no=27 any clue ?? Thanks M-A = = = ceph -v ceph version 0.67.2 (eb4380dd036a0b644c6283869911d615ed729ac8) ceph status cluster eb16413a----f23fddd6a5f6 health HEALTH_OK monmap e1: 2 mons at {coe-w1-stor-db01=10.150.2.101:6789/0,coe-w1-stor-db02=10.150.2.102:6789/0}, election epoch 30, quorum 0,1 coe-w1-stor-db01,coe-w1-stor-db02 osdmap e518: 101 osds: 101 up, 101 in pgmap v8991: 288 pgs: 288 active+clean; 2633 MB data, 11312 MB used, 250 TB / 250 TB avail mdsmap e64: 1/1/1 up {0=coe-w1-stor-db02=up:active}, 1 up:standby apache: 172.16.11.118 - - [27/Aug/2013:11:15:21 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 1596 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] GET / HTTP/1.1 200 1673 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 573 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] PUT /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 500 377 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:09 -0400] PUT /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 200 315 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:27 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 944 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:27 -0400] HEAD /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 404 248 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) radosgw.log 2013-08-27 11:15:23.634295 7f1e490ec700 2 req 4:0.000457:s3:PUT /aaa/XenServer-6.2-binpkg.iso:put_obj:verifying op params 2013-08-27 11:15:23.634297 7f1e490ec700 2 req 4:0.000459:s3:PUT /aaa/XenServer-6.2-binpkg.iso:put_obj:executing 2013-08-27 11:15:31.797706 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:15:53.797852 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:15.798007 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:37.798146 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:59.798282 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:17:21.798415 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:17:43.798551 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:18:05.495773 7f1e490ec700 10 x x-amz-acl:private 2013-08-27 11:18:05.495932 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:XenServer-6.2-binpkg.iso state=0x7f1dc405f3a8 s-prefetch_data=0 2013-08-27 11:18:05.497367 7f1e490ec700 0 setting object write_tag=default.15804.4 2013-08-27 11:18:05.508505 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:_shadow__ivTDZIBHCbNVE4p-DDeXDJHni8SArBZ_1 state=0x7f1dc4037f88 s-prefetch_data=0 2013-08-27 11:18:05.509832 7f1e490ec700 20 get_obj_state: s-obj_tag was set empty ( bunch of those messages ) 2013-08-27 11:18:09.840315 7f1e490ec700 20 prepare_atomic_for_write_impl: state is not atomic. state=0x7f1dc40ad4a8 2013-08-27 11:18:09.849362 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:_shadow__ivTDZIBHCbNVE4p-DDeXDJHni8SArBZ_434 state=0x7f1dc40ad938 s-prefetch_data=0 2013-08-27 11:18:09.850700 7f1e490ec700 20 get_obj_state: s-obj_tag was set empty 2013-08-27 11:18:09.850705 7f1e490ec700 20 prepare_atomic_for_write_impl: state is not atomic. state=0x7f1dc40ad938 2013-08-27 11:18:09.859304 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:_shadow__ivTDZIBHCbNVE4p-DDeXDJHni8SArBZ_435 state=0x7f1dc40adbf8
Re: [ceph-users] Storage, File Systems and Data Scrubbing
On Tue, 27 Aug 2013, ker can wrote: This was very helpful -thanks. However I'm still trying to reconcile this with something that Sage mentioned a while back on a similar topic. Apparently you can disable the journal if you're using btrfs. Is that possible because btrfs takes care of things like atomic object writes and updates to the osd metadata ? It's because with btrfs we take snapshots that are consistent checkpoints. You *can* disable the journal, but it means that writes only commit when a new checkpoint is made (i.e., snapshot), which is a infrequent and relatively expensive operation.. so in general the write latency is terrible. This is useful only for workloads where you are doing bulk data inject (for example) and write latency is not important. sage -Original Message- From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sage Weil Sent: Thursday, July 11, 2013 8:39 PM To: Mark Nelson Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Turning off ceph journaling with xfs ? Note that you *can* disable teh journal if you use btrfs, but your write latency will tend to be pretty terrible. This is only viable for bulk-storage use cases where throughput trumps all and latency is not an issue at all (it may be seconds). We are planning on eliminating the double-write for at least large writes when using btrfs by cloning data out of the journal and into the target file. This is not a hugely complex task (although it is non-trivial) but it hasn't made it to the top of the priority list yet. sage On Mon, Aug 26, 2013 at 4:05 PM, Samuel Just sam.j...@inktank.com wrote: ceph-osd builds a transactional interface on top of the usual posix operations so that we can do things like atomically perform an object write and update the osd metadata. The current implementation requires our own journal and some metadata ordering (which is provided by the backing filesystem's own journal) to implement our own atomic operations. It's true that in some cases you might be able to get away with having the client replay the operation (which we do anyway for other reasons), but that wouldn't be enough to ensure consistency of the filesystem's own internal structures. It also wouldn't be enough to ensure that the OSD's internal structure remain consistent in the case of a crash. Also, if the client is unavailable to do the replay, you'd have a problem. In summary, it's actually really hard to to detect partial/corrupted writes after a crash without journaling of some form. -Sam ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Limitations of Ceph
Hi, I have been running a small Ceph cluster for experimentation for a while, and now my employer has asked me to do little talk about my findings, and one important part is, of course, going to be practical limitations of Ceph. Here is my list so far: - Ceph is not supported by VMWare ESX. That may change in the future, but seeing how VMWare is now owned by EMC, they might make it a political decision to not support Ceph. Apparently, you can import an RBD volume on linux server and then reexport it to a VMWare host as an iSCSI target, but doing so would introduce a bottleneck and a single point of failure, which kind of defeats the purpose of having a Ceph cluster in the first place. - Ceph is not supported by Windows clients, or even, as far as I can tell, anything that isn't a very recent version of Linux. (User space only clients work in some cases.) - There is no dynamic tiered storage, and there probably never will be, if I understand the architecture correctly. You can have different pools with different perfomance characteristics (like one on cheap and large 7200 RPM disks, and another on SSDs), but once you have put a given bunch of data on one pool, it is pretty much stuck there. (I.e. you cannot move it to another pool without very tight and very manual coordination with all clients using it.) - There is no active data deduplication, and, again, if I understand the architecture correctly, there probably never will be. There is, however, sparse allocation and COW-cloning for RBD volumes, which does something similar. Under certain conditions, it is even possible to use the discard option of modern filesystems to automatically keep unused regions of an RBD volume sparse. - Bad support for multiple customers accessing the same cluster. This is assuming that, if you have multiple customers, it is imperative that any one given customer must be unable to access or even modify the data of any other customer. You can have authorization on the pool layer, but it has been reported that Ceph reacts badly to defining a large number of pools. Multi-customer support in CephFS is non-existant. RadosGW probably supports multi-customer, but I haven't tried it. - No dynamic partitioning for CephFS The original paper talked about dynamic partioning of the CephFS namespace, so that multiple Metadata Servers could share the workload of a large number of CephFS clients. This isn't implemented yet (or implemented but not working properly?), and the only currently support multi-MDS configuration is 1 active / n standby. This limits the scalability of CephFS. It looks to me like CephFS is not a major focus of the development team at this time. Can you give me some comments on that? Am I totally in the wrong on some of those points? Have I forgotten some important limitation? Regards, Guido ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with keyrings during deployment
On Tue, Aug 27, 2013 at 12:30 PM, Francesc Alted franc...@continuum.iowrote: On Tue, Aug 27, 2013 at 6:25 PM, Alfredo Deza alfredo.d...@inktank.comwrote: On Tue, Aug 27, 2013 at 12:04 PM, Francesc Alted franc...@continuum.iowrote: On Tue, Aug 27, 2013 at 5:29 PM, Sage Weil s...@inktank.com wrote: On Tue, 27 Aug 2013, Francesc Alted wrote: Hi again, I continue to try debugging the problem reported before. Now, I have been trying to use a couple of VM for doing this (one with Ubuntu 12.04 64-bit, and the other with Ubuntu 12.10 64-bit, and I use the ceph.com repos for installing the Ceph libraries). And, unfortunately, I am getting into the same problem: the keyring do not appear where they should (i.e. bootstrap-mds and bootstrap-osd in /var/lib/ceph). I have followed the preflight check list (http://ceph.com/docs/next/start/quick-start-preflight/), and the ceph user in the admin box can login perfectly well on the server box, so not sure what's going on here. I have even tried to use a single ceph server for installing everything (adding the 'osd crush chooseleaf type = 0' line into the ceph conf file) but then again the keyrings do not appear. Nobody is having the same problems than me (using latest Ceph Dumpling 0.67.2 release here)? Thanks for any insight! There are several possible pitfalls here; the missing keys are just the most visible symptom of the monitors not forming an initial quorum. Can you post the contents of your ceph.conf and output from 'ceph daemon mon.`hostnam` mon-status' on each of the mon nodes? Okay, I tracked down my problem. It turned out that I was setting different names for the ceph servers in /etc/hosts than their own `hostname`. These log lines when creating the monitor gave me the clue: [ceph-server2][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-vagrant.mon.keyring [ceph-server2][INFO ] create the monitor keyring file [ceph-server2][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i vagrant --keyring /var/lib/ceph/tmp/ceph-vagrant.mon.keyring [ceph-server2][INFO ] ceph-mon: mon.noname-a 192.168.33.11:6789/0 is local, renaming to mon.vagrant [ceph-server2][INFO ] ceph-mon: set fsid to 253c5a74-699b-44ef-a071-5883716fa620 I was calling this 'vagrant' hostname 'ceph-server2' in my /etc/hosts and I realized this was fooling cephs. So I changed all my /etc/hosts to follow the original hostnames (changed to 'quantal64'), and pum! everything works as intended: [quantal64][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-quantal64.mon.keyring [quantal64][INFO ] create the monitor keyring file [quantal64][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i quantal64 --keyring /var/lib/ceph/tmp/ceph-quantal64.mon.keyring [quantal64][INFO ] ceph-mon: mon.noname-a 192.168.33.11:6789/0 is local, renaming to mon.quantal64 [quantal64][INFO ] ceph-mon: set fsid to 96c48ec5-7dd5-4f76-81f9-4fdc711a76f0 Now I can gather the keys normally: $ ceph-deploy gatherkeys quantal64 [ceph_deploy.gatherkeys][DEBUG ] Checking quantal64 for /etc/ceph/ceph.client.admin.keyring [ceph_deploy.gatherkeys][DEBUG ] Got ceph.client.admin.keyring key from quantal64. [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring [ceph_deploy.gatherkeys][DEBUG ] Checking quantal64 for /var/lib/ceph/bootstrap-osd/ceph.keyring [ceph_deploy.gatherkeys][DEBUG ] Got ceph.bootstrap-osd.keyring key from quantal64. [ceph_deploy.gatherkeys][DEBUG ] Checking quantal64 for /var/lib/ceph/bootstrap-mds/ceph.keyring [ceph_deploy.gatherkeys][DEBUG ] Got ceph.bootstrap-mds.keyring key from quantal64. Well, thanks anyways. Now it is time to make some more progress and create some ODSs :) Francesc, thanks for pasting this log info, it is useful to know what worked for you :) I will update the docs for ceph-deploy on things to watch out so that there is *something* users can try when this comes up. No problem. A possible idea for enhancing the capabilities to self-detecting problems would be to implement a check in ceph-deploy (or in another place) that warns (or just gives an error) when it detects that the hostname is different depending on whether they do a DNS lookup or a `hostname` output. I went ahead and created http://tracker.ceph.com/issues/6132 to track this. Thanks again. -- Francesc Alted ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Limitations of Ceph
Hi Guido! On Tue, 27 Aug 2013, Guido Winkelmann wrote: Hi, I have been running a small Ceph cluster for experimentation for a while, and now my employer has asked me to do little talk about my findings, and one important part is, of course, going to be practical limitations of Ceph. Here is my list so far: - Ceph is not supported by VMWare ESX. That may change in the future, but seeing how VMWare is now owned by EMC, they might make it a political decision to not support Ceph. Apparently, you can import an RBD volume on linux server and then reexport it to a VMWare host as an iSCSI target, but doing so would introduce a bottleneck and a single point of failure, which kind of defeats the purpose of having a Ceph cluster in the first place. It will be a challenge to make ESX natively support RBD as RBD is open source (ESX is proprietary), ESX is (I think) based on a *BSD kernel, and VMWare just announced a possibly competitive product. Inktank is doing what it can. Meanwhile, we are pursuing a robust iSCSI solution. Sadly this will require a traditional HA failover setup, but that's how the cookie crumbles when you use legacy protocols. - Ceph is not supported by Windows clients, or even, as far as I can tell, anything that isn't a very recent version of Linux. (User space only clients work in some cases.) There is ongoing work here; nothing to announce yet. - There is no dynamic tiered storage, and there probably never will be, if I understand the architecture correctly. You can have different pools with different perfomance characteristics (like one on cheap and large 7200 RPM disks, and another on SSDs), but once you have put a given bunch of data on one pool, it is pretty much stuck there. (I.e. you cannot move it to another pool without very tight and very manual coordination with all clients using it.) This is a key item on the roadmap for Emperor (nov) and Firefly (feb). We are building two capabilities: 'cache pools' that let you put fast storage in front of your main data pool, and a tiered 'cold' pool that lets you bleed cold objects off to a cheaper, slower tier (probably using erasure coding.. which is also coming in firefly). - There is no active data deduplication, and, again, if I understand the architecture correctly, there probably never will be. There is, however, sparse allocation and COW-cloning for RBD volumes, which does something similar. Under certain conditions, it is even possible to use the discard option of modern filesystems to automatically keep unused regions of an RBD volume sparse. You can do two things: - Do dedup inside an osd. Btrfs is growing this capability, and ZFS already has it. This is not ideal because data is random distributed across nodes. - You can build dedup on top of rados, for example by naming objects after a hash of their content. This will never be a 'magic and transparent dedup for all rados apps' because CAS is based on naming objects from content, and rados fundamentally places data based on name and eschews metadata. That means there isn't normally a way to point to the content unless there is some MDS on top of rados. Someday CephFS will get this, but raw librados users and RBD won't get it for free. - Bad support for multiple customers accessing the same cluster. This is assuming that, if you have multiple customers, it is imperative that any one given customer must be unable to access or even modify the data of any other customer. You can have authorization on the pool layer, but it has been reported that Ceph reacts badly to defining a large number of pools. Multi-customer support in CephFS is non-existant. RadosGW probably supports multi-customer, but I haven't tried it. The just-released Dumpling included support for rados namespaces, which are designed to address exactly this issue. Namespaces exist inside pools, and the auth capabilities can restrict access to a specific namespace. - No dynamic partitioning for CephFS The original paper talked about dynamic partioning of the CephFS namespace, so that multiple Metadata Servers could share the workload of a large number of CephFS clients. This isn't implemented yet (or implemented but not working properly?), and the only currently support multi-MDS configuration is 1 active / n standby. This limits the scalability of CephFS. It looks to me like CephFS is not a major focus of the development team at this time. This has been implemented since ~2006. We do not recommend it for production because it has not had the QA attention it deserves. That said, Zheng Yan has been doing a lot of great work here recently and things have improved considerably. Please try it! You just need to do 'ceph mds set_max_mds 3' (or whatever) to tell ceph how many active ceph-mds daemons you want. Hope that helps! sage ___ ceph-users
Re: [ceph-users] Administering a ceph cluster
This is an error in the docs. Upstart jobs apply to each node. I've updated the docs to reflect this understanding. When deployed as a service with the -a option, ceph would start daemons across nodes. With upstart, you need to start and stop by invoking upstart on each node. On Tue, Aug 27, 2013 at 10:03 AM, Francesc Alted franc...@continuum.io wrote: Hi, So I have already setup a shiny new Ceph cluster (in one single machine, quantal64, adminstered from another machine, precise64). Now, for operating the cluster, I am a bit unsure on how to interpret the docs in http://ceph.com/docs/next/rados/operations/operating/. My interpretation is that I should start the cluster from the *admin* node, right? But once I have done this in precise64 (via `sudo start ceph-all`), I try to see the status of it with the `ceph` command and I am getting this: $ ceph 2013-08-27 16:50:35.946904 7f43d44c6700 1 -- :/0 messenger.start 2013-08-27 16:50:35.947392 7f43d44c6700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication 2013-08-27 16:50:35.947410 7f43d44c6700 0 librados: client.admin initialization error (2) No such file or directory 2013-08-27 16:50:35.947444 7f43d44c6700 1 -- :/1020622 mark_down_all 2013-08-27 16:50:35.947604 7f43d44c6700 1 -- :/1020622 shutdown complete. Error connecting to cluster: ObjectNotFound Then, I tried to start the cluster right at 'cluster' machine (quantal64), but I am getting the same error in the admin machine. Here it is the contents of my 'my-cluster' directory in the admin machine: vagrant@precise64:~/my-cluster$ ls ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph.conf ceph.log ceph.mon.keyring and my ceph.conf contents: $ cat ceph.conf [global] fsid = 64b3090b-a692-4993-98a0-ba3e0bedd7db mon initial members = quantal64 mon host = 192.168.33.11 auth supported = cephx osd journal size = 1024 filestore xattr use omap = true [osd.1] host = quantal64 Am I doing something wrong? Thanks, -- Francesc Alted ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Limitations of Ceph
Hi Sage, Thanks for your comments, much appreciated. Am Dienstag, 27. August 2013, 10:19:46 schrieb Sage Weil: Hi Guido! On Tue, 27 Aug 2013, Guido Winkelmann wrote: [...] - There is no dynamic tiered storage, and there probably never will be, if I understand the architecture correctly. You can have different pools with different perfomance characteristics (like one on cheap and large 7200 RPM disks, and another on SSDs), but once you have put a given bunch of data on one pool, it is pretty much stuck there. (I.e. you cannot move it to another pool without very tight and very manual coordination with all clients using it.) This is a key item on the roadmap for Emperor (nov) and Firefly (feb). We are building two capabilities: 'cache pools' that let you put fast storage in front of your main data pool, and a tiered 'cold' pool that lets you bleed cold objects off to a cheaper, slower tier Sounds interesting. Will that work on entire PGs or on single objects? How do you keep track of which object lies on what pool without resorting to a lookup step before every operation? Will that feature retain backwards compatibility with older Ceph clients? (probably using erasure coding.. which is also coming in firefly). ... which happens to address another issue I forgot to mention - There is no active data deduplication, and, again, if I understand the architecture correctly, there probably never will be. There is, however, sparse allocation and COW-cloning for RBD volumes, which does something similar. Under certain conditions, it is even possible to use the discard option of modern filesystems to automatically keep unused regions of an RBD volume sparse. You can do two things: - Do dedup inside an osd. Btrfs is growing this capability, and ZFS already has it. This is not ideal because data is random distributed across nodes. - You can build dedup on top of rados, for example by naming objects after a hash of their content. This will never be a 'magic and transparent dedup for all rados apps' because CAS is based on naming objects from content, and rados fundamentally places data based on name and eschews metadata. That means there isn't normally a way to point to the content unless there is some MDS on top of rados. Someday CephFS will get this, but raw librados users and RBD won't get it for free. I read that as TL;DR: No real deduplication. - Bad support for multiple customers accessing the same cluster. This is assuming that, if you have multiple customers, it is imperative that any one given customer must be unable to access or even modify the data of any other customer. You can have authorization on the pool layer, but it has been reported that Ceph reacts badly to defining a large number of pools. Multi-customer support in CephFS is non-existant. RadosGW probably supports multi-customer, but I haven't tried it. The just-released Dumpling included support for rados namespaces, which are designed to address exactly this issue. Namespaces exist inside pools, and the auth capabilities can restrict access to a specific namespace. I'm having some trouble finding this in the documentation. Can you give me a pointer here? - No dynamic partitioning for CephFS The original paper talked about dynamic partioning of the CephFS namespace, so that multiple Metadata Servers could share the workload of a large number of CephFS clients. This isn't implemented yet (or implemented but not working properly?), and the only currently support multi-MDS configuration is 1 active / n standby. This limits the scalability of CephFS. It looks to me like CephFS is not a major focus of the development team at this time. This has been implemented since ~2006. We do not recommend it for production because it has not had the QA attention it deserves. That said, Zheng Yan has been doing a lot of great work here recently and things have improved considerably. Please try it! You just need to do 'ceph mds set_max_mds 3' (or whatever) to tell ceph how many active ceph-mds daemons you want. Okay, I think I will try this. Guido ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Fedora 18 Qemu
Does anyone have the best patches for Fedora 18 qemu that fixes aio issues? I have built my own but am having mixed results? Its qemu 1.2.2 Or would it be better to jump to Fedora 19. I am running Fedora 18 in hopes that RHEL 7 will be based on it. Thanks, Joe -- Joe Ryner Center for the Application of Information Technologies (CAIT) Production Coordinator P: (309) 298-1804 F: (309) 298-2806 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] problem uploading 1GB and up file
Since you mention a problem starting at 1GB, check to see if you have a LimitRequestBody directive (http://httpd.apache.org/docs/2.2/mod/core.html#limitrequestbody) The|LimitRequestBody|directive allows the user to set a limit on the allowed size of an HTTP request message body within the context in which the directive is given. To me, this doesn't look like an Apache problem. If it was LimitRequestBody, Apache should deny the request, and RadosGW would never see it. But it's quick and easy to verify that this isn't a problem. I'd check the other LimitRequest parameters too. *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com *Central Desktop. Work together in ways you never thought possible.* Connect with us Website http://www.centraldesktop.com/ | Twitter http://www.twitter.com/centraldesktop | Facebook http://www.facebook.com/CentralDesktop | LinkedIn http://www.linkedin.com/groups?gid=147417 | Blog http://cdblog.centraldesktop.com/ On 8/27/13 08:54 , Marc-Andre Jutras wrote: Hello, I got a weird upload issue with Ceph - dumpling (0.67.2) and I don't know if someone can help me out to pin point my problem... Basically, If I'm trying to upload a 1Gb file, as soon as my upload is completed, apache return a 500 error... no problem if I upload a 900Mb file or less, just got that specific problem with any file bigger than 1Gb !! I also got 2 apache server in place, one with the modify fastcgi module and the other one without - both server generate the same issue/behavior... One thing that I remark : with a 1Gb file or up, radosgw throw me a bunch of these errors : ( not getting these with a 900Mb or less file size... ) 2013-08-27 11:18:09.850700 7f1e490ec700 20 get_obj_state: s-obj_tag was set empty 2013-08-27 11:18:09.850705 7f1e490ec700 20 prepare_atomic_for_write_impl: state is not atomic. state=0x7f1dc40ad938 2013-08-27 11:18:09.859304 7f1e490ec700 20 get_obj_state: rctx=0x7f1dc40028b0 obj=aaa:_shadow__ivTDZIBHCbNVE4p-DDeXDJHni8SArBZ_435 state=0x7f1dc40adbf8 s-prefetch_data=0 (...) 2013-08-27 11:18:09.878697 7f1e490ec700 0 WARNING: set_req_state_err err_no=27 resorting to 500 not seeing anything others errors, no OSD errors, no MDS errors... Can't find anything on google related to this warning msg: set_req_state_err err_no=27 any clue ?? Thanks M-A = = = ceph -v ceph version 0.67.2 (eb4380dd036a0b644c6283869911d615ed729ac8) ceph status cluster eb16413a----f23fddd6a5f6 health HEALTH_OK monmap e1: 2 mons at {coe-w1-stor-db01=10.150.2.101:6789/0,coe-w1-stor-db02=10.150.2.102:6789/0}, election epoch 30, quorum 0,1 coe-w1-stor-db01,coe-w1-stor-db02 osdmap e518: 101 osds: 101 up, 101 in pgmap v8991: 288 pgs: 288 active+clean; 2633 MB data, 11312 MB used, 250 TB / 250 TB avail mdsmap e64: 1/1/1 up {0=coe-w1-stor-db02=up:active}, 1 up:standby apache: 172.16.11.118 - - [27/Aug/2013:11:15:21 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 1596 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] GET / HTTP/1.1 200 1673 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 573 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:15:23 -0400] PUT /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 500 377 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:09 -0400] PUT /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 200 315 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:27 -0400] GET /aaa/?delimiter=%2Fmax-keys=1000prefix HTTP/1.1 200 944 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) 172.16.11.118 - - [27/Aug/2013:11:18:27 -0400] HEAD /aaa/XenServer-6.2-binpkg.iso HTTP/1.1 404 248 - Cyberduck/4.3.1 (Mac OS X/10.8.4) (i386) radosgw.log 2013-08-27 11:15:23.634295 7f1e490ec700 2 req 4:0.000457:s3:PUT /aaa/XenServer-6.2-binpkg.iso:put_obj:verifying op params 2013-08-27 11:15:23.634297 7f1e490ec700 2 req 4:0.000459:s3:PUT /aaa/XenServer-6.2-binpkg.iso:put_obj:executing 2013-08-27 11:15:31.797706 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:15:53.797852 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:15.798007 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:37.798146 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:16:59.798282 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:17:21.798415 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:17:43.798551 7f1e42ffd700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-27 11:18:05.495773 7f1e490ec700 10 x x-amz-acl:private 2013-08-27 11:18:05.495932 7f1e490ec700 20 get_obj_state:
Re: [ceph-users] Limitations of Ceph
On Tue, Aug 27, 2013 at 10:19 AM, Sage Weil s...@inktank.com wrote: Hi Guido! On Tue, 27 Aug 2013, Guido Winkelmann wrote: Hi, I have been running a small Ceph cluster for experimentation for a while, and now my employer has asked me to do little talk about my findings, and one important part is, of course, going to be practical limitations of Ceph. Here is my list so far: - Ceph is not supported by VMWare ESX. That may change in the future, but seeing how VMWare is now owned by EMC, they might make it a political decision to not support Ceph. Apparently, you can import an RBD volume on linux server and then reexport it to a VMWare host as an iSCSI target, but doing so would introduce a bottleneck and a single point of failure, which kind of defeats the purpose of having a Ceph cluster in the first place. It will be a challenge to make ESX natively support RBD as RBD is open source (ESX is proprietary), ESX is (I think) based on a *BSD kernel, and VMWare just announced a possibly competitive product. Inktank is doing what it can. To add some context to this, my current understanding is that VMware do provide mechanisms to add plugins to ESX but a formal partnership is needed for those plugins to be signed certified. As such the challenge is more commercial than technical. Inktank are in conversations with VMware but if you are interested in seeing support, please tell your VMware account rep and let us know so we can demonstrate the customer demand for this. VMware partner with multiple storage companies (as evidenced by the number of storage vendors at VMWorld this week) so the fact that they have launched vSAN and are owned by EMC is not a commercial barrier. The ESX business unit want to sell as many licenses as possible and so a good storage ecosystem is critical to them. On the Windows side, as Sage said, watch this space. :-) Neil ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Geek On Duty: Hours Update
Hey all! We’ve moved our Friday “Geek on Duty” shift from 1pm PT to 8am PT. You can see the new calendar here: http://ceph.com/help/community/ We’re a bit light on US-friendly shifts at the moment. If anyone wants to volunteer to be a Geek on Duty, tell commun...@ceph.com and we’ll set you up! In exchange for your time, your company gets: a nice, cozy spot on ceph.com you get: to build your skills by helping people :) Cheers, Ross -- Ross Turk Community, Inktank @rossturk @inktank @ceph ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some help needed with ceph deployment
]}, { pgid: 2.31, osds: [ 3, 1]}, { pgid: 2.32, osds: [ 2, 1]}, { pgid: 2.33, osds: [ 3, 0]}, { pgid: 2.35, osds: [ 3, 2]}, { pgid: 2.36, osds: [ 1, 0]}, { pgid: 2.38, osds: [ 1, 0]}, { pgid: 2.3a, osds: [ 3, 1]}, { pgid: 2.3c, osds: [ 4, 0]}, { pgid: 2.3d, osds: [ 2, 0]}, { pgid: 2.3e, osds: [ 1, 0]}, { pgid: 2.3f, osds: [ 4, 1]}], blacklist: []} __ Informatie van ESET Endpoint Antivirus, versie van database viruskenmerken 8734 (20130827) __ Het bericht is gecontroleerd door ESET Endpoint Antivirus. http://www.eset.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Could you add me to the subscribe list?
___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] xfsprogs not found in RHEL
I am trying to install CEPH and I get the following error - --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-babel.noarch 0:0.9.4-5.1.el6 will be installed --- Package python-backports-ssl_match_hostname.noarch 0:3.2-0.3.a3.el6 will be installed --- Package python-docutils.noarch 0:0.6-1.el6 will be installed -- Processing Dependency: python-imaging for package: python-docutils-0.6-1.el6.noarch --- Package python-jinja2.x86_64 0:2.2.1-1.el6 will be installed --- Package python-pygments.noarch 0:1.1.1-1.el6 will be installed --- Package python-six.noarch 0:1.1.0-2.el6 will be installed -- Running transaction check --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-imaging.x86_64 0:1.1.6-19.el6 will be installed -- Finished Dependency Resolution Error: Package: ceph-0.67.2-0.el6.x86_64 (ceph) Requires: xfsprogs Machine Info - Linux version 2.6.32-131.4.1.el6.x86_64 ( mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Fri Jun 10 10:54:26 EDT 2011 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] xfsprogs not found in RHEL
I am trying to install CEPH and I get the following error - --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-babel.noarch 0:0.9.4-5.1.el6 will be installed --- Package python-backports-ssl_match_hostname.noarch 0:3.2-0.3.a3.el6 will be installed --- Package python-docutils.noarch 0:0.6-1.el6 will be installed -- Processing Dependency: python-imaging for package: python-docutils-0.6-1.el6.noarch --- Package python-jinja2.x86_64 0:2.2.1-1.el6 will be installed --- Package python-pygments.noarch 0:1.1.1-1.el6 will be installed --- Package python-six.noarch 0:1.1.0-2.el6 will be installed -- Running transaction check --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-imaging.x86_64 0:1.1.6-19.el6 will be installed -- Finished Dependency Resolution Error: Package: ceph-0.67.2-0.el6.x86_64 (ceph) Requires: xfsprogs Machine Info - Linux version 2.6.32-131.4.1.el6.x86_64 ( mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Fri Jun 10 10:54:26 EDT 2011 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] xfsprogs not found in RHEL
Hi, xfsprogs should be included in the EL6 base. Perhaps run yum clean all and try again? Cheers, Lincoln On Aug 27, 2013, at 9:16 PM, sriram wrote: I am trying to install CEPH and I get the following error - --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-babel.noarch 0:0.9.4-5.1.el6 will be installed --- Package python-backports-ssl_match_hostname.noarch 0:3.2-0.3.a3.el6 will be installed --- Package python-docutils.noarch 0:0.6-1.el6 will be installed -- Processing Dependency: python-imaging for package: python-docutils-0.6-1.el6.noarch --- Package python-jinja2.x86_64 0:2.2.1-1.el6 will be installed --- Package python-pygments.noarch 0:1.1.1-1.el6 will be installed --- Package python-six.noarch 0:1.1.0-2.el6 will be installed -- Running transaction check --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-imaging.x86_64 0:1.1.6-19.el6 will be installed -- Finished Dependency Resolution Error: Package: ceph-0.67.2-0.el6.x86_64 (ceph) Requires: xfsprogs Machine Info - Linux version 2.6.32-131.4.1.el6.x86_64 (mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Fri Jun 10 10:54:26 EDT 2011 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] xfsprogs not found in RHEL
Tried yum clean all followed by yum install ceph and the same result. On Tue, Aug 27, 2013 at 7:44 PM, Lincoln Bryant linco...@uchicago.eduwrote: Hi, xfsprogs should be included in the EL6 base. Perhaps run yum clean all and try again? Cheers, Lincoln On Aug 27, 2013, at 9:16 PM, sriram wrote: I am trying to install CEPH and I get the following error - --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-babel.noarch 0:0.9.4-5.1.el6 will be installed --- Package python-backports-ssl_match_hostname.noarch 0:3.2-0.3.a3.el6 will be installed --- Package python-docutils.noarch 0:0.6-1.el6 will be installed -- Processing Dependency: python-imaging for package: python-docutils-0.6-1.el6.noarch --- Package python-jinja2.x86_64 0:2.2.1-1.el6 will be installed --- Package python-pygments.noarch 0:1.1.1-1.el6 will be installed --- Package python-six.noarch 0:1.1.0-2.el6 will be installed -- Running transaction check --- Package ceph.x86_64 0:0.67.2-0.el6 will be installed -- Processing Dependency: xfsprogs for package: ceph-0.67.2-0.el6.x86_64 --- Package python-imaging.x86_64 0:1.1.6-19.el6 will be installed -- Finished Dependency Resolution Error: Package: ceph-0.67.2-0.el6.x86_64 (ceph) Requires: xfsprogs Machine Info - Linux version 2.6.32-131.4.1.el6.x86_64 ( mockbu...@x86-003.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Fri Jun 10 10:54:26 EDT 2011 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com