Re: Ceph version 0.56.1, data loss on power failure

2013-01-17 Thread Yann Dupont
Le 16/01/2013 17:56, Jeff Mitchell a écrit : FWIW, my ceph data dirs (for e.g. mons) are all on XFS. I've experienced a lot of corruption on these on power loss to the node -- and in some cases even when power wasn't lost, and the box was simply rebooted. This is on Ubuntu 12.04 with the

Single host VM limit when using RBD

2013-01-17 Thread Matthew Anderson
I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's

Re: Single host VM limit when using RBD

2013-01-17 Thread Andrey Korolyov
Hi Matthew, Seems to a low value in /proc/sys/kernel/threads-max value. On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson matth...@base3.com.au wrote: I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open)

RE: Single host VM limit when using RBD

2013-01-17 Thread Matthew Anderson
Hi Audrey, I did try your suggestion beforehand and it doesn't appear to fix the issue. [root@KVM04 ~]# cat /proc/sys/kernel/threads-max 2549635 [root@KVM04 ~]# echo 5549635 /proc/sys/kernel/threads-max [root@KVM04 ~]# virsh start EX03 error: Failed to start domain EX03 error: internal error

Re: flashcache

2013-01-17 Thread Joseph Glanville
On 17 January 2013 20:46, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/16 Mark Nelson mark.nel...@inktank.com: I don't know if I have to use a single two port IB card (switch redundancy and no card redundancy) or I have to use two single port cards. (or a single one

Re: flashcache

2013-01-17 Thread Mark Nelson
On 01/16/2013 11:47 PM, Stefan Priebe - Profihost AG wrote: Hi Mark, Am 16.01.2013 um 22:53 schrieb Mark With only 2 SSDs for 12 spinning disks, you'll need to make sure the SSDs are really fast. I use Intel 520s for testing which are great, but I wouldn't use them in production. Why

Re: flashcache

2013-01-17 Thread Mark Nelson
On 01/17/2013 07:32 AM, Joseph Glanville wrote: On 17 January 2013 20:46, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/16 Mark Nelsonmark.nel...@inktank.com: I don't know if I have to use a single two port IB card (switch redundancy and no card redundancy) or I have

Hit suicide timeout after adding new osd

2013-01-17 Thread Jens Kristian Søgaard
Hi guys, I had a functioning Ceph system that reported HEALTH_OK. It was running with 3 osds on 3 servers. Then I added an extra osd on 1 of the servers using the commands from the documentation here: http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ Shortly after I did that 2

Re: flashcache

2013-01-17 Thread Atchley, Scott
On Jan 17, 2013, at 8:37 AM, Mark Nelson mark.nel...@inktank.com wrote: On 01/17/2013 07:32 AM, Joseph Glanville wrote: On 17 January 2013 20:46, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/16 Mark Nelsonmark.nel...@inktank.com: I don't know if I have to use a

Re: Hit suicide timeout after adding new osd

2013-01-17 Thread Wido den Hollander
Hi, On 01/17/2013 03:35 PM, Jens Kristian Søgaard wrote: Hi guys, I had a functioning Ceph system that reported HEALTH_OK. It was running with 3 osds on 3 servers. Then I added an extra osd on 1 of the servers using the commands from the documentation here:

Re: flashcache

2013-01-17 Thread Andrey Korolyov
On Thu, Jan 17, 2013 at 7:00 PM, Atchley, Scott atchle...@ornl.gov wrote: On Jan 17, 2013, at 9:48 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/17 Atchley, Scott atchle...@ornl.gov: IB DDR should get you close to 2 GB/s with IPoIB. I have gotten our IB QDR PCI-E

Re: Hit suicide timeout after adding new osd

2013-01-17 Thread Wido den Hollander
Hi, On 01/17/2013 03:50 PM, Stefan Priebe wrote: Hi, Am 17.01.2013 15:47, schrieb Wido den Hollander: You might want to try building from 'next' yourself or fetch some new packages from the RPM repos: http://eu.ceph.com/docs/master/install/rpm/ Should it be backported to bobtail branch as

Re: Hit suicide timeout after adding new osd

2013-01-17 Thread Stefan Priebe
Hi Sage, Am 17.01.2013 16:33, schrieb Wido den Hollander: Hi, On 01/17/2013 03:50 PM, Stefan Priebe wrote: Hi, Am 17.01.2013 15:47, schrieb Wido den Hollander: You might want to try building from 'next' yourself or fetch some new packages from the RPM repos:

Re: flashcache

2013-01-17 Thread Atchley, Scott
On Jan 17, 2013, at 10:07 AM, Andrey Korolyov and...@xdel.ru wrote: On Thu, Jan 17, 2013 at 7:00 PM, Atchley, Scott atchle...@ornl.gov wrote: On Jan 17, 2013, at 9:48 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/17 Atchley, Scott atchle...@ornl.gov: IB DDR

Re: flashcache

2013-01-17 Thread Atchley, Scott
On Jan 17, 2013, at 10:14 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/17 Atchley, Scott atchle...@ornl.gov: IPoIB appears as a traditional Ethernet device to Linux and can be used as such. Ceph has no idea that it is not Ethernet. Ok. Now it's clear. AFAIK, a

Current OSD weight vs. target weight

2013-01-17 Thread Christopher Kunz
Hi, if run during a user-issued reweight (i.e. ceph osd crush reweight x y), ceph osd tree shows the target weight of an OSD. Is there a way to see the *current* weight of the OSD? We would like to be able to approximate the amount of rollback necessary per OSD if we need to cancel a larger

Re: flashcache

2013-01-17 Thread Atchley, Scott
On Jan 17, 2013, at 11:01 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/17 Atchley, Scott atchle...@ornl.gov: Yes. It should get close to 1 GB/s where 1GbE is limited to about 125 MB/s. Lower latency? Probably since most Ethernet drivers set interrupt coalescing

Re: flashcache

2013-01-17 Thread Stefan Priebe
Hi, Am 17.01.2013 17:12, schrieb Atchley, Scott: On Jan 17, 2013, at 11:01 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/1/17 Atchley, Scott atchle...@ornl.gov: Yes. It should get close to 1 GB/s where 1GbE is limited to about 125 MB/s. Lower latency? Probably since

Re: flashcache

2013-01-17 Thread Stefan Priebe
Hi, Am 17.01.2013 17:21, schrieb Gandalf Corvotempesta: 2013/1/17 Stefan Priebe s.pri...@profihost.ag: We're using bonded active/active 2x10GbE with Intel ixgbe and i'm able to get 2.3GB/s. Which kind of switch do you use? HP 5920 Stefan -- To unsubscribe from this list: send the line

RE: Ceph slow request unstable issue

2013-01-17 Thread Sage Weil
On Thu, 17 Jan 2013, Chen, Xiaoxi wrote: Some update summary for tested case till now: Ceph is v0.56.1 1.RBD:Ubuntu 13.04 + 3.7Kernel OSD:Ubuntu 13.04 + 3.7Kernel XFS Result: Kernel Panic on both RBD and OSD sides We're very interested in the RBD client-side

Re: HOWTO: teuthology and code coverage

2013-01-17 Thread Gregory Farnum
It's great to see people outside of Inktank starting to get into using teuthology. Thanks for the write-up! -Greg On Wed, Jan 16, 2013 at 6:01 AM, Loic Dachary l...@dachary.org wrote: Hi, I'm happy to report that running teuthology to get a lcov code coverage report worked for me.

master branch issue in ceph.git

2013-01-17 Thread David Zafman
The latest code is hanging trying to start teuthology. I used teuthology-nuke to clear old state and reboot the machines. I was using my branch rebased to latest master and when that started failing I switched to the default config. It still keeps hanging here:

Re: master branch issue in ceph.git

2013-01-17 Thread Sage Weil
On Thu, 17 Jan 2013, David Zafman wrote: The latest code is hanging trying to start teuthology. I used teuthology-nuke to clear old state and reboot the machines. I was using my branch rebased to latest master and when that started failing I switched to the default config. It still