Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread Sylvain Munaut
FWIW, I can confirm via printf's that this error path is never hit in at least some of the crashes I'm seeing. Ok thanks. Are you using cache btw ? Cheers, Sylvain -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to

RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread James Harper
FWIW, I can confirm via printf's that this error path is never hit in at least some of the crashes I'm seeing. Ok thanks. Are you using cache btw ? I hope not. How could I tell? It's not something I've explicitly enabled. Thanks James -- To unsubscribe from this list: send the

Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread Sylvain Munaut
Hi, I hope not. How could I tell? It's not something I've explicitly enabled. It's disabled by default. So you'd have to have enabled it either in ceph.conf or directly in the device path in the xen config. (option is 'rbd cache', http://ceph.com/docs/next/rbd/rbd-config-ref/ ) Cheers,

Re: teuthology : ulimit: error

2013-08-13 Thread Loic Dachary
Hi Dan, That was indeed the solution :-) Thanks ! On 13/08/2013 04:53, Dan Mick wrote: Ah, there's another we apply universally to our test systems, apparently: '/etc/security/limits.d/ubuntu.conf' ubuntu hard nofile 16384 and the tests run as user ubuntu. Line 4 of the script is the

Re: Call for participants : teuthology weekly meeting

2013-08-13 Thread Loic Dachary
Hi Ceph, Teuthology was successfully installed run as described here: http://dachary.org/?p=2204 with only two minor glitches: teuthology : ulimit: error http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/16421 do not check the jobid if check-locks is False

Re: poll/sendmsg problem with 3.5.0-37-generic #58~precise1-Ubuntu

2013-08-13 Thread Luis Henriques
Sage Weil s...@inktank.com writes: Hi, A ceph user hit a problem with the 3.5 precise kernel with symptoms exactly like an old poll(2) bug[1]. Basically, one end of a socket is blocked on sendmsg(2), and the other end is blocked on poll(2) waiting for data. 15 minutes later the poll(2)

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Andreas Bluemle
Hi Matthew, I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and all of a sudden things work nicely. I think we are looking at two issues here: 1. the

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Atchley, Scott
On Aug 13, 2013, at 10:06 AM, Andreas Bluemle andreas.blue...@itxperts.de wrote: Hi Matthew, I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and

Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread Frederik Thuysbaert
Hi, I have been testing this a while now, and just finished testing your untested patch. The rbd caching problem still persists. The system I am testing on has the following characteristics: Dom0: - Linux xen-001 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 - Most recent git checkout

Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread Sylvain Munaut
Hi, I have been testing this a while now, and just finished testing your untested patch. The rbd caching problem still persists. Yes, I wouldn't expect to change anything for caching. But I still don't understand why caching would change anything at all ... all of it should be handled within

teuthology and code coverage

2013-08-13 Thread Loic Dachary
Hi, When running teuthology from a laptop with the configuration below and ./virtualenv/bin/teuthology --archive=/tmp/teuthology try.yaml it then fails on ./virtualenv/bin/teuthology-coverage -v --html-output /tmp/html -o /tmp/lcov --cov-tools-dir $(pwd)/coverage /tmp/teuthology

Re: still recovery issues with cuttlefish

2013-08-13 Thread Samuel Just
I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish branch or wait for a 61.8 release. -Sam On Mon, Aug 12, 2013 at 10:34

Re: still recovery issues with cuttlefish

2013-08-13 Thread Stefan Priebe - Profihost AG
Am 13.08.2013 um 22:43 schrieb Samuel Just sam.j...@inktank.com: I just backported a couple of patches from next to fix a bug where we weren't respecting the osd_recovery_max_active config in some cases (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the current cuttlefish

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Hefty, Sean
I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and all of a sudden things work nicely. I am adding the linux-rdma list to CC so Sean might see

RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread James Harper
Just noticed email subject qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] where Sage noted that he has seen a completion called twice in the logs the OP posted. If that is

Re: poll/sendmsg problem with 3.5.0-37-generic #58~precise1-Ubuntu

2013-08-13 Thread Sage Weil
On Tue, 13 Aug 2013, Luis Henriques wrote: Sage Weil s...@inktank.com writes: Hi, A ceph user hit a problem with the 3.5 precise kernel with symptoms exactly like an old poll(2) bug[1]. Basically, one end of a socket is blocked on sendmsg(2), and the other end is blocked on poll(2)

Re: still recovery issues with cuttlefish

2013-08-13 Thread Samuel Just
I'm not sure, but your logs did show that you had 16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam On Tue, Aug 13, 2013 at

RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread James Harper
I think I have a separate problem too - tapdisk will segfault almost immediately upon starting but seemingly only for Linux PV DomU's. Once it has started doing this I have to wait a few hours to a day before it starts working again. My Windows DomU's appear to be able to start normally though.

Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread Sylvain Munaut
On Wed, Aug 14, 2013 at 1:39 AM, James Harper james.har...@bendigoit.com.au wrote: I think I have a separate problem too - tapdisk will segfault almost immediately upon starting but seemingly only for Linux PV DomU's. Once it has started doing this I have to wait a few hours to a day before

RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread James Harper
On Wed, Aug 14, 2013 at 1:39 AM, James Harper james.har...@bendigoit.com.au wrote: I think I have a separate problem too - tapdisk will segfault almost immediately upon starting but seemingly only for Linux PV DomU's. Once it has started doing this I have to wait a few hours to a day

RE: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p

2013-08-13 Thread James Harper
On Wed, Aug 14, 2013 at 1:39 AM, James Harper james.har...@bendigoit.com.au wrote: I think I have a separate problem too - tapdisk will segfault almost immediately upon starting but seemingly only for Linux PV DomU's. Once it has started doing this I have to wait a few hours to a

[PATCH v4] Ceph-fuse: Fallocate and punch hole support

2013-08-13 Thread Li Wang
This patch implements fallocate and punch hole support for Ceph fuse client. Signed-off-by: Yunchuan Wen yunchuan...@ubuntukylin.com Signed-off-by: Li Wang liw...@ubuntukylin.com --- Since the i_size is untrustable without Fs cap, we'd better let the fallocate go without checking if it beyond

[PATCH] Ceph-qa: change the fsx.sh to support hole punching test

2013-08-13 Thread Li Wang
This patch change the fsx.sh to pull better fsx.c from xfstests site to support hole punching test. Signed-off-by: Yunchuan Wen yunchuan...@ubuntukylin.com Signed-off-by: Li Wang liw...@ubuntukylin.com --- qa/workunits/suites/fsx.sh |6 -- 1 file changed, 4 insertions(+), 2 deletions(-)

Re: [PATCH v4] Ceph-fuse: Fallocate and punch hole support

2013-08-13 Thread Sage Weil
On Wed, 14 Aug 2013, Li Wang wrote: This patch implements fallocate and punch hole support for Ceph fuse client. Signed-off-by: Yunchuan Wen yunchuan...@ubuntukylin.com Signed-off-by: Li Wang liw...@ubuntukylin.com --- Since the i_size is untrustable without Fs cap, we'd better let the