Re: Re: question about striped_read

2013-08-01 Thread majianpeng
On Thu, Aug 1, 2013 at 9:45 AM, majianpeng majianp...@gmail.com wrote: On Wed, Jul 31, 2013 at 3:32 PM, majianpeng majianp...@gmail.com wrote: [snip] Test case A: touch file dd if=file of=/dev/null bs=5M count=1 iflag=direct B: [data(2M)|hole(2m)][data(2M)] dd if=file of=/dev/null

Re: Re: question about striped_read

2013-08-01 Thread Yan, Zheng
On Thu, Aug 1, 2013 at 2:30 PM, majianpeng majianp...@gmail.com wrote: On Thu, Aug 1, 2013 at 9:45 AM, majianpeng majianp...@gmail.com wrote: On Wed, Jul 31, 2013 at 3:32 PM, majianpeng majianp...@gmail.com wrote: [snip] Test case A: touch file dd if=file of=/dev/null bs=5M count=1

still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe - Profihost AG
Hi, i still have recovery issues with cuttlefish. After the OSD comes back it seem to hang for around 2-4 minutes and then recovery seems to start (pgs in recovery_wait start to decrement). This is with ceph 0.61.7. I get a lot of slow request messages an hanging VMs. What i noticed today is

Re: LFS Ceph

2013-08-01 Thread Chmouel Boudjnah
Hello, Sorry for the late answer as I was travelling lately. The LFS works has been in a heavy state of work in progress by Peter (in CC) and others there is some documentation in this review : https://review.openstack.org/#/c/30051/ (summarized in Pete's gist here

[PATCH V5 2/8] fs/ceph: vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem

2013-08-01 Thread Sha Zhengju
From: Sha Zhengju handai@taobao.com Following we will begin to add memcg dirty page accounting around __set_page_dirty_ {buffers,nobuffers} in vfs layer, so we'd better use vfs interface to avoid exporting those details to filesystems. Signed-off-by: Sha Zhengju handai@taobao.com ---

Re: [PATCH] mds: remove waiting lock before merging with neighbours

2013-08-01 Thread David Disseldorp
Hi, Did anyone get a chance to look at this change? Any comments/feedback/ridicule would be appreciated. Cheers, David -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH] Add missing buildrequires for Fedora

2013-08-01 Thread Danny Al-Gaaf
Hi, I've opened a pull request with some additional fixes for this issue: https://github.com/ceph/ceph/pull/478 Danny Am 30.07.2013 09:53, schrieb Erik Logtenberg: Hi, This patch adds two buildrequires to the ceph.spec file, that are needed to build the rpms under Fedora. Danny Al-Gaaf

Re: still recovery issues with cuttlefish

2013-08-01 Thread Andrey Korolyov
Second this. Also for long-lasting snapshot problem and related performance issues I may say that cuttlefish improved things greatly, but creation/deletion of large snapshot (hundreds of gigabytes of commited data) still can bring down cluster for a minutes, despite usage of every possible

PG Backend Proposal

2013-08-01 Thread Loic Dachary
Hi Sam, When the acting set changes order two chunks for the same object may co-exist in the same placement group. The key should therefore also contain the chunk number. That's probably the most sensible comment I have so far. This document is immensely useful (even in its current state)

Re: PG Backend Proposal

2013-08-01 Thread Samuel Just
DELETE can always be rolled forward, but there may be other operations in the log that can't be (like an append). So we need to be able to roll it back (I think) perform_write, read, try_rollback probably don't matter to backfill, scrubbing. You are correct, we need to include the chunk number

Re: PG Backend Proposal

2013-08-01 Thread Loic Dachary
On 01/08/2013 18:42, Loic Dachary wrote: Hi Sam, When the acting set changes order two chunks for the same object may co-exist in the same placement group. The key should therefore also contain the chunk number. That's probably the most sensible comment I have so far. This document

Re: [PATCH] mds: remove waiting lock before merging with neighbours

2013-08-01 Thread Sage Weil
On Thu, 1 Aug 2013, David Disseldorp wrote: Hi, Did anyone get a chance to look at this change? Any comments/feedback/ridicule would be appreciated. Sorry, not yet--and Greg just headed out for vacation yesterday. It's on my list to look at when I have some time tonight or tomorrow,

v0.67-rc3 Dumpling release candidate

2013-08-01 Thread Sage Weil
We've tagged and pushed out packages for another release candidate for Dumpling. At this point things are looking very good. There are a few odds and ends with the CLI changes but the core ceph functionality is looking quite stable. Please test! Packages are available in the -testing repos:

Re: [PATCH V5 2/8] fs/ceph: vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem

2013-08-01 Thread Sage Weil
On Thu, 1 Aug 2013, Yan, Zheng wrote: On Thu, Aug 1, 2013 at 7:51 PM, Sha Zhengju handai@gmail.com wrote: From: Sha Zhengju handai@taobao.com Following we will begin to add memcg dirty page accounting around __set_page_dirty_ {buffers,nobuffers} in vfs layer, so we'd better use

Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, i still have recovery issues with cuttlefish. After the OSD comes

Re: still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe
m 01.08.2013 20:34, schrieb Samuel Just: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam Sure which log levels? On Thu, Aug 1, 2013 at 1:22 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote:

Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
For now, just the main ceph.log. -Sam On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote: m 01.08.2013 20:34, schrieb Samuel Just: Can you reproduce and attach the ceph.log from before you stop the osd until after you have started the osd and it has recovered? -Sam

Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
It doesn't have log levels, should be in /var/log/ceph/ceph.log. -Sam On Thu, Aug 1, 2013 at 11:36 AM, Samuel Just sam.j...@inktank.com wrote: For now, just the main ceph.log. -Sam On Thu, Aug 1, 2013 at 11:34 AM, Stefan Priebe s.pri...@profihost.ag wrote: m 01.08.2013 20:34, schrieb Samuel

Re: still recovery issues with cuttlefish

2013-08-01 Thread Mike Dawson
I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs - VMs mounted from RBD volumes work properly - I see the OSD's boot messages as they re-join the cluster - Start seeing active+recovery_wait, peering, and

Re: still recovery issues with cuttlefish

2013-08-01 Thread Stefan Priebe
Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am 01.08.2013 20:54, schrieb Mike Dawson: I am also seeing recovery issues with 0.61.7. Here's the process: - ceph osd set noout - Reboot one of the nodes hosting OSDs

Rados Protocoll

2013-08-01 Thread Niklas Goerke
Hi, I was wondering why there is no native Java implementation of librados. I'm thinking about creating one and I'm thus looking for a documentation of the RADOS protocol. Also the way I see it librados implements the crush algorithm. Is there a documentation for it? Also an educated guess

Re: still recovery issues with cuttlefish

2013-08-01 Thread Samuel Just
Can you dump your osd settings? sudo ceph --admin-daemon ceph-osd.osdid.asok config show -Sam On Thu, Aug 1, 2013 at 12:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Mike we already have the async patch running. Yes it helps but only helps it does not solve. It just hides the issue ... Am

Re: Rados Protocoll

2013-08-01 Thread Noah Watkins
Hi Niklas, The RADOS reference implementation in C++ is quite large. Reproducing it all in another language would be interesting, but I'm curious if wrapping the C interface is not an option for you? There are Java bindings that are being worked on here: https://github.com/wido/rados-java. There

Re: PG Backend Proposal

2013-08-01 Thread Loic Dachary
Hi Sam, I'm under the impression that https://github.com/athanatos/ceph/blob/wip-erasure-coding-doc/doc/dev/osd_internals/erasure_coding.rst#distinguished-acting-set-positions assumes acting[0] stores all chunk[0], acting[1] stores all chunk[1] etc. The chunk rank does not need to match the OSD

Re: PG Backend Proposal

2013-08-01 Thread Samuel Just
I think there are some tricky edge cases with the above approach. You might end up with two pg replicas in the same acting set which happen for reasons of history to have the same chunk for one or more objects. That would have to be detected and repaired even though the object would be missing

Re: [PATCH] ceph: fix bugs about handling short-read for sync read mode.

2013-08-01 Thread Sage Weil
On Fri, 2 Aug 2013, majianpeng wrote: cephfs . show_layout layyout.data_pool: 0 layout.object_size: 4194304 layout.stripe_unit: 4194304 layout.stripe_count: 1 TestA: dd if=/dev/urandom of=test bs=1M count=2 oflag=direct dd if=/dev/urandom of=test bs=1M count=2 seek=4