Re: maintanance on osd host

2013-02-26 Thread Stefan Priebe - Profihost AG
Hi Greg, Hi Sage, Am 26.02.2013 21:27, schrieb Gregory Farnum: > On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe wrote: > "out" and "down" are quite different — are you sure you tried "down" > and not "out"? (You reference out in your first email, rather than > down.) > -Greg sorry that's it i

[GIT PULL] Ceph updates for 3.9-rc1

2013-02-26 Thread Sage Weil
Hi Linus, Please pull the following Ceph updates from git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus A few groups of patches here. Alex has been hard at work improving the RBD code, layout groundwork for understanding the new formats and doing layering. Most o

Re: Crash and strange things on MDS

2013-02-26 Thread Sage Weil
On Wed, 27 Feb 2013, Yan, Zheng wrote: > On Wed, Feb 27, 2013 at 5:58 AM, Gregory Farnum wrote: > > On Tue, Feb 26, 2013 at 1:57 PM, Kevin Decherf wrote: > >> On Tue, Feb 26, 2013 at 12:26:17PM -0800, Gregory Farnum wrote: > >>> On Tue, Feb 26, 2013 at 11:58 AM, Kevin Decherf > >>> wrote: > >>

Re: Crash and strange things on MDS

2013-02-26 Thread Yan, Zheng
On Wed, Feb 27, 2013 at 5:58 AM, Gregory Farnum wrote: > On Tue, Feb 26, 2013 at 1:57 PM, Kevin Decherf wrote: >> On Tue, Feb 26, 2013 at 12:26:17PM -0800, Gregory Farnum wrote: >>> On Tue, Feb 26, 2013 at 11:58 AM, Kevin Decherf wrote: >>> > We have one folder per application (php, java, ruby).

Re: Crash and strange things on MDS

2013-02-26 Thread Gregory Farnum
On Tue, Feb 26, 2013 at 1:57 PM, Kevin Decherf wrote: > On Tue, Feb 26, 2013 at 12:26:17PM -0800, Gregory Farnum wrote: >> On Tue, Feb 26, 2013 at 11:58 AM, Kevin Decherf wrote: >> > We have one folder per application (php, java, ruby). Every application has >> > small (<1M) files. The folder is

Re: Crash and strange things on MDS

2013-02-26 Thread Kevin Decherf
On Tue, Feb 26, 2013 at 12:26:17PM -0800, Gregory Farnum wrote: > On Tue, Feb 26, 2013 at 11:58 AM, Kevin Decherf wrote: > > We have one folder per application (php, java, ruby). Every application has > > small (<1M) files. The folder is mounted by only one client by default. > > > > In case of ov

Geographic DR for RGW

2013-02-26 Thread Mark Kampe
A few weeks ago, Yehuda Sadeh sent out a proposal for adding support for asynchronous remote site replication to the RADOS Gateway. We have done some preliminary planning and are now starting the implementation work. At the 100,000' level (from which height all water looks drinkable and all moun

Re: maintanance on osd host

2013-02-26 Thread Gregory Farnum
On Tue, Feb 26, 2013 at 11:44 AM, Stefan Priebe wrote: > Hi Sage, > > Am 26.02.2013 18:24, schrieb Sage Weil: > >> On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: >>> >>> But that redults in a 1-3s hickup for all KVM vms. This is not what I >>> want. >> >> >> You can do >> >> kill $pid

Re: Crash and strange things on MDS

2013-02-26 Thread Gregory Farnum
On Tue, Feb 26, 2013 at 11:58 AM, Kevin Decherf wrote: > We have one folder per application (php, java, ruby). Every application has > small (<1M) files. The folder is mounted by only one client by default. > > In case of overload, another clients spawn to mount the same folder and > access the sa

Re: maintanance on osd host

2013-02-26 Thread Sage Weil
On Tue, 26 Feb 2013, Stefan Priebe wrote: > Hi Sage, > > Am 26.02.2013 18:24, schrieb Sage Weil: > > On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: > > > But that redults in a 1-3s hickup for all KVM vms. This is not what I > > > want. > > > > You can do > > > > kill $pid > > ceph

Re: Crash and strange things on MDS

2013-02-26 Thread Kevin Decherf
On Tue, Feb 26, 2013 at 10:10:06AM -0800, Gregory Farnum wrote: > On Tue, Feb 26, 2013 at 9:57 AM, Kevin Decherf wrote: > > On Tue, Feb 19, 2013 at 05:09:30PM -0800, Gregory Farnum wrote: > >> On Tue, Feb 19, 2013 at 5:00 PM, Kevin Decherf wrote: > >> > On Tue, Feb 19, 2013 at 10:15:48AM -0800, G

Re: maintanance on osd host

2013-02-26 Thread Stefan Priebe
Hi Sage, Am 26.02.2013 18:24, schrieb Sage Weil: On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to

Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load

2013-02-26 Thread Sage Weil
On Tue, 26 Feb 2013, Jim Schutt wrote: > > I think the right solution is to make an option that will setsockopt on > > SO_RECVBUF to some value (say, 256KB). I pushed a branch that does this, > > wip-tcp. Do you mind checking to see if this addresses the issue (without > > manually adjusting t

Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load

2013-02-26 Thread Jim Schutt
Hi Sage, On 02/20/2013 05:12 PM, Sage Weil wrote: > Hi Jim, > > I'm resurrecting an ancient thread here, but: we've just observed this on > another big cluster and remembered that this hasn't actually been fixed. Sorry for the delayed reply - I missed this in a backlog of unread email... > >

Re: [PATCH 0/4] libceph: abstract setting message data info

2013-02-26 Thread Josh Durgin
On 02/25/2013 03:40 PM, Alex Elder wrote: This series makes the fields related to the data portion of a ceph message not get manipulated by code outside the ceph messenger. It implements some interface functions that can be used to assign data-related fields. Doing this will allow the way messa

Re: [PATCH 0/3] libceph: focus calc_layout() on filling in the osd op

2013-02-26 Thread Josh Durgin
On 02/25/2013 03:09 PM, Alex Elder wrote: This series refactors the code involved with identifying the details of the name, offset, and length of an object involved with an osd request based on a file layout. It makes the focus of calc_layout() be filling in an osd op structure based on the file

Re: [PATCH 3/3] ceph: fix vmtruncate deadlock

2013-02-26 Thread Gregory Farnum
On Mon, Feb 25, 2013 at 4:01 PM, Gregory Farnum wrote: > On Fri, Feb 22, 2013 at 8:31 PM, Yan, Zheng wrote: >> On 02/23/2013 02:54 AM, Gregory Farnum wrote: >>> I haven't spent that much time in the kernel client, but this patch >>> isn't working out for me. In particular, I'm pretty sure we need

Re: [PATCH] libceph: make ceph_msg->bio_seg be unsigned

2013-02-26 Thread Josh Durgin
Reviewed-by: Josh Durgin On 02/25/2013 02:40 PM, Alex Elder wrote: The bio_seg field is used by the ceph messenger in iterating through a bio. It should never have a negative value, so make it an unsigned. Change variables used to hold bio_seg values to all be unsigned as well. Change two va

Re: [PATCH] libceph: fix a osd request memory leak

2013-02-26 Thread Josh Durgin
Reviewed-by: Josh Durgin On 02/25/2013 02:36 PM, Alex Elder wrote: If an invalid layout is provided to ceph_osdc_new_request(), its call to calc_layout() might return an error. At that point in the function we've already allocated an osd request structure, so we need to free it (drop a referen

Re: Crash and strange things on MDS

2013-02-26 Thread Gregory Farnum
On Tue, Feb 26, 2013 at 9:57 AM, Kevin Decherf wrote: > On Tue, Feb 19, 2013 at 05:09:30PM -0800, Gregory Farnum wrote: >> On Tue, Feb 19, 2013 at 5:00 PM, Kevin Decherf wrote: >> > On Tue, Feb 19, 2013 at 10:15:48AM -0800, Gregory Farnum wrote: >> >> Looks like you've got ~424k dentries pinned,

Re: Crash and strange things on MDS

2013-02-26 Thread Kevin Decherf
On Tue, Feb 19, 2013 at 05:09:30PM -0800, Gregory Farnum wrote: > On Tue, Feb 19, 2013 at 5:00 PM, Kevin Decherf wrote: > > On Tue, Feb 19, 2013 at 10:15:48AM -0800, Gregory Farnum wrote: > >> Looks like you've got ~424k dentries pinned, and it's trying to keep > >> 400k inodes in cache. So you're

Re: maintanance on osd host

2013-02-26 Thread Sage Weil
On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: > But that redults in a 1-3s hickup for all KVM vms. This is not what I want. You can do kill $pid ceph osd down $osdid (or even reverse the order, if the sequence is quick enough) to avoid waiting for the failure detection delay. But

Re: maintanance on osd host

2013-02-26 Thread Stefan Priebe - Profihost AG
But that redults in a 1-3s hickup for all KVM vms. This is not what I want. Stefan Am 26.02.2013 um 18:06 schrieb Sage Weil : > On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: >> Hi list, >> >> how can i do a short maintanance like a kernel upgrade on an osd host? >> Right now ceph sta

Re: maintanance on osd host

2013-02-26 Thread Sage Weil
On Tue, 26 Feb 2013, Stefan Priebe - Profihost AG wrote: > Hi list, > > how can i do a short maintanance like a kernel upgrade on an osd host? > Right now ceph starts to backfill immediatly if i say: > ceph osd out 41 > ... > > Without ceph osd out command all clients hang for the time ceph does

Re: OpenStack summit : Ceph design session

2013-02-26 Thread Neil Levine
It's been an embryonic internal Inktank conversation and Nick Barcet at eNovance mentioned some ideas when we last met. Will try and put together a blueprint soon. Neil On Mon, Feb 25, 2013 at 2:04 AM, Loic Dachary wrote: > Hi Neil, > > I've added "RBD backups secondary clusters within Openstack

Re: maintanance on osd host

2013-02-26 Thread Andrey Korolyov
On Tue, Feb 26, 2013 at 6:56 PM, Stefan Priebe - Profihost AG wrote: > Hi list, > > how can i do a short maintanance like a kernel upgrade on an osd host? > Right now ceph starts to backfill immediatly if i say: > ceph osd out 41 > ... > > Without ceph osd out command all clients hang for the time

maintanance on osd host

2013-02-26 Thread Stefan Priebe - Profihost AG
Hi list, how can i do a short maintanance like a kernel upgrade on an osd host? Right now ceph starts to backfill immediatly if i say: ceph osd out 41 ... Without ceph osd out command all clients hang for the time ceph does not know that the host was rebootet. I tried ceph osd set nodown and cep

Re: [PATCH 3/3] ceph: fix vmtruncate deadlock

2013-02-26 Thread Yan, Zheng
On 02/26/2013 01:00 PM, Sage Weil wrote: > On Tue, 26 Feb 2013, Yan, Zheng wrote: >>> It looks to me like truncates can get queued for later, so that's not the >>> case? >>> And how could the client receive a truncate while in the middle of >>> writing? Either it's got the write caps (in which cas