RE: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread James Harper
Can you offer some comments on what the impact is likely to be to the data in an affected cluster? Should all data now be treated with suspicion and restored back to before the firefly upgrade? James -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On

Re: v0.80.4 Firefly released

2014-07-16 Thread Christoph Hellwig
On Tue, Jul 15, 2014 at 04:45:59PM -0700, Sage Weil wrote: This Firefly point release fixes an potential data corruption problem when ceph-osd daemons run on top of XFS and service Firefly librbd clients. A recently added allocation hint that RBD utilizes triggers an XFS bug on some kernels

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Sylvain Munaut
On Wed, Jul 16, 2014 at 10:50 AM, James Harper ja...@ejbdigital.com.au wrote: Can you offer some comments on what the impact is likely to be to the data in an affected cluster? Should all data now be treated with suspicion and restored back to before the firefly upgrade? Yes, I'd definitely

RE: [RFC][PATCH] osd: Add local_connection to fast_dispatch in func _send_boot.

2014-07-16 Thread Ma, Jianpeng
Ping... -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Ma, Jianpeng Sent: Monday, July 14, 2014 11:17 AM To: g...@inktank.com Cc: ceph-devel@vger.kernel.org Subject: [RFC][PATCH] osd: Add local_connection to fast_dispatch

extremely-high-unreasonable number of threads and memory usage on the client side using librados C++ api

2014-07-16 Thread Amit Tiwary
Ceph Developers, we are using C++ librados (ceph 0.67.9) to interact with two ceph clusters. cluster1(test) has 30 osds whereas cluster2(production) has 936 osds. We are hitting extremely-high-unreasonable number of threads and memory usage on the client side while interacting with cluster2 for

Re: extremely-high-unreasonable number of threads and memory usage on the client side using librados C++ api

2014-07-16 Thread Sage Weil
On Wed, 16 Jul 2014, Amit Tiwary wrote: Ceph Developers, we are using C++ librados (ceph 0.67.9) to interact with two ceph clusters. cluster1(test) has 30 osds whereas cluster2(production) has 936 osds. We are hitting extremely-high-unreasonable number of threads and memory usage on the

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Sage Weil
On Wed, 16 Jul 2014, Travis Rhoden wrote: Hi Andrija, I'm running a cluster with both CentOS and Ubuntu machines in it.  I just did some upgrades to 0.80.4, and I can confirm that doing yum update ceph on the CentOS machine did result in having all OSDs on that machine restarted

some bugs when cluster name is not ‘ceph'

2014-07-16 Thread yy
Hi, we found some bugs when cluster name is not ‘ceph' in version 0.80.1, - diff --git a/src/ceph-disk b/src/ceph-disk index f79e341..153e344 100755 --- a/src/ceph-disk +++ b/src/ceph-disk @@ -1611,6 +1611,8 @@ def start_daemon( [ svc,

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Gregory Farnum
On Wed, Jul 16, 2014 at 1:50 AM, James Harper ja...@ejbdigital.com.au wrote: Can you offer some comments on what the impact is likely to be to the data in an affected cluster? Should all data now be treated with suspicion and restored back to before the firefly upgrade? I am under the

Re: v0.80.4 Firefly released

2014-07-16 Thread Gregory Farnum
On Wed, Jul 16, 2014 at 2:22 AM, Christoph Hellwig h...@infradead.org wrote: On Tue, Jul 15, 2014 at 04:45:59PM -0700, Sage Weil wrote: This Firefly point release fixes an potential data corruption problem when ceph-osd daemons run on top of XFS and service Firefly librbd clients. A recently

Re: Read from clones

2014-07-16 Thread Gregory Farnum
FYI, this sounds like an issue the userspace client (and possibly the actual rbd class?) had as well: it looked at the HEAD parent_overlap field even when reading from snapshots. (I don't remember if parent_overlap is the actual parameter name, but you get the idea.) -Greg Software Engineer #42 @

Re: Read from clones

2014-07-16 Thread Ilya Dryomov
On Wed, Jul 16, 2014 at 10:19 PM, Gregory Farnum g...@inktank.com wrote: FYI, this sounds like an issue the userspace client (and possibly the actual rbd class?) had as well: it looked at the HEAD parent_overlap field even when reading from snapshots. (I don't remember if parent_overlap is the

Re: [RFC][PATCH] osd: Add local_connection to fast_dispatch in func _send_boot.

2014-07-16 Thread Gregory Farnum
I'm looking at this and getting a little confused. Can you provide a log of the crash occurring? (preferably with debug_ms=20, debug_osd=20) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Jul 13, 2014 at 8:17 PM, Ma, Jianpeng jianpeng...@intel.com wrote: When do

Re: [ceph-users] EU mirror now supports rsync

2014-07-16 Thread David Moreau Simard
Hi, Thanks for making this available. I am currently synchronizing off of it and will make it available on our 4 Gbps mirror on the Canadian east coast by the end of this week. Are you able to share how you are synchronizing from the Ceph repositories ? It would probably be better for us to

Re: v0.80.4 Firefly released

2014-07-16 Thread Dave Chinner
On Wed, Jul 16, 2014 at 10:26:23AM -0700, Gregory Farnum wrote: On Wed, Jul 16, 2014 at 2:22 AM, Christoph Hellwig h...@infradead.org wrote: On Tue, Jul 15, 2014 at 04:45:59PM -0700, Sage Weil wrote: This Firefly point release fixes an potential data corruption problem when ceph-osd daemons

Re: v0.80.4 Firefly released

2014-07-16 Thread Samuel Just
[Apologies for the repost, attachment was too big] Sorry for the delay. I've been trying to put together a simpler reproducer since no one wants to debug a filesystem based on rbd symptoms :). It doesn't appear to be related to using extsize on a non-empty file. The linked archive below has a

Re: [PATCH] mon: OSDMonitor: add osd pool get pool erasure_code_profile command

2014-07-16 Thread Sage Weil
Applied, thanks! sage On Thu, 17 Jul 2014, Ma, Jianpeng wrote: Enable us to obtain the erasure-code-profile for a given erasure-pool. Signed-off-by: Ma Jianpeng jianpeng...@intel.com --- src/mon/MonCommands.h | 2 +- src/mon/OSDMonitor.cc | 11 +++ 2 files changed, 12

Re: extremely-high-unreasonable number of threads and memory usage on the client side using librados C++ api

2014-07-16 Thread Amit Tiwary
Sage Weil sweil at redhat.com writes: Increasing the open file limit is the way to address this currently. The underlying problem is that we are creating threads to service each connection and there isn't a max open connection limit. The real fix is to reimplement the SimpleMessenger, but