Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-30 Thread James Eckersall
. Any further help is greatly appreciated. On 17 May 2017 at 10:58, James Eckersall <james.eckers...@gmail.com> wrote: > An update to this. The cluster has been upgraded to Kraken, but I've > still got the same PG reporting inconsistent and the same error message > about mds m

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-17 Thread James Eckersall
it? I haven't been able to find any docs that explain. Thanks J On 3 May 2017 at 14:35, James Eckersall <james.eckers...@gmail.com> wrote: > Hi David, > > Thanks for the reply, it's appreciated. > We're going to upgrade the cluster to Kraken and see if that fixes the >

Re: [ceph-users] mds slow requests

2017-05-12 Thread James Eckersall
Hi, no I have not seen any log entries related to scrubs. I see slow requests for various operations including readdir, unlink. Sometimes rdlock, sometimes wrlock. On 12 May 2017 at 16:10, Peter Maloney <peter.malo...@brockmann-consult.de> wrote: > On 05/12/17 16:54, James Eckers

[ceph-users] mds slow requests

2017-05-12 Thread James Eckersall
Hi, We have an 11 node ceph cluster 8 OSD nodes with 5 disks each and 3 MDS servers. Since upgrading from Jewel to Kraken last week, we are seeing the active MDS constantly reporting a number of slow requests > 30 seconds. The load on the Ceph servers is not excessive. None of the OSD disks

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-03 Thread James Eckersall
iggered during omap deletion typically > in a large directory which corresponds to an individual unlink in cephfs. > > If you can build a branch in github to get the newer ceph-osdomap-tool you > could try to use it to repair the omaps. > > David > > > On 5/2/17 5:05 AM, Jam

[ceph-users] cephfs metadata damage and scrub error

2017-05-02 Thread James Eckersall
Hi, I'm having some issues with a ceph cluster. It's an 8 node cluster rnning Jewel ceph-10.2.7-0.el7.x86_64 on CentOS 7. This cluster provides RBDs and a CephFS filesystem to a number of clients. ceph health detail is showing the following errors: pg 2.9 is active+clean+inconsistent, acting

Re: [ceph-users] data balancing/crush map issue

2015-11-13 Thread James Eckersall
I've just discovered the hashpspool setting and found that it is set to false on all of my pools. I can't really work out what this setting does though. Can anyone please explain what this setting does and whether it would improve my situation? Thanks J On 11 November 2015 at 14:51, James

[ceph-users] data balancing/crush map issue

2015-11-11 Thread James Eckersall
Hi, I have a Ceph cluster running on 0.80.10 and I'm having problems with the data balancing on two new nodes that were recently added. The cluster nodes look like as follows: 6x OSD servers with 32 4TB SAS drives. The drives are configured with RAID0 in pairs, so 16 8TB OSD's per node. New

Re: [ceph-users] No auto-mount of OSDs after server reboot

2015-01-30 Thread James Eckersall
I'm running Ubuntu 14.04 servers with Firefly and I don't have a sysvinit file, but I do have an upstart file. touch /var/lib/ceph/osd/ceph-XX/upstart should be all you need to do. That way, the OSD's should be mounted automatically on boot. On 30 January 2015 at 10:25, Alexis KOALLA

Re: [ceph-users] monitor quorum

2014-09-19 Thread James Eckersall
:) J On 18 September 2014 10:24, James Eckersall james.eckers...@gmail.com wrote: Is anyone able to offer any advice on how to fix this? I've tried re-injecting the monmap into mon03 as that was mentioned in the mon troubleshooting docs, but that has not helped at all. mon03 is still stuck

Re: [ceph-users] monitor quorum

2014-09-18 Thread James Eckersall
) election timer expired J On 17 September 2014 17:05, James Eckersall james.eckers...@gmail.com wrote: Hi, Now I feel dumb for jumping to the conclusion that it was a simple networking issue - it isn't. I've just checked connectivity properly and I can ping and telnet 6789 from all mon

[ceph-users] monitor quorum

2014-09-17 Thread James Eckersall
Hi, I have a ceph cluster running 0.80.1 on Ubuntu 14.04. I have 3 monitors and 4 OSD nodes currently. Everything has been running great up until today where I've got an issue with the monitors. I moved mon03 to a different switchport so it would have temporarily lost connectivity. Since then,

Re: [ceph-users] monitor quorum

2014-09-17 Thread James Eckersall
Hi, Thanks for the advice. I feel pretty dumb as it does indeed look like a simple networking issue. You know how you check things 5 times and miss the most obvious one... J On 17 September 2014 16:04, Florian Haas flor...@hastexo.com wrote: On Wed, Sep 17, 2014 at 1:58 PM, James Eckersall

Re: [ceph-users] monitor quorum

2014-09-17 Thread James Eckersall
be done to fix this. With hindsight, I would have stopped the mon service before relocating the nic cable, but I expected the mon to survive a short network outage which it doesn't seem to have done :( On 17 September 2014 16:21, James Eckersall james.eckers...@gmail.com wrote: Hi, Thanks

[ceph-users] ceph cluster expansion

2014-08-13 Thread James Eckersall
Hi, I'm looking for some advice on my ceph cluster. The current setup is as follows: 3 mon servers 4 storage servers with the following spec: 1x Intel Xeon E5-2640 @2.50GHz 6 core (12 with hyperthreading). 64GB DDR3 RAM 2x SSDSC2BB080G4 for OS LSI MegaRAID 9260-16i with the following drives:

Re: [ceph-users] ceph cluster expansion

2014-08-13 Thread James Eckersall
On 13 August 2014 10:28, Christian Balzer ch...@gol.com wrote: Hello, On Wed, 13 Aug 2014 09:15:34 +0100 James Eckersall wrote: Hi, I'm looking for some advice on my ceph cluster. The current setup is as follows: 3 mon servers 4 storage servers with the following spec: 1x

Re: [ceph-users] ceph cluster expansion

2014-08-13 Thread James Eckersall
, Christian Balzer ch...@gol.com wrote: On Wed, 13 Aug 2014 12:47:22 +0100 James Eckersall wrote: Hi Christian, We're actually using the following chassis: http://rnt.de/en/bf_xxlarge.html Ah yes, one of the Blazeback heritage. But rather more well designed and thought through than most

Re: [ceph-users] GPF kernel panics

2014-08-04 Thread James Eckersall
=hosting_windows_sharedweb, allow rwx pool=infra_systems, allow rwx pool=hosting_linux_sharedweb, allow rwx pool=test Thanks J On 1 August 2014 01:17, Brad Hubbard bhubb...@redhat.com wrote: On 07/31/2014 06:37 PM, James Eckersall wrote: Hi, The stacktraces are very similar. Here is another one

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread James Eckersall
. Thanks J On 31 July 2014 09:12, Ilya Dryomov ilya.dryo...@inktank.com wrote: On Thu, Jul 31, 2014 at 11:44 AM, James Eckersall james.eckers...@gmail.com wrote: Hi, I've had a fun time with ceph this week. We have a cluster with 4 OSD (20 OSD's per) servers, 3 mons and a server mapping

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread James Eckersall
the maximum amount of kernel mappings, which is somewhat shy of 250 in any kernel below 3.14? If you can easily upgrade to 3.14 see if that fixes it. Christian On Thu, 31 Jul 2014 09:37:05 +0100 James Eckersall wrote: Hi, The stacktraces are very similar. Here is another one

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread James Eckersall
ish and counting). Now to figure out the best way to get a 3.14 kernel in Ubuntu Trusty :) On 31 July 2014 10:23, Christian Balzer ch...@gol.com wrote: On Thu, 31 Jul 2014 10:13:11 +0100 James Eckersall wrote: Hi, I thought the limit was in relation to ceph and that 0.80+ fixed

[ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
Hi, I have a ceph cluster running on 0.80.1 with 80 OSD's. I've had fairly uneven distribution of the data and have been keeping it ticking along with ceph osd reweight XX 0.x commands on a few OSD's while I try and increase the pg count of the pools to hopefully better balance the data.

Re: [ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Gregory Farnum Sent: 18 July 2014 23:25 To: James Eckersall Cc: ceph-users Subject: Re: [ceph-users] health_err on osd full Yes, that's expected behavior. Since the cluster can't move data around on its own, and lots

Re: [ceph-users] health_err on osd full

2014-07-18 Thread James Eckersall
Thanks Greg. I appreciate the advice, and very quick replies too :) On 18 July 2014 23:35, Gregory Farnum g...@inktank.com wrote: On Fri, Jul 18, 2014 at 3:29 PM, James Eckersall james.eckers...@fasthosts.com wrote: Thanks Greg. Can I suggest that the documentation makes this much

Re: [ceph-users] logrotate

2014-07-11 Thread James Eckersall
of my osd's. My mon's all have the done file and logrotate is working fine for those. So my question is, what is the purpose of the done file and should I just create one for each of my osd's ? On 10 July 2014 11:10, James Eckersall james.eckers...@gmail.com wrote: Hi, I've just upgraded

Re: [ceph-users] logrotate

2014-07-11 Thread James Eckersall
July 2014 15:04, Sage Weil sw...@redhat.com wrote: On Fri, 11 Jul 2014, James Eckersall wrote: Upon further investigation, it looks like this part of the ceph logrotate script is causing me the problem: if [ -e /var/lib/ceph/$daemon/$f/done ] [ -e /var/lib/ceph/$daemon/$f/upstart

[ceph-users] logrotate

2014-07-10 Thread James Eckersall
Hi, I've just upgraded a ceph cluster from Ubuntu 12.04 with 0.72.1 to Ubuntu 14.04 with 0.80.1. I've noticed that the log rotation doesn't appear to work correctly. The OSD's are just not logging to the current ceph-osd-X.log file. If I restart the OSD's or run service ceph-osd reload id=X,

[ceph-users] logrotate

2014-07-10 Thread James Eckersall
Hi, I've just upgraded a ceph cluster from Ubuntu 12.04 with 0.73.1 to Ubuntu 14.04 with 0.80.1. I've noticed that the log rotation doesn't appear to work correctly. The OSD's are just not logging to the current ceph-osd-X.log file. If I restart the OSD's, they start logging, but then overnight,

Re: [ceph-users] rbd watchers

2014-05-22 Thread James Eckersall
that haven't been deleted yet. You can see the snapshots with rbd snap list image. On Tue, May 20, 2014 at 4:26 AM, James Eckersall james.eckers...@gmail.com wrote: Hi, I'm having some trouble with an rbd image. I want to rename the current rbd and create a new rbd with the same name. I

[ceph-users] rbd watchers

2014-05-20 Thread James Eckersall
Hi, I'm having some trouble with an rbd image. I want to rename the current rbd and create a new rbd with the same name. I renamed the rbd with rbd mv, but it was still mapped on another node, so rbd mv gave me an error that it was unable to remove the source. I then unmapped the original