Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-05-15 Thread Aaron Ten Clay
er to get backtraces out of the core > you need the matching executables. Can you make sure the ceph-osd-dbg or > ceph-debuginfo package is installed on the machine (depending on if it's > deb or rpm) and then gdb ceph-osd corefile and 'thr app all bt'? > > Thanks! > sag

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-05-04 Thread Aaron Ten Clay
Were the backtraces we obtained not useful? Is there anything else we can try to get the OSDs up again? On Wed, Apr 19, 2017 at 4:18 PM, Aaron Ten Clay wrote: > I'm new to doing this all via systemd and systemd-coredump, but I appear to > have gotten cores from two OSD processes. W

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-19 Thread Aaron Ten Clay
dump# ceph -v ceph version 11.2.0 (f223e27eeb35991352ebc1f67423d4ebc252adb7) I am also investigating sysdig as recommended. Thanks! -Aaron On Mon, Apr 17, 2017 at 8:15 AM, Sage Weil wrote: > On Sat, 15 Apr 2017, Aaron Ten Clay wrote: > > Hi all, > > > > Our cluster is exp

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-16 Thread Aaron Ten Clay
process was about 4.2GiB. https://pastebin.com/nLQ8Jpwt Thanks again for the insight! -Aaron On Sat, Apr 15, 2017 at 10:34 AM, Aaron Ten Clay wrote: > Thanks for the recommendation, Bob! I'll try to get this data later today > and reply with it. > > -Aaron > > On Sat,

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-15 Thread Aaron Ten Clay
memory-profiling/ > > Bob > > On Sat, Apr 15, 2017 at 5:39 AM, Peter Maloney consult.de> wrote: > >> How many PGs do you have? And did you change any config, like mds cache >> size? Show your ceph.conf. >> >> >> On 04/15/17 07:34, Aaron Ten Clay w

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-15 Thread Aaron Ten Clay
t.de> wrote: > How many PGs do you have? And did you change any config, like mds cache > size? Show your ceph.conf. > > > On 04/15/17 07:34, Aaron Ten Clay wrote: > > Hi all, > > Our cluster is experiencing a very odd issue and I'm hoping for some > guidance on t

[ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-15 Thread Aaron Ten Clay
Hi all, Our cluster is experiencing a very odd issue and I'm hoping for some guidance on troubleshooting steps and/or suggestions to mitigate the issue. tl;dr: Individual ceph-osd processes try to allocate > 90GiB of RAM and are eventually nuked by oom_killer. I'll try to explain the situation in

Re: [ceph-users] Group permission problems with CephFS

2015-11-06 Thread Aaron Ten Clay
I'm seeing similar behavior as well. -rw-rw-r-- 1 testuser testgroup 6 Nov 6 07:41 testfile aaron@testhost$ groups ... testgroup ... aaron@testhost$ cat > testfile -bash: testfile: Permission denied Running version 9.0.2. Were you able to make any progress on this? Thanks, -Aaron On Tue, Aug

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread Aaron Ten Clay
to fix it would be to take that osd down/out and let recovery > regenerate the chunk. Remove the pg from the osd (ceph-objectstore-tool) > and then you can bring the osd back up/in. > > David > > > On 8/28/15 11:06 AM, Samuel Just wrote: > >> David, does this look

[ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread Aaron Ten Clay
Hi Cephers, I'm trying to resolve an inconsistent pg on an erasure-coded pool, running Ceph 9.0.2. I can't seem to get Ceph to run a repair or even deep-scrub the pg again. Here's the background, with my attempted resolution steps below. Hopefully someone can steer me in the right direction. Thank

Re: [ceph-users] Recovering from multiple OSD failures

2015-06-05 Thread Aaron Ten Clay
A904 C70E E654 3BB2 FA62 B9F1 > > > On Thu, Jun 4, 2015 at 10:26 PM, Aaron Ten Clay wrote: > > Hi Cephers, > > > > I recently had a power problem and the entire cluster was brought down, > came > > up, went down, and came up again. Afterword, 3 OSDs were most

[ceph-users] Recovering from multiple OSD failures

2015-06-04 Thread Aaron Ten Clay
Hi Cephers, I recently had a power problem and the entire cluster was brought down, came up, went down, and came up again. Afterword, 3 OSDs were mostly dead (HDD failures). Luckily (I think) the drives were alive enough that I could copy the data off and leave the journal alone. Since my pool "d

[ceph-users] Inconsistent PGs because 0 copies of objects...

2015-05-11 Thread Aaron Ten Clay
ect to CephFS file is to read the xattrs on the 0th stripe object and pick out the strings.) Thanks in advance for any suggestions/pointers! -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph File System Question

2015-01-27 Thread Aaron Ten Clay
On Tue, Jan 27, 2015 at 6:13 AM, John Spray wrote: > Raj, > > The note is still valid, but the filesystem is getting more stable all the > time. Some people are using it, especially in an active/passive > configuration with a single active MDS. If you do choose to do some > testing, use the mos

Re: [ceph-users] [Ceph-community] ceph replication and striping

2014-08-26 Thread Aaron Ten Clay
On Tue, Aug 26, 2014 at 5:07 AM, wrote: > Hello all, > > > > I have configured a ceph storage cluster. > > > > 1. I created the volume .I would like to know that replication of data > will happen automatically in ceph ? > > 2. how to configure striped volume using ceph ? > > > > > > Regards, >

[ceph-users] Finding CephFS file from object ID

2014-07-22 Thread Aaron Ten Clay
Hi Cephers, I'm trying to recover from an inconsistent object issue. I know which object is inconsistent across its two replicas, but I'm having difficulty determining which of the three copies is correct. Is there an easy way to determine which file in CephFS the object is a part of? (I know how

Re: [ceph-users] Error in documentation

2014-06-19 Thread Aaron Ten Clay
ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > > -- > John Wilkins > Senior Technical Writer > Intank > john.wilk...@inktank.com > (415) 425-9599 > http://inktank.com > > ___

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Aaron Ten Clay
On Mon, Jun 16, 2014 at 11:16 AM, Gregory Farnum wrote: > On Mon, Jun 16, 2014 at 11:11 AM, Aaron Ten Clay > wrote: > > I would also like to see Ceph get smarter about inconsistent PGs. If we > > can't automate the repair, at least the "ceph pg repair" command

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Aaron Ten Clay
to its own local log > >> file. You'll need to identify for yourself which version is correct, > >> which will probably involve going and looking at them inside each > >> OSD's data store. If the primary is correct for all the objects in a > >> PG,

[ceph-users] Fixing inconsistent placement groups

2014-06-12 Thread Aaron Ten Clay
I'm having trouble finding a concise set of steps to repair inconsistent placement groups. I know from other threads that issuing a 'ceph pg repair ...' command could cause loss of data integrity if the primary OSD happens to have the bad copy of the placement group. I know how to find which PG's a

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-05-20 Thread Aaron Ten Clay
For what it's worth, version 0.79 has different headers, and the awk command needs $19 instead of $20. But here is the output I have on a small cluster that I recently rebuilt: $ ceph pg dump all | grep active | awk '{ print $19}' | sort -k1 | uniq -c dumped all in format plain 1 2014-05-15

Re: [ceph-users] Deep-Scrub Scheduling

2014-05-07 Thread Aaron Ten Clay
Mike, You can find the last scrub info for a given PG with "ceph pg x.yy query". -Aaron On Wed, May 7, 2014 at 8:47 PM, Mike Dawson wrote: > Perhaps, but if that were the case, would you expect the max concurrent > number of deep-scrubs to approach the number of OSDs in the cluster? > > I have

Re: [ceph-users] [Ceph-community] How to install CEPH on CentOS 6.3

2014-05-07 Thread Aaron Ten Clay
On Tue, May 6, 2014 at 7:35 PM, Ease Lu wrote: > Hi All, > As following the CEPH online document, I tried to install CEPH on > centos 6.3: > > The step: ADD CEPH > I cannot find centos distro, so I used el6. when I reach the > "INTSALL VIRTUALIZATION FOR BLOCK DEVICE" step, I got:

Re: [ceph-users] CephFS feature set mismatch with v0.79 and recent kernel

2014-04-08 Thread Aaron Ten Clay
On Tue, Apr 8, 2014 at 4:50 PM, Michael Nelson wrote: > I am trying to mount CephFS from a freshly installed v0.79 cluster using a > kernel built from git.kernel.org:kernel/git/sage/ceph-client.git > (for-linus a30be7cb) and running into the following dmesg errors on mount: > > libceph: mon0 198.1

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
Well that was quick! osd.0 crashed already, here's the log (~20 MiB): http://www.aarontc.com/logs/ceph-osd.0.log.bz2 I updated the bug report as well. Thanks, -Aaron On Mon, Mar 31, 2014 at 2:16 PM, Aaron Ten Clay wrote: > Greg, > > I'm in the process of doing so n

Re: [ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
/inktank.com | http://ceph.com > > > On Mon, Mar 31, 2014 at 1:22 PM, Aaron Ten Clay > wrote: > > Hello fellow Cephers! > > > > Recently, before and after the update from 0.77 to 0.78, about half the > OSDs > > in my cluster crash quite frequently with 

[ceph-users] OSDs crashing frequently

2014-03-31 Thread Aaron Ten Clay
Hello fellow Cephers! Recently, before and after the update from 0.77 to 0.78, about half the OSDs in my cluster crash quite frequently with 'osd/PG.cc: 5255: FAILED assert(0 == "we got a bad state machine event")' I'm not sure if this is a bug (there are some similar-sounding reports in Redmine

Re: [ceph-users] Gentoo & ceph 0.67 & pg stuck After fresh Installation

2014-01-30 Thread Aaron Ten Clay
led versions: 0.67{tbz2}(00:54:50 01/08/14)(fuse -debug -gtk >>> -libatomic -radosgw -static-libs -tcmalloc) >>> cluster name is vmsys, servers are dp1 and dp2 >>> config: >>> >>> [global] >>> auth cluster required = none >>> auth s

Re: [ceph-users] rbd client affected with only one node down

2014-01-21 Thread Aaron Ten Clay
________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Gentoo & ceph 0.67 & pg stuck After fresh Installation

2014-01-10 Thread Aaron Ten Clay
t; Best regards > And thank you in advance > > Philipp Strobl > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph as offline S3 substitute and peer-to-peer fileshare?

2014-01-02 Thread Aaron Ten Clay
ers@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problem with "monmaptool: invalid ip"

2013-12-18 Thread Aaron Ten Clay
ch. Hope that helps! -Aaron On Wed, Dec 18, 2013 at 10:32 PM, Yuri Weinstein wrote: > Wow!!! > I tried everything but not this! > I will give it a try. > > However it does sound strange that only for this value requirements differ. > > Why? > > Regards > > >

Re: [ceph-users] Problem with "monmaptool: invalid ip"

2013-12-18 Thread Aaron Ten Clay
What does it mean and how to work around this? > > Thx! > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] alternative approaches to CEPH-FS

2013-11-14 Thread Aaron Ten Clay
I've been using CephFS for a meager 40TB store of video clips for editing, from Dumpling to Emperor, and (fingers crossed) so far I haven't had any problems. The only disruption I've seen is that the metadata server will crash every couple of days, and one of the standby MDS will pick up. The repla

Re: [ceph-users] Gentoo ceph-deploy

2013-11-11 Thread Aaron Ten Clay
lping as well. I currently maintain ebuilds for the latest Ceph versions at an overlay called Nextoo, if anyone is interested: https://github.com/nextoo/portage-overlay/tree/master/sys-cluster/ceph I'm happy to help with other Gentoo-related Ceph development as well :) -- Aaron Ten Clay http://www.aarontc.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph monitor problems

2013-10-30 Thread Aaron Ten Clay
On Wed, Oct 30, 2013 at 1:43 PM, Joao Eduardo Luis wrote: > > A quorum of 2 monitors is completely fine as long as both monitors are up. > A quorum is always possible regardless of how many monitors you have, as > long as a majority is up and able to form it (1 out of 1, 2 out of 2, 2 out > of 3,

Re: [ceph-users] Ceph monitor problems

2013-10-30 Thread Aaron Ten Clay
It sounds like you tried to go from 1 monitor to 2 monitors, which is an unsupported configuration as far as I am aware. You must have either 1, or 3 or more monitors for a quorum to be possible. More information is available here: http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ On

Re: [ceph-users] Public/Cluster addr how to

2013-10-17 Thread Aaron Ten Clay
On Thu, Oct 17, 2013 at 3:24 AM, NEVEU Stephane < stephane.ne...@thalesgroup.com> wrote: > Thank you Gilles, I actually have other servers running on the same > networks so can’t I just set these particular 3 IPs ? > Your servers need to have the IP addresses assigned already. The daemons will fig

Re: [ceph-users] Weird behavior of PG distribution

2013-10-01 Thread Aaron Ten Clay
On Tue, Oct 1, 2013 at 10:11 AM, Chen, Ching-Cheng (KFRM 1) < chingcheng.c...@credit-suisse.com> wrote: > Mike: > > Thanks for the reply. > > However, I did the crushtool command but the output doesn't give me any > obvious explanation why osd.4 should be the primary OSD for PGs. > > All the rule

Re: [ceph-users] Can't mount CephFS - where to start troubleshooting?

2013-09-28 Thread Aaron Ten Clay
lity just yet. Changing this flag to false allows the CephFS to be mounted by 3.5.0 and 3.10.7 without a problem. There is probably a good opportunity to add additional error logging to the in-kernel client here as well. Thanks for the help! -Aaron On Fri, Sep 27, 2013 at 2:53 PM, Aaron Ten Cl

Re: [ceph-users] Can't mount CephFS - where to start troubleshooting?

2013-09-27 Thread Aaron Ten Clay
On Fri, Sep 27, 2013 at 2:44 PM, Gregory Farnum wrote: > What is the output of ceph -s? It could be something underneath the > filesystem. > > root@chekov:~# ceph -s cluster 18b7cba7-ccc3-4945-bb39-99450be81c98 health HEALTH_OK monmap e3: 3 mons at {chekov= 10.42.6.29:6789/0,laforge=10.42

[ceph-users] Can't mount CephFS - where to start troubleshooting?

2013-09-27 Thread Aaron Ten Clay
Hi, I probably did something wrong setting up my cluster with 0.67.3. I previously built a cluster with 0.61 and everything went well, even after an upgrade to 0.67.3. Now I built a fresh 0.67.3 cluster and when I try to mount CephFS: aaron@seven ~ $ sudo mount -t ceph 10.42.6.21:/ /mnt/ceph moun

Re: [ceph-users] CephFS Pool Specification?

2013-09-26 Thread Aaron Ten Clay
On Wed, Sep 25, 2013 at 8:44 PM, Sage Weil wrote: > On Wed, 25 Sep 2013, Aaron Ten Clay wrote: > > Hi all, > > > > Does anyone know how to specify which pool the mds and CephFS data will > be > > stored in? > > > > After creating a new cluster, t

[ceph-users] CephFS Pool Specification?

2013-09-25 Thread Aaron Ten Clay
Hi all, Does anyone know how to specify which pool the mds and CephFS data will be stored in? After creating a new cluster, the pools "data", "metadata", and "rbd" all exist but with pg count too small to be useful. The documentation indicates the pg count can be set only at pool creation time, s