Re: [ceph-users] Why is this pg incomplete?

2016-01-04 Thread Bryan Wright
Michael Kidd writes: >   If you can read the disk that was osd.102, you may wish to attempt this process to recover your data:https://ceph.com/community/incomplete-pgs-oh-my/ > Good luck! Hi Michael, Thanks for the pointer. After looking at it, I'm wondering if the necessity to

Re: [ceph-users] Why is this pg incomplete?

2016-01-04 Thread Bryan Wright
Gregory Farnum writes: > I can't parse all of that output, but the most important and > easiest-to-understand bit is: > "blocked_by": [ > 102 > ], > > And indeed in the past_intervals section there are a bunch where it's > just 102. You

[ceph-users] Why is this pg incomplete?

2016-01-01 Thread Bryan Wright
Hi folks, "ceph pg dump_stuck inactive" shows: 0.e8incomplete [406,504] 406 [406,504] 406 Each of the osds above is alive and well, and idle. The output of "ceph pg 0.e8 query" is shown below. All of the osds it refers to are alive and well, with the exception of osd

Re: [ceph-users] Cephfs: large files hang

2016-01-01 Thread Bryan Wright
Gregory Farnum writes: > Or maybe it's 0.9a, or maybe I just don't remember at all. I'm sure > somebody recalls... > I'm still struggling with this. When copying some files from the ceph file system, it hangs forever. Here's some more data: * Attempt to copy file. ceph

Re: [ceph-users] Cephfs: large files hang

2015-12-18 Thread Bryan Wright
Gregory Farnum writes: > > What's the full output of "ceph -s"? > > The only time the MDS issues these "stat" ops on objects is during MDS > replay, but the bit where it's blocked on "reached_pg" in the OSD > makes it look like your OSD is just very slow. (Which could > potentially

Re: [ceph-users] Cephfs: large files hang

2015-12-18 Thread Bryan Wright
Gregory Farnum writes: > > Nonetheless, it's probably your down or incomplete PGs causing the > issue. You can check that by seeing if seed 0.5d427a9a (out of that > blocked request you mentioned) belongs to one of the dead ones. > -Greg Hi Greg, How would I find out which pg

[ceph-users] Cephfs: large files hang

2015-12-17 Thread Bryan Wright
Hi folks, This is driving me crazy. I have a ceph filesystem that behaves normally when I "ls" files, and behaves normally when I copy smallish files on or off of the filesystem, but large files (~ GB size) hang after copying a few megabytes. This is ceph 0.94.5 under Centos 6.7 under kernel

[ceph-users] MDS stuck replaying

2015-12-15 Thread Bryan Wright
Hi folks, This morning, one of my MDSes dropped into "replaying": mds cluster is degraded mds.0 at 192.168.1.31:6800/12550 rank 0 is replaying journal and the ceph filesystem seems to be unavailable to the clients. Is there any way to see the progress of this replay? I don't see any

Re: [ceph-users] MDS stuck replaying

2015-12-15 Thread Bryan Wright
John Spray writes: > Anyway -- you'll need to do some local poking of the MDS to work out > what the hold up is. Turn up MDS debug logging[1] and see what's > it's saying during the replay. Also, you can use performance counters > "ceph daemon mds. perf dump" and see which are

Re: [ceph-users] MDS stuck replaying

2015-12-15 Thread Bryan Wright
John Spray writes: > If you haven't already, also > check the overall health of the MDS host, e.g. is it low on > memory/swapping? For what it's worth, I've taken down some OSDs, and that seems to have allowed the MDS to finish replaying. My guess is that one of the OSDs was

[ceph-users] Replacing a disk: Best practices?

2014-10-15 Thread Bryan Wright
Hi folks, I recently had an OSD disk die, and I'm wondering what are the current best practices for replacing it. I think I've thoroughly removed the old disk, both physically and logically, but I'm having trouble figuring out how to add the new disk into ceph. For one thing,

[ceph-users] feature set mismatch, missing 20000000000

2014-06-06 Thread Bryan Wright
Hi folks, Thanks to Sage Weil's advice, I fixed my TMAP2OMAP problem by just restarting the osds, but now I'm running into the following cephfs problem. When I try to mount the filesystem, I get errors like the following: libceph: mon0 192.168.1.31:6789 feature set mismatch, my

Re: [ceph-users] feature set mismatch,

2014-06-06 Thread Bryan Wright
Ilya Dryomov ilya.dryomov@... writes: Unless you have been playing with 'osd primary-affinity' command, the problem is probably that you have chooseleaf_vary_r tunable set in your crushmap. This is a new tunable, it will be supported in 3.15. If you disable it with ceph osd getcrushmap

[ceph-users] Spurious error about TMAP2OMAP from mds?

2014-06-05 Thread Bryan Wright
Hi folks, I just upgraded from 0.72 to 0.80, and everything seems fine with the exception of one mds, which refuses to start because one or more OSDs do not support TMAP2OMAP. Two other mdses are fine. I've checked the osd processes, and they are all version 0.80.1, and they were all