On Thu, Oct 8, 2015 at 11:41 AM, Burkhard Linke
<burkhard.li...@computational.bio.uni-giessen.de> wrote:
> Hi John,
>
> On 10/08/2015 12:05 PM, John Spray wrote:
>>
>> On Thu, Oct 8, 2015 at 10:21 AM, Burkhard Linke
>> <burkhard.li...@computational.bio.uni-giessen.de> wrote:
>>>
>>> Hi,
>
> *snipsnap*
>>>
>>>
>>> I've moved all files from a CephFS data pool (EC pool with frontend cache
>>> tier) in order to remove the pool completely.
>>>
>>> Some objects are left in the pools ('ceph df' output of the affected
>>> pools):
>>>
>>>      cephfs_ec_data           19      7565k         0 66288G           13
>>>
>>> Listing the objects and the readable part of their 'parent' attribute:
>>>
>>> # for obj in $(rados -p cephfs_ec_data ls); do echo $obj; rados -p
>>> cephfs_ec_data getxattr parent | strings; done
>>> 10000f6119f.00000000
>>> 10000f6119f
>>> stray9
>>> 10000f63fe5.00000000
>>> 10000f6119f
>>> stray9
>>> 10000f61196.00000000
>>> 10000f6119f
>>> stray9
>>> .......
>
>
> *snipsnap*
>>
>>
>> Well, they're strays :-)
>>
>> You get stray dentries when you unlink files.  They hang around either
>> until the inode is ready to be purged, or if there are hard links then
>> they hang around until something prompts ceph to "reintegrate" the
>> stray into a new path.
>
> Thanks for the fast reply. During the transfer of all files from the EC pool
> to a standard replicated pool I've copied the file to a new file name,
> removed the orignal one and renamed the copy. There might have been some
> processed with open files at that time, which might explain the stray files
> objects.
>
> I've also been able to locate some processes that might be the reason for
> these leftover files. I've terminated these processes, but the objects are
> still present in the pool. How long does purging an inode usually take?

If nothing is holding a file open, it'll start purging within a couple
of journal-latencies of the unlink (i.e. pretty darn quick), and it'll
take as long to purge as there are objects in the file (again, pretty
darn quick for normal-sized files and a non-overloaded cluster).
Chances are if you're noticing strays, they're stuck for some reason.
You're probably on the right track looking for processes holding files
open.

>> You don't say what version you're running, so it's possible you're
>> running an older version (pre hammer, I think) where you're
>> experiencing either a bug holding up deletion (we've had a few) or a
>> bug preventing reintegration (we had one of those too).  The bugs
>> holding up deletion can usually be worked around with some client
>> and/or mds restarts.
>
> The cluster is running on hammer. I'm going to restart the mds to try to get
> rid of these objects.

OK, let us know how it goes.  You may find the num_strays,
num_strays_purging, num_strays_delayted performance counters (ceph
daemon mds.<foo> perf dump) useful.

>> It isn't safe to remove the pool in this state.  The MDS is likely to
>> crash if it eventually gets around to trying to purge these files.
>
> That's bad. Does the mds provide a way to get more information about these
> files, e.g. which client is blocking purging? We have about 3 hosts working
> on CephFS, and checking every process might be difficult.

If a client has caps on an inode, you can find out about it by dumping
(the whole!) cache from a running MDS.  We have tickets for adding a
more surgical version of this[1] but for now it's bit of a heavyweight
thing.  You can do JSON ("ceph daemon mds.<id> dump cache > foo.json")
or plain text ("ceph daemon mds.<id> dump cache foo.txt").  The latter
version is harder to parse but is less likely to eat all the memory on
your MDS (JSON output builds the whole thing in memory before writing
it)!

In the dump output, search for the inode number you're interested in,
and look for client caps.  Remember if search json output to look for
the decimal form of the inode, vs. the hex form in plan text output.
Resolve the client session ID in the caps to a meaningful name with
"ceph daemon mds.<id> session ls", assuming the clients are recent
enough to report the hostnames.

You can also look at "ceph daemon mds.<id> dump_ops_in_flight" to
check there are no (stuck) requests touching the inode.

John

1.  http://tracker.ceph.com/issues/11171,
http://tracker.ceph.com/issues/11172,
http://tracker.ceph.com/issues/11173
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to