Re: [ceph-users] cephfs slow delete

Gregory Farnum Fri, 14 Oct 2016 12:59:14 -0700

On Fri, Oct 14, 2016 at 11:41 AM, Heller, Chris <[email protected]> wrote:
> Unfortunately, it was all in the unlink operation. Looks as if it took nearly 
> 20 hours to remove the dir, roundtrip is a killer there. What can be done to 
> reduce RTT to the MDS? Does the client really have to sequentially delete 
> directories or can it have internal batching or parallelization?


It's bound by the same syscall APIs as anything else. You can spin off
multiple deleters; I'd either keep them on one client (if you want to
work within a single directory) or if using multiple clients assign
them to different portions of the hierarchy. That will let you
parallelize across the IO latency until you hit a cap on the MDS'
total throughput (should be 1-10k deletes/s based on latest tests
IIRC).
-Greg

>
> -Chris
>
> On 10/13/16, 4:22 PM, "Gregory Farnum" <[email protected]> wrote:
>
>     On Thu, Oct 13, 2016 at 12:44 PM, Heller, Chris <[email protected]> 
> wrote:
>     > I have a directory I’ve been trying to remove from cephfs (via
>     > cephfs-hadoop), the directory is a few hundred gigabytes in size and
>     > contains a few million files, but not in a single sub directory. I 
> startd
>     > the delete yesterday at around 6:30 EST, and it’s still progressing. I 
> can
>     > see from (ceph osd df) that the overall data usage on my cluster is
>     > decreasing, but at the rate its going it will be a month before the 
> entire
>     > sub directory is gone. Is a recursive delete of a directory known to be 
> a
>     > slow operation in CephFS or have I hit upon some bad configuration? What
>     > steps can I take to better debug this scenario?
>
>     Is it the actual unlink operation taking a long time, or just the
>     reduction in used space? Unlinks require a round trip to the MDS
>     unfortunately, but you should be able to speed things up at least some
>     by issuing them in parallel on different directories.
>
>     If it's the used space, you can let the MDS issue more RADOS delete
>     ops by adjusting the "mds max purge files" and "mds max purge ops"
>     config values.
>     -Greg
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs slow delete

Reply via email to