On Fri, Oct 14, 2016 at 11:41 AM, Heller, Chris <chel...@akamai.com> wrote:
> Unfortunately, it was all in the unlink operation. Looks as if it took nearly
> 20 hours to remove the dir, roundtrip is a killer there. What can be done to
> reduce RTT to the MDS? Does the client really have to sequentially delete
> directories or can it have internal batching or parallelization?
It's bound by the same syscall APIs as anything else. You can spin off
multiple deleters; I'd either keep them on one client (if you want to
work within a single directory) or if using multiple clients assign
them to different portions of the hierarchy. That will let you
parallelize across the IO latency until you hit a cap on the MDS'
total throughput (should be 1-10k deletes/s based on latest tests
> On 10/13/16, 4:22 PM, "Gregory Farnum" <gfar...@redhat.com> wrote:
> On Thu, Oct 13, 2016 at 12:44 PM, Heller, Chris <chel...@akamai.com>
> > I have a directory I’ve been trying to remove from cephfs (via
> > cephfs-hadoop), the directory is a few hundred gigabytes in size and
> > contains a few million files, but not in a single sub directory. I
> > the delete yesterday at around 6:30 EST, and it’s still progressing. I
> > see from (ceph osd df) that the overall data usage on my cluster is
> > decreasing, but at the rate its going it will be a month before the
> > sub directory is gone. Is a recursive delete of a directory known to be
> > slow operation in CephFS or have I hit upon some bad configuration? What
> > steps can I take to better debug this scenario?
> Is it the actual unlink operation taking a long time, or just the
> reduction in used space? Unlinks require a round trip to the MDS
> unfortunately, but you should be able to speed things up at least some
> by issuing them in parallel on different directories.
> If it's the used space, you can let the MDS issue more RADOS delete
> ops by adjusting the "mds max purge files" and "mds max purge ops"
> config values.
ceph-users mailing list