Re: [ceph-users] CephFS: 'ls -alR' performance terrible unless Linux cache flushed

negillen negillen Tue, 16 Jun 2015 05:00:37 -0700

Thanks everyone,

update: I tried running on "node A":
# vmtouch -ev /storage/
# sync; sync


The problem persisted; one minute needed to 'ls -Ral' the dir (from node B).

After that I ran on node A:
# echo 2 > /proc/sys/vm/drop_caches

And everything became suddenly fast on node B. ls, du, tar, all of them
take a fraction of a second to complete on node B after dropping cache on A.




On Tue, Jun 16, 2015 at 12:52 PM, Jan Schermer <[email protected]> wrote:

> Have you tried just running “sync;sync” on the originating node? Does that
> achieve the same thing or not? (I guess it could/should).
>
> Jan
>
>
> On 16 Jun 2015, at 13:37, negillen negillen <[email protected]> wrote:
>
> Thanks again,
>
> even 'du' performance is terrible on node B (testing on a directory taken
> from Phoronix):
>
> # time du -hs /storage/test9/installed-tests/pts/pgbench-1.5.1/
> 73M     /storage/test9/installed-tests/pts/pgbench-1.5.1/
> real    0m21.044s
> user    0m0.010s
> sys     0m0.067s
>
>
> Reading the files from node B doesn't seem to help with subsequent
> accesses in this case:
>
> # time tar c /storage/test9/installed-tests/pts/pgbench-1.5.1/>/dev/null
> real    1m47.650s
> user    0m0.041s
> sys     0m0.212s
>
> # time tar c /storage/test9/installed-tests/pts/pgbench-1.5.1/>/dev/null
> real    1m45.636s
> user    0m0.042s
> sys     0m0.214s
>
> # time ls -laR /storage/test9/installed-tests/pts/pgbench-1.5.1>/dev/null
>
> real    1m43.180s
> user    0m0.069s
> sys     0m0.236s
>
>
> Of course, once I dismount the CephFS on node A everything gets as fast as
> it can be.
>
> Am I missing something obvious here?
> Yes I could drop the Linux cache as a 'fix' but that would drop the entire
> system's cache, sounds a bit extreme! :P
> Unless is there a way to drop the cache only for that single dir...?
>
>
> On Tue, Jun 16, 2015 at 12:15 PM, Gregory Farnum <[email protected]> wrote:
>
>> On Tue, Jun 16, 2015 at 12:11 PM, negillen negillen <[email protected]>
>> wrote:
>> > Thank you very much for your reply!
>> >
>> > Is there anything I can do to go around that? e.g. setting access caps
>> to be
>> > released after a short while? Or is there a command to manually release
>> > access caps (so that I could run it in cron)?
>>
>> Well, you can drop the caches. ;)
>>
>> More generally, you're running into a specific hole here. If your
>> clients are actually *accessing* the files then they should go into
>> shared mode and this will be much faster on subsequent accesses.
>>
>> > This is quite a problem because we have several applications that need
>> to
>> > access a large number of files and when we set them to work on CephFS
>> > latency skyrockets.
>>
>> What kind of shared-file access do they have? If you have a bunch of
>> files being shared for read I'd expect this to be very fast. If
>> different clients are writing small amounts to them in round-robin
>> then that's unfortunately not going to work well. :(
>> -Greg
>>
>> >
>> > Thanks again and regards.
>> >
>> > On Tue, Jun 16, 2015 at 10:59 AM, Gregory Farnum <[email protected]>
>> wrote:
>> >>
>> >> On Mon, Jun 15, 2015 at 11:34 AM, negillen negillen <
>> [email protected]>
>> >> wrote:
>> >> > Hello everyone,
>> >> >
>> >> > something very strange is driving me crazy with CephFS (kernel
>> driver).
>> >> > I copy a large directory on the CephFS from one node. If I try to
>> >> > perform a
>> >> > 'time ls -alR' on that directory it gets executed in less than one
>> >> > second.
>> >> > If I try to do the same 'time ls -alR' from another node it takes
>> >> > several
>> >> > minutes. No matter how many times I repeat the command, the speed is
>> >> > always
>> >> > abysmal. The ls works fine on the node where the initial copy was
>> >> > executed
>> >> > from. This happens with any directory I have tried, no matter what
>> kind
>> >> > of
>> >> > data is inside.
>> >> >
>> >> > After lots of experimenting I have found that in order to have fast
>> ls
>> >> > speed
>> >> > for that dir from every node I need to flush the Linux cache on the
>> >> > original
>> >> > node:
>> >> > echo 3 > /proc/sys/vm/drop_caches
>> >> > Unmounting and remounting the CephFS on that node does the trick too.
>> >> >
>> >> > Anyone has a clue about what's happening here? Could this be a
>> problem
>> >> > with
>> >> > the writeback fscache for the CephFS?
>> >> >
>> >> > Any help appreciated! Thanks and regards. :)
>> >>
>> >> This is a consequence of the CephFS "file capabilities" that we use to
>> >> do distributed locking on file states. When you copy the directory on
>> >> client A, it has full capabilities on the entire tree. When client B
>> >> tries to do a stat on each file in that tree, it doesn't have any
>> >> capabilities. So it sends a stat request to the MDS, which sends a cap
>> >> update to client A requiring it to pause updates on the file and share
>> >> its current state. Then the MDS tells client A it can keep going and
>> >> sends the stat to client B.
>> >> So that's:
>> >> B -> MDS
>> >> MDS -> A
>> >> A -> MDS
>> >> MDS -> B | MDS -> A
>> >> for every file you touch.
>> >>
>> >> I think the particular oddity you're encountering here is that CephFS
>> >> generally tries not to make clients drop their "exclusive" access caps
>> >> just to satisfy a stat. If you had client B doing something with the
>> >> files (like reading them) you would probably see different behavior.
>> >> I'm not sure if there's something effective we can do here or not
>> >> (it's just a bunch of heuristics when we should or should not drop
>> >> caps), but please file a bug on the tracker (tracker.ceph.com) with
>> >> this case. :)
>> >> -Greg
>> >
>> >
>>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS: 'ls -alR' performance terrible unless Linux cache flushed

Reply via email to