On 2010-07-29, at 04:47, Daire Byrne wrote:
> I was wondering if it is possible to have the client completely cache
> a recursive listing of a lustre filesystem such that on a second run
> it doesn't have to talk to the MDT again? Taking the simplest case
> where I only have one client that is browsing a million file tree
> (say), I would expect that once the ldlm has cached the locks
> (lru_size) then a second recursive scan (find, ls -R) shouldn't need
> to talk to the MDT/OST again. But this is not the case probably
> because a recursive scan needs to do a open() and getdents() on each
> directory it finds.
The getdents() calls can be returned from the client-side cache, it is only the
open() that needs to go to the MDS. Lustre actually does support client-side
open cache, but it is currently only used by NFS servers (which, sadly, opens
and closes the file for every single write operation on a file).
I know Oleg has at times discussed enabling the open cache on the client for
regular filesystem access, but I don't know the tweak needed for this offhand.
I know in the past we didn't do this because there was extra DLM locking
overhead for cancelling the open lock, but with the DLM lock cancel batching
that may not be as big a performance hit.
It wouldn't be a bad idea to start with a /proc tuneable or "-o openlock" mount
option that selectively allows open cache per client mount, so that performance
testing can be done. After that we can decide whether this is only good for
specific workloads and bad for others, or if it is an improvement for most
workloads and should be enabled by default.
> If I just stat all the files without doing a recursive scan then it gets
> everything from the client cache as expected without the MDS chatter - e.g.
>
> find /mnt/lustre -type f > /tmp/files.txt
> cat /tmp/files.txt | xargs ls -l
>
> Is there any way to improve the browsing speed and cache directory
> contents - especially for the case where I only have a single client
> accessing an entire tree? As an aside I also noticed that a "ls -l"
> does a getxattr - does that get cached by the client too? I can
> imagine it might cause quite a bit of MDS chatter.
So far, Lustre doesn't cache any xattr on the client beyond the file layout
("lustre.lov" xattr), but it is something I've been thinking about. The
security.capability attribute is special-cased in the 1.8.4 client to not
return any data, and beyond that there aren't any attributes that I'm aware of
that are widely used, so I don't think there is a pressing demand for this, but
if a case can be made for this we'll definitely look at it more seriously.
--
Cheers, Andreas
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss