Re: [Lustre-discuss] Client directory entry caching

Andreas Dilger Thu, 29 Jul 2010 12:55:27 -0700

On 2010-07-29, at 04:47, Daire Byrne wrote:
> I was wondering if it is possible to have the client completely cache
> a recursive listing of a lustre filesystem such that on a second run
> it doesn't have to talk to the MDT again? Taking the simplest case
> where I only have one client that is browsing a million file tree
> (say), I would expect that once the ldlm has cached the locks
> (lru_size) then a second recursive scan (find, ls -R) shouldn't need
> to talk to the MDT/OST again. But this is not the case probably
> because a recursive scan needs to do a open() and getdents() on each
> directory it finds.


The getdents() calls can be returned from the client-side cache, it is only the 
open() that needs to go to the MDS.  Lustre actually does support client-side 
open cache, but it is currently only used by NFS servers (which, sadly, opens 
and closes the file for every single write operation on a file).

I know Oleg has at times discussed enabling the open cache on the client for 
regular filesystem access, but I don't know the tweak needed for this offhand.  
I know in the past we didn't do this because there was extra DLM locking 
overhead for cancelling the open lock, but with the DLM lock cancel batching 
that may not be as big a performance hit.

It wouldn't be a bad idea to start with a /proc tuneable or "-o openlock" mount 
option that selectively allows open cache per client mount, so that performance 
testing can be done.  After that we can decide whether this is only good for 
specific workloads and bad for others, or if it is an improvement for most 
workloads and should be enabled by default.

> If I just stat all the files without doing a recursive scan then it gets 
> everything from the client cache as expected without the MDS chatter - e.g.
> 
>  find /mnt/lustre -type f > /tmp/files.txt
>  cat /tmp/files.txt | xargs ls -l
> 
> Is there any way to improve the browsing speed and cache directory
> contents - especially for the case where I only have a single client
> accessing an entire tree? As an aside I also noticed that a "ls -l"
> does a getxattr - does that get cached by the client too? I can
> imagine it might cause quite a bit of MDS chatter.

So far, Lustre doesn't cache any xattr on the client beyond the file layout 
("lustre.lov" xattr), but it is something I've been thinking about.  The 
security.capability attribute is special-cased in the 1.8.4 client to not 
return any data, and beyond that there aren't any attributes that I'm aware of 
that are widely used, so I don't think there is a pressing demand for this, but 
if a case can be made for this we'll definitely look at it more seriously.

--
Cheers, Andreas
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Client directory entry caching

Reply via email to