Re:

Daire Byrne Thu, 17 Dec 2009 08:17:45 -0800

J.R,

On Thu, Dec 17, 2009 at 2:58 PM,  <[email protected]> wrote:
>
> Daire Byrne:
>> I can watch the network traffic with tcpdump and see that for every
>> (?) file there are lots of NFS ops (plenty of "readdirplus" for
>> example) to the NFS server. The time taken to list all the files
>> increases (0.4s -> 15s in my simple test). I'm wondering if it is
>> possible to tell aufs never to check the NFS branch if the file exists
>> on the local branch. I understand that in order to make a union the
>> directory contents of the remote filesystem needs to be known but is
>> there any way to minimise the traffic so that this operation is only
>> done once (for each dir?). I can probably turn up the NFS attribute
>> caching but this doesn't help much with the very first read.
>
> Aufs has an option called 'rdcache=<seconds>' which specifies the
> readdir cache lifetime in aufs. The default value is 10 seconds. If you
> set it much longer, then you will be able to reduce the number of
> internal readdir for branch FS.
> But it is equivalent to the NFS cache time you mentioned. So I am afraid
> you won't be satisfied.


I suppose the NFS attributes cache is an option (dunno how much you
can cache at a time?). Maybe the rdcache option is better? My use case
here is a VPN user who is likely to only connect, run an application
once and then disconnect. So I'm not sure there is going to be much
repeat access where a cache will help.

> I'd like to suggest you to try RDU (readdir in userspace) which runs
> faster generally. To use RDU,
> Although you may not agree, I think the NFS cache options are effective
> for you too.

I don't think it is the speed of the readdir that is the issue
per-say. It is the slow network filesystem and I doubt the userspace
optimisation will make much difference in relation to the slow network
latency.

> Aufs has no option to stop readdir for the lower branch. While you know
> the contents of the upper branch is equivalent to the lower, aufs
> doesn't know it. And aufs thinks there may exist unknown files on the
> lower branch. It wont' be clear until aufs completes readdir for the
> lower branch.

I understand that the readirplus calls are necessary to list the dir
contents and make the union. Looking at the traffic more closely in
wireshark I see that there are a few readdirplus calls first to get
the contents of a single directory and if I do an "ls -l" (stat)
actually most of the traffic is then NFS "LOOKUP" calls/replies (one
for each file in the dir). Is this traffic required? I mean if we now
know the contents of the dir and I always want to use the file in the
top branch then why stat the lower branch files at all if they exist
in the top branch? Could this be disabled?

Also for the case when doing blind open() calls on files is it not
possible to try open from the top branch and only if it fails do a
lookup on the lower (NFS) branch? Again looking at the wireshark
output when I open a file in a dirtree it looks like there are LOOKUP
and ACCESS calls being made to each directory in the tree. But the
tree exists on the top branch so why can't we ignore the lower NFS
tree unless the file/dirs don't exist?

> There is one related branch attribute in aufs. It is '<branch>=rr' which
> stands for real readonly. But it may not be helpful for you.

This doesn't seem to change the access patterns in these cases.

> Obviously it is not a bug. But I think I could understand what you want.
> Currently I'd suggest you to try RDU in aufs, NFS cache options or
> FS-cache. Because aufs depends upon the behaviour of branch fs, and the
> behaviour you pointed out is the natural behaviour of NFS.

I suppose I could use FS-Cache but then I have to build up the cache
over time by downloading data first. Pre-populating with the loopback
squashfs archive is a lot nicer and compressed too. I can always turn
on FS-Cache on the NFS filesystem as well as using the aufs union with
the squashfs archive.

Thanks for your time,

Daire

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev

Re:

Reply via email to