On 4/7/2010 9:39 PM, Stephan Zimmer wrote:
> Hi guys,
> this is my very first post to this mailing list and I hope that someone
> following the list could point me to a solution for this bunch of
> questions I am going to ask in this email.
> 
> We are trying to utilize AFS in order to avoid the installation of local
> software on our machine by sourcing precompiled versions from a
> different cell. As the cell we are fetching our data from is located
> 1000 km away from us we are experiencing serious latency issues although
> our cache of currently 50 GB should give us plenty opportunities to
> cache whatever we need.
> 
> Tests showed in fact that sub sequential exercising of the same tasks is
> not as performant as we would have expected it: after having run it for
> the first time I would have thought that the second call (from a
> different user though) should entirely rely on the cached content the
> first user had created during his call. However, it appears as the cache
> content expires very soon after having accessed it for the first time.

A few things to keep in mind.
(1) The cache manager is trusted by your machine but has no ability
    to perform access control decisions.  Those decisions are made by
    the file servers.  Even if all of the data is cached and callbacks
    are registered for the file, each access by a new user will require
    an RPC to the file server to determine that user's permissions for
    that object.

(2) The data is stored in the cache but the callback registrations
    expire in a few minutes when a read/write volume is being used.
    If a readonly volume is being used, then the callback registrations
    expire in a couple of hours.

(3) The default configuration of the file server is not very good.
    I recommend:

      -L -udpsize 131071 -sendsize 131071 -rxpck 700 -p 128 -b 600
      -nojumbo -cb 1500000

    You can remove -nojumbo if you know it is safe to send large udp
    packets and not have them be fragmented between your various sites.

> For our particular setup we were thinking of setting up our machine to
> cache entire parts of the cell we were taking our stuff from and do this
> caching via a cron job. This apparent expiration appears to be a major
> drawback but after having read a bunch of references I couldn't figure
> out what to change in our configuration.
> 
> The other issues is whether it is advised to change the chunksize as
> this would reduce the chunks that are used for a generic file transfer.
> Is it recommended to change this parameter - does it bring much
> performance?

Changing the chunksize will not have a dramatic impact if you already
have the data cached.  If the data isn't changing, its the status info
that is expiring.

> Lastly, how do you guys think about the general plan - setting up an
> afs-configuration that caches entire afs volumes from a designated cell
> once a day or something... is this really practically or rather not
> recommended due to performance issues?

If you controlled the cell and could setup a file server at the remote
location, I think you would be better off setting up a remote file
server and storing readonly volume instances on it.  If you don't
control the cell or don't trust the remote location to place a file
server there, then your approach is reasonable.

I'm not comfortable that you have determined the actual cause of
the performance delays.  You may want to monitor the traffic flows
with wireshark and see what is actually being sent and received on
the wire.

Jeffrey Altman

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to