Spenser:

The AFS cache cannot be used the way you are thinking.  What you are
looking for is a ceph/gluster to afs gateway which does not exist.

The AFS cache is part of the security boundary that is enforced by the
Cache Manager in the kernel.  As such, it is stored either in kernel
memory or on local disk accessible only to the one kernel.  It is not
designed for shared access.  Pointing multiple AFS cache managers at the
same cache will most likely result in data corruption.

There are two extensions to AFS that are being developed that will help
cluster access to data stores from far away locations

 1. read/write replication which permits a single copy of the data
generated at the slow site to be replicated to file servers near the
cluster.

 2. peer-to-peer cache sharing which permits an AFS cache manager to
securely access data from another cache manager on the same subnet and
avoid retransmitting it across a slow link.

The first option is preferred when it is possible to deploy file servers
in the cluster data center because it doesn't involve adding workload to
client nodes and provides for the possibility of parallel reads.

Jeffrey Altman


On 7/7/2011 4:01 AM, Spenser Gilliland wrote:
> Hello,
> 
> Can the AFS cache be placed on a parallel file system (IE: ceph or gluster)?
> 
> If the cache can be placed on a parallel file system,
> When data is read into or written to the cache will all of the other
> nodes in the cluster have access to this cached data for both reading
> and writing?  And will every write block until it is written to the
> AFS cell (IE: is it write back or write-through)?
> 
> FYI: I'm going to give this a go here in a couple weeks and wanted to
> know if anyone has tried it.
> 
> The idea is to have an AFS Cell at home (very slow especially upload)
> and a cluster at School which accesses this AFS Cell but only
> downloads a file once for all of the servers in the cluster thereby
> saving time and bandwidth.  Additionally, because the file is now on
> the parallel file system all nodes can access the data concurrently.
> When the program is finished the results will be available in the same
> directory as the program.
> 
> I'm thinking this could be immensely valuable for grid computing; if it works.
> 
> Let me know if there is anything I should be looking out for along the way.
> 
> Thanks,
> Spenser
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to