On 9/3/2014 1:29 PM, Lionel Cons wrote:
> Will AFS3 include support for sparse files, e.g. files which have one
> or multiple holes (see POSIX lseek() documentation about SEEK_HOLE and
> SEEK_DATA) where no data reside?
> 
> CERN has a lot of projects which rely on sparse files and it would be
> a mandatory feature for a migration to AFS3.
> 
> Lionel

Hi Lionel,

You are writing to the AFS3 Standardization list which is the
appropriate location for discussing AFS3 RPC protocol extensions. In
case you aren't aware, in the AFS model file pointer operations are
local to the client and are satisfied by a cache manager that obtains
chunks of files at a time from the file servers.   Cache managers do not
in general cache whole files.  Instead the cache managers fetch only
those chunks of the file that are requested by the application.  While
the files themselves might not be sparse the caching of file data by the
clients will be if that is the access pattern issued by the application.

In the existing AFS3 protocol the FetchData and StoreData operations do
not support compression and do not support any notion of sparseness (a
range of zero bytes.)  Nor do the file servers recognize that a chunk
being written is entirely zero bytes and in turn store the file as a
sparse file in its backend.  This is one of the reasons that using ZFS
with disk compression as the backing store can be a big win.

In summary, at the present time there is no protocol support for a cache
manager to communicate the Fetching or Storing of a sparse range nor is
there any support for a cache manager to query a file server to request
allocated ranges.  As a result is not currently possible for an AFS
cache manager to implement VFS layer support for lseek(2) SEEK_HOLE and
SEEK_DATA when it is present.  (Nor is there the ability to implement
Microsoft Windows' FSCTL_SET_SPARSE, FSCTL_SET_ZERO_DATA, and
FSCTL_QUERY_ALLOCATED_RANGES sparse file operations.)

From an AFS client implementation perspective, for example OpenAFS on
Linux (3.1 or later) or Solaris it would be possible for the cache
manager to support SEEK_DATA and SEEK_HOLE as recommended by the Linux
man page.  SEEK_HOLE always returns the offset of the end of file and
SEEK_DATA always return the offset specified in the lseek().  Discussion
of adding that level of support should take place on the
[email protected] mailing list.

What are the requirements of the applications in question?

Jeffrey Altman




Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to