On 9/3/2014 1:29 PM, Lionel Cons wrote: > Will AFS3 include support for sparse files, e.g. files which have one > or multiple holes (see POSIX lseek() documentation about SEEK_HOLE and > SEEK_DATA) where no data reside? > > CERN has a lot of projects which rely on sparse files and it would be > a mandatory feature for a migration to AFS3. > > Lionel
Hi Lionel, You are writing to the AFS3 Standardization list which is the appropriate location for discussing AFS3 RPC protocol extensions. In case you aren't aware, in the AFS model file pointer operations are local to the client and are satisfied by a cache manager that obtains chunks of files at a time from the file servers. Cache managers do not in general cache whole files. Instead the cache managers fetch only those chunks of the file that are requested by the application. While the files themselves might not be sparse the caching of file data by the clients will be if that is the access pattern issued by the application. In the existing AFS3 protocol the FetchData and StoreData operations do not support compression and do not support any notion of sparseness (a range of zero bytes.) Nor do the file servers recognize that a chunk being written is entirely zero bytes and in turn store the file as a sparse file in its backend. This is one of the reasons that using ZFS with disk compression as the backing store can be a big win. In summary, at the present time there is no protocol support for a cache manager to communicate the Fetching or Storing of a sparse range nor is there any support for a cache manager to query a file server to request allocated ranges. As a result is not currently possible for an AFS cache manager to implement VFS layer support for lseek(2) SEEK_HOLE and SEEK_DATA when it is present. (Nor is there the ability to implement Microsoft Windows' FSCTL_SET_SPARSE, FSCTL_SET_ZERO_DATA, and FSCTL_QUERY_ALLOCATED_RANGES sparse file operations.) From an AFS client implementation perspective, for example OpenAFS on Linux (3.1 or later) or Solaris it would be possible for the cache manager to support SEEK_DATA and SEEK_HOLE as recommended by the Linux man page. SEEK_HOLE always returns the offset of the end of file and SEEK_DATA always return the offset specified in the lseek(). Discussion of adding that level of support should take place on the [email protected] mailing list. What are the requirements of the applications in question? Jeffrey Altman
smime.p7s
Description: S/MIME Cryptographic Signature
