On Tue, Dec 17, 2013 at 04:53:16PM +0000, David Howells wrote:
> It has occurred to me and others that something like BTRFS could be
> a good fit to build an AFS fileserver directly on top of. The
> question is what facilities would be needed from BTRFS to make this
> work? So I thought I'd kick off a shopping list;-)

>  (1) 64-bit data version numbers that increase monotonically with
> each write. Yes, this is likely to cause some performance
> degredation as it introduces an ordering over data writes and
> metadata writes to a file. Maybe writes can be batched to improve
> performance?

   Do these have to be per-file? If not, then you might be able to get
away with using the transid, which is a filesystem-global
monotonically-increasing number.

   btrfs batches disk writes already, and uses the transid to
differentiate these -- the writes come at 30 second intervals (by
default, although there's an option to change the period). There may
be multiple distinct changes to a single file within that transaction
(although obviously, only the state of the file after the last one
gets written to disk). I don't know exactly what you need it for, so
this may or may not be appropriate here.

   Ceph uses transids for [something, mumble, wavy-hand] -- I don't
know if the use-case for Ceph is equivalent to the use-case for AFS.

>  (2) Storage for ACLs and AFS UIDs. Having shareable ACLs might also
> be useful. Xattrs would likely do for this.

   This would seem like a reasonable place to put them, given that
that's what POSIX ACLs do, and we have POSIX ACL support already.

>  (3) The ability to snapshot a filesystem to make backups and for
>      pushing to read-only volume servers.

   We have snapshots of subvolumes, but not the filesystem as a whole.

>  (4) A 32-bit vnode number and 32-bit vnode uniquifier/generation
> number. These don't necessarily have to be stored by BTRFS directly
> but could instead be in a separate database file that gets
> snapshotted also.
> 
>  (5) The ability to set the vnode number, vnode uniquifier and data
>      version number to specific values. Necessary to clone volumes
>      and restore volume dumps.

   What's a vnode meant to represent? I'm not familiar with the
terminology.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- "Are you the man who rules the Universe?" "Well,  I ---       
                              try not to."                               

Attachment: signature.asc
Description: Digital signature

Reply via email to