Stepehn Tweedie wrote:
> 
> Our proposed kernel API looks something like this:
> 
>       sys_setattr (char *filename, int attrib_family, int op, 
>                    struct attrib *old_attribs, int *old_lenp,
>                    struct attrib *new_attribs, int new_len);

Does this need the ability to not follow symlinks, or does user
space gets to do an lstat and a readlink call to ensure we do not
follow links, or do depending on the intent - otherwise you cannot
set an attribute on a symlink.

> 
>       sys_fsetattr(int fd, int attrib_family, int op, 
>                    struct attrib *old_attribs, int *old_lenp,
>                    struct attrib *new_attribs, int new_len);
> 
> where <op> can be
> 
>       ATR_SET         overwrite existing attribute
>       ATR_GET         read existing attribute
>       ATR_GETALL      read entire ordered attribute list (ignores new val)
>       ATR_PREPEND     add new attribute to start of ordered list
>       ATR_APPEND      add new attribute to end of ordered list
>       ATR_REPLACE     replace entire ordered attribute list

Does any filesystem implement an ordered list? Since attributes are
addressed by name, adding an ordering method of addressing them
looks a bit superfluous. This also relates to Hans' question about
what are block structured attributes, the answer is they are not a
stream, they are addressed via a name instead of a position in a byte
stream - again no ordering is implied.

How do you atomically create an attribute, i.e. you want to set the
attribute to a value if it does not already exist, the equivalent of
O_CREAT|O_EXCL? Maybe ATTR_SET needs to be extended with ATTR_CREATE
and ATTR_REPLACE which are the create it if it does not exist and only
set it if it does exist variants. With this interface it appears to
take two calls - which means two threads can stamp on each other.

> 
> and where <attribs> is a buffer of length <len> bytes of variable
> length struct attrib records:
> 
> struct attrib {
>       int     rec_len;                /* Length of the whole record:
>                                          should be padded to long
>                                          alignment */
>       int     name_family;            /* Which namespace is the name in? */
>       int     name_len;
>       int     val_len;
>       char    name[variable];         /* byte-aligned */
>       char    val[variable];          /* byte-aligned */
> };

If the typical operation is manipulating a single attribute then
encoding the attributes into a single byte array like this will 
tend to mean that the user program gets to do a couple of memory copies,
since programmers would probably be using a fixed string containing the 
attribute name, and a variable string for the contents. I cannot
speak for other filesystems, but in the case of XFS I think it
means the kernel gets to do multiple memory copies too. Passing
pointers down means less work for both sides - the kernel gets to do
more copyin/copyout operations, although in the multiple attribute
case this would happen anyway. If we pass in an unbounded list of
attributes to set then the kernel would have to copy in the length
components of an attrib structure before copying in the name and val
components, otherwise we may need to allocate a very large chunk
of memory in the kernel.


> 
> ATR_SET will overwrite an existing attribute, or if the attribute does
> not already exist, will append the new attribute (ie. it does not
> override existing ACL controls, in keeping with the Principle of Least
> Surprise).  If multiple instances of the name already exist, then the
> first one is replaced and subsequent ones deleted.  If supplied with
> an "old" buffer, all old attributes of that name will be returned.

Again, are there filesystems which support multiple instances of
the same attribute?

> 
> For the PREPEND/APPEND/REPLACE operations, the entire old attribute
> set is returned.
> 
> For GET, the <new> specification is read and all attributes which
> match any items in <new> are returned, in the order in which they are
> specified in <new>.  The actual value in <new> is ignored; only the
> name is used.
> 
> For GETALL, <new> is ignored entirely.
> 
> *old_lenp should contain the size of the old attributes buffer on
> entry.  It will contain the number of valid bytes in the old buffer on
> exit.  If the buffer is not sufficiently large to contain all of the
> attributes, E2BIG is returned.
> 
> 
> This is just a first stab at documenting what feels like an
> appropriate API.  It should be extensible enough for the future, but
> is pretty easy to code to already --- existing filesystems don't have
> to deal with any complexity they don't want to. 
> 
> Additionally, the use of well-defined namespaces for attributes means
> that in the future we can implement things like common code for
> generic attribute caching, or process authentication groups for
> non-Unix-ID authentication tokens, without having to duplicate all of
> that work in each individual filesystem.
> 
> 
> The extended attribute patch currently on the acl-devel group simply
> doesn't give us the ability to do extended attributes on any
> filesystem other than ext2, because it has such specific semantics.
> I'd rather avoid that, and I'd rather do so without adding a profusion
> of different ACL and attribute syscalls in the process.
> 
> Cheers,
>  Stephen
> -

For an existing API (which I am not proposing be taken as is) take a look
at the xfs man pages here:

        http://oss.sgi.com/projects/xfs/manpages.html

All the attribute calls are conveniently at the top of the list.

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]

Reply via email to