Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Emmanuel Dreyfus
On Mon, Aug 01, 2011 at 07:20:30PM +, David Holland wrote:
 Sure. But what does it actually do, such that if you have a symlink it
 doesn't work to copy the symlink instead of hardlink it?

That would probably work for symlinks, since they cannot be updated.
But this would requires heavy changes in the way the code is written. 
Basically a rename on a symlink would become a readlink/symlink/unlink.
This is not really a portability patch, it is a code rewrite which would
consume more time than I can afford here. I suspect glusterfs developpers 
will have to do it if they want to support something else than Linux, but 
I have no idea when, therefore it is not wise to hold our breath on it.

llink(2) is a simple change, FreeBSD already went there with linkat(2),
and it makes everything simple. 

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Roland C. Dowdeswell
On Tue, Aug 02, 2011 at 08:52:56AM +, Emmanuel Dreyfus wrote:


 On Mon, Aug 01, 2011 at 07:20:30PM +, David Holland wrote:
  Sure. But what does it actually do, such that if you have a symlink it
  doesn't work to copy the symlink instead of hardlink it?
 
 That would probably work for symlinks, since they cannot be updated.
 But this would requires heavy changes in the way the code is written. 
 Basically a rename on a symlink would become a readlink/symlink/unlink.
 This is not really a portability patch, it is a code rewrite which would
 consume more time than I can afford here. I suspect glusterfs developpers 
 will have to do it if they want to support something else than Linux, but 
 I have no idea when, therefore it is not wise to hold our breath on it.
 
 llink(2) is a simple change, FreeBSD already went there with linkat(2),
 and it makes everything simple. 

It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux
as well as FreeBSD.  It might be the more portable direction to go.

--
Roland Dowdeswell  http://Imrryr.ORG/~elric/


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Emmanuel Dreyfus
On Tue, Aug 02, 2011 at 10:02:39AM +0100, Roland C. Dowdeswell wrote:
 It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux
 as well as FreeBSD.  It might be the more portable direction to go.

Right, then everything is simple, this is just the matter of 
implementing a standard system call.

Here is the specification. I will change llink to linkat and commit
shortly.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Rhialto
On Tue 02 Aug 2011 at 09:05:27 +, Emmanuel Dreyfus wrote:
 On Tue, Aug 02, 2011 at 10:02:39AM +0100, Roland C. Dowdeswell wrote:
  It looks like linkat(2) is POSIX.1-2008 and is implemented by Linux
  as well as FreeBSD.  It might be the more portable direction to go.
 
 Right, then everything is simple, this is just the matter of 
 implementing a standard system call.

Ok, then we also want openat(2), fchmodat(2) (which seems to be misnamed
and looks more like a chmodat(2)), unlinkat(2), fchownat(2) (same remark
as fchmodat), etc.

openat(2) is similar to, but not the same as, the existing function
fhopen(2). 

These functions can be found here too:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/func.html

FreeBSD seems to have:

faccessat
fchmodat
fchownat
fstatat
futimesat
linkat
mkdirat
mkfifoat
mknodat
openat
readlinkat
renameat
symlinkat
unlinkat

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- There's no point being grown-up if you 
\X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Emmanuel Dreyfus
On Tue, Aug 02, 2011 at 05:45:56PM +0200, Rhialto wrote:
 Ok, then we also want openat(2), fchmodat(2) (which seems to be misnamed
 and looks more like a chmodat(2)), unlinkat(2), fchownat(2) (same remark
 as fchmodat), etc.

And you forgot fexecve(). I agree we want all of them, but I do not think
we want everything at once.

We have linkat(2) which fixes the problem of hard linking symlinks. This
is a small and harmless change.  And we have these *at functions that
allow specifying pathnames relative to a directory specified by a file
descriptor. That means modifying the namei interface, not a challenge, 
but something a bit more intrusive. Therefore I would like to go 
incremental, by first supproting this:
linkat (AT_FDCW, name1, AT_FDCW, name2, 0)
linkat (AT_FDCW, name1, AT_FDCW, name2, AT_SYMLINK_FOLLOW)
and return ENOSYS if fd1 and fd2 have values other than AT_FDCW.

Then do the full Extended API set 2. 

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread David Holland
On Tue, Aug 02, 2011 at 08:52:56AM +, Emmanuel Dreyfus wrote:
   Sure. But what does it actually do, such that if you have a symlink it
   doesn't work to copy the symlink instead of hardlink it?
  
  That would probably work for symlinks, since they cannot be updated.
  But this would requires heavy changes in the way the code is written. 
  Basically a rename on a symlink would become a readlink/symlink/unlink.

As opposed to link/unlink? I still don't see why this would be more
than a half-dozen lines of code, if that. By your previous
descriptions it already needs to stat the object to check if it's a
directory.

-- 
David A. Holland
dholl...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Emmanuel Dreyfus
On Tue, Aug 02, 2011 at 04:30:15PM +, David Holland wrote:
 As opposed to link/unlink? I still don't see why this would be more
 than a half-dozen lines of code, if that. By your previous
 descriptions it already needs to stat the object to check if it's a
 directory.

It is much more code, since it happens on the client, which sends
filesystem operations to lower layers and regain control later using
callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c)
and you will see why it is complex.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread David Holland
On Tue, Aug 02, 2011 at 04:34:12PM +, Emmanuel Dreyfus wrote:
   As opposed to link/unlink? I still don't see why this would be more
   than a half-dozen lines of code, if that. By your previous
   descriptions it already needs to stat the object to check if it's a
   directory.
  
  It is much more code, since it happens on the client, which sends
  filesystem operations to lower layers and regain control later using
  callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c)
  and you will see why it is complex.

Where does that path live? glusterfs source?

-- 
David A. Holland
dholl...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-02 Thread Emmanuel Dreyfus
David Holland dholland-t...@netbsd.org wrote:

   It is much more code, since it happens on the client, which sends
   filesystem operations to lower layers and regain control later using
   callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c)
   and you will see why it is complex.
 
 Where does that path live? glusterfs source?

Yes, get it from here:
http://download.gluster.com/pub/gluster/glusterfs/3.2/3.2.2/
glusterfs-3.2.2.tar.gz

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-01 Thread Matthias Drochner

m...@netbsd.org said:
 Both behaviors are standard compliant, since SUSv2 says nothing about
 resolving symlinks or not.

While the DESCRIPTION chapter doesn't tell it explicitely,
we have the following in ERRORS:

[ELOOP]
 A loop exists in symbolic links encountered during resolution of the path1
 or path2 argument.

This implies that the intention is that symlinks are followed.

el...@imrryr.org said:
 Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2).

This looks reasonable.

best regards
Matthias





Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt




[PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread Emmanuel Dreyfus
On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote:
 I don't have an issue with it as long as:
   - fsck does not get confused
   - filesystems don't need to be modified to support it
   - there is consensus that this is not harmful
   - I am also ambivalent about exposing this in the native abi
 because it will only cause confusion.

Attached is the patch that adds llink(2) and its documentation. The test I
ran are below (the llink program just calls llink(2)).

fsck has no probmem with it, ffs was not modified. For it
being harmful, I cannot immagine what could be done with it, but 
we could restrict it to root just in case.

On confusion, well, I think the llink name speaks by itself.

# ls -li
total 1
3648 drwxr-xr-x  2 root  wheel  512 Aug  1 11:31 dir
   3 -rw-r--r--  2 root  wheel0 Aug  1 11:30 file
   3 -rw-r--r--  2 root  wheel0 Aug  1 11:30 hfile
   5 lrwxr-xr-x  1 root  wheel3 Aug  1 11:31 sdir - dir
   4 lrwxr-xr-x  1 root  wheel4 Aug  1 11:31 sfile - file
   6 lrwxr-xr-x  1 root  wheel   11 Aug  1 11:32 void - nonexistent
# /home2/manu/llink sfile hsfile
# /home2/manu/llink sdir hsdir
# /home2/manu/llink void hvoid 
# ls -li
total 1
3648 drwxr-xr-x  2 root  wheel  512 Aug  1 11:31 dir
   3 -rw-r--r--  2 root  wheel0 Aug  1 11:30 file
   3 -rw-r--r--  2 root  wheel0 Aug  1 11:30 hfile
   5 lrwxr-xr-x  2 root  wheel3 Aug  1 11:31 hsdir - dir
   4 lrwxr-xr-x  2 root  wheel4 Aug  1 11:31 hsfile - file
   6 lrwxr-xr-x  2 root  wheel   11 Aug  1 11:32 hvoid - nonexistent
   5 lrwxr-xr-x  2 root  wheel3 Aug  1 11:31 sdir - dir
   4 lrwxr-xr-x  2 root  wheel4 Aug  1 11:31 sfile - file
   6 lrwxr-xr-x  2 root  wheel   11 Aug  1 11:32 void - nonexistent

-- 
Emmanuel Dreyfus
m...@netbsd.org
Index: include/unistd.h
===
RCS file: /cvsroot/src/include/unistd.h,v
retrieving revision 1.126
diff -U4 -r1.126 unistd.h
--- include/unistd.h26 Jun 2011 16:42:40 -  1.126
+++ include/unistd.h1 Aug 2011 09:34:20 -
@@ -124,8 +124,9 @@
 pid_t   getppid(void);
 uid_t   getuid(void);
 int isatty(int);
 int link(const char *, const char *);
+int llink(const char *, const char *);
 longpathconf(const char *, int);
 int pause(void);
 int pipe(int *);
 #if __SSP_FORTIFY_LEVEL == 0
Index: sys/kern/vfs_syscalls.c
===
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.430
diff -U4 -r1.430 vfs_syscalls.c
--- sys/kern/vfs_syscalls.c 17 Jun 2011 14:23:51 -  1.430
+++ sys/kern/vfs_syscalls.c 1 Aug 2011 09:34:20 -
@@ -1769,27 +1769,32 @@
 }
 
 /*
  * Make a hard file link.
+ * The flag argument can be 
+ * - FOLLOW for sys_link, to link to symlink target
+ * - NOFOLLOW for sys_llink, to link to symlink itself
  */
 /* ARGSUSED */
-int
-sys_link(struct lwp *l, const struct sys_link_args *uap, register_t *retval)
+static int
+do_sys_link(struct lwp *l, const char *path, const char *link, 
+   int flags, register_t *retval) 
 {
-   /* {
-   syscallarg(const char *) path;
-   syscallarg(const char *) link;
-   } */
struct vnode *vp;
struct pathbuf *linkpb;
struct nameidata nd;
+   namei_simple_flags_t namei_simple_flags;
int error;
 
-   error = namei_simple_user(SCARG(uap, path),
-   NSM_FOLLOW_TRYEMULROOT, vp);
+   if (flags  FOLLOW)
+   namei_simple_flags = NSM_FOLLOW_TRYEMULROOT;
+   else
+   namei_simple_flags =  NSM_NOFOLLOW_TRYEMULROOT;
+
+   error = namei_simple_user(path, namei_simple_flags, vp);
if (error != 0)
return (error);
-   error = pathbuf_copyin(SCARG(uap, link), linkpb);
+   error = pathbuf_copyin(link, linkpb);
if (error) {
goto out1;
}
NDINIT(nd, CREATE, LOCKPARENT | TRYEMULROOT, linkpb);
@@ -1826,8 +1831,35 @@
goto out2;
 }
 
 int
+sys_link(struct lwp *l, const struct sys_link_args *uap, register_t *retval)
+{
+   /* {
+   syscallarg(const char *) path;
+   syscallarg(const char *) link;
+   } */
+   const char *path = SCARG(uap, path);
+   const char *link = SCARG(uap, link);
+
+   return do_sys_link(l, path, link, FOLLOW, retval);
+}
+
+int
+sys_llink(struct lwp *l, const struct sys_llink_args *uap, register_t *retval)
+{
+   /* {
+   syscallarg(const char *) path;
+   syscallarg(const char *) link;
+   } */
+   const char *path = SCARG(uap, path);
+   const char *link = SCARG(uap, link);
+
+   return do_sys_link(l, path, link, NOFOLLOW, retval);
+}
+
+
+int
 do_sys_symlink(const char *patharg, const char *link, enum uio_seg seg)
 {
struct proc *p = curproc;
struct vattr vattr;
@@ -1850,8 +1882,10 @@
   

Re: Adding linux_link(2) system call, second round

2011-08-01 Thread Joerg Sonnenberger
On Mon, Aug 01, 2011 at 04:05:33AM +0200, Emmanuel Dreyfus wrote:
 Joerg Sonnenberger jo...@britannica.bec.de wrote:
 
  Given the very small number of programs that manage to mess up the
  symlink usage, I'm kind of opposed to providing another system call just
  as work around for them.
 
 You did not explain what problems it would introduce, did you?

You are adding a lot of complexity to workaround portability issues of a
single application. Let's start the other way -- has FreeBSD added
llink(2)? What about OSX? Solaris?

Joerg


Re: Adding linux_link(2) system call, second round

2011-08-01 Thread Emmanuel Dreyfus
Joerg Sonnenberger jo...@britannica.bec.de wrote:

  You did not explain what problems it would introduce, did you?
 You are adding a lot of complexity to workaround portability issues of a
 single application. 

It is not that complex. See the patch I posted this morning, the thing
is really simple, and it works quite well.

 Let's start the other way -- has FreeBSD added llink(2)? 
 What about OSX? Solaris?

OSX did not in X.5, I do not know for the others. But we do not know if
developers of theses systems are aware of this portability problem. We
may be the first that got there, it does not means we have to be wrong.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-08-01 Thread Rhialto
On Mon 01 Aug 2011 at 10:50:50 +0200, Matthias Drochner wrote:
 While the DESCRIPTION chapter doesn't tell it explicitely,
 we have the following in ERRORS:
 
 [ELOOP]
  A loop exists in symbolic links encountered during resolution of the path1
  or path2 argument.
 
 This implies that the intention is that symlinks are followed.

Non-final symlinks are always followed, even for lchmod(2), readlink(2)
and similar functions, right? For instance, readlink(2)'s man page also
mentions ELOOP.

Another argument that allowing hard links to symlinks is only natural:
The rename(2) operation does it in any case (on ffs). And one *can* rename
symlinks.

From /usr/src/sys/ufs/ufs/ufs_vnops.c:

/*
 * Rename vnode operation
 *  rename(foo, bar);
 * is essentially
 *  unlink(bar);
 *  link(foo, bar);
 *  unlink(foo);
 * but ``atomically''.  Can't do full commit without saving state in the
 * inode on disk which isn't feasible at this time.  Best we can do is
 * always guarantee the target exists.

The code below that doesn't appear to special-case symlinks, only
directories.

FreeBSD also allows hard links to symlinks, with the ln -P option.
(This must have been introduced in version 7 or 8; 6.1 doesn't have it
but 8.1 does)

LN(1)   FreeBSD General Commands Manual  LN(1)

NAME
 ln, link -- link files

SYNOPSIS
 ln [-L | -P | -s [-F]] [-f | -iw] [-hnv] source_file [target_file]
 ln [-L | -P | -s [-F]] [-f | -iw] [-hnv] source_file ... target_dir
 link source_file target_file

DESCRIPTION
...
 -PWhen creating a hard link to a symbolic link, create a hard link to
   the symbolic link itself.  This option cancels the -L option.

test$ ln -s foo bar
test$ l
total 8
drwxr-xr-x   2 olafs  vb   3 Aug  1 12:25 ./
drwxr-xr-x  23 olafs  vb  29 Aug  1 12:25 ../
lrwxr-xr-x   1 olafs  vb   3 Aug  1 12:25 bar@ - foo
test$ ln -P bar baz
test$ l
total 8
drwxr-xr-x   2 olafs  vb   4 Aug  1 12:25 ./
drwxr-xr-x  23 olafs  vb  29 Aug  1 12:25 ../
lrwxr-xr-x   2 olafs  vb   3 Aug  1 12:25 bar@ - foo
lrwxr-xr-x   2 olafs  vb   3 Aug  1 12:25 baz@ - foo

I tested both on ffs and zfs. The results are the same.


-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- There's no point being grown-up if you 
\X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor


Re: Adding linux_link(2) system call, second round

2011-08-01 Thread Emmanuel Dreyfus
Rhialto rhia...@falu.nl wrote:

 LN(1)   FreeBSD General Commands Manual  LN(1)
(...)
  -PWhen creating a hard link to a symbolic link, create a hard link to
the symbolic link itself.  This option cancels the -L option.

I can add this this to NetBSD as well if it is considered desirable.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread Christos Zoulas
In article 20110801094633.ga17...@homeworld.netbsd.org,
Emmanuel Dreyfus  m...@netbsd.org wrote:
-=-=-=-=-=-

On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote:
 I don't have an issue with it as long as:
  - fsck does not get confused
  - filesystems don't need to be modified to support it
  - there is consensus that this is not harmful
  - I am also ambivalent about exposing this in the native abi
because it will only cause confusion.

Attached is the patch that adds llink(2) and its documentation. The test I
ran are below (the llink program just calls llink(2)).

fsck has no probmem with it, ffs was not modified. For it
being harmful, I cannot immagine what could be done with it, but 
we could restrict it to root just in case.

On confusion, well, I think the llink name speaks by itself.

Except for the ktruser() call, looks good to me (my personal opinion).

christos



Re: Adding linux_link(2) system call, second round

2011-08-01 Thread David Holland
On Mon, Aug 01, 2011 at 04:05:32AM +0200, Emmanuel Dreyfus wrote:
   You still haven't explained what glusterfs is doing that's so evil or
   why it can't be fixed by having it copy the symlink when that's the
   case in question.
  
  glusterfs uses the native filesystem as its storage backend. When you
  rename a filesytem object in the distributed and replicated setup, they
  have to make sure it remains accessible by another client during the
  operation. 
  
  Directories are all present on all servers and therefore are just
  treated by a rename(2). Other objects are stored on some server and are
  reteived using a DHT. When they are renamed, they are treated by a
  distributed link(2)/rename(2)/unlink(2) algorithm. This breaks on NetBSD
  when the object is a symlink to a directory or a symlink to a
  nonexistent target, since you cannot link(2) to such an object. 

Sure. But what does it actually do, such that if you have a symlink it
doesn't work to copy the symlink instead of hardlink it?

  The fix is not traightforward, and require a change in the way glusterfs
  stores symlinks in its distributed and replicated setup.

Why?

-- 
David A. Holland
dholl...@netbsd.org


Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread Emmanuel Dreyfus
Christos Zoulas chris...@astron.com wrote:

 Except for the ktruser() call, looks good to me (my personal opinion).

Um, yes, that one was another pending patch I had for later. For now
ktrace does not show symlink(2) targets, which is annoying:  sometime
you cannot tell what is going on.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread David Laight
On Mon, Aug 01, 2011 at 09:46:33AM +, Emmanuel Dreyfus wrote:
 + if (flags  FOLLOW)
 + namei_simple_flags = NSM_FOLLOW_TRYEMULROOT;
 + else
 + namei_simple_flags =  NSM_NOFOLLOW_TRYEMULROOT;
 +
 + error = namei_simple_user(path, namei_simple_flags, vp);

Not withstanding dh's comment, why not pass in all the namei flags.

 + error = namei_simple_user(path, flags, vp);

David

-- 
David Laight: da...@l8s.co.uk


Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread David Holland
On Mon, Aug 01, 2011 at 09:31:11PM +0100, David Laight wrote:
   +  if (flags  FOLLOW)
   +  namei_simple_flags = NSM_FOLLOW_TRYEMULROOT;
   +  else
   +  namei_simple_flags =  NSM_NOFOLLOW_TRYEMULROOT;
   +
   +  error = namei_simple_user(path, namei_simple_flags, vp);
  
  Not withstanding dh's comment, why not pass in all the namei flags.
  
   +  error = namei_simple_user(path, flags, vp);

Because I gimmicked up the flags for namei_simple specifically to
disallow that sort of thing :-)

-- 
David A. Holland
dholl...@netbsd.org


Re: [PATCH] llink(2) (was: Re: Adding linux_link(2) system call, second round)

2011-08-01 Thread David Holland
On Mon, Aug 01, 2011 at 09:00:36PM +, David Holland wrote:
Not withstanding dh's comment, why not pass in all the namei flags.

 +   error = namei_simple_user(path, flags, vp);
  
  Because I gimmicked up the flags for namei_simple specifically to
  disallow that sort of thing :-)

Er, or rather, you can't use | on them. Just passing them would work,
but again, please don't...

-- 
David A. Holland
dholl...@netbsd.org


Adding linux_link(2) system call, second round

2011-07-31 Thread Emmanuel Dreyfus
Quick summary for the impatient: NetBSD link(2) first resolves symlinks
before doing the actual link to the target. As a result, NetBSD link(2)
fails on symlinks to directories or to non existent targets.

On the other side, Linux link(2) is dumb and just create a second
symlink with the same inode. Therefore it does not care about the
symlink target, and will succeed even if it is a directory or if it is
nonexistent. 

Both behaviors are standard compliant, since SUSv2 says nothing about
resolving symlinks or not. I found at least one program (glusterfs),
which assumes the Linux behavior, and is a real pain to fix on NetBSD
because of that.

I proposed to implement a linux_link(2), or lazy_link(2), whatever
sounds nicer. It seems it does not reach consensus, but I am not sure I
understood why: what are the problems that would be introduced by adding
such a system call? At least I can tell what benefit it would have: it
would ease porting from Linux.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Christos Zoulas
In article 1k5abxi.a8h289rm4jc3m%m...@netbsd.org,
Emmanuel Dreyfus m...@netbsd.org wrote:
Quick summary for the impatient: NetBSD link(2) first resolves symlinks
before doing the actual link to the target. As a result, NetBSD link(2)
fails on symlinks to directories or to non existent targets.

On the other side, Linux link(2) is dumb and just create a second
symlink with the same inode. Therefore it does not care about the
symlink target, and will succeed even if it is a directory or if it is
nonexistent. 

Both behaviors are standard compliant, since SUSv2 says nothing about
resolving symlinks or not. I found at least one program (glusterfs),
which assumes the Linux behavior, and is a real pain to fix on NetBSD
because of that.

I proposed to implement a linux_link(2), or lazy_link(2), whatever
sounds nicer. It seems it does not reach consensus, but I am not sure I
understood why: what are the problems that would be introduced by adding
such a system call? At least I can tell what benefit it would have: it
would ease porting from Linux.

I don't have an issue with it as long as:
- fsck does not get confused
- filesystems don't need to be modified to support it
- there is consensus that this is not harmful
- I am also ambivalent about exposing this in the native abi
  because it will only cause confusion.

Also perhaps just call it link2(from, to, flags) in the long tradition
of adding a number to existing syscalls when extending them ;-)

christos



Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Christos Zoulas
On Jul 31,  9:18pm, el...@imrryr.org (Roland C. Dowdeswell) wrote:
-- Subject: Re: Adding linux_link(2) system call, second round

| On Sun, Jul 31, 2011 at 06:36:53PM +, Christos Zoulas wrote:
| 
| 
|  Also perhaps just call it link2(from, to, flags) in the long tradition
|  of adding a number to existing syscalls when extending them ;-)
| 
| Or perhaps llink(2) for symmetry with lchmod(2) and lstat(2).

I like that even more!

christos


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Joerg Sonnenberger
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote:
 Both behaviors are standard compliant, since SUSv2 says nothing about
 resolving symlinks or not. I found at least one program (glusterfs),
 which assumes the Linux behavior, and is a real pain to fix on NetBSD
 because of that.

The standard is explicitly open on this to allow filesystems that
implement symlinks without using inodes. Essentially, it is valid to
store a symlink inside the directory entry itself. That's one of the
reasons why no change semantic is provided either.

Given the very small number of programs that manage to mess up the
symlink usage, I'm kind of opposed to providing another system call just
as work around for them. Besides, NetBSD isn't the only implementation
following this strategy...

Joerg


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread David Holland
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote:
  Quick summary for the impatient: NetBSD link(2) first resolves symlinks
  before doing the actual link to the target. As a result, NetBSD link(2)
  fails on symlinks to directories or to non existent targets.
  
  On the other side, Linux link(2) is dumb and just create a second
  symlink with the same inode. Therefore it does not care about the
  symlink target, and will succeed even if it is a directory or if it is
  nonexistent. 
  
  Both behaviors are standard compliant, since SUSv2 says nothing about
  resolving symlinks or not. I found at least one program (glusterfs),
  which assumes the Linux behavior, and is a real pain to fix on NetBSD
  because of that.

You still haven't explained what glusterfs is doing that's so evil or
why it can't be fixed by having it copy the symlink when that's the
case in question.

I remain not thrilled about adding this, mostly on the grounds that
adding variant functionality with no clear purpose or value tends to
create maintenance hassles in the long run.

-- 
David A. Holland
dholl...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Emmanuel Dreyfus
David Holland dholland-t...@netbsd.org wrote:

 You still haven't explained what glusterfs is doing that's so evil or
 why it can't be fixed by having it copy the symlink when that's the
 case in question.

glusterfs uses the native filesystem as its storage backend. When you
rename a filesytem object in the distributed and replicated setup, they
have to make sure it remains accessible by another client during the
operation. 

Directories are all present on all servers and therefore are just
treated by a rename(2). Other objects are stored on some server and are
reteived using a DHT. When they are renamed, they are treated by a
distributed link(2)/rename(2)/unlink(2) algorithm. This breaks on NetBSD
when the object is a symlink to a directory or a symlink to a
nonexistent target, since you cannot link(2) to such an object. 

The fix is not traightforward, and require a change in the way glusterfs
stores symlinks in its distributed and replicated setup. I suspect it
may involve treating such objects like directories, and have them
duplicated on all servers. An alternative would be to sacrifice the
garantee that symlinks are available during a rename, at least for
NetBSD.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Emmanuel Dreyfus
Joerg Sonnenberger jo...@britannica.bec.de wrote:

 Given the very small number of programs that manage to mess up the
 symlink usage, I'm kind of opposed to providing another system call just
 as work around for them.

You did not explain what problems it would introduce, did you?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: Adding linux_link(2) system call, second round

2011-07-31 Thread Christos Zoulas
In article 20110731224944.ga23...@britannica.bec.de,
Joerg Sonnenberger  jo...@britannica.bec.de wrote:
On Sun, Jul 31, 2011 at 07:49:20PM +0200, Emmanuel Dreyfus wrote:
 Both behaviors are standard compliant, since SUSv2 says nothing about
 resolving symlinks or not. I found at least one program (glusterfs),
 which assumes the Linux behavior, and is a real pain to fix on NetBSD
 because of that.

The standard is explicitly open on this to allow filesystems that
implement symlinks without using inodes. Essentially, it is valid to
store a symlink inside the directory entry itself. That's one of the
reasons why no change semantic is provided either.

And approximately this (storing the symlink data inside the source
inode without using an extra inode of the link target fit) was
attempted in BSD4.4 and if failed miserably. We had to undo it, and
use separate inodes again.

christos