Re: struct file reference at VFS level

2016-04-29 Thread Emmanuel Dreyfus
On Fri, Apr 29, 2016 at 10:13:13AM +0200, J. Hannken-Illjes wrote:
> Advertised locks are already implemented inside the file systems and
> we have VOP_ADVLOCK() with ?op == F_GETLK?. What else is needed to
> make the decision here?

In advisory locks, the only place where the filesystem checks locks
status is when you try to acquire a lock. VOP_ADVLOCK has the duty 
to tell you if the region is already locked by someone else. But any
other operation such as VOP_WRITE just proceed without regards for 
locking status. The lock is adivsory and should be enforced by the
application.

With mandatory locks, the filesystems will actually enforce the locks.
VOP_WRITE to a region locked by another process must fail.

> I strongly object against an implementation whose specification starts:
(...)
>   The Linux implementation is prey to a number of difficult-to-fix race
>   conditions which in practice make it not dependable:

This warning about the Linux *implementation* status is not actually 
part of the specification. In other words, this specification does not
mandate the implementation to be broken.

Moreover, the only candidate so far to implement mandatory locking
is a filesystem in userspace, which means the broken cases described
do not really apply: you can always recover by killing the userspace
process.


-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-29 Thread J. Hannken-Illjes

> On 28 Apr 2016, at 10:19, Emmanuel Dreyfus  wrote:
> 
>> Wouldn't it be better to put the mandatory locking at the file->vnode
>> level, operations vn_read() and vn_write()?
> 
> Do you mean that instead of providing support so that filesystems 
> can implement mandatory locks, we should just implement it above VFS
> for all filesystems? That would work for local filesystems, but I 
> understand it cannot be done for distributed filesystems, as the 
> local kernel knows nothing about locks set by other machines: you 
> need to let the filesystem handle it.

Advertised locks are already implemented inside the file systems and
we have VOP_ADVLOCK() with “op == F_GETLK”. What else is needed to
make the decision here?

>> Could you explain the proposed semantics in detail?
> 
> We have this:
> https://www.kernel.org/doc/Documentation/filesystems/mandatory-locking.txt

I strongly object against an implementation whose specification starts:

  0. Why you should avoid mandatory locking
  -

  The Linux implementation is prey to a number of difficult-to-fix race
  conditions which in practice make it not dependable:

  - The write system call checks for a mandatory lock only once
at its start.  It is therefore possible for a lock request to
be granted after this check but before the data is modified.
A process may then see file data change even while a mandatory
lock was held.
  - Similarly, an exclusive lock may be granted on a file after
the kernel has decided to proceed with a read, but before the
read has actually completed, and the reading process may see
the file data in a state which should not have been visible
to it.
  - Similar races make the claimed mutual exclusion between lock
and mmap similarly unreliable.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: struct file reference at VFS level

2016-04-28 Thread Emmanuel Dreyfus
On Thu, Apr 28, 2016 at 09:54:05AM +0200, J. Hannken-Illjes wrote:
> In your example process 2 writes without locking, is it part of your
> proposed semantics to add an implicit lock/unlock to read and write
> operations?

The given scenario is a test case for mandatory lock support. Process 2 
does not acquire a lock and tries to write to a locked part. With advisory
locks, the write succeeds, but with mandatory locks, it must fail. We
use O_NONBLOCK for convenience so that it fails immediatly, but this is
not a requirement for the test.

> Wouldn't it be better to put the mandatory locking at the file->vnode
> level, operations vn_read() and vn_write()?

Do you mean that instead of providing support so that filesystems 
can implement mandatory locks, we should just implement it above VFS
for all filesystems? That would work for local filesystems, but I 
understand it cannot be done for distributed filesystems, as the 
local kernel knows nothing about locks set by other machines: you 
need to let the filesystem handle it.

> Could you explain the proposed semantics in detail?

We have this:
https://www.kernel.org/doc/Documentation/filesystems/mandatory-locking.txt

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-28 Thread Emmanuel Dreyfus
On Thu, Apr 28, 2016 at 04:57:28PM +1000, matthew green wrote:
> i don't know, but not in kauth_cred_t.  don't go linking systems
> together like that.  don't abuse kauth to shove some random other
> data you need simply because that's a place you found you could
> change.  the usage of it in VFS is as a consumer, not as a storage
> mechanism.

If it does not go into kauth_cred, then we need to add a field to 
many VFS methods. While there it may be interesting to add something
extensible, like a chaine list of typed values.

-- 
Emmanuel Dreyfus
m...@netbsd.org


re: struct file reference at VFS level

2016-04-28 Thread matthew green
Emmanuel Dreyfus writes:
> matthew green  wrote:
> 
> > you do realise that kauth_cred_t is shared across processes in 
> > some cases but not all, right?
> 
> VFS uses a kauth_cred_t from struct file f_cred field, which is tied to
> the struct file. The value is initialized as curlwp->l_cred in
> fd_allocfile(). 
> 
> My proposed patch replaces this f_cred by an augmented copy in
> open_setfp().

so you want to create an entirely separate copy of kauth_cred_t for
every file opened like this just to provide a single ID?  that is a
huge waste of space:

(gdb) p sizeof(struct kauth_cred)
$2 = 176

currently the vast majority of kauth_cred_t's are shared.

> > i don't think kauth_cred_t is the right place to try to store
> > random crap related to file systems.
> 
> Where do you think that data should go?

i don't know, but not in kauth_cred_t.  don't go linking systems
together like that.  don't abuse kauth to shove some random other
data you need simply because that's a place you found you could
change.  the usage of it in VFS is as a consumer, not as a storage
mechanism.


.mrg.


Re: struct file reference at VFS level

2016-04-28 Thread Emmanuel Dreyfus
matthew green  wrote:

> you do realise that kauth_cred_t is shared across processes in 
> some cases but not all, right?

VFS uses a kauth_cred_t from struct file f_cred field, which is tied to
the struct file. The value is initialized as curlwp->l_cred in
fd_allocfile(). 

My proposed patch replaces this f_cred by an augmented copy in
open_setfp().

> i don't think kauth_cred_t is the right place to try to store
> random crap related to file systems.

Where do you think that data should go?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-27 Thread Emmanuel Dreyfus
Joerg Sonnenberger  wrote:

> Can you please start with a consistent proposal of what the end result
> should look like before adding random pieces? I'm sure, but this feels
> like a very adhoc hack to cover a few corner cases for some strange out
> of tree target without a clean idea on how it affects the long term
> design of the VFS layer. 

Well, since kauth_cred_t can be extended, I foresee no need to change
VFS design. one additionnal credential is required here, struct file *,
we can just optionaly add it up like I did in this patch.

There is a possible exception: VOP_FALLOCATE touches file content, and
it does not pass kauth_cred_t. Obviously this was forgotten when the VOP
was created.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-27 Thread Joerg Sonnenberger
On Wed, Apr 27, 2016 at 03:58:43PM +, Emmanuel Dreyfus wrote:
> On Sun, Apr 24, 2016 at 07:11:37PM +, David Holland wrote:
> > Since you said fuse has a way to do that but it doesn't work for our
> > fuse, I guess the right way forward is to make it work in our fuse.
> > What's required? Just send an arbitrary ID associated with the open
> > through puffs to userland?
> 
> Here is the first part: a MNT_FILECRED mount option that cause
> the struct file to be attached to VOP credentials. It builds but
> I have not yet tested, as I need the second part in PUFFS for that.

Can you please start with a consistent proposal of what the end result
should look like before adding random pieces? I'm sure, but this feels
like a very adhoc hack to cover a few corner cases for some strange out
of tree target without a clean idea on how it affects the long term
design of the VFS layer. I'm not saying the current state is perfect or
that the approach has no merits, but changes of this nature just because
FUSE is built to assume Linux behavior is very very bad.

Joerg


Re: struct file reference at VFS level

2016-04-27 Thread Emmanuel Dreyfus
David Holland  wrote:

> ok, I don't understand: even assuming that stashing this information
> in the credentials is the right place (which I don't think I agree
> with but haven't really formed a coherent opinion about yet)... why on
> earth should it be conditional?

Because if the filesystem does not implement mandatory locks (or some
other feature that requires a reference to struct file), the information
is of no use. Since this is the case for all our filesystems except
PUFFS/FUSE, I made sure the extra code is not executed when it is not
needed.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-27 Thread David Holland
On Wed, Apr 27, 2016 at 03:58:43PM +, Emmanuel Dreyfus wrote:
 > On Sun, Apr 24, 2016 at 07:11:37PM +, David Holland wrote:
 > > Since you said fuse has a way to do that but it doesn't work for our
 > > fuse, I guess the right way forward is to make it work in our fuse.
 > > What's required? Just send an arbitrary ID associated with the open
 > > through puffs to userland?
 > 
 > Here is the first part: a MNT_FILECRED mount option that cause
 > the struct file to be attached to VOP credentials. It builds but
 > I have not yet tested, as I need the second part in PUFFS for that.

ok, I don't understand: even assuming that stashing this information
in the credentials is the right place (which I don't think I agree
with but haven't really formed a coherent opinion about yet)... why on
earth should it be conditional?

-- 
David A. Holland
dholl...@netbsd.org


Re: struct file reference at VFS level

2016-04-27 Thread Emmanuel Dreyfus
On Sun, Apr 24, 2016 at 07:11:37PM +, David Holland wrote:
> Since you said fuse has a way to do that but it doesn't work for our
> fuse, I guess the right way forward is to make it work in our fuse.
> What's required? Just send an arbitrary ID associated with the open
> through puffs to userland?

Here is the first part: a MNT_FILECRED mount option that cause
the struct file to be attached to VOP credentials. It builds but
I have not yet tested, as I need the second part in PUFFS for that.


Index: sys/sys/fstypes.h
===
RCS file: /cvsroot/src/sys/sys/fstypes.h,v
retrieving revision 1.33
diff -U4 -r1.33 fstypes.h
--- sys/sys/fstypes.h   6 May 2015 15:57:08 -   1.33
+++ sys/sys/fstypes.h   27 Apr 2016 15:54:05 -
@@ -79,12 +79,11 @@
  *
  * Unmount uses MNT_FORCE flag.
  *
  * Note that all mount flags are listed here.  if you need to add one, take
- * one of the __MNT_UNUSED flags.
+ * one of the __MNT_UNUSED flags (none available currently, sorry)
  */
 
-#define__MNT_UNUSED1   0x0020
 
 #defineMNT_RDONLY  0x0001  /* read only filesystem */
 #defineMNT_SYNCHRONOUS 0x0002  /* file system written 
synchronously */
 #defineMNT_NOEXEC  0x0004  /* can't exec from filesystem */
@@ -94,8 +93,9 @@
 #defineMNT_ASYNC   0x0040  /* file system written 
asynchronously */
 #defineMNT_NOCOREDUMP  0x8000  /* don't write core dumps to 
this FS */
 #defineMNT_RELATIME0x0002  /* only update access time if 
mod/ch */
 #defineMNT_IGNORE  0x0010  /* don't show entry in df */
+#defineMNT_FILECRED0x0020  /* provide file_t in VFS ops 
creds */
 #defineMNT_DISCARD 0x0080  /* use DISCARD/TRIM if 
supported */
 #defineMNT_EXTATTR 0x0100  /* enable extended attributes */
 #defineMNT_LOG 0x0200  /* Use logging */
 #defineMNT_NOATIME 0x0400  /* Never update access times in 
fs */
Index: sys/sys/kauth.h
===
RCS file: /cvsroot/src/sys/sys/kauth.h,v
retrieving revision 1.73
diff -U4 -r1.73 kauth.h
--- sys/sys/kauth.h 6 Oct 2015 22:13:39 -   1.73
+++ sys/sys/kauth.h 27 Apr 2016 15:54:05 -
@@ -85,8 +85,12 @@
specificdata_reference cr_sd;   /* specific data */
 };
 #endif
 
+#ifdef _KERNEL
+extern kauth_key_t kauth_filecred_key;;
+#endif
+
 /*
  * Possible return values for a listener.
  */
 #defineKAUTH_RESULT_ALLOW  0   /* allow access */
Index: sys/secmodel/secmodel.c
===
RCS file: /cvsroot/src/sys/secmodel/secmodel.c,v
retrieving revision 1.2
diff -U4 -r1.2 secmodel.c
--- sys/secmodel/secmodel.c 4 Nov 2014 16:01:58 -   1.2
+++ sys/secmodel/secmodel.c 27 Apr 2016 15:54:05 -
@@ -37,8 +37,11 @@
 #include 
 #include 
 #include 
 
+/* kauth key for MNT_FILECRED mount option */
+kauth_key_t kauth_filecred_key;
+
 /* List of secmodels, parameters, and lock. */
 static LIST_HEAD(, secmodel_descr) secmodels =
 LIST_HEAD_INITIALIZER(secmodels);
 static unsigned int secmodel_copy_cred_on_fork = false;
@@ -61,8 +64,10 @@
 
rw_init(_lock);
 
secmodel_copy_cred_on_fork = false;
+
+   (void)kauth_register_key(NULL, _filecred_key);
 }
 
 /*
  * Register a new secmodel.
Index: sys/kern/vfs_syscalls.c
===
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.504
diff -U4 -r1.504 vfs_syscalls.c
--- sys/kern/vfs_syscalls.c 28 Nov 2015 15:26:29 -  1.504
+++ sys/kern/vfs_syscalls.c 27 Apr 2016 15:54:05 -
@@ -218,8 +218,18 @@
fp->f_type = DTYPE_VNODE;
fp->f_ops = 
fp->f_vnode = vp;
 
+   if (vp->v_mount->mnt_flag & MNT_FILECRED) {
+   kauth_cred_t cred;
+
+   cred = kauth_cred_dup(fp->f_cred);
+   kauth_cred_free(fp->f_cred);
+   fp->f_cred = cred;
+
+   kauth_cred_setdata(cred, kauth_filecred_key, fp);
+   }
+
if (flags & (O_EXLOCK | O_SHLOCK)) {
struct flock lf;
int type;
 

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-24 Thread Emmanuel Dreyfus
David Holland  wrote:

> Ah, I see -- you were talking about access to the flags of the struct
> file, and that isn't necessary. 

Yes, sorry for that confusing part. In fact, the file flags are quite
useful to spot the problem in a test case and this is why I insisted on
them, but indeed they are not the core of the problem.

> What you need is the locking
> principal, which for some glusterfs reason is supposed to be the open
> (potentially shared by many processes) and not the current process.

Correct.

> Since you said fuse has a way to do that but it doesn't work for our
> fuse, I guess the right way forward is to make it work in our fuse.
> What's required? Just send an arbitrary ID associated with the open
> through puffs to userland?

My understanding is that we need to send an opaque reference for struct
file (which could just be the struct file pointer itself).

> Which vnode operations need this information to make the fuse thing
> work fully? (Whether or not glusterfs needs them all is less relevant;
> something else will eventually anyway.)

Any operation that deals with file content may hit a lock: VOP_READ,
VOP_READLINK, VOP_WRITE, etc... Fortunately all of them send a
kauth_cred_t, and that structure was designed with the ability to cary
extra data: we can attach struct file * without changing the VFS
interface, and this could be done based on a mount option.

I have still  to make sure other operations (which do not access file
content) do not need reference to struct file, but I see no reason why
they would.

PUFFS is another story. Everywhere VFS has a kauth_cred_t, PUFFS has a
struct puffs_kcred in the message, but that was not designed to be
variable-length, hence adding data here means adding new message
versions.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-24 Thread David Holland
On Sun, Apr 24, 2016 at 04:40:43AM +0200, Emmanuel Dreyfus wrote:
 > > If something in fuse is causing these cases to share the same open and
 > > thus the same struct file, fuse is broken. Fix fuse first.
 > > 
 > > If that isn't what's happening, the next possible problem is that
 > > puffs/refuse/whatnot is the losing IO_NDELAY flag of VOP_WRITE. If so,
 > > fuse is broken. Fix fuse first.
 > 
 > If you want to implement mandatory locks, the O_NONBLOCK / IO_NDELAY
 > just determines how a failure should occur: by waiting for the lock to
 > be released, or by failing with EAGAIN.
 > 
 > In order to decide whether an I/O will succeed or fail because of a
 > mandatory lock, the filesystem must know if there is a lock on the file,
 > and in the case there is a lock, whether it is held by the caller or
 > not.
 > 
 > As I understand, telling if a lock is held by caller requires a
 > reference to the struct file on which the operation was issued.

Ah, I see -- you were talking about access to the flags of the struct
file, and that isn't necessary. What you need is the locking
principal, which for some glusterfs reason is supposed to be the open
(potentially shared by many processes) and not the current process.

Since you said fuse has a way to do that but it doesn't work for our
fuse, I guess the right way forward is to make it work in our fuse.
What's required? Just send an arbitrary ID associated with the open
through puffs to userland?

Which vnode operations need this information to make the fuse thing
work fully? (Whether or not glusterfs needs them all is less relevant;
something else will eventually anyway.)

-- 
David A. Holland
dholl...@netbsd.org


Re: struct file reference at VFS level

2016-04-23 Thread Emmanuel Dreyfus
David Holland  wrote:

> If something in fuse is causing these cases to share the same open and
> thus the same struct file, fuse is broken. Fix fuse first.
> 
> If that isn't what's happening, the next possible problem is that
> puffs/refuse/whatnot is the losing IO_NDELAY flag of VOP_WRITE. If so,
> fuse is broken. Fix fuse first.

If you want to implement mandatory locks, the O_NONBLOCK / IO_NDELAY
just determines how a failure should occur: by waiting for the lock to
be released, or by failing with EAGAIN.

In order to decide whether an I/O will succeed or fail because of a
mandatory lock, the filesystem must know if there is a lock on the file,
and in the case there is a lock, whether it is held by the caller or
not.

As I understand, telling if a lock is held by caller requires a
reference to the struct file on which the operation was issued.

If we reconsider the proposed scenario without process 2 using
O_NONBLOCK, then a write by process 2 must wait for the lock to be
released, while a write by process 1 must succeed without delay. But
when the requests come at the filesystem level, there is no way to
distinguish them.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-23 Thread David Holland
On Sat, Apr 23, 2016 at 09:20:28PM +0200, Emmanuel Dreyfus wrote:
 > > If something in fuse is causing these cases to share the same open and
 > > thus the same struct file, fuse is broken. Fix fuse first.
 > 
 > The NetBSD VFS interface does not let underlying filesystem distinguish
 > different struct file for a given vnode. That information is just not
 > sent, and this is what this thread is about.

No, you have jumped to the conclusion that you need this information
to handle something. Based on the other things you've said, I don't
think that's the case. Read the rest of my previous mail before
responding.

-- 
David A. Holland
dholl...@netbsd.org


Re: struct file reference at VFS level

2016-04-23 Thread Emmanuel Dreyfus
David Holland  wrote:

> If something in fuse is causing these cases to share the same open and
> thus the same struct file, fuse is broken. Fix fuse first.

The NetBSD VFS interface does not let underlying filesystem distinguish
different struct file for a given vnode. That information is just not
sent, and this is what this thread is about.

The FUSE interface has the notion of filehandles, which are opaque
references to struct file. But currently NetBSD cannot use that.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-23 Thread David Holland
On Fri, Apr 22, 2016 at 09:10:23AM +, Emmanuel Dreyfus wrote:
 > I talked to the glusterFS developer that hit the problem about the 
 > requirement. This is to iplement mandatory locks, a feature not available 
 > in UFS.

UFS isn't relevant.

 > Quooted below is the scenario chere the problem arise:
 > 
 > > (...) proposed mandatory lock feature checks for locks during
 > > every data modifying fops. And hence we make use of fd flags too. Let
 > > me explain the scenario from my test case:
 > >
 > > Process 1 opens file 'foo' with O_RDWR and acquires a shared byte range
 > > lock. Process 2 opens the same file 'foo' with O_RDWR|O_NONBLOCK so as
 > > to fail the next write in case a conflicting lock is found. Both
 > > processes are acting on the same mount point and hence we have the
 > > issue of reusing the previous open which does not contain O_NONBLOCK
 > > flag. As a result instead of failing the write fop it will continue to
 > > wait until the conflicting byte range lock is released.

If something in fuse is causing these cases to share the same open and
thus the same struct file, fuse is broken. Fix fuse first.

If that isn't what's happening, the next possible problem is that
puffs/refuse/whatnot is the losing IO_NDELAY flag of VOP_WRITE. If so,
fuse is broken. Fix fuse first.

If the problem is merely that you're being passed IO_NDELAY and not
noticing/using it, try fixing that before gunking up the kernel...

-- 
David A. Holland
dholl...@netbsd.org


Re: struct file reference at VFS level

2016-04-23 Thread Greg Troxel

Joerg Sonnenberger  writes:

> On Fri, Apr 22, 2016 at 10:42:10AM -0400, Greg Troxel wrote:
>> I still don't understand why this is about FUSE.  What if a file were
>> opened without O_NONBLOCK and then the same file were opened with?
>
> O_NONBLOCK is pretty much pointless for regular files. It only really
> changes something for sockets, pipes and the like and they behave
> different already.

Sure, but I meant to include especially character special files.


signature.asc
Description: PGP signature


Re: struct file reference at VFS level

2016-04-22 Thread Emmanuel Dreyfus
Emmanuel Dreyfus  wrote:

> Now let us see the locking scenario:
> - process 1 opens file foo O_RDWR
> - process 2 opens file foo O_RDWR|O_NONBLOCK
> - process 1 locks a part of the file
> - process 2 atteemps to write to the locked part
> 
> With advisory locks (what we have in our filesystems),  the code above
> VFS has no reason to block process 2 write operation. The filesystem
> could notice there is a lock on the written area, but it cannot enforce
> it because it does not distinguish the file descriptors. Hence the write
> on locked area succeeds, but this is not a huge problem since the
> locks are adivsory.
> 
> With mandatory locks, the filesystem must block process 2 when it
> writes. 

The word "block" is unfortunate here: I meant the filesystem must deny
process 2 when it attempts to write. Since process 2 used O_NONBLOCK,
this means write() must return with EAGAIN.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-22 Thread Joerg Sonnenberger
On Fri, Apr 22, 2016 at 10:42:10AM -0400, Greg Troxel wrote:
> I still don't understand why this is about FUSE.  What if a file were
> opened without O_NONBLOCK and then the same file were opened with?

O_NONBLOCK is pretty much pointless for regular files. It only really
changes something for sockets, pipes and the like and they behave
different already.

Joerg


Re: struct file reference at VFS level

2016-04-22 Thread Emmanuel Dreyfus
On Fri, Apr 22, 2016 at 10:42:10AM -0400, Greg Troxel wrote:
> I still don't understand why this is about FUSE.  What if a file were
> opened without O_NONBLOCK and then the same file were opened with?  Why
> isn't that the same situation, and the same bug?

The filesystem below NetBSD VFS cannot make the difference between 
operations on the two file descriptors. Generic code above VFS 
implements the blocking / non blocking behavior and the filesystem 
does not have to handle that.

Now let us see the locking scenario:
- process 1 opens file foo O_RDWR
- process 2 opens file foo O_RDWR|O_NONBLOCK
- process 1 locks a part of the file
- process 2 atteemps to write to the locked part

With advisory locks (what we have in our filesystems),  the code above
VFS has no reason to block process 2 write operation. The filesystem
could notice there is a lock on the written area, but it cannot enforce
it because it does not distinguish the file descriptors. Hence the write
on locked area succeeds, but this is not a huge problem since the
locks are adivsory.

With mandatory locks, the filesystem must block process 2 when it
writes. In order to do that it must distinguish the file descriptors, 
but NetBSD VFS does not let it do that.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-22 Thread Taylor R Campbell
   Date: Fri, 22 Apr 2016 14:33:14 +
   From: Emmanuel Dreyfus 

   On Fri, Apr 22, 2016 at 02:07:22PM +, Taylor R Campbell wrote:
   >I talked to the glusterFS developer that hit the problem about the 
   >requirement. This is to iplement mandatory locks, a feature not 
available 
   >in UFS.
   > 
   > Why is this file-system-specific?

   It is not filesystem-specific: the point is that there is no need
   for such a VFS feature in the filesystems currently supported
   by NetBSD? because none of them enforces mandatory locks.

What I mean is: Why does this require VFS operations to operate on the
mandatory locks -- why is it not generic for all file operations?

   > Can you share a pointer to the relevant logic?

   There was an exaple in my previous message.

Can you point to an example of a file system that relies on the struct
file reference?


Re: struct file reference at VFS level

2016-04-22 Thread Greg Troxel

Emmanuel Dreyfus  writes:

>> (...) proposed mandatory lock feature checks for locks during
>> every data modifying fops. And hence we make use of fd flags too. Let
>> me explain the scenario from my test case:
>>
>> Process 1 opens file 'foo' with O_RDWR and acquires a shared byte range
>> lock. Process 2 opens the same file 'foo' with O_RDWR|O_NONBLOCK so as
>> to fail the next write in case a conflicting lock is found. Both
>> processes are acting on the same mount point and hence we have the
>> issue of reusing the previous open which does not contain O_NONBLOCK
>> flag. As a result instead of failing the write fop it will continue to
>> wait until the conflicting byte range lock is released.

I still don't understand why this is about FUSE.  What if a file were
opened without O_NONBLOCK and then the same file were opened with?  Why
isn't that the same situation, and the same bug?   It seems that either

  NONBLOCK has to be passed to VOP_OPEN, and then vnodes cannot be
  reused for other fd access to the same file, or

  all these flags need to be passed in per operation

and a middle ground is unsound.  But maybe I'm missing something.



signature.asc
Description: PGP signature


Re: struct file reference at VFS level

2016-04-22 Thread Emmanuel Dreyfus
On Fri, Apr 22, 2016 at 02:07:22PM +, Taylor R Campbell wrote:
>I talked to the glusterFS developer that hit the problem about the 
>requirement. This is to iplement mandatory locks, a feature not available 
>in UFS.
> 
> Why is this file-system-specific?

It is not filesystem-specific: the point is that there is no need
for such a VFS feature in the filesystems currently supported
by NetBSD? because none of them enforces mandatory locks.

> Can you share a pointer to the relevant logic?

There was an exaple in my previous message.

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-22 Thread Taylor R Campbell
   Date: Fri, 22 Apr 2016 09:10:23 +
   From: Emmanuel Dreyfus 

   On Wed, Apr 20, 2016 at 07:09:12AM -0400, Greg Troxel wrote:
   > I am not following this.   My basic issue is that while I see your point
   > about open files in FUSE being different, I don't see how that's
   > different from multiple open files in any other filesystem.
   > 
   > What specific flags are you talking about, and can you explain why a
   > FUSE vfs implementation would need to know but say UFS would not?

   I talked to the glusterFS developer that hit the problem about the 
   requirement. This is to iplement mandatory locks, a feature not available 
   in UFS.

Why is this file-system-specific?

Can you share a pointer to the relevant logic?


Re: struct file reference at VFS level

2016-04-22 Thread Emmanuel Dreyfus
On Wed, Apr 20, 2016 at 07:09:12AM -0400, Greg Troxel wrote:
> I am not following this.   My basic issue is that while I see your point
> about open files in FUSE being different, I don't see how that's
> different from multiple open files in any other filesystem.
> 
> What specific flags are you talking about, and can you explain why a
> FUSE vfs implementation would need to know but say UFS would not?

I talked to the glusterFS developer that hit the problem about the 
requirement. This is to iplement mandatory locks, a feature not available 
in UFS.

Quooted below is the scenario chere the problem arise:

> (...) proposed mandatory lock feature checks for locks during
> every data modifying fops. And hence we make use of fd flags too. Let
> me explain the scenario from my test case:
>
> Process 1 opens file 'foo' with O_RDWR and acquires a shared byte range
> lock. Process 2 opens the same file 'foo' with O_RDWR|O_NONBLOCK so as
> to fail the next write in case a conflicting lock is found. Both
> processes are acting on the same mount point and hence we have the
> issue of reusing the previous open which does not contain O_NONBLOCK
> flag. As a result instead of failing the write fop it will continue to
> wait until the conflicting byte range lock is released.



-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: struct file reference at VFS level

2016-04-20 Thread Greg Troxel

m...@netbsd.org (Emmanuel Dreyfus) writes:

> FUSE methods on open files use a pointer on a file handle. The
> filesystem uses it to track attribute about the open file. For instance,
> if process 1 opens a file with a given flag and process 2 opens it
> without it, the filesystem may need to act differently when it gets an
> I/O for the file. This means it must be able to tell the difference
> between the struct file that caused an operation on a given vnode. 
>
> As I understand, current implementation of NetBSD VFS does not offer a
> reference to the calling struct file when we do VFS operations. It seems
> this could be optionally added using kauth_cred_setdata() /
> kauth_cred_getdata() if we have a mount option for the filesystem.

I am not following this.   My basic issue is that while I see your point
about open files in FUSE being different, I don't see how that's
different from multiple open files in any other filesystem.

What specific flags are you talking about, and can you explain why a
FUSE vfs implementation would need to know but say UFS would not?

(Conveying more information across puffs sounds ok; it's the notion that
more has to cross the VFS interface specially for FUSE that I don't follow.)


signature.asc
Description: PGP signature