subject:"recent nfs change causes autofs regression"

Re: recent nfs change causes autofs regression

2007-09-05 Thread Ian Kent

On Wed, 2007-09-05 at 16:50 +0100, Trond Myklebust wrote:
> On Wed, 2007-09-05 at 16:37 +0100, David Howells wrote:
> > Ian Kent <[EMAIL PROTECTED]> wrote:
> > 
> > > But what about mounting with different protocol, tcp vs udp for example.
> > 
> > I was referring specifically to the R/O / R/W variants of the same mount.  
> > Any
> > flag variation that varies the way the NFS client talks to the NFS server 
> > must
> > either result in a new superblock or be ignored.
> > 
> > David
> 
> We currently ignore remount requests that attempt to change the NFS
> mount parameters. This is not new behaviour, BTW: it has always been the
> case, and nobody has ever requested it.

Yes, I only mentioned it because I'm aware it.

I've not payed much attention to it because there haven't been any
complaints so far and it's been a long time.

Ian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Trond Myklebust

On Wed, 2007-09-05 at 20:44 +0800, Ian Kent wrote:
> On Tue, 2007-09-04 at 08:54 +0100, David Howells wrote:
> > Bill Davidsen <[EMAIL PROTECTED]> wrote:
> > 
> > > mount /base on point1 - rw[ hopefully really r/w ]
> > > mount /base on point2 - ro[ hopefully r/o ]
> > 
> > I think Al Viro probably has the right idea as to how to fix this: Move the
> > R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
> > superblock.  I never quite finished implementing the patch to do this, but I
> > can go back and revisit it.
> 
> But what about mounting with different protocol, tcp vs udp for example.
> 
> Ian

With the patch that Linus merged, we will fork off a new superblock.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Trond Myklebust

On Wed, 2007-09-05 at 16:37 +0100, David Howells wrote:
> Ian Kent <[EMAIL PROTECTED]> wrote:
> 
> > But what about mounting with different protocol, tcp vs udp for example.
> 
> I was referring specifically to the R/O / R/W variants of the same mount.  Any
> flag variation that varies the way the NFS client talks to the NFS server must
> either result in a new superblock or be ignored.
> 
> David

We currently ignore remount requests that attempt to change the NFS
mount parameters. This is not new behaviour, BTW: it has always been the
case, and nobody has ever requested it.

The ro flag is different, and I agree that it should be moved to the
vfsmount structure. I'm hoping Dave Hansen's patches will be ready for
merging soon...

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread David Howells

Ian Kent <[EMAIL PROTECTED]> wrote:

> But what about mounting with different protocol, tcp vs udp for example.

I was referring specifically to the R/O / R/W variants of the same mount.  Any
flag variation that varies the way the NFS client talks to the NFS server must
either result in a new superblock or be ignored.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread David Howells

Bill Davidsen <[EMAIL PROTECTED]> wrote:

> I think Al had a good idea there, that is nice and clean. What about bind
> mounts, will that just fall out?

I don't see that it should be a problem since the vfsmount is copied.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Ian Kent

On Tue, 2007-09-04 at 08:54 +0100, David Howells wrote:
> Bill Davidsen <[EMAIL PROTECTED]> wrote:
> 
> > mount /base on point1 - rw  [ hopefully really r/w ]
> > mount /base on point2 - ro  [ hopefully r/o ]
> 
> I think Al Viro probably has the right idea as to how to fix this: Move the
> R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
> superblock.  I never quite finished implementing the patch to do this, but I
> can go back and revisit it.

But what about mounting with different protocol, tcp vs udp for example.

Ian


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Bill Davidsen


David Howells wrote:

Bill Davidsen <[EMAIL PROTECTED]> wrote:

  

mount /base on point1 - rw  [ hopefully really r/w ]
mount /base on point2 - ro  [ hopefully r/o ]



I think Al Viro probably has the right idea as to how to fix this: Move the
R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
superblock.  I never quite finished implementing the patch to do this, but I
can go back and revisit it.
  
I think Al had a good idea there, that is nice and clean. What about 
bind mounts, will that just fall out?


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Bill Davidsen


David Howells wrote:

Bill Davidsen [EMAIL PROTECTED] wrote:

  

mount /base on point1 - rw  [ hopefully really r/w ]
mount /base on point2 - ro  [ hopefully r/o ]



I think Al Viro probably has the right idea as to how to fix this: Move the
R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
superblock.  I never quite finished implementing the patch to do this, but I
can go back and revisit it.
  
I think Al had a good idea there, that is nice and clean. What about 
bind mounts, will that just fall out?


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Ian Kent

On Tue, 2007-09-04 at 08:54 +0100, David Howells wrote:
 Bill Davidsen [EMAIL PROTECTED] wrote:
 
  mount /base on point1 - rw  [ hopefully really r/w ]
  mount /base on point2 - ro  [ hopefully r/o ]
 
 I think Al Viro probably has the right idea as to how to fix this: Move the
 R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
 superblock.  I never quite finished implementing the patch to do this, but I
 can go back and revisit it.

But what about mounting with different protocol, tcp vs udp for example.

Ian


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread David Howells

Bill Davidsen [EMAIL PROTECTED] wrote:

 I think Al had a good idea there, that is nice and clean. What about bind
 mounts, will that just fall out?

I don't see that it should be a problem since the vfsmount is copied.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread David Howells

Ian Kent [EMAIL PROTECTED] wrote:

 But what about mounting with different protocol, tcp vs udp for example.

I was referring specifically to the R/O / R/W variants of the same mount.  Any
flag variation that varies the way the NFS client talks to the NFS server must
either result in a new superblock or be ignored.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Trond Myklebust

On Wed, 2007-09-05 at 16:37 +0100, David Howells wrote:
 Ian Kent [EMAIL PROTECTED] wrote:
 
  But what about mounting with different protocol, tcp vs udp for example.
 
 I was referring specifically to the R/O / R/W variants of the same mount.  Any
 flag variation that varies the way the NFS client talks to the NFS server must
 either result in a new superblock or be ignored.
 
 David

We currently ignore remount requests that attempt to change the NFS
mount parameters. This is not new behaviour, BTW: it has always been the
case, and nobody has ever requested it.

The ro flag is different, and I agree that it should be moved to the
vfsmount structure. I'm hoping Dave Hansen's patches will be ready for
merging soon...

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-05 Thread Trond Myklebust

On Wed, 2007-09-05 at 20:44 +0800, Ian Kent wrote:
 On Tue, 2007-09-04 at 08:54 +0100, David Howells wrote:
  Bill Davidsen [EMAIL PROTECTED] wrote:
  
   mount /base on point1 - rw[ hopefully really r/w ]
   mount /base on point2 - ro[ hopefully r/o ]
  
  I think Al Viro probably has the right idea as to how to fix this: Move the
  R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
  superblock.  I never quite finished implementing the patch to do this, but I
  can go back and revisit it.
 
 But what about mounting with different protocol, tcp vs udp for example.
 
 Ian

With the patch that Linus merged, we will fork off a new superblock.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread Linus Torvalds

On Tue, 4 Sep 2007, David Howells wrote:
> 
> That helps one case, yes, but what about a superset?  What about two sets that
> might intersect but for which you don't have the common root to hand?

Sure. In which case bind mounts don't work. Fair enough.

> The case I came up with was this:
> 
>   mount home:/home/fred /home/fred
>   mount home:/home/jim /home/jim

The much more trivial case is

mount -o ro server:/usr/bin /usr/share/bin
mount server:/usr/tmp /usr/share/tmp

and now tell me any reasonable reason why this should fail? (Replace "-o 
ro" with any other attributes).

Quite frankly, if the above two mounts fail - just beause /usr/bin and 
/usr/tmp happen to be on the same filesystem on the server - then the 
implementation is more than just buggy - it's a pure piece of shit.

And quite frankly, as far as I can tell, that was exactly what the NFS 
changes that are being discussed did. They failed the equivalent of the 
second mount, because it didn't have the same flags as the first one.

Can you really honestly say that wasn't totally broken?

> The reason I added all this NFS superblock sharing is so that I could 
> implement
> on-disk local caching much more easily.  If, for instance, two netfs inodes
> aren't shared, but their "index keys" say they should use the same piece of
> cache then all sorts of fun ensues from the disjoint cache coherency.
> 
> Even working out that two inodes are using the same piece of cache isn't
> trivial (though it seems like it ought to be).

I'm just saying that the whole "require all mount flags to be identical, 
and error out if they are not" is pure and utter CRAP.

So anything that does that - for *any* reason what-so-ever - is just 
broken. If you require identical mount-time flags, that absolutely has to 
be a special case (like using "--bind", or perhaps using a special option 
like "sharecache").

It really is that simple. I don't know how anybody could possibly ever 
dispute that.

As far as I can tell, the current situation in NFS is "reasonably ok", but 
I already asked Trond about what happens with "remount" with the "same 
mount options imply sharecache" code that he did, and afaik, I never got 
an answer. In other words, let's change the above two commands to the 
following three commands:

mount server:/usr/bin /usr/share/bin
mount server:/usr/tmp /usr/share/tmp
mount -o remount,ro /usr/share/bin

and I'm claiming that if the above fails (or remounts /usr/share/tmp as 
read-only too), then it's also obvious CRAP (replace "ro" with any other 
possible attribute - whether cache timeouts or similar)

See? It really is that simple. The obvious mount usage above absolutely 
*has* to work, and anything that breaks it is crap, crap, crap. And that 
was exactly what apparently happened here, and I really don't see why 
anybody has the *gall* to claim that the "default to sharecache" code 
wasn't totally broken.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Linus Torvalds <[EMAIL PROTECTED]> wrote:

> In other words, let's assume that the user has /some/nfs/mount mounted 
> over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
> else, the sane thing to do is not to mount it again, but to just do

That helps one case, yes, but what about a superset?  What about two sets that
might intersect but for which you don't have the common root to hand?  The
current NFS code deals with all these problems by attempting to share the
dentry sets.  Superblocks can now have multiple roots and we graft trees
together automatically when we discover one is a subset of another.

The case I came up with was this:

mount home:/home/fred /home/fred
mount home:/home/jim /home/jim

To effect these, the NFS mount process looks up "/home/fred" or "/home/jim"
directly rather than looking up "/" and path walking.  However, the NFS client
in the kernel may note that both Fred's and Jim's home directories reside on
the same NFS volume.  You cannot use a bind mount here because there's nothing
to bind from.

Then, should, say, this happen:

mount home:/home /mnt

You'll probably end up with three roots in the NFS superblock.  Following with
an ls of /home, say, would then populate the dentries for /home - including
those for fred and jim, and the code would splice in the dentried now rooted at
/home/fred and /home/jim.

You can't do that with bind mounts as far as I know because I don't believe
that you can go up the tree (rootwards) from the apparent root of a vfsmount.

So bind mounts aren't quite it for this problem, and in any case your
suggestion of:

mount --bind /some/nfs/mount/subdir /new/mount/place

doesn't help with the automounter case particularly well.  The automounter
*could* probe to see if the server stuff is common with an already existing
mount, but there would then be a race, and it doesn't help with the homedir
example I gave above either.

You might think "well, start by mounting '/' somewhere and then bind mounting
subdirs of it", but that doesn't work if you can't mount "/" or "/home", and
might go spectacularly wrong if the server has a symlink in the path that you
can't see.

> This is why I think "nosharecache" should just be the default, because 
> that's the behaviour that simply does not have any subtle issues. The 
> *special* case should be the "sharecache" case, and 99% of the time that 
> one should likely be done with a "--bind" mount.

Yeah, that's probably necessary, if annoying.  However, local caching can
enable sharing or make it a prerequisite option.

> (I don't really see the point of _ever_ doing anything but a bind mount, 
> but maybe there are reasons to try to share at a NFS layer that I don't 
> really see)

The reason I added all this NFS superblock sharing is so that I could implement
on-disk local caching much more easily.  If, for instance, two netfs inodes
aren't shared, but their "index keys" say they should use the same piece of
cache then all sorts of fun ensues from the disjoint cache coherency.

Even working out that two inodes are using the same piece of cache isn't
trivial (though it seems like it ought to be).

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Linus Torvalds <[EMAIL PROTECTED]> wrote:

> In other words, let's assume that the user has /some/nfs/mount mounted 
> over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
> else, the sane thing to do is not to mount it again, but to just do

What about a superset?  What about two intersecting sets?  Bind mounts aren't
quite it for this problem, and in any case your suggestion of:


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Bill Davidsen <[EMAIL PROTECTED]> wrote:

> mount /base on point1 - rw[ hopefully really r/w ]
> mount /base on point2 - ro[ hopefully r/o ]

I think Al Viro probably has the right idea as to how to fix this: Move the
R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
superblock.  I never quite finished implementing the patch to do this, but I
can go back and revisit it.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Trond Myklebust <[EMAIL PROTECTED]> wrote:

> - the NFSv4 delegation model breaks: the client will be using
> OPEN when it could use cached opens. More importantly, when
> performing an operation that requires it to return the
> delegation on the aliased file, it won't know until the server
> sends it a callback.

Perhaps sharing could be the default on NFSv4 and non-sharing for 2 & 3?
After all, NFSv4 is supposed to be able to handle local caching on disk.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Trond Myklebust [EMAIL PROTECTED] wrote:

 - the NFSv4 delegation model breaks: the client will be using
 OPEN when it could use cached opens. More importantly, when
 performing an operation that requires it to return the
 delegation on the aliased file, it won't know until the server
 sends it a callback.

Perhaps sharing could be the default on NFSv4 and non-sharing for 2  3?
After all, NFSv4 is supposed to be able to handle local caching on disk.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Bill Davidsen [EMAIL PROTECTED] wrote:

 mount /base on point1 - rw[ hopefully really r/w ]
 mount /base on point2 - ro[ hopefully r/o ]

I think Al Viro probably has the right idea as to how to fix this: Move the
R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the
superblock.  I never quite finished implementing the patch to do this, but I
can go back and revisit it.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Linus Torvalds [EMAIL PROTECTED] wrote:

 In other words, let's assume that the user has /some/nfs/mount mounted 
 over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
 else, the sane thing to do is not to mount it again, but to just do

What about a superset?  What about two intersecting sets?  Bind mounts aren't
quite it for this problem, and in any case your suggestion of:


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread David Howells

Linus Torvalds [EMAIL PROTECTED] wrote:

 In other words, let's assume that the user has /some/nfs/mount mounted 
 over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
 else, the sane thing to do is not to mount it again, but to just do

That helps one case, yes, but what about a superset?  What about two sets that
might intersect but for which you don't have the common root to hand?  The
current NFS code deals with all these problems by attempting to share the
dentry sets.  Superblocks can now have multiple roots and we graft trees
together automatically when we discover one is a subset of another.

The case I came up with was this:

mount home:/home/fred /home/fred
mount home:/home/jim /home/jim

To effect these, the NFS mount process looks up /home/fred or /home/jim
directly rather than looking up / and path walking.  However, the NFS client
in the kernel may note that both Fred's and Jim's home directories reside on
the same NFS volume.  You cannot use a bind mount here because there's nothing
to bind from.

Then, should, say, this happen:

mount home:/home /mnt

You'll probably end up with three roots in the NFS superblock.  Following with
an ls of /home, say, would then populate the dentries for /home - including
those for fred and jim, and the code would splice in the dentried now rooted at
/home/fred and /home/jim.

You can't do that with bind mounts as far as I know because I don't believe
that you can go up the tree (rootwards) from the apparent root of a vfsmount.

So bind mounts aren't quite it for this problem, and in any case your
suggestion of:

mount --bind /some/nfs/mount/subdir /new/mount/place

doesn't help with the automounter case particularly well.  The automounter
*could* probe to see if the server stuff is common with an already existing
mount, but there would then be a race, and it doesn't help with the homedir
example I gave above either.

You might think well, start by mounting '/' somewhere and then bind mounting
subdirs of it, but that doesn't work if you can't mount / or /home, and
might go spectacularly wrong if the server has a symlink in the path that you
can't see.


 This is why I think nosharecache should just be the default, because 
 that's the behaviour that simply does not have any subtle issues. The 
 *special* case should be the sharecache case, and 99% of the time that 
 one should likely be done with a --bind mount.

Yeah, that's probably necessary, if annoying.  However, local caching can
enable sharing or make it a prerequisite option.

 (I don't really see the point of _ever_ doing anything but a bind mount, 
 but maybe there are reasons to try to share at a NFS layer that I don't 
 really see)

The reason I added all this NFS superblock sharing is so that I could implement
on-disk local caching much more easily.  If, for instance, two netfs inodes
aren't shared, but their index keys say they should use the same piece of
cache then all sorts of fun ensues from the disjoint cache coherency.

Even working out that two inodes are using the same piece of cache isn't
trivial (though it seems like it ought to be).

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-04 Thread Linus Torvalds



On Tue, 4 Sep 2007, David Howells wrote:
 
 That helps one case, yes, but what about a superset?  What about two sets that
 might intersect but for which you don't have the common root to hand?

Sure. In which case bind mounts don't work. Fair enough.

 The case I came up with was this:
 
   mount home:/home/fred /home/fred
   mount home:/home/jim /home/jim

The much more trivial case is

mount -o ro server:/usr/bin /usr/share/bin
mount server:/usr/tmp /usr/share/tmp

and now tell me any reasonable reason why this should fail? (Replace -o 
ro with any other attributes).

Quite frankly, if the above two mounts fail - just beause /usr/bin and 
/usr/tmp happen to be on the same filesystem on the server - then the 
implementation is more than just buggy - it's a pure piece of shit.

And quite frankly, as far as I can tell, that was exactly what the NFS 
changes that are being discussed did. They failed the equivalent of the 
second mount, because it didn't have the same flags as the first one.

Can you really honestly say that wasn't totally broken?

 The reason I added all this NFS superblock sharing is so that I could 
 implement
 on-disk local caching much more easily.  If, for instance, two netfs inodes
 aren't shared, but their index keys say they should use the same piece of
 cache then all sorts of fun ensues from the disjoint cache coherency.
 
 Even working out that two inodes are using the same piece of cache isn't
 trivial (though it seems like it ought to be).

I'm just saying that the whole require all mount flags to be identical, 
and error out if they are not is pure and utter CRAP.

So anything that does that - for *any* reason what-so-ever - is just 
broken. If you require identical mount-time flags, that absolutely has to 
be a special case (like using --bind, or perhaps using a special option 
like sharecache).

It really is that simple. I don't know how anybody could possibly ever 
dispute that.

As far as I can tell, the current situation in NFS is reasonably ok, but 
I already asked Trond about what happens with remount with the same 
mount options imply sharecache code that he did, and afaik, I never got 
an answer. In other words, let's change the above two commands to the 
following three commands:

mount server:/usr/bin /usr/share/bin
mount server:/usr/tmp /usr/share/tmp
mount -o remount,ro /usr/share/bin

and I'm claiming that if the above fails (or remounts /usr/share/tmp as 
read-only too), then it's also obvious CRAP (replace ro with any other 
possible attribute - whether cache timeouts or similar)

See? It really is that simple. The obvious mount usage above absolutely 
*has* to work, and anything that breaks it is crap, crap, crap. And that 
was exactly what apparently happened here, and I really don't see why 
anybody has the *gall* to claim that the default to sharecache code 
wasn't totally broken.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-03 Thread Martin Knoblauch


--- Jakob Oestergaard <[EMAIL PROTECTED]> wrote:

> On Fri, Aug 31, 2007 at 09:43:29AM -0700, Linus Torvalds wrote:
> ...
> > This is *not* a security hole. In order to make it a security hole,
> you 
> > need to be root in the first place.
> 
> Non-root users can write to places where root might believe they
> cannot write
> because he might be under the mistaken assumption that ro means ro.
> 
> I am under the impression that that could have implications in some
> setups.
>

 That was never in question.
 
> ...
> > 
> >  - it's a misfeature that people are used to, and has been around
> forever.
> 
> Sure, they're used it it, but I doubt they are aware of it.
>

 So, the right thing to do (tm) is to make them aware without breaking
their setup. 

 Log any detected inconsistencies in the dmesg buffer and to syslog. If
the sysadmin is not competent enough to notice, to bad.
 
Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-03 Thread Jakob Oestergaard

On Fri, Aug 31, 2007 at 09:43:29AM -0700, Linus Torvalds wrote:
...
> This is *not* a security hole. In order to make it a security hole, you 
> need to be root in the first place.

Non-root users can write to places where root might believe they cannot write
because he might be under the mistaken assumption that ro means ro.

I am under the impression that that could have implications in some setups.

...
> 
>  - it's a misfeature that people are used to, and has been around forever.

Sure, they're used it it, but I doubt they are aware of it.

...
> so I really don't see why people excuse the new behaviour.

We can certainly agree that a nicer fix would be nicer :)

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-03 Thread Jakob Oestergaard

On Fri, Aug 31, 2007 at 09:43:29AM -0700, Linus Torvalds wrote:
...
 This is *not* a security hole. In order to make it a security hole, you 
 need to be root in the first place.

Non-root users can write to places where root might believe they cannot write
because he might be under the mistaken assumption that ro means ro.

I am under the impression that that could have implications in some setups.

...
 
  - it's a misfeature that people are used to, and has been around forever.

Sure, they're used it it, but I doubt they are aware of it.

...
 so I really don't see why people excuse the new behaviour.

We can certainly agree that a nicer fix would be nicer :)

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-03 Thread Martin Knoblauch


--- Jakob Oestergaard [EMAIL PROTECTED] wrote:

 On Fri, Aug 31, 2007 at 09:43:29AM -0700, Linus Torvalds wrote:
 ...
  This is *not* a security hole. In order to make it a security hole,
 you 
  need to be root in the first place.
 
 Non-root users can write to places where root might believe they
 cannot write
 because he might be under the mistaken assumption that ro means ro.
 
 I am under the impression that that could have implications in some
 setups.


 That was never in question.
 
 ...
  
   - it's a misfeature that people are used to, and has been around
 forever.
 
 Sure, they're used it it, but I doubt they are aware of it.


 So, the right thing to do (tm) is to make them aware without breaking
their setup. 

 Log any detected inconsistencies in the dmesg buffer and to syslog. If
the sysadmin is not competent enough to notice, to bad.
 
Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-01 Thread Bill Davidsen


Trond Myklebust wrote:

On Thu, 2007-08-30 at 20:49 -0700, Linus Torvalds wrote:
Please send in a fix. If the fix involves making "nosharecache" the 
default, then that is better than making policy decisions like this in the 
kernel. The kernel should do what the user asks and not put in unnecessary 
roadblocks.


The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails. The attached patch does just that.

I'm glad I read the whole thread, because when I saw it earlier and 
didn't respond, this was the question I had, why not replace the error 
with forcing "nosharecache" on, which is essentially what you have done.



Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Finally, for the record: I still feel very uncomfortable about not being
able to report the state of the client setup back to the sysadmin.
AFAIK, the only way to do so is to stat the mountpoints, and compare the
device ids.

Since clients may not know the server setup, and it may change for 
policy or error recovery reason, I think this patch is needed.


The cases I think are common are:

1 - single export, multiple client mounts

export /base - rw

mount /base/share - ro  [ client enforces r/o or not ]
mount /base/upload - rw

2 - export parts of a filesystem (/base) [ server enforces access ]

export /base/share - ro [ hopefully really r/o on client ]
export /base/upload - rw[ should work for write ]

3 - mount the same f/s with different permissions on client

export /base - rw

mount /base on point1 - rw  [ hopefully really r/w ]
mount /base on point2 - ro  [ hopefully r/o ]

I consider this *really* bad practice, but I have seen it in enough 
places to know others don't agree. It assumes the client will protect 
the r/o data.


4 - export f/s and part of f/s

export /base/ - ro
export /base/upload - rw

clients may mount one or both, with the upload directory as part of base 
or elsewhere. What will happen here?



Trond



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-09-01 Thread Bill Davidsen


Trond Myklebust wrote:

On Thu, 2007-08-30 at 20:49 -0700, Linus Torvalds wrote:
Please send in a fix. If the fix involves making nosharecache the 
default, then that is better than making policy decisions like this in the 
kernel. The kernel should do what the user asks and not put in unnecessary 
roadblocks.


The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails. The attached patch does just that.

I'm glad I read the whole thread, because when I saw it earlier and 
didn't respond, this was the question I had, why not replace the error 
with forcing nosharecache on, which is essentially what you have done.



Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Finally, for the record: I still feel very uncomfortable about not being
able to report the state of the client setup back to the sysadmin.
AFAIK, the only way to do so is to stat the mountpoints, and compare the
device ids.

Since clients may not know the server setup, and it may change for 
policy or error recovery reason, I think this patch is needed.


The cases I think are common are:

1 - single export, multiple client mounts

export /base - rw

mount /base/share - ro  [ client enforces r/o or not ]
mount /base/upload - rw

2 - export parts of a filesystem (/base) [ server enforces access ]

export /base/share - ro [ hopefully really r/o on client ]
export /base/upload - rw[ should work for write ]

3 - mount the same f/s with different permissions on client

export /base - rw

mount /base on point1 - rw  [ hopefully really r/w ]
mount /base on point2 - ro  [ hopefully r/o ]

I consider this *really* bad practice, but I have seen it in enough 
places to know others don't agree. It assumes the client will protect 
the r/o data.


4 - export f/s and part of f/s

export /base/ - ro
export /base/upload - rw

clients may mount one or both, with the upload directory as part of base 
or elsewhere. What will happen here?



Trond



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

> It's not about default (for which backward compatibility is most
> important and this patch is perfectly fine), but user explicitly asks
> for "sharecache". In this case if for any reason the cache cannot be
> shared, I am not sure if he should get an error back.
> 
> I for one agree with Ian and Linus that changing default to
> nosharecache might be the best thing to do, but since I am now able to
> use the latest kernel, I am very happy already.

Actually, I think just fine-tuning it a bit may be better:

1. make 'nosharecache' as default
2. apply the algorithm in this patch to 'nosharecache': if the fsid and
mount options are the same, then share cache

This way the default behavior does not change, but both algorithms have
pitfalls, and we choose from:
1. if user specifies "sharecache", he may end up with nosharecache if mount
options are different
And
2. if user specifies "nosharecache", he may end up with sharecache if mount
options are the same

I'd think 2 is better (least surprise). I cannot think of a case where 2 is
actually a bad thing.

Comments?

> Thanks a lot for your attention to my problem. :-)
> 
> > Trond


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

> On Fri, 2007-08-31 at 11:47 -0700, Hua Zhong wrote:
> > This patch fixes the problem for me, thanks.
> >
> > Is this patch changing the behavior of "sharecache" to
> > "try-to-share-cache-if-possible", or adding a third behavior? If the
> > user explicitly asks for "-o sharecache", does he get an error back
> > if the mount options mismatch?
> 
> There has never been a 'sharecache' flag as far as the kernel is
> concerned. The default behaviour has always been to share.

It's not about default (for which backward compatibility is most important
and this patch is perfectly fine), but user explicitly asks for
"sharecache". In this case if for any reason the cache cannot be shared, I
am not sure if he should get an error back.

I for one agree with Ian and Linus that changing default to nosharecache
might be the best thing to do, but since I am now able to use the latest
kernel, I am very happy already.

Thanks a lot for your attention to my problem. :-)

> Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 11:47 -0700, Hua Zhong wrote:
> This patch fixes the problem for me, thanks.
> 
> Is this patch changing the behavior of "sharecache" to
> "try-to-share-cache-if-possible", or adding a third behavior? If the user
> explicitly asks for "-o sharecache", does he get an error back if the mount
> options mismatch?

There has never been a 'sharecache' flag as far as the kernel is
concerned. The default behaviour has always been to share.

Trond


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 10:01 -0700, Linus Torvalds wrote:
> 
> On Fri, 31 Aug 2007, Trond Myklebust wrote:
> >
> > The best I can do given the constraints appears to be to have the kernel
> > first look for a superblock that matches both the fsid and the
> > user-specified mount options, and then spawn off a new superblock if
> > that search fails.
> 
> I think this is probably acceptable to get roughly the old behaviour, but 
> I still think it's a bit stupid.
> 
> What happens at "mount -o remount,..." time?
> 
> The fact is, the whole "match the fsid and user mount options, and re-use 
> the mount" sounds like it's trying to solve a problem that doesn't need 
> solving. If the user really wants to duplicate the mount, he really should 
> be using a a bind-mount instead.
> 
> In other words, let's assume that the user has /some/nfs/mount mounted 
> over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
> else, the sane thing to do is not to mount it again, but to just do
> 
>   mount --bind /some/nfs/mount/subdir /new/mount/place
> 
> instead. That *guarantees* that the low-level filesystem uses the same 
> flags, and it also means that things like re-mounting have sane and 
> well-defined semantics, and will fail or succeed predictably.

I agree for the cases where you can use bind mounts, however you can't
always do that.

Consider the fairly common setup where /foo, /foo/a, /foo/b are all on
the same filesystem on the server, but only /foo/a and /foo/b are
exported.
There can be plenty of files that are contain hard links in both
directories, but because you cannot mount the parent, /foo, you will not
be able to ensure that these common files are cached to the same inode
(which they need to be).

IOW: with this scenario, you can't ensure that local posix semantics
hold (i.e. that if my client is the only user, then the filesystem will
behave as if it were a posix filesystem). That would be a major
regression.

> In contrast, if a user wants to create a new NFS mount, it really should 
> be independent of the old one, because that's (a) what other systems do, 
> and (b) also makes the semantics of re-mounting it with other flags be 
> clear and unambiguous (ie the remount has nothing what-so-ever to do with 
> the independent NFS mount).

(a) I'm not sure that is true: see (b).
(b) You gain remount clarity at the expense of local posix filesystem
correctness.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

This patch fixes the problem for me, thanks.

Is this patch changing the behavior of "sharecache" to
"try-to-share-cache-if-possible", or adding a third behavior? If the user
explicitly asks for "-o sharecache", does he get an error back if the mount
options mismatch?
 
> The best I can do given the constraints appears to be to have the
> kernel first look for a superblock that matches both the fsid and the
> user-specified mount options, and then spawn off a new superblock if
> that search fails. The attached patch does just that.
> 
> Note that this is not the same as specifying nosharecache everywhere
> since nosharecache will never attempt to match an existing superblock.
> 
> Finally, for the record: I still feel very uncomfortable about not
> being able to report the state of the client setup back to the sysadmin.
> AFAIK, the only way to do so is to stat the mountpoints, and compare
> the device ids.
> 
> Trond


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds

On Fri, 31 Aug 2007, Trond Myklebust wrote:
>
> The best I can do given the constraints appears to be to have the kernel
> first look for a superblock that matches both the fsid and the
> user-specified mount options, and then spawn off a new superblock if
> that search fails.

I think this is probably acceptable to get roughly the old behaviour, but 
I still think it's a bit stupid.

What happens at "mount -o remount,..." time?

The fact is, the whole "match the fsid and user mount options, and re-use 
the mount" sounds like it's trying to solve a problem that doesn't need 
solving. If the user really wants to duplicate the mount, he really should 
be using a a bind-mount instead.

In other words, let's assume that the user has /some/nfs/mount mounted 
over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
else, the sane thing to do is not to mount it again, but to just do

mount --bind /some/nfs/mount/subdir /new/mount/place

instead. That *guarantees* that the low-level filesystem uses the same 
flags, and it also means that things like re-mounting have sane and 
well-defined semantics, and will fail or succeed predictably.

In contrast, if a user wants to create a new NFS mount, it really should 
be independent of the old one, because that's (a) what other systems do, 
and (b) also makes the semantics of re-mounting it with other flags be 
clear and unambiguous (ie the remount has nothing what-so-ever to do with 
the independent NFS mount).

See? 

This is why I think "nosharecache" should just be the default, because 
that's the behaviour that simply does not have any subtle issues. The 
*special* case should be the "sharecache" case, and 99% of the time that 
one should likely be done with a "--bind" mount.

(I don't really see the point of _ever_ doing anything but a bind mount, 
but maybe there are reasons to try to share at a NFS layer that I don't 
really see)

> The attached patch does just that.

Hua, does this fix things for you? If it gets rid of the regression, I can 
certainly live with it, but as per above, I don't really think this makes 
much sense in the "bigger picture" kind of thing.

> Finally, for the record: I still feel very uncomfortable about not being
> able to report the state of the client setup back to the sysadmin.
> AFAIK, the only way to do so is to stat the mountpoints, and compare the
> device ids.

Well, not only don't I see that as being horribly wrong, I actually think 
that the sysadmin should know what his mount setup is, even without having 
to ask. But since he *can* ask, using easy and standard interfaces, I 
don't really see what the problem really is.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds

On Fri, 31 Aug 2007, Jakob Oestergaard wrote:
> 
> > The fact that he may *also* have broken insane setups is totally 
> > irrelevant. Don't go off on some tangent that has nothing to do with the 
> > regression in question!
> 
> It does not have "nothing" to do with the regression.
> 
> Some setups which worked more by accident than by design earlier on were 
> broken
> by the fix. This could have been avoided, I agree, but the breakage was caused
> by the fix (or the breakage is the fix, however you prefer to look at it).

Well, it's not a "fix" if it breaks other setups. 

It's especially not a fix since the whole requirement that all the flags 
be exactly the same is totally brain-dead in the first place. We *have* 
that kind of mount already, and it has nothing to do with NFS: it's called 
a "bind" mount.

So if you want an identical mount, with cache coherency and tying the two 
mount-points together (requiring that they have the same mount flags), 
then that has absolutely *nothing* to do with NFS. The VFS layer does that 
for you.

> *part* of it wasn't a security hole.
> 
> The other half very much was.

No, the fix was simply wrong.  It was done the wrong way, and it broke 
things it shouldn't have broken.

Let's put it this way: if I create a patch that stops the system from 
booting, I sure as hell fix a potential security hole, don't I?

Does that make my patch a "fix"?

No it does not.

> Sure, given that Trond (or whomever) has the time it takes to go and implement
> all of this, there's no need to screw anyone.
> 
> Assuming he's on a schedule and this will have to wait, I agree with him that
> it makes the most sense to play it safe security/consistency-wise rather than
> functionality-wise.

I disagree. Either that thing gets fixed before 2.6.23, or the commit that 
introduced the broken behaviour gets reverted.

We've had this policy of "regressions are fixed" for a long time, and 
we're not suddenly changing it.

This is *not* a security hole. In order to make it a security hole, you 
need to be root in the first place. So what you call a security hole is 
really no different from root installing a bad SUID binary. It's simply 
not the kernels place to then say "SUID binaries will not work, because 
it's a potential security hole".

See?

So stop calling this a security hole.  It's certainly a misfeature, but:

 - it's a misfeature that people are used to, and has been around forever.

 - there are bound to be ways to fix it that don't break existing users.

 - the requirement that all flags be the same for a mount to the same NFS 
   directory is *particularly* stupid, since there are better ways to do 
   that than go through NFS!

so I really don't see why people excuse the new behaviour.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Thu, 2007-08-30 at 20:49 -0700, Linus Torvalds wrote:
> Please send in a fix. If the fix involves making "nosharecache" the 
> default, then that is better than making policy decisions like this in the 
> kernel. The kernel should do what the user asks and not put in unnecessary 
> roadblocks.

The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails. The attached patch does just that.

Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Finally, for the record: I still feel very uncomfortable about not being
able to report the state of the client setup back to the sysadmin.
AFAIK, the only way to do so is to stat the mountpoints, and compare the
device ids.

Trond

--- Begin Message ---
Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
---

 fs/nfs/super.c |  110 +---
 1 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index c28f30d..8ed5937 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1303,34 +1303,6 @@ static void nfs_clone_super(struct super_block *sb,
nfs_initialise_sb(sb);
 }
 
-static int nfs_set_super(struct super_block *s, void *_server)
-{
-   struct nfs_server *server = _server;
-   int ret;
-
-   s->s_fs_info = server;
-   ret = set_anon_super(s, server);
-   if (ret == 0)
-   server->s_dev = s->s_dev;
-   return ret;
-}
-
-static int nfs_compare_super(struct super_block *sb, void *data)
-{
-   struct nfs_server *server = data, *old = NFS_SB(sb);
-
-   if (memcmp(>nfs_client->cl_addr,
-   >nfs_client->cl_addr,
-   sizeof(old->nfs_client->cl_addr)) != 0)
-   return 0;
-   /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
-   if (old->flags & NFS_MOUNT_UNSHARED)
-   return 0;
-   if (memcmp(>fsid, >fsid, sizeof(old->fsid)) != 0)
-   return 0;
-   return 1;
-}
-
 #define NFS_MS_MASK (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_SYNCHRONOUS)
 
 static int nfs_compare_mount_options(const struct super_block *s, const struct 
nfs_server *b, int flags)
@@ -1359,9 +1331,46 @@ static int nfs_compare_mount_options(const struct 
super_block *s, const struct n
goto Ebusy;
if (clnt_a->cl_auth->au_flavor != clnt_b->cl_auth->au_flavor)
goto Ebusy;
-   return 0;
+   return 1;
 Ebusy:
-   return -EBUSY;
+   return 0;
+}
+
+struct nfs_sb_mountdata {
+   struct nfs_server *server;
+   int mntflags;
+};
+
+static int nfs_set_super(struct super_block *s, void *data)
+{
+   struct nfs_sb_mountdata *sb_mntdata = data;
+   struct nfs_server *server = sb_mntdata->server;
+   int ret;
+
+   s->s_flags = sb_mntdata->mntflags;
+   s->s_fs_info = server;
+   ret = set_anon_super(s, server);
+   if (ret == 0)
+   server->s_dev = s->s_dev;
+   return ret;
+}
+
+static int nfs_compare_super(struct super_block *sb, void *data)
+{
+   struct nfs_sb_mountdata *sb_mntdata = data;
+   struct nfs_server *server = sb_mntdata->server, *old = NFS_SB(sb);
+   int mntflags = sb_mntdata->mntflags;
+
+   if (memcmp(>nfs_client->cl_addr,
+   >nfs_client->cl_addr,
+   sizeof(old->nfs_client->cl_addr)) != 0)
+   return 0;
+   /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
+   if (old->flags & NFS_MOUNT_UNSHARED)
+   return 0;
+   if (memcmp(>fsid, >fsid, sizeof(old->fsid)) != 0)
+   return 0;
+   return nfs_compare_mount_options(sb, server, mntflags);
 }
 
 static int nfs_get_sb(struct file_system_type *fs_type,
@@ -1373,6 +1382,9 @@ static int nfs_get_sb(struct file_system_type *fs_type,
struct nfs_mount_data *data = raw_data;
struct dentry *mntroot;
int (*compare_super)(struct super_block *, void *) = nfs_compare_super;
+   struct nfs_sb_mountdata sb_mntdata = {
+   .mntflags = flags,
+   };
int error;
 
/* Validate the mount data */
@@ -1386,28 +1398,25 @@ static int nfs_get_sb(struct file_system_type *fs_type,
error = PTR_ERR(server);
goto out;
}
+   sb_mntdata.server = server;
 
if (server->flags & NFS_MOUNT_UNSHARED)
compare_super = NULL;
 
/* Get a superblock - note that we may end up sharing one that already 
exists */
-   s = sget(fs_type, compare_super, nfs_set_super, server);
+   s = sget(fs_type, compare_super, nfs_set_super, _mntdata);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 09:50:12AM -0400, Trond Myklebust wrote:
> On Fri, 2007-08-31 at 15:12 +0200, Frank van Maarseveen wrote:
> 
> > IMHO I'd only consider returning EBUSY when trying to mount _exactly_
> > the same directory with different flags, not for arbitrary subtrees. The
> > client should preferably not be bothered with server side disk
> > partitioning (at least not beyond the obvious such as df output).
> 
> That is utterly inconsistent and confusing too.
> 
> If you have a filesystem "/foo" exported on the server "remote", then
> why should
> 
> mount -oro remote:/foo
> mount -orw remote:/foo/a
> 
> be allowed, but
> 
> mount -oro remote:/foo
> mount -orw remote:/foo
> 
> be forbidden?

I'm not arguing to forbid the second case but confronting the sysadmin
there with nosharedcache is much less likely to harm existing setups than
the first case. Let's consider the most likely intention. The first case
is probably used as:

mount -oro remote:/foo  /foo
mount -orw remote:/foo/a/foo/a

and I don't see a real issue with that, sharedcache or not. Ditto with:

mount -oro remote:/foo/a/a
mount -orw remote:/foo/b/b

These are all typical use cases, without multiple views on the same
tree. But

mount -oro remote:/foo  /foo1
mount -orw remote:/foo  /foo2

is strange and much less likely.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 15:12 +0200, Frank van Maarseveen wrote:

> IMHO I'd only consider returning EBUSY when trying to mount _exactly_
> the same directory with different flags, not for arbitrary subtrees. The
> client should preferably not be bothered with server side disk
> partitioning (at least not beyond the obvious such as df output).

That is utterly inconsistent and confusing too.

If you have a filesystem "/foo" exported on the server "remote", then
why should

mount -oro remote:/foo
mount -orw remote:/foo/a

be allowed, but

mount -oro remote:/foo
mount -orw remote:/foo

be forbidden? The caching problems are the same. Telling the admin that
one is safe and the other is not, is just messing with his mind.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 08:11:38AM -0400, Trond Myklebust wrote:
> On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote:
> > 
> 
> > If you want new behaviour, you add a new flag saying you want new 
> > behaviour. You don't just start behaving differently from what you've 
> > always done before (and what *other* UNIXes do, for that matter).
> > 
> > Besides, even *if* it was a matter of somebody doing a mount with "rw", 
> > when the previous mount was "ro", returning EBUSY is still the wrong thing 
> > to do! If the user asks for a new mount that is read-write, he should just 
> > get it - ie we should not re-use the old client handles, and we should do 
> > what Solaris apparently does, namely to just make it a totally different 
> > mount.
> > 
> > In other words, it should (as I already mentioned once) have used 
> > "nosharecache" by default, which makes it all work.
> > 
> > Then, people who want to re-use the caches (which in turn may mean that 
> > everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
> > SEMANTICS (errors and all) should then use a "sharecache" flag.
> 
> That would be a major change in existing semantics. The default has been
> "sharecache" ever since Al Viro introduced the "sget()" function some 6
> or 7 years ago. The problem was that we never advertised the fact that
> the kernel was overriding your mount options, and so sysadmins were
> (rightly IMO) complaining that they should _know_ when the client does
> this.
> 
> The list of known problems with a "nosharecache" default is nasty too:
> 
> - file and directory attribute and data caching breaks.
> Applications will see stale data in cases where they otherwise
> would not expect it.
> 
> - the existing dcache and icache issues when a file is renamed
> or deleted on the server are now extended to also include the
> case where the rename or deletion occurs on an alias in another
> directory on the client itself. In particular, sillyrename will
> break.
> 
> - file locking breaks (the server knows that the client holds
> locks on one file, whereas the client thinks it holds locks on
> several).
> 
> - the NFSv4 delegation model breaks: the client will be using
> OPEN when it could use cached opens. More importantly, when
> performing an operation that requires it to return the
> delegation on the aliased file, it won't know until the server
> sends it a callback.
> 
> ...and of course, the amount of unnecessary traffic to the server
> increases. I'm not aware of any sane way of dealing with those issues,
> and I doubt Solaris has a solution for them either.

All of this won't happen when server foo exports /bar and a client
mounts /bar/x and /bar/y separately: there must be a shared subtree or
hard-links between files within them, right?

An obvious (but disruptive) server side workaround is to export the
subtrees with different fsid= but that would give the same list of
problems as above, right?

IMHO I'd only consider returning EBUSY when trying to mount _exactly_
the same directory with different flags, not for arbitrary subtrees. The
client should preferably not be bothered with server side disk
partitioning (at least not beyond the obvious such as df output).

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote:
> 

> If you want new behaviour, you add a new flag saying you want new 
> behaviour. You don't just start behaving differently from what you've 
> always done before (and what *other* UNIXes do, for that matter).
> 
> Besides, even *if* it was a matter of somebody doing a mount with "rw", 
> when the previous mount was "ro", returning EBUSY is still the wrong thing 
> to do! If the user asks for a new mount that is read-write, he should just 
> get it - ie we should not re-use the old client handles, and we should do 
> what Solaris apparently does, namely to just make it a totally different 
> mount.
> 
> In other words, it should (as I already mentioned once) have used 
> "nosharecache" by default, which makes it all work.
> 
> Then, people who want to re-use the caches (which in turn may mean that 
> everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
> SEMANTICS (errors and all) should then use a "sharecache" flag.

That would be a major change in existing semantics. The default has been
"sharecache" ever since Al Viro introduced the "sget()" function some 6
or 7 years ago. The problem was that we never advertised the fact that
the kernel was overriding your mount options, and so sysadmins were
(rightly IMO) complaining that they should _know_ when the client does
this.

The list of known problems with a "nosharecache" default is nasty too:

- file and directory attribute and data caching breaks.
Applications will see stale data in cases where they otherwise
would not expect it.

- the existing dcache and icache issues when a file is renamed
or deleted on the server are now extended to also include the
case where the rename or deletion occurs on an alias in another
directory on the client itself. In particular, sillyrename will
break.

- file locking breaks (the server knows that the client holds
locks on one file, whereas the client thinks it holds locks on
several).

- the NFSv4 delegation model breaks: the client will be using
OPEN when it could use cached opens. More importantly, when
performing an operation that requires it to return the
delegation on the aliased file, it won't know until the server
sends it a callback.

...and of course, the amount of unnecessary traffic to the server
increases. I'm not aware of any sane way of dealing with those issues,
and I doubt Solaris has a solution for them either.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Ian Kent

On Fri, 31 Aug 2007, Frank van Maarseveen wrote:

> On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote:
> > I am re-sending this after help from Ian and git-bisect. To me it's a
> > show-stopper: I cannot find an acceptable workaround that I can implement.
> > 
> > The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
> > mounts to fail silently - they just not appear when they should.
> > 
> > I believe it's caused by the NFS change that forces multiple mounts from
> > different directories under the same server side filesystem to have the same
> > mount options by default, otherwise it returns EBUSY.
> > 
> > For example, if server has a filesystem /a, and it exports /a/x and /a/y
> > (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
> > mount options now.
> > 
> > Since in my setup they are managed by autofs, and the autofs map is managed
> > by nis, there is no way I could easily workaround it..
> > 
> > If we have to live with this regression, I want to hear some suggestions
> > about how to fix them realistically. Thanks.
> > 
> > By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
> > says:
> > 
> > c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
> > commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
> > Author: Frank van Maarseveen <[EMAIL PROTECTED]>
> > Date:   Mon Jul 9 22:25:29 2007 +0200
> > 
> > NLM: fix source address of callback to client
> > 
> > Use the destination address of the original NLM request as the
> > source address in callbacks to the client.
> > 
> > Signed-off-by: Frank van Maarseveen <[EMAIL PROTECTED]>
> > Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
> > 
> > :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
> > 105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
> > :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
> > 2fec08debe51c20423a88b1a0d4281c683ba5daf M  include
> 
> This does not have any relation with the mount problem, assuming commit
> and comment do match.

That's right.

The commits we're discussing here are (I believe):

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=75180df2ed467866ada839fe73cf7cc7d75c0a22
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=275a5d24bf56b2d9dd4644c54a56366b89a028f1

The later being the one returning EBUSY for the option mismatch and the 
former the addition of the "nosharecache" option.

Ian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Martin Knoblauch


--- Ian Kent <[EMAIL PROTECTED]> wrote:

> On Thu, 30 Aug 2007, Linus Torvalds wrote:
> > 
> > 
> > On Fri, 31 Aug 2007, Trond Myklebust wrote:
> > > 
> > > It did not. The previous behaviour was to always silently
> override the
> > > user mount options.
> > 
> > ..so it still worked for any sane setup, at least.
> > 
> > You broke that. Hua gave good reasons for why he cannot use the
> current 
> > kernel. It's a regression.
> > 
> > In other words, the new behaviour is *worse* than the behaviour you
> 
> > consider to be the incorrect one.
> > 
> 
> This all came about due to complains about not being able to mount
> the 
> same server file system with different options, most commonly ro vs.
> rw 
> which I think was due to the shared super block changes some time
> ago. 
> And, to some extent, I have to plead guilty for not complaining
> enough 
> about this default in the beginning, which is basically unacceptable
> for 
> sure.
> 
> We have seen breakage in Fedora with the introduction of the patches
> and 
> this is typical of it. It also breaks amd and admins have no way of 
> altering this that I'm aware of (help us here Ion).
> 
> I understand Tronds concerns but the fact remains that other Unixs
> allow 
> this behaviour but don't assert cache coherancy and many sysadmin
> don't 
> realize this. So the broken behavior is expected to work and we can't
> 
> simply stop allowing it unless we want to attend a public hanging
> with us 
> as the paticipants.
> 
> There is no question that the new behavior is worse and this change
> is 
> unacceptable as a solution to the original problem.
> 
> I really think that reversing the default, as has been suggested, 
> documenting the risk in the mount.nfs man page and perhaps issuing a 
> warning from the kernel is a better way to handle this. At least we
> will 
> be doing more to raise public awareness of the issue than others.
> 

 I can only second that. Changing the default behavior in this way is
really bad.

 Not that I am disagreeing with the technical reasons, but the change
breaks working setups. And -EBUSY is not very helpful as a message
here. It does not matter that the user tools may handle the breakage
incorrect. The users (admins) had workings setups for years. And they
were obviously working "good enough".

 And one should not forget that there will be a considerable time until
"nosharecache" will trickle down into distributions.

 If the situation stays this way, quite a few people will not be able
to move beyond 2.6.22 for some time. E.g. for I am working for a
company that operates some linux "clusters" at a few german automotive
cdompanies. For certain reasons everything there is based on
automounter maps (both autofs and amd style). We have almost zero
influence on that setup. The maps are a mess - we will run into the
sharecache problem. At the same time I am trying to fight the notorious
"system turns into frozen molassis on moderate I/O load". There maybe
some interesting developements coming forth after 2.6.22. Not good :-(

 What I would like to see done for the at hand situation is:

- make "nosharecache" the default for the forseeable future
- log any attempt to mount option-inconsistent NFS filesystems to dmesh
and syslog (apparently the NFS client is able to detect them :-). Do
this regardless of the "nosharecache" option. This way admins will at
least be made aware of the situation.
- In a year or so we can talk about making the default safe. With
proper advertising.

 Just my  0.02.

Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Jakob Oestergaard

On Fri, Aug 31, 2007 at 01:07:56AM -0700, Linus Torvalds wrote:
...
> When we add NEW BEHAVIOUR, we don't add it to old interfaces when that 
> breaks old user mode! We add a new flag saying "I want the new behaviour".
> 
> This is not rocket science, guys. This is very basic kernel behaviour. The 
> kernel exists only to serve user space, and that means that there is no 
> more important thing to do than to make sure you don't break existing 
> users, unless you have some *damns* strong reasons.

100% agreed.

> The fact that he may *also* have broken insane setups is totally 
> irrelevant. Don't go off on some tangent that has nothing to do with the 
> regression in question!

It does not have "nothing" to do with the regression.

Some setups which worked more by accident than by design earlier on were broken
by the fix. This could have been avoided, I agree, but the breakage was caused
by the fix (or the breakage is the fix, however you prefer to look at it).

> > If ext3 in some rare case (which would still mean it hit a few thousand 
> > users)
> > failed to remember that a file had been marked read-only and allowed writes 
> > to
> > it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
> > fix
> > it, right?
> 
> Stop blathering. Of course we fix security holes. But we don't break 
> things that don't need breaking. This wasn't a security hole.

*part* of it wasn't a security hole.

The other half very much was.

...
> In other words, it should (as I already mentioned once) have used 
> "nosharecache" by default, which makes it all work.
> 
> Then, people who want to re-use the caches (which in turn may mean that 
> everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
> SEMANTICS (errors and all) should then use a "sharecache" flag.
> 
> See? You don't have to screw people over.

Sure, given that Trond (or whomever) has the time it takes to go and implement
all of this, there's no need to screw anyone.

Assuming he's on a schedule and this will have to wait, I agree with him that
it makes the most sense to play it safe security/consistency-wise rather than
functionality-wise.

> > mount passes back the error code on a failed mount. autofs passes that error
> > along too (when people configure syslog correctly). In short; when these
> > serious mistakes are made and caught, the admin sees an error in his logs.
> 
> Bullshit. "Seeing the error in his logs" doesn't help anything.

It makes troubleshooting possible, which adresses *the* major complaint from
*one* of the *two* people who complained about this.


-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote:
> I am re-sending this after help from Ian and git-bisect. To me it's a
> show-stopper: I cannot find an acceptable workaround that I can implement.
> 
> The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
> mounts to fail silently - they just not appear when they should.
> 
> I believe it's caused by the NFS change that forces multiple mounts from
> different directories under the same server side filesystem to have the same
> mount options by default, otherwise it returns EBUSY.
> 
> For example, if server has a filesystem /a, and it exports /a/x and /a/y
> (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
> mount options now.
> 
> Since in my setup they are managed by autofs, and the autofs map is managed
> by nis, there is no way I could easily workaround it..
> 
> If we have to live with this regression, I want to hear some suggestions
> about how to fix them realistically. Thanks.
> 
> By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
> says:
> 
> c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
> commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
> Author: Frank van Maarseveen <[EMAIL PROTECTED]>
> Date:   Mon Jul 9 22:25:29 2007 +0200
> 
> NLM: fix source address of callback to client
> 
> Use the destination address of the original NLM request as the
> source address in callbacks to the client.
> 
> Signed-off-by: Frank van Maarseveen <[EMAIL PROTECTED]>
> Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
> 
> :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
> 105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
> :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
> 2fec08debe51c20423a88b1a0d4281c683ba5daf M  include

This does not have any relation with the mount problem, assuming commit
and comment do match.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 09:40:28AM +0200, Jakob Oestergaard wrote:
> On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote:
> > 
> ...
> > > Why aren't we doing that for any other filesystem than NFS?
> > 
> > How hard is it to acknowledge the following little word:
> > 
> > "regression"
> > 
> > It's simple. You broke things. You may want to fix them, but you need to 
> > fix them in a way that does not break user space.
> 
> Trond has a point Linus.
> 
> What he "broke" is, for example, a ro mount being mounted as rw.
> 
> That *could* be a very serious security (etc.etc.) problem which he just 
> fixed.
> Anything depending on read-only not being enforced will cease to work, of
> course, and that is what a few people complain about(!).
> 
> If ext3 in some rare case (which would still mean it hit a few thousand users)
> failed to remember that a file had been marked read-only and allowed writes to
> it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
> fix
> it, right?
> 
> mount passes back the error code on a failed mount. autofs passes that error
> along too (when people configure syslog correctly). In short; when these
> serious mistakes are made and caught, the admin sees an error in his logs.

Hua explained already that seeing the error is not the same as fixing
the error: he cannot fix it because NFS implies other systems we _must_
co-operate with.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Matthias Schniedermeyer

> > It's not very conservative to suddenly change default behavior and break
> > autofs mounts. There is not even one kernel message that "_tells_ user why
> > it thinks it's wrong". It just silently fails.
> 
> No it doesn't. It reports an error code to the caller. If autofs is
> failing silently, then that is a bug in autofs: mount will report the
> error to the user.

Wrong(tm).

autofs AND mounting at the commandline just say:
mount.nfs: /mnt is already mounted or busy

Which has an actual information value of about 1%.

In my case i moved a nfs exported directory inside another nfs-exported 
directory month ago and placed a symlink where the direcotry was (on the 
server-side). It never acured to me that that was "wrong"(tm).

Now i can only mount one of the two mounts and the other just tells 
"busy".

After reading this i could fix my case easyly. I just erased the 
"deeper" mount and symlinked the directory from the other mount.

But YOU HAVE TO KNOW THAT YOU DID SOMETHING WRONG. Just getting a "Busy" 
lets you staying with Question-marks flying around you head!




Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds

On Fri, 31 Aug 2007, Jakob Oestergaard wrote:
> 
> Trond has a point Linus.

I don't dispute that the new code does somethign good.

But it changes existing behaviour.

When we add NEW BEHAVIOUR, we don't add it to old interfaces when that 
breaks old user mode! We add a new flag saying "I want the new behaviour".

This is not rocket science, guys. This is very basic kernel behaviour. The 
kernel exists only to serve user space, and that means that there is no 
more important thing to do than to make sure you don't break existing 
users, unless you have some *damns* strong reasons.

> What he "broke" is, for example, a ro mount being mounted as rw.

No. What he broke was a working and sane setup.

The fact that he may *also* have broken insane setups is totally 
irrelevant. Don't go off on some tangent that has nothing to do with the 
regression in question!

> If ext3 in some rare case (which would still mean it hit a few thousand users)
> failed to remember that a file had been marked read-only and allowed writes to
> it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
> fix
> it, right?

Stop blathering. Of course we fix security holes. But we don't break 
things that don't need breaking. This wasn't a security hole.

You are making up irrelevant arguments that have nothing to do with this 
regression.

If you want new behaviour, you add a new flag saying you want new 
behaviour. You don't just start behaving differently from what you've 
always done before (and what *other* UNIXes do, for that matter).

Besides, even *if* it was a matter of somebody doing a mount with "rw", 
when the previous mount was "ro", returning EBUSY is still the wrong thing 
to do! If the user asks for a new mount that is read-write, he should just 
get it - ie we should not re-use the old client handles, and we should do 
what Solaris apparently does, namely to just make it a totally different 
mount.

In other words, it should (as I already mentioned once) have used 
"nosharecache" by default, which makes it all work.

Then, people who want to re-use the caches (which in turn may mean that 
everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
SEMANTICS (errors and all) should then use a "sharecache" flag.

See? You don't have to screw people over.

> mount passes back the error code on a failed mount. autofs passes that error
> along too (when people configure syslog correctly). In short; when these
> serious mistakes are made and caught, the admin sees an error in his logs.

Bullshit. "Seeing the error in his logs" doesn't help anything. The 
problem wasn't the lack of error, the problem was that it was a new and 
unnecessary error in the first place. Logging it doesn't make it any less 
buggy.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Jakob Oestergaard

On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote:
> 
...
> > Why aren't we doing that for any other filesystem than NFS?
> 
> How hard is it to acknowledge the following little word:
> 
>   "regression"
> 
> It's simple. You broke things. You may want to fix them, but you need to 
> fix them in a way that does not break user space.

Trond has a point Linus.

What he "broke" is, for example, a ro mount being mounted as rw.

That *could* be a very serious security (etc.etc.) problem which he just fixed.
Anything depending on read-only not being enforced will cease to work, of
course, and that is what a few people complain about(!).

If ext3 in some rare case (which would still mean it hit a few thousand users)
failed to remember that a file had been marked read-only and allowed writes to
it, wouldn't we want to fix that too?  It would cause regressions, but we'd fix
it, right?

mount passes back the error code on a failed mount. autofs passes that error
along too (when people configure syslog correctly). In short; when these
serious mistakes are made and caught, the admin sees an error in his logs.

This is not wrong. This is good.

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 09:50:12AM -0400, Trond Myklebust wrote:
 On Fri, 2007-08-31 at 15:12 +0200, Frank van Maarseveen wrote:
 
  IMHO I'd only consider returning EBUSY when trying to mount _exactly_
  the same directory with different flags, not for arbitrary subtrees. The
  client should preferably not be bothered with server side disk
  partitioning (at least not beyond the obvious such as df output).
 
 That is utterly inconsistent and confusing too.
 
 If you have a filesystem /foo exported on the server remote, then
 why should
 
 mount -oro remote:/foo
 mount -orw remote:/foo/a
 
 be allowed, but
 
 mount -oro remote:/foo
 mount -orw remote:/foo
 
 be forbidden?

I'm not arguing to forbid the second case but confronting the sysadmin
there with nosharedcache is much less likely to harm existing setups than
the first case. Let's consider the most likely intention. The first case
is probably used as:

mount -oro remote:/foo  path/foo
mount -orw remote:/foo/apath/foo/a

and I don't see a real issue with that, sharedcache or not. Ditto with:

mount -oro remote:/foo/apath/a
mount -orw remote:/foo/bpath/b

These are all typical use cases, without multiple views on the same
tree. But

mount -oro remote:/foo  /foo1
mount -orw remote:/foo  /foo2

is strange and much less likely.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Thu, 2007-08-30 at 20:49 -0700, Linus Torvalds wrote:
 Please send in a fix. If the fix involves making nosharecache the 
 default, then that is better than making policy decisions like this in the 
 kernel. The kernel should do what the user asks and not put in unnecessary 
 roadblocks.

The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails. The attached patch does just that.

Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Finally, for the record: I still feel very uncomfortable about not being
able to report the state of the client setup back to the sysadmin.
AFAIK, the only way to do so is to stat the mountpoints, and compare the
device ids.

Trond

---BeginMessage---
Signed-off-by: Trond Myklebust [EMAIL PROTECTED]
---

 fs/nfs/super.c |  110 +---
 1 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index c28f30d..8ed5937 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1303,34 +1303,6 @@ static void nfs_clone_super(struct super_block *sb,
nfs_initialise_sb(sb);
 }
 
-static int nfs_set_super(struct super_block *s, void *_server)
-{
-   struct nfs_server *server = _server;
-   int ret;
-
-   s-s_fs_info = server;
-   ret = set_anon_super(s, server);
-   if (ret == 0)
-   server-s_dev = s-s_dev;
-   return ret;
-}
-
-static int nfs_compare_super(struct super_block *sb, void *data)
-{
-   struct nfs_server *server = data, *old = NFS_SB(sb);
-
-   if (memcmp(old-nfs_client-cl_addr,
-   server-nfs_client-cl_addr,
-   sizeof(old-nfs_client-cl_addr)) != 0)
-   return 0;
-   /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
-   if (old-flags  NFS_MOUNT_UNSHARED)
-   return 0;
-   if (memcmp(old-fsid, server-fsid, sizeof(old-fsid)) != 0)
-   return 0;
-   return 1;
-}
-
 #define NFS_MS_MASK (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_SYNCHRONOUS)
 
 static int nfs_compare_mount_options(const struct super_block *s, const struct 
nfs_server *b, int flags)
@@ -1359,9 +1331,46 @@ static int nfs_compare_mount_options(const struct 
super_block *s, const struct n
goto Ebusy;
if (clnt_a-cl_auth-au_flavor != clnt_b-cl_auth-au_flavor)
goto Ebusy;
-   return 0;
+   return 1;
 Ebusy:
-   return -EBUSY;
+   return 0;
+}
+
+struct nfs_sb_mountdata {
+   struct nfs_server *server;
+   int mntflags;
+};
+
+static int nfs_set_super(struct super_block *s, void *data)
+{
+   struct nfs_sb_mountdata *sb_mntdata = data;
+   struct nfs_server *server = sb_mntdata-server;
+   int ret;
+
+   s-s_flags = sb_mntdata-mntflags;
+   s-s_fs_info = server;
+   ret = set_anon_super(s, server);
+   if (ret == 0)
+   server-s_dev = s-s_dev;
+   return ret;
+}
+
+static int nfs_compare_super(struct super_block *sb, void *data)
+{
+   struct nfs_sb_mountdata *sb_mntdata = data;
+   struct nfs_server *server = sb_mntdata-server, *old = NFS_SB(sb);
+   int mntflags = sb_mntdata-mntflags;
+
+   if (memcmp(old-nfs_client-cl_addr,
+   server-nfs_client-cl_addr,
+   sizeof(old-nfs_client-cl_addr)) != 0)
+   return 0;
+   /* Note: NFS_MOUNT_UNSHARED == NFS4_MOUNT_UNSHARED */
+   if (old-flags  NFS_MOUNT_UNSHARED)
+   return 0;
+   if (memcmp(old-fsid, server-fsid, sizeof(old-fsid)) != 0)
+   return 0;
+   return nfs_compare_mount_options(sb, server, mntflags);
 }
 
 static int nfs_get_sb(struct file_system_type *fs_type,
@@ -1373,6 +1382,9 @@ static int nfs_get_sb(struct file_system_type *fs_type,
struct nfs_mount_data *data = raw_data;
struct dentry *mntroot;
int (*compare_super)(struct super_block *, void *) = nfs_compare_super;
+   struct nfs_sb_mountdata sb_mntdata = {
+   .mntflags = flags,
+   };
int error;
 
/* Validate the mount data */
@@ -1386,28 +1398,25 @@ static int nfs_get_sb(struct file_system_type *fs_type,
error = PTR_ERR(server);
goto out;
}
+   sb_mntdata.server = server;
 
if (server-flags  NFS_MOUNT_UNSHARED)
compare_super = NULL;
 
/* Get a superblock - note that we may end up sharing one that already 
exists */
-   s = sget(fs_type, compare_super, nfs_set_super, server);
+   s = sget(fs_type, compare_super, nfs_set_super, sb_mntdata);
if (IS_ERR(s)) {
error = PTR_ERR(s);
goto out_err_nosb;
}

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds



On Fri, 31 Aug 2007, Jakob Oestergaard wrote:
 
  The fact that he may *also* have broken insane setups is totally 
  irrelevant. Don't go off on some tangent that has nothing to do with the 
  regression in question!
 
 It does not have nothing to do with the regression.
 
 Some setups which worked more by accident than by design earlier on were 
 broken
 by the fix. This could have been avoided, I agree, but the breakage was caused
 by the fix (or the breakage is the fix, however you prefer to look at it).

Well, it's not a fix if it breaks other setups. 

It's especially not a fix since the whole requirement that all the flags 
be exactly the same is totally brain-dead in the first place. We *have* 
that kind of mount already, and it has nothing to do with NFS: it's called 
a bind mount.

So if you want an identical mount, with cache coherency and tying the two 
mount-points together (requiring that they have the same mount flags), 
then that has absolutely *nothing* to do with NFS. The VFS layer does that 
for you.

 *part* of it wasn't a security hole.
 
 The other half very much was.

No, the fix was simply wrong.  It was done the wrong way, and it broke 
things it shouldn't have broken.

Let's put it this way: if I create a patch that stops the system from 
booting, I sure as hell fix a potential security hole, don't I?

Does that make my patch a fix?

No it does not.

 Sure, given that Trond (or whomever) has the time it takes to go and implement
 all of this, there's no need to screw anyone.
 
 Assuming he's on a schedule and this will have to wait, I agree with him that
 it makes the most sense to play it safe security/consistency-wise rather than
 functionality-wise.

I disagree. Either that thing gets fixed before 2.6.23, or the commit that 
introduced the broken behaviour gets reverted.

We've had this policy of regressions are fixed for a long time, and 
we're not suddenly changing it.

This is *not* a security hole. In order to make it a security hole, you 
need to be root in the first place. So what you call a security hole is 
really no different from root installing a bad SUID binary. It's simply 
not the kernels place to then say SUID binaries will not work, because 
it's a potential security hole.

See?

So stop calling this a security hole.  It's certainly a misfeature, but:

 - it's a misfeature that people are used to, and has been around forever.

 - there are bound to be ways to fix it that don't break existing users.

 - the requirement that all flags be the same for a mount to the same NFS 
   directory is *particularly* stupid, since there are better ways to do 
   that than go through NFS!

so I really don't see why people excuse the new behaviour.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds



On Fri, 31 Aug 2007, Trond Myklebust wrote:

 The best I can do given the constraints appears to be to have the kernel
 first look for a superblock that matches both the fsid and the
 user-specified mount options, and then spawn off a new superblock if
 that search fails.

I think this is probably acceptable to get roughly the old behaviour, but 
I still think it's a bit stupid.

What happens at mount -o remount,... time?

The fact is, the whole match the fsid and user mount options, and re-use 
the mount sounds like it's trying to solve a problem that doesn't need 
solving. If the user really wants to duplicate the mount, he really should 
be using a a bind-mount instead.

In other words, let's assume that the user has /some/nfs/mount mounted 
over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
else, the sane thing to do is not to mount it again, but to just do

mount --bind /some/nfs/mount/subdir /new/mount/place

instead. That *guarantees* that the low-level filesystem uses the same 
flags, and it also means that things like re-mounting have sane and 
well-defined semantics, and will fail or succeed predictably.

In contrast, if a user wants to create a new NFS mount, it really should 
be independent of the old one, because that's (a) what other systems do, 
and (b) also makes the semantics of re-mounting it with other flags be 
clear and unambiguous (ie the remount has nothing what-so-ever to do with 
the independent NFS mount).

See? 

This is why I think nosharecache should just be the default, because 
that's the behaviour that simply does not have any subtle issues. The 
*special* case should be the sharecache case, and 99% of the time that 
one should likely be done with a --bind mount.

(I don't really see the point of _ever_ doing anything but a bind mount, 
but maybe there are reasons to try to share at a NFS layer that I don't 
really see)

 The attached patch does just that.

Hua, does this fix things for you? If it gets rid of the regression, I can 
certainly live with it, but as per above, I don't really think this makes 
much sense in the bigger picture kind of thing.

 Finally, for the record: I still feel very uncomfortable about not being
 able to report the state of the client setup back to the sysadmin.
 AFAIK, the only way to do so is to stat the mountpoints, and compare the
 device ids.

Well, not only don't I see that as being horribly wrong, I actually think 
that the sysadmin should know what his mount setup is, even without having 
to ask. But since he *can* ask, using easy and standard interfaces, I 
don't really see what the problem really is.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

This patch fixes the problem for me, thanks.

Is this patch changing the behavior of sharecache to
try-to-share-cache-if-possible, or adding a third behavior? If the user
explicitly asks for -o sharecache, does he get an error back if the mount
options mismatch?
 
 The best I can do given the constraints appears to be to have the
 kernel first look for a superblock that matches both the fsid and the
 user-specified mount options, and then spawn off a new superblock if
 that search fails. The attached patch does just that.
 
 Note that this is not the same as specifying nosharecache everywhere
 since nosharecache will never attempt to match an existing superblock.
 
 Finally, for the record: I still feel very uncomfortable about not
 being able to report the state of the client setup back to the sysadmin.
 AFAIK, the only way to do so is to stat the mountpoints, and compare
 the device ids.
 
 Trond


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 10:01 -0700, Linus Torvalds wrote:
 
 On Fri, 31 Aug 2007, Trond Myklebust wrote:
 
  The best I can do given the constraints appears to be to have the kernel
  first look for a superblock that matches both the fsid and the
  user-specified mount options, and then spawn off a new superblock if
  that search fails.
 
 I think this is probably acceptable to get roughly the old behaviour, but 
 I still think it's a bit stupid.
 
 What happens at mount -o remount,... time?
 
 The fact is, the whole match the fsid and user mount options, and re-use 
 the mount sounds like it's trying to solve a problem that doesn't need 
 solving. If the user really wants to duplicate the mount, he really should 
 be using a a bind-mount instead.
 
 In other words, let's assume that the user has /some/nfs/mount mounted 
 over NFS, and wants to re-mount it (or even just a subset of it) somewhere 
 else, the sane thing to do is not to mount it again, but to just do
 
   mount --bind /some/nfs/mount/subdir /new/mount/place
 
 instead. That *guarantees* that the low-level filesystem uses the same 
 flags, and it also means that things like re-mounting have sane and 
 well-defined semantics, and will fail or succeed predictably.

I agree for the cases where you can use bind mounts, however you can't
always do that.

Consider the fairly common setup where /foo, /foo/a, /foo/b are all on
the same filesystem on the server, but only /foo/a and /foo/b are
exported.
There can be plenty of files that are contain hard links in both
directories, but because you cannot mount the parent, /foo, you will not
be able to ensure that these common files are cached to the same inode
(which they need to be).

IOW: with this scenario, you can't ensure that local posix semantics
hold (i.e. that if my client is the only user, then the filesystem will
behave as if it were a posix filesystem). That would be a major
regression.

 In contrast, if a user wants to create a new NFS mount, it really should 
 be independent of the old one, because that's (a) what other systems do, 
 and (b) also makes the semantics of re-mounting it with other flags be 
 clear and unambiguous (ie the remount has nothing what-so-ever to do with 
 the independent NFS mount).

(a) I'm not sure that is true: see (b).
(b) You gain remount clarity at the expense of local posix filesystem
correctness.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 11:47 -0700, Hua Zhong wrote:
 This patch fixes the problem for me, thanks.
 
 Is this patch changing the behavior of sharecache to
 try-to-share-cache-if-possible, or adding a third behavior? If the user
 explicitly asks for -o sharecache, does he get an error back if the mount
 options mismatch?

There has never been a 'sharecache' flag as far as the kernel is
concerned. The default behaviour has always been to share.

Trond


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

 On Fri, 2007-08-31 at 11:47 -0700, Hua Zhong wrote:
  This patch fixes the problem for me, thanks.
 
  Is this patch changing the behavior of sharecache to
  try-to-share-cache-if-possible, or adding a third behavior? If the
  user explicitly asks for -o sharecache, does he get an error back
  if the mount options mismatch?
 
 There has never been a 'sharecache' flag as far as the kernel is
 concerned. The default behaviour has always been to share.

It's not about default (for which backward compatibility is most important
and this patch is perfectly fine), but user explicitly asks for
sharecache. In this case if for any reason the cache cannot be shared, I
am not sure if he should get an error back.

I for one agree with Ian and Linus that changing default to nosharecache
might be the best thing to do, but since I am now able to use the latest
kernel, I am very happy already.

Thanks a lot for your attention to my problem. :-)

 Trond


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-31 Thread Hua Zhong

 It's not about default (for which backward compatibility is most
 important and this patch is perfectly fine), but user explicitly asks
 for sharecache. In this case if for any reason the cache cannot be
 shared, I am not sure if he should get an error back.
 
 I for one agree with Ian and Linus that changing default to
 nosharecache might be the best thing to do, but since I am now able to
 use the latest kernel, I am very happy already.

Actually, I think just fine-tuning it a bit may be better:

1. make 'nosharecache' as default
2. apply the algorithm in this patch to 'nosharecache': if the fsid and
mount options are the same, then share cache

This way the default behavior does not change, but both algorithms have
pitfalls, and we choose from:
1. if user specifies sharecache, he may end up with nosharecache if mount
options are different
And
2. if user specifies nosharecache, he may end up with sharecache if mount
options are the same

I'd think 2 is better (least surprise). I cannot think of a case where 2 is
actually a bad thing.

Comments?

 Thanks a lot for your attention to my problem. :-)
 
  Trond


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Jakob Oestergaard

On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote:
 
...
  Why aren't we doing that for any other filesystem than NFS?
 
 How hard is it to acknowledge the following little word:
 
   regression
 
 It's simple. You broke things. You may want to fix them, but you need to 
 fix them in a way that does not break user space.

Trond has a point Linus.

What he broke is, for example, a ro mount being mounted as rw.

That *could* be a very serious security (etc.etc.) problem which he just fixed.
Anything depending on read-only not being enforced will cease to work, of
course, and that is what a few people complain about(!).

If ext3 in some rare case (which would still mean it hit a few thousand users)
failed to remember that a file had been marked read-only and allowed writes to
it, wouldn't we want to fix that too?  It would cause regressions, but we'd fix
it, right?

mount passes back the error code on a failed mount. autofs passes that error
along too (when people configure syslog correctly). In short; when these
serious mistakes are made and caught, the admin sees an error in his logs.

This is not wrong. This is good.

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Linus Torvalds



On Fri, 31 Aug 2007, Jakob Oestergaard wrote:
 
 Trond has a point Linus.

I don't dispute that the new code does somethign good.

But it changes existing behaviour.

When we add NEW BEHAVIOUR, we don't add it to old interfaces when that 
breaks old user mode! We add a new flag saying I want the new behaviour.

This is not rocket science, guys. This is very basic kernel behaviour. The 
kernel exists only to serve user space, and that means that there is no 
more important thing to do than to make sure you don't break existing 
users, unless you have some *damns* strong reasons.

 What he broke is, for example, a ro mount being mounted as rw.

No. What he broke was a working and sane setup.

The fact that he may *also* have broken insane setups is totally 
irrelevant. Don't go off on some tangent that has nothing to do with the 
regression in question!

 If ext3 in some rare case (which would still mean it hit a few thousand users)
 failed to remember that a file had been marked read-only and allowed writes to
 it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
 fix
 it, right?

Stop blathering. Of course we fix security holes. But we don't break 
things that don't need breaking. This wasn't a security hole.

You are making up irrelevant arguments that have nothing to do with this 
regression.

If you want new behaviour, you add a new flag saying you want new 
behaviour. You don't just start behaving differently from what you've 
always done before (and what *other* UNIXes do, for that matter).

Besides, even *if* it was a matter of somebody doing a mount with rw, 
when the previous mount was ro, returning EBUSY is still the wrong thing 
to do! If the user asks for a new mount that is read-write, he should just 
get it - ie we should not re-use the old client handles, and we should do 
what Solaris apparently does, namely to just make it a totally different 
mount.

In other words, it should (as I already mentioned once) have used 
nosharecache by default, which makes it all work.

Then, people who want to re-use the caches (which in turn may mean that 
everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
SEMANTICS (errors and all) should then use a sharecache flag.

See? You don't have to screw people over.

 mount passes back the error code on a failed mount. autofs passes that error
 along too (when people configure syslog correctly). In short; when these
 serious mistakes are made and caught, the admin sees an error in his logs.

Bullshit. Seeing the error in his logs doesn't help anything. The 
problem wasn't the lack of error, the problem was that it was a new and 
unnecessary error in the first place. Logging it doesn't make it any less 
buggy.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Matthias Schniedermeyer

  It's not very conservative to suddenly change default behavior and break
  autofs mounts. There is not even one kernel message that _tells_ user why
  it thinks it's wrong. It just silently fails.
 
 No it doesn't. It reports an error code to the caller. If autofs is
 failing silently, then that is a bug in autofs: mount will report the
 error to the user.

Wrong(tm).

autofs AND mounting at the commandline just say:
mount.nfs: /mnt is already mounted or busy

Which has an actual information value of about 1%.

In my case i moved a nfs exported directory inside another nfs-exported 
directory month ago and placed a symlink where the direcotry was (on the 
server-side). It never acured to me that that was wrong(tm).

Now i can only mount one of the two mounts and the other just tells 
busy.

After reading this i could fix my case easyly. I just erased the 
deeper mount and symlinked the directory from the other mount.

But YOU HAVE TO KNOW THAT YOU DID SOMETHING WRONG. Just getting a Busy 
lets you staying with Question-marks flying around you head!




Bis denn

-- 
Real Programmers consider what you see is what you get to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a you asked for it, you got it text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 09:40:28AM +0200, Jakob Oestergaard wrote:
 On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote:
  
 ...
   Why aren't we doing that for any other filesystem than NFS?
  
  How hard is it to acknowledge the following little word:
  
  regression
  
  It's simple. You broke things. You may want to fix them, but you need to 
  fix them in a way that does not break user space.
 
 Trond has a point Linus.
 
 What he broke is, for example, a ro mount being mounted as rw.
 
 That *could* be a very serious security (etc.etc.) problem which he just 
 fixed.
 Anything depending on read-only not being enforced will cease to work, of
 course, and that is what a few people complain about(!).
 
 If ext3 in some rare case (which would still mean it hit a few thousand users)
 failed to remember that a file had been marked read-only and allowed writes to
 it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
 fix
 it, right?
 
 mount passes back the error code on a failed mount. autofs passes that error
 along too (when people configure syslog correctly). In short; when these
 serious mistakes are made and caught, the admin sees an error in his logs.

Hua explained already that seeing the error is not the same as fixing
the error: he cannot fix it because NFS implies other systems we _must_
co-operate with.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote:
 I am re-sending this after help from Ian and git-bisect. To me it's a
 show-stopper: I cannot find an acceptable workaround that I can implement.
 
 The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
 mounts to fail silently - they just not appear when they should.
 
 I believe it's caused by the NFS change that forces multiple mounts from
 different directories under the same server side filesystem to have the same
 mount options by default, otherwise it returns EBUSY.
 
 For example, if server has a filesystem /a, and it exports /a/x and /a/y
 (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
 mount options now.
 
 Since in my setup they are managed by autofs, and the autofs map is managed
 by nis, there is no way I could easily workaround it..
 
 If we have to live with this regression, I want to hear some suggestions
 about how to fix them realistically. Thanks.
 
 By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
 says:
 
 c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
 commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
 Author: Frank van Maarseveen [EMAIL PROTECTED]
 Date:   Mon Jul 9 22:25:29 2007 +0200
 
 NLM: fix source address of callback to client
 
 Use the destination address of the original NLM request as the
 source address in callbacks to the client.
 
 Signed-off-by: Frank van Maarseveen [EMAIL PROTECTED]
 Signed-off-by: Trond Myklebust [EMAIL PROTECTED]
 
 :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
 105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
 :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
 2fec08debe51c20423a88b1a0d4281c683ba5daf M  include

This does not have any relation with the mount problem, assuming commit
and comment do match.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Jakob Oestergaard

On Fri, Aug 31, 2007 at 01:07:56AM -0700, Linus Torvalds wrote:
...
 When we add NEW BEHAVIOUR, we don't add it to old interfaces when that 
 breaks old user mode! We add a new flag saying I want the new behaviour.
 
 This is not rocket science, guys. This is very basic kernel behaviour. The 
 kernel exists only to serve user space, and that means that there is no 
 more important thing to do than to make sure you don't break existing 
 users, unless you have some *damns* strong reasons.

100% agreed.

 The fact that he may *also* have broken insane setups is totally 
 irrelevant. Don't go off on some tangent that has nothing to do with the 
 regression in question!

It does not have nothing to do with the regression.

Some setups which worked more by accident than by design earlier on were broken
by the fix. This could have been avoided, I agree, but the breakage was caused
by the fix (or the breakage is the fix, however you prefer to look at it).

  If ext3 in some rare case (which would still mean it hit a few thousand 
  users)
  failed to remember that a file had been marked read-only and allowed writes 
  to
  it, wouldn't we want to fix that too?  It would cause regressions, but we'd 
  fix
  it, right?
 
 Stop blathering. Of course we fix security holes. But we don't break 
 things that don't need breaking. This wasn't a security hole.

*part* of it wasn't a security hole.

The other half very much was.

...
 In other words, it should (as I already mentioned once) have used 
 nosharecache by default, which makes it all work.
 
 Then, people who want to re-use the caches (which in turn may mean that 
 everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
 SEMANTICS (errors and all) should then use a sharecache flag.
 
 See? You don't have to screw people over.

Sure, given that Trond (or whomever) has the time it takes to go and implement
all of this, there's no need to screw anyone.

Assuming he's on a schedule and this will have to wait, I agree with him that
it makes the most sense to play it safe security/consistency-wise rather than
functionality-wise.

  mount passes back the error code on a failed mount. autofs passes that error
  along too (when people configure syslog correctly). In short; when these
  serious mistakes are made and caught, the admin sees an error in his logs.
 
 Bullshit. Seeing the error in his logs doesn't help anything.

It makes troubleshooting possible, which adresses *the* major complaint from
*one* of the *two* people who complained about this.


-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Martin Knoblauch


--- Ian Kent [EMAIL PROTECTED] wrote:

 On Thu, 30 Aug 2007, Linus Torvalds wrote:
  
  
  On Fri, 31 Aug 2007, Trond Myklebust wrote:
   
   It did not. The previous behaviour was to always silently
 override the
   user mount options.
  
  ..so it still worked for any sane setup, at least.
  
  You broke that. Hua gave good reasons for why he cannot use the
 current 
  kernel. It's a regression.
  
  In other words, the new behaviour is *worse* than the behaviour you
 
  consider to be the incorrect one.
  
 
 This all came about due to complains about not being able to mount
 the 
 same server file system with different options, most commonly ro vs.
 rw 
 which I think was due to the shared super block changes some time
 ago. 
 And, to some extent, I have to plead guilty for not complaining
 enough 
 about this default in the beginning, which is basically unacceptable
 for 
 sure.
 
 We have seen breakage in Fedora with the introduction of the patches
 and 
 this is typical of it. It also breaks amd and admins have no way of 
 altering this that I'm aware of (help us here Ion).
 
 I understand Tronds concerns but the fact remains that other Unixs
 allow 
 this behaviour but don't assert cache coherancy and many sysadmin
 don't 
 realize this. So the broken behavior is expected to work and we can't
 
 simply stop allowing it unless we want to attend a public hanging
 with us 
 as the paticipants.
 
 There is no question that the new behavior is worse and this change
 is 
 unacceptable as a solution to the original problem.
 
 I really think that reversing the default, as has been suggested, 
 documenting the risk in the mount.nfs man page and perhaps issuing a 
 warning from the kernel is a better way to handle this. At least we
 will 
 be doing more to raise public awareness of the issue than others.
 

 I can only second that. Changing the default behavior in this way is
really bad.

 Not that I am disagreeing with the technical reasons, but the change
breaks working setups. And -EBUSY is not very helpful as a message
here. It does not matter that the user tools may handle the breakage
incorrect. The users (admins) had workings setups for years. And they
were obviously working good enough.

 And one should not forget that there will be a considerable time until
nosharecache will trickle down into distributions.

 If the situation stays this way, quite a few people will not be able
to move beyond 2.6.22 for some time. E.g. for I am working for a
company that operates some linux clusters at a few german automotive
cdompanies. For certain reasons everything there is based on
automounter maps (both autofs and amd style). We have almost zero
influence on that setup. The maps are a mess - we will run into the
sharecache problem. At the same time I am trying to fight the notorious
system turns into frozen molassis on moderate I/O load. There maybe
some interesting developements coming forth after 2.6.22. Not good :-(

 What I would like to see done for the at hand situation is:

- make nosharecache the default for the forseeable future
- log any attempt to mount option-inconsistent NFS filesystems to dmesh
and syslog (apparently the NFS client is able to detect them :-). Do
this regardless of the nosharecache option. This way admins will at
least be made aware of the situation.
- In a year or so we can talk about making the default safe. With
proper advertising.

 Just my  0.02.

Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Ian Kent

On Fri, 31 Aug 2007, Frank van Maarseveen wrote:

 On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote:
  I am re-sending this after help from Ian and git-bisect. To me it's a
  show-stopper: I cannot find an acceptable workaround that I can implement.
  
  The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
  mounts to fail silently - they just not appear when they should.
  
  I believe it's caused by the NFS change that forces multiple mounts from
  different directories under the same server side filesystem to have the same
  mount options by default, otherwise it returns EBUSY.
  
  For example, if server has a filesystem /a, and it exports /a/x and /a/y
  (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
  mount options now.
  
  Since in my setup they are managed by autofs, and the autofs map is managed
  by nis, there is no way I could easily workaround it..
  
  If we have to live with this regression, I want to hear some suggestions
  about how to fix them realistically. Thanks.
  
  By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
  says:
  
  c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
  commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
  Author: Frank van Maarseveen [EMAIL PROTECTED]
  Date:   Mon Jul 9 22:25:29 2007 +0200
  
  NLM: fix source address of callback to client
  
  Use the destination address of the original NLM request as the
  source address in callbacks to the client.
  
  Signed-off-by: Frank van Maarseveen [EMAIL PROTECTED]
  Signed-off-by: Trond Myklebust [EMAIL PROTECTED]
  
  :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
  105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
  :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
  2fec08debe51c20423a88b1a0d4281c683ba5daf M  include
 
 This does not have any relation with the mount problem, assuming commit
 and comment do match.

That's right.

The commits we're discussing here are (I believe):

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=75180df2ed467866ada839fe73cf7cc7d75c0a22
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=275a5d24bf56b2d9dd4644c54a56366b89a028f1

The later being the one returning EBUSY for the option mismatch and the 
former the addition of the nosharecache option.

Ian

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote:
 

 If you want new behaviour, you add a new flag saying you want new 
 behaviour. You don't just start behaving differently from what you've 
 always done before (and what *other* UNIXes do, for that matter).
 
 Besides, even *if* it was a matter of somebody doing a mount with rw, 
 when the previous mount was ro, returning EBUSY is still the wrong thing 
 to do! If the user asks for a new mount that is read-write, he should just 
 get it - ie we should not re-use the old client handles, and we should do 
 what Solaris apparently does, namely to just make it a totally different 
 mount.
 
 In other words, it should (as I already mentioned once) have used 
 nosharecache by default, which makes it all work.
 
 Then, people who want to re-use the caches (which in turn may mean that 
 everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
 SEMANTICS (errors and all) should then use a sharecache flag.

That would be a major change in existing semantics. The default has been
sharecache ever since Al Viro introduced the sget() function some 6
or 7 years ago. The problem was that we never advertised the fact that
the kernel was overriding your mount options, and so sysadmins were
(rightly IMO) complaining that they should _know_ when the client does
this.

The list of known problems with a nosharecache default is nasty too:

- file and directory attribute and data caching breaks.
Applications will see stale data in cases where they otherwise
would not expect it.

- the existing dcache and icache issues when a file is renamed
or deleted on the server are now extended to also include the
case where the rename or deletion occurs on an alias in another
directory on the client itself. In particular, sillyrename will
break.

- file locking breaks (the server knows that the client holds
locks on one file, whereas the client thinks it holds locks on
several).

- the NFSv4 delegation model breaks: the client will be using
OPEN when it could use cached opens. More importantly, when
performing an operation that requires it to return the
delegation on the aliased file, it won't know until the server
sends it a callback.

...and of course, the amount of unnecessary traffic to the server
increases. I'm not aware of any sane way of dealing with those issues,
and I doubt Solaris has a solution for them either.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Frank van Maarseveen

On Fri, Aug 31, 2007 at 08:11:38AM -0400, Trond Myklebust wrote:
 On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote:
  
 
  If you want new behaviour, you add a new flag saying you want new 
  behaviour. You don't just start behaving differently from what you've 
  always done before (and what *other* UNIXes do, for that matter).
  
  Besides, even *if* it was a matter of somebody doing a mount with rw, 
  when the previous mount was ro, returning EBUSY is still the wrong thing 
  to do! If the user asks for a new mount that is read-write, he should just 
  get it - ie we should not re-use the old client handles, and we should do 
  what Solaris apparently does, namely to just make it a totally different 
  mount.
  
  In other words, it should (as I already mentioned once) have used 
  nosharecache by default, which makes it all work.
  
  Then, people who want to re-use the caches (which in turn may mean that 
  everything needs to have the same flags), THOSE PEOPLE, who want the NEW 
  SEMANTICS (errors and all) should then use a sharecache flag.
 
 That would be a major change in existing semantics. The default has been
 sharecache ever since Al Viro introduced the sget() function some 6
 or 7 years ago. The problem was that we never advertised the fact that
 the kernel was overriding your mount options, and so sysadmins were
 (rightly IMO) complaining that they should _know_ when the client does
 this.
 
 The list of known problems with a nosharecache default is nasty too:
 
 - file and directory attribute and data caching breaks.
 Applications will see stale data in cases where they otherwise
 would not expect it.
 
 - the existing dcache and icache issues when a file is renamed
 or deleted on the server are now extended to also include the
 case where the rename or deletion occurs on an alias in another
 directory on the client itself. In particular, sillyrename will
 break.
 
 - file locking breaks (the server knows that the client holds
 locks on one file, whereas the client thinks it holds locks on
 several).
 
 - the NFSv4 delegation model breaks: the client will be using
 OPEN when it could use cached opens. More importantly, when
 performing an operation that requires it to return the
 delegation on the aliased file, it won't know until the server
 sends it a callback.
 
 ...and of course, the amount of unnecessary traffic to the server
 increases. I'm not aware of any sane way of dealing with those issues,
 and I doubt Solaris has a solution for them either.

All of this won't happen when server foo exports /bar and a client
mounts /bar/x and /bar/y separately: there must be a shared subtree or
hard-links between files within them, right?

An obvious (but disruptive) server side workaround is to export the
subtrees with different fsid= but that would give the same list of
problems as above, right?

IMHO I'd only consider returning EBUSY when trying to mount _exactly_
the same directory with different flags, not for arbitrary subtrees. The
client should preferably not be bothered with server side disk
partitioning (at least not beyond the obvious such as df output).

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-31 Thread Trond Myklebust

On Fri, 2007-08-31 at 15:12 +0200, Frank van Maarseveen wrote:

 IMHO I'd only consider returning EBUSY when trying to mount _exactly_
 the same directory with different flags, not for arbitrary subtrees. The
 client should preferably not be bothered with server side disk
 partitioning (at least not beyond the obvious such as df output).

That is utterly inconsistent and confusing too.

If you have a filesystem /foo exported on the server remote, then
why should

mount -oro remote:/foo
mount -orw remote:/foo/a

be allowed, but

mount -oro remote:/foo
mount -orw remote:/foo

be forbidden? The caching problems are the same. Telling the admin that
one is safe and the other is not, is just messing with his mind.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Ian Kent

On Thu, 30 Aug 2007, Linus Torvalds wrote:
> 
> 
> On Fri, 31 Aug 2007, Trond Myklebust wrote:
> > 
> > It did not. The previous behaviour was to always silently override the
> > user mount options.
> 
> ..so it still worked for any sane setup, at least.
> 
> You broke that. Hua gave good reasons for why he cannot use the current 
> kernel. It's a regression.
> 
> In other words, the new behaviour is *worse* than the behaviour you 
> consider to be the incorrect one.
> 

This all came about due to complains about not being able to mount the 
same server file system with different options, most commonly ro vs. rw 
which I think was due to the shared super block changes some time ago. 
And, to some extent, I have to plead guilty for not complaining enough 
about this default in the beginning, which is basically unacceptable for 
sure.

We have seen breakage in Fedora with the introduction of the patches and 
this is typical of it. It also breaks amd and admins have no way of 
altering this that I'm aware of (help us here Ion).

I understand Tronds concerns but the fact remains that other Unixs allow 
this behaviour but don't assert cache coherancy and many sysadmin don't 
realize this. So the broken behavior is expected to work and we can't 
simply stop allowing it unless we want to attend a public hanging with us 
as the paticipants.

There is no question that the new behavior is worse and this change is 
unacceptable as a solution to the original problem.

I really think that reversing the default, as has been suggested, 
documenting the risk in the mount.nfs man page and perhaps issuing a 
warning from the kernel is a better way to handle this. At least we will 
be doing more to raise public awareness of the issue than others.

Ian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

Trond,

> So you are saying that it is acceptable for the kernel to decide
> unilaterally to override mount options? Why aren't we doing that for
> any other filesystem than NFS?

I think there are two reasons.

First, I have no problem with the new behavior if it didn't cause a
regression. I am not sure about the history of other filesystems, but NFS
has had the old behavior for ages, and people get used to it.

Second, NFS is actually special as this particular setup is very common and
you'll get into this situation far too easily, as from the server you could
export two directories within a filesystem as if they were two filesystems.
Very few people actually want to mount the same local filesystem multiple
times, but under NFS this is the norm.

Last but not the least, NFS is often controlled by central corporate
policies (autofs/nis), and has to work with various clients. For example,
it's not possible to add "nosharecache" to auto.auto as almost nobody
understands it, unless you upgrade all the clients.

> Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Linus Torvalds

On Fri, 31 Aug 2007, Trond Myklebust wrote:
> 
> So you are saying that it is acceptable for the kernel to decide
> unilaterally to override mount options?

IT'S WHAT WE'VE APPARENTLY ALWAYS DONE!

> Why aren't we doing that for any other filesystem than NFS?

How hard is it to acknowledge the following little word:

"regression"

It's simple. You broke things. You may want to fix them, but you need to 
fix them in a way that does not break user space.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Ian Kent

On Fri, 31 Aug 2007, Trond Myklebust wrote:

> On Thu, 2007-08-30 at 16:44 -0700, Hua Zhong wrote:
> > > How is the NFS client to know that these directories are disjoint, or
> > > that no-one will ever create a hard link from one directory to another?
> > > To my knowledge, the only way to ensure this is to put them on
> > > different disk partitions.
> > > 
> > > I don't know if all Unix systems have this issue, but I have been told
> > > that Solaris at least has it.
> > 
> > Does Solaris enforces this "mount with same options" as default?
> 
> No. Solaris defaults to breaking cache consistency.
> 
> > > > "working" as in "I can mount the directory and do my work". And there
> > > > has never been any problems as far as I know.
> > > 
> > > That is too narrow a definition: the minimum should be "everyone can
> > > mount their directories and do their work". Your particular setup may
> > > be safe, but that is why we have overrides: the default should be for the
> > > kernel to be conservative, and to _tell_ users what it thinks is wrong.
> > 
> > Every engineer in our organization mounts it too. No problem until now.
> 
> I believe I've already explained why that isn't a sufficient metric.
> What is your point?
> 
> > It's not very conservative to suddenly change default behavior and break
> > autofs mounts. There is not even one kernel message that "_tells_ user why
> > it thinks it's wrong". It just silently fails.
> 
> No it doesn't. It reports an error code to the caller. If autofs is
> failing silently, then that is a bug in autofs: mount will report the
> error to the user.

Actually, yes, it looks like I'm not logging mount errors at the correct 
log level. Oops.

Ian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 21:59 -0700, Linus Torvalds wrote:
> 
> On Fri, 31 Aug 2007, Trond Myklebust wrote:
> > 
> > It did not. The previous behaviour was to always silently override the
> > user mount options.
> 
> ..so it still worked for any sane setup, at least.
> 
> You broke that. Hua gave good reasons for why he cannot use the current 
> kernel. It's a regression.
> 
> In other words, the new behaviour is *worse* than the behaviour you 
> consider to be the incorrect one.

So you are saying that it is acceptable for the kernel to decide
unilaterally to override mount options? Why aren't we doing that for any
other filesystem than NFS?

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Linus Torvalds

On Fri, 31 Aug 2007, Trond Myklebust wrote:
> 
> It did not. The previous behaviour was to always silently override the
> user mount options.

..so it still worked for any sane setup, at least.

You broke that. Hua gave good reasons for why he cannot use the current 
kernel. It's a regression.

In other words, the new behaviour is *worse* than the behaviour you 
consider to be the incorrect one.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 21:38 -0700, Linus Torvalds wrote:
> 
> On Fri, 31 Aug 2007, Trond Myklebust wrote:
> > 
> > No. Solaris defaults to breaking cache consistency.
> 
> If so, and since that's obviously what people _expect_ to happen, why not 
> make that the default, with the "consistent" behaviour being the one that 
> needs an explicit option.

The majority of "nfs sucks" complaints result from the general lack of
understanding by sysadmins of the nfs caching model. I'd be very
sceptical of any claim that most sysadmins "expect" broken cache
consistency as a result of mounting the same filesystem with different
mount options.

> Just out of curiosity - Hua, is this NFSv2? Especially there, cache 
> "consistency" is largely a joke anyway, so defaulting to some annoying 
> careful mode is doubly ridiculous.

NFSv2 has a close-to-open caching model which works fine as long as you
don't break the underlying assumptions. See my comment above.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

> On Fri, 31 Aug 2007, Trond Myklebust wrote:
> >
> > No. Solaris defaults to breaking cache consistency.
> 
> If so, and since that's obviously what people _expect_ to happen, why
> not make that the default, with the "consistent" behaviour being the 
> one that needs an explicit option.
> 
> Just out of curiosity - Hua, is this NFSv2? Especially there, cache
> "consistency" is largely a joke anyway, so defaulting to some annoying
> careful mode is doubly ridiculous.

It's v3 as can be seen from the autofs maps I posted.

These directories are used mostly as read-only and get pulled in via our
build system. We do not actually write to them often, if at all. I don't
think this setup is uncommon, and I am worried that once people start using
the latest kernel their systems will mysteriously break.

>   Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 20:49 -0700, Linus Torvalds wrote:
> 
> On Thu, 30 Aug 2007, Trond Myklebust wrote:
> > 
> > Which is better than having it fail silently, or giving you a mount with
> > the wrong mount options.
> 
> No, Trond.
> 
> That commit gets reverted or fixed. It's a regression, and your theories 
> that it's "better" that way are obviously broken.
> 
> It's obviously broken because you seem to say that you know better, even 
> though you also admit that:
> 
>   "How is the NFS client to know that these directories are disjoint, or
>that no-one will ever create a hard link from one directory to another? 
>To my knowledge, the only way to ensure this is to put them on 
>different disk partitions."
> 
> the point being that you just disallowed people from doing things that are 
> sane but _potentially_ dangerous. That's now how we work. The UNIX way sis 
> to give people rope - if you cannot *prove* that what they are doing is 
> wrong, then you damn well better not disallow it.
> 
> No regressions, Trond. Especially not for stuff that used to work, was 
> used, and that could be sanely expected to work (which this *definitely*
> sounds like).

It did not. The previous behaviour was to always silently override the
user mount options.

> Please send in a fix. If the fix involves making "nosharecache" the 
> default, then that is better than making policy decisions like this in the 
> kernel. The kernel should do what the user asks and not put in unnecessary 
> roadblocks.

This is _not_ a kernel policy decision. The kernel is simply informing
the user that it cannot fulfil the mount request as specified. Exactly
why do you think that NFS should be any different from other filesystems
when it comes to this?

AFAIK, every other filesystem will give you an EBUSY if you try to mount
a partition with -oro if you are already mounting somewhere else with
-orw. Every filesystem will give you an EBUSY if you try to mount the
partition with -oacl if it is mounted somewhere else with -onoacl. The
reason: exactly the same as NFS, the caches cannot remain consistent
when you try to mount two different super blocks that both refer to the
same underlying filesystem.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Linus Torvalds

On Fri, 31 Aug 2007, Trond Myklebust wrote:
> 
> No. Solaris defaults to breaking cache consistency.

If so, and since that's obviously what people _expect_ to happen, why not 
make that the default, with the "consistent" behaviour being the one that 
needs an explicit option.

Just out of curiosity - Hua, is this NFSv2? Especially there, cache 
"consistency" is largely a joke anyway, so defaulting to some annoying 
careful mode is doubly ridiculous.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 16:44 -0700, Hua Zhong wrote:
> > How is the NFS client to know that these directories are disjoint, or
> > that no-one will ever create a hard link from one directory to another?
> > To my knowledge, the only way to ensure this is to put them on
> > different disk partitions.
> > 
> > I don't know if all Unix systems have this issue, but I have been told
> > that Solaris at least has it.
> 
> Does Solaris enforces this "mount with same options" as default?

No. Solaris defaults to breaking cache consistency.

> > > "working" as in "I can mount the directory and do my work". And there
> > > has never been any problems as far as I know.
> > 
> > That is too narrow a definition: the minimum should be "everyone can
> > mount their directories and do their work". Your particular setup may
> > be safe, but that is why we have overrides: the default should be for the
> > kernel to be conservative, and to _tell_ users what it thinks is wrong.
> 
> Every engineer in our organization mounts it too. No problem until now.

I believe I've already explained why that isn't a sufficient metric.
What is your point?

> It's not very conservative to suddenly change default behavior and break
> autofs mounts. There is not even one kernel message that "_tells_ user why
> it thinks it's wrong". It just silently fails.

No it doesn't. It reports an error code to the caller. If autofs is
failing silently, then that is a bug in autofs: mount will report the
error to the user.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 18:24 -0700, Andrew Morton wrote:
> On Thu, 30 Aug 2007 18:37:13 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:
> 
> > On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
> > > I am re-sending this after help from Ian and git-bisect. To me it's a
> > > show-stopper: I cannot find an acceptable workaround that I can implement.
> > > 
> > > The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
> > > mounts to fail silently - they just not appear when they should.
> > > 
> > > I believe it's caused by the NFS change that forces multiple mounts from
> > > different directories under the same server side filesystem to have the 
> > > same
> > > mount options by default, otherwise it returns EBUSY.
> > > 
> > > For example, if server has a filesystem /a, and it exports /a/x and /a/y
> > > (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
> > > mount options now.
> > 
> > Which is better than having it fail silently, or giving you a mount with
> > the wrong mount options.
> > 
> > If you need to mount the same filesystem with incompatible mount options
> > on the same client, then there is a new mount option "nosharecache",
> > which enables it.
> > The new option is there in order to make it damned clear to sysadmins
> > that this is a dangerous thing to do: mounts which don't share the same
> > superblock also don't share the same data and attribute caches. Any file
> > or directory which appears in both mounts had better only be used by one
> > application at a time or be using an appropriate locking scheme.
> > 
> 
> If we're going to send a message to sysadmins, we shouldn't force them to go
> through a git bisection search and a lkml discussion to receive it!
> 
> Is there at least some way in which the kernel can detect this situation
> and emit a friendly printk which guides people to a friendly document?

There are already error codes being passed back to the mount syscall.
The problem here is that unlike the mount utility, autofs isn't passing
that information on to the user.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

Hi Linus,

> Hua - that said, I don't actually see why the commit you bisected to
> has anything to do with the issue being discussed. Can you double-check
> that it's literally that particular commit that breaks for you (you could
> try just reverting that commit).

I will double check that tomorrow. Thanks. :-) I'm happy I'll still be able
to test the latest kernels on my work desktop.

>   Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Linus Torvalds

On Thu, 30 Aug 2007, Trond Myklebust wrote:
> 
> Which is better than having it fail silently, or giving you a mount with
> the wrong mount options.

No, Trond.

That commit gets reverted or fixed. It's a regression, and your theories 
that it's "better" that way are obviously broken.

It's obviously broken because you seem to say that you know better, even 
though you also admit that:

  "How is the NFS client to know that these directories are disjoint, or
   that no-one will ever create a hard link from one directory to another? 
   To my knowledge, the only way to ensure this is to put them on 
   different disk partitions."

the point being that you just disallowed people from doing things that are 
sane but _potentially_ dangerous. That's now how we work. The UNIX way sis 
to give people rope - if you cannot *prove* that what they are doing is 
wrong, then you damn well better not disallow it.

No regressions, Trond. Especially not for stuff that used to work, was 
used, and that could be sanely expected to work (which this *definitely*
sounds like).

Please send in a fix. If the fix involves making "nosharecache" the 
default, then that is better than making policy decisions like this in the 
kernel. The kernel should do what the user asks and not put in unnecessary 
roadblocks.

Hua - that said, I don't actually see why the commit you bisected to has 
anything to do with the issue being discussed. Can you double-check that 
it's literally that particular commit that breaks for you (you could try 
just reverting that commit).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Andrew Morton

On Thu, 30 Aug 2007 18:37:13 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:

> On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
> > I am re-sending this after help from Ian and git-bisect. To me it's a
> > show-stopper: I cannot find an acceptable workaround that I can implement.
> > 
> > The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
> > mounts to fail silently - they just not appear when they should.
> > 
> > I believe it's caused by the NFS change that forces multiple mounts from
> > different directories under the same server side filesystem to have the same
> > mount options by default, otherwise it returns EBUSY.
> > 
> > For example, if server has a filesystem /a, and it exports /a/x and /a/y
> > (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
> > mount options now.
> 
> Which is better than having it fail silently, or giving you a mount with
> the wrong mount options.
> 
> If you need to mount the same filesystem with incompatible mount options
> on the same client, then there is a new mount option "nosharecache",
> which enables it.
> The new option is there in order to make it damned clear to sysadmins
> that this is a dangerous thing to do: mounts which don't share the same
> superblock also don't share the same data and attribute caches. Any file
> or directory which appears in both mounts had better only be used by one
> application at a time or be using an appropriate locking scheme.
> 

If we're going to send a message to sysadmins, we shouldn't force them to go
through a git bisection search and a lkml discussion to receive it!

Is there at least some way in which the kernel can detect this situation
and emit a friendly printk which guides people to a friendly document?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

> How is the NFS client to know that these directories are disjoint, or
> that no-one will ever create a hard link from one directory to another?
> To my knowledge, the only way to ensure this is to put them on
> different disk partitions.
> 
> I don't know if all Unix systems have this issue, but I have been told
> that Solaris at least has it.

Does Solaris enforces this "mount with same options" as default?

> > "working" as in "I can mount the directory and do my work". And there
> > has never been any problems as far as I know.
> 
> That is too narrow a definition: the minimum should be "everyone can
> mount their directories and do their work". Your particular setup may
> be safe, but that is why we have overrides: the default should be for the
> kernel to be conservative, and to _tell_ users what it thinks is wrong.

Every engineer in our organization mounts it too. No problem until now.

It's not very conservative to suddenly change default behavior and break
autofs mounts. There is not even one kernel message that "_tells_ user why
it thinks it's wrong". It just silently fails.

> Your choice.

No. I have no other choice as I explained before.

Hua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 16:30 -0700, Hua Zhong wrote:

> There are two disjoint directories. I am wondering why there would be cache
> coherency issues in this case? Is this Linus nfs implementation specific or
> all other Unix systems all have the same issue?

How is the NFS client to know that these directories are disjoint, or
that no-one will ever create a hard link from one directory to another?
To my knowledge, the only way to ensure this is to put them on different
disk partitions.

I don't know if all Unix systems have this issue, but I have been told
that Solaris at least has it.

> > If you know what you are doing, then there is an option which allows
> > you to override the default behaviour.
> > 
> > > More importantly, it is a regression. My understanding is that unless
> > > absolutely necessary we do not introduce a "feature" that breaks
> > > working setups.
> > 
> > Your turn to define what you mean by "working"? In my book that means
> > "a setup that doesn't include unexpected or unintended behaviour".
> 
> "working" as in "I can mount the directory and do my work". And there has
> never been any problems as far as I know.

That is too narrow a definition: the minimum should be "everyone can
mount their directories and do their work". Your particular setup may be
safe, but that is why we have overrides: the default should be for the
kernel to be conservative, and to _tell_ users what it thinks is wrong.

> > Not being able to notice cache coherency failures on a file that is
> > mounted in two different places with two different sets of mount
> > options counts as "unexpected behaviour".
> > 
> > Not being able to notice that your mount options have been overridden
> > by the kernel also counts as "unexpected behaviour".
> 
> Fine. These are all very nice theories, but I just want to report this
> regression and hope it won't cause any big problems for any users out there.
> In the mean time, I am returning to 2.6.22.

Your choice.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

> On Thu, 2007-08-30 at 15:47 -0700, Hua Zhong wrote:
> > >
> > > Which is better than having it fail silently, or giving you a mount
> > > with the wrong mount options.
> >
> > Well, it depends on how you define "better".
> 
> "better" as in: "I now have a chance to notice, when my 'read-only
> mount' is actually 'read-write'".
> 
> > In this particular scenario, the maps read as follows:
> >
> > tools
> > -
> fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,n
> fsve
> > rs=3,actimeo=600 fs1.domain.com:/a/tools
> > share
> > -
> fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,n
> fsve
> > rs=3 fs1.domain.com:/a/share
> >
> > The only difference is in the actimeo (I don't even know what it
> > means). Is this enough to fail a mount?
> 
> Yes. The default values for acregmin, acregmax, acdirmin, acdirmax are
> not 600. If /a/tools and /a/share are on the same filesystem on the
> server, then the NFS client should warn you that you are about to do
> something that may result in cache coherency problems instead of
> silently allowing it, and then leaving you to debug the coherency issue.

There are two disjoint directories. I am wondering why there would be cache
coherency issues in this case? Is this Linus nfs implementation specific or
all other Unix systems all have the same issue?

> If you know what you are doing, then there is an option which allows
> you to override the default behaviour.
> 
> > More importantly, it is a regression. My understanding is that unless
> > absolutely necessary we do not introduce a "feature" that breaks
> > working setups.
> 
> Your turn to define what you mean by "working"? In my book that means
> "a setup that doesn't include unexpected or unintended behaviour".

"working" as in "I can mount the directory and do my work". And there has
never been any problems as far as I know.

> Not being able to notice cache coherency failures on a file that is
> mounted in two different places with two different sets of mount
> options counts as "unexpected behaviour".
> 
> Not being able to notice that your mount options have been overridden
> by the kernel also counts as "unexpected behaviour".

Fine. These are all very nice theories, but I just want to report this
regression and hope it won't cause any big problems for any users out there.
In the mean time, I am returning to 2.6.22.

> Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 15:47 -0700, Hua Zhong wrote:
> Hi Trond,
> 
> > On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
> > > I am re-sending this after help from Ian and git-bisect. To me it's a
> > > show-stopper: I cannot find an acceptable workaround that I can
> > > implement. The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes 
> > > several autofs mounts to fail silently - they just not appear when 
> > > they should.
> > > I believe it's caused by the NFS change that forces multiple mounts
> > > from different directories under the same server side filesystem to have
> > > the same mount options by default, otherwise it returns EBUSY.
> > >
> > > For example, if server has a filesystem /a, and it exports /a/x and /a/y
> > > (maybe with rw or ro), and a client must mount /a/x and /a/y with the
> > > same mount options now.
> > 
> > Which is better than having it fail silently, or giving you a mount
> > with the wrong mount options.
> 
> Well, it depends on how you define "better".

"better" as in: "I now have a chance to notice, when my 'read-only
mount' is actually 'read-write'".

> In this particular scenario, the maps read as follows:
> 
> tools
> -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
> rs=3,actimeo=600 fs1.domain.com:/a/tools
> share
> -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
> rs=3 fs1.domain.com:/a/share
> 
> The only difference is in the actimeo (I don't even know what it means). Is
> this enough to fail a mount?

Yes. The default values for acregmin, acregmax, acdirmin, acdirmax are
not 600. If /a/tools and /a/share are on the same filesystem on the
server, then the NFS client should warn you that you are about to do
something that may result in cache coherency problems instead of
silently allowing it, and then leaving you to debug the coherency issue.

If you know what you are doing, then there is an option which allows you
to override the default behaviour.

> More importantly, it is a regression. My understanding is that unless
> absolutely necessary we do not introduce a "feature" that breaks working
> setups.

Your turn to define what you mean by "working"? In my book that means "a
setup that doesn't include unexpected or unintended behaviour".

Not being able to notice cache coherency failures on a file that is
mounted in two different places with two different sets of mount options
counts as "unexpected behaviour".

Not being able to notice that your mount options have been overridden by
the kernel also counts as "unexpected behaviour".

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

Hi Trond,

> On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
> > I am re-sending this after help from Ian and git-bisect. To me it's a
> > show-stopper: I cannot find an acceptable workaround that I can
> > implement. The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes 
> > several autofs mounts to fail silently - they just not appear when 
> > they should.
> > I believe it's caused by the NFS change that forces multiple mounts
> > from different directories under the same server side filesystem to have
> > the same mount options by default, otherwise it returns EBUSY.
> >
> > For example, if server has a filesystem /a, and it exports /a/x and /a/y
> > (maybe with rw or ro), and a client must mount /a/x and /a/y with the
> > same mount options now.
> 
> Which is better than having it fail silently, or giving you a mount
> with the wrong mount options.

Well, it depends on how you define "better".

In this particular scenario, the maps read as follows:

tools
-fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
rs=3,actimeo=600 fs1.domain.com:/a/tools
share
-fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
rs=3 fs1.domain.com:/a/share

The only difference is in the actimeo (I don't even know what it means). Is
this enough to fail a mount?

More importantly, it is a regression. My understanding is that unless
absolutely necessary we do not introduce a "feature" that breaks working
setups.

> If you need to mount the same filesystem with incompatible mount
> options on the same client, then there is a new mount option
"nosharecache",
> which enables it.

Unfortunately, as I said, I don't control the auto.auto nis map.

> The new option is there in order to make it damned clear to sysadmins
> that this is a dangerous thing to do: mounts which don't share the same
> superblock also don't share the same data and attribute caches. Any
> file or directory which appears in both mounts had better only be used by
> one application at a time or be using an appropriate locking scheme.

I guess the question is what should be the default.

I'll convey this to our system admin (fortunately we are not a very big
company), but I am just not 100% sure this is a well-thought change because
I believe many people will be impacted once 2.6.23 is out. Shouldn't we give
some time to user to fix their config before we enforce this, by like some
kernel warnings?

Thanks.

> Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
> I am re-sending this after help from Ian and git-bisect. To me it's a
> show-stopper: I cannot find an acceptable workaround that I can implement.
> 
> The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
> mounts to fail silently - they just not appear when they should.
> 
> I believe it's caused by the NFS change that forces multiple mounts from
> different directories under the same server side filesystem to have the same
> mount options by default, otherwise it returns EBUSY.
> 
> For example, if server has a filesystem /a, and it exports /a/x and /a/y
> (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
> mount options now.

Which is better than having it fail silently, or giving you a mount with
the wrong mount options.

If you need to mount the same filesystem with incompatible mount options
on the same client, then there is a new mount option "nosharecache",
which enables it.
The new option is there in order to make it damned clear to sysadmins
that this is a dangerous thing to do: mounts which don't share the same
superblock also don't share the same data and attribute caches. Any file
or directory which appears in both mounts had better only be used by one
application at a time or be using an appropriate locking scheme.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

I am re-sending this after help from Ian and git-bisect. To me it's a
show-stopper: I cannot find an acceptable workaround that I can implement.

The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
mounts to fail silently - they just not appear when they should.

I believe it's caused by the NFS change that forces multiple mounts from
different directories under the same server side filesystem to have the same
mount options by default, otherwise it returns EBUSY.

For example, if server has a filesystem /a, and it exports /a/x and /a/y
(maybe with rw or ro), and a client must mount /a/x and /a/y with the same
mount options now.

Since in my setup they are managed by autofs, and the autofs map is managed
by nis, there is no way I could easily workaround it..

If we have to live with this regression, I want to hear some suggestions
about how to fix them realistically. Thanks.

By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
says:

c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
Author: Frank van Maarseveen <[EMAIL PROTECTED]>
Date:   Mon Jul 9 22:25:29 2007 +0200

NLM: fix source address of callback to client

Use the destination address of the original NLM request as the
source address in callbacks to the client.

Signed-off-by: Frank van Maarseveen <[EMAIL PROTECTED]>
Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>

:04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
:04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
2fec08debe51c20423a88b1a0d4281c683ba5daf M  include


-Original Message-
From: Hua Zhong [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 29, 2007 1:59 PM
To: 'Linux Kernel Mailing List'
Subject: regression of autofs for current git?

Hi,

I am wondering if this is a known issue, but I just built the current git
and several autofs mounts mysteriously disappeared. Restarting autofs could
fix some, but then lose others. 2.6.22 was fine.

Is there anything I could check other than bisect? (It may take some time
for me to get to it)

Thanks for your help.

Hua

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

I am re-sending this after help from Ian and git-bisect. To me it's a
show-stopper: I cannot find an acceptable workaround that I can implement.

The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
mounts to fail silently - they just not appear when they should.

I believe it's caused by the NFS change that forces multiple mounts from
different directories under the same server side filesystem to have the same
mount options by default, otherwise it returns EBUSY.

For example, if server has a filesystem /a, and it exports /a/x and /a/y
(maybe with rw or ro), and a client must mount /a/x and /a/y with the same
mount options now.

Since in my setup they are managed by autofs, and the autofs map is managed
by nis, there is no way I could easily workaround it..

If we have to live with this regression, I want to hear some suggestions
about how to fix them realistically. Thanks.

By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
says:

c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
Author: Frank van Maarseveen [EMAIL PROTECTED]
Date:   Mon Jul 9 22:25:29 2007 +0200

NLM: fix source address of callback to client

Use the destination address of the original NLM request as the
source address in callbacks to the client.

Signed-off-by: Frank van Maarseveen [EMAIL PROTECTED]
Signed-off-by: Trond Myklebust [EMAIL PROTECTED]

:04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f
105fbd3cb3fa5e3019836b4b5268125d0181a72d M  fs
:04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41
2fec08debe51c20423a88b1a0d4281c683ba5daf M  include


-Original Message-
From: Hua Zhong [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 29, 2007 1:59 PM
To: 'Linux Kernel Mailing List'
Subject: regression of autofs for current git?

Hi,

I am wondering if this is a known issue, but I just built the current git
and several autofs mounts mysteriously disappeared. Restarting autofs could
fix some, but then lose others. 2.6.22 was fine.

Is there anything I could check other than bisect? (It may take some time
for me to get to it)

Thanks for your help.

Hua

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
 I am re-sending this after help from Ian and git-bisect. To me it's a
 show-stopper: I cannot find an acceptable workaround that I can implement.
 
 The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
 mounts to fail silently - they just not appear when they should.
 
 I believe it's caused by the NFS change that forces multiple mounts from
 different directories under the same server side filesystem to have the same
 mount options by default, otherwise it returns EBUSY.
 
 For example, if server has a filesystem /a, and it exports /a/x and /a/y
 (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
 mount options now.

Which is better than having it fail silently, or giving you a mount with
the wrong mount options.

If you need to mount the same filesystem with incompatible mount options
on the same client, then there is a new mount option nosharecache,
which enables it.
The new option is there in order to make it damned clear to sysadmins
that this is a dangerous thing to do: mounts which don't share the same
superblock also don't share the same data and attribute caches. Any file
or directory which appears in both mounts had better only be used by one
application at a time or be using an appropriate locking scheme.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

Hi Trond,

 On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
  I am re-sending this after help from Ian and git-bisect. To me it's a
  show-stopper: I cannot find an acceptable workaround that I can
  implement. The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes 
  several autofs mounts to fail silently - they just not appear when 
  they should.
  I believe it's caused by the NFS change that forces multiple mounts
  from different directories under the same server side filesystem to have
  the same mount options by default, otherwise it returns EBUSY.
 
  For example, if server has a filesystem /a, and it exports /a/x and /a/y
  (maybe with rw or ro), and a client must mount /a/x and /a/y with the
  same mount options now.
 
 Which is better than having it fail silently, or giving you a mount
 with the wrong mount options.

Well, it depends on how you define better.

In this particular scenario, the maps read as follows:

tools
-fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
rs=3,actimeo=600 fs1.domain.com:/a/tools
share
-fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
rs=3 fs1.domain.com:/a/share

The only difference is in the actimeo (I don't even know what it means). Is
this enough to fail a mount?

More importantly, it is a regression. My understanding is that unless
absolutely necessary we do not introduce a feature that breaks working
setups.

 If you need to mount the same filesystem with incompatible mount
 options on the same client, then there is a new mount option
nosharecache,
 which enables it.

Unfortunately, as I said, I don't control the auto.auto nis map.

 The new option is there in order to make it damned clear to sysadmins
 that this is a dangerous thing to do: mounts which don't share the same
 superblock also don't share the same data and attribute caches. Any
 file or directory which appears in both mounts had better only be used by
 one application at a time or be using an appropriate locking scheme.

I guess the question is what should be the default.

I'll convey this to our system admin (fortunately we are not a very big
company), but I am just not 100% sure this is a well-thought change because
I believe many people will be impacted once 2.6.23 is out. Shouldn't we give
some time to user to fix their config before we enforce this, by like some
kernel warnings?

Thanks.

 Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 15:47 -0700, Hua Zhong wrote:
 Hi Trond,
 
  On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
   I am re-sending this after help from Ian and git-bisect. To me it's a
   show-stopper: I cannot find an acceptable workaround that I can
   implement. The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes 
   several autofs mounts to fail silently - they just not appear when 
   they should.
   I believe it's caused by the NFS change that forces multiple mounts
   from different directories under the same server side filesystem to have
   the same mount options by default, otherwise it returns EBUSY.
  
   For example, if server has a filesystem /a, and it exports /a/x and /a/y
   (maybe with rw or ro), and a client must mount /a/x and /a/y with the
   same mount options now.
  
  Which is better than having it fail silently, or giving you a mount
  with the wrong mount options.
 
 Well, it depends on how you define better.

better as in: I now have a chance to notice, when my 'read-only
mount' is actually 'read-write'.

 In this particular scenario, the maps read as follows:
 
 tools
 -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
 rs=3,actimeo=600 fs1.domain.com:/a/tools
 share
 -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve
 rs=3 fs1.domain.com:/a/share
 
 The only difference is in the actimeo (I don't even know what it means). Is
 this enough to fail a mount?

Yes. The default values for acregmin, acregmax, acdirmin, acdirmax are
not 600. If /a/tools and /a/share are on the same filesystem on the
server, then the NFS client should warn you that you are about to do
something that may result in cache coherency problems instead of
silently allowing it, and then leaving you to debug the coherency issue.

If you know what you are doing, then there is an option which allows you
to override the default behaviour.

 More importantly, it is a regression. My understanding is that unless
 absolutely necessary we do not introduce a feature that breaks working
 setups.

Your turn to define what you mean by working? In my book that means a
setup that doesn't include unexpected or unintended behaviour.

Not being able to notice cache coherency failures on a file that is
mounted in two different places with two different sets of mount options
counts as unexpected behaviour.

Not being able to notice that your mount options have been overridden by
the kernel also counts as unexpected behaviour.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

 On Thu, 2007-08-30 at 15:47 -0700, Hua Zhong wrote:
  
   Which is better than having it fail silently, or giving you a mount
   with the wrong mount options.
 
  Well, it depends on how you define better.
 
 better as in: I now have a chance to notice, when my 'read-only
 mount' is actually 'read-write'.
 
  In this particular scenario, the maps read as follows:
 
  tools
  -
 fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,n
 fsve
  rs=3,actimeo=600 fs1.domain.com:/a/tools
  share
  -
 fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,n
 fsve
  rs=3 fs1.domain.com:/a/share
 
  The only difference is in the actimeo (I don't even know what it
  means). Is this enough to fail a mount?
 
 Yes. The default values for acregmin, acregmax, acdirmin, acdirmax are
 not 600. If /a/tools and /a/share are on the same filesystem on the
 server, then the NFS client should warn you that you are about to do
 something that may result in cache coherency problems instead of
 silently allowing it, and then leaving you to debug the coherency issue.

There are two disjoint directories. I am wondering why there would be cache
coherency issues in this case? Is this Linus nfs implementation specific or
all other Unix systems all have the same issue?

 If you know what you are doing, then there is an option which allows
 you to override the default behaviour.
 
  More importantly, it is a regression. My understanding is that unless
  absolutely necessary we do not introduce a feature that breaks
  working setups.
 
 Your turn to define what you mean by working? In my book that means
 a setup that doesn't include unexpected or unintended behaviour.

working as in I can mount the directory and do my work. And there has
never been any problems as far as I know.

 Not being able to notice cache coherency failures on a file that is
 mounted in two different places with two different sets of mount
 options counts as unexpected behaviour.
 
 Not being able to notice that your mount options have been overridden
 by the kernel also counts as unexpected behaviour.

Fine. These are all very nice theories, but I just want to report this
regression and hope it won't cause any big problems for any users out there.
In the mean time, I am returning to 2.6.22.

 Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Hua Zhong

 How is the NFS client to know that these directories are disjoint, or
 that no-one will ever create a hard link from one directory to another?
 To my knowledge, the only way to ensure this is to put them on
 different disk partitions.
 
 I don't know if all Unix systems have this issue, but I have been told
 that Solaris at least has it.

Does Solaris enforces this mount with same options as default?

  working as in I can mount the directory and do my work. And there
  has never been any problems as far as I know.
 
 That is too narrow a definition: the minimum should be everyone can
 mount their directories and do their work. Your particular setup may
 be safe, but that is why we have overrides: the default should be for the
 kernel to be conservative, and to _tell_ users what it thinks is wrong.

Every engineer in our organization mounts it too. No problem until now.

It's not very conservative to suddenly change default behavior and break
autofs mounts. There is not even one kernel message that _tells_ user why
it thinks it's wrong. It just silently fails.

 Your choice.

No. I have no other choice as I explained before.

Hua

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: recent nfs change causes autofs regression

2007-08-30 Thread Trond Myklebust

On Thu, 2007-08-30 at 16:30 -0700, Hua Zhong wrote:

 There are two disjoint directories. I am wondering why there would be cache
 coherency issues in this case? Is this Linus nfs implementation specific or
 all other Unix systems all have the same issue?

How is the NFS client to know that these directories are disjoint, or
that no-one will ever create a hard link from one directory to another?
To my knowledge, the only way to ensure this is to put them on different
disk partitions.

I don't know if all Unix systems have this issue, but I have been told
that Solaris at least has it.

  If you know what you are doing, then there is an option which allows
  you to override the default behaviour.
  
   More importantly, it is a regression. My understanding is that unless
   absolutely necessary we do not introduce a feature that breaks
   working setups.
  
  Your turn to define what you mean by working? In my book that means
  a setup that doesn't include unexpected or unintended behaviour.
 
 working as in I can mount the directory and do my work. And there has
 never been any problems as far as I know.

That is too narrow a definition: the minimum should be everyone can
mount their directories and do their work. Your particular setup may be
safe, but that is why we have overrides: the default should be for the
kernel to be conservative, and to _tell_ users what it thinks is wrong.

  Not being able to notice cache coherency failures on a file that is
  mounted in two different places with two different sets of mount
  options counts as unexpected behaviour.
  
  Not being able to notice that your mount options have been overridden
  by the kernel also counts as unexpected behaviour.
 
 Fine. These are all very nice theories, but I just want to report this
 regression and hope it won't cause any big problems for any users out there.
 In the mean time, I am returning to 2.6.22.

Your choice.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Andrew Morton

On Thu, 30 Aug 2007 18:37:13 -0400 Trond Myklebust [EMAIL PROTECTED] wrote:

 On Thu, 2007-08-30 at 14:07 -0700, Hua Zhong wrote:
  I am re-sending this after help from Ian and git-bisect. To me it's a
  show-stopper: I cannot find an acceptable workaround that I can implement.
  
  The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
  mounts to fail silently - they just not appear when they should.
  
  I believe it's caused by the NFS change that forces multiple mounts from
  different directories under the same server side filesystem to have the same
  mount options by default, otherwise it returns EBUSY.
  
  For example, if server has a filesystem /a, and it exports /a/x and /a/y
  (maybe with rw or ro), and a client must mount /a/x and /a/y with the same
  mount options now.
 
 Which is better than having it fail silently, or giving you a mount with
 the wrong mount options.
 
 If you need to mount the same filesystem with incompatible mount options
 on the same client, then there is a new mount option nosharecache,
 which enables it.
 The new option is there in order to make it damned clear to sysadmins
 that this is a dangerous thing to do: mounts which don't share the same
 superblock also don't share the same data and attribute caches. Any file
 or directory which appears in both mounts had better only be used by one
 application at a time or be using an appropriate locking scheme.
 

If we're going to send a message to sysadmins, we shouldn't force them to go
through a git bisection search and a lkml discussion to receive it!

Is there at least some way in which the kernel can detect this situation
and emit a friendly printk which guides people to a friendly document?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: recent nfs change causes autofs regression

2007-08-30 Thread Linus Torvalds



On Thu, 30 Aug 2007, Trond Myklebust wrote:
 
 Which is better than having it fail silently, or giving you a mount with
 the wrong mount options.

No, Trond.

That commit gets reverted or fixed. It's a regression, and your theories 
that it's better that way are obviously broken.

It's obviously broken because you seem to say that you know better, even 
though you also admit that:

  How is the NFS client to know that these directories are disjoint, or
   that no-one will ever create a hard link from one directory to another? 
   To my knowledge, the only way to ensure this is to put them on 
   different disk partitions.

the point being that you just disallowed people from doing things that are 
sane but _potentially_ dangerous. That's now how we work. The UNIX way sis 
to give people rope - if you cannot *prove* that what they are doing is 
wrong, then you damn well better not disallow it.

No regressions, Trond. Especially not for stuff that used to work, was 
used, and that could be sanely expected to work (which this *definitely*
sounds like).

Please send in a fix. If the fix involves making nosharecache the 
default, then that is better than making policy decisions like this in the 
kernel. The kernel should do what the user asks and not put in unnecessary 
roadblocks.

Hua - that said, I don't actually see why the commit you bisected to has 
anything to do with the issue being discussed. Can you double-check that 
it's literally that particular commit that breaks for you (you could try 
just reverting that commit).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 113 matches

Mail list logo