Re: 2.4.2: opening deleted directories

2001-03-19 Thread Alexander Viro



On 19 Mar 2001, Trond Myklebust wrote:

 For a networked filesystem, you want to force the inode to be
 revalidated in some way every time you open() a file in order to
 ensure that the local cached data are still valid.
 The most obvious way to do so given the current VFS API is to use
 d_revalidate().  One could use inode-i_op-revalidate() instead, but
 if one wants to optimize attribute caching, one needs to add a flag to
 tell the filesystem code for when the VFS knows that we want to force
 a lookup.

What LOOKUP will you pass to server when you are opening "."?

 That said, I'm starting to believe that we need a rethink of the
 'open()' interface at the VFS level for 2.5. The current system is
 becoming more of a problem as the networked filesystems we'd like to
 support become more sophisticated.
 
 In particular, there is a problem with atomicity: currently we rely
 heavily on semaphores in the VFS to ensure atomicity between lookups,
 permission checks, and the various operations on files. This obviously
 is a problem in a networked environment where one in general has no
 such locks. Thus the newer filesystems (NFSv4, codafs, ...) are
 increasingly moving towards attempting to solve these problems by
 pushing more of the responsability for atomicity onto the server, and
 thus packing more and more operations into a single RPC call. The most
 insane case I know of to date is the monster-OPEN statement that NFSv4
 has to cope with (see RFC3010). There you ideally want to cram a
 sequence like
lookup filehandle
check permissions
create file
lock file
 all into a single RPC OPEN call.

 NFSv4 is not alone in this, however. NFSv3 could for instance optimize
 inode revalidation and permissions checking into a single operation
 for the case where we know we're doing an 'open()' on a file.
 We might also want to drop forcing revalidation on open() altogether
 if we knew that another file already was open.

Could we? I open /home/foo. Then I chdir there. Then I rmdir /home/foo.
Then I open ".". You wanted revalidate on that, didn't you?

 Perhaps one should make 'sys_open()' call the filesystem layer, and
 then provide tools, and a generic_open_file() method in the VFS
 (rather like sys_read and sys_write work today)?

I agree that all current stuff related to revalidation sucks badly, but
I don't understand the suggestion above. sys_write() knows what
object to talk with. sys_open() gets a string. It can't decide which
filesystem it belongs to until it does the lookup. It can't go into
the fs-specific code until it knows what filesystem would it be.

As for the permission checks on LOOKUP - why not? We can pass the
information about planned kind of access to -lookup(). Cache
credentials of the guy who did that lookup in your inode or dentry
and if subsequent -permission() matches - don't ask server.
Cheers,
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]



Re: 2.4.2: opening deleted directories

2001-03-19 Thread Trond Myklebust

 " " == Alexander Viro [EMAIL PROTECTED] writes:

  On 19 Mar 2001, Trond Myklebust wrote:

 For a networked filesystem, you want to force the inode to be
 revalidated in some way every time you open() a file in order
 to ensure that the local cached data are still valid.  The most
 obvious way to do so given the current VFS API is to use
 d_revalidate().  One could use inode-i_op-revalidate()
 instead, but if one wants to optimize attribute caching, one
 needs to add a flag to tell the filesystem code for when the
 VFS knows that we want to force a lookup.

  What LOOKUP will you pass to server when you are opening "."?

In theory, you could pass LOOKUP '.'. The latter is perfectly legal in
NFS, and has a well-defined (filesystem-independent) meaning. (LOOKUP
'..' is in fact also well-defined. Both are used by other variants of
*NIX in their treatment of 'getcwd()'.)

However locally, the Linux dcache defines '.' in terms of a
dentry. Thus I think it's correct to use d_revalidate(dentry) in this
case.

 We might also want to drop forcing revalidation on open()
 altogether if we knew that another file already was open.

  Could we? I open /home/foo. Then I chdir there. Then I rmdir
  /home/foo.  Then I open ".". You wanted revalidate on that,
  didn't you?

I was thinking more in terms of regular files than directories. In
principle we assume that if we've got a file open, nobody else is
supposed to be changing it (unless we're using file locking to keep
the data caches in sync).

That said, you're right: we do need to revalidate the filename itself
in case somebody renamed the file while we weren't looking.

 Perhaps one should make 'sys_open()' call the filesystem layer,
 and then provide tools, and a generic_open_file() method in the
 VFS (rather like sys_read and sys_write work today)?

  I agree that all current stuff related to revalidation sucks
  badly, but I don't understand the suggestion above. sys_write()
  knows what object to talk with. sys_open() gets a string. It
  can't decide which filesystem it belongs to until it does the
  lookup. It can't go into the fs-specific code until it knows
  what filesystem would it be.

Yes, that bit has to be done at the VFS level, but once the lookup of
the parent directory is over, file lookup, permissions checking, file
creation,... would best be done atomically.

Consider those cases where we currently have to grab i_sem (and
i_zombie) in order to guarantee atomicity. NFSv4 encourages use of a
single COMPOUND call if it is to offer the same guarantees.

One point is, for instance, proper treatment of the 'O_EXCL' flag on
file creation.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]



Re: 2.4.2: opening deleted directories

2001-03-19 Thread Guest section DW


On Tue, Mar 20, 2001 at 10:49:27AM +1100, Neil Brown wrote:

  ls -la- reports nothing, not even ".".  Thats odd.
  ls -la .. - lists the old parent.  This is wrong.
 
 So I certainly agree that things aren't perfect, and that if the
 standard says that lookup of "." must stop working, then I guess it
 must,  but I'm still have trouble seeing how this relates to NFS.

No, the standard says that the entries for dot and dotdot are removed,
but also that one does not look at these entries during a lookup.
Nothing you describe is wrong.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]



Re: 2.4.2: opening deleted directories

2001-03-18 Thread Guest section DW

On Sun, Mar 18, 2001 at 05:25:52PM -0500, Chuck Lever wrote:
 on Linux, if my cwd is a deleted directory, i can still open it.  to wit:
 
 shell A:shell B:
 1.  cd /tmp cd /tmp
 2.  mkdir test-dir
 3.  cd test-dir
 4.  rmdir test-dir
 5.  strace /bin/ls
 
...
 notice the open(".") -- it opens the current working directory that
 is in effect for the "ls" command.  but i just deleted that directory
 from another shell.  shouldn't that open(".") return ENOENT?
 
 it does on Solaris and OpenBSD for local file systems.
 
 since open(".") doesn't do either a d_lookup or d_revalidate on
 ".", open() never knows if the object is there or not; it merely
 returns a file descriptor that can't be used for anything.  this is
 an artifact of how path_init and path_walk work -- for ".",
 path_walk simply returns the dentry that was passed in by
 path_init.
 
 Trond and I were discussing this problem off-line as it relates
 to NFS close-to-open semantics. any comments on why it 
 works that way?  does anyone have opinions about whether
 or not this is broken behavior?

No, I do not think this is broken behavior.

You open ".". The name resolution says:
 The special file name dot refers to the directory specified by its predecessor.
where
 If the path name does not begin with a slash, the predecessor of the
 first file name of the path name is taken to be the current working directory
so it seems that to open "." one only needs an inode, and no name lookup.

The current working directory is a directory which is a file.
There is no requirement that a file shall have a name.
The rmdir removes a name, but not the file, in case there are extant refs.

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]



Re: 2.4.2: opening deleted directories

2001-03-18 Thread Chuck Lever

repeating neil's question from below
  "why is the open(".") behavior an issue?"

there are a few reasons:

* NFS close-to-open cache consistency:

in NFS, close-to-open semantics require that attributes be
fetched from the server when a file/directory is opened.  this
behavior is in part to help an application determine whether
the file or directory still exists or has been replaced or
removed, as in my 'ls' example.

in the open(".") case, path_walk doesn't invoke either
d_lookup or d_revalidate, so there is no opportunity in the
present logic to retrieve the directory's attributes.  this
potentially breaks programs that depend on attributes
being correct when opening ".".  'make' and 'ls' are just two
examples.

this is, btw, the original problem Trond and I were discussing.
he pointed out this problem.

* rmdir behavior:

the POSIX 1003.1 definition of rmdir() states that:

 If the directory is the root directory or the current working directory
 of any process, the effect of this function is implementation-defined.

cop-out. it later states that:

 If one or more processes have the directory open when the last
 link is removed, the dot and dot-dot entries, if present, are removed
 before rmdir() returns and no new entries may be created in the
 directory.

this indicates to me that, while the directory may continue to exist
if it's the cwd of some other process, the "." and ".." entries must
be removed, or equivalently, that lookups of "." and ".." will always
fail after a directory is deleted.

* standard pathname resolution behavior:

according to POSIX 1003.1, resolving a relative pathname
means the resolution *begins* at the current working directory.
In our case, if we follow POSIX resolution strategy, after starting
at the cwd, a lookup of "." should be done.  At this point I infer
that since the directory has been deleted, "." doesn't exist, and
open() returns ENOENT.

IOW, according to the text in the standard, the current
working directory is not an open file descriptor, it is simply a
naming convenience used during pathname resolution.

* other broken system calls:

i haven't tried this, but i'd guess stat(".") would behave similarly.
thus stat(".") on such a removed directory would tell an application
that the directory exists when in fact it doesn't.  this borders on
insecure behavior.  fortunately, no other operations are allowed
on the directory.

* consistent behavior across operating systems:

there are other flavors of UNIX that don't appear to work
this way.  once the directory is removed, it cannot be opened,
both on Solaris and OpenBSD.  i don't have access to others
at the moment.  this complicates porting applications among
operating systems, if only slightly.

* open() is a name space operation:

open() converts a pathname into a file descriptor; it's a name
space operation.  i believe that if a file or directory no longer
exists, applications expect they will not be able to open the file
or directory because it is no longer attached to the file system's
name space.  i don't believe there are any other cases in Linux
where you can open a file or directory that has been removed,
are there?

* good design:

i believe in reporting an error as soon as it occurs.  there are no
other operations allowed on a deleted directory, but the open()
call is the first opportunity for the operating system to indicate
to an application that the directory is gone.



i'm not trying to start a rock fight.  but i think this behavior is a
little strange when compared to other systems, and especially the
NFS part is bothersome.  and yes, i know that ext2 doesn't
support lookups on ".".  but other file systems do... and Linux
is operating in a larger universe these days.

and NB: according to the POSIX standard's description of
relative pathname lookup and rmdir, i'd say that, if the cwd is
a deleted directory, open("..") should also fail .  i haven't
checked whether this is true or not.  but we do know that
".." is handled similarly in path_walk -- no d_lookup or
d_revalidate is done.


- Original Message -
From: "Neil Brown" [EMAIL PROTECTED]
To: "Chuck Lever" [EMAIL PROTECTED]
Cc: "Linux FS Developers" [EMAIL PROTECTED]; "Trond Myklebust"
[EMAIL PROTECTED]
Sent: Sunday, March 18, 2001 5:40 PM
Subject: Re: 2.4.2: opening deleted directories


 On Sunday March 18, [EMAIL PROTECTED] wrote:
  on Linux, if my cwd is a deleted directory, i can still open it.  to
wit:
 
  notice the open(".") -- it opens the current working directory that
  is in effect for the "ls" command.  but i just deleted that directory
  from another shell.  shouldn't that open(".") return ENOENT?

 Note that the error message you expect is  is "ENOENT" == Error, NO
 ENTry.  The ENTRY th