Re: 2.4.2: opening deleted directories
On 19 Mar 2001, Trond Myklebust wrote: For a networked filesystem, you want to force the inode to be revalidated in some way every time you open() a file in order to ensure that the local cached data are still valid. The most obvious way to do so given the current VFS API is to use d_revalidate(). One could use inode-i_op-revalidate() instead, but if one wants to optimize attribute caching, one needs to add a flag to tell the filesystem code for when the VFS knows that we want to force a lookup. What LOOKUP will you pass to server when you are opening "."? That said, I'm starting to believe that we need a rethink of the 'open()' interface at the VFS level for 2.5. The current system is becoming more of a problem as the networked filesystems we'd like to support become more sophisticated. In particular, there is a problem with atomicity: currently we rely heavily on semaphores in the VFS to ensure atomicity between lookups, permission checks, and the various operations on files. This obviously is a problem in a networked environment where one in general has no such locks. Thus the newer filesystems (NFSv4, codafs, ...) are increasingly moving towards attempting to solve these problems by pushing more of the responsability for atomicity onto the server, and thus packing more and more operations into a single RPC call. The most insane case I know of to date is the monster-OPEN statement that NFSv4 has to cope with (see RFC3010). There you ideally want to cram a sequence like lookup filehandle check permissions create file lock file all into a single RPC OPEN call. NFSv4 is not alone in this, however. NFSv3 could for instance optimize inode revalidation and permissions checking into a single operation for the case where we know we're doing an 'open()' on a file. We might also want to drop forcing revalidation on open() altogether if we knew that another file already was open. Could we? I open /home/foo. Then I chdir there. Then I rmdir /home/foo. Then I open ".". You wanted revalidate on that, didn't you? Perhaps one should make 'sys_open()' call the filesystem layer, and then provide tools, and a generic_open_file() method in the VFS (rather like sys_read and sys_write work today)? I agree that all current stuff related to revalidation sucks badly, but I don't understand the suggestion above. sys_write() knows what object to talk with. sys_open() gets a string. It can't decide which filesystem it belongs to until it does the lookup. It can't go into the fs-specific code until it knows what filesystem would it be. As for the permission checks on LOOKUP - why not? We can pass the information about planned kind of access to -lookup(). Cache credentials of the guy who did that lookup in your inode or dentry and if subsequent -permission() matches - don't ask server. Cheers, Al - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED]
Re: 2.4.2: opening deleted directories
" " == Alexander Viro [EMAIL PROTECTED] writes: On 19 Mar 2001, Trond Myklebust wrote: For a networked filesystem, you want to force the inode to be revalidated in some way every time you open() a file in order to ensure that the local cached data are still valid. The most obvious way to do so given the current VFS API is to use d_revalidate(). One could use inode-i_op-revalidate() instead, but if one wants to optimize attribute caching, one needs to add a flag to tell the filesystem code for when the VFS knows that we want to force a lookup. What LOOKUP will you pass to server when you are opening "."? In theory, you could pass LOOKUP '.'. The latter is perfectly legal in NFS, and has a well-defined (filesystem-independent) meaning. (LOOKUP '..' is in fact also well-defined. Both are used by other variants of *NIX in their treatment of 'getcwd()'.) However locally, the Linux dcache defines '.' in terms of a dentry. Thus I think it's correct to use d_revalidate(dentry) in this case. We might also want to drop forcing revalidation on open() altogether if we knew that another file already was open. Could we? I open /home/foo. Then I chdir there. Then I rmdir /home/foo. Then I open ".". You wanted revalidate on that, didn't you? I was thinking more in terms of regular files than directories. In principle we assume that if we've got a file open, nobody else is supposed to be changing it (unless we're using file locking to keep the data caches in sync). That said, you're right: we do need to revalidate the filename itself in case somebody renamed the file while we weren't looking. Perhaps one should make 'sys_open()' call the filesystem layer, and then provide tools, and a generic_open_file() method in the VFS (rather like sys_read and sys_write work today)? I agree that all current stuff related to revalidation sucks badly, but I don't understand the suggestion above. sys_write() knows what object to talk with. sys_open() gets a string. It can't decide which filesystem it belongs to until it does the lookup. It can't go into the fs-specific code until it knows what filesystem would it be. Yes, that bit has to be done at the VFS level, but once the lookup of the parent directory is over, file lookup, permissions checking, file creation,... would best be done atomically. Consider those cases where we currently have to grab i_sem (and i_zombie) in order to guarantee atomicity. NFSv4 encourages use of a single COMPOUND call if it is to offer the same guarantees. One point is, for instance, proper treatment of the 'O_EXCL' flag on file creation. Cheers, Trond - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED]
Re: 2.4.2: opening deleted directories
On Tue, Mar 20, 2001 at 10:49:27AM +1100, Neil Brown wrote: ls -la- reports nothing, not even ".". Thats odd. ls -la .. - lists the old parent. This is wrong. So I certainly agree that things aren't perfect, and that if the standard says that lookup of "." must stop working, then I guess it must, but I'm still have trouble seeing how this relates to NFS. No, the standard says that the entries for dot and dotdot are removed, but also that one does not look at these entries during a lookup. Nothing you describe is wrong. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED]
Re: 2.4.2: opening deleted directories
On Sun, Mar 18, 2001 at 05:25:52PM -0500, Chuck Lever wrote: on Linux, if my cwd is a deleted directory, i can still open it. to wit: shell A:shell B: 1. cd /tmp cd /tmp 2. mkdir test-dir 3. cd test-dir 4. rmdir test-dir 5. strace /bin/ls ... notice the open(".") -- it opens the current working directory that is in effect for the "ls" command. but i just deleted that directory from another shell. shouldn't that open(".") return ENOENT? it does on Solaris and OpenBSD for local file systems. since open(".") doesn't do either a d_lookup or d_revalidate on ".", open() never knows if the object is there or not; it merely returns a file descriptor that can't be used for anything. this is an artifact of how path_init and path_walk work -- for ".", path_walk simply returns the dentry that was passed in by path_init. Trond and I were discussing this problem off-line as it relates to NFS close-to-open semantics. any comments on why it works that way? does anyone have opinions about whether or not this is broken behavior? No, I do not think this is broken behavior. You open ".". The name resolution says: The special file name dot refers to the directory specified by its predecessor. where If the path name does not begin with a slash, the predecessor of the first file name of the path name is taken to be the current working directory so it seems that to open "." one only needs an inode, and no name lookup. The current working directory is a directory which is a file. There is no requirement that a file shall have a name. The rmdir removes a name, but not the file, in case there are extant refs. Andries - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED]
Re: 2.4.2: opening deleted directories
repeating neil's question from below "why is the open(".") behavior an issue?" there are a few reasons: * NFS close-to-open cache consistency: in NFS, close-to-open semantics require that attributes be fetched from the server when a file/directory is opened. this behavior is in part to help an application determine whether the file or directory still exists or has been replaced or removed, as in my 'ls' example. in the open(".") case, path_walk doesn't invoke either d_lookup or d_revalidate, so there is no opportunity in the present logic to retrieve the directory's attributes. this potentially breaks programs that depend on attributes being correct when opening ".". 'make' and 'ls' are just two examples. this is, btw, the original problem Trond and I were discussing. he pointed out this problem. * rmdir behavior: the POSIX 1003.1 definition of rmdir() states that: If the directory is the root directory or the current working directory of any process, the effect of this function is implementation-defined. cop-out. it later states that: If one or more processes have the directory open when the last link is removed, the dot and dot-dot entries, if present, are removed before rmdir() returns and no new entries may be created in the directory. this indicates to me that, while the directory may continue to exist if it's the cwd of some other process, the "." and ".." entries must be removed, or equivalently, that lookups of "." and ".." will always fail after a directory is deleted. * standard pathname resolution behavior: according to POSIX 1003.1, resolving a relative pathname means the resolution *begins* at the current working directory. In our case, if we follow POSIX resolution strategy, after starting at the cwd, a lookup of "." should be done. At this point I infer that since the directory has been deleted, "." doesn't exist, and open() returns ENOENT. IOW, according to the text in the standard, the current working directory is not an open file descriptor, it is simply a naming convenience used during pathname resolution. * other broken system calls: i haven't tried this, but i'd guess stat(".") would behave similarly. thus stat(".") on such a removed directory would tell an application that the directory exists when in fact it doesn't. this borders on insecure behavior. fortunately, no other operations are allowed on the directory. * consistent behavior across operating systems: there are other flavors of UNIX that don't appear to work this way. once the directory is removed, it cannot be opened, both on Solaris and OpenBSD. i don't have access to others at the moment. this complicates porting applications among operating systems, if only slightly. * open() is a name space operation: open() converts a pathname into a file descriptor; it's a name space operation. i believe that if a file or directory no longer exists, applications expect they will not be able to open the file or directory because it is no longer attached to the file system's name space. i don't believe there are any other cases in Linux where you can open a file or directory that has been removed, are there? * good design: i believe in reporting an error as soon as it occurs. there are no other operations allowed on a deleted directory, but the open() call is the first opportunity for the operating system to indicate to an application that the directory is gone. i'm not trying to start a rock fight. but i think this behavior is a little strange when compared to other systems, and especially the NFS part is bothersome. and yes, i know that ext2 doesn't support lookups on ".". but other file systems do... and Linux is operating in a larger universe these days. and NB: according to the POSIX standard's description of relative pathname lookup and rmdir, i'd say that, if the cwd is a deleted directory, open("..") should also fail . i haven't checked whether this is true or not. but we do know that ".." is handled similarly in path_walk -- no d_lookup or d_revalidate is done. - Original Message - From: "Neil Brown" [EMAIL PROTECTED] To: "Chuck Lever" [EMAIL PROTECTED] Cc: "Linux FS Developers" [EMAIL PROTECTED]; "Trond Myklebust" [EMAIL PROTECTED] Sent: Sunday, March 18, 2001 5:40 PM Subject: Re: 2.4.2: opening deleted directories On Sunday March 18, [EMAIL PROTECTED] wrote: on Linux, if my cwd is a deleted directory, i can still open it. to wit: notice the open(".") -- it opens the current working directory that is in effect for the "ls" command. but i just deleted that directory from another shell. shouldn't that open(".") return ENOENT? Note that the error message you expect is is "ENOENT" == Error, NO ENTry. The ENTRY th