Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-17 Thread Ulrich Drepper
Ragnar Kjørstad wrote: I think Andreas already wrote that ls --color is the default in most distributions and needs to stat every file. Remove the :ex entry from LS_COLORS and try again. find is already smart enough to not call stat when it's not needed, and make use of d_type when it's

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-17 Thread Matthew Wilcox
On Sun, Dec 17, 2006 at 11:07:27AM -0800, Ulrich Drepper wrote: And how often do the scripts which are in everyday use require such a command? And the same for the other programs. I know that the rsync load is a major factor on kernel.org right now. With all the git trees (particularly the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-17 Thread Ulrich Drepper
Matthew Wilcox wrote: I know that the rsync load is a major factor on kernel.org right now. That should be quite easy to quantify then. Move the readdir and stat call next to each other in the sources, pass the struct stat around if necessary, and then count the stat calls which do not

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-17 Thread Ragnar Kjørstad
On Sun, Dec 17, 2006 at 01:51:38PM -0800, Ulrich Drepper wrote: Matthew Wilcox wrote: I know that the rsync load is a major factor on kernel.org right now. That should be quite easy to quantify then. Move the readdir and stat call next to each other in the sources, pass the struct stat

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-17 Thread Gary Grider
At 07:57 PM 12/17/2006, Ragnar Kjørstad wrote: On Sun, Dec 17, 2006 at 01:51:38PM -0800, Ulrich Drepper wrote: Matthew Wilcox wrote: I know that the rsync load is a major factor on kernel.org right now. That should be quite easy to quantify then. Move the readdir and stat call next to each

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-16 Thread Andreas Dilger
On Dec 15, 2006 14:37 -0800, Ulrich Drepper wrote: Andreas Dilger wrote: IMHO, once part of the information is optional, why bother making ANY of it required? Consider ls -s on a distributed filesystem that has UID+GID mapping. It doesn't actually NEED to return the UID+GID to ls for each

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-16 Thread Ulrich Drepper
Andreas Dilger wrote: The kernel doesn't necessarily have to clear the fields. The per-field valid flag would determine is that field had valid data or garbage. You cannot leak kernel memory content. Either you clear the field or, in the code which actually copies the data to userlevel, you

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-15 Thread Ulrich Drepper
Andreas Dilger wrote: IMHO, once part of the information is optional, why bother making ANY of it required? Consider ls -s on a distributed filesystem that has UID+GID mapping. It doesn't actually NEED to return the UID+GID to ls for each file, since it won't be shown, but if that is part of

Re: Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread David Chinner
On Tue, Dec 05, 2006 at 05:47:16PM +0100, Latchesar Ionkov wrote: On 12/5/06, Rob Ross [EMAIL PROTECTED] wrote: Hi, I agree that it is not feasible to add new system calls every time somebody has a problem, and we don't take adding system calls lightly. However, in this case we're talking

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Andreas Dilger
On Dec 05, 2006 15:55 -0800, Ulrich Drepper wrote: I don't think an accuracy flag is useful at all. Programs don't want to use fuzzy information. If you want a fast 'ls -l' then add a mode which doesn't print the fields which are not provided. Don't provide outdated information.

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Ragnar Kjørstad
On Tue, Dec 05, 2006 at 09:55:16AM -0500, Trond Myklebust wrote: The again statlite and readdirplus really are the most sane bits of these proposals as they fit nicely into the existing set of APIs. The filehandle idiocy on the other hand is way of into crackpipe land. ...

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Rob Ross
Matthew Wilcox wrote: On Tue, Dec 05, 2006 at 10:07:48AM +, Christoph Hellwig wrote: The filehandle idiocy on the other hand is way of into crackpipe land. Right, and it needs to be discarded. Of course, there was a real problem that it addressed, so we need to come up with an acceptable

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Trond Myklebust
On Wed, 2006-12-06 at 13:22 +0100, Ragnar Kjørstad wrote: On Tue, Dec 05, 2006 at 09:55:16AM -0500, Trond Myklebust wrote: The again statlite and readdirplus really are the most sane bits of these proposals as they fit nicely into the existing set of APIs. The filehandle idiocy on

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Rob Ross
Matthew Wilcox wrote: On Wed, Dec 06, 2006 at 09:04:00AM -0600, Rob Ross wrote: The openg() solution has the following advantages to what you propose. First, it places the burden of the communication of the file handle on the application process, not the file system. That means less work for

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Rob Ross
Ulrich Drepper wrote: Andreas Dilger wrote: Does this mean you are against the statlite() API entirely, or only against the document's use of the flag as a vague accuracy value instead of a hard valid value? I'm against fuzzy values. I've no problems with a bitmap specifying that certain

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Ulrich Drepper
Rob Ross wrote: File size is definitely one of the more difficult of the parameters, either because (a) it isn't stored in one place but is instead derived, or (b) because a lock has to be obtained to guarantee consistency of the returned value. OK, and looking at the man page again, it is

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Ragnar Kjørstad
On Wed, Dec 06, 2006 at 09:42:55AM -0800, Ulrich Drepper wrote: I can't speak for everyone, but ls is the #1 consumer as far as I am concerned. So a syscall for ls alone? I guess the code needs to be checked, but I would think that: * ls * find * rm -r * chown -R * chmod -R * rsync *

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Ulrich Drepper
Ragnar Kjørstad wrote: I guess the code needs to be checked, but I would think that: * ls * find * rm -r * chown -R * chmod -R * rsync * various backup software * imap servers Then somebody do the analysis. And please an analysis which takes into account that some programs might need to be

Re: Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Latchesar Ionkov
On 12/5/06, Rob Ross [EMAIL PROTECTED] wrote: I unfortunately don't have data to show exactly where the time was spent, but it's a good guess that it is all the network traffic in the open() case. Is it hard to repeat the test and check what requests (and how much time do they take) PVFS

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-06 Thread Andreas Dilger
On Dec 06, 2006 09:42 -0800, Ulrich Drepper wrote: Rob Ross wrote: File size is definitely one of the more difficult of the parameters, either because (a) it isn't stored in one place but is instead derived, or (b) because a lock has to be obtained to guarantee consistency of the returned

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Christoph Hellwig
On Mon, Dec 04, 2006 at 09:44:08PM -0700, Gary Grider wrote: The one use that some users talk about is just knowing the file is growing is important and useful to them, knowing exactly to the byte how much growth seems less important to them until they close. On these big parallel apps, so

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Matthew Wilcox
On Tue, Dec 05, 2006 at 10:07:48AM +, Christoph Hellwig wrote: The filehandle idiocy on the other hand is way of into crackpipe land. Right, and it needs to be discarded. Of course, there was a real problem that it addressed, so we need to come up with an acceptable alternative. The

Re: Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Latchesar Ionkov
On 12/5/06, Rob Ross [EMAIL PROTECTED] wrote: Hi, I agree that it is not feasible to add new system calls every time somebody has a problem, and we don't take adding system calls lightly. However, in this case we're talking about an entire *community* of people (high-end computing), not just

Re: Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Latchesar Ionkov
On 12/5/06, Christoph Hellwig [EMAIL PROTECTED] wrote: The filehandle idiocy on the other hand is way of into crackpipe land. What is your opinion on giving the file system an option to lookup a file more than one name/directory at a time? I think that all remote file systems can benefit from

Re: Re: Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Latchesar Ionkov
-- Forwarded message -- From: Latchesar Ionkov [EMAIL PROTECTED] Date: Dec 5, 2006 6:09 PM Subject: Re: Re: Re: NFSv4/pNFS possible POSIX I/O API standards To: Matthew Wilcox [EMAIL PROTECTED] On 12/5/06, Matthew Wilcox [EMAIL PROTECTED] wrote: On Tue, Dec 05, 2006 at 05:47

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Peter Staubach
Matthew Wilcox wrote: On Tue, Dec 05, 2006 at 05:47:16PM +0100, Latchesar Ionkov wrote: I think that the main problem is that all these file systems resove a path name, one directory at a time bringing the server to its knees by the huge amount of requests. I would like to see what the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Christoph Hellwig
I'd like to Cc Ulrich Drepper in this thread because he's going to decide what APIs will be exposed at the C library level in the end, and he also has quite a lot of experience with the various standardization bodies. Ulrich, this in reply to these API proposals:

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Rob Ross
Matthew Wilcox wrote: On Tue, Dec 05, 2006 at 06:09:03PM +0100, Latchesar Ionkov wrote: It could be wasteful, but it could (most likely) also be useful. Name resolution is not that expensive on either side of the network. The latency introduced by the single-name lookups is :) *is* latency

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Rob Ross
Trond Myklebust wrote: On Tue, 2006-12-05 at 10:07 +, Christoph Hellwig wrote: ...and we have pointed out how nicely this ignores the realities of current caching models. There is no need for a readdirplus() system call. There may be a need for a caching barrier, but AFAICS that is all. I

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Sage Weil
On Tue, 5 Dec 2006, Christoph Hellwig wrote: Readdir plus is a little more involved. For one thing the actual kernel implementation will be a variant of getdents() call anyway while a readdirplus would only be a library level interface. At the actual C prototype level I would rename d_stat_err

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Trond Myklebust
On Tue, 2006-12-05 at 16:11 -0600, Rob Ross wrote: Trond Myklebust wrote: b) quite unnatural to impose caching semantics on all the directory _entries_ using a syscall that refers to the directory itself (see the explanations by both myself and Peter Staubach

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-05 Thread Ulrich Drepper
Christoph Hellwig wrote: Ulrich, this in reply to these API proposals: I know the documents. The HECWG was actually supposed to submit an actual draft to the OpenGroup-internal working group but I haven't seen anything yet. I'm not opposed to getting real-world experience first. So

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-04 Thread Trond Myklebust
On Mon, 2006-12-04 at 00:32 -0700, Andreas Dilger wrote: I'm wondering if a corresponding opendirplus() (or similar) would also be appropriate to inform the kernel/filesystem that readdirplus() will follow, and stat information should be gathered/buffered. Or do most implementations

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-04 Thread Peter Staubach
Sage Weil wrote: On Fri, 1 Dec 2006, Trond Myklebust wrote: I'm quite happy with a proposal for a statlite(). I'm objecting to readdirplus() because I can't see that it offers you anything useful. You haven't provided an example of an application which would clearly benefit from a readdirplus()

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-04 Thread Rob Ross
Hi, I agree that it is not feasible to add new system calls every time somebody has a problem, and we don't take adding system calls lightly. However, in this case we're talking about an entire *community* of people (high-end computing), not just one or two people. Of course it may still be

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-04 Thread Rob Ross
Hi all, I don't think that the group intended that there be an opendirplus(); rather readdirplus() would simply be called instead of the usual readdir(). We should clarify that. Regarding Peter Staubach's comments about no one ever using the readdirplus() call; well, if people weren't

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-04 Thread Trond Myklebust
On Mon, 2006-12-04 at 18:59 -0600, Rob Ross wrote: Hi all, I don't think that the group intended that there be an opendirplus(); rather readdirplus() would simply be called instead of the usual readdir(). We should clarify that. Regarding Peter Staubach's comments about no one ever

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-03 Thread Sage Weil
On Sat, 2 Dec 2006, Andreas Dilger wrote: Just to be clear, I have no desire to include any kind of synchronization semantics to readdirplus() that is also being discussed in this thread. Just the ability to bundle select stat info along with the readdir information, and to allow stat to not

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-03 Thread Andreas Dilger
On Dec 03, 2006 08:10 -0800, Sage Weil wrote: My only concern is the at least as recent as the opendir() part, in contrast to statlite(), which has undefined recentness of its result for fields not specified in the mask. Ideally, I'd like to see readdirplus() also take a statlite() style

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-02 Thread Andreas Dilger
On Dec 01, 2006 13:07 -0500, Trond Myklebust wrote: The more interesting case is multiple clients in the same directory. In order to provide strong consistency, both stat() and readdir() have to talk to the server (or more complicated leasing mechanisms are needed). Why would that be

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Trond Myklebust
On Thu, 2006-11-30 at 23:08 -0800, Sage Weil wrote: I mean atomic only in the sense that the stat result returned by readdirplus() would reflect the file state at some point during the time consumed by that system call. In contrast, when you call stat() separately, it's expected that the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Sage Weil
On Fri, 1 Dec 2006, Trond Myklebust wrote: 'ls --color' and 'find' don't give a toss about most of the arguments from 'stat()'. They just want to know what kind of filesystem object they are dealing with. We already provide that information in the readdir() syscall via the 'd_type' field.

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Sage Weil
On Fri, 1 Dec 2006, Trond Myklebust wrote: I'm quite happy with a proposal for a statlite(). I'm objecting to readdirplus() because I can't see that it offers you anything useful. You haven't provided an example of an application which would clearly benefit from a readdirplus() interface instead

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Trond Myklebust
On Fri, 2006-12-01 at 10:42 -0800, Sage Weil wrote: On Fri, 1 Dec 2006, Trond Myklebust wrote: I'm quite happy with a proposal for a statlite(). I'm objecting to readdirplus() because I can't see that it offers you anything useful. You haven't provided an example of an application which

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Sage Weil
On Fri, 1 Dec 2006, Trond Myklebust wrote: Also, it's a tiring and trivial example, but even the 'ls -al' scenario isn't ideally addressed by readdir()+statlite(), since statlite() might return size/mtime from before 'ls -al' was executed by the user. stat() will do the same. It does with

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Rob Ross
Hi all, The use model for openg() and openfh() (renamed sutoc()) is n processes spread across a large cluster simultaneously opening a file. The challenge is to avoid to the greatest extent possible incurring O(n) FS interactions. To do that we need to allow actions of one process to be

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-12-01 Thread Latchesar Ionkov
Hi, One general remark: I don't think it is feasible to add new system calls every time somebody has a problem. Usually there are (may be not that good) solutions that don't require big changes and work well enough. Let's change the interface and make the life of many filesystem developers

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-30 Thread Christoph Hellwig
On Wed, Nov 29, 2006 at 12:26:22AM -0800, Brad Boyer wrote: For a more extreme case, hfs and hfsplus don't even have a separation between directory entries and inode information. The code creates this separation synthetically to match the expectations of the kernel. During a readdir(), the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-30 Thread Christoph Hellwig
On Wed, Nov 29, 2006 at 10:25:07AM +, Steven Whitehouse wrote: I agree that this is a good plan, but I'd been looking at this idea from a different direction recently. The in kernel NFS server calls vfs_getattr from its filldir routine for readdirplus and this means not only are we unable

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-30 Thread Sage Weil
On Thu, 30 Nov 2006, Christoph Hellwig wrote: On Wed, Nov 29, 2006 at 12:26:22AM -0800, Brad Boyer wrote: For a more extreme case, hfs and hfsplus don't even have a separation between directory entries and inode information. The code creates this separation synthetically to match the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Andreas Dilger
On Nov 29, 2006 09:04 +, Christoph Hellwig wrote: - readdirplus This one is completely unneeded as a kernel API. Doing readdir plus calls on the wire makes a lot of sense and we already do that for NFSv3+. Doing this at the syscall layer just means kernel

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Matthew Wilcox
On Wed, Nov 29, 2006 at 09:04:50AM +, Christoph Hellwig wrote: - openg/sutoc No way. We already have a very nice file descriptor abstraction. You can pass file descriptors over unix sockets just fine. Yes, but it behaves like dup(). Gary replied to me off-list (which I

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Matthew Wilcox
On Wed, Nov 29, 2006 at 05:23:13AM -0700, Matthew Wilcox wrote: On Wed, Nov 29, 2006 at 09:04:50AM +, Christoph Hellwig wrote: - openg/sutoc No way. We already have a very nice file descriptor abstraction. You can pass file descriptors over unix sockets just fine. Yes,

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Gary Grider
At 05:35 AM 11/29/2006, Matthew Wilcox wrote: On Wed, Nov 29, 2006 at 05:23:13AM -0700, Matthew Wilcox wrote: On Wed, Nov 29, 2006 at 09:04:50AM +, Christoph Hellwig wrote: - openg/sutoc No way. We already have a very nice file descriptor abstraction. You can pass file

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Christoph Hellwig
Please don't repeat the stupid marketroid speach. If you want this to go anywhere please get someone with an actual clue to talk to us instead of you. Thanks a lot. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-29 Thread Brad Boyer
On Wed, Nov 29, 2006 at 10:18:42AM +, Anton Altaparmakov wrote: To take NTFS as an example I know something about, the directory entry caches the a/c/m time as well as the data file size (needed for ls) and the allocated on disk file size (needed for du) as well as the inode number

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-28 Thread Andreas Dilger
On Nov 28, 2006 05:54 +, Christoph Hellwig wrote: What crack do you guys have been smoking? As usual, Christoph is a model of diplomacy :-). On Mon, Nov 27, 2006 at 09:34:05PM -0700, Gary Grider wrote: readx/writex - scattergather readwrite - more appropriate and complete than the

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-28 Thread Anton Altaparmakov
On Tue, 2006-11-28 at 03:54 -0700, Andreas Dilger wrote: On Nov 28, 2006 05:54 +, Christoph Hellwig wrote: What crack do you guys have been smoking? As usual, Christoph is a model of diplomacy :-). On Mon, Nov 27, 2006 at 09:34:05PM -0700, Gary Grider wrote: readx/writex -

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-28 Thread Matthew Wilcox
On Mon, Nov 27, 2006 at 09:34:05PM -0700, Gary Grider wrote: Things like openg() - on process opens a file and gets a key that is passed to lots of processes which use the key to get a handle (great for thousands of processes opening a file) I don't understand how this leads to a more

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-28 Thread Russell Cattelan
On Tue, 2006-11-28 at 03:54 -0700, Andreas Dilger wrote: statlite() - asking for stat info without requiring completely accurate info like dates and sizes. This is good for running stat against a file that is open by hundreds of processes which currently forces callbacks and

NFSv4/pNFS possible POSIX I/O API standards

2006-11-27 Thread Gary Grider
NFS developers, a group of people from the High End Computing Interagency Working Group File Systems and I/O (HECIWG FSIO), which is a funding oversight group for file systems and storage government funded research, has formed a project to extend the POSIX I/O API. The extensions have

Re: NFSv4/pNFS possible POSIX I/O API standards

2006-11-27 Thread Christoph Hellwig
What crack do you guys have been smoking? On Mon, Nov 27, 2006 at 09:34:05PM -0700, Gary Grider wrote: NFS developers, a group of people from the High End Computing Interagency Working Group File Systems and I/O (HECIWG FSIO), which is a funding oversight group for file systems and