Re: For review: open_by_name_at(2) man page [v2]
Hi Mike, On 03/19/2014 07:42 AM, Mike Frysinger wrote: > On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: >> The >> .I flags >> argument is a bit mask constructed by ORing together >> zero or more of the following value: >> .TP >> .B AT_EMPTY_PATH >> Allow >> .I pathname >> to be an empty string. >> See above. >> (which may have been obtained using the >> .BR open (2) >> .B O_PATH >> flag). >> .TP >> .B AT_SYMLINK_FOLLOW >> By default, >> .BR name_to_handle_at () >> does not dereference >> .I pathname >> if it is a symbolic link. >> The flag >> .B AT_SYMLINK_FOLLOW >> can be specified in >> .I flags >> to cause >> .I pathname >> to be dereferenced if it is a symbolic link. > > this section is only talking about |flags|, and further this part is only > talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super > redundant. > how about reversing the sentence order so that both are implicit like is done > in the openat() page and the description of O_NOFOLLOW ? I'm not sure that I completely understand you here, but I agree that this could be better. I've rewritten somewhat. >> .B ENOTDIR >> The file descriptor supplied in >> .I dirfd >> does not refer to a directory, >> and it it is not the case that both > > "it" is duplicated Fixed. >> .SS Obtaining a persistent filesystem ID >> The mount IDs in >> .IR /proc/self/mountinfo >> can be reused as filesystems are unmounted and mounted. >> Therefore, the mount ID returned by >> .BR name_to_handle_at (3) > > should be () and not (3) Fixed. > side note: this seems like an easy error to script for ... Yep, I've got some scripts that I run manually now and then to check for this sort of stuff. >> $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP > > aber, ich spreche kein Deutsch :( > > do we have a standard about sticking to english ? i wonder if people are > more > likely to be confused or to appreciate it ... Fair enough. I'm too influenced by recent work on the locale pages (and family conversations ;-)). I'll switch it to English >> #define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \\ >> } while (0) > > i wonder if err.h makes sense now that this is a man page for completely > linux-specific syscalls :). and you use _GNU_SOURCE. I'm not really convinced about using these functions, but I'll reflect on it more. >> int >> main(int argc, char *argv[]) >> { >> struct file_handle *fhp; >> int mount_id, fhsize, s; >> >> if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { > > argc != 2 ? Yes, some cruft crept in. >> /* Allocate file_handle structure */ >> >> fhsize = sizeof(struct file_handle *); > > pretty sure this is wrong > as sizeof() here returns the size of a pointer, not > the size of the struct. it's why i prefer the form: > > fhsize = sizeof(*fhp); > > less typing and harder to screw up by accident. Yep, changed. > granted, the case below won't crash since the kernel only reads/writes > sizeof(unsigned int) and i'm not aware of any system where that is larger > than > sizeof(void *), but it's still wrong :). > >> s = name_to_handle_at(AT_FDCWD, argv[1], fhp, _id, 0); > > another personal style: create dedicated variables for each arg you unpack > out > of argv[1]. it's generally OK when you only take one arg, but when you get > more than one, you end up flipping back and forth between the usage trying to > figure out what index 1 represents instead of focusing on what the code is > doing. > const char *pathname = argv[1]; Yup. >> fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; > > fhsize += fhp->handle_bytes ? > > it's the same, but i think nicer ;) Depends on your perspective. It relies on no one changing the code so that fhsize is modified after the earlier initialization. And also, with this line, I see exactly what is going on, in one place. I'll leave as is. >> /* Write mount ID, file handle size, and file handle to stdout, >>for later reuse by t_open_by_handle_at.c */ >> >> if (write(STDOUT_FILENO, _id, sizeof(int)) != sizeof(int) || >> write(STDOUT_FILENO, , sizeof(int)) != sizeof(int) || >> write(STDOUT_FILENO, fhp, fhsize) != fhsize) { > > seems like a whole lot of code spew for a simple printf() ? you'd have to > adjust the other program to use scanf(), but seems like the end result would > be nicer for users to experiment with ? Yes. I'd already reflected on exactly that and made a change to using text formats. >> static int >> open_mount_path_by_id(int mount_id) >> { >> char *linep; >> size_t lsize; >> char mount_path[PATH_MAX]; >> int fmnt_id, fnd, nread; > > could we buy a few more letters for these vars ? i guess fmnt_id is the > filesystem mount id, and fnd is "find". When I was a kid, you had to pay a dollar for each letter... (I've made a few changes.) > also, getline() returns a ssize_t, not an int. Fixed.
Re: For review: open_by_name_at(2) man page [v2]
[CC =+ Al Viro] Hi Neil, On 03/19/2014 05:13 AM, NeilBrown wrote: > On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" > wrote: > >> Hi Aneesh, (and others) >> >> After integrating review comments from NeilBown and Christoph Hellwig, >> here is draft 2 of a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Especially thanks to Neil's comments, several parts >> of the page underwent a substantial rewrite. Would you be willing to >> review it please, and let me know of any corrections/improvements? [various typos you reported fixed now.] >> .TP >> .B AT_EMPTY_PATH >> Allow >> .I pathname >> to be an empty string. >> See above. >> (which may have been obtained using the >> .BR open (2) >> .B O_PATH >> flag). > > What "may have been obtained" ?? Crufty text. gone now. >> The >> .I flags >> argument >> is as for >> .BR open (2). >> .\" FIXME: Confirm that the following is intended behavior. >> .\"(It certainly seems to be the behavior, from experimenting.) >> If >> .I handle >> refers to a symbolic link, the caller must specify the >> .B O_PATH >> flag, and the symbolic link is not dereferenced (the >> .B O_NOFOLLOW >> flag, if specified, is ignored). > > It certainly sounds like reasonable behaviour. I cannot comment on intention > though. > Are you bothered that O_PATH is needed for symlinks? No. > An fd on a symlink is a > sufficiently unusual thing that it seems reasonable for a programmer to > explicitly say they are expecting one. I think the point is this: If you have a file handle for a symlink, then you can't follow the symlink, which is why you must specify O_PATH and O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though. I suspect it's something like: the process receiving the handle doesn't have enough information for the symlink to be interpreted, I think because it can;t reliably determine what directory the link lives in. Possibly Al Viro or Aneesh can confirm. >> In the event of an error, both system calls return \-1 and set >> .I errno >> to indicate the cause of the error. >> .SH ERRORS >> .BR name_to_handle_at () >> and >> .BR open_by_handle_at () >> can fail for the same errors as >> .BR openat (2). >> In addition, they can fail with the errors noted below. > > Should you mention EFAULT if mount_id or handle are not valid pointers? Done. >> Not all filesystem types support the translation of pathnames to >> file handles. >> .\" FIXME NeilBrown noted: >> .\"ESTALE is also returned if the filesystem does not support >> .\"file-handle -> file mappings. >> .\"On filesystems which don't provide export_operations (/sys /proc >> .\"ubifs romfs cramfs nfs coda ... several others) name_to_handle_at >> .\"will produce a generic handle using the 32 bit inode and 32 bit >> .\"i_generation. open_by_name_at given this (or any) filehandle >> .\"will fail with ESTALE. >> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with >> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the >> .\" same error as for an invalid file handle) in the above circumstances? > > This is all wrong - discard it :-) Yup. Gone now ;-). Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
Hi Neil, On 03/18/2014 11:24 PM, NeilBrown wrote: > On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)" > wrote: > >> On 03/18/2014 10:43 AM, Christoph Hellwig wrote: >>> On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: ESTALE is also returned if the filesystem does not support file-handle -> file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. >>> >>> Do we? Seems like the code is erroring out early if there are no >>> export_ops? >> >> It appears to me that Neil's statement isn't correct, at least for /proc >> and /sys (see my other mail, to Neil). I'm unsure about whether it is true >> for some of those other FSes thought. > > > Indeed, I was wrong. > > I was looking at > > int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, >int *max_len, struct inode *parent) > { > const struct export_operations *nop = inode->i_sb->s_export_op; > > if (nop && nop->encode_fh) > return nop->encode_fh(inode, fid->raw, max_len, parent); > > return export_encode_fh(inode, fid, max_len, parent); > } > > > which uses a default if there is no 'nop'. > > However do_sys_name_to_handle() contains > > if (!path->dentry->d_sb->s_export_op || > !path->dentry->d_sb->s_export_op->fh_to_dentry) > return -EOPNOTSUPP; > > long before export_encode_inode_fh() gets called. So the default isn't used. Okay. > I would have thought that exportfs_encode_inode_fh would never get called if > there were no s_export_op pointer - certainly name_to_handle_at and nfsd > would never call it in that case. > However it seems that > > This routine will be used to generate a file handle in fdinfo output for > inotify subsystem, where if no s_export_op present the general > export_encode_fh should be used. Thus add a test if s_export_op present > inside exportfs_encode_fh itself. > > according to > > commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 > Author: Cyrill Gorcunov > Date: Mon Dec 17 16:05:06 2012 -0800 > > > I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD > when $FD is an inotify fd which is watching the particular file. I > wouldn't have expected that, but maybe it is a good idea. Yes, it does--I tested it, and it works! I was unaware of this feature, though I'm not sure that I'll add anything to a man page just yet. > So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. > So if you get ESTALE from open_by_handle_at(), then it really is a stale > handle. Sorry for the confusion. Yup, I've updated the page now. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page [v2]
On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: > The > .I flags > argument is a bit mask constructed by ORing together > zero or more of the following value: > .TP > .B AT_EMPTY_PATH > Allow > .I pathname > to be an empty string. > See above. > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). > .TP > .B AT_SYMLINK_FOLLOW > By default, > .BR name_to_handle_at () > does not dereference > .I pathname > if it is a symbolic link. > The flag > .B AT_SYMLINK_FOLLOW > can be specified in > .I flags > to cause > .I pathname > to be dereferenced if it is a symbolic link. this section is only talking about |flags|, and further this part is only talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super redundant. how about reversing the sentence order so that both are implicit like is done in the openat() page and the description of O_NOFOLLOW ? > .B ENOTDIR > The file descriptor supplied in > .I dirfd > does not refer to a directory, > and it it is not the case that both "it" is duplicated > .SS Obtaining a persistent filesystem ID > The mount IDs in > .IR /proc/self/mountinfo > can be reused as filesystems are unmounted and mounted. > Therefore, the mount ID returned by > .BR name_to_handle_at (3) should be () and not (3) side note: this seems like an easy error to script for ... > $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP aber, ich spreche kein Deutsch :( do we have a standard about sticking to english ? i wonder if people are more likely to be confused or to appreciate it ... > #define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \\ > } while (0) i wonder if err.h makes sense now that this is a man page for completely linux-specific syscalls :). and you use _GNU_SOURCE. > int > main(int argc, char *argv[]) > { > struct file_handle *fhp; > int mount_id, fhsize, s; > > if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { argc != 2 ? > /* Allocate file_handle structure */ > > fhsize = sizeof(struct file_handle *); pretty sure this is wrong as sizeof() here returns the size of a pointer, not the size of the struct. it's why i prefer the form: fhsize = sizeof(*fhp); less typing and harder to screw up by accident. granted, the case below won't crash since the kernel only reads/writes sizeof(unsigned int) and i'm not aware of any system where that is larger than sizeof(void *), but it's still wrong :). > s = name_to_handle_at(AT_FDCWD, argv[1], fhp, _id, 0); another personal style: create dedicated variables for each arg you unpack out of argv[1]. it's generally OK when you only take one arg, but when you get more than one, you end up flipping back and forth between the usage trying to figure out what index 1 represents instead of focusing on what the code is doing. const char *pathname = argv[1]; > fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhsize += fhp->handle_bytes ? it's the same, but i think nicer ;) > /* Write mount ID, file handle size, and file handle to stdout, >for later reuse by t_open_by_handle_at.c */ > > if (write(STDOUT_FILENO, _id, sizeof(int)) != sizeof(int) || > write(STDOUT_FILENO, , sizeof(int)) != sizeof(int) || > write(STDOUT_FILENO, fhp, fhsize) != fhsize) { seems like a whole lot of code spew for a simple printf() ? you'd have to adjust the other program to use scanf(), but seems like the end result would be nicer for users to experiment with ? > static int > open_mount_path_by_id(int mount_id) > { > char *linep; > size_t lsize; > char mount_path[PATH_MAX]; > int fmnt_id, fnd, nread; could we buy a few more letters for these vars ? i guess fmnt_id is the filesystem mount id, and fnd is "find". also, getline() returns a ssize_t, not an int. > FILE *fp; > > fp = fopen("/proc/self/mountinfo", "r"); only one space before the = i would encourage using the "e" flag whenever possible in the hopes that someone might start using it in their own code base. fp = fopen("/proc/self/mountinfo", "re"); > for (fnd = 0; !fnd ; ) { in my experience, seems like a while() loop makes more sense when you're implementing a while() loop ... fnd = 0; while (!fnd) { > linep = NULL; > nread = getline(, , fp); this works, but it's unusual when using getline() as it kind of defeats the purpose of using the dyn allocation feature. fnd = 0; linep = NULL; while (!fnd) { nread = getline(, , fp); ... } free(linep); i don't think it complicates the code much more ? > if (nread == \-1) > break; > > nread = sscanf(linep, "%d %*d %*s %*s %s", _id, mount_path); indent is off here > return open(mount_path, O_RDONLY | O_DIRECTORY); O_CLOEXEC for funsies ? > int > main(int argc, char *argv[]) > { > struct
Re: For review: open_by_name_at(2) man page [v2]
On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: The .I flags argument is a bit mask constructed by ORing together zero or more of the following value: .TP .B AT_EMPTY_PATH Allow .I pathname to be an empty string. See above. (which may have been obtained using the .BR open (2) .B O_PATH flag). .TP .B AT_SYMLINK_FOLLOW By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link. The flag .B AT_SYMLINK_FOLLOW can be specified in .I flags to cause .I pathname to be dereferenced if it is a symbolic link. this section is only talking about |flags|, and further this part is only talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super redundant. how about reversing the sentence order so that both are implicit like is done in the openat() page and the description of O_NOFOLLOW ? .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it it is not the case that both it is duplicated .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at (3) should be () and not (3) side note: this seems like an easy error to script for ... $ \fBecho 'Kannst du bitte überlegen?' cecilia.txt\fP aber, ich spreche kein Deutsch :( do we have a standard about sticking to english ? i wonder if people are more likely to be confused or to appreciate it ... #define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) i wonder if err.h makes sense now that this is a man page for completely linux-specific syscalls :). and you use _GNU_SOURCE. int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, s; if (argc 2 || strcmp(argv[1], \-\-help) == 0) { argc != 2 ? /* Allocate file_handle structure */ fhsize = sizeof(struct file_handle *); pretty sure this is wrong as sizeof() here returns the size of a pointer, not the size of the struct. it's why i prefer the form: fhsize = sizeof(*fhp); less typing and harder to screw up by accident. granted, the case below won't crash since the kernel only reads/writes sizeof(unsigned int) and i'm not aware of any system where that is larger than sizeof(void *), but it's still wrong :). s = name_to_handle_at(AT_FDCWD, argv[1], fhp, mount_id, 0); another personal style: create dedicated variables for each arg you unpack out of argv[1]. it's generally OK when you only take one arg, but when you get more than one, you end up flipping back and forth between the usage trying to figure out what index 1 represents instead of focusing on what the code is doing. const char *pathname = argv[1]; fhsize = sizeof(struct file_handle) + fhp\-handle_bytes; fhsize += fhp-handle_bytes ? it's the same, but i think nicer ;) /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ if (write(STDOUT_FILENO, mount_id, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhsize, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhp, fhsize) != fhsize) { seems like a whole lot of code spew for a simple printf() ? you'd have to adjust the other program to use scanf(), but seems like the end result would be nicer for users to experiment with ? static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int fmnt_id, fnd, nread; could we buy a few more letters for these vars ? i guess fmnt_id is the filesystem mount id, and fnd is find. also, getline() returns a ssize_t, not an int. FILE *fp; fp = fopen(/proc/self/mountinfo, r); only one space before the = i would encourage using the e flag whenever possible in the hopes that someone might start using it in their own code base. fp = fopen(/proc/self/mountinfo, re); for (fnd = 0; !fnd ; ) { in my experience, seems like a while() loop makes more sense when you're implementing a while() loop ... fnd = 0; while (!fnd) { linep = NULL; nread = getline(linep, lsize, fp); this works, but it's unusual when using getline() as it kind of defeats the purpose of using the dyn allocation feature. fnd = 0; linep = NULL; while (!fnd) { nread = getline(linep, lsize, fp); ... } free(linep); i don't think it complicates the code much more ? if (nread == \-1) break; nread = sscanf(linep, %d %*d %*s %*s %s, fmnt_id, mount_path); indent is off here return open(mount_path, O_RDONLY | O_DIRECTORY); O_CLOEXEC for funsies ? int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fd, mount_fd, fhsize;
Re: For review: open_by_name_at(2) man page
Hi Neil, On 03/18/2014 11:24 PM, NeilBrown wrote: On Tue, 18 Mar 2014 13:37:15 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: On 03/18/2014 10:43 AM, Christoph Hellwig wrote: On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? It appears to me that Neil's statement isn't correct, at least for /proc and /sys (see my other mail, to Neil). I'm unsure about whether it is true for some of those other FSes thought. Indeed, I was wrong. I was looking at int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent) { const struct export_operations *nop = inode-i_sb-s_export_op; if (nop nop-encode_fh) return nop-encode_fh(inode, fid-raw, max_len, parent); return export_encode_fh(inode, fid, max_len, parent); } which uses a default if there is no 'nop'. However do_sys_name_to_handle() contains if (!path-dentry-d_sb-s_export_op || !path-dentry-d_sb-s_export_op-fh_to_dentry) return -EOPNOTSUPP; long before export_encode_inode_fh() gets called. So the default isn't used. Okay. I would have thought that exportfs_encode_inode_fh would never get called if there were no s_export_op pointer - certainly name_to_handle_at and nfsd would never call it in that case. However it seems that This routine will be used to generate a file handle in fdinfo output for inotify subsystem, where if no s_export_op present the general export_encode_fh should be used. Thus add a test if s_export_op present inside exportfs_encode_fh itself. according to commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 Author: Cyrill Gorcunov gorcu...@openvz.org Date: Mon Dec 17 16:05:06 2012 -0800 I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD when $FD is an inotify fd which is watching the particular file. I wouldn't have expected that, but maybe it is a good idea. Yes, it does--I tested it, and it works! I was unaware of this feature, though I'm not sure that I'll add anything to a man page just yet. So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. So if you get ESTALE from open_by_handle_at(), then it really is a stale handle. Sorry for the confusion. Yup, I've updated the page now. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page [v2]
[CC =+ Al Viro] Hi Neil, On 03/19/2014 05:13 AM, NeilBrown wrote: On Tue, 18 Mar 2014 13:55:15 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: Hi Aneesh, (and others) After integrating review comments from NeilBown and Christoph Hellwig, here is draft 2 of a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Especially thanks to Neil's comments, several parts of the page underwent a substantial rewrite. Would you be willing to review it please, and let me know of any corrections/improvements? [various typos you reported fixed now.] .TP .B AT_EMPTY_PATH Allow .I pathname to be an empty string. See above. (which may have been obtained using the .BR open (2) .B O_PATH flag). What may have been obtained ?? Crufty text. gone now. The .I flags argument is as for .BR open (2). .\ FIXME: Confirm that the following is intended behavior. .\(It certainly seems to be the behavior, from experimenting.) If .I handle refers to a symbolic link, the caller must specify the .B O_PATH flag, and the symbolic link is not dereferenced (the .B O_NOFOLLOW flag, if specified, is ignored). It certainly sounds like reasonable behaviour. I cannot comment on intention though. Are you bothered that O_PATH is needed for symlinks? No. An fd on a symlink is a sufficiently unusual thing that it seems reasonable for a programmer to explicitly say they are expecting one. I think the point is this: If you have a file handle for a symlink, then you can't follow the symlink, which is why you must specify O_PATH and O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though. I suspect it's something like: the process receiving the handle doesn't have enough information for the symlink to be interpreted, I think because it can;t reliably determine what directory the link lives in. Possibly Al Viro or Aneesh can confirm. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR openat (2). In addition, they can fail with the errors noted below. Should you mention EFAULT if mount_id or handle are not valid pointers? Done. Not all filesystem types support the translation of pathnames to file handles. .\ FIXME NeilBrown noted: .\ESTALE is also returned if the filesystem does not support .\file-handle - file mappings. .\On filesystems which don't provide export_operations (/sys /proc .\ubifs romfs cramfs nfs coda ... several others) name_to_handle_at .\will produce a generic handle using the 32 bit inode and 32 bit .\i_generation. open_by_name_at given this (or any) filehandle .\will fail with ESTALE. .\ However, on /proc and /sys, at least, name_to_handle_at() fails with .\ EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the .\ same error as for an invalid file handle) in the above circumstances? This is all wrong - discard it :-) Yup. Gone now ;-). Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page [v2]
Hi Mike, On 03/19/2014 07:42 AM, Mike Frysinger wrote: On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: The .I flags argument is a bit mask constructed by ORing together zero or more of the following value: .TP .B AT_EMPTY_PATH Allow .I pathname to be an empty string. See above. (which may have been obtained using the .BR open (2) .B O_PATH flag). .TP .B AT_SYMLINK_FOLLOW By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link. The flag .B AT_SYMLINK_FOLLOW can be specified in .I flags to cause .I pathname to be dereferenced if it is a symbolic link. this section is only talking about |flags|, and further this part is only talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super redundant. how about reversing the sentence order so that both are implicit like is done in the openat() page and the description of O_NOFOLLOW ? I'm not sure that I completely understand you here, but I agree that this could be better. I've rewritten somewhat. .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it it is not the case that both it is duplicated Fixed. .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at (3) should be () and not (3) Fixed. side note: this seems like an easy error to script for ... Yep, I've got some scripts that I run manually now and then to check for this sort of stuff. $ \fBecho 'Kannst du bitte überlegen?' cecilia.txt\fP aber, ich spreche kein Deutsch :( do we have a standard about sticking to english ? i wonder if people are more likely to be confused or to appreciate it ... Fair enough. I'm too influenced by recent work on the locale pages (and family conversations ;-)). I'll switch it to English #define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) i wonder if err.h makes sense now that this is a man page for completely linux-specific syscalls :). and you use _GNU_SOURCE. I'm not really convinced about using these functions, but I'll reflect on it more. int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, s; if (argc 2 || strcmp(argv[1], \-\-help) == 0) { argc != 2 ? Yes, some cruft crept in. /* Allocate file_handle structure */ fhsize = sizeof(struct file_handle *); pretty sure this is wrong blush as sizeof() here returns the size of a pointer, not the size of the struct. it's why i prefer the form: fhsize = sizeof(*fhp); less typing and harder to screw up by accident. Yep, changed. granted, the case below won't crash since the kernel only reads/writes sizeof(unsigned int) and i'm not aware of any system where that is larger than sizeof(void *), but it's still wrong :). s = name_to_handle_at(AT_FDCWD, argv[1], fhp, mount_id, 0); another personal style: create dedicated variables for each arg you unpack out of argv[1]. it's generally OK when you only take one arg, but when you get more than one, you end up flipping back and forth between the usage trying to figure out what index 1 represents instead of focusing on what the code is doing. const char *pathname = argv[1]; Yup. fhsize = sizeof(struct file_handle) + fhp\-handle_bytes; fhsize += fhp-handle_bytes ? it's the same, but i think nicer ;) Depends on your perspective. It relies on no one changing the code so that fhsize is modified after the earlier initialization. And also, with this line, I see exactly what is going on, in one place. I'll leave as is. /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ if (write(STDOUT_FILENO, mount_id, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhsize, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhp, fhsize) != fhsize) { seems like a whole lot of code spew for a simple printf() ? you'd have to adjust the other program to use scanf(), but seems like the end result would be nicer for users to experiment with ? Yes. I'd already reflected on exactly that and made a change to using text formats. static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int fmnt_id, fnd, nread; could we buy a few more letters for these vars ? i guess fmnt_id is the filesystem mount id, and fnd is find. When I was a kid, you had to pay a dollar for each letter... (I've made a few changes.) also, getline() returns a ssize_t, not an int. Fixed. FILE *fp; fp = fopen(/proc/self/mountinfo, r); only one space before the = Yup. i would encourage using the e flag whenever possible in the hopes that someone
Re: For review: open_by_name_at(2) man page [v2]
On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" wrote: > Hi Aneesh, (and others) > > After integrating review comments from NeilBown and Christoph Hellwig, > here is draft 2 of a man page I've written for name_to_handle_at(2) and > open_by_name_at(2). Especially thanks to Neil's comments, several parts > of the page underwent a substantial rewrite. Would you be willing to > review it please, and let me know of any corrections/improvements? I didn't notice before but above and in $SUBJ I see "open_by_name_at", which is fictitious :-) > > Together, the > .I pathname > and > .I dirfd > arguments identify the file for which a handle is to obtained. ^be > > The > .I flags > argument is a bit mask constructed by ORing together > zero or more of the following value: ^s > .TP > .B AT_EMPTY_PATH > Allow > .I pathname > to be an empty string. > See above. > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). What "may have been obtained" ?? > The > .I flags > argument > is as for > .BR open (2). > .\" FIXME: Confirm that the following is intended behavior. > .\"(It certainly seems to be the behavior, from experimenting.) > If > .I handle > refers to a symbolic link, the caller must specify the > .B O_PATH > flag, and the symbolic link is not dereferenced (the > .B O_NOFOLLOW > flag, if specified, is ignored). It certainly sounds like reasonable behaviour. I cannot comment on intention though. Are you bothered that O_PATH is needed for symlinks? An fd on a symlink is a sufficiently unusual thing that it seems reasonable for a programmer to explicitly say they are expecting one. > > In the event of an error, both system calls return \-1 and set > .I errno > to indicate the cause of the error. > .SH ERRORS > .BR name_to_handle_at () > and > .BR open_by_handle_at () > can fail for the same errors as > .BR openat (2). > In addition, they can fail with the errors noted below. Should you mention EFAULT if mount_id or handle are not valid pointers? > > Not all filesystem types support the translation of pathnames to > file handles. > .\" FIXME NeilBrown noted: > .\"ESTALE is also returned if the filesystem does not support > .\"file-handle -> file mappings. > .\"On filesystems which don't provide export_operations (/sys /proc > .\"ubifs romfs cramfs nfs coda ... several others) name_to_handle_at > .\"will produce a generic handle using the 32 bit inode and 32 bit > .\"i_generation. open_by_name_at given this (or any) filehandle > .\"will fail with ESTALE. > .\" However, on /proc and /sys, at least, name_to_handle_at() fails with > .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the > .\" same error as for an invalid file handle) in the above circumstances? This is all wrong - discard it :-) NeilBrown signature.asc Description: PGP signature
Re: For review: open_by_name_at(2) man page
On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)" wrote: > On 03/18/2014 10:43 AM, Christoph Hellwig wrote: > > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: > >> ESTALE is also returned if the filesystem does not support file-handle -> > >> file mappings. > >> On filesystems which don't provide export_operations (/sys /proc ubifs > >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > >> generic handle using the 32 bit inode and 32 bit i_generation. > > > > Do we? Seems like the code is erroring out early if there are no > > export_ops? > > It appears to me that Neil's statement isn't correct, at least for /proc > and /sys (see my other mail, to Neil). I'm unsure about whether it is true > for some of those other FSes thought. Indeed, I was wrong. I was looking at int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent) { const struct export_operations *nop = inode->i_sb->s_export_op; if (nop && nop->encode_fh) return nop->encode_fh(inode, fid->raw, max_len, parent); return export_encode_fh(inode, fid, max_len, parent); } which uses a default if there is no 'nop'. However do_sys_name_to_handle() contains if (!path->dentry->d_sb->s_export_op || !path->dentry->d_sb->s_export_op->fh_to_dentry) return -EOPNOTSUPP; long before export_encode_inode_fh() gets called. So the default isn't used. I would have thought that exportfs_encode_inode_fh would never get called if there were no s_export_op pointer - certainly name_to_handle_at and nfsd would never call it in that case. However it seems that This routine will be used to generate a file handle in fdinfo output for inotify subsystem, where if no s_export_op present the general export_encode_fh should be used. Thus add a test if s_export_op present inside exportfs_encode_fh itself. according to commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 Author: Cyrill Gorcunov Date: Mon Dec 17 16:05:06 2012 -0800 I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD when $FD is an inotify fd which is watching the particular file. I wouldn't have expected that, but maybe it is a good idea. So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. So if you get ESTALE from open_by_handle_at(), then it really is a stale handle. Sorry for the confusion. NeilBrown signature.asc Description: PGP signature
Re: For review: open_by_name_at(2) man page
On 03/18/2014 02:07 PM, Christoph Hellwig wrote: > That's why the file handles contain a generation counter that gets > incremented in this case. Ahh, yes. Thanks for the reminder/clue. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/18/2014 10:37 AM, Christoph Hellwig wrote: > Hi Michael, > > the man page looks reasonable. If you refer to openat(2) instead of > open(2) in the ERRORS section you could avoid duplicating a few of the > dirfd and flags related errors. Good idea. Done. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On Tue, Mar 18, 2014 at 01:35:06PM +0100, Michael Kerrisk (man-pages) wrote: > Indeed! I don't know quite what I was smoking as I reviewed that piece. > In fact, I started writing this page a long time ago, but then other > events intervened, and it was a long time before I came back to it recently. > Certainly, when I produced that shell session log, things proceeded > (almost) as shown. I'm guessing that what happened is that I by > accident edited out a line > > rm cecilia.txt > > just before > > echo 'Warum?' > cecilia.txt > > Fixed now. (In that case of course, it is of course a matter of chance > whether the pathname is re-created with the same i-node number, but if > you are quick, it often is. I'll add some explanation to the page.) That's why the file handles contain a generation counter that gets incremented in this case. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/18/2014 10:43 AM, Christoph Hellwig wrote: > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: >> ESTALE is also returned if the filesystem does not support file-handle -> >> file mappings. >> On filesystems which don't provide export_operations (/sys /proc ubifs >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a >> generic handle using the 32 bit inode and 32 bit i_generation. > > Do we? Seems like the code is erroring out early if there are no > export_ops? It appears to me that Neil's statement isn't correct, at least for /proc and /sys (see my other mail, to Neil). I'm unsure about whether it is true for some of those other FSes thought. >> Does it? My understanding from "man libblkid" (it is a while since I've read >> the code) is that it either uses info in /dev/disks/by-* or reads directly >> from the block devices (maybe using /sys to find them?) and interprets the >> superblock to extract a UUID. > > It normally reads directly from disk, unless it has changed very > recently. Thanks. As noted in my mail, I solved this one by just saying a little less about libblkid. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/17/2014 11:00 PM, NeilBrown wrote: > On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)" > wrote: > >> Hi Aneesh, (and others) >> >> Below is a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Would you be willing to review it please, >> and let me know of any corrections/improvements? >> >> Thanks, >> >> Michael > > Thanks for writing this Michael. The fact that I can only find very small > points to comment on reflects the high quality... Thanks, Neil. But there was at least one good clanger below :-}. > >> Otherwise, >> .IR dirfd >> must be a file descriptor that refers to a directory, and > ^^^ >> .I pathname >> is interpreted relative to that directory. > > As you clarify later, "must be" is not correct. Maybe this is just an issue > of style, in which case you should obviously keep a consistent style across > man pages, but to me it sounds wrong. I would use "is generally" or similar. Yep, good point. In fact, what I did was rewrite that section completely, to more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH. >> The >> .IR mountdirfd >> argument is a file descriptor for a directory under >> the mount point with respect to which >> .IR handle >> should be interpreted. > > mountdirfd does not have to be for a directory. It can be for any object in > the filesystem. And I would say "in", not "under". > If /foo and /foo/bar are both mountpoints, and I want to look up a > filehandle for the filesystem mounted at /foo, then opening "/foo/bar" > wouldn't work even though /foo/bar is "under" /foo. And opening "/foo" would > work even though "/foo" is not under "/foo/" (is it?). Good catch. I got deceived by the name of the argument, which in the kernel source is indeed 'mountdirfd', implying it must be a descriptor for a directory. I'll rename the argument in the man page to 'mount_fd' and fix the description as you suggest here: > The > .IR mountfd > argument is a file descriptor for any object (file, directory, etc.) in the > filesystem with respect to which I did s/filesystem/mounted filesystem/ > .IR handle > should be interpreted. > > ?? >> .B ESTALE >> The specified >> .I handle >> is no longer valid. > > ESTALE is also returned if the filesystem does not support file-handle -> > file mappings. > On filesystems which don't provide export_operations (/sys /proc ubifs > romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > generic handle using the 32 bit inode and 32 bit i_generation. Are you sure about this? When I try name_to_handle_at() on /proc and /sys, it gives an error (EOPNOTSUPP). I haven't tested the other FSes though, so maybe some of them do what you say. > open_by_name_at given this (or any) filehandle will fail with ESTALE. > I don't know how best to include this in the documentation. Maybe a note > earlier noting that some filesystems do not support open_by_name_at(), and > you cannot programatically determine which do except by trying. > At the same time note that a file handle can become in valid if a file is > deleted or for any other reason as determined by the filesystem, and that the > error is the same as for when the filesystem doesn't support open_by_name_at. I've added text about invalid file handles into NOTES, and noted that not all FSes support the production of file handles, but haven't noted ESTALE for the latter, since I don't yet know if your statement above is true for some filesystems. >> For example, one can use the device name in the fifth field of the >> .I mountinfo >> record to search for the corresponding device UUID via the symbolic links in >> .IR /dev/disks/by-uuid . >> (A more comfortable way of obtaining the UUID is to use the >> .\" e.g., >> http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition >> .BR libblkid (3) >> library, which uses the >> .I /sys >> filesystem to obtain the same information.) > > Does it? My understanding from "man libblkid" (it is a while since I've read > the code) is that it either uses info in /dev/disks/by-* or reads directly > from the block devices (maybe using /sys to find them?) and interprets the > superblock to extract a UUID. Thanks (and to Christoph) -- I'll just remove the words "which uses the /sys filesystem to obtain the same information" >> Now delete and re-create the file with the same inode number; >> .BR open_by_handle_at () >> recognizes that the file referred to by the file handle no longer exists. >> >> .in +4n >> .nf >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number >> 4072121 >> $ \fBecho 'Warum?' > cecilia.txt\fP >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number >> 4072121 >> $ \fBsudo ./t_open_by_handle_at < fh\fP >> open_by_handle_at: Stale NFS file handle > > Something is very wrong here. >echo foo > somefile > does not "delete and re-create" the file. It opens and
Re: For review: open_by_name_at(2) man page
On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: > ESTALE is also returned if the filesystem does not support file-handle -> > file mappings. > On filesystems which don't provide export_operations (/sys /proc ubifs > romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? > Does it? My understanding from "man libblkid" (it is a while since I've read > the code) is that it either uses info in /dev/disks/by-* or reads directly > from the block devices (maybe using /sys to find them?) and interprets the > superblock to extract a UUID. It normally reads directly from disk, unless it has changed very recently. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
Hi Michael, the man page looks reasonable. If you refer to openat(2) instead of open(2) in the ERRORS section you could avoid duplicating a few of the dirfd and flags related errors. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
Hi Michael, the man page looks reasonable. If you refer to openat(2) instead of open(2) in the ERRORS section you could avoid duplicating a few of the dirfd and flags related errors. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? Does it? My understanding from man libblkid (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. It normally reads directly from disk, unless it has changed very recently. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/17/2014 11:00 PM, NeilBrown wrote: On Mon, 17 Mar 2014 16:57:29 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: Hi Aneesh, (and others) Below is a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Would you be willing to review it please, and let me know of any corrections/improvements? Thanks, Michael Thanks for writing this Michael. The fact that I can only find very small points to comment on reflects the high quality... Thanks, Neil. But there was at least one good clanger below :-}. Otherwise, .IR dirfd must be a file descriptor that refers to a directory, and ^^^ .I pathname is interpreted relative to that directory. As you clarify later, must be is not correct. Maybe this is just an issue of style, in which case you should obviously keep a consistent style across man pages, but to me it sounds wrong. I would use is generally or similar. Yep, good point. In fact, what I did was rewrite that section completely, to more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH. The .IR mountdirfd argument is a file descriptor for a directory under the mount point with respect to which .IR handle should be interpreted. mountdirfd does not have to be for a directory. It can be for any object in the filesystem. And I would say in, not under. If /foo and /foo/bar are both mountpoints, and I want to look up a filehandle for the filesystem mounted at /foo, then opening /foo/bar wouldn't work even though /foo/bar is under /foo. And opening /foo would work even though /foo is not under /foo/ (is it?). Good catch. I got deceived by the name of the argument, which in the kernel source is indeed 'mountdirfd', implying it must be a descriptor for a directory. I'll rename the argument in the man page to 'mount_fd' and fix the description as you suggest here: The .IR mountfd argument is a file descriptor for any object (file, directory, etc.) in the filesystem with respect to which I did s/filesystem/mounted filesystem/ .IR handle should be interpreted. ?? .B ESTALE The specified .I handle is no longer valid. ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. Are you sure about this? When I try name_to_handle_at() on /proc and /sys, it gives an error (EOPNOTSUPP). I haven't tested the other FSes though, so maybe some of them do what you say. open_by_name_at given this (or any) filehandle will fail with ESTALE. I don't know how best to include this in the documentation. Maybe a note earlier noting that some filesystems do not support open_by_name_at(), and you cannot programatically determine which do except by trying. At the same time note that a file handle can become in valid if a file is deleted or for any other reason as determined by the filesystem, and that the error is the same as for when the filesystem doesn't support open_by_name_at. I've added text about invalid file handles into NOTES, and noted that not all FSes support the production of file handles, but haven't noted ESTALE for the latter, since I don't yet know if your statement above is true for some filesystems. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\ e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library, which uses the .I /sys filesystem to obtain the same information.) Does it? My understanding from man libblkid (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. Thanks (and to Christoph) -- I'll just remove the words which uses the /sys filesystem to obtain the same information Now delete and re-create the file with the same inode number; .BR open_by_handle_at () recognizes that the file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf=%i\\n cecilia.txt\fP # Display inode number 4072121 $ \fBecho 'Warum?' cecilia.txt\fP $ \fBstat \-\-printf=%i\\n cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at fh\fP open_by_handle_at: Stale NFS file handle Something is very wrong here. echo foo somefile does not delete and re-create the file. It opens and truncates. That operation should not invalidate the filehandle on any sane filesystem. Indeed! I don't know quite what I was smoking as I reviewed that piece.
Re: For review: open_by_name_at(2) man page
On 03/18/2014 10:43 AM, Christoph Hellwig wrote: On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? It appears to me that Neil's statement isn't correct, at least for /proc and /sys (see my other mail, to Neil). I'm unsure about whether it is true for some of those other FSes thought. Does it? My understanding from man libblkid (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. It normally reads directly from disk, unless it has changed very recently. Thanks. As noted in my mail, I solved this one by just saying a little less about libblkid. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On Tue, Mar 18, 2014 at 01:35:06PM +0100, Michael Kerrisk (man-pages) wrote: Indeed! I don't know quite what I was smoking as I reviewed that piece. In fact, I started writing this page a long time ago, but then other events intervened, and it was a long time before I came back to it recently. Certainly, when I produced that shell session log, things proceeded (almost) as shown. I'm guessing that what happened is that I by accident edited out a line rm cecilia.txt just before echo 'Warum?' cecilia.txt Fixed now. (In that case of course, it is of course a matter of chance whether the pathname is re-created with the same i-node number, but if you are quick, it often is. I'll add some explanation to the page.) That's why the file handles contain a generation counter that gets incremented in this case. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/18/2014 10:37 AM, Christoph Hellwig wrote: Hi Michael, the man page looks reasonable. If you refer to openat(2) instead of open(2) in the ERRORS section you could avoid duplicating a few of the dirfd and flags related errors. Good idea. Done. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On 03/18/2014 02:07 PM, Christoph Hellwig wrote: That's why the file handles contain a generation counter that gets incremented in this case. Ahh, yes. Thanks for the reminder/clue. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: For review: open_by_name_at(2) man page
On Tue, 18 Mar 2014 13:37:15 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: On 03/18/2014 10:43 AM, Christoph Hellwig wrote: On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? It appears to me that Neil's statement isn't correct, at least for /proc and /sys (see my other mail, to Neil). I'm unsure about whether it is true for some of those other FSes thought. Indeed, I was wrong. I was looking at int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent) { const struct export_operations *nop = inode-i_sb-s_export_op; if (nop nop-encode_fh) return nop-encode_fh(inode, fid-raw, max_len, parent); return export_encode_fh(inode, fid, max_len, parent); } which uses a default if there is no 'nop'. However do_sys_name_to_handle() contains if (!path-dentry-d_sb-s_export_op || !path-dentry-d_sb-s_export_op-fh_to_dentry) return -EOPNOTSUPP; long before export_encode_inode_fh() gets called. So the default isn't used. I would have thought that exportfs_encode_inode_fh would never get called if there were no s_export_op pointer - certainly name_to_handle_at and nfsd would never call it in that case. However it seems that This routine will be used to generate a file handle in fdinfo output for inotify subsystem, where if no s_export_op present the general export_encode_fh should be used. Thus add a test if s_export_op present inside exportfs_encode_fh itself. according to commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 Author: Cyrill Gorcunov gorcu...@openvz.org Date: Mon Dec 17 16:05:06 2012 -0800 I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD when $FD is an inotify fd which is watching the particular file. I wouldn't have expected that, but maybe it is a good idea. So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. So if you get ESTALE from open_by_handle_at(), then it really is a stale handle. Sorry for the confusion. NeilBrown signature.asc Description: PGP signature
Re: For review: open_by_name_at(2) man page [v2]
On Tue, 18 Mar 2014 13:55:15 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: Hi Aneesh, (and others) After integrating review comments from NeilBown and Christoph Hellwig, here is draft 2 of a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Especially thanks to Neil's comments, several parts of the page underwent a substantial rewrite. Would you be willing to review it please, and let me know of any corrections/improvements? I didn't notice before but above and in $SUBJ I see open_by_name_at, which is fictitious :-) Together, the .I pathname and .I dirfd arguments identify the file for which a handle is to obtained. ^be The .I flags argument is a bit mask constructed by ORing together zero or more of the following value: ^s .TP .B AT_EMPTY_PATH Allow .I pathname to be an empty string. See above. (which may have been obtained using the .BR open (2) .B O_PATH flag). What may have been obtained ?? The .I flags argument is as for .BR open (2). .\ FIXME: Confirm that the following is intended behavior. .\(It certainly seems to be the behavior, from experimenting.) If .I handle refers to a symbolic link, the caller must specify the .B O_PATH flag, and the symbolic link is not dereferenced (the .B O_NOFOLLOW flag, if specified, is ignored). It certainly sounds like reasonable behaviour. I cannot comment on intention though. Are you bothered that O_PATH is needed for symlinks? An fd on a symlink is a sufficiently unusual thing that it seems reasonable for a programmer to explicitly say they are expecting one. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR openat (2). In addition, they can fail with the errors noted below. Should you mention EFAULT if mount_id or handle are not valid pointers? Not all filesystem types support the translation of pathnames to file handles. .\ FIXME NeilBrown noted: .\ESTALE is also returned if the filesystem does not support .\file-handle - file mappings. .\On filesystems which don't provide export_operations (/sys /proc .\ubifs romfs cramfs nfs coda ... several others) name_to_handle_at .\will produce a generic handle using the 32 bit inode and 32 bit .\i_generation. open_by_name_at given this (or any) filehandle .\will fail with ESTALE. .\ However, on /proc and /sys, at least, name_to_handle_at() fails with .\ EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the .\ same error as for an invalid file handle) in the above circumstances? This is all wrong - discard it :-) NeilBrown signature.asc Description: PGP signature
Re: For review: open_by_name_at(2) man page
On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)" wrote: > Hi Aneesh, (and others) > > Below is a man page I've written for name_to_handle_at(2) and > open_by_name_at(2). Would you be willing to review it please, > and let me know of any corrections/improvements? > > Thanks, > > Michael Thanks for writing this Michael. The fact that I can only find very small points to comment on reflects the high quality... > Otherwise, > .IR dirfd > must be a file descriptor that refers to a directory, and ^^^ > .I pathname > is interpreted relative to that directory. As you clarify later, "must be" is not correct. Maybe this is just an issue of style, in which case you should obviously keep a consistent style across man pages, but to me it sounds wrong. I would use "is generally" or similar. > The > .IR mountdirfd > argument is a file descriptor for a directory under > the mount point with respect to which > .IR handle > should be interpreted. mountdirfd does not have to be for a directory. It can be for any object in the filesystem. And I would say "in", not "under". If /foo and /foo/bar are both mountpoints, and I want to look up a filehandle for the filesystem mounted at /foo, then opening "/foo/bar" wouldn't work even though /foo/bar is "under" /foo. And opening "/foo" would work even though "/foo" is not under "/foo/" (is it?). The .IR mountfd argument is a file descriptor for any object (file, directory, etc.) in the filesystem with respect to which .IR handle should be interpreted. ?? > .B ESTALE > The specified > .I handle > is no longer valid. ESTALE is also returned if the filesystem does not support file-handle -> file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. open_by_name_at given this (or any) filehandle will fail with ESTALE. I don't know how best to include this in the documentation. Maybe a note earlier noting that some filesystems do not support open_by_name_at(), and you cannot programatically determine which do except by trying. At the same time note that a file handle can become in valid if a file is deleted or for any other reason as determined by the filesystem, and that the error is the same as for when the filesystem doesn't support open_by_name_at. > For example, one can use the device name in the fifth field of the > .I mountinfo > record to search for the corresponding device UUID via the symbolic links in > .IR /dev/disks/by-uuid . > (A more comfortable way of obtaining the UUID is to use the > .\" e.g., > http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition > .BR libblkid (3) > library, which uses the > .I /sys > filesystem to obtain the same information.) Does it? My understanding from "man libblkid" (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. > Now delete and re-create the file with the same inode number; > .BR open_by_handle_at () > recognizes that the file referred to by the file handle no longer exists. > > .in +4n > .nf > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number > 4072121 > $ \fBecho 'Warum?' > cecilia.txt\fP > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number > 4072121 > $ \fBsudo ./t_open_by_handle_at < fh\fP > open_by_handle_at: Stale NFS file handle Something is very wrong here. echo foo > somefile does not "delete and re-create" the file. It opens and truncates. That operation should not invalidate the filehandle on any sane filesystem. > if (argc > 1) > mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); O_DIRECTORY is not appropriate, as mentioned earlier. Thanks, NeilBrown signature.asc Description: PGP signature
Re: For review: open_by_name_at(2) man page
On Mon, 17 Mar 2014 16:57:29 +0100 Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote: Hi Aneesh, (and others) Below is a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Would you be willing to review it please, and let me know of any corrections/improvements? Thanks, Michael Thanks for writing this Michael. The fact that I can only find very small points to comment on reflects the high quality... Otherwise, .IR dirfd must be a file descriptor that refers to a directory, and ^^^ .I pathname is interpreted relative to that directory. As you clarify later, must be is not correct. Maybe this is just an issue of style, in which case you should obviously keep a consistent style across man pages, but to me it sounds wrong. I would use is generally or similar. The .IR mountdirfd argument is a file descriptor for a directory under the mount point with respect to which .IR handle should be interpreted. mountdirfd does not have to be for a directory. It can be for any object in the filesystem. And I would say in, not under. If /foo and /foo/bar are both mountpoints, and I want to look up a filehandle for the filesystem mounted at /foo, then opening /foo/bar wouldn't work even though /foo/bar is under /foo. And opening /foo would work even though /foo is not under /foo/ (is it?). The .IR mountfd argument is a file descriptor for any object (file, directory, etc.) in the filesystem with respect to which .IR handle should be interpreted. ?? .B ESTALE The specified .I handle is no longer valid. ESTALE is also returned if the filesystem does not support file-handle - file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. open_by_name_at given this (or any) filehandle will fail with ESTALE. I don't know how best to include this in the documentation. Maybe a note earlier noting that some filesystems do not support open_by_name_at(), and you cannot programatically determine which do except by trying. At the same time note that a file handle can become in valid if a file is deleted or for any other reason as determined by the filesystem, and that the error is the same as for when the filesystem doesn't support open_by_name_at. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\ e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library, which uses the .I /sys filesystem to obtain the same information.) Does it? My understanding from man libblkid (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. Now delete and re-create the file with the same inode number; .BR open_by_handle_at () recognizes that the file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf=%i\\n cecilia.txt\fP # Display inode number 4072121 $ \fBecho 'Warum?' cecilia.txt\fP $ \fBstat \-\-printf=%i\\n cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at fh\fP open_by_handle_at: Stale NFS file handle Something is very wrong here. echo foo somefile does not delete and re-create the file. It opens and truncates. That operation should not invalidate the filehandle on any sane filesystem. if (argc 1) mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); O_DIRECTORY is not appropriate, as mentioned earlier. Thanks, NeilBrown signature.asc Description: PGP signature