Re: Optimizing chmod(1)
Adam Thompson wrote: > > On 16-03-13 11:11 PM, Michael McConville wrote: > >It seems that chown(1) will write to a file even if it already has the > >desired ownership. The below patch causes it to skip the write when > >there would be no change. The best I could tell, the fts_read(3) and > >fchownat(3) logic agree on whether to follow symlinks in all cases, so > >there's no need to execute a useless write when certain options are > >specified. > > > >I get a speedup of 5-10% on SSDs, and probably more on a slow storage > >HD. This is when all the files are already owned by the specified user. > >I think this is a common case, as chown is often used to "comb out" > >offending files, ensure that a server can access everything in /var/www, > >etc. > > > >The APIs involved are not simple and there's potential for subtle > >breakage, so this only an initial patch. I'm interested to hear what > >more experienced people think. > > > >If it's worthwhile, a similar approach can probably be applied to > >chmod(1) et al. as well. > > If this becomes the default behaviour, please allow a way to revert to the > previous behaviour, as this change would break several real systems I'm > aware of, including one I currently manage. As I feared, and as Todd pointed out, this would break too much stuff to be worth the modest performance improvement. It won't be committed.
Re: Optimizing chmod(1)
On 16-03-13 11:11 PM, Michael McConville wrote: It seems that chown(1) will write to a file even if it already has the desired ownership. The below patch causes it to skip the write when there would be no change. The best I could tell, the fts_read(3) and fchownat(3) logic agree on whether to follow symlinks in all cases, so there's no need to execute a useless write when certain options are specified. I get a speedup of 5-10% on SSDs, and probably more on a slow storage HD. This is when all the files are already owned by the specified user. I think this is a common case, as chown is often used to "comb out" offending files, ensure that a server can access everything in /var/www, etc. The APIs involved are not simple and there's potential for subtle breakage, so this only an initial patch. I'm interested to hear what more experienced people think. If it's worthwhile, a similar approach can probably be applied to chmod(1) et al. as well. If this becomes the default behaviour, please allow a way to revert to the previous behaviour, as this change would break several real systems I'm aware of, including one I currently manage. There are many NFS implementations where chown(2)/chmod(2) do magic on the back end, ranging from "clearing the UID/GID translation cache" (huh?) to resetting the file to default ACLs (e.g. Sun/Oracle "Fishworks"-series NAS systems). To rephrase: chown(file,$owner,$gid) when the file is already owned by $owner:$gid is _not necessarily_ a no-op on all filesystems. As an extra option, though, sure; I know which filesystems this behaviour matters on and which it doesn't, but there's no realistic way to embed that knowledge into chmod - there's no standard query to ask an NFS server "do you practice voodoo?", for better or for worse... GNU already uses "-c" for a somewhat-related behaviour (modifies the output of -v, but not the internal behaviour IIRC); I can't quickly find anything similar anywhere else. FWIW, full support for NFSv4 and NFS ACLs would eliminate the dependency on chown/chmod/etc. doing voodoo on the remote NFS server, but there are other, non-NFS, remote filesystems that do similar things. I can't remember if OpenBSD still supports any of them, though. -Adam
Re: Optimizing chmod(1)
On Sun, Mar 13, 2016 at 9:11 PM, Michael McConvillewrote: > It seems that chown(1) will write to a file even if it already has the > desired ownership. The below patch causes it to skip the write when > there would be no change. The best I could tell, the fts_read(3) and > fchownat(3) logic agree on whether to follow symlinks in all cases, so > there's no need to execute a useless write when certain options are > specified. Regretfully, I don't think this can be the default behavior. There are filesystems where you can't whether chown() would be a no-op from just checking the uid/gid, as the syscall may affect permission bits or ACLs. For example, an apparent no-op chown() will clear the setuid/setgid bits. (While we don't have ACLs on any native filesystems, they may be accessible via NFS or fuse.) I think that sort of complexity is why POSIX specifies that chown(8) has the effect of calling chown(2) on each target, without leaving room for skipping what might appear to be no-ops. The existing solution for the problem of saving time when you really are sure there's no magic bits to be changed is to use find to hunt out the files that need it, ala find . ! -group www -exec chgrp -h www {} + or even script it... fast_chgrp() { local grp=$1; shift; find "$@" ! -group $grp -exec chgrp -h $grp {} +; } Philip Guenther
Optimizing chmod(1)
It seems that chown(1) will write to a file even if it already has the desired ownership. The below patch causes it to skip the write when there would be no change. The best I could tell, the fts_read(3) and fchownat(3) logic agree on whether to follow symlinks in all cases, so there's no need to execute a useless write when certain options are specified. I get a speedup of 5-10% on SSDs, and probably more on a slow storage HD. This is when all the files are already owned by the specified user. I think this is a common case, as chown is often used to "comb out" offending files, ensure that a server can access everything in /var/www, etc. The APIs involved are not simple and there's potential for subtle breakage, so this only an initial patch. I'm interested to hear what more experienced people think. If it's worthwhile, a similar approach can probably be applied to chmod(1) et al. as well. Thanks for your time, Michael Index: chmod.c === RCS file: /cvs/src/bin/chmod/chmod.c,v retrieving revision 1.39 diff -u -p -r1.39 chmod.c --- chmod.c 31 Dec 2015 23:38:16 - 1.39 +++ chmod.c 14 Mar 2016 04:01:39 - @@ -63,8 +63,8 @@ main(int argc, char *argv[]) int oct; mode_t omode; int Hflag, Lflag, Rflag, ch, fflag, fts_options, hflag, rval, atflags; - uid_t uid; - gid_t gid; + uid_t uid, orig_uid; + gid_t gid, orig_gid; u_int32_t fclear, fset; char *ep, *mode, *cp, *flags; @@ -213,7 +213,12 @@ done: if ((ftsp = fts_open(++argv, fts_options, 0)) == NULL) err(1, NULL); + for (rval = 0; (p = fts_read(ftsp)) != NULL;) { + + orig_uid = p->fts_statp->st_uid; + orig_gid = p->fts_statp->st_gid; + switch (p->fts_info) { case FTS_D: if (!Rflag) @@ -265,6 +270,16 @@ done: || fflag) continue; } else if (!ischflags) { + /* +* Don't modify the file if it already has the +* specified ownership. fts_read(3) and +* fchownat(3) both handle the symlink logic, so +* we know we're deciding based on the right +* file. +*/ + if ((gid == -1 || gid == orig_gid) && + (uid == -1 || uid == orig_uid)) + continue; if (!fchownat(AT_FDCWD, p->fts_accpath, uid, gid, atflags) || fflag) continue;