Re: Optimizing chmod(1)

2016-03-14 Thread Michael McConville
Adam Thompson wrote:
> 
> On 16-03-13 11:11 PM, Michael McConville wrote:
> >It seems that chown(1) will write to a file even if it already has the
> >desired ownership. The below patch causes it to skip the write when
> >there would be no change. The best I could tell, the fts_read(3) and
> >fchownat(3) logic agree on whether to follow symlinks in all cases, so
> >there's no need to execute a useless write when certain options are
> >specified.
> >
> >I get a speedup of 5-10% on SSDs, and probably more on a slow storage
> >HD. This is when all the files are already owned by the specified user.
> >I think this is a common case, as chown is often used to "comb out"
> >offending files, ensure that a server can access everything in /var/www,
> >etc.
> >
> >The APIs involved are not simple and there's potential for subtle
> >breakage, so this only an initial patch. I'm interested to hear what
> >more experienced people think.
> >
> >If it's worthwhile, a similar approach can probably be applied to
> >chmod(1) et al. as well.
> 
> If this becomes the default behaviour, please allow a way to revert to the
> previous behaviour, as this change would break several real systems I'm
> aware of, including one I currently manage.

As I feared, and as Todd pointed out, this would break too much stuff to
be worth the modest performance improvement. It won't be committed.



Re: Optimizing chmod(1)

2016-03-14 Thread Adam Thompson


On 16-03-13 11:11 PM, Michael McConville wrote:

It seems that chown(1) will write to a file even if it already has the
desired ownership. The below patch causes it to skip the write when
there would be no change. The best I could tell, the fts_read(3) and
fchownat(3) logic agree on whether to follow symlinks in all cases, so
there's no need to execute a useless write when certain options are
specified.

I get a speedup of 5-10% on SSDs, and probably more on a slow storage
HD. This is when all the files are already owned by the specified user.
I think this is a common case, as chown is often used to "comb out"
offending files, ensure that a server can access everything in /var/www,
etc.

The APIs involved are not simple and there's potential for subtle
breakage, so this only an initial patch. I'm interested to hear what
more experienced people think.

If it's worthwhile, a similar approach can probably be applied to
chmod(1) et al. as well.


If this becomes the default behaviour, please allow a way to revert to 
the previous behaviour, as this change would break several real systems 
I'm aware of, including one I currently manage.


There are many NFS implementations where chown(2)/chmod(2) do magic on 
the back end, ranging from "clearing the UID/GID translation cache" 
(huh?) to resetting the file to default ACLs (e.g. Sun/Oracle 
"Fishworks"-series NAS systems).


To rephrase: chown(file,$owner,$gid) when the file is already owned by 
$owner:$gid is _not necessarily_ a no-op on all filesystems.


As an extra option, though, sure; I know which filesystems this 
behaviour matters on and which it doesn't, but there's no realistic way 
to embed that knowledge into chmod - there's no standard query to ask an 
NFS server "do you practice voodoo?", for better or for worse...


GNU already uses "-c" for a somewhat-related behaviour (modifies the 
output of -v, but not the internal behaviour IIRC); I can't quickly find 
anything similar anywhere else.


FWIW, full support for NFSv4 and NFS ACLs would eliminate the dependency 
on chown/chmod/etc. doing voodoo on the remote NFS server, but there are 
other, non-NFS, remote filesystems that do similar things.  I can't 
remember if OpenBSD still supports any of them, though.


-Adam



Re: Optimizing chmod(1)

2016-03-13 Thread Philip Guenther
On Sun, Mar 13, 2016 at 9:11 PM, Michael McConville  wrote:
> It seems that chown(1) will write to a file even if it already has the
> desired ownership. The below patch causes it to skip the write when
> there would be no change. The best I could tell, the fts_read(3) and
> fchownat(3) logic agree on whether to follow symlinks in all cases, so
> there's no need to execute a useless write when certain options are
> specified.

Regretfully, I don't think this can be the default behavior.  There
are filesystems where you can't whether chown() would be a no-op from
just checking the uid/gid, as the syscall may affect permission bits
or ACLs.  For example, an apparent no-op chown() will clear the
setuid/setgid bits.  (While we don't have ACLs on any native
filesystems, they may be accessible via NFS or fuse.)

I think that sort of complexity is why POSIX specifies that chown(8)
has the effect of calling chown(2) on each target, without leaving
room for skipping what might appear to be no-ops.


The existing solution for the problem of saving time when you really
are sure there's no magic bits to be changed is to use find to hunt
out the files that need it, ala
  find . ! -group www -exec chgrp -h www {} +

or even script it...
   fast_chgrp() { local grp=$1; shift; find "$@" ! -group $grp -exec
chgrp -h $grp {} +; }


Philip Guenther



Optimizing chmod(1)

2016-03-13 Thread Michael McConville
It seems that chown(1) will write to a file even if it already has the
desired ownership. The below patch causes it to skip the write when
there would be no change. The best I could tell, the fts_read(3) and
fchownat(3) logic agree on whether to follow symlinks in all cases, so
there's no need to execute a useless write when certain options are
specified.

I get a speedup of 5-10% on SSDs, and probably more on a slow storage
HD. This is when all the files are already owned by the specified user.
I think this is a common case, as chown is often used to "comb out"
offending files, ensure that a server can access everything in /var/www,
etc.

The APIs involved are not simple and there's potential for subtle
breakage, so this only an initial patch. I'm interested to hear what
more experienced people think.

If it's worthwhile, a similar approach can probably be applied to
chmod(1) et al. as well.

Thanks for your time,
Michael


Index: chmod.c
===
RCS file: /cvs/src/bin/chmod/chmod.c,v
retrieving revision 1.39
diff -u -p -r1.39 chmod.c
--- chmod.c 31 Dec 2015 23:38:16 -  1.39
+++ chmod.c 14 Mar 2016 04:01:39 -
@@ -63,8 +63,8 @@ main(int argc, char *argv[])
int oct;
mode_t omode;
int Hflag, Lflag, Rflag, ch, fflag, fts_options, hflag, rval, atflags;
-   uid_t uid;
-   gid_t gid;
+   uid_t uid, orig_uid;
+   gid_t gid, orig_gid;
u_int32_t fclear, fset;
char *ep, *mode, *cp, *flags;
 
@@ -213,7 +213,12 @@ done:
 
if ((ftsp = fts_open(++argv, fts_options, 0)) == NULL)
err(1, NULL);
+
for (rval = 0; (p = fts_read(ftsp)) != NULL;) {
+
+   orig_uid = p->fts_statp->st_uid;
+   orig_gid = p->fts_statp->st_gid;
+
switch (p->fts_info) {
case FTS_D:
if (!Rflag)
@@ -265,6 +270,16 @@ done:
|| fflag)
continue;
} else if (!ischflags) {
+   /*
+* Don't modify the file if it already has the
+* specified ownership. fts_read(3) and
+* fchownat(3) both handle the symlink logic, so
+* we know we're deciding based on the right
+* file.
+*/
+   if ((gid == -1 || gid == orig_gid) &&
+   (uid == -1 || uid == orig_uid))
+   continue;
if (!fchownat(AT_FDCWD, p->fts_accpath, uid, gid,
atflags) || fflag)
continue;