Re: summarizing gnulib changes for coreutils-7.7?

2009-09-23 Thread Jim Meyering
Eric Blake wrote:
 Jim Meyering jim at meyering.net writes:
 Once most of your infrastructure improvements are in gnulib,
 would you please write a brief summary to go in NEWS
 for the upcoming coreutils-7.7 release?

 How about the following?  I still want to get my symlink module in first, but
 that should happen within the next 24 hours.

 Subject: [PATCH 1/2] readlink: pick up gnulib changes to readlink -f
...
 Subject: [PATCH 2/2] maint: summarize gnulib changes

Thanks for doing all that.

I've pushed an initial gnulib-submodule-update patch
and then your first patch (to fix the then-failing readlink test),
but without the gnulib-updating part, which I prefer to keep in a
separate commit.

You're welcome to push the 2nd patch at your leisure,
I suppose after the remaining gnulib changes have been sync'd.

P.S. there were two wrapped @@ lines in that first patch.




[PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Pádraig Brady
The attached patch removes the inconsistent use of bitwise
operators on boolean values.

As well as being distracting, bitwise operators are
not short circuiting and brittle with C89 where
bool can be an int, i.e.  1.

cheers,
Pádraig.
From f83dd9eb575685c7e2bbdc62a868fb28a15a82c6 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Wed, 23 Sep 2009 10:10:51 +0100
Subject: [PATCH] maint: Use logical rather than bitwise operators on bools

This is because bitwise operators are:
- distracting and inconsistent in a boolean context
- non short circuiting
- brittle in C89 where bool can be an int (so  1)
---
 src/cat.c|8 
 src/chcon.c  |2 +-
 src/chgrp.c  |2 +-
 src/chmod.c  |4 ++--
 src/chown-core.c |2 +-
 src/chown.c  |2 +-
 src/copy.c   |4 ++--
 src/cp.c |4 ++--
 src/dd.c |2 +-
 src/df.c |6 +++---
 src/du.c |2 +-
 src/expand.c |2 +-
 src/expr.c   |4 ++--
 src/id.c |2 +-
 src/install.c|2 +-
 src/ls.c |   10 +-
 src/md5sum.c |6 +++---
 src/od.c |4 ++--
 src/pathchk.c|2 +-
 src/pr.c |   26 +-
 src/ptx.c|   12 ++--
 src/rm.c |2 +-
 src/sort.c   |   42 +-
 src/stty.c   |8 
 src/tail.c   |2 +-
 src/test.c   |6 +++---
 src/unexpand.c   |2 +-
 src/wc.c |   10 +-
 28 files changed, 90 insertions(+), 90 deletions(-)

diff --git a/src/cat.c b/src/cat.c
index d939681..9008957 100644
--- a/src/cat.c
+++ b/src/cat.c
@@ -388,7 +388,7 @@ cat (
 
   /* Are line numbers to be written at empty lines (-n)?  */
 
-  if (number  !number_nonblank)
+  if (number  !number_nonblank)
 {
   next_line_num ();
   bpout = stpcpy (bpout, line_num_print);
@@ -657,7 +657,7 @@ main (int argc, char **argv)
 #endif
 }
 
-  if (! (number | show_ends | squeeze_blank))
+  if (! (number || show_ends || squeeze_blank))
 {
   file_open_mode |= O_BINARY;
   if (O_BINARY  ! isatty (STDOUT_FILENO))
@@ -719,8 +719,8 @@ main (int argc, char **argv)
   /* Select which version of `cat' to use.  If any format-oriented
  options were given use `cat'; otherwise use `simple_cat'.  */
 
-  if (! (number | show_ends | show_nonprinting
- | show_tabs | squeeze_blank))
+  if (! (number || show_ends || show_nonprinting
+ || show_tabs || squeeze_blank))
 {
   insize = MAX (insize, outsize);
   inbuf = xmalloc (insize + page_size - 1);
diff --git a/src/chcon.c b/src/chcon.c
index fb68402..fbfdb4d 100644
--- a/src/chcon.c
+++ b/src/chcon.c
@@ -549,7 +549,7 @@ main (int argc, char **argv)
   usage (1);
 }
 
-  if (recurse  preserve_root)
+  if (recurse  preserve_root)
 {
   static struct dev_ino dev_ino_buf;
   root_dev_ino = get_root_dev_ino (dev_ino_buf);
diff --git a/src/chgrp.c b/src/chgrp.c
index 71582b5..b9e3f43 100644
--- a/src/chgrp.c
+++ b/src/chgrp.c
@@ -293,7 +293,7 @@ main (int argc, char **argv)
   gid = parse_group (group_name);
 }
 
-  if (chopt.recurse  preserve_root)
+  if (chopt.recurse  preserve_root)
 {
   static struct dev_ino dev_ino_buf;
   chopt.root_dev_ino = get_root_dev_ino (dev_ino_buf);
diff --git a/src/chmod.c b/src/chmod.c
index e6f2b0b..da35003 100644
--- a/src/chmod.c
+++ b/src/chmod.c
@@ -279,7 +279,7 @@ process_file (FTS *fts, FTSENT *ent)
 }
 }
 
-  if (chmod_succeeded  diagnose_surprises)
+  if (chmod_succeeded  diagnose_surprises)
 {
   mode_t naively_expected_mode =
 mode_adjust (old_mode, S_ISDIR (old_mode) != 0, 0, change, NULL);
@@ -523,7 +523,7 @@ main (int argc, char **argv)
   umask_value = umask (0);
 }
 
-  if (recurse  preserve_root)
+  if (recurse  preserve_root)
 {
   static struct dev_ino dev_ino_buf;
   root_dev_ino = get_root_dev_ino (dev_ino_buf);
diff --git a/src/chown-core.c b/src/chown-core.c
index 705a29b..e7dacf6 100644
--- a/src/chown-core.c
+++ b/src/chown-core.c
@@ -440,7 +440,7 @@ change_file_owner (FTS *fts, FTSENT *ent,
   if (chopt-verbosity != V_off)
 {
   bool changed =
-((do_chown  ok  symlink_changed)
+((do_chown  ok  symlink_changed)
   ! ((uid == (uid_t) -1 || uid == file_stats-st_uid)
 (gid == (gid_t) -1 || gid == file_stats-st_gid)));
 
diff --git a/src/chown.c b/src/chown.c
index 12df061..4b75986 100644
--- a/src/chown.c
+++ b/src/chown.c
@@ -317,7 +317,7 @@ main (int argc, char **argv)
   optind++;
 }
 
-  if (chopt.recurse  preserve_root)
+  if (chopt.recurse  preserve_root)
 {
   static struct dev_ino dev_ino_buf;
   chopt.root_dev_ino = get_root_dev_ino (dev_ino_buf);
diff 

Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Jim Meyering
Pádraig Brady wrote:
 The attached patch removes the inconsistent use of bitwise
 operators on boolean values.

 As well as being distracting, bitwise operators are
 not short circuiting and brittle with C89 where
 bool can be an int, i.e.  1.
...
 Subject: [PATCH] maint: Use logical rather than bitwise operators on bools

 This is because bitwise operators are:
 - distracting and inconsistent in a boolean context
 - non short circuiting
 - brittle in C89 where bool can be an int (so  1)

Good change.
That looks fine and passes make check.
Thanks!




Re: [PATCH] maint: expr: avoid compiler warnings without GMP

2009-09-23 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Pádraig Brady on 9/23/2009 5:58 AM:
 Another ununused parameter warning patch attached

 +static void mpz_init_set_ui (mpz_t z, unsigned long i) { z[0] = i; (void) i; 
 }

Doesn't the assignment count as a use of i, so that the (void) i; is
redundant?

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkq6EYIACgkQ84KuGfSFAYDy4QCfa8krOj3DyolErc2dBEzFxTMu
8kgAn1qI+vNiY1KDW2nEu5xtQ7sl/pKr
=tixh
-END PGP SIGNATURE-




Re: summarizing gnulib changes for coreutils-7.7?

2009-09-23 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Jim Meyering on 9/23/2009 1:00 AM:
 Subject: [PATCH 2/2] maint: summarize gnulib changes
 
 Thanks for doing all that.
 
 I've pushed an initial gnulib-submodule-update patch
 and then your first patch (to fix the then-failing readlink test),
 but without the gnulib-updating part, which I prefer to keep in a
 separate commit.
 
 You're welcome to push the 2nd patch at your leisure,
 I suppose after the remaining gnulib changes have been sync'd.

OK, I've pushed the NEWS change (including the typo fix
canonicalize=readlink), then a separate commit to resync gnulib.

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkq6G9AACgkQ84KuGfSFAYB9vgCgnWfFKdj/gZ/9GSK4pWOVFQbp
GNAAn07+5czsJQr3yzzzRUXazlxZUSGv
=g/NK
-END PGP SIGNATURE-




Re: [PATCH] maint: expr: avoid compiler warnings without GMP

2009-09-23 Thread Pádraig Brady
Eric Blake wrote:
 According to Pádraig Brady on 9/23/2009 5:58 AM:
 Another ununused parameter warning patch attached
 
 +static void mpz_init_set_ui (mpz_t z, unsigned long i) { z[0] = i; (void) 
 i; }
 
 Doesn't the assignment count as a use of i, so that the (void) i; is
 redundant?

Oops yes.
I'll remove that one.

thanks for the review.
Pádraig.




Re: readlink(1) behavior

2009-09-23 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Eric Blake on 9/18/2009 5:35 AM:
 On one hand:
 
 ln -s dangling link = 0
 stat dangling/ = ENOENT
 stat link/ = ENOENT
 mkdir link/ = 0

Creating a directory through a slashed symlink works on Solaris 10, and is
required by POSIX, but it fails on Linux with EEXIST (in other words,
Linux ignores the trailing slash).  Unlike the rmdir counterpart (where it
is nice that Linux does NOT follow the slashed symlink to remove the dir,
leaving a dangling symlink), this seems like a case where the Solaris
behavior is nicer.  At any rate, now that readlink -f claims that both
missing/ and symlink-to-missing/ can resolve, would it make sense to
change mkdir(1) to guarantee that 'mkdir symlink-to-missing/' obeys POSIX
and creates missing instead of complaining that symlink-to-missing exists?
 Should I do this via a mkdir(2) wrapper in POSIX (wrapping both mkdir and
mkdirat because Linux does not comply), or just in coreutils?

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkq6HoIACgkQ84KuGfSFAYDMugCgjWkMxs0X5ObnMzN3DR+eBZFw
U+oAmwfHwVwu5ITadXCJKDnVyVq/PWLY
=62t3
-END PGP SIGNATURE-




Re: summarizing gnulib changes for coreutils-7.7?

2009-09-23 Thread Jim Meyering
Eric Blake wrote:
 According to Jim Meyering on 9/23/2009 1:00 AM:
 Subject: [PATCH 2/2] maint: summarize gnulib changes

 Thanks for doing all that.

 I've pushed an initial gnulib-submodule-update patch
 and then your first patch (to fix the then-failing readlink test),
 but without the gnulib-updating part, which I prefer to keep in a
 separate commit.

 You're welcome to push the 2nd patch at your leisure,
 I suppose after the remaining gnulib changes have been sync'd.

 OK, I've pushed the NEWS change (including the typo fix
 canonicalize=readlink), then a separate commit to resync gnulib.

Thanks.




coreutils-7.7 beta release soon

2009-09-23 Thread Jim Meyering
I'd like to make a test release soon.  Maybe as soon as Friday.
If you have something that you'd like included, please speak up.

NEWS is already pretty long:

** Bug fixes

  cp --preserve=xattr and --archive now preserve extended attributes even
  when the source file doesn't have write access.
  [bug introduced in coreutils-7.1]

  touch -t [[CC]YY]MMDDhhmm[.ss] now accepts a timestamp string ending in .60,
  to accommodate leap seconds.
  [the bug dates back to the initial implementation]

  ls --color now reverts to the color of a base file type consistently
  when the color of a more specific type is disabled.
  [bug introduced in coreutils-5.90]

** Portability

  On Solaris 9, many commands would mistakenly treat file/ the same as
  file.  Now, even on such a system, path resolution obeys the POSIX
  rules that a trailing slash ensures that the preceeding name is a
  directory or a symlink to a directory.

** Changes in behavior

  id no longer prints SELinux  context=... when the POSIXLY_CORRECT
  environment variable is set.

  readlink -f now ignores a trailing slash when deciding if the
  last component (possibly via a dangling symlink) can be created,
  since mkdir will succeed in that case.

** Improvements

  rm: rewrite to use gnulib's fts
  This makes rm -rf significantly faster (400-500%) in some pathological
  cases, and slightly slower (20%) in at least one pathological case.

  rm -r deletes deep hierarchies more efficiently.  Before, execution time
  was quadratic in the depth of the hierarchy, now it is merely linear.
  However, this improvement is not as pronounced as might be expected for
  very deep trees, because prior to this change, for any relative name
  length longer than 8KiB, rm -r would sacrifice official conformance to
  avoid the disproportionate quadratic performance penalty.  Leading to
  another improvement:

  rm -r is now slightly more standards-conformant when operating on
  write-protected files with relative names longer than 8KiB.




Re: readlink(1) behavior

2009-09-23 Thread Jim Meyering
Eric Blake wrote:

 According to Eric Blake on 9/18/2009 5:35 AM:
 On one hand:

 ln -s dangling link = 0
 stat dangling/ = ENOENT
 stat link/ = ENOENT
 mkdir link/ = 0

 Creating a directory through a slashed symlink works on Solaris 10, and is
 required by POSIX, but it fails on Linux with EEXIST (in other words,
 Linux ignores the trailing slash).  Unlike the rmdir counterpart (where it
 is nice that Linux does NOT follow the slashed symlink to remove the dir,
 leaving a dangling symlink), this seems like a case where the Solaris
 behavior is nicer.  At any rate, now that readlink -f claims that both
 missing/ and symlink-to-missing/ can resolve, would it make sense to
 change mkdir(1) to guarantee that 'mkdir symlink-to-missing/' obeys POSIX
 and creates missing instead of complaining that symlink-to-missing exists?
  Should I do this via a mkdir(2) wrapper in POSIX (wrapping both mkdir and
 mkdirat because Linux does not comply), or just in coreutils?

This does not seem to be important enough to merit penalizing
all Linux-based uses of mkdir, so maybe just mkdir(1), to start with?
Ideally, someone would get the fix into the kernel, and *then*
we'd do the right thing in gnulib, and that would penalize only older
Linux kernels.




Re: coreutils-7.7 beta release soon

2009-09-23 Thread Eric Blake
Jim Meyering jim at meyering.net writes:

 
 I'd like to make a test release soon.  Maybe as soon as Friday.
 If you have something that you'd like included, please speak up.

I'm nearly ready to post support for ln -P/-L, thanks to the pending addition 
of gnulib linkat().  That would be nice to have in the release.

-- 
Eric Blake






Re: coreutils-7.7 beta release soon

2009-09-23 Thread Jim Meyering
Eric Blake wrote:
 Jim Meyering jim at meyering.net writes:
 I'd like to make a test release soon.  Maybe as soon as Friday.
 If you have something that you'd like included, please speak up.

 I'm nearly ready to post support for ln -P/-L, thanks to the pending addition
 of gnulib linkat().  That would be nice to have in the release.

Sure.
I can easily defer the release until next week, if needed.




Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Paul Eggert
Pádraig Brady p...@draigbrady.com writes:

 As well as being distracting, bitwise operators are
 not short circuiting and brittle with C89 where
 bool can be an int, i.e.  1.

I don't find them distracting; on the contrary, I find them a helpful
reminder that the expressions are boolean (in the traditional
mathematical sense) and so I don't need to worry about the
complexities of short-circuit evaluation.

Also, the conditional branches generated by short-circuit operators
can make code harder to debug because GDB starts jumping all over the
place when I single step.

Furthermore, bitwise operators often generate shorter and faster code,
particularly on modern processors where conditional branches are far
more expensive than logical operations.  For example, consider these
functions:

bool F (bool a, bool b, bool c, bool d) { return a  b  c  d; }
bool G (bool a, bool b, bool c, bool d) { return a  b  c  d; }

For F, gcc -O2 (x86) currently generates 3 conditional branches
(ouch!); for G it generates straightline code (good with pipelining).
F is also quite a bit bigger than G (20 instructions vs 12; and 49
bytes versus 30), and this size increase bloats the code and causes
more instruction cache misses.

Bitwise operators are equivalent in behavior to short-circuit
operations on boolean expressions (i.e., expressions involving just 0
and 1) that do not have side effects or conditionally-undefined
behavior. This is true regardless of whether the boolean type is
implemented via 'int'.  Let's just leave them alone.




Re: ls --help: further tweak

2009-09-23 Thread Benno Schulenberg
On Tue, 22 Sep 2009 22:16 +0200, Jim Meyering j...@meyering.net wrote:
 +Using color to distinguish file types is disabled both by default and\n\
 +with --color=never.

Okay with me.  It forces is disabled to describe a state also for the option.
(In some translations this won't work, though.  But that is something for the
translators to solve.)


One other tiny thing about the HTML pages.  Subsection 23.6.1 does not
have its own subnode (whereas for example 7.6.1 does).  Clicking 23.6.1
at 
http://www.gnu.org/software/coreutils/manual/html_node/index.html#toc_Modified-command-invocation
shows section 23.6 instead.

Benno

-- 
http://www.fastmail.fm - Access all of your messages and folders
  wherever you are





Re: ls --help: further tweak

2009-09-23 Thread Jim Meyering
Benno Schulenberg wrote:
 On Tue, 22 Sep 2009 22:16 +0200, Jim Meyering j...@meyering.net wrote:
 +Using color to distinguish file types is disabled both by default and\n\
 +with --color=never.

 Okay with me.  It forces is disabled to describe a state also for the 
 option.
 (In some translations this won't work, though.  But that is something for the
 translators to solve.)


 One other tiny thing about the HTML pages.  Subsection 23.6.1 does not
 have its own subnode (whereas for example 7.6.1 does).  Clicking 23.6.1
 at 
 http://www.gnu.org/software/coreutils/manual/html_node/index.html#toc_Modified-command-invocation
 shows section 23.6 instead.

If someone wants to investigate and fix that, great.
Otherwise, imho, this isn't worth worrying about,
especially since su is not even installed by default, now.




Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Jim Meyering
Paul Eggert wrote:

 Pádraig Brady p...@draigbrady.com writes:

 As well as being distracting, bitwise operators are
 not short circuiting and brittle with C89 where
 bool can be an int, i.e.  1.

 I don't find them distracting; on the contrary, I find them a helpful
 reminder that the expressions are boolean (in the traditional
 mathematical sense) and so I don't need to worry about the
 complexities of short-circuit evaluation.

 Also, the conditional branches generated by short-circuit operators
 can make code harder to debug because GDB starts jumping all over the
 place when I single step.

 Furthermore, bitwise operators often generate shorter and faster code,
 particularly on modern processors where conditional branches are far
 more expensive than logical operations.  For example, consider these
 functions:

 bool F (bool a, bool b, bool c, bool d) { return a  b  c  d; }
 bool G (bool a, bool b, bool c, bool d) { return a  b  c  d; }

 For F, gcc -O2 (x86) currently generates 3 conditional branches
 (ouch!); for G it generates straightline code (good with pipelining).
 F is also quite a bit bigger than G (20 instructions vs 12; and 49
 bytes versus 30), and this size increase bloats the code and causes
 more instruction cache misses.

 Bitwise operators are equivalent in behavior to short-circuit
 operations on boolean expressions (i.e., expressions involving just 0
 and 1) that do not have side effects or conditionally-undefined
 behavior. This is true regardless of whether the boolean type is
 implemented via 'int'.  Let's just leave them alone.

Hi Paul,

I was thinking of risk (not code size) when I ok'd his patch,
with this change still fresh in my memory:

commit 3db7c2c03c4c6daf358925d1c585fcbb478fa25a
Author: Jim Meyering meyer...@redhat.com
Date:   Thu Aug 20 10:36:30 2009 +0200

install: avoid a portability bug when compiling with non-gcc

* src/install.c (extra_mode): Be careful to return only a 0 or 1
value, since this is a bool function, and there are still some
compilers for which this is necessary.  Without this change,
Bernhard Voelker reported that the Forte Developer 7 C 5.4 2002/03/09
compiler would produce a malfunctioning install binary.  Details in
http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/17710/focus=17760

Considering your arguments, I wonder if it's worthwhile to see
how many of those changes are in parts of the code where decreased
code size and fewer branches might make a difference.  Or go one
better...

Now that I think more about it, I really dislike pessimizing our code
to accommodate C89 (20 years old!) or any compiler that doesn't have
sufficient support for bool, especially now that we require support
for some C99 features.

Pádraig, what would you think of my backing out your change, if I added
an (overridable) configure-time test to require working bool?




Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Pádraig Brady
Paul Eggert wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 As well as being distracting, bitwise operators are
 not short circuiting and brittle with C89 where
 bool can be an int, i.e.  1.
 
 I don't find them distracting; on the contrary, I find them a helpful
 reminder that the expressions are boolean (in the traditional
 mathematical sense) and so I don't need to worry about the
 complexities of short-circuit evaluation.
 
 Also, the conditional branches generated by short-circuit operators
 can make code harder to debug because GDB starts jumping all over the
 place when I single step.
 
 Furthermore, bitwise operators often generate shorter and faster code,
 particularly on modern processors where conditional branches are far
 more expensive than logical operations.  For example, consider these
 functions:
 
 bool F (bool a, bool b, bool c, bool d) { return a  b  c  d; }
 bool G (bool a, bool b, bool c, bool d) { return a  b  c  d; }

Well I did actually compare the assembly for chcon.c which
was the first I noticed, before I decided it might be worth doing:

$ objdump -S chcon.o  chcon.logical #and likewise for bitwise
$ sdiff chcon.bitwise chcon.logical
...
if (recurse  preserve_root)|   if (recurse  
preserve_root)
   6c0:   0f b6 05 11 00 00 00movzbl 0x11,%eax  |  6c0:   
80 3d 11 00 00 00 00cmpb   $0x0,0x11
   6c7:   0f b6 4c 24 34  movzbl 0x34(%esp),%ec |  6c7:   
74 0b   je 6d4 main+0x3c
   6cc:   85 c8   test   %ecx,%eax  |  6c9:   
80 7c 24 34 00  cmpb   $0x0,0x34(%esp
   6ce:   0f 85 8b 07 00 00   jnee5f main+0xb4 |  6ce:   
0f 85 3e 0a 00 00   jne1112 main+0xe
  error (EXIT_FAILURE, errno, _(failed to get attribut   error 
(EXIT_FAILURE, errno, _(failed to get attribut

You're right about the extra branch but for this common case at least
the size was the same. I did also check the speed with a test program
(attached) which showed the logical test was a bit faster on my pentium-m.
Note I compiled without optimization like `gcc -march=pentium-m chcon-test.c`
so that the loop was actually run, and I got these (averaged) results:

$ time ./bitwise
real0m4.550s
$ time ./logical
real0m4.173s

 For F, gcc -O2 (x86) currently generates 3 conditional branches
 (ouch!); for G it generates straightline code (good with pipelining).
 F is also quite a bit bigger than G (20 instructions vs 12; and 49
 bytes versus 30), and this size increase bloats the code and causes
 more instruction cache misses.
 
 Bitwise operators are equivalent in behavior to short-circuit
 operations on boolean expressions (i.e., expressions involving just 0
 and 1) that do not have side effects or conditionally-undefined
 behavior. This is true regardless of whether the boolean type is
 implemented via 'int'.  Let's just leave them alone.

Well lots (most?) of the code already uses logical operators
and the mixing of both was confusing me a bit (I'm easily confused :))
I also noticed in expr.c that the null() function was not
short circuited which I changed in the hunk below.
As well as this probably being a lower performance, I needed
to examine the function to know if it had side effects which
one can discount when logical operators are used.
I'm worried about stuff like that creeping in if bitwise
operators are left in place.

diff --git a/src/expr.c b/src/expr.c
index aab09a1..88e0b47 100644
--- a/src/expr.c
+++ b/src/expr.c
@@ -923,7 +923,7 @@ eval1 (bool evaluate)
 {
   if (nextarg ())
 {
-  r = eval2 (evaluate  ~ null (l));
+  r = eval2 (evaluate  !null (l));
   if (null (l) || null (r))
 {
   freev (l);
@@ -954,7 +954,7 @@ eval (bool evaluate)
 {
   if (nextarg (|))
 {
-  r = eval1 (evaluate  null (l));
+  r = eval1 (evaluate  null (l));
   if (null (l))
 {
   freev (l);

cheers,
Pádraig.
#include stdbool.h
#include stdio.h

int main(int argc, char** argv)
{
bool recurse=false;
bool preserve_root=true;
int i;
for (i=0; i10; i++)
  if (recurse  preserve_root)
  puts(true);
}


Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Pádraig Brady
Jim Meyering wrote:

 Considering your arguments, I wonder if it's worthwhile to see
 how many of those changes are in parts of the code where decreased
 code size and fewer branches might make a difference.  Or go one
 better...

Here's list of size changes for reference:

$ find -type f -perm +111 -printf %f\n | sort |
 xargs size  logical.size
$ git show a037e838 | patch -p2 -R
$ make
$ find -newer chcon.c -type f -perm +111 -printf %f\n | sort |
 xargs size  bitwise.size
$ join -j6 -o 0,1.1,2.1 bitwise.size logical.size | column -t

filename   text   text
[  29026  29026
cat37530  37434
chcon  48181  48181
chgrp  44920  44984
chmod  41217  41217
chown  46982  47046
cp 97798  97782
dd 45727  45711
df 57633  57697
dir95685  95637
du 86464  86464
expand 20663  20647
expr   25536  25552
ginstall   88646  88646
id 22035  22035
ls 95685  95637
md5sum 27182  27198
mv 91160  91144
od 52127  52143
pathchk18799  18831
pr 50974  51006
ptx51030  51030
rm 46499  46499
sha1sum30786  30802
sha224sum  36626  36642
sha256sum  36626  36642
sha384sum  91930  91962
sha512sum  91930  91962
sort   74379  74523
stty   52619  52283
tail   50009  49993
test   26014  26014
unexpand   21833  21737
vdir   95685  95637
wc 28847  28831

cheers,
Pádraig.




Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Eric Blake
Paul Eggert eggert at CS.UCLA.EDU writes:

 Furthermore, bitwise operators often generate shorter and faster code,
 particularly on modern processors where conditional branches are far
 more expensive than logical operations.  For example, consider these
 functions:
 
 bool F (bool a, bool b, bool c, bool d) { return a  b  c  d; }
 bool G (bool a, bool b, bool c, bool d) { return a  b  c  d; }

Are we feeding these as bug reports to the gcc folks?  This seems like 
something they should be aware of, in that if there are no side effects in 
evaluating the four boolean variables, the compiler should be generating 
identical code for either choice of operator.  If we can get the compiler 
fixed, then I'd lean towards using  notation.

-- 
Eric Blake







Re: summarizing gnulib changes for coreutils-7.7?

2009-09-23 Thread Eric Blake
Jim Meyering jim at meyering.net writes:

  OK, I've pushed the NEWS change (including the typo fix
  canonicalize=readlink), then a separate commit to resync gnulib.
 
 Thanks.

Here's another patch related to gnulib churn.  I am _NOT_ planning on 
attempting to port coreutils to mingw, so the mingw folks may still have a lot 
of work to make coreutils handle their platform's oddities nicely.  But this 
patch should be okay (it is a no-op if applied before my matching gnulib change 
to SAME_INODE to be tri-state [1]; and if applied afterwards, it should have no 
semantic changes for sane platforms where st_ino is never 0), and at least in 
theory will make life more bearable for the poor guy who does try to port 
coreutils to mingw.  OK to apply?

[1] http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/18854


From: Eric Blake e...@byu.net
Date: Wed, 23 Sep 2009 15:42:05 -0600
Subject: [PATCH] build: reflect same-inode changes from gnulib

On mingw, st_ino is always 0, so SAME_INODE returns -1.  Also,
st_nlink is always 1, so even though the link module can
create hard links, we can't detect them.  Here is an analysis
of all uses of SAME_INODE in coreutils:

root-dev-ino - mingw can't delete root dir anyway
chown-core - can't rename in-use files
copy - can't rename in-use files, no way to detect hard links
  to protect them, and no symlinks; behavior on identical
  files may be non-compliant, but that's life for mingw users
cp-hash - using -1 may lead to more hash collisions, but still
  correct behavior
du - no way to detect hard links, so they will be double-counted
ln - trying ln -f k k will delete k, but that's life for mingw
  and complies with POSIX
ls - no directory cycles, so no problem
pwd - getpwd is always accurate, and no symlinks so no -L needed
sort - will always use temporary file, but still correct

* gl/lib/root-dev-ino.h (ROOT_DEV_INO_CHECK): Update caller.
* src/du.c (entry_compare): Likewise.
* src/ln.c (do_link): Likewise.
* src/ls.c (dev_ino_compare): Likewise.
* src/pwd.c (robust_getcwd, logical_getcwd): Likewise.

Signed-off-by: Eric Blake e...@byu.net
---
 gl/lib/root-dev-ino.h |2 +-
 src/copy.c|2 +-
 src/du.c  |2 +-
 src/ln.c  |2 +-
 src/ls.c  |2 +-
 src/pwd.c |5 +++--
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/gl/lib/root-dev-ino.h b/gl/lib/root-dev-ino.h
index bec27f0..3ee1d88 100644
--- a/gl/lib/root-dev-ino.h
+++ b/gl/lib/root-dev-ino.h
@@ -28,7 +28,7 @@ get_root_dev_ino (struct dev_ino *root_d_i);
--preserve-root and --no-preserve-root options.  */

 # define ROOT_DEV_INO_CHECK(Root_dev_ino, Dir_statbuf) \
-(Root_dev_ino  SAME_INODE (*Dir_statbuf, *Root_dev_ino))
+(Root_dev_ino  SAME_INODE (*Dir_statbuf, *Root_dev_ino) == 1)

 # define ROOT_DEV_INO_WARN(Dirname)\
   do   \
diff --git a/src/copy.c b/src/copy.c
index e3c5c52..7d26b59 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1971,7 +1971,7 @@ copy_internal (char const *src_name, char const *dst_name,
directory.  Other things will fail later.  */
 || stat (., dot_sb) != 0
 || stat (dst_parent, dst_parent_sb) != 0
-|| SAME_INODE (dot_sb, dst_parent_sb));
+|| SAME_INODE (dot_sb, dst_parent_sb) == 1);
   free (dst_parent);

   if (! in_current_dir)
diff --git a/src/du.c b/src/du.c
index 9831a17..7382ff1 100644
--- a/src/du.c
+++ b/src/du.c
@@ -354,7 +354,7 @@ entry_compare (void const *x, void const *y)
 {
   struct entry const *a = x;
   struct entry const *b = y;
-  return SAME_INODE (*a, *b) ? true : false;
+  return SAME_INODE (*a, *b) == 1 ? true : false;
 }

 /* Try to insert the INO/DEV pair into the global table, HTAB.
diff --git a/src/ln.c b/src/ln.c
index 0c35338..6e1ecdc 100644
--- a/src/ln.c
+++ b/src/ln.c
@@ -214,7 +214,7 @@ do_link (const char *source, const char *dest)
  misleading.  */
(backup_type == no_backups || !symbolic_link)
(!symbolic_link || stat (source, source_stats) == 0)
-   SAME_INODE (source_stats, dest_stats)
+   SAME_INODE (source_stats, dest_stats) == 1
   /* The following detects whether removing DEST will also remove
  SOURCE.  If the file has only one link then both are surely
  the same link.  Otherwise check whether they point to the same
diff --git a/src/ls.c b/src/ls.c
index 1bb6873..45543e4 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -1076,7 +1076,7 @@ dev_ino_compare (void const *x, void const *y)
 {
   struct dev_ino const *a = x;
   struct dev_ino const *b = y;
-  return SAME_INODE (*a, *b) ? true : false;
+  return SAME_INODE (*a, *b) == 1 ? true : false;
 }

 static void
diff --git a/src/pwd.c b/src/pwd.c
index cfbf5b7..a2e7a2b 100644
--- a/src/pwd.c
+++ b/src/pwd.c

Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Paul Eggert
Eric Blake e...@byu.net writes:

 bool F (bool a, bool b, bool c, bool d) { return a  b  c  d; }
 bool G (bool a, bool b, bool c, bool d) { return a  b  c  d; }

 the compiler should be generating identical code for either choice
 of operator.

Not necessarily, because it could be that the programmer wrote F that
way because its first argument is almost always false, in which case F
is faster even though it's bigger and consumes more instructions in
the worst case.




Re: [PATCH] maint: Use logical rather than bitwise operators on bools

2009-09-23 Thread Paul Eggert
Pádraig Brady p...@draigbrady.com writes:

 I did also check the speed with a test program
 (attached) which showed the logical test was a bit faster on my pentium-m.
 Note I compiled without optimization

I would expect optimization would be crucial: I got the 60% increase
in number of instructions for  when compiling with optimization, and
without optimization there was almost no difference between  and .

Although I admit that it bugs me that the change makes the code slower
and fatter, my primary objection is that short-circuit AND has more
complicated semantics than Boolean AND, and we shouldn't necessarily
favor the former over the latter.  I agree that the bitwise part of
the logical AND in traditional C 'int' semantics is a problem, but to
my mind it's not a problem that necessarily overwhelms the
short-circuit disadvantage.

If it's safe to assume proper 'bool' support now, that would address
the problem as well.  Perhaps it's better to make that assumption now
anyway, for reasons other than the one prompting this thread.  After
all, the subtle differences between C99 bool and C89
approximately-bool go beyond the  and || problem.