Re: strcache scaling issue

2011-02-23 Thread Ralf Wildenhues
* Paul Smith wrote on Mon, Feb 21, 2011 at 04:01:49PM CET:
 On Mon, 2011-02-21 at 02:30 -0500, Paul Smith wrote:
  Hi Ralf.  I promoted a rework of strcache.c.  For your simple scaling
  test on my system with debug compiled I get these results:

 Try #2:
 Build Info  10002000 4000
 3.82 -g 2.61s   8.85s   33.52s
 3.82 -O21.90s   7.62s   27.82s
 New -g (with asserts)   1.03s   2.31s5.79s
 New -O2 (no asserts)0.65s   1.50s3.52s

Cool, thanks for addressing this Paul!

I hope to have some Automake patches ready for Xan soon too ...


Bug-make mailing list

Re: separate the notions of always-build and jobserver

2011-01-26 Thread Ralf Wildenhues

* Ralf Wildenhues wrote on Tue, Sep 28, 2010 at 10:55:48PM CEST:
 Currently, 'make -n' only executes recipe commands prefixed with '+' or
 containing one of the strings ${MAKE} or $(MAKE).  Likewise, parallel
 make hands the jobserver file descriptors only to rules annotated in the
 same way.
 It would be nice if these two semantics could be decoupled.

FYI, to follow up to this old message myself, according to the other
message I just sent, I intend to use the following for this:

 1) The GCC LTO (link-time optimization) engine may exploit
 parallelization during whole-program linking by hooking into the job

in Automake:

am__if_make_flag = \
$(if $(strip $(foreach flag,$(filter-out --%,$(MAKEFLAGS)),$(if \
  $(findstring =,$(flag)),,$(if \
$(findstring $(1),$(flag)),:,$(2),$(3))
LTOPAR = $(call am__if_make_flag,n,,+)

and prefix link recipe commands with $(LTOPAR) (if GNU make is used).


Bug-make mailing list

docs: fix MAKEFLAGS testing example

2011-01-26 Thread Ralf Wildenhues
I think it's been reported here before, but I can't find the reference
now.  Anyway, the ranlib example in the manual is wrong because it is
way too simplistic; try, e.g.,
  make --no-print-directory FOO=bart

Below is a patch that might make the section less fitting for what it
was intended, but at least the code has less problems, or so I hope;
also, I think it is useful to have this code snippet in the manual
so people can copy it.

And yes, even the line wrapping cause headaches to get right.


2011-01-26  Ralf Wildenhues

* doc/make.texi (Testing Flags): Fix MAKEFLAGS testing example
and adjusted text to cope with passed variable assignments and
long options.

Index: doc/make.texi
RCS file: /cvsroot/make/make/doc/make.texi,v
retrieving revision 1.69
diff -u -r1.69 make.texi
--- doc/make.texi   30 Nov 2010 14:48:53 -  1.69
+++ doc/make.texi   26 Jan 2011 20:08:49 -
@@ -6449,23 +6449,31 @@
 @section Conditionals that Test Flags
 You can write a conditional that tests @code{make} command flags such as
-@samp{-t} by using the variable @code{MAKEFLAGS} together with the
-@code{findstring} function
-(@pxref{Text Functions, , Functions for String Substitution and Analysis}).
+@samp{-t} by using the variable @code{MAKEFLAGS} together with the a
+couple of @code{make} functions
+(@pxref{Functions, , Functions for Transforming Text}).
 This is useful when @code{touch} is not enough to make a file appear up
 to date.
-The @code{findstring} function determines whether one string appears as a
-substring of another.  If you want to test for the @samp{-t} flag,
-use @samp{t} as the first string and the value of @code{MAKEFLAGS} as
-the other.
+The @code{foreach} function expands text repeatedly for each element of
+a list.  The @code{filter-out} function removes words from text matching
+a pattern.  The @code{findstring} function determines whether one string
+appears as a substring of another.  Finally, the @code{if} function
+expands code conditionally, based on whether the condition is nonempty
+or not.
+Now, if you want to test for the @samp{-t} flag, you can use @samp{t} as
+the string to find in @code{MAKEFLAGS}, after filtering out long options
+and variable assignments that might match wrongly
 For example, here is how to arrange to use @samp{ranlib -t} to finish
 marking an archive file up to date:
 archive.a: @dots{}
-ifneq (,$(findstring t,$(MAKEFLAGS)))
+ifneq (,$(foreach flag,$(filter-out --%,$(MAKEFLAGS)),$(if \
+  $(findstring =,$(flag)),,$(findstring t,$(flag)
 +touch archive.a
 +ranlib -t archive.a

Bug-make mailing list

Re: strcache scaling issue

2011-01-23 Thread Ralf Wildenhues
Hi Paul,

* Paul Smith wrote on Sat, Jan 22, 2011 at 10:27:01PM CET:
 Hi Ralf, sorry for the delay in reply.

No worries.

 On Sun, 2011-01-09 at 08:50 +0100, Ralf Wildenhues wrote:
  Just breaking out of the for loop in the if-true case avoids the problem
  for me.  (The bulk of the patch is cleanup to remove the unneeded 'best'
  variable.)  Of course that only delays quadratic behavior as with filled
  caches, all of them are still walked.
 Thanks for your investigations here; that's very helpful.  I had also
 had some thoughts about improving the strcache in various ways.  For
 example, I was thinking maybe about splitting it into two: one for
 filenames and one for variable names, since these strings very rarely

But a hash doesn't degrade in quality just because you put two different
kinds of things in it.  I wouldn't go for more complexity here.  On the
contrary, if you want an example where one big hash table is used for
several different kinds of objects very successfully, look at git.

 One other thing that would probably be useful is that someone posted
 some patches that changed the hashing in GNU make to use a PATRICIA tree
 instead of a standard hash (the hashing in GNU make is taken from
 idutils); I think this could help a lot in the fairly common case where
 there are lots of strings which share prefixes.  I asked for copyright
 assignment for that code but I'm not sure where it stands.

Yes, this sounds like it might help.  FWIW, I would only go that way if
there are use cases where this demonstrably helps, and the log(N)
behavior doesn't slow others down a lot.

 But as always, it's better to test and fix real issues rather than guess
 as to where your issues might be.  I'll definitely look into making
 changes along the lines you suggest.

Thank you!

  strcache_iscached is more worrisome with 14-fold increase.  Looking at
  its implementation, I wonder why it doesn't do a hash lookup rather
  than looping over all the cache entries.
 I was thinking that doing simple pointer compares across the buffers
 would be not much slower, and even faster in simple cases, than the hash
 lookup, but obviously when the # of buffers gets large that's not true.
 On the other hand, the strcache_iscached() function is really only
 intended for debug/verification.  It's called in an assert() (in file.c)
 I added during development of the string cache to be sure that strings
 expected to be in the cache, really were.  For normal use that
 could/should be disabled.  Perhaps I need a special flag controlling
 stuff like this.  I could add a compiler flag to enable these checks in
 maintenance mode, maybe.

Sounds like a good idea.


Bug-make mailing list

Re: strcache scaling issue

2011-01-22 Thread Ralf Wildenhues
Ping!  :-)

* Ralf Wildenhues wrote on Sun, Jan 09, 2011 at 08:50:34AM CET:
 2011-01-09  Ralf Wildenhues
   * strcache.c (add_string): Use first fitting cache, to
   avoid quadratic behavior searching for an optimal-size cache.
   (strcache_iscached): Use the hash instead of walking the string
   Prompted by a report against Automake by Xan Lopez

Any feedback on this thread?

and by the way, maybe also on this one?

Do you prefer opening a bug tracker/patch tracker entry to messages sent
directly to bug-make, so they don't get lost?


Bug-make mailing list

strcache scaling issue

2011-01-09 Thread Ralf Wildenhues
 implementing a priority queue for the
set of free bytes, or use a proper allocator scheme.  Generally, this is
all still wrong though.  make shouldn't spend 18s in its string handling
code for a 2MB Makefile specifying 16K files (8000 in above shell
script) on a modern system.  I'd expect 1-2s at most, and the rest in
stat (this is with the patch below applied):

  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 32.14  8.46 8.46  3840369 0.00 0.00  add_string
 16.34 12.76 4.30 34547800 0.00 0.00  hash_find_slot
 11.15 15.70 2.9432002 0.00 0.00  pattern_search
  6.78 17.48 1.79 10281361 0.00 0.00  str_hash_1
  5.13 18.83 1.35  8155410 0.00 0.00  dirfile_hash_1
  4.94 20.13 1.3048000 0.00 0.00  record_files
  4.92 21.43 1.30 25410516 0.00 0.00  dirfile_hash_cmp
  3.53 22.36 0.93  6277494 0.00 0.00  str_hash_2
  3.21 23.20 0.85  5102123 0.00 0.00  file_hash_1
  2.26 23.80 0.60  5403326 0.00 0.00  dirfile_hash_2
  1.82 24.28 0.48  5238695 0.00 0.00  file_hash_cmp
  1.44 24.66 0.38  2517085 0.00 0.00  file_hash_2
  1.25 24.99 0.33 11008694 0.00 0.00  directory_hash_1
  0.46 25.11 0.12  5472342 0.00 0.00  file_impossible_p
  0.42 25.22 0.11  4928311 0.00 0.00  file_exists_p
  0.40 25.32 0.11 34208095 0.00 0.00  str_hash_cmp
  0.38 25.42 0.10   448046 0.00 0.00  find_char_unquote
  0.38 25.52 0.10   28 0.00 0.07  hash_rehash
  0.30 25.60 0.08 11666204 0.00 0.00  hash_find_item
  0.27 25.67 0.07  4484685 0.00 0.00  hash_insert_at
  0.23 25.73 0.06  4961544 0.00 0.00  lookup_file
  0.21 25.79 0.06 11008691 0.00 0.00  directory_hash_cmp
  0.19 25.84 0.05  6212628 0.00 0.00  add_hash

If the expected average string length is bounded away from
CACHE_BUFFER_SIZE (which I'm sure it is), then it would even make sense
to not ever look at any but the first strcache in the list at all; the
loss would still be bounded away from 50%.  Memory is cheap these days,
and I don't know of projects that parse GB worth of non-repeating
makefile text.  (I have not done this step in the patch below, feel free
to add it though.)


2011-01-09  Ralf Wildenhues

* strcache.c (add_string): Use first fitting cache, to
avoid quadratic behavior searching for an optimal-size cache.
(strcache_iscached): Use the hash instead of walking the string
Prompted by a report against Automake by Xan Lopez

Index: strcache.c
RCS file: /cvsroot/make/make/strcache.c,v
retrieving revision 2.9
diff -u -p -r2.9 strcache.c
--- strcache.c  13 Jul 2010 01:20:43 -  2.9
+++ strcache.c  9 Jan 2011 07:41:44 -
@@ -63,7 +63,6 @@ new_cache()
 static const char *
 add_string(const char *str, int len)
-  struct strcache *best = NULL;
   struct strcache *sp;
   const char *res;
@@ -73,26 +72,24 @@ add_string(const char *str, int len)
   if (len  bufsize)
 bufsize = len * 2;
-  /* First, find a cache with enough free space.  We always look through all
- the blocks and choose the one with the best fit (the one that leaves the
- least amount of space free).  */
+  /* First, find a cache with enough free space.  */
   for (sp = strcache; sp != NULL; sp = sp-next)
-if (sp-bytesfree  len  (!best || best-bytesfree  sp-bytesfree))
-  best = sp;
+if (sp-bytesfree  len)
+  break;
   /* If nothing is big enough, make a new cache.  */
-  if (!best)
-best = new_cache();
+  if (!sp)
+sp = new_cache();
-  assert (best-bytesfree  len);
+  assert (sp-bytesfree  len);
   /* Add the string to the best cache.  */
-  res = best-end;
-  memcpy (best-end, str, len);
-  best-end += len;
-  *(best-end++) = '\0';
-  best-bytesfree -= len + 1;
-  ++best-count;
+  res = sp-end;
+  memcpy (sp-end, str, len);
+  sp-end += len;
+  *(sp-end++) = '\0';
+  sp-bytesfree -= len + 1;
+  ++sp-count;
   return res;
@@ -144,13 +141,7 @@ add_hash (const char *str, int len)
 strcache_iscached (const char *str)
-  struct strcache *sp;
-  for (sp = strcache; sp != 0; sp = sp-next)
-if (str = sp-buffer  str  sp-end)
-  return 1;
-  return 0;
+  return hash_find_item (strings, str) != NULL;
 /* If the string is already in the cache, return a pointer to the cached

Bug-make mailing list

prerequisite scaling (was: strcache scaling issue)

2011-01-09 Thread Ralf Wildenhues
Continuing with the previous example makefile, but this time with 'make
-r', I get a bit further, but only to hit the next bumper: this is with

  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 96.75 25.8825.88   12 0.00 0.00  record_files
  0.79 26.09 0.21  112 0.00 0.00  find_char_unquote
  0.64 26.26 0.17  1212990 0.00 0.00  hash_find_slot
  0.26 26.33 0.07   799239 0.00 0.00  str_hash_1

-: 2065:  /* Add the dependencies to this file entry.  */
   168000: 2066:  if (this != 0)
-: 2067:{
-: 2068:  /* Add the file's old deps and the new ones in THIS 
together.  */
   112000: 2069:  if (f-deps == 0)
56002: 2070:f-deps = this;
55998: 2071:  else if (cmds != 0)
-: 2072:{
#: 2073:  struct dep *d = this;
-: 2081:}
-: 2082:  else
-: 2083:{
55998: 2084:  struct dep *d = f-deps;
-: 2085:
-: 2086:  /* A rule without commands: put its prereqs at 
the end.  */
928027998: 2087:  while (d-next != 0)
927916002: 2088:d = d-next;
-: 2089:
55998: 2090:  d-next = this;
-: 2091:}
-: 2092:}


A doubly-linked list would be overkill (and memory-intensive), but I
think storing an end pointer to the dep chain in 'struct file' might
be prudent.  That requires some changes throughout the code though,
and warrants some data structure change to avoid the obvious error
(of updating only -deps but not -deps_end).  Maybe a

  struct dep_list {
struct dep *deps;   /* all dependencies, including duplicates */
struct dep *last_dep;   /* last dependency in the deps list */

(the second one could be a double pointer if removal is ever needed)
and 'struct file' containing such a struct?

If you agree, I might work on a patch; but I certainly won't mind being
beaten to it.


Bug-make mailing list

[bug #28134] inconsistent reporting of linux kernel builds with -j2

2009-12-17 Thread Ralf Wildenhues

Follow-up Comment #1, bug #28134 (project make):

Try /tmp/C_out 2/tmp/C_err so that append mode is used so that writes to
the files from different processes are atomic and don't overwrite each other.


Reply to this item at:

  Message sent via/by Savannah

Bug-make mailing list

small configure correctness patch

2009-07-29 Thread Ralf Wildenhues

BTW, I noticed in my make tree this patch that has not made it into
upstream yet:
Maybe you could take a peek at it?

Other than that, here is a small patch to fix a configure glitch:
with AC_CACHE_VAL, the third argument should contain code that sets the
cache variable, but not have any other side-effects.  The reason for
this is that, with a './config.status --recheck' (that may be triggered
by make), the cached values from config.cache (if enabled) may cause
these commands to be by-passed.  The fix is to move side-effects after
the check, and key them off the cache variable, as below.
(Newer autoconfs warn about this issue.)

The move to put the AC_SUBSTs outside of any conditional code is not
needed for correctness, merely done for clarity: the substitution is
done in any case anyway.


2009-07-29  Ralf Wildenhues

* Move side-effects outside AC_CACHE_VAL arguments
that set make_cv_sys_gnu_glob, so they are also correctly set
when the cache has been populated before.

RCS file: /cvsroot/make/make/,v
retrieving revision 1.150
diff -u -r1.150
--- configure.in4 Jun 2009 06:30:27 -   1.150
+++ configure.in29 Jul 2009 18:01:13 -
@@ -364,9 +365,13 @@
  ], [AC_MSG_RESULT(yes)
 make_cv_sys_gnu_glob=yes], [AC_MSG_RESULT([no; using local copy])
-AC_SUBST(GLOBINC) GLOBINC='-I$(srcdir)/glob'
+if test $make_cv_sys_gnu_glob = no; then
+  GLOBINC='-I$(srcdir)/glob'
+  GLOBLIB=glob/libglob.a
 # Tell automake about this, so it can build the right .c files.
 AM_CONDITIONAL(USE_LOCAL_GLOB, test $make_cv_sys_gnu_glob = no)

Bug-make mailing list

ulimit and unset are unixy sh internals

2009-07-29 Thread Ralf Wildenhues
I just happened to debug a makefile by adding a 'ulimit -a' command to a
recipe, and was surprised to see it fail.  :-)

So I went through the list of builtins from other shells in job.c and
came up with this patch that I think you need for a bit more safety on
unixy shells.

Thanks for maintaining GNU make,

2009-07-29  Ralf Wildenhues

* job.c (construct_command_argv_internal): Add ulimit and
unset to the sh_cmds for Unixy shells.

Index: job.c
RCS file: /cvsroot/make/make/job.c,v
retrieving revision 1.193
diff -u -r1.193 job.c
--- job.c   9 Jun 2009 15:35:38 -   1.193
+++ job.c   29 Jul 2009 18:06:38 -
@@ -2327,7 +2327,7 @@
  eval, exec, exit, export, for, if,
  login, logout, read, readonly, set,
  shift, switch, test, times, trap,
- umask, wait, while, 0 };
+ ulimit, umask, unset, wait, while, 0 };
   /* This is required if the MSYS/Cygwin ports (which do not define
  WINDOWS32) are compiled with HAVE_DOS_PATHS defined, which uses

Bug-make mailing list

interaction of make -t, -q, phony targets and $(MAKE)

2009-02-28 Thread Ralf Wildenhues
Hello bug makers,  ;-)

in your infinite wisdom, I am sure you can help my confused head wrap
itself around this behavior of (CVS) make:

cat Makefile \EOF
a b:
echo $@
echo $@, $(MAKE)

c d:
echo $@

e f:
echo $@, $(MAKE)

g h:
echo $@, $(MAKE)
echo $@

.PHONY: a c e g
rm -f ?
make -t a b c d e f g h # leads to:
| echo a, make
| a, make
| echo b, make
| b, make
| touch b
| make: Nothing to be done for `c'.
| touch d
| echo e, make
| e, make
| echo f, make
| f, make
| echo g, make
| g, make
| echo h, make
| h, make
| touch h
| touch h

This means phony targets which are not marked as recursive (c) are not
touched nor executed.  Is this a bug?  I would guess not.  If you agree
that it is intentional, then I suggest that this be documented in the
manual, proposed patch below.

BTW, I think this behavior is useful.

The other funny thing is the duplicate touch lines for 'h'.  :-)

More funny though is this:

rm -f ?
make -q a b c d e f g h # leads to:
| echo e, make
| e, make
| echo f, make
| f, make
| echo g, make
| g, make
| echo h, make
| h, make

You only check the first command line for $(MAKE)?  Is this an
intended optimization or an unintended bug?


2009-02-28  Ralf Wildenhues

* doc/make.texi (Instead of Execution): Document interaction of
-t with phony targets.

Index: doc/make.texi
RCS file: /cvsroot/make/make/doc/make.texi,v
retrieving revision 1.52
diff -u -r1.52 make.texi
--- doc/make.texi   18 May 2008 15:11:40 -  1.52
+++ doc/make.texi   28 Feb 2009 12:46:16 -
@@ -7741,6 +7741,11 @@
 not run unless they too begin with @samp{+} or contain @samp{$(MAKE)} or
 @sam...@{make@}} (@xref{MAKE Variable, ,How the @code{MAKE} Variable Works}.)
+...@cindex phony targets and recipe execution
+With @samp{-t}, phony targets (@pxref{Phony Targets}) are not updated
+nor executed, unless there are recipe lines beginning with @samp{+} or
+containing @samp{$(MAKE)} or @sam...@{make@}}.
 The @samp{-W} flag provides two features:
 @itemize @bullet

Bug-make mailing list

[bug #17873] .NOTPARALLEL enhancements

2008-06-23 Thread Ralf Wildenhues

Follow-up Comment #31, bug #17873 (project make):

I don't see any functionality asked for in this bug, that cannot
be emulated with order-only prerequisites.  What am I missing?


Reply to this item at:

  Message sent via/by Savannah

Bug-make mailing list


2006-08-06 Thread Ralf Wildenhues
Hello bug-make readers,

I noticed a couple of typos in the make manual.


* doc/make.texi: Fix some typos.

Index: doc/make.texi
RCS file: /cvsroot/make/make/doc/make.texi,v
retrieving revision 1.45
diff -u -r1.45 make.texi
--- doc/make.texi   1 Apr 2006 06:36:40 -   1.45
+++ doc/make.texi   6 Aug 2006 18:40:52 -
@@ -1937,7 +1937,7 @@
 prerequisite you must write two of them, @samp{$$} (@pxref{Using
 Variables, ,How to Use Variables}).  If you have enabled secondary
 expansion (@pxref{Secondary Expansion}) and you want a literal dollar
-sign in the prerequisites lise, you must actually write @emph{four}
+sign in the prerequisites list, you must actually write @emph{four}
 dollar signs (@samp{}).
 You may split a long line by inserting a backslash followed by a
@@ -3093,7 +3093,7 @@
 the last set given and prints an error message.  (As a special case,
 if the file's name begins with a dot, no error message is printed.
 This odd behavior is only for compatibility with other implementations
-of @code{make}... you should avoid using it).  Occasionally it is
+of @[EMAIL PROTECTED] you should avoid using it).  Occasionally it is
 useful to have the same target invoke multiple commands which are
 defined in different parts of your makefile; you can use
 @dfn{double-colon rules} (@pxref{Double-Colon}) for this.
@@ -3653,8 +3653,8 @@
 Notice how the backslash/newline pair was removed inside the string quoted
-with double quotes (@code{...}), but not from the string quoted with single
-quotes (@code{'...'}).  This is the way the default shell (@file{/bin/sh})
+with double quotes (@code{@dots{}}), but not from the string quoted with 
+quotes (@code{'@dots{}'}).  This is the way the default shell (@file{/bin/sh})
 handles backslash/newline pairs.  If you specify a different shell in your
 makefiles it may treat them differently.

Bug-make mailing list