This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU M4 source repository".
http://git.sv.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=58ffaf13f5206f8bac8ec4ebad8147e99157263d The branch, branch-1_4 has been updated via 58ffaf13f5206f8bac8ec4ebad8147e99157263d (commit) from 10801988a863c045aa0dc06b56575f1cc52bc586 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 58ffaf13f5206f8bac8ec4ebad8147e99157263d Author: Eric Blake <[EMAIL PROTECTED]> Date: Mon Nov 19 19:40:49 2007 -0700 Stage 18: try harder to reuse argv in recursion. * src/macro.c (make_argv_ref_token): Avoid wrapping $@ when possible. (push_args): Let make_argv_ref_token take care of pending data. * doc/m4.texinfo (Improved foreach): Tweak wording to match new performance capability. Add regression tests. * NEWS: Document the speedup. * src/m4.c (AUTHORS): Claim credit for my rewrite. (cherry picked from commit 58d580eeca1f75ddd2ca68d8b93fef6eead14350) Signed-off-by: Eric Blake <[EMAIL PROTECTED]> ----------------------------------------------------------------------- Summary of changes: ChangeLog | 16 +++++++++ NEWS | 4 ++- doc/m4.texinfo | 96 +++++++++++++++++++++++++++++++++++++++++++++++++------ src/m4.c | 2 +- src/macro.c | 80 ++++++++++++++++++++++++++-------------------- 5 files changed, 150 insertions(+), 48 deletions(-) diff --git a/ChangeLog b/ChangeLog index 62c4ad1..1269290 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,19 @@ +2008-02-23 Eric Blake <[EMAIL PROTECTED]> + + Stage 18: try harder to reuse argv in recursion. + When pushing arguments that contain an existing $@ ref, reuse the + ref rather than creating another layer of wrappers. + Memory impact: noticeable improvement, due to better $@ reuse. + Speed impact: noticeable improvement, due to O(n^2) to O(n) + reduction in unboxed recursion. + * src/macro.c (make_argv_ref_token): Avoid wrapping $@ when + possible. + (push_args): Let make_argv_ref_token take care of pending data. + * doc/m4.texinfo (Improved foreach): Tweak wording to match new + performance capability. Add regression tests. + * NEWS: Document the speedup. + * src/m4.c (AUTHORS): Claim credit for my rewrite. + 2008-02-22 Eric Blake <[EMAIL PROTECTED]> Stage 17: pass argv through quoted strings. diff --git a/NEWS b/NEWS index 31a41c2..8faafcf 100644 --- a/NEWS +++ b/NEWS @@ -27,7 +27,9 @@ Version 1.4.11 - ?? ??? 2008, by ???? (git version 1.4.10a-*) regular expressions, which speeds up typical Autoconf usage. * Enhance the `format' builtin to warn for more suspicious usages, such as missing arguments or problems parsing according to the format string. -* Memory usage is greatly reduced in recursive macros. +* Enhance the `ifelse' and `shift' builtins so that tail-recursive + algorithms based on `$@' operate in linear, rather than quadratic, time + and memory. * A number of portability improvements inherited from gnulib. Version 1.4.10 - 09 Jul 2007, by Eric Blake (CVS version 1.4.9c) diff --git a/doc/m4.texinfo b/doc/m4.texinfo index 2a549dd..92a8eff 100644 --- a/doc/m4.texinfo +++ b/doc/m4.texinfo @@ -7195,17 +7195,17 @@ helper method immediately, the @samp{defn([EMAIL PROTECTED]')} no longer contains unexpanded macros. The astute m4 programmer might notice that the solution above still uses -more memory, and thus more time, than strictly necessary. Note that [EMAIL PROTECTED], which contains an arbitrarily long quoted list, is expanded -and rescanned three times per iteration of @code{_foreachq}. -Furthermore, every iteration of the algorithm effectively unboxes then -reboxes the list, which costs a couple of macro invocations. It is -possible to rewrite the algorithm for a bit more speed by swapping the -order of the arguments to @code{_foreachq} in order to operate on an -unboxed list in the first place, and by using the fixed-length @samp{$#} -instead of an arbitrary length list as the key to end recursion. This -alternative approach is available as [EMAIL PROTECTED]@value{VERSION}/@/examples/@/foreach3.m4}: +more macro invocations than strictly necessary. Note that @samp{$2}, +which contains an arbitrarily long quoted list, is expanded and +rescanned three times per iteration of @code{_foreachq}. Furthermore, +every iteration of the algorithm effectively unboxes then reboxes the +list, which costs a couple of macro invocations. It is possible to +rewrite the algorithm by swapping the order of the arguments to [EMAIL PROTECTED] in order to operate on an unboxed list in the first +place, and by using the fixed-length @samp{$#} instead of an arbitrary +length list as the key to end recursion. The result is eight macro +invocations per loop, instead of nine. This alternative approach is +available as @[EMAIL PROTECTED]/@/examples/@/foreach3.m4}: @comment examples @example @@ -7248,6 +7248,20 @@ foreachq(`x', ``1', `2', `3', `4'', `x @result{}4 @end example +Prior to M4 1.4.11, every instance of @samp{$@@} was rescanned as it was +encountered. Thus, the @file{foreachq3.m4} alternative used much less +memory than @file{foreachq2.m4}, and executed as much as 10% faster, +since each iteration encountered fewer @samp{$@@}. However, the +implementation of rescanning every byte in @samp{$@@} was quadratic in +the number of bytes scanned (for example, making the broken version in [EMAIL PROTECTED] cubic, rather than quadratic, in behavior). Once the +underlying M4 implementation was improved in 1.4.11 to reuse results of +previous scans, both styles of @code{foreachq} become linear in the +number of bytes scanned, and the difference in timing is no longer +noticeable; in fact, after the change, the @file{foreachq2.m4} version +uses slightly less memory since it tracks fewer arguments per macro +invocation. + For yet another approach, the improved version of @code{foreach}, available in @[EMAIL PROTECTED]/@/examples/@/foreach2.m4}, simply overquotes the arguments to @[EMAIL PROTECTED] to begin with, using @@ -7376,6 +7390,66 @@ foreachq(`x', ```active'', ``active''', `<x> @result{}<active> @end example [EMAIL PROTECTED] [EMAIL PROTECTED] Not worth putting in the manual, but make sure that performance [EMAIL PROTECTED] on recursive algorithms is not quadratic. + [EMAIL PROTECTED] boxed recursion + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=10 -Dverbose [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=10 -Dverbose} +include(`loop.m4')dnl [EMAIL PROTECTED] 1 2 3 4 5 6 7 8 9 10 [EMAIL PROTECTED] example + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=2500 [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=2500} +include(`loop.m4')dnl [EMAIL PROTECTED] example + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=10000 [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=10000} +define(`debug', `define(`popdef',`divert`'i')') [EMAIL PROTECTED] +include(`loop.m4')dnl [EMAIL PROTECTED] [EMAIL PROTECTED] example + [EMAIL PROTECTED] unboxed recursion + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=10 -Dverbose -Dalt [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt} +include(`loop.m4')dnl [EMAIL PROTECTED] 1 2 3 4 5 6 7 8 9 10 [EMAIL PROTECTED] example + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=2500 -Dalt [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=2500 -Dalt} +include(`loop.m4')dnl [EMAIL PROTECTED] example + [EMAIL PROTECTED] examples [EMAIL PROTECTED] options: -Dlimit=10000 -Dalt [EMAIL PROTECTED] +$ @kbd {m4 -I examples -Dlimit=10000 -Dalt} +define(`debug', `define(`popdef',`divert`'i')') [EMAIL PROTECTED] +include(`loop.m4')dnl [EMAIL PROTECTED] [EMAIL PROTECTED] example + [EMAIL PROTECTED] ignore + @node Improved cleardivert @section Solution for @code{cleardivert} diff --git a/src/m4.c b/src/m4.c index af4991f..0ace6dc 100644 --- a/src/m4.c +++ b/src/m4.c @@ -27,7 +27,7 @@ #include "version-etc.h" -#define AUTHORS "Rene' Seindal" +#define AUTHORS "Rene' Seindal", "Eric Blake" static void usage (int); diff --git a/src/macro.c b/src/macro.c index c0a24c4..32ff62d 100644 --- a/src/macro.c +++ b/src/macro.c @@ -1285,24 +1285,55 @@ make_argv_ref_token (token_data *token, struct obstack *obs, int level, { token_chain *chain; - assert (obstack_object_size (obs) == 0); - /* TODO support $@ refs when argv array is larger than 1. */ - if (argv->wrapper && argv->arraylen == 1) - { - assert (TOKEN_DATA_TYPE (argv->array[0]) == TOKEN_COMP - && argv->array[0]->u.u_c.wrapper); - chain = argv->array[0]->u.u_c.chain; - assert (!chain->next && chain->type == CHAIN_ARGV - && !chain->u.u_a.skip_last); - argv = chain->u.u_a.argv; - index += chain->u.u_a.index - 1; - } if (index >= argv->argc) return NULL; + TOKEN_DATA_TYPE (token) = TOKEN_COMP; + token->u.u_c.chain = token->u.u_c.end = NULL; + /* Cater to the common idiom of $0(`$1',shift(shift($@))), by + inlining the first few arguments and reusing the original $@ ref, + rather than creating another layer of wrappers. */ + while (argv->wrapper) + { + unsigned int i; + for (i = 0; i < argv->arraylen; i++) + { + if (TOKEN_DATA_TYPE (argv->array[i]) == TOKEN_COMP + && argv->array[i]->u.u_c.wrapper) + break; + if (index == 1) + { + push_arg_quote (obs, argv, i + 1, quotes); + obstack_1grow (obs, ','); + } + else + index--; + } + assert (i < argv->arraylen); + if (i + 1 == argv->arraylen) + { + assert (TOKEN_DATA_TYPE (argv->array[i]) == TOKEN_COMP + && argv->array[i]->u.u_c.wrapper); + chain = argv->array[i]->u.u_c.chain; + assert (!chain->next && chain->type == CHAIN_ARGV + && !chain->u.u_a.skip_last); + argv = chain->u.u_a.argv; + index += chain->u.u_a.index - 1; + } + else + { + index += i; + break; + } + } + + make_text_link (obs, &token->u.u_c.chain, &token->u.u_c.end); chain = (token_chain *) obstack_alloc (obs, sizeof *chain); - TOKEN_DATA_TYPE (token) = TOKEN_COMP; - token->u.u_c.chain = token->u.u_c.end = chain; + if (token->u.u_c.end) + token->u.u_c.end->next = chain; + else + token->u.u_c.chain = chain; + token->u.u_c.end = chain; token->u.u_c.wrapper = true; token->u.u_c.has_func = argv->has_func; chain->next = NULL; @@ -1416,8 +1447,6 @@ push_args (struct obstack *obs, macro_arguments *argv, bool skip, bool quote) unsigned int i = skip ? 2 : 1; token_data td; token_data *token; - char *str = NULL; - size_t len = obstack_object_size (obs); if (i >= argv->argc) return; @@ -1428,29 +1457,10 @@ push_args (struct obstack *obs, macro_arguments *argv, bool skip, bool quote) return; } - /* Since make_argv_ref_token puts data on obs, we must first close - any pending data. The resulting token contents live entirely on - obs, so we call push_token with a level of -1. */ - if (len) - { - obstack_1grow (obs, '\0'); - str = (char *) obstack_finish (obs); - } /* TODO allow shift, $@, to push builtins without flatten. */ token = make_argv_ref_token (&td, obs, -1, argv, i, true, quote ? &curr_quote : NULL); assert (token); - if (len) - { - token_chain *chain = (token_chain *) obstack_alloc (obs, sizeof *chain); - chain->next = token->u.u_c.chain; - token->u.u_c.chain = chain; - chain->type = CHAIN_STR; - chain->quote_age = 0; - chain->u.u_s.str = str; - chain->u.u_s.len = len; - chain->u.u_s.level = -1; - } if (push_token (token, -1, argv->inuse)) arg_mark (argv); } hooks/post-receive -- GNU M4 source repository
