This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU M4 source repository".
http://git.sv.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=59d3cfafa8d73e43a974bc066722cd6220cb479f The branch, branch-1.6 has been updated via 59d3cfafa8d73e43a974bc066722cd6220cb479f (commit) via e9e4abba45f7e9f368cf497e14bc2ce64b867a02 (commit) from 9b6e3c06e836c35480030a13150e9ed2b6a6ee4f (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 59d3cfafa8d73e43a974bc066722cd6220cb479f Author: Eric Blake <[email protected]> Date: Fri Dec 26 00:45:24 2008 -0700 Enhance substr to support replacement text. * doc/m4.texinfo (Substr): Document new semantics. * src/builtin.c (m4_substr): Support optional fourth argument. * NEWS: Document this. Signed-off-by: Eric Blake <[email protected]> commit e9e4abba45f7e9f368cf497e14bc2ce64b867a02 Author: Eric Blake <[email protected]> Date: Fri Dec 26 00:33:18 2008 -0700 Enhance substr to support negative values. * doc/m4.texinfo (Substr): Document new semantics, and how to simulate old. * src/builtin.c (m4_substr): Support negative values. * NEWS: Document this. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 13 ++++ NEWS | 10 +++- doc/m4.texinfo | 187 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- src/builtin.c | 73 ++++++++++++++++------ 4 files changed, 252 insertions(+), 31 deletions(-) diff --git a/ChangeLog b/ChangeLog index 41719e2..c590435 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,16 @@ +2009-01-06 Eric Blake <[email protected]> + + Enhance substr to support replacement text. + * doc/m4.texinfo (Substr): Document new semantics. + * src/builtin.c (m4_substr): Support optional fourth argument. + * NEWS: Document this. + + Enhance substr to support negative values. + * doc/m4.texinfo (Substr): Document new semantics, and how to + simulate old. + * src/builtin.c (m4_substr): Support negative values. + * NEWS: Document this. + 2009-01-05 Eric Blake <[email protected]> Use nicer email address in web manual. diff --git a/NEWS b/NEWS index 2e1a286..cbea814 100644 --- a/NEWS +++ b/NEWS @@ -1,6 +1,6 @@ GNU M4 NEWS - User visible changes. -Copyright (C) 1992, 1993, 1994, 2004, 2005, 2006, 2007, 2008 Free Software -Foundation, Inc. +Copyright (C) 1992, 1993, 1994, 2004, 2005, 2006, 2007, 2008, 2009 Free +Software Foundation, Inc. * Noteworthy changes in Version 1.6 (????-??-??) [stable] Released by ????, based on git versions 1.4.10b.x-* and 1.5.* @@ -53,6 +53,12 @@ Foundation, Inc. the current expansion is nested within argument collection of another macro. It has also been optimized for faster performance. +** The `substr' builtin now treats negative arguments as indices relative + to the end of the string, and accepts an optional fourth argument of + text to supply in place of the selected substring. The manual gives an + example of how to recover M4 1.4.x behavior, as well as an example of + simulating the new negative argument semantics with older M4. + ** The `-d'/`--debug' command-line option now understands `-' and `+' modifiers, the way the builtin `debugmode' has always done; this allows `-d-V' to disable prior debug settings from the command line, similar to diff --git a/doc/m4.texinfo b/doc/m4.texinfo index 93adb64..37600c4 100644 --- a/doc/m4.texinfo +++ b/doc/m4.texinfo @@ -44,7 +44,7 @@ This manual (@value{UPDATED}) is for @acronym{GNU} M4 (version language. Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 2004, 2005, -2006, 2007, 2008 Free Software Foundation, Inc. +2006, 2007, 2008, 2009 Free Software Foundation, Inc. @quotation Permission is granted to copy, distribute and/or modify this document @@ -6232,13 +6232,33 @@ regexp(`abc', `', `\\def') @cindex substrings, extracting Substrings are extracted with @code{substr}: -...@deffn Builtin substr (@var{string}, @var{from}, @ovar{length}) -Expands to the substring of @var{string}, which starts at index -...@var{from}, and extends for @var{length} characters, or to the end of -...@var{string}, if @var{length} is omitted. The starting index of a string -is always 0. The expansion is empty if there is an error parsing -...@var{from} or @var{length}, if @var{from} is beyond the end of -...@var{string}, or if @var{length} is negative. +...@deffn Builtin substr (@var{string}, @var{from}, @ovar{length}, @ + @ovar{replace}) +Performs a substring operation on @var{string}. If @var{from} is +positive, it represents the 0-based index where the substring begins. +If @var{length} is omitted, the substring ends at the end of +...@var{string}; if it is positive, @var{length} is added to the starting +index to determine the ending index. + +...@cindex @acronym{GNU} extensions +As a @acronym{GNU} extension, if @var{from} is negative, it is added to +the length of @var{string} to determine the starting index; if it is +empty, the start of the string is used. Likewise, if @var{length} is +negative, it is added to the length of @var{string} to determine the +ending index, and an emtpy @var{length} behaves like an omitted +...@var{length}. It is not an error if either of the resulting indices lie +outside the string, but the selected substring only contains the bytes +of @var{string} that overlap the selected indices. If the end point +lies before the beginning point, the substring chosen is the empty +string located at the starting index. + +If @var{replace} is omitted, then the expansion is only the selected +substring, which may be empty. As a @acronym{GNU} extension,if +...@var{replace} is provided, then the expansion is the original +...@var{string} with the selected substring replaced by @var{replace}. The +expansion is empty and a warning issued if @var{from} or @var{length} +cannot be parsed, or if @var{replace} is provided but the selected +indices do not overlap with @var{string}. The macro @code{substr} is recognized only with parameters. @end deffn @@ -6250,15 +6270,160 @@ substr(`gnus, gnats, and armadillos', `6', `5') @result{}gnats @end example -Omitting @var{from} evokes a warning, but still produces output. +Omitting @var{from} evokes a warning, but still produces output. On the +other hand, selecting a @var{from} or @var{length} that lies beyond +...@var{string} is not a problem. @example substr(`abc') @error{}m4:stdin:1: Warning: substr: too few arguments: 1 < 2 @result{}abc -substr(`abc',) -...@error{}m4:stdin:2: Warning: substr: empty string treated as 0 +substr(`abc', `') @result{}abc +substr(`abc', `4') +...@result{} +substr(`abc', `1', `4') +...@result{}bc +...@end example + +Using negative values for @var{from} or @var{length} are @acronym{GNU} +extensions, useful for accessing a fixed size tail of an +arbitrary-length string. Prior to M4 1.6, using these values would +silently result in the empty string. Some other implementations crash +on negative values, and many treat an explicitly empty @var{length} as +0, which is different from the omitted @var{length} implying the rest of +the original @var{string}. + +...@example +substr(`abcde', `2', `') +...@result{}cde +substr(`abcde', `-3') +...@result{}cde +substr(`abcde', `', `-3') +...@result{}ab +substr(`abcde', `-6') +...@result{}abcde +substr(`abcde', `-6', `5') +...@result{}abcd +substr(`abcde', `-7', `1') +...@result{} +substr(`abcde', `1', `-2') +...@result{}bc +substr(`abcde', `-4', `-1') +...@result{}bcd +substr(`abcde', `4', `-3') +...@result{} +substr(`abcdefghij', `-09', `08') +...@result{}bcdefghi +...@end example + +Another useful @acronym{GNU} extension, also added in M4 1.6, is the +ability to replace a substring within the original @var{string}. An +empty length substring at the beginning or end of @var{string} is valid, +but selecting a substring that does not overlap @var{string} causes a +warning. + +...@example +substr(`abcde', `1', `3', `t') +...@result{}ate +substr(`abcde', `5', `', `f') +...@result{}abcdef +substr(`abcde', `-3', `-4', `f') +...@result{}abfcde +substr(`abcde', `-6', `1', `f') +...@result{}fabcde +substr(`abcde', `-7', `1', `f') +...@error{}m4:stdin:5: Warning: substr: substring out of range +...@result{} +substr(`abcde', `6', `', `f') +...@error{}m4:stdin:6: Warning: substr: substring out of range +...@result{} +...@end example + +If backwards compabitility to M4 1.4.x behavior is necessary, the +following macro is sufficient to do the job (mimicking warnings about +empty @var{from} or @var{length} or an ignored fourth argument is left +as an exercise to the reader). + +...@example +define(`substr', `ifelse(`$#', `0', ``$0'', + eval(`2 < $#')`$3', `1', `', + index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')') +...@result{} +substr(`abcde', `3') +...@result{}de +substr(`abcde', `3', `') +...@result{} +substr(`abcde', `-1') +...@result{} +substr(`abcde', `1', `-1') +...@result{} +substr(`abcde', `2', `1', `C') +...@result{}c +...@end example + +On the other hand, it is possible to portably emulate the @acronym{GNU} +extension of negative @var{from} and @var{length} arguments across all +...@code{m4} implementations, albeit with a lot more overhead. This +example uses @code{incr} and @code{decr} to normalize @samp{-08} to +something that a later @code{eval} will treat as a decimal value, rather +than looking like an invalid octal number, while avoiding using these +macros on an empty string. The helper macro @code{_substr_normalize} is +recursive, since it is easier to fix @var{length} after @var{from} has +been normalized, with the final iteration supplying two non-negative +arguments to the original builtin, now named @code{_substr}. + +...@comment options: -daq -t_substr +...@example +$ @kbd{m4 -daq -t _substr} +define(`_substr', defn(`substr'))dnl +define(`substr', `ifelse(`$#', `0', ``$0'', + `_$0(`$1', _$0_normalize(len(`$1'), + ifelse(`$2', `', `0', `incr(decr(`$2'))'), + ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl +define(`_substr_normalize', `ifelse( + eval(`$2 < 0 && $1 + $2 >= 0'), `1', + `$0(`$1', eval(`$1 + $2'), `$3')', + eval(`$2 < 0')`$3', `1', ``0', `$1'', + eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1', + `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))', + eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'', + eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')', + `$3', `', ``$2', `$1'', + eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1', + ``$2', eval(`$1 - $2 + $3')', + eval(`$3 - 0 < 0'), `1', ``$2', `0'', + ``$2', `$3'')')dnl +substr(`abcde', `2', `') +...@error{}m4trace: -1- _substr(`abcde', `2', `5') +...@result{}cde +substr(`abcde', `-3') +...@error{}m4trace: -1- _substr(`abcde', `2', `5') +...@result{}cde +substr(`abcde', `', `-3') +...@error{}m4trace: -1- _substr(`abcde', `0', `2') +...@result{}ab +substr(`abcde', `-6') +...@error{}m4trace: -1- _substr(`abcde', `0', `5') +...@result{}abcde +substr(`abcde', `-6', `5') +...@error{}m4trace: -1- _substr(`abcde', `0', `4') +...@result{}abcd +substr(`abcde', `-7', `1') +...@error{}m4trace: -1- _substr(`abcde', `0', `0') +...@result{} +substr(`abcde', `1', `-2') +...@error{}m4trace: -1- _substr(`abcde', `1', `2') +...@result{}bc +substr(`abcde', `-4', `-1') +...@error{}m4trace: -1- _substr(`abcde', `1', `3') +...@result{}bcd +substr(`abcde', `4', `-3') +...@error{}m4trace: -1- _substr(`abcde', `4', `0') +...@result{} +substr(`abcdefghij', `-09', `08') +...@error{}m4trace: -1- _substr(`abcdefghij', `1', `8') +...@result{}bcdefghi @end example @node Translit diff --git a/src/builtin.c b/src/builtin.c index 33ef9e5..6594cb9 100644 --- a/src/builtin.c +++ b/src/builtin.c @@ -1,7 +1,7 @@ /* GNU m4 -- A simple macro processor - Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2000, 2004, 2006, 2007, - 2008 Free Software Foundation, Inc. + Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2000, 2004, 2006, + 2007, 2008, 2009 Free Software Foundation, Inc. This file is part of GNU M4. @@ -1861,22 +1861,26 @@ m4_index (struct obstack *obs, int argc, macro_arguments *argv) shipout_int (obs, retval); } -/*-------------------------------------------------------------------------. -| The macro "substr" extracts substrings from the first argument, starting | -| from the index given by the second argument, extending for a length | -| given by the third argument. If the third argument is missing, the | -| substring extends to the end of the first argument. | -`-------------------------------------------------------------------------*/ +/*-------------------------------------------------------------------. +| The macro "substr" extracts substrings from the first argument, | +| starting from the index given by the second argument, extending | +| for a length given by the third argument. If the third argument | +| is missing or empty, the substring extends to the end of the first | +| argument. As an extension, negative arguments are treated as | +| indices relative to the string length. Also, if a fourth argument | +| is supplied, the original string is output with the selected | +| substring replaced by the argument. | +`-------------------------------------------------------------------*/ static void m4_substr (struct obstack *obs, int argc, macro_arguments *argv) { const call_info *me = arg_info (argv); int start = 0; + int end; int length; - int avail; - if (bad_argc (me, argc, 2, 3)) + if (bad_argc (me, argc, 2, 4)) { /* builtin(`substr') is blank, but substr(`abc') is abc. */ if (argc == 2) @@ -1884,19 +1888,52 @@ m4_substr (struct obstack *obs, int argc, macro_arguments *argv) return; } - length = avail = ARG_LEN (1); - if (!numeric_arg (me, ARG (2), &start)) + length = ARG_LEN (1); + if (!arg_empty (argv, 2) && !numeric_arg (me, ARG (2), &start)) return; + if (start < 0) + start += length; - if (argc >= 4 && !numeric_arg (me, ARG (3), &length)) - return; + if (arg_empty (argv, 3)) + end = length; + else + { + if (!numeric_arg (me, ARG (3), &end)) + return; + if (end < 0) + end += length; + else + end += start; + } + + if (argc >= 5) + { + /* Replacement text provided. */ + if (end < start) + end = start; + if (end < 0 || length < start) + { + m4_warn (0, me, _("substring out of range")); + return; + } + if (start < 0) + start = 0; + if (length < end) + end = length; + obstack_grow (obs, ARG (1), start); + push_arg (obs, argv, 4); + obstack_grow (obs, ARG (1) + end, length - end); + return; + } - if (start < 0 || length <= 0 || start >= avail) + if (start < 0) + start = 0; + if (length < end) + end = length; + if (end <= start) return; - if (start + length > avail) - length = avail - start; - obstack_grow (obs, ARG (1) + start, length); + obstack_grow (obs, ARG (1) + start, end - start); } /*------------------------------------------------------------------. hooks/post-receive -- GNU M4 source repository
