[PATCH] 4.0..devel: fix a problem that unset 'a[`echo 0`]' causes "bad array subscript" error
[ I initially intended to submit this report after the previous report at https://lists.gnu.org/archive/html/bug-bash/2021-10/msg00051.html has been settled. I decided to submit this patch now because a problem related to this report is mentioned in the above thread. Note that this patch bases on the code after applying the patches provided at the previous report. ] Bash Version: All the versions from 4.0 to devel are affected. The problem should be reproduced independent of the machine type. Description: « unset -v 'a[`echo 0`]' » fails because the first character « ` » of the subscript « `echo 0` » is skipped when checking the ending of the array subscript. This is caused by the mismatching of the assumptions on the arguments of `unbind_array_reference (var,sub,flags)' and `skipsubscript (string,start,flags)'. The former assumes `sub' starts from the next character after `[` while the latter assumes that `string[start]' starts from `[` itself. However, `unbind_array_reference' directly passes `sub' to `skipsubscript'. Repeat-By: $ bash -c "a=(); unset 'a[\`echo [0]\`]'" bash: line 0: unset: a[`echo [0]`]: bad array subscript This is caused by the mismatching of the assumption on the argument of `unbind_array_reference' (arrayfunc.c:1062) and the argument of `skipsubscript' (subst.c:1845). The function `unbind_array_reference' assumes that the second argument SUB points to just after the beginning `[' as described in the comment of the function (arrayfunc.c:1054): > /* This function is called with SUB pointing to just after the beginning >`[' of an array subscript and removes the array element to which SUB >expands from array VAR. A subscript of `*' or `@' unsets the array. */ > [...] > int > unbind_array_element (var, sub, flags) In this function, another function `skipsubscript' is called as > len = skipsubscript (sub, 0, (flags&VA_NOEXPAND) | 2); /* XXX */ The function `skipsubscript' calls another function `skip_matched_pair' (subst.c:1747). > int > skipsubscript (string, start, flags) > const char *string; > int start, flags; > { > return (skip_matched_pair (string, start, '[', ']', flags)); The function `skip_matched_pair' assumes that `string[start]' points to the beginning `[' itself: > /* This function assumes s[i] == open; returns with s[ret] == close; used to >parse array subscripts. FLAGS & 1 means to not attempt to skip over >matched pairs of quotes or backquotes, or skip word expansions; it is >intended to be used after expansion has been performed and during final >assignment parsing (see arrayfunc.c:assign_compound_array_list()) or >during execution by a builtin which has already undergone word expansion. */ > static int > skip_matched_pair (string, start, open, close, flags) which is inconsistent with the assumption of `unbind_array_reference' that `sub' points to the next character of the beginning `['. Fix: In the attached patch, I added a new flag 2 in the `flags` argument of `skipsubscript', which indicates that `string[start]' points to the next character after the beginning delimiter `open' (`[' in this case). Now, the bit `2' of the argument `flags' of `skipsubscript (var,sub,flags)' will change the behavior. Also, `array_variable_name (s,flags,subp,lenp)' and `array_variable_part (s,flags,subp,lenp)' indirectly calls `skipsubscript' by directly passing its argument `flags'. I have checked all the existing calls of these functions and confirmed that currently the arguments are always `0' or `1' and never contains the bit `2', so the existing behavior will not change. Then, I added the bit `2' in the call of `skipsubscript' from `unbind_array_reference'. I also changed the argument `flags' of the call of `skipsubscript' in `array_variable_name' with `flags & 1' because `array_variable_name' explicitly searches for the beginning `[' and pass its position to `skipsubscript'. I also updated the comments of the functions. If you think we should define constants for these flags such as `#define SKIPSUBSCRIPT_NONESTING 1' and `#define SKIPSUBSCRIPT_STARTAFTEROPEN 2', I can adjust the patch. -- Koichi 0001-arrayfunc.c-unset_array_element-fix-a-bug-that-the-f.patch Description: Binary data
Re: [PATCH] devel: fix segfault by unset 'assoc[${x[0]}]'
> The difference is that valid_array_reference can be called before > any of the subscript is expanded, in which case you need to parse > things that can be expanded, where unbind_array_element is called > after all the expansions are performed (but see below). > > So let's see if we can talk through this. > > [...] > > You're right, there should be no `nesting' considered at all. By the time > unbind_array_element is called, since it's only called from unset_builtin, > the word and subscript should have already undergone all the expansion > they're going to. There should be no need to interpret ${} or $() in the > subscript: since associative arrays can have arbitrary subscripts, you > should not attempt to parse the subscript contents. Yeah, I think the above paragraph describes the expected behavior when `assoc_expand_once' is turned on. But in this patch, I actually aim to fix the behavior of the backward-compatible mode (with `assoc_expand_once' turned off). In the patch, I suggested to remove `(var && assoc_p(var))' from the skipsubscript flag for the nesting consideration as > -len = skipsubscript (sub, 0, (flags&VA_NOEXPAND) || (var && > assoc_p(var))); /* XXX */ > +len = skipsubscript (sub, 0, flags&VA_NOEXPAND); /* XXX */ Here, `(flags & VA_NOEXPAND)' is always `1' when `assoc_expand_once' is turned on, so the above change does not affect the behavior of `assoc_expand_once' mode but affect the behavior of the backward-compatible mode. > However, there is backwards compatibility to consider, which is why > assoc_expand_once isn't set by default and the code attempts to run the > subscript through word expansion. Yeah, that's the issue. > In this example, the quoting prevents the shell from recognizing the > word as an array reference before the quote removal occurs and the > word gets passed to the unset builtin, so it can't set any hints for > unset. unset sees `a[${b[0]}]', which is ambiguous. I thought the extra expansions are always performed in the backward-compatible behavior and never performed in the `assoc_expand_once' mode, so there is no ambiguity once the current mode is given. In the backward-compatible mode (i.e., in older Bash and in newer Bash with `assoc_expand_once' turned off), the subscripts of « unset -v 'a[...]' » have been always subject to the extra expansions. If we accept this extra expansions as a design, it is actually well-defined and unambiguous, and it always works as expected if one always quotes the arguments of `unset' (and other builtins such as `printf -v', `read', `mapfile', etc.) as « unset -v 'a[$key]' ». Actually, this is the only way to make it work in all the Bash versions (with `assoc_expand_once' turned off for newer versions). > It can shortcut and say "ok, if it passes valid_array_reference, we should > just consider the entire argument as one word as long as the final > character is `]'." This is again where backwards compatibility and > assoc_expand_once matter. > > We can apply your change, but it is still incomplete What is exactly the incompleteness that you focus on in this context? I understand that you are not satisfied with the behavior of the backward-compatible mode, but once we define the design of the extra expansions in the backward-compatible mode, I think this patch will make it consistent and there is no incompleteness ``within'' the backward-compatible mode. > (plus it breaks things that currently work, like > > declare -A a > key='$(echo foo)' > > a['$key']=2 > declare -p a > > unset -v "a['\$key']" > declare -p a > > ). This is related to another bug (which is rather a clear one) that has existed since bash-4.0, for which I have a pending report. The above problem is not caused by this patch, but just another bug that has been concealed by the current behavior has been revealed. I was planning to submit the report after this patch is processed because the codes to be changed in two patches overlap with each other. Now I'll submit the report though there are conflicting changes between the two sets of patches. > The real issue is that it's still going to expand the subscript for > backwards compatibility. I think that's going to have to be solved > at a higher level. Yes, but I feel like this is another design issue which is irrelevant for the fix of the present small problem. -- Koichi
Incorrect LINENO with exported nested functions with loops
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -DDEFAULT_PATH_VALUE='/usr/local/bin:/usr/bin' -DSYSLOG_HISTORY -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wno-parentheses -Wno-format-security uname output: Linux gadi-login-06.gadi.nci.org.au 4.18.0-305.19.1.el8.nci.x86_64 #1 SMP Mon Sep 27 03:06:23 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.4 Patch Level: 20 Release Status: release Description: If you have a function, which exports another function, and that exported function contains a generic for loop, the LINENO variable is incorrect for loops in future functions which contain another loop which uses ((..)) notation. Whilst the above system information says bash 4.4.20 20, I have seen it in 3.2.51 and 4.2.46 too. Repeat-By: Below is a sample script to replicate the bug in LINENO. The LINENO variable for the 'i' loop which is printed out by the PS4 is incorrect when that DO_BACKUP function is exported, and the second 'j' loop is uncommented. If you instead uncomment the first 'j' loop, the LINENO variables are correct. If DO_BACKUP does not have a loop inside it, the LINENO variables are all correct. #!/bin/bash export PS4='+L$LINENO + ' setup() { DO_BACKUP() { for d in 1 2 3; do break done } export -f DO_BACKUP } run() { for i in 1; do #for j in 1; do # Uncomment this 'j' loop instead of the next and LINENO is correct #for ((j=1; j<=2; j++)); do # Uncomment this 'j' loop and LINENO is incorrect true done done } set -x setup run Example output with first 'j' loop uncommented: $ bash run.sh +L20 + setup +L9 + export -f DO_BACKUP +L21 + run +L12 + for i in 1 +L13 + for j in 1 +L15 + true Example output with the second 'j' loop uncommented: $ bash run.sh +L20 + setup +L9 + export -f DO_BACKUP +L21 + run +L5 + for i in 1 +L14 + (( j=1 )) +L14 + (( j<=1 )) +L15 + true +L14 + (( j++ )) +L14 + (( j<=1 )) Regards, Tom Coleman
Re: Using systemd-249's libnss_systemd.so.2 triggers a crash in bash-5.1's malloc.c
On 10/5/21 1:50 PM, Dominique Martinet wrote: > If I change malloc_usable_size to return p->mh_nbytes instead of > maxbytes, then the crash disappears.[2] > > I did not read the full bash malloc code but I suspect the buffer really > could be grown, but we would need to fix p->mh_nbytes to maxbytes and > also adjust the end block to pass sanity checks on free -- e.g. it > should be considered as a lightweight inplace realloc. > > I'm not sure we care enough to be honest and returning what is really > usable feels like the simplest solution, what do you think? > Thanks for your work tracking this down. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Using systemd-249's libnss_systemd.so.2 triggers a crash in bash-5.1's malloc.c
On 10/5/21 1:50 PM, Dominique Martinet wrote: > Turns out I'm lucky enough on address consistency.. > > So, > - since we have a nice before/after with systemd, I took a moment to > bisect it. > It comes down to this commit[1] which is basically using > malloc_usable_size() to use buffers beyond what it initially requested > > [1] > https://github.com/systemd/systemd/commit/319a4f4bc46b230fc660321e99aaac1bc449deea > > - through gdb/printf I see that the failing pointer has been allocated > with (what comes down to) malloc(64), but malloc_usable_size returns > 108, so systemd happily writes beyond the 64 bytes it originally > requested -- which bash allocator treats as an overflow. Sure. The bash malloc_usable_size basically returns the size of the chunk: what you could realloc it to without having to go back for more memory (that's how I interpreted `usable bytes', but that's clearly not how it's intended). I suppose without the bounds checking that would have been fine. > If I change malloc_usable_size to return p->mh_nbytes instead of > maxbytes, then the crash disappears.[2] That's the right fix. > > I did not read the full bash malloc code but I suspect the buffer really > could be grown, but we would need to fix p->mh_nbytes to maxbytes and > also adjust the end block to pass sanity checks on free -- e.g. it > should be considered as a lightweight inplace realloc. If you want realloc, use realloc. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [PATCH] devel: fix segfault by unset 'assoc[${x[0]}]'
On 10/5/21 7:05 AM, Koichi Murase wrote: > The segmentation fault is fixed by the above patch, but there > still remains the same error as bash 4.4. > > bash-patch1: ${x[0: bad substitution > > This is caused by an inconsistency between `valid_array_reference > (name,flags)' (arrayfunc.c:1187) and `unbind_array_element > (var,sub,flags)' (arrayfunc.c:1033) in the extraction of > associative-array subscripts. Note that `valid_array_reference' is > called from `unset_builtin' (builtins/set.def:834) to check if the > unset name has the form of an array element. Also, > `unbind_array_element' is called from `unset_builtin' to perform the > actual unset. In `valid_array_reference', the length of the > associative-array subscripts are determined as > > else if (isassoc) > len = skipsubscript (t, 0, flags&VA_NOEXPAND); /* VA_NOEXPAND > must be 1 */ The difference is that valid_array_reference can be called before any of the subscript is expanded, in which case you need to parse things that can be expanded, where unbind_array_element is called after all the expansions are performed (but see below). So let's see if we can talk through this. > > whereas in `unbind_array_element', the length is determined as > > if (var && assoc_p (var) && (flags&VA_ONEWORD)) > len = strlen (sub) - 1; > else > len = skipsubscript (sub, 0, (flags&VA_NOEXPAND) || (var && > assoc_p(var))); /* XXX */ > > `skipsubscript' does not consider the nesting of ${} and $() when > bit 1 is set to the third argument. In the former code, nesting is > not considered only when VA_NOEXPAND is specified. However, in the > latter code, nesting is never considered for associative arrays > (even when VA_NOEXPAND is not specified). I believe the former code > should be the expected one. You're right, there should be no `nesting' considered at all. By the time unbind_array_element is called, since it's only called from unset_builtin, the word and subscript should have already undergone all the expansion they're going to. There should be no need to interpret ${} or $() in the subscript: since associative arrays can have arbitrary subscripts, you should not attempt to parse the subscript contents. That was obviously one of the problems with the code through bash-5.1, one of the original reasons I introduced `assoc_expand_once', and an issue that's still looking for a final resolution. However, there is backwards compatibility to consider, which is why assoc_expand_once isn't set by default and the code attempts to run the subscript through word expansion. In this example, the quoting prevents the shell from recognizing the word as an array reference before the quote removal occurs and the word gets passed to the unset builtin, so it can't set any hints for unset. unset sees `a[${b[0]}]', which is ambiguous. It can shortcut and say "ok, if it passes valid_array_reference, we should just consider the entire argument as one word as long as the final character is `]'." This is again where backwards compatibility and assoc_expand_once matter. We can apply your change, but it is still incomplete (plus it breaks things that currently work, like declare -A a key='$(echo foo)' a['$key']=2 declare -p a unset -v "a['\$key']" declare -p a ). The real issue is that it's still going to expand the subscript for backwards compatibility. I think that's going to have to be solved at a higher level. Maybe the unset builtin code should just punt as described above -- depending on the shell compatibility level -- and turn on VA_ONEWORD|VA_NOEXPAND if there is an unquoted `[', the last character is a `]', and valid_array_reference() returns true. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Using systemd-249's libnss_systemd.so.2 triggers a crash in bash-5.1's malloc.c
Chet Ramey wrote on Mon, Oct 04, 2021 at 10:11:10PM -0400: > > I'm running busybox sh in a unit (which starts properly), then > > interactively test things from there. > > > > Running in gdb does fail the same way as running normally, so I've also > > been looking at that a bit, but nothing obvious poped up. > > I'd like to trace back which allocation corresponds to the failing one, > > and break from there next time. > > Without the allocation tracing, which you don't get from allocations that > aren't directly from bash, it's tedious, but doable. It only works if the > memory layout is consistent enough that you get the same address failing > every time (that is, the address being passed to the failing free() is the > same). > > If you do, you write it down, put a breakpoint in internal_malloc at the > point where it returns the memory to its caller, and just run, breaking > at that point, until malloc returns that address. Then run a backtrace and > see where you are. Turns out I'm lucky enough on address consistency.. So, - since we have a nice before/after with systemd, I took a moment to bisect it. It comes down to this commit[1] which is basically using malloc_usable_size() to use buffers beyond what it initially requested [1] https://github.com/systemd/systemd/commit/319a4f4bc46b230fc660321e99aaac1bc449deea - through gdb/printf I see that the failing pointer has been allocated with (what comes down to) malloc(64), but malloc_usable_size returns 108, so systemd happily writes beyond the 64 bytes it originally requested -- which bash allocator treats as an overflow. copying bash's malloc_usable_size here for convenience: malloc_usable_size (mem) void *mem; { register union mhead *p; register char *ap; register int maxbytes; if ((ap = (char *)mem) == 0) return 0; /* Find the true start of the memory block to discover which bin */ p = (union mhead *) ap - 1; if (p->mh_alloc == ISMEMALIGN) { ap -= p->mh_nbytes; p = (union mhead *) ap - 1; } /* XXX - should we return 0 if ISFREE? */ maxbytes = binsize(p->mh_index); /* So the usable size is the maximum number of bytes in the bin less the malloc overhead */ maxbytes -= MOVERHEAD + MSLOP; return (maxbytes); } If I change malloc_usable_size to return p->mh_nbytes instead of maxbytes, then the crash disappears.[2] I did not read the full bash malloc code but I suspect the buffer really could be grown, but we would need to fix p->mh_nbytes to maxbytes and also adjust the end block to pass sanity checks on free -- e.g. it should be considered as a lightweight inplace realloc. I'm not sure we care enough to be honest and returning what is really usable feels like the simplest solution, what do you think? [2] return p->mh_nbytes -- diff --git a/lib/malloc/malloc.c b/lib/malloc/malloc.c index 439f8ef11af2..ad27d793b7f4 100644 --- a/lib/malloc/malloc.c +++ b/lib/malloc/malloc.c @@ -1272,8 +1272,6 @@ malloc_usable_size (mem) { register union mhead *p; register char *ap; - register int maxbytes; - if ((ap = (char *)mem) == 0) return 0; @@ -1286,13 +1284,7 @@ malloc_usable_size (mem) p = (union mhead *) ap - 1; } - /* XXX - should we return 0 if ISFREE? */ - maxbytes = binsize(p->mh_index); - - /* So the usable size is the maximum number of bytes in the bin less the - malloc overhead */ - maxbytes -= MOVERHEAD + MSLOP; - return (maxbytes); + return p->mh_nbytes; } #if !defined (NO_VALLOC) -- Thanks, -- Dominique
Re: [PATCH] devel: fix segfault by unset 'assoc[${x[0]}]'
> In Bash 4.0--5.3, the unset command causes the following error: > > bash-4.0: [${x[0]}]: bad array subscript Sorry, please ignore these two lines. This part is unrelated to the fix of the patches. The above error message was produced for the empty associative array key, which is the expected behavior. I just forgot to remove these lines from the email draft before sending the mail. -- Koichi
[PATCH] devel: fix segfault by unset 'assoc[${x[0]}]'
Bash Version: devel branch (441078402919f6f0dd677cad18d55c7a89d294fc), 5.1.8(2)-maint (x86_64-pc-linux-gnu) Description: In the devel branch, « unset 'assoc[${x[0]}]' » causes a segmentation fault, where `assoc' is the name of an associative array. This does not happen with Bash 5.1. In Bash 4.4--5.1, the same unset command causes the following bad-substitution error: bash-4.4: ${x[0: bad substitution In Bash 4.0--5.3, the unset command causes the following error: bash-4.0: [${x[0]}]: bad array subscript Repeat-by: The following command causes a segmentation fault. $ bash-dev -c "declare -A a; unset 'a[\${b[0]}]'" The stack trace reads #0 0x77de4e35 in raise () from /lib64/libc.so.6 #1 0x77dcf895 in abort () from /lib64/libc.so.6 #2 0x00452fbf in programming_error () #3 0x004fd4d7 in internal_free.isra () #4 0x00474c10 in expand_word_internal.isra () #5 0x004774e0 in expand_subscript_string () #6 0x0048890a in unbind_array_element () #7 0x004b0ed5 in unset_builtin () #8 0x0043ad4b in execute_builtin.isra () #9 0x0043f69f in execute_command_internal () #10 0x00442ac3 in execute_connection () #11 0x0043dfd9 in execute_command_internal () #12 0x004a7191 in parse_and_execute () #13 0x00423f8b in run_one_command () #14 0x00422c0a in main () Fix: The segmentation fault is caused in `expand_word_internal (WORD_DESC *word, ...)' (subst.c:10325) which releases the memory block `word->word' when it fails to expand the word. The problem is that `expand_subscript_string (char *string, ...)' (subst.c:10184) tries to directly pass the pointer `string' to `word->word' for the call of `expand_word_internal (WORD_DESC *word, ...)'. `word->word' needs to be a pointer which may be released on the expansion error. * In the first patch `0001-patch', the argument string is copied using `savestring ()', and the copy is passed to `expand_word_internal' through `td.word'. Finally, the copy is deleted by `free (td.word)'. When an expansion error occurs, `NULL' is assigned to `fd.word' by `expand_word_internal', so `free (td.word)'---i.e., `free (NULL)'---does nothing. The segmentation fault is fixed by the above patch, but there still remains the same error as bash 4.4. bash-patch1: ${x[0: bad substitution This is caused by an inconsistency between `valid_array_reference (name,flags)' (arrayfunc.c:1187) and `unbind_array_element (var,sub,flags)' (arrayfunc.c:1033) in the extraction of associative-array subscripts. Note that `valid_array_reference' is called from `unset_builtin' (builtins/set.def:834) to check if the unset name has the form of an array element. Also, `unbind_array_element' is called from `unset_builtin' to perform the actual unset. In `valid_array_reference', the length of the associative-array subscripts are determined as else if (isassoc) len = skipsubscript (t, 0, flags&VA_NOEXPAND); /* VA_NOEXPAND must be 1 */ whereas in `unbind_array_element', the length is determined as if (var && assoc_p (var) && (flags&VA_ONEWORD)) len = strlen (sub) - 1; else len = skipsubscript (sub, 0, (flags&VA_NOEXPAND) || (var && assoc_p(var))); /* XXX */ `skipsubscript' does not consider the nesting of ${} and $() when bit 1 is set to the third argument. In the former code, nesting is not considered only when VA_NOEXPAND is specified. However, in the latter code, nesting is never considered for associative arrays (even when VA_NOEXPAND is not specified). I believe the former code should be the expected one. * In the second patch `0002-patch', the subscript extraction in `unbind_array_element' is adjusted to match with that of `valid_array_element'. -- Koichi 0001-fix-unset-segfault-on-assoc-subscripts-with-paramexp.patch Description: Binary data 0002-allow-nesting-and-quoting-in-assoc-subscripts-when-a.patch Description: Binary data
bug-bash@gnu.org
I have questions on the new feature ${var/pat/&} in the devel branch. > commit f188aa6a013e89d421e39354086eed513652b492 (upstream/devel) > Author: Chet Ramey > Date: Mon Oct 4 15:30:21 2021 -0400 > > enable support for using `&' in the pattern substitution replacement > string > > Any unquoted instances of & in STRING are replaced with the matching > portion of PATTERN. Backslash is used to quote & in STRING; the > backslash is removed in order to permit a literal & in the > replacement string. Users should take care if STRING is > double-quoted to avoid unwanted interactions between the backslash > and double-quoting. Pattern substitution performs the check for & > after expanding STRING; shell programmers should quote backslashes > intended to escape the & and inhibit replacement so they survive any > quote removal performed by the expansion of STRING. I would very much like this change introduced in the latest commit f188aa6a in devel as it would enable many more string manipulations with a simple construct, but I feel the current treatment of quoting has problems: 1. There is no way to specify an arbitrary string in replacement in a way that is compatible with both bash 5.1 and 5.2. 2. There is no way to insert a backslash before the matched part (which I'd think would be one of the typical usages of &). I below describe the details of each, followed by my suggestion or discussion on an alternative design. -- 1. How to specify an arbitrary string in replacement copatibly with both bash 5.1 and 5.2? Currently any & in the replacement is replaced by the matched part regardless of whether & is quoted in the parameter-expansion context or not. Even the result of the parameter expansions and other substitutions are subject to the special treatment of &, which makes it non-trivial to specify an arbitrary string to the replacement ${var/pat/rep}. $ str='X&Y&Z' pat='Y' rep='A&B' $ echo ${str/$pat/} X&A&B&Z where is some string that represents the literal "$rep" (i.e., 'A&B'). A naive quoting of "$rep" does not work: $ echo "1:${str/$pat/"$rep"}" 1:X&AYB&Z I would have expected it to work because $pat will lose special meaning and be treated literally when it is quoted as "$pat". For example, the glob patterns *?[ etc. and anchors # and % in $pat will lose its special meaning when it is quoted: $ v='A' p='?'; echo "${v/$p/B}"; echo "${v/"$p"/B}" B A $ v='A' p='#'; echo "${v/$p/B}"; echo "${v/"$p"/B}" BA A $ v='A' p='%'; echo "${v/$p/B}"; echo "${v/"$p"/B}" AB A Of course, if $rep is not quoted, & in $rep is replaced by the matched part. $ echo "2:${str/$pat/$rep}" 2:X&AYB&Z * To properly specify an arbitrary string in the replacement, one needs to replace all the characters. $ echo "${str/$pat/${rep//&/&}}" * When the replacement is not stored in a variable, one needs to create a variable for the replacement, i.e., $ echo "${str/$pat/$(something)}" in Bash 5.1 needs to be converted to $ tmp=$(something) $ echo "${str/$pat/${tmp//&/&}}" in Bash 5.2. * Also, there is no way of writing it so that it works in both Bash 5.1 and 5.2. To make it work, one needs to switch the code depending on the bash version as: if ((BASH_VERSINFO[0]*1+BASH_VERSINFO[1]*100>=50200)); then echo "${str/$pat/${rep//&/&}}" else echo "${str/$pat/$rep}" fi [ Note: this does not work for the devel branch because the devel branch still has the version 5.1. ] -- 2. How to insert a literal backslash before the matched part? Another problem is that one cannot put a literal backslash just before & without affecting the meaning of &. Currently if there is any backslash before &, & will lose the special meaning and the two characters '\&' become '&' after the replacement. One of typical usages of & in the replacement would be string escaping, i.e., quoting special characters in a string so that they do not have special meaning and are treated literally. For example, let us consider escaping a string as a glob pattern as in the following case: $ value='a*b*c' globchars='\*?[(' $ escaped=${value//["$globchars"]/} $ echo "$escaped" a\*b\*c where "" is some string that represents a literal '\' plus &. I naively expect « = '\'& » would work, which doesn't work actually: $ echo "${value//["$globchars"]/'\'&}" 1:a&b&c All the other attempts fail: $ echo "2a:${value//["$globchars"]/&}" 2a:a*b*c $ echo "2b:${value//["$globchars"]/\&}" 2b:a*b*c $ echo "2c:${value//["$globchars"]/\\&}" 2c:a&b&c $ echo "2d:${value//["$globchars"]/\\\&}" 2d:a&b&c $ echo "2e:${value//["$globchars"]/&}" 2e:a\&b\&c $ echo "2f:${value//["$globchars"]/\&}" 2f:a\&b\&c $ echo "2g:${value//["$globchars"]/\\&}" 2g:a\\&b\\&c $ backslash='\'