A year ago, dash had this fix:

commit 1f1e555aba99808a82cb5090b5ef980714dea09c
Author: Herbert Xu <herb...@gondor.apana.org.au>
Date:   Wed May 1 17:12:27 2024 +0800

    expand: Fix naked backslah leakage

    Naked backslashes in patterns may incorrectly unquote subsequent
    wild characters that are themselves quoted.  Fix this by adding
    an extra backslash when necessary.

    Test case:

            a="\\*bc"; b="\\"; c="*"; echo "<${a##$b"$c"}>"

    Old result:

            <>

    New result:

            <bc>

I started creating a testcase for it, covering a few more possibilities.

... and discovered that bash, in fact, is probably buggy:
The naked (not double-quoted) $b in pattern part misbehaves:
$b*   combination matches literal '*' character
$b"*" combination matches ... I didn't manage to find out what it matches:
it matches neither '*' nor '\*', so I fail to imagine what glob() pattern
is internally produced to do the match when ${var#$b"*"} is evaluated...

If it's not a bug, can you specify the rules how it works?
(so that I can have meaningful comments in dash and busybox ash/hush)

The full testcase script is below, with parts where bash is not working
as I expect commented accordingly:

a='\*bc'
b='\'
c='*'
echo "a is '$a'"
echo "b is '$b'"
echo "c is '$c'"
echo '${a##?*} removes everything:    '"|${a##?*}|"
echo '${a##?"*"} removes \*:          '"|${a##?"*"}|"' - matches one char, then 
*'
echo '${a##\*} removes nothing:       '"|${a##\*}|"' - first char is not *'
echo '${a##\\*} removes everything:   '"|${a##\\*}|"' - matches \, then all'
echo '${a##\\\*} removes \*:          '"|${a##\\*}|"' - matches \, then *'
echo '${a##?$c} removes everything:   '"|${a##?$c}|"' - matches one char, then 
all'
echo '${a##?"$c"} removes \*:         '"|${a##?"$c"}|"' - matches one char, 
then *'
echo '${a##\\$c} removes everything:  '"|${a##\\$c}|"' - matches \, then all'
echo '${a##\\"$c"} removes \*:        '"|${a##\\"$c"}|"' - matches \, then *'
echo '${a##$b} removes \:             '"|${a##$b}|"' - matches \'
echo '${a##"$b"} removes \:           '"|${a##"$b"}|"' - matches \'
echo
# This isn't working in bash as expected
echo '${a##$b?} removes \*:           '"|${a##$b?}|"' - matches \, then one 
char'   # bash prints |\*bc|
echo '${a##$b*} removes everything:   '"|${a##$b*}|"' - matches \, then all'    
    # bash prints |\*bc|
echo '${a##$b$c} removes everything:  '"|${a##$b$c}|"' - matches \, then all'   
    # bash prints |\*bc|
echo '${a##$b"$c"} removes \*:        '"|${a##$b"$c"}|"' - matches \, then *'   
    # bash prints |\*bc|
# the cause seems to be that $b emits backslash that "glues" onto next 
character if there is one:
# a='\*bc'; b='\'; c='*'; echo "|${a##?$b*}|"  # bash prints |bc| - the $b* 
works as \* (matches literal *)
# a='\*bc'; b='\'; c='*'; echo "|${a##\\$b*}|" # bash prints |bc|
# a='*bc'; b='\'; c='*'; echo "|${a##$b*}|"    # bash prints |bc|
echo
echo '${a##"$b"?} removes \*:         '"|${a##"$b"?}|"' - matches \, then one 
char'
echo '${a##"$b"*} removes everything: '"|${a##"$b"*}|"' - matches \, then all'
echo '${a##"$b""?"} removes nothing:  '"|${a##"$b""?"}|"' - second char is not 
?'  # bash prints |bc|
echo '${a##"$b""*"} removes \*:       '"|${a##"$b""*"}|"' - matches \, then *'
echo '${a##"$b"\*} removes \*:        '"|${a##"$b"\*}|"' - matches \, then *'
echo '${a##"$b"$c} removes everything:'"|${a##"$b"$c}|"' - matches \, then all'
echo '${a##"$b""$c"} removes \*:      '"|${a##"$b""$c"}|"' - matches \, then *'
echo '${a##"$b?"} removes nothing:    '"|${a##"$b?"}|"' - second char is not ?' 
 # bash prints |bc|
echo '${a##"$b*"} removes \*:         '"|${a##"$b*"}|"' - matches \, then *'    
 # bash prints ||
echo '${a##"$b$c"} removes \*:        '"|${a##"$b$c"}|"' - matches \, then *'


Reply via email to