Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt 
-fexceptions         -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security   
      -fstack-clash-protection -fcf-protection         -fno-omit-frame-pointer 
-mno-omit-leaf-frame-pointer -flto=auto 
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' 
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc' 
-DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS 
-std=gnu17
uname output: Linux vbox-virtualbox 6.12.28-1-MANJARO #1 SMP PREEMPT_DYNAMIC 
Fri, 09 May 2025 10:53:27 +0000 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.2
Patch Level: 37
Release Status: release

Bash options:   autocd                  off
                assoc_expand_once       off
                cdable_vars             off
                cdspell                 off
                checkhash               off
                checkjobs               off
                checkwinsize            on
                cmdhist                 on
                compat31                off
                compat32                off
                compat40                off
                compat41                off
                compat42                off
                compat43                off
                compat44                off
                complete_fullquote      on
                direxpand               off
                dirspell                off
                dotglob                 off
                execfail                off
                expand_aliases          on
                extdebug                off
                extglob                 on
                extquote                on
                failglob                off
                force_fignore           on
                globasciiranges         on
                globskipdots            on
                globstar                off
                gnu_errfmt              off
                histappend              on
                histreedit              off
                histverify              off
                hostcomplete            off
                huponexit               off
                inherit_errexit         off
                interactive_comments    on
                lastpipe                off
                lithist                 off
                localvar_inherit        off
                localvar_unset          off
                login_shell             off
                mailwarn                off
                no_empty_cmd_completion   off
                nocaseglob              off
                nocasematch             off
                noexpand_translation    off
                nullglob                off
                patsub_replacement      on
                progcomp                on
                progcomp_alias          off
                promptvars              on
                restricted_shell        off
                shift_verbose           off
                sourcepath              on
                varredir_close          off
                xpg_echo                off

                (tried looking for any changes turning off extglob and extquote 
but to no difference)
Description:
      Bash's '=~' extended POSIX regex seems to behave very different to the 
way grep's -E flag seems to deal with regular expressions.
        I failed multiple times on getting similar results to what I was 
expecting from using grep just using the [a-z] and [a-z]+ classes - expecting 
multiple results from $BASH_REMATCH but it's only picking up 1 character at 
most, while grep -E is able to pick up all the characters (which is weird, 
since the class [a-z]+$ gives completely similar results).
So, I was wondering whether this was a bug or intended and I'm just 
misinterpreting how bash does regular expressions. I tried reading the bash 
manual on the '=~' operator,
-> https://www.gnu.org/software/bash/manual/bash.html#index-_005b_005b,
but as far as I know (and to the extent of my knowledge how regular expressions 
work), this seems like unintended behavior.
Repeat-By:
      grep:
            `$ echo test-test | POSIXLY_CORRECT=1 grep -E [a-z]`
            `^test^-^test^`

            `$ echo test-tesst | POSIXLY_CORRECT=1 grep -E [a-z]+`
            `^test^-^tesst^`

        bash's '=~' and $BASH_REMATCH:
            ```
            $ if [[ test-test =~ [a-z] ]]; then
                for i in "${!BASH_REMATCH[@]}"; do
                    echo "$i: ${BASH_REMATCH[$i]}";
                done
            fi
            ```
            `0: t`
            ```
            $ if [[ test-tesst =~ [a-z]+ ]]; then
                for i in "${!BASH_REMATCH[@]}"; do
                    echo "$i: ${BASH_REMATCH[$i]}";
                done;
            fi
            ```
            `0: test`

            (Similarly when test-test/test-tesst gets quoted or double quoted, 
or if the regex gets put in a single quoted variable)

Fix:
        In both cases, ${BASH_REMATCH[1]} should also have results stored.

Reply via email to