Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Emanuele Torre
On Wed, Oct 18, 2023 at 11:24:11AM -0400, Chet Ramey wrote:
> On 10/18/23 10:50 AM, Chet Ramey wrote:
> 
> > > Is it really ok to break that behaviour?
> > 
> > That's why, if you want the first problem repaired, you have to specify
> > which word expansions are ok to parse and which are not. Should it be
> > only command and process substitution, which can independently specify
> > brace expansions? Should it be other parameter expansions, which can
> > contain quoted strings and command substitutions? Neither? Both?
> 
> It looks like the answer is probably "neither."

I have examined the code, and I now understand why "${foo:-"{1..2}"}"
etc were allowed, but were not supposed to.

It is hard to specify a behaviour that preserves brace expansions in
double quoted parts within a parameter substitution.

I have no real reason to oppose to the removal of "${arr["{foo,bar}"]}"
and similar; I have only ever used that when golfing... I just wanted to
mention that it now does not work in case someone cared more about this
feature.

It could be allowed only in the double quoted parts of the parameter
expansions. But it is hard to tell when a double quote is literal or not
in a PE with the existing code, so maybe not worth doing if no one
cares. Removing this behaviour while fixing "${x#'`'}" is fine for me.


DDD


While we are at it, when examining this issue now, I have also noticed
that, regardless of whether brace expansion is enabled or not, the bash
parser always reads single quotes in double quoted PEs as if they are
not literal, even though they are literal if the PE is -, =, or +.

For example:

  $ bash -c "echo x\"\${foo-'}"
  bash: -c: line 1: unexpected EOF while looking for matching `''
  bash: -c: line 2: syntax error: unexpected end of file
  $ bash -c "echo x\"\${foo-'$(echo hi >&2)'}"
  hi
  x''
  $ bash -c "echo \"\${foo#'$(echo hi >&2)'}"
  x

  $ # }" in single quotes does not terminate the PE even though the
  $ # single quote is supposed to be literal, and it will be interpreted
  $ # as literal
  $ printf %s\\n x"${foo:-'}" hello hi'}"
  x'} hello hi'

Could/Should this also be fixed so that  "${foo:-'}"  (and +, and =) is
read correctly? Other shells including ksh93, mksh, dash, NetBSD ash,
and busybox can read it correctly.

The only shell I found that does not handle that correctly other than
bash is pdksh, but only because it treats single quotes as not literal
even if the double quoted PE is +, - or =.

Thank you!
 emanuele6



Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Chet Ramey

On 10/18/23 10:50 AM, Chet Ramey wrote:


Is it really ok to break that behaviour?


That's why, if you want the first problem repaired, you have to specify
which word expansions are ok to parse and which are not. Should it be
only command and process substitution, which can independently specify
brace expansions? Should it be other parameter expansions, which can
contain quoted strings and command substitutions? Neither? Both?


It looks like the answer is probably "neither."

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Chet Ramey

On 10/18/23 10:24 AM, Emanuele Torre wrote:

On Wed, Oct 18, 2023 at 09:50:00AM -0400, Chet Ramey wrote:

On 10/17/23 5:55 PM, Emanuele Torre wrote:

braces.c
- brace_gobbler: use extract_dollar_brace_string if we see ${ with
  the appropriate value of QUOTING, so we don't have to teach brace
  expansion more shell syntax.
  Report from Emanuele Torre 
- brace_gobbler: call the word extraction functions with SX_NOALLOC
  so we don't have to allocate memory we're just going to free


That patch fixed the bug with "${foo#'$('}", but it also broke the
"${arr["{start..end}"]}" / "${arr["{foo,bar}"]}" patterns.


How much shell syntax do you want? You want brace expansion to detect
some issues with parameter expansion but ignore others, and detect
some expansions but not others, without supplying requirements.


I didn't even think that the "${foo#'$('}" had something to do with
brace expansion. 


OK, now you know, and now you know where the fix needs to be applied.


Now I am reporting that the patch that was supposed to fix that problem


It does fix the reported problem. You can't fully consume shell word
expansions without parsing them as word expansions, and you can't avoid
the reported problem without parsing the word expansion.


made "${foo["{2,1}"]}" no longer expands to "${foo["2"]}" "${foo["1"]}".


Hence my point: to fix the reported problem, you have to parse the word
expansion, since simple quoted string skipping doesn't do it. To allow
this syntax, you cannot parse the word expansion.


Brace expansion between "${foo[" and "]}" is something that has always
worked: at least since bash 2.05b (I have not checked with earlier
versions after the introduction of arrays).

Is it really ok to break that behaviour?


That's why, if you want the first problem repaired, you have to specify
which word expansions are ok to parse and which are not. Should it be
only command and process substitution, which can independently specify
brace expansions? Should it be other parameter expansions, which can
contain quoted strings and command substitutions? Neither? Both?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Emanuele Torre
On Wed, Oct 18, 2023 at 09:50:00AM -0400, Chet Ramey wrote:
> On 10/17/23 5:55 PM, Emanuele Torre wrote:
> > > braces.c
> > >   - brace_gobbler: use extract_dollar_brace_string if we see ${ with
> > > the appropriate value of QUOTING, so we don't have to teach brace
> > > expansion more shell syntax.
> > > Report from Emanuele Torre 
> > >   - brace_gobbler: call the word extraction functions with SX_NOALLOC
> > > so we don't have to allocate memory we're just going to free
> > 
> > That patch fixed the bug with "${foo#'$('}", but it also broke the
> > "${arr["{start..end}"]}" / "${arr["{foo,bar}"]}" patterns.
> 
> How much shell syntax do you want? You want brace expansion to detect
> some issues with parameter expansion but ignore others, and detect
> some expansions but not others, without supplying requirements.

I didn't even think that the "${foo#'$('}" had something to do with
brace expansion. I only noticed that "${foo#'$('}" started being a
syntax error on evaluation in bash 4.3 (apparently only when brace
expansion is enabled), and I thought that was not correct, so I reported
it.

Now I am reporting that the patch that was supposed to fix that problem
made "${foo["{2,1}"]}" no longer expands to "${foo["2"]}" "${foo["1"]}".

Brace expansion between "${foo[" and "]}" is something that has always
worked: at least since bash 2.05b (I have not checked with earlier
versions after the introduction of arrays).

Is it really ok to break that behaviour?

o/
 emanuele6



Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Greg Wooledge
On Wed, Oct 18, 2023 at 09:39:36AM -0400, Zachary Santer wrote:
> I guess I still want to hear about "${#@}" and $[  ].

$[ ] is officially deprecated, and users are advised to stop using it.
It was originally going to be the syntax for arithmetic expansion, and
made it as far as some POSIX rough draft, I think.  But then the
decision was made to go with $(( )) instead.

Meanwhile, bash had already shipped releases with support for $[ ] and
people had already started using it.  So, it's still *there*, and still
works, despite being deprecated for literally decades.  I don't know
whether it'll ever actually be removed -- maybe it'll just linger until
POSIX decides to use $[ ] for something else and forces a change.

${#@} is simply the use of @ as a pseudo array name, which works in many
places, the most common being "$@".  For example,

unicorn:~$ set -- one two three four
unicorn:~$ echo "${@//o/x}"
xne twx three fxur

In this context, @ can be thought of as a shortcut for argv[@] where
argv contains the positional parameters.  ${#@} is therefore analogous
to ${#argv[@]} which counts the number of elements in an array named
argv.

It's not a perfect analogy, though.  For example, this doesn't work:

unicorn:~$ echo "${!@}"
bash: one two three four: invalid variable name

That's due to the syntax collision between ${!ref} and ${!array[@]}.
The parser tries to apply the former rather than the latter.

In practical terms, there is no reason to use ${#@}.  It's the same
as $# except that the latter is shorter and more portable.  But at the
same time, there's no reason to say anything about ${#@}, since nobody
ought to be using it anyway.  And if they do use it, what's the harm?



Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Chet Ramey

On 10/18/23 9:39 AM, Zachary Santer wrote:

I guess I still want to hear about "${#@}" and $[  ]. Sorry about bringing 
them up in relation to something that *is* documented.


The former is unspecified, but seems reasonable: $@ expands the positional
parameters to a set of words, and the # counts them. Behavior varies across
shells, but no shell flags it as an error.

$[...] still works as an equivalent to $((...)), since it was the POSIX
way to perform arithmetic expansion before $((...)).

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Chet Ramey

On 10/17/23 5:55 PM, Emanuele Torre wrote:

braces.c
- brace_gobbler: use extract_dollar_brace_string if we see ${ with
  the appropriate value of QUOTING, so we don't have to teach brace
  expansion more shell syntax.
  Report from Emanuele Torre 
- brace_gobbler: call the word extraction functions with SX_NOALLOC
  so we don't have to allocate memory we're just going to free


That patch fixed the bug with "${foo#'$('}", but it also broke the
"${arr["{start..end}"]}" / "${arr["{foo,bar}"]}" patterns.


How much shell syntax do you want? You want brace expansion to detect
some issues with parameter expansion but ignore others, and detect
some expansions but not others, without supplying requirements.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Zachary Santer
On Wed, Oct 18, 2023 at 8:47 AM alex xmb sw ratchev 
wrote:

> by chet stating many times that every bash item undergoes expansion ..
>
As in, Bash expands
$ printf '%s\n' "${array["{2..6..2}"]}"
to
$ printf '%s\n' "${array[2]}" "${array[4]}" "${array[6]}"
on its way to giving you
two
four
six
That took me a while to comprehend. Alright, I take back that it isn't
documented anywhere.

u miss at least once the quotes of ['{blabla}']
>
Yeah, I was trying different, seemingly-related things.

its both the 1+ args
> $@ expands to different args
> "$@" to preserve spacesafe
>
> $* expands the same ( all args ) to one-arg
> "$*" spaceaafe
>
I know that. I don't really get what you're getting at, here.

This doesn't work either, obviously:
$ printf '%s\n' "${array["${indices[*]}"]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")
$ printf '%s\n' "${array["${indices[@]}"]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")

I guess I still want to hear about "${#@}" and $[  ]. Sorry about bringing
them up in relation to something that *is* documented.

Zack


Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Greg Wooledge
On Wed, Oct 18, 2023 at 08:19:35AM -0400, Zachary Santer wrote:
> On Tue, Oct 17, 2023 at 5:56 PM Emanuele Torre 
> wrote:
> 
> > bash-5.1$ letters=( {a..z} ); echo "${letters["{10..15}"]}"
> > k l m n o p
> >
> 
> Then there's the question "Was that even supposed to work like that?"

This is another one of those cases where some end user did something
wacky and it produced a result that they liked, so they kept doing it.
At some point, they convinced themselves that this was an intended
feature, and they'd probably be upset if it stopped "working".

It's a pretty straightforward application of brace expansion, but not
one I would have thought to use.

> If
> so, you'd think it would generalize to being able to pass a series of
> whitespace-delimited indices to an array expansion.

That's not what's happening, at all.

unicorn:~$ set -x
unicorn:~$ echo "a b"{c,d}"e f"
+ echo 'a bce f' 'a bde f'
a bce f a bde f

Brace expansion is extremely low-level text macro expansion.  In my
example above, the thing after the echo consists of a single parser-word
which contains two quoted sections, and a brace expansion.  The brace
expansion fires first, and causes two words to be generated: "a b"c"e f"
is the first, and "a b"d"e f" is the second.

After those words are generated, quote removal occurs, and the result
is what you see in the -x output.

> In Bash 5.2:
> $ array=( zero one two three four five six )
> $ printf '%s\n' "${array["{2..6..2}"]}"
> two
> four
> six

This one "works" because the final parser-word is a brace expansion with
quoted sections before and after it.  The brace expansion causes three
words to be generated: "${array["2"]}" is the first, and so on.

You're basically typing

printf '%s\n' "${array["2"]}" "${array["4"]}" "${array["6"]}"

or rather, you're letting the brace expansion type it for you.

It just so happens that "${array["2"]}" is accepted as an indexed array
element expansion, despite the extra quotes inside the square brackets.

> $ printf '%s\n' "${array[{2..6..2}]}"
> -bash: {2..6..2}: syntax error: operand expected (error token is
> "{2..6..2}")

Here, what would have been a valid brace expansion is inside quotes, so
it's not expanded.  "{2..6..2}" is not a valid indexed array index value.

> $ printf '%s\n' "${array["2 4 6"]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")

"2 4 6" is not a valid indexed array index value either.

> $ printf '%s\n' "${array[2 4 6]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")

Same here, just without the extra quotes.

> $ printf '%s\n' "${array[2,4,6]}"
> six

In this case, "2,4,6" *is* a valid indexed array index.  It's an
arithmetic expression, whose value is that of the thing after the
last comma.

A comma series in an arithmetic expression is intended to be used in
cases like:

unicorn:~$ let 'a=1,b=a+1'; declare -p a b
declare -- a="1"
declare -- b="2"

The evaluation of the comma series as the last element is mostly an
afterthought.  "What else would it evaluate to?"  We're more interested
in the side effects, rather than the final value.

This comes from C, where the most common use is something like:

for (i=1,j=2; i<10; i++,j+=2) {...}

The comma series lets you perform two assignments/alterations in a
place where the syntax only asks for one.

> $ indices=(2 4 6)
> $ printf '%s\n' "${array[${indices[@]}]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")
> $ printf '%s\n' "${array[${indices[*]}]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")

In these cases, the inner expression is evaluated, yielding an
invalid indexed array index value.  "2 4 6" isn't allowed as an index,
no matter where you pull it from.

> Considering I don't think this is documented anywhere, and what's in
> between the square brackets gets an arithmetic expansion applied to it, I'm
> going to guess "no."
> 
> So how important is it to maintain undocumented behavior?

There is one analogous case where the behavior *did* change.  Consider
a command like this:

unicorn:~$ echo <(printf '%s\n' {a..c})
/dev/fd/63

In bash 5.2, we get the result shown above.  The brace expansion happens
inside the process substitution.  It's as if we had typed

echo <(printf '%s\n' a b c)

In bash 3.2, however:

unicorn:~$ bash-3.2
unicorn:~$ echo <(printf '%s\n' {a..c})
/dev/fd/63 /dev/fd/62 /dev/fd/61

Here, the brace expansion happened *first*, and we got three separate
process substitution words out of it.  It's as if we had typed

echo <(printf '%s\n' a) <(printf '%s\n' b) <(printf '%s\n' c)

Changing this was significant, and caused some scripts to break, and
some people to become confused.  But it caused *more* people to be less
confused, because the current behavior looks a lot more reasonable to
more people vs. the previous behavior.

So, there's precedent for changing something that seems wrong, when it
comes to brace expansions.  The first question is whether the 

Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread alex xmb sw ratchev
On Wed, Oct 18, 2023, 14:20 Zachary Santer  wrote:

> On Tue, Oct 17, 2023 at 5:56 PM Emanuele Torre 
> wrote:
>
> > bash-5.1$ letters=( {a..z} ); echo "${letters["{10..15}"]}"
> > k l m n o p
> >
>
> Then there's the question "Was that even supposed to work like that?" If
> so, you'd think it would generalize to being able to pass a series of
> whitespace-delimited indices to an array expansion.
>

by chet stating many times that every bash item undergoes expansion ..
like why [[ -v var[\$k] ]] .. cause $k once and if not \$ again in parsing
the bash cmd / keyword / whatever

In Bash 5.2:
> $ array=( zero one two three four five six )
> $ printf '%s\n' "${array["{2..6..2}"]}"
> two
> four
> six
> $ printf '%s\n' "${array[{2..6..2}]}"
> -bash: {2..6..2}: syntax error: operand expected (error token is
> "{2..6..2}")
> $ printf '%s\n' "${array["2 4 6"]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")
> $ printf '%s\n' "${array[2 4 6]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")
> $ printf '%s\n' "${array[2,4,6]}"
> six
> $ indices=(2 4 6)
> $ printf '%s\n' "${array[${indices[@]}]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")
> $ printf '%s\n' "${array[${indices[*]}]}"
> -bash: 2 4 6: syntax error in expression (error token is "4 6")
>

u miss at least once the quotes of ['{blabla}']

Considering I don't think this is documented anywhere, and what's in
> between the square brackets gets an arithmetic expansion applied to it, I'm
> going to guess "no."
>
> So how important is it to maintain undocumented behavior?
>
> Why does "${#@}" expand to the same thing as "${#}"? Why is $[  ]
> equivalent to $((  ))? Does that stuff need to continue to work forever?
>

its both the 1+ args
$@ expands to different args
"$@" to preserve spacesafe

$* expands the same ( all args ) to one-arg
"$*" spaceaafe

Zack
>

x

>


Re: bash tries to parse comsub in quoted PE pattern

2023-10-18 Thread Zachary Santer
On Tue, Oct 17, 2023 at 5:56 PM Emanuele Torre 
wrote:

> bash-5.1$ letters=( {a..z} ); echo "${letters["{10..15}"]}"
> k l m n o p
>

Then there's the question "Was that even supposed to work like that?" If
so, you'd think it would generalize to being able to pass a series of
whitespace-delimited indices to an array expansion.

In Bash 5.2:
$ array=( zero one two three four five six )
$ printf '%s\n' "${array["{2..6..2}"]}"
two
four
six
$ printf '%s\n' "${array[{2..6..2}]}"
-bash: {2..6..2}: syntax error: operand expected (error token is
"{2..6..2}")
$ printf '%s\n' "${array["2 4 6"]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")
$ printf '%s\n' "${array[2 4 6]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")
$ printf '%s\n' "${array[2,4,6]}"
six
$ indices=(2 4 6)
$ printf '%s\n' "${array[${indices[@]}]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")
$ printf '%s\n' "${array[${indices[*]}]}"
-bash: 2 4 6: syntax error in expression (error token is "4 6")

Considering I don't think this is documented anywhere, and what's in
between the square brackets gets an arithmetic expansion applied to it, I'm
going to guess "no."

So how important is it to maintain undocumented behavior?

Why does "${#@}" expand to the same thing as "${#}"? Why is $[  ]
equivalent to $((  ))? Does that stuff need to continue to work forever?

Zack