Re: Having an alias and a function with the same name leads to some sort of recursion

2023-02-07 Thread Robert Elz
Date:Tue, 7 Feb 2023 14:35:54 -0500
From:Chet Ramey 
Message-ID:  

  | On 2/7/23 12:33 PM, Dale R. Worley wrote:  (That was 7 Feb, not 2 July...)
  | > That makes it clear why the second case behaves as it does.  But my
  | > reading of the definition of "simple commands" implies that function
  | > defintions are not simple commands,

You're right, they're not.

  | > and alias substitution should not be
  | > done on them (that is, the initial part) in any case.

That's what the standard said (says) currently, but it was never how shells
behaved (the standard was wrong), alias processing is a lexical issue, and
as Chet says:

  | When you parse a command and perform alias expansion, you don't yet know if
  | you're reading a simple command or a function definition.

which is why in the forthcoming edition, POSIX has been changed from:

After a token has been delimited, but before applying the
grammatical rules in Section 2.10, a resulting word that is
identified to be the command name word of a simple command shall
be examined to determine whether it is an unquoted, valid alias name.

into:

After a token has been categorized as type TOKEN (see Section 2.10.1),
including (recursively) any token resulting from an alias substitution,
the TOKEN shall be subject to alias substitution if:

� the TOKEN does not contain any quoting characters,
[...]
� the TOKEN could be parsed as the command name word of a simple
  command (see Section 2.10), based on this TOKEN and the tokens
  (if any) that preceded it, but ignoring whether any subsequent
  characters would allow that,

(There are more rules in both cases, but they're not currently relevant).

In the example case

cmd() { echo "$@" ; }

where "cmd" has been defined as an alias, when the lexical analysis phase
has read 'c' 'm' 'd' (and combined those into the token "cmd" (a "word",
which posix calls TOKEN in 2.10.1) the next char is '(' which delimits the
current token (ends it), but that char is not processed yet, it will be part
of the following token.   The just completed token, categorised as a TOKEN,
(section 2.10.1) contains no quoting chars, (and rules not stated above: is
a valid and defined alias name, not currently being expanded) so is replaced
by the value of the alias.

It needs to be this way, as the shell allows (always has) things like

alias thing='echo('

and then with that in effect, one can write

thing) { printf 'hello\n'; }

(where I purposely didn't create a recursive function, so you can test it).

That expands, after alias processing to:

echo() { printf 'hello\n'; }

but if we only substituted aliases in what are actually command words of
simple commands, there would have been no alias substitution there, as

thing) { 

isn't a simple command (without any alias being expanded yet), it isn't
a function definition either, it is simply a syntax error - there is no
opening '(' to match the closing ')'.

With the new rules, that is not an issue - just as it never was for shells.

Of course all of this is absurd, impossible to explain to anyone who doesn't
already understand the difference between lexical analysis and parsing,
or what either of those have to do with running shell commands, and just
provides another reason that aliases should be abandoned completely.

It also justifies the current bash manual page remaining as it is, even though
it is not technically correct - it is close enough for people who want (for
unexplainable reasons) to use aliases to work out how to use them in normal
cases.

kre

ps: even as rewritten, the standard is not perfect, as if the input were

$x thing) ...
or
$x cmd() ...

then if $x != '' then there's certainly no alias processing of thing or
cmd, as the first word of the value of $x would be the command word, and
what follows that command's args (including "thing" or "cmd" - in either
case the parentheses will cause a syntax error, unless some non-standard
shell syntax permits them).

But if $x == '' then, since the expansion is unquoted, it simply vanishes,
in which case "thing" (or "cmd") would be in the command word position,
so strictly, given the above input, "thing" or "cmd" *could* be a TOKEN in
the command word position (we don't know, as the lexer doesn't evaluate $x),
so according to the new wording, should be alias expanded (assuming the
word is defined as an alias).   But that's not how shells work either.
The lexical analysis phase just sees $x as one word, delimited by white
space (which the lexer deletes) followed by "thing" (or "cmd") as another
word - the 2nd word in a sequence of words isn't in the command word position,
so is never considered for alias processing.

Note that even if it wanted to, the lexer cannot expand $x to find out what
it will be, as we might be in a loop like


Re: Having an alias and a function with the same name leads to some sort of recursion

2023-02-07 Thread Chet Ramey

On 2/7/23 12:33 PM, Dale R. Worley wrote:


 ALIASES
Aliases  allow a string to be substituted for a word when it is used as
the first word of a simple command.

That makes it clear why the second case behaves as it does.  But my
reading of the definition of "simple commands" implies that function
defintions are not simple commands, and alias substitution should not be
done on them (that is, the initial part) in any case.


When you parse a command and perform alias expansion, you don't yet know if
you're reading a simple command or a function definition.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Having an alias and a function with the same name leads to some sort of recursion

2023-02-07 Thread Dale R. Worley
Robert Elz  writes:
>   | Aliases are not used in bash scripts, unless bash is invoked in POSIX
>   | compatibility mode, or the "expand_aliases" shopt is turned on.
>
> I think that's what must have happened ... the infinite loop of
> echo commands suggests that the function definition
>
>   cmd() { echo "$@" ; }
>
> was converted by the alias info
>
>   echo() { echo "$@" ; }
>
> and when you see that, it is obvious why cmd a b c (which becomes echo a b c)
> just runs echo which runs echo which runs echo which ...

Heh -- but OTOH, if you use

function cmd() { echo "$@" ; }

you *don't* get that behavior.

Looking at the manual page, it says

ALIASES
   Aliases  allow a string to be substituted for a word when it is used as
   the first word of a simple command.

That makes it clear why the second case behaves as it does.  But my
reading of the definition of "simple commands" implies that function
defintions are not simple commands, and alias substitution should not be
done on them (that is, the initial part) in any case.

Dale



[PATCH] use bind_lastarg to restore $_ when executing variable

2023-02-07 Thread Emanuele Torre
Before this patch, if allexport was set, $_ gained the "x" attribute
after PROMPT_COMMAND finished running, that would only get removed after
the next simple command is executed.

  $ PROMPT_COMMAND=:
  $ : foo
  $ declare -p _
  declare -- _="foo"
  $ set -a
  $ : bar
  $ declare -p _
  declare -x _="bar"
  $ : zoo; declare -p _; : bee
  declare -- _="zoo"
  $ declare -p _
  declare -x _="bee"
---
 execute_cmd.c | 3 +--
 execute_cmd.h | 2 ++
 parse.y   | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/execute_cmd.c b/execute_cmd.c
index d4c082d0..18540409 100644
--- a/execute_cmd.c
+++ b/execute_cmd.c
@@ -121,7 +121,6 @@ extern int close (int);
 /* Static functions defined and used in this file. */
 static void close_pipes (int, int);
 static void do_piping (int, int);
-static void bind_lastarg (char *);
 static int shell_control_structure (enum command_type);
 static void cleanup_redirects (REDIRECT *);
 
@@ -4000,7 +3999,7 @@ execute_cond_command (COND_COM *cond_command)
 }
 #endif /* COND_COMMAND */
 
-static void
+void
 bind_lastarg (char *arg)
 {
   SHELL_VAR *var;
diff --git a/execute_cmd.h b/execute_cmd.h
index 37e386ab..944b6a98 100644
--- a/execute_cmd.h
+++ b/execute_cmd.h
@@ -85,6 +85,8 @@ extern void dispose_exec_redirects (void);
 
 extern int execute_shell_function (SHELL_VAR *, WORD_LIST *);
 
+extern void bind_lastarg (char *);
+
 extern struct coproc *getcoprocbypid (pid_t);
 extern struct coproc *getcoprocbyname (const char *);
 
diff --git a/parse.y b/parse.y
index f5199f15..8db9bee6 100644
--- a/parse.y
+++ b/parse.y
@@ -2848,7 +2848,7 @@ execute_variable_command (const char *command, const char 
*vname)
   parse_and_execute (savestring (command), vname, 
SEVAL_NONINT|SEVAL_NOHIST|SEVAL_NOOPTIMIZE);
 
   restore_parser_state ();
-  bind_variable ("_", last_lastarg, 0);
+  bind_lastarg (last_lastarg);
   FREE (last_lastarg);
 
   if (token_to_read == '\n')   /* reset_parser was called */
-- 
2.39.1