Re: posix command search and execution

2023-11-07 Thread Mike Jonkmans
On Tue, Nov 07, 2023 at 11:49:25AM -0500, Chet Ramey wrote:
> On 11/7/23 8:54 AM, Mike Jonkmans wrote:

...

> > > Look at https://www.austingroupbugs.net/view.php?id=854 for a discussion
> > > of this issue.
> > Thanks for the link, I find that very hard to read though.
> It's also incomplete; there was a lot of discussion on the mailing list.
> I don't have a link to a usable public mailing list archive.

So the discussion is hidden. Hmm.
I already did not find it much of a discussion in terms of opposition.

...

> > Then again, is there a requirement for the standard utilities to be
> > found in the current PATH? Or do they just need to be present somewhere.
> They have to be findable using the value returned by `getconf PATH'. If
> the user modifies PATH to, say, prepend directories before that standard
> PATH, then all bets are off.

I see. Weirdly on Ubuntu 22.04, with /bin symlinked to /usr/bin,
`getconf PATH' produces `/bin:/usr/bin'.
That looks like a recipe for redundant `stats'.

> > > > - The 'newgrp' utility (mentioned in 1d) is not a builtin in bash.
> It's gone in the latest draft of the next version of the standard anyway.

Good riddance.

> > > > - Utilities:
> > > > https://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html
> > > > Q: Where is `standard utilities' defined - as used in 1d.
> > > These are the standard utilities.
> > Some of these utilities are marked with optional `codes'.
> > Are these also considered standard utilities - even when the option is
> > not true?
> Not really, no. If the implementation claims to support, for instance, XSI,
> the XSI-shaded utilities have to be present and they have to behave as
> specified. If the implementation doesn't, they don't.
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_04
> 
> So in 1d, if the system doesn't claim XSI conformance, the shell doesn't
> have to include type or ulimit in this required invocation order.


> (But wait! The list of intrinsics in the latest draft includes type and
> ulimit, and isn't XSI-shaded. So that will change.)

Isn't that described in `Note 0004803' in 
https://www.austingroupbugs.net/view.php?id=854
(not possible to add shade to type and ulimit in column order)

-- 
Regards, Mike Jonkmans



Re: posix command search and execution

2023-11-07 Thread Mike Jonkmans
On Tue, Nov 07, 2023 at 09:39:33AM +0700, Robert Elz wrote:
> Date:Mon, 6 Nov 2023 14:28:24 -0500
> From:Chet Ramey 
> Message-ID:  <0ab6075e-22bf-43cd-992c-b2476f626...@case.edu>
> 
>   | On 11/6/23 10:48 AM, Mike Jonkmans wrote:
>   | > According to these docs (what I make of it), resolving is done
>   | > in steps, the first applicable step is used:
> This is one of the most debated, and stupidest, parts of posix.

Unneeded complexity, I would say.

>   | > 1b) List several names that have unspecified results.
>   | This is an ad-hoc list of builtins that shells implement,
>   | not necessarily common across all shells.
> If it were just builtins it would not be important, the issue
> is more that some shells implement some of that list as reserved
> words, or aliases, and if that's done what applications can do
> alters dramatically.   So avoiding using those words as command
> names, except when using the known features of a specific shell,
> is the best way to remain portable.

Since we are dealing with 'Simple Commands', I hadn't yet even
considered reserved word and aliases. Makes sense though.

>   | > 1c) Use a function, for functions not matching standard utilities.
> No, that's not what it says, it is except of standard utilities
> implemented as functions.   More on that below.

I see the nuance. Thanks for pointing that out.
So a function is called, unless it is provided by the implementation and
matches a standard utility.
In particular, a user function with the name of a standard utility,
will be called at this point.

>   | > 1d) Lists 20 fixed utility names (like alias, cd etc.) that are
>   | >  to be invoked at his point. No PATH search yet.
>   | > These are the `regular builtins'.
> In the next standard the ones listed are the intrinsic builtins,
> and includes only those that must be builtin to work.   But
> implementations can add more to the list.

Chet mentioned that. But I find the Austin-discussion hard to read.
It makes sense to partition the builtins in three categories with
a separate name for each.

>   | > 1eI) Search is successful.
>   | > 1eIa) Check for `regular builtins' and functions
>   | >   and invoke that regular builtin/function.
>   | >   Q: Shouldn't this specify an ordering for builtins/functions?
>   | The text seems to imply that you can't have both, doesn't it?
> While I suppose you could have both, it would be very unusual.

Unusuality sketch:
- the shell provides a builtin for a standard utility
- the distributor provides a function for the same utility in /etc/profile
  (maybe to mitigate some security issue)

Are scripts in /etc/profile considered part of the implementation?

...

>   | My feeling, without testing anything, is that most shells would allow
>   | functions to override builtins here.
> Since I have never seen any shell implement any standard utility
> as a function, it would be very hard to test.   Further if the
> did, also implementing the same thing as a builtin would be
> even harder to imagine - why do it twice when one of the two
> would never be used?   So not just hard to test - probably
> impossible.
> 
> It is also unclear to me why anyone would ever implement a standard
> builtin as a function - implementing builtins is simpler for the
> implementation than functions (in my experience anyway) and in
> any case, if the rules in the standard are followed, there is
> no way (except possibly by using "command", and even that is not
> clear to me) to tell if the implementation used a function or
> a builtin (maybe the output from type might make it clear, but
> not necessarily).

It is all kind of theoretical. What wonders me is that the POSIX
specifications and definitions sometimes are imprecise or lacking.

>   | This has been an area of significant disagreement.
> It has indeed.

Agreed to disagree.
 
>   | > 1eIb) Run the utility.
>   | >   (This is where ordinary builtins should run).
>   | >   (It seems logical that a builtin takes precedence over PATH).
>   | You'd be surprised.
> Yes.   But almost all shells implement it that way, so the
> seemingly logical assumption is mostly backed by experience.

Again this is not too precise for a standard.

>   | Note that this seems to require that you can only run
>   | a builtin if it exists (or something with that name exists) in $PATH.
> A builtin for a standard utility, yes.  Unless the implementation has
> defined it as intrinsic (which the forthcoming standard allows, but
> discourages).  Applications (which includes users) who invoke non
> standard utilities are stepping outside the standard, so get
> unspecified results (so implementations can add new non-standard
> builtins without also adding a matching command in PATH without
> issues.

It doesn't sound like the easiest way out.

>   | So if you have a builtin that doesn't exist in $PATH and isn't listed as
>   | one of the regular builtins, what do you do? Even 

Re: posix command search and execution

2023-11-07 Thread Chet Ramey

On 11/7/23 8:54 AM, Mike Jonkmans wrote:

Thanks for the answers, Chet.

On Mon, Nov 06, 2023 at 02:28:24PM -0500, Chet Ramey wrote:

On 11/6/23 10:48 AM, Mike Jonkmans wrote:


   Q: Why check for regular builtins? That was already done in 1d.

Implementations can provide other builtins. The check in 1d is only for
those specific ones.


It was my earlier understanding that POSIX partitions the builtins into:
- special builtins
- regular builtins (listed in 1d).
- ordinary builtins (i.c. not the two others)
Because the first two are specific lists.


I think the list in 1d defines a particular subset of regular builtins,
and that regular builtins are just non-special ones built into the shell,
as you conclude below:


Upon rereading those, I think it is more like:
- special builtins
- regular builtins listed in 1d
- regular builtins



Which still exists in the new draft standard. They're just called
`intrinsics' now, and the benefit is that shells can define anything
they want as an intrinsic utility.


Look at https://www.austingroupbugs.net/view.php?id=854 for a discussion
of this issue.


Thanks for the link, I find that very hard to read though.


It's also incomplete; there was a lot of discussion on the mailing list.
I don't have a link to a usable public mailing list archive.


...

Thus in posix mode, bash does not follow this part of the standard.

Exactly which part of the standard are you saying bash is not following?


That would be PATH search failing and executing a builtin.


Yes, that's true, but no shell implements things that way, so it's a
deficiency in the standard.


Though i think that standard utilities *must* be in PATH,
otherwise there is no conformance a priori.


That section doesn't restrict itself to the standard utilities.


They should be named 'forever builtins', like 'forever chemicals'.
It's a real shame. There certainly are use cases for overriding the
special builtins (e.g. logging around `.' a.k.a. source).


People override `exit' as well.




- Regarding 1eIb.
The shells posh, dash, ksh and zsh
also run builtins, even when not found in PATH.
Checked with the `test' builtin (mv /usr/bin/test{,.sav})
on the versions found on Ubuntu 22.04.

Yes, this is part of the discussion of interp 854. The business of running
builtins other than the ones listed only after a PATH search was always
ahistorical.


Hmm, my check maybe incorrect.
Removing the standard utility `test' is not conformant.


You don't have to remove it, just verify that any builtin is run even if
there's no corresponding utility in $PATH. But this is all cleaned up if
the shell defines the builtin as an intrinsic utility.



Then again, is there a requirement for the standard utilities to be
found in the current PATH? Or do they just need to be present somewhere.


They have to be findable using the value returned by `getconf PATH'. If
the user modifies PATH to, say, prepend directories before that standard
PATH, then all bets are off.




- The 'newgrp' utility (mentioned in 1d) is not a builtin in bash.


It's gone in the latest draft of the next version of the standard anyway.



- Utilities:
https://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html
Q: Where is `standard utilities' defined - as used in 1d.

These are the standard utilities.


Some of these utilities are marked with optional `codes'.
Are these also considered standard utilities - even when the option is
not true?


Not really, no. If the implementation claims to support, for instance, XSI,
the XSI-shaded utilities have to be present and they have to behave as
specified. If the implementation doesn't, they don't.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_04

So in 1d, if the system doesn't claim XSI conformance, the shell doesn't
have to include type or ulimit in this required invocation order.

(But wait! The list of intrinsics in the latest draft includes type and
ulimit, and isn't XSI-shaded. So that will change.)

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Defaults when cross-compiling

2023-11-07 Thread Chet Ramey

On 11/7/23 10:03 AM, Oğuz wrote:
On Tuesday, November 7, 2023, Chet Ramey > wrote:


It's interesting that musl supports brk but not sbrk


It doesn't support locales either. I always assumed it's someone's toy 
project but looks like there are Linux distros shipping it instead of 
glibc. Huh


They probably want some minimal system for containers. dash doesn't handle
multibyte characters or different locales either, but distros still use it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: posix command search and execution

2023-11-07 Thread Chet Ramey

On 11/6/23 9:39 PM, Robert Elz wrote:


   | > 1eI) Search is successful.
   | > 1eIa) Check for `regular builtins' and functions
   | >and invoke that regular builtin/function.
   | >Q: Shouldn't this specify an ordering for builtins/functions?
   |
   | The text seems to imply that you can't have both, doesn't it?

While I suppose you could have both, it would be very unusual.


Granted.


Again, the functions that can get invoked here are only the
standard utilities implemented as functions, all others would
have been invoked earlier, and we would never be here.


It does imply that the implementation can't use a shell function to
override one of the utilities in 1d, since those are standard.

There goes your implementation-provided `cd' replacement function. :-)

Luckily, interp 854 resulted in changes here.



   | My feeling, without testing anything, is that most shells would allow
   | functions to override builtins here.

Since I have never seen any shell implement any standard utility
as a function, it would be very hard to test. 


I think ksh93 has some, or used to.


 Further if the
did, also implementing the same thing as a builtin would be
even harder to imagine - why do it twice when one of the two
would never be used?   So not just hard to test - probably
impossible.


Presumably the shell function has features beyond the builtin, and uses
the builtin for the basics. There's no good -- or useful -- way to do the
opposite.



   | This has been an area of significant disagreement.

It has indeed.

   | > 1eIb) Run the utility.
   | >(This is where ordinary builtins should run).
   | > (It seems logical that a builtin takes precedence over PATH).
   |
   | You'd be surprised.

Yes.   But almost all shells implement it that way, so the
seemingly logical assumption is mostly backed by experience.


See below about the business of not invoking builtins that aren't in 1d
unless they're found via a PATH search.



   | Note that this seems to require that you can only run
   | a builtin if it exists (or something with that name exists) in $PATH.

A builtin for a standard utility, yes.  


If the standard wants to say that, it can. It doesn't. The language in 1e
doesn't restrict itself to "standard utilities." It's any simple command
that the shell may have implemented as a regular builtin. That's one of
the problems here.



Unless the implementation has
defined it as intrinsic (which the forthcoming standard allows, but
discourages). 


It shouldn't discourage that practice. It's a way for a shell to provide
users with certainty about lookup order.


Applications (which includes users) who invoke non
standard utilities are stepping outside the standard, so get
unspecified results (so implementations can add new non-standard
builtins without also adding a matching command in PATH without
issues.


Not necessarily. Invoking a command that isn't defined in the standard
results in command behavior and effects that are outside the standard (of
course), but the way that command is invoked and the order in which
builtins/instrinsics/functions/executables are found is in the standard.




   | So if you have a builtin that doesn't exist in $PATH and isn't listed as
   | one of the regular builtins, what do you do? Even the unspecified list
   | doesn't give much help.

If it is a standard utility it is required to exist in PATH.


The standard's language doesn't restrict itself to standard utilities.


If PATH has been changed so that is no longer true, then that
is a non-conforming environment, and anything is OK. 


That's not necessary.


Similarly
if the builtin is not a standard utility (like declare or
enable for example).


The language in 1e doesn't restrict itself to standard utilities. If
that was the intent, the standard should have made it explicit. You can
always say "all bets are off if the command name isn't the name of one
of the utilities defined in this standard" but that isn't practical, and
the standard itself doesn't say that.



   | This is a quality of implementation feature.
   | Why confuse users by allowing them to define a function that
   | will never be executed?

Indeed - but you could also write that as "Why confuse users by
allowing them to define a function that can never be invoked?"
and by so doing, encourage more portable scripts.


What's the difference? Either way, you can't define a function with the
same name as a special builtin.


In practice this distinction (unlike some of the other properties
os special builtins) rarely matters, as users typically have no
reason to define functions that override the special builtins.


Ha, you'd be surprised. It's rare, but it happens. `exit' is the one I've
seen most often (yes, even in the presence of the EXIT trap).



   | I think the resolution to interpretation 854 addresses this. Shells
   | who want this ordering just declare all the builtins they implement as
   | `intrinsic' so 

Re: Defaults when cross-compiling

2023-11-07 Thread Oğuz
On Tuesday, November 7, 2023, Chet Ramey  wrote:
>
> It's interesting that musl supports brk but not sbrk


It doesn't support locales either. I always assumed it's someone's toy
project but looks like there are Linux distros shipping it instead of
glibc. Huh


-- 
Oğuz


Re: Defaults when cross-compiling

2023-11-07 Thread Chet Ramey

On 11/6/23 10:39 PM, Michael T. Kloos wrote:

I was trying to cross-compile bash for musl libc.  The configure script reports:

checking for working sbrk... configure: WARNING: cannot check working sbrk if 
cross-compiling
yes


In this case, the bash configure assumes that sbrk is present and working,
since that's true 90+% of the time.


However, I don't believe musl libc supports sbrk.  However, autoconf seems to 
default
to assuming yes and sets the HAVE_SBRK definition.  Bash then crashes on 
xmalloc failure.


If sbrk doesn't work on the target platform, configure --without-bash-malloc
to avoid using it.

It's interesting that musl supports brk but not sbrk, since you can always
implement sbrk using brk if you know the current break.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: posix command search and execution

2023-11-07 Thread Mike Jonkmans
Thanks for the answers, Chet.

On Mon, Nov 06, 2023 at 02:28:24PM -0500, Chet Ramey wrote:
> On 11/6/23 10:48 AM, Mike Jonkmans wrote:
> 
> >   Q: Why check for regular builtins? That was already done in 1d.
> Implementations can provide other builtins. The check in 1d is only for
> those specific ones.

It was my earlier understanding that POSIX partitions the builtins into:
- special builtins
- regular builtins (listed in 1d).
- ordinary builtins (i.c. not the two others)
Because the first two are specific lists.
XBD definition of builtins:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_83
(Note that this definition says that 'regular builtins are defined
in detail in XCU Command Search and Excecution'
which makes one think that the regulars are the list of 20 in 1d).
And the table of regular builtins:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tagtcjh_18
(which is mostly the same as the list in 1d - except for the XSI stuff).

Upon rereading those, I think it is more like:
- special builtins
- regular builtins listed in 1d
- regular builtins


> > 1eIb) Run the utility.
> >   (This is where ordinary builtins should run).
> >   (It seems logical that a builtin takes precedence over PATH).
> You'd be surprised. Note that this seems to require that you can only run
> a builtin if it exists (or something with that name exists) in $PATH.

Yes. that is weird.

> Look at https://www.austingroupbugs.net/view.php?id=854 for a discussion
> of this issue.

Thanks for the link, I find that very hard to read though.

> > 1eII) Nothing in PATH, exit with 127
> So if you have a builtin that doesn't exist in $PATH and isn't listed as
> one of the regular builtins, what do you do? Even the unspecified list
> doesn't give much help.

Interpretation depends on the definition of regular builtins.
Which IMHO is not well specified.

> > I hope I have understood POSIX correctly on these points.
False hope ;)

> > In posix mode, it seems that bash:
... 
> >A function defined before `set -o posix' will mask a
> >   special builtin. (This seems to be ok).
> It will not, at least not while posix mode is enabled. If you mean that a
> function will be executed before a special builtin when not in posix mode,
> you are correct, because there are no special builtins when not in posix
> mode.

I was mistaken. Guess I toggled the posix option too much.

...
> > Thus in posix mode, bash does not follow this part of the standard.
> Exactly which part of the standard are you saying bash is not following?

That would be PATH search failing and executing a builtin.
Though i think that standard utilities *must* be in PATH,
otherwise there is no conformance a priori.

> The requirements concerning PATH search and builtins are different in the
> next version of POSIX, the result of interp 854. The standard already
> says this about functions with the same name as a special builtin:
> 
> "The function is named fname; the application shall ensure that it is a name
> (see XBD Name) and that it is not the name of a special built-in utility."

Nice.

> > But should it?
> > I would rather have POSIX modified to *also* accept the, more logical,
> > bash way (i.c. first matching functions, then builtins, then PATH).
> > Would that be a feasible modification to suggest to the Austingroup?
> I think the resolution to interpretation 854 addresses this. Shells
> who want this odering just declare all the builtins they implement as
> `intrinsic' so they're not subject to a PATH search. That way there's no
> difference between the regular builtins and the ones an implementation
> chooses to provide. It still leaves posix special builtins, but I think
> those are with us forever.

They should be named 'forever builtins', like 'forever chemicals'.
It's a real shame. There certainly are use cases for overriding the
special builtins (e.g. logging around `.' a.k.a. source).

> > - Regarding 1eIb.
> >The shells posh, dash, ksh and zsh
> >also run builtins, even when not found in PATH.
> >Checked with the `test' builtin (mv /usr/bin/test{,.sav})
> >on the versions found on Ubuntu 22.04.
> Yes, this is part of the discussion of interp 854. The business of running
> builtins other than the ones listed only after a PATH search was always
> ahistorical.

Hmm, my check maybe incorrect.
Removing the standard utility `test' is not conformant.
Then again, is there a requirement for the standard utilities to be
found in the current PATH? Or do they just need to be present somewhere.

> > - The 'newgrp' utility (mentioned in 1d) is not a builtin in bash.
> >This is ok. The regular builtins from 1d need not be provided. See:
> >https://lists.gnu.org/archive/html/bug-bash/2005-02/msg00129.html
> >Builtins are defined in:
> >
> > https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_83
> >Q: Isn't that incorrect in stating where