Re: sh 'continue' shenanigans: negating

2024-02-15 Thread Chet Ramey via austin-group-l at The Open Group

On 2/14/24 6:40 PM, Christoph Anton Mitterer wrote:

On Wed, 2024-02-14 at 09:18 -0500, Chet Ramey via austin-group-l at The
Open Group wrote:

POSIX requires this, since it says that return sets $? to 1 here.


I assume you mean the description of the exit status from `return`?


No, I mean POSIX specifies that `return' sets the value of $? directly.
It's not set from the (possibly inverted) return status from `return'.


The value of the special parameter '?' shall be set to n, an
unsigned decimal integer, or to the exit status of the last command
executed if n is not specified.


If so, then IMO strictly speaking, it doesn't say whose $? shall be set
that way.


That doesn't make sense as written unless you're using $? as a shorthand
for a command's return status.

> But is there anything that prevents one from interpreting it as the $?
> of the `return` itself? Similar as you said above the `continue` has
> one as its a built-in?

The difference in the descriptions is that the `return' description
talks about setting $?:

"The value of the special parameter '?' shall be set to n"

where the `continue' description has a generic description of its
return status.

There was a discussion about this -- at least the language -- in
https://www.austingroupbugs.net/view.php?id=1309 resulting in
changes to the description for the next edition.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: sh 'continue' shenanigans: negating

2024-02-14 Thread Chet Ramey via austin-group-l at The Open Group

On 2/14/24 12:06 AM, Oğuz wrote:
On Tuesday, February 13, 2024, Chet Ramey via austin-group-l at The Open 
Group mailto:austin-group-l@opengroup.org>> 
wrote:


`continue' is a builtin; continue has a return status; `!' says to
negate it. It seems easy to come to the conclusion that the script
should return 1.


The same can be said about `return'. But bash disagrees:

     $ bash -c 'f(){ ! return 1;}; f; echo $?'
     1
     $

Does POSIX allow this or is it another case where bash diverges from POSIX?


POSIX requires this, since it says that return sets $? to 1 here. If
you find cases where you believe bash differs from POSIX, and it's not
documented in POSIX mode, please report them.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: sh 'continue' shenanigans: negating

2024-02-13 Thread Chet Ramey via austin-group-l at The Open Group

On 2/13/24 2:48 PM, Thorsten Glaser via austin-group-l at The Open Group wrote:

Hi,

I’ve got the following issue, and… yes I can see how the
reporter could come to the conclusion that it should “return”
1, but…

… at what point does “continue” “return”? Where do I stop
operating?


`continue' is a builtin; continue has a return status; `!' says to
negate it. It seems easy to come to the conclusion that the script
should return 1.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: sh: set -o pipefail by default

2024-01-11 Thread Chet Ramey via austin-group-l at The Open Group
On 1/11/24 3:53 AM, Andrew Pennebaker via austin-group-l at The Open Group 
wrote:
With sh gaining set -o pipefail, I am curious about having sh require (or 
encourage) enabling this option by default. it would help to catch a lot of 
false negatives in deceptively simple scripts.


No. This would break a large body of existing scripts, and it's not the
purpose of the standard. Any implementation can choose to default it to
enabled, of course.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Request: Standard hashmaps in sh

2023-12-27 Thread Chet Ramey via austin-group-l at The Open Group
On 12/27/23 11:26 AM, Andrew Pennebaker via austin-group-l at The Open 
Group wrote:

Many programs depend on hashmaps in order to work.

awk is not an answer.

The lack of hashmaps forces people to use less efficient algorithms, such 
as linear search.


The bash family implements it. Simply acknowledging bash associative array 
syntax, would instantly improve the scalability of sh scripts.


That's not the intent of the standard. The standard is supposed to give
users an idea about what they can rely on for portable scripts (and, to a
lesser extent, interactive use). While bash and ksh93 implement associative
arrays, that's not enough for a standard.

You could write something up and request that it be included -- that's how
the $'...' quoting form eventually got in -- but I'm kind of skeptical that
it would make it. It's a big change to the shell syntax.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-08 Thread Chet Ramey via austin-group-l at The Open Group
On 9/3/23 2:36 AM, Stephane Chazelas via austin-group-l at The Open Group 
wrote:



And except with yash's printf (among the few printf's I've
tested):

$ LC_ALL=zh_TW luit
$ locale title charmap
Chinese locale for Taiwan R.O.C.
BIG5
$ echo() { printf '%b ' "$@"\\n\\c; }
$ echo 'α'
αn%

(the trailing %'s above indicating the absence of newline
character).


Presumably because yash uses wide character strings internally and doesn't
rely on the printf(3) engine to output characters. On the other hand, I
suspect it will refuse to print byte strings that do not form valid wide
characters (I haven't tested this in particular, but yash fails on invalid
wide character strings in other expansions).


α (Greek lowercase alpha, U+03B1) being one of the several
characters whose encoding ends in byte 0x5c (the encoding of
backslash) in BIG5 (there are even more in GB18030, but I like
BIG5's α as an example as that's the alphabetic character by
excellence; BIG5HKSCS (Hong Kong variant) also has characters
from the Latin and Cyrillic alphabets in that situation).


It depends on how much of the output you want to leave to printf(3)
and the other stdio functions. If printf(1) assembles format strings and
arguments into something it passes to printf(3), and that processes them
as bytes, the game is over.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-05 Thread Chet Ramey via austin-group-l at The Open Group

On 9/3/23 4:22 PM, Robert Elz via austin-group-l at The Open Group wrote:

 Date:Sun, 3 Sep 2023 07:36:59 +0100
 From:Stephane Chazelas 
 Message-ID:  <20230903063659.mzyfen4evyrnz...@chazelas.org>

   | though has the same limitation as my bash echo -e "$*\n\c"

Yes, I know, though as nothing anywhere says what echo is supposed
to do with a lone trailing \ (or in fact, a \ that is not followed
by one of the defined escape sequences), I treat that as unspecified,


It's not specified, rather than being explicitly unspecified, so
anything goes. Bash just outputs the `\' in this case.



   | $ LC_ALL=zh_TW luit
   | $ locale title charmap
   | Chinese locale for Taiwan R.O.C.
   | BIG5
   | $ echo() { printf '%b ' "$@"\\n\\c; }
   | $ echo 'α'
   | αn%

That one is a different issue, and seems to me to be a simple
implementation bug (and no, I am not claiming that NetBSD wouldn't
act just like that) - characters ought to be fully formed before
testing their values.


I suspect this is the result of printf's history as a byte-oriented
utility, everyone still treats the format string as a sequence of bytes.

It's probably rare enough for an encoded character to contain a backslash
that no one has changed it yet.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001771]: support or reserve %q as printf-utility format specifier

2023-09-05 Thread Chet Ramey via austin-group-l at The Open Group

On 9/4/23 3:58 AM, Geoff Clare via austin-group-l at The Open Group wrote:


   | Issue 9 will have an inconsistency between the printf() function and the
   | printf utility.

Yes.   And exactly why is that a problem?


I think everyone in the teleconference just assumed that the inconsistency
is best avoided.  I don't recall living with it being discussed as an
option.


I don't agree that consistency is the primary requirement here.



However, from the feedback it seems that enough people think "the cure
is worse than the disease" on this, and we should indeed consider
living with the inconsistency as another option.


It just seems like a lot of backwards compatibility and POSIX guidance to
throw away for little gain. POSIX has included %b -- and recommended its
use -- for over 30 years. My guess is that thousands of scripts use it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Chet Ramey via austin-group-l at The Open Group

On 8/31/23 11:35 AM, Eric Blake wrote:

In today's Austin Group call, we discussed the fact that printf(1) has
mandated behavior for %b (escape sequence processing similar to XSI
echo) that will eventually conflict with C2x's desire to introduce %b
to printf(3) (to produce 0b000... binary literals).

For POSIX Issue 8, we plan to mark the current semantics of %b in
printf(1) as obsolescent (it would continue to work, because Issue 8
targets C17 where there is no conflict with C2x), but with a Future
Directions note that for Issue 9, we could remove %b entirely, or
(more likely) make %b output binary literals just like C.


I doubt I'd ever remove %b, even in posix mode -- it's already been there
for 25 years.


But that
raises the question of whether the escape-sequence processing
semantics of %b should still remain available under the standard,
under some other spelling, since relying on XSI echo is still not
portable.

One of the observations made in the meeting was that currently, both
the POSIX spec for printf(1) as seen at [1], and the POSIX and C
standard (including the upcoming C2x standard) for printf(3) as seen
at [3] state that both the ' and # flag modifiers are currently
undefined when applied to %s.


Neither one is a very good choice, but `#' is the better one. It at least
has a passing resemblence to the desired functionality.

Why not standardize another character, like %B? I suppose I'll have to look
at the etherpad for the discussion. I think that came up on the mailing
list, but I can't remember the details.


Is there
any interest in a patch to coreutils or bash that would add such a
synonym, to make it easier to leave that functionality in place for
POSIX Issue 9 even when %b is repurposed to align with C2x?


It's maybe a two or three line change at most.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-11 Thread Chet Ramey via austin-group-l at The Open Group

On 4/10/23 7:02 PM, Robert Elz wrote:

 Date:Mon, 10 Apr 2023 10:30:08 -0400
 From:Chet Ramey 
 Message-ID:  <78038281-f431-775e-6d60-a44126d1d...@case.edu>

   | The different semantics are that the standard specifies the status of the
   | simple command in terms of the command substitution that's part of the
   | assignment statement, so you have to hang onto it for a while.

I suspect that's because you are treating the assignments (more or less)
as statements of their own, and expanding and then assigning each, one by one,
left to right as you encounter them.


No, the sentence means exactly what it says. The difference between his
example and mine is that in my example the shell has to remember the return
status of the last command substitution until the command completes.



then there is no issue, and no real need to "hang onto it for a while".


Of course you do, the standard says it's the return status of the command.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-10 Thread Chet Ramey via austin-group-l at The Open Group

On 4/6/23 5:59 PM, Robert Elz wrote:

 Date:Wed, 5 Apr 2023 10:35:58 -0400
 From:"Chet Ramey via austin-group-l at The Open Group" 

 Message-ID:  

   | A variant with slightly different semantics:
   |
   | (exit 8)
   | a=4 b=$(exit 42) c=$?
   | echo status:$? c=$c
   |
   | The standard is clear about what $? should be for the echo, but should it
   | be set fron the command substitution for the assignment to c?

It isn't really different semantics, it is the same thing.


The different semantics are that the standard specifies the status of the
simple command in terms of the command substitution that's part of the
assignment statement, so you have to hang onto it for a while.

We have this identical discussion every couple of years. At least the last
time produced interp 1150, which -- in true standards fashion -- attempted
to clarify the issue with additional obscure language.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-10 Thread Chet Ramey via austin-group-l at The Open Group

On 4/6/23 5:43 PM, Robert Elz wrote:


Hence we got that absurd PATH search rule for builtins, that no shell of
the time did anything like, "because a user might want to override a
builtin with a version in their own bin directory, earlier in PATH than
where the standard version of the command exists",


Yes, it's ludicrous and ahistorical, but you still had people arguing in
favor of it as recently as a few years ago. There were proposals to extend
(abuse?) env, exec, and command to accomplish the task of temporarily
overriding a builtin, but those place the burden on the script author
rather than enable the user to do it.

The bash `enable' builtin is always used as the canonical example, since
there's printer (I think) software that uses it as a command name.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-06 Thread Chet Ramey via austin-group-l at The Open Group

On 4/6/23 1:55 PM, Harald van Dijk wrote:

One additional data point: in schily-2021-09-18, Jörg's last release, 
obosh, the legacy non-POSIX shell that is just there for existing scripts 
and for portability testing, prints 0 (using `` rather than $()), whereas 
pbosh and sh, the minimal and extended POSIX versions of the shell, print 
1. This does provide extra support for the view that this was a change that 
POSIX demanded, that the deviation from historical practice was 
intentional, but does not answer what the reasoning might have been.


I doubt it was `demanded'; the bosh change immediately followed an austin-
group discussion (we both participated) about this exact issue. Maybe he
thought it was the right thing based on that discussion.

As part of the discussion, he wrote:

> The important thing to know here is that the Bourne Shell has some
> checkpoints that update the intermediate value of $?. Since that changed in
> ksh88 and since POSIX requires a different behavior compared to the Bourne
> Shell, I modified one checkpoint in bosh to let it match POSIX.

so he had already been modifying that behavior before 2021, maybe after
interp 1150.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-06 Thread Chet Ramey via austin-group-l at The Open Group

On 4/5/23 12:36 PM, Harald van Dijk wrote:

On 05/04/2023 15:35, Chet Ramey via austin-group-l at The Open Group wrote:
On 4/5/23 9:06 AM, Martijn Dekker via austin-group-l at The Open Group 
wrote:

Consider:

 false || echo $(true) $?

dash, mksh and yash print 1.
bash, ksh93 and zsh print 0.
Which is right?


I believe dash, mksh, yash are already right based on the current wording 
of the standard. As Martijn wrote, the rule is that $? "Expands to the 
decimal exit status of the most recent pipeline", the most recent pipeline 
in the shell environment in which $? is evaluated is "false", and changes 
in the subshell environment shall not affect the parent shell environment, 
including changes in the subshell environment to $?.


That's certainly one interpretation, and may indeed be what the 1992
authors intended. My question is why they would choose something other than
what the so-called reference implementations (SVR4 sh, ksh88) did.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-05 Thread Chet Ramey via austin-group-l at The Open Group

On 4/5/23 11:25 AM, Oğuz wrote:
5 Nisan 2023 Çarşamba tarihinde Chet Ramey via austin-group-l at The Open 
Group mailto:austin-group-l@opengroup.org>> 
yazdı:


but should it
be set fron the command substitution for the assignment to c? 



I think it'd be practical, is there a reason why it shouldn't? 


https://www.austingroupbugs.net/view.php?id=1150

It's unspecified. The assignments are performed `beginning to end', but
it's not specified when $? is set. Interestingly, the SVR4 sh and ksh88
both set $? as each command substitution finishes, but POSIX didn't specify
that behavior explicitly. Maybe the 1992 authors didn't feel they had to.

And while
we're at it, is there a reason why assignments in a simple command 
shouldn't be applied sequentially, from left to right? 


The Bourne shell performed the assignments right to left. This came up on
the austin-group list a couple of years ago, in almost exactly the same
way, but I don't think the discussion made its way to an interpretation
request.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? behaviour after comsub in same command

2023-04-05 Thread Chet Ramey via austin-group-l at The Open Group

On 4/5/23 9:06 AM, Martijn Dekker via austin-group-l at The Open Group wrote:

Consider:

     false || echo $(true) $?

dash, mksh and yash print 1.
bash, ksh93 and zsh print 0.
Which is right?


A variant with slightly different semantics:

(exit 8)
a=4 b=$(exit 42) c=$?
echo status:$? c=$c

The standard is clear about what $? should be for the echo, but should it
be set fron the command substitution for the assignment to c?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Syntax error with "command . file" (was: [1003.1(2016/18)/Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)

2023-03-14 Thread Chet Ramey via austin-group-l at The Open Group

On 3/14/23 4:58 PM, Harald van Dijk wrote:

On 14/03/2023 20:41, Chet Ramey wrote:
On 3/12/23 10:19 PM, Harald van Dijk via austin-group-l at The Open Group 
wrote:



bash appears to disables the reading of .profile in POSIX mode entirely.


This isn't quite correct. By default, a login shell named `sh' or `-sh'
reads /etc/profile and ~/.profile. You can compile bash for `strict posix'
conformance, or invoke it with POSIXLY_CORRECT or POSIX_PEDANTIC in the
environment, and it won't.


Isn't it? The mode bash gets into when invoked as sh is described in the 
manpage (looking at the 5.2.15 manpage) as:


   If bash is invoked with the name sh, it tries to mimic the startup
   behavior of historical versions of sh as closely as possible, while
   conforming to the POSIX standard as well. [...] When invoked as sh,
   bash enters posix mode after the startup files are read.

The mode bash gets into when POSIXLY_CORRECT is set, the mode that can also 
be obtained with --posix, is described in the manpage as:


   When bash is started in posix mode, as with the --posix command line
   option, it follows the POSIX standard for startup files.


Right. When you force posix mode immediately, as I said above, bash won't
read the startup files. A login shell named sh or -sh reads the historical
startup fles, then enters posix mode.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Syntax error with "command . file" (was: [1003.1(2016/18)/Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)

2023-03-14 Thread Chet Ramey via austin-group-l at The Open Group
On 3/12/23 10:19 PM, Harald van Dijk via austin-group-l at The Open Group 
wrote:



bash appears to disables the reading of .profile in POSIX mode entirely.


This isn't quite correct. By default, a login shell named `sh' or `-sh'
reads /etc/profile and ~/.profile. You can compile bash for `strict posix'
conformance, or invoke it with POSIXLY_CORRECT or POSIX_PEDANTIC in the
environment, and it won't.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Writing the "[job] pid" for async and-or lists

2023-02-24 Thread Chet Ramey via austin-group-l at The Open Group

On 2/24/23 10:59 AM, Robert Elz via austin-group-l at The Open Group wrote:



Can we agree (and then after Draft 3 is available, submit a bug report)
that this text should only apply to the top level of an interactive shell,
and not to any subshell environments.


I'd be good with specifying that it doesn't apply to subshell environments.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Writing the "[job] pid" for async and-or lists

2023-02-24 Thread Chet Ramey via austin-group-l at The Open Group
On 2/24/23 11:35 AM, Harald van Dijk via austin-group-l at The Open Group 
wrote:


I agree, but a subshell of an interactive shell is effectively 
non-interactive anyway, and many of the special rules for interactive 
shells should not apply to subshells of interactive shells and already 
don't in various existing shells (but the extent to which they don't varies 
from shell to shell). Rather than make a special exception for process IDs, 
could this be made a general rule?


If POSIX made it a general rule, someone would probably have to go through
all the changed behavior and check which shells conform and which do not.
I can't see that happening in time for the next draft.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Minutes of the 6th February 2023 Teleconference

2023-02-09 Thread Chet Ramey via austin-group-l at The Open Group

On 2/9/23 4:19 AM, Geoff Clare via austin-group-l at The Open Group wrote:


When this was discussed on the call, there was general agreement that
executing the partial line after getting a read error is really not
a good thing for shells to be doing. 


OK, that's a reasonable position to take. But is it the role of a standards
body to require that behavior, when shells don't do it today?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Minutes of the 6th February 2023 Teleconference

2023-02-07 Thread Chet Ramey via austin-group-l at The Open Group

On 2/7/23 2:22 AM, Andrew Josey via austin-group-l at The Open Group wrote:


Bug 1629: Shell vs. read(2) errors on the script OPEN
https://austingroupbugs.net/view.php?id=1629

This item was discussed at length on the call
including the feedback from Chet Ramey. Notes were updated in
the etherpad -- https://posix.rhansen.org/p/2023-02-06 .


I looked at the etherpad. Nick, thanks for the detail about the script.
There are some missing details about what happens on read errors. One
thing is that bash assumes fatal read(2) errors are *not* transient: when
a read returns -1/EWHATEVER, if errno is not EINTR or EAGAIN the next read
will also return an error. Same with EOF. This is, of course, not how it
goes when you inject read errors. With that in mind:

> /tmp/shell.script: line 1227: ent: command not found
> $ echo $?
> 42
>
> (i.e. it treated the end of "# this is a comment" as the command "ent"
> and continued execution)

is reasonably easy to explain. Bash, like all the shells, converts the read
error into EOF and executes the partial line, which happens to be a
comment. Then it goes back to read, assuming that a real error will result
in another -1/EIO (as it will in virtually all situations). Since the read
succeeds, the error must have been transient, and it goes on. Other shells
do things differently.

The key is that everyone `executes' the partial line after getting EOF,
even yash.

Nick's example shows this, if you convert it to a non-interactive shell
instance.

printf "echo foo" | bash

will output `foo' in bash and every other shell. Same with interactive
shells.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Minutes of the 2nd February 2023 Teleconference

2023-02-04 Thread Chet Ramey via austin-group-l at The Open Group

On 2/4/23 1:33 AM, Andrew Josey via austin-group-l at The Open Group wrote:


Bug 1629: Shell vs. read(2) errors on the script OPEN
https://austingroupbugs.net/view.php?id=1629

This item was discussed at length on the call.
We will continue with this item next time.


I looked at the etherpad. I'm not sure who tested bash-5.2.2 (Nick?), but
you ended up running something that a vendor -- probably Ubuntu via Debian --
modified to add an error message.

I can't tell what `/tmp/script' is, but running bash-5.2.15 on RHEL7

$ ./bash --version
GNU bash, version 5.2.15(5)-release (x86_64-pc-linux-gnu)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

against the following script:

$ cat x10
echo a; echo b;
echo after: $?

exits with status 0 after a read error:

$ strace -e trace=read -e inject=read:error=ESTALE:when=7 ./bash ./x10
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\316\0\0\0\0\0\0"..., 832) 
= 832
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\16\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`&\2\0\0\0\0\0"..., 
832) = 832

read(3, "MemTotal:1862792 kB\nMemF"..., 1024) = 1024
read(3, "echo a; echo b;\necho after: $?\n", 80) = 31
read(255, "echo a; echo b;\necho after: $?\n", 31) = 31
a
b
after: 0
read(255, 0x26d0290, 31)= -1 ESTALE (Stale file handle) 
(INJECTED)

+++ exited with 0 +++

You have to make sure you inject the error when bash is reading input
for the parser (the fifth read is checking whether or not the script is a
binary file). You'd get the same results with when=6.

You didn't check dash, but it returns 0 as well. I assume the other ash-
derived BSD shells behave similarly. yash is still the only shell that
returns an error in this case.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Security risk in uudecode specification?

2023-01-20 Thread Chet Ramey via austin-group-l at The Open Group
On 1/20/23 7:11 PM, Christoph Anton Mitterer via austin-group-l at The Open 
Group wrote:



It's a pity, that the different parties often don't seem to try to
agree on something standardised first, but rather add new base
utilities or functionalities like the shell's "local"... and only
afterwards standardisation is tried (but often fails).


This is a great example of how hindsight is perfect and seductive. There
was a `local' in draft 9 of POSIX.2, back in the late 80s. It got taken
out -- even in its benign, non-specific form -- because nobody agreed how
it should be implemented even back then, and it stood in the way of
consensus.

So we all went our own ways. Brian and I implemented local with dynamic
scoping, like ksh88. Korn went off to do static scoping in ksh93. Ken
Almquist implemented dynamic scoping in ash. That kind of set the
boundaries of the debate. And here we are.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Shell vs. read(2) errors on the script

2023-01-09 Thread Chet Ramey via austin-group-l at The Open Group

On 1/8/23 9:39 AM, Thorsten Glaser via austin-group-l at The Open Group wrote:


There are two questions here:

① Should shell script read errors be treated as EOF, as is practice?

② If not, what should the shell do upon encountering one?


It's pretty clear that a non-interactive shell can't continue after
it gets a read error on the script, so treating it as EOF in the sense
that it stops execution is the right thing to do.

The only question is whether the shell is still bound by the requirement
to exit with the status of the last command executed.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work

2022-05-23 Thread Chet Ramey via austin-group-l at The Open Group
On 5/18/22 9:46 PM, Christoph Anton Mitterer via austin-group-l at The Open 
Group wrote:




The above, I'm not quite sure what these tell/prove...

I assume the ones with '?': that for all except bash/fnmatch   '?'
matches both, valid characters and a single byte that is no character.

And the ones with bracket expression, that these also work when the BE
has either a valid character or a byte (that is not a character) and
vice-versa?

If Chet is reading along, is the above intended in bash, or considered
a bug?


The bash matcher falls back to C-locale-like behavior only if the pattern
and the string both do not contain any valid multibyte characters. So if,
for example, the string contains a valid multibyte character, but the
pattern does not, the matcher will attempt multibyte (wide character,
really) matches.

This is why the string \243] (a valid multibyte character in Big5) does not
match [\243!]]: nothing in the bracket expression will match that
character, and that string will never match a pattern ending in `]'.



IMO it would have been interesting to see whether ? would also match
multiple bytes that are each for themselves and together no valid
character...


No, it wouldn't. You can make a case for `?' matching a single byte that is
not part of a valid multibyte character (there is no such thing as a single
byte that is "no valid character" when you are matching), but you cannot
make one for `?' matching more than one byte that does not compose a valid
multibyte character.



The tests involving \243 are run in a Big5 environment. In Big5,
\243\135 is the representation of β, a single valid character, even
though \135 on its own is still the single character ].


Seem also a bit strange to me,... all shells match \243 against ? ...
i.e. ? matches a single byte that is not a character... but later on it
doesn't work again with \243] and ?]


Because, as Harald says, \243] is a valid multibyte character in Big5
locales.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-16 Thread Chet Ramey via austin-group-l at The Open Group

On 5/13/22 5:37 PM, Robert Elz wrote:

 Date:Sat, 14 May 2022 03:56:32 +0700
 From:"Robert Elz via austin-group-l at The Open Group" 

 Message-ID:  <2459.1652475...@jinx.noi.kre.to>

   |   | Show your work.

   | I no longer remember the exact command I used (cannot even locate the
   | message you're quoting from),

I finally did ...

This is what I see:


I don't see that.

$ echo $BASH_VERSION
5.1.16(2)-release
$ sleep 20 | sleep 20 & sleep 30 | sleep 30 & jobs -l ; pstree $$ ; ps jT
[1] 22954
[2] 22956
[1]- 22953 Running sleep 20
 22954   | sleep 20 &
[2]+ 22955 Running sleep 30
 22956   | sleep 30 &
-+= 22938 chet ./bash
 |--- 22953 chet sleep 20
 |--- 22954 chet sleep 20
 |--- 22955 chet sleep 30
 |--- 22956 chet sleep 30
 \-+- 22957 chet pstree 22938
   \--- 22958 root ps -axwwo user,pid,ppid,pgid,command
USER   PID  PPID  PGID   SESS JOBC STAT   TT   TIME COMMAND
root   811   544   811  00 Ss   s0190:00.05 login -pfl chet /bin/ba
chet   814   811   814  01 Ss0190:00.09 -bash
chet 22938   814 22938  01 S+   s0190:00.04 ./bash
chet 22953 22938 22938  01 S+   s0190:00.00 sleep 20
chet 22954 22938 22938  01 S+   s0190:00.00 sleep 20
chet 22955 22938 22938  01 S+   s0190:00.00 sleep 30
chet 22956 22938 22938  01 S+   s0190:00.00 sleep 30
root 22959 22938 22938  01 R+   s0190:00.00 ps jT
$ kill %1
$ ps jT
USER   PID  PPID  PGID   SESS JOBC STAT   TT   TIME COMMAND
root   811   544   811  00 Ss   s0190:00.05 login -pfl chet /bin/ba
chet   814   811   814  01 Ss0190:00.09 -bash
chet 22938   814 22938  01 S+   s0190:00.04 ./bash
chet 22955 22938 22938  01 S+   s0190:00.00 sleep 30
chet 22956 22938 22938  01 S+   s0190:00.00 sleep 30
root 22960 22938 22938  01 R+   s0190:00.00 ps jT
$


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-16 Thread Chet Ramey via austin-group-l at The Open Group

On 5/13/22 4:56 PM, Robert Elz wrote:

 Date:Fri, 13 May 2022 11:22:20 -0400
 From:Chet Ramey 
 Message-ID:  


   | Show your work.
   |
   | I tested this on macOS 12 and RHEL 7, using interactive shells with job
   | control enabled,

That is likely the difference.   The question was about what happens when
job control is not enabled.


The same thing. This example uses bash-5.2-beta on macOS 10.15, but the
same thing happens with bash-5.1.16.

$ ./bash
$ set +m
$ sleep 20 | sleep 20 &
[1] 22755
jenna.local(2)$ pstree $$
-+= 22753 chet ./bash
 |--- 22754 chet sleep 20
 |--- 22755 chet sleep 20
 \-+- 22756 chet pstree 22753
   \--- 22757 root ps -axwwo user,pid,ppid,pgid,command
$ kill %1
$ ps ax | grep sleep
22759 s018  S+ 0:00.00 grep sleep
$ sleep 20 | sleep 20 & pstree $$
[1] 22787
-+= 22753 chet ./bash
 |--- 22786 chet sleep 20
 |--- 22787 chet sleep 20
 \-+- 22788 chet pstree 22753
   \--- 22789 root ps -axwwo user,pid,ppid,pgid,command
$ kill %1
$ ps axuw | grep sleep
chet 22791   0.0  0.0  4408552764 s018  S+   10:25AM 
0:00.00 grep sleep


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/5/22 7:46 AM, Geoff Clare via austin-group-l at The Open Group wrote:

[Robert intended to send the mail I'm replying to to the list, but it
was only sent to me. I've quoted it in full.]

Robert Elz  wrote, on 05 May 2022:



This leaves just bash of the shells I have to test.   bash is odd, at
first glance it seems to act like the ksh's, zsh & fbsh do.   But it
doesn't.   This seems to be because in a pipeline like

sleep 20 | sleep 20 &

creates a subshell for the '&' first, and then creates a new subshell
environment for each side of the pipe.   None of the other shells do that,
the processes in the pipeline are in subshell environments (in most anyway)
but the same one as the one created for the async process execution - that
is, the sleep processes are direct children of the parent shell, not
grandchildren as they are in bash.

When given "kill %1" it then seems to work just like those other shells, but
all that is actually killed is the forked copy of itself, leaving the sleep
processes running, orphaned.


Show your work.

I tested this on macOS 12 and RHEL 7, using interactive shells with job
control enabled, running the latest bash devel version, and could not
reproduce it.

The Linux version of pstree shows the process group; the macOS version
doesn't have that option. Both show the sleep processes are direct
descendents of the parent shell, but even if they aren't, bash clearly does
not leave the sleep processes orphaned.

macOS 12:

$ sleep 20 | sleep 20 &
[1] 16711
$ pstree $$
-+= 16694 chet ./bash
 |--= 16710 chet sleep 20
 |--- 16711 chet sleep 20
 \-+= 16712 chet pstree 16694
   \--- 16713 root ps -axwwo user,pid,ppid,pgid,command
$ kill %1
$ ps axuw | grep sleep
chet 16717   0.0  0.0 34142704632 s027  U+   11:04AM 
0:00.00 grep sleep

[1]+  Terminated: 15  sleep 20 | sleep 20

RHEL 7:

$ sleep 20 | sleep 20 &
[1] 106739
$ pstree -g $$
bash(106427)─┬─pstree(106743)
 ├─sleep(106738)
 └─sleep(106738)
$ kill %1
$ ps axuw | grep sleep
chet 106753  0.0  0.0 112812   960 pts/1R+   10:59   0:00 grep sleep
[1]+  Terminated  sleep 20 | sleep 20

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/13/22 10:27 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Chet Ramey wrote, on 13 May 2022:


On 5/13/22 5:20 AM, Geoff Clare via austin-group-l at The Open Group wrote:


The definition of "Job" is:

  A set of processes, comprising a shell pipeline, and any processes
  descended from it, that are all in the same process group.

Notice it says "that are all in the same process group".  In the
case of a background command started with job control disabled, the
processes all have the same process group as the parent shell.
By a strict reading, this counts as a job, but I don't think that
was intended.


Why not? This is what allows jobs/kill/wait to use job control notation
in operands even when job control is not currently enabled. I'd argue
that that was intended.


My reading is that all the standard requires here is that if one or
more jobs are created with job control enabled, and job control is
subsequently disabled, you can still use "jobs" to list those jobs,
and %n etc. with "kill" to refer to those jobs.


Of course; it relies on your assertion that the standard requires job
control to be enabled to create a job and put it in the jobs list. I've
already said what I think about that, and most, if not all, shells behave
differently.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/13/22 5:20 AM, Geoff Clare via austin-group-l at The Open Group wrote:


You are over reaching in the way you are reading that text.


I strongly disagree.


If you have to work that hard to make your case, it's a good indication
that the existing language is wrong -- or at least insufficient -- and
needs to be changed.


There is no such thing as a known process ID that is not a job.


Bash allows process substitutions to set $!, so users can wait for them,
but they are not jobs. Process substitution is, of course, an extension.



The definition of "Job" is:

 A set of processes, comprising a shell pipeline, and any processes
 descended from it, that are all in the same process group.

Notice it says "that are all in the same process group".  In the
case of a background command started with job control disabled, the
processes all have the same process group as the parent shell. > By a strict 
reading, this counts as a job, but I don't think that
was intended.


Why not? This is what allows jobs/kill/wait to use job control notation
in operands even when job control is not currently enabled. I'd argue
that that was intended.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/12/22 10:03 AM, Geoff Clare via austin-group-l at The Open Group wrote:


The normative text relating to creation of job numbers/IDs is all
conditional on job control being enabled.


Where is that? It's not in the definition of Job ID, it's not in 2.9.3
Asynchronous Lists, it's not in the `jobs' description, it's not part of
the definition of Background Job or Foreground Job, it's not in any
of fg/bg/kill/wait. I feel like I'm missing something obvious here.


You're looking in (some of) the right places, but missing the
significance of what's written there. 


If we're going to make basic concepts dependent on obscure language in
the standard that requires the reader to make the proper set of inferences,
the standard has failed. It's worse that it fails to capture what the
majority of shells do in practice.

This set of examples you give, which you might assert are definitive, is
not all that compelling. If the standard wants to specify something, why
can't it just say so in plain language? Why make it a puzzle to be solved?

If you have to work this hard to make your case, it's probably not that
obvious.



So for the known IDs list, it's pretty much `wait' and `jobs', right?


The phrase kre used was "when their termination status has been
reported to the user - however that happens".  That includes information
written by an interactive shell before it writes a prompt.  Although
the standard says this information is about the exit status of
"the background job", it is also, by association, information about
the exit status of a process in the known process IDs list.


Another reason that the language relating the two things, and describing
how they interact, needs to be clear and unambiguous, and handle all four
scenarios.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/11/22 6:31 PM, Robert Elz wrote:


   | For neither the first nor the last time.

Including now.


People can disagree.



   | > I think they should remain independent.
   | Sure, I agree.

I don't.  I cannot think of a single reason why the shell should be
forced to maintain two separate lists of its child processes.  The jobs
table needs to have them, so processes in the job can be identified as
they finish.  Duplicating that in another table, for no particular reason
I can imagine makes no sense to me.   Still, if others want to implement
it that way, I don't object - but the standard has never required that,
and should not, absent some very good reason, be changed to require it now.


It's going to take more work on the standard to make it be that way, then.
There will have to be more specific language about when and how the jobs
list is created, when jobs are added and removed, when and how jobs
correspond to known process IDs, and whether or not removing IDs from that
list just means removing the job from the table. If we're going to require
job control to be enabled to maintain a jobs list, at least a visible one,
then we have to have something else to use. It may be the jobs list
internally, if we end up fixing all the places in the standard that are
underspecified, and that would probably work.

It's my impression that the known IDs list is a remnant from the time when
job control was optional, and you didn't need to implement job control
unless you implemented the UPE. You still needed a way to keep track of
background processes, and the known IDs list was it.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: wait and stopped processes (was: When can shells remove "known" process IDs from the list?)

2022-05-13 Thread Chet Ramey via austin-group-l at The Open Group

On 5/11/22 6:56 PM, Robert Elz wrote:


   | Maybe. And yet I can't recall ever receiving a bug about this.


[...]


The circumstances to provoke a problem need to be contrived.


Exactly. It's a largely hypothetical scenario.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-11 Thread Chet Ramey via austin-group-l at The Open Group
On 5/10/22 12:03 PM, Geoff Clare via austin-group-l at The Open Group wrote:

>> If jobs and kill work, you should probably add wait to this description, or
>> add a separate paragraph to the wait rationale.
> 
> If it works with "wait" in all shells (that we care about), then I
> agree it would make sense to add it.

Just decide whether or not it makes sense. If it makes sense, add it.
Shell behavior is only selectively relevant.


>> I'd be interested in your reasoning. The standard simply says that jobs
>> and kill (and wait should be added) work with job %X notation whether
>> or not job control is enabled.
> 
> The normative text relating to creation of job numbers/IDs is all
> conditional on job control being enabled.

Where is that? It's not in the definition of Job ID, it's not in 2.9.3
Asynchronous Lists, it's not in the `jobs' description, it's not part of
the definition of Background Job or Foreground Job, it's not in any
of fg/bg/kill/wait. I feel like I'm missing something obvious here.


>> OK. I'm pretty sure everyone already does this for the jobs list. Not sure
>> whether you want it to include the known IDs list.
> 
> I think kre intended it apply to the known IDs list as well, and I
> was agreeing with that.

So for the known IDs list, it's pretty much `wait' and `jobs', right?

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: wait and stopped processes (was: When can shells remove "known" process IDs from the list?)

2022-05-11 Thread Chet Ramey via austin-group-l at The Open Group
On 5/10/22 11:50 AM, Geoff Clare via austin-group-l at The Open Group wrote:
> Chet Ramey wrote, on 06 May 2022:
>>
>>>> And last, also in this area, is the question of stopped jobs and the wait
>>>> command, and how those two are intended to interact.
>>>
>>> The wording in my current draft makes clear that wait waits for
>>> processes to terminate.  I could, if desired, add some rationale saying
>>> that some implementations have, as an extension, an option that allows
>>> wait to return when a process stops.
>>
>> That's not the current behavior. At best, it should be unspecified.
> 
> It is already what the standard requires, and with good reason.

Sure. It simply isn't what many (most) shells do.

> I have never, ever, seen a shell script use "wait" in a way that would
> work correctly if the wait returned when the process stopped.  The code
> invariably assumes that wait will not return until the process
> terminates. If it checks $? after the wait, it is always just to
> distinguish between different exit status values.

Maybe. And yet I can't recall ever receiving a bug about this.


> In shells where wait (with no options specified) returns when a
> process stops, that is a horrible misfeature.  Kre has already stated
> he will change NetBSD sh so that it doesn't do that. Hopefully the
> other ash-bashed shells will follow suit.

There are more shells than ash-based ones that do this. At least four
independent code bases have made the same choice.

> If you only change
> bash in POSIX mode, you will be doing your users a disservice.

I doubt that. There's no evidence that this is a problem for bash users.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-11 Thread Chet Ramey via austin-group-l at The Open Group
On 5/10/22 11:17 AM, Geoff Clare via austin-group-l at The Open Group wrote:

>> Anyway, I agree with disallowing remove-before-prompting.
> 
> Unfortunately that puts you in opposition to kre.

For neither the first nor the last time.

>> Or make it clear everywhere that removing a job from the jobs list
>> means removing its pid from the list of terminated asynchronous lists.
> 
> I think they should remain independent.

Sure, I agree. It just means more work specifying when the shell can
remove entries from either. I'll wait for your proposal.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-06 Thread Chet Ramey via austin-group-l at The Open Group

On 5/5/22 7:46 AM, Geoff Clare via austin-group-l at The Open Group wrote:



The fact that the jobs command works with job control disabled is
mentioned in the rationale on the jobs page:

 The jobs utility is not dependent on the job control option, as
 are the seemingly related bg and fg utilities because jobs is
 useful for examining background jobs, regardless of the condition
 of job control. When the user has invoked a set +m command and job
 control has been turned off, jobs can still be used to examine the
 background jobs associated with that current session. Similarly,
 kill can then be used to kill background jobs with kill
 %.

so that's not an "issue".


If jobs and kill work, you should probably add wait to this description, or
add a separate paragraph to the wait rationale.






XBD 2.175 defines a job as

A set of processes, comprising a shell pipeline, and any
processes descended from it, that are all in the same process group.

Which says nothing very useful, and I am not sure is even correct.


Yes, I made the same point in a previous message.


The reason I think #2 should say "if job control is disabled" is
because the standard talks separately about the list of "process IDs
known in the shell environment" and the job list / job IDs. 


I think it needs to talk a little bit more clearly about the jobs list and
what constitutes a job, not to mention how and when one gets created.

Anyway, this also implies the existence of two separate lists.



Your testing above seems to be conflating the "known IDs" and the jobs
list. My reading of the standard is that entries in the jobs list only
need to be created when job control is enabled,


I'd be interested in your reasoning. The standard simply says that jobs
and kill (and wait should be added) work with job %X notation whether
or not job control is enabled. And in any event, that's not how shells
work.

I do agree that the current text implies two separate lists, and there's
insufficient explanation of how they interact. It certainly doesn't imply
that the `known IDs' stuff is only in effect when job control is not
enabled.


Independently of this, when
job control is disabled all of the requirements relating to "known IDs"
still apply and have nothing to do with %... job ID notation.


If you make that change. The known IDs description doesn't depend on job
control being enabled or disabled.




   | I think the description of the wait utility should be updated to require
   | removal from the list.

I would agree with that.


I wouldn't object.



If someone wants to implement it that way, I have no objection, but it
should not be required.   shells should at least be permitted to remove
jobs from the list of remembered stuff when their termination status has
been reported to the user - however that happens.


I agree.


OK. I'm pretty sure everyone already does this for the jobs list. Not sure
whether you want it to include the known IDs list.



That could be another valid choice, but I would prefer that all shells
wait for termination by default.


You might, but that's not the current state of the world.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-06 Thread Chet Ramey via austin-group-l at The Open Group

On 5/3/22 6:52 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Robert Elz wrote, on 30 Apr 2022:


   | However, today it threw a last curve ball when I was working on an
   | update to the description of set -b ...

How many shells actually implement that?


They all accept it as an option, but for some it seems to be a no-op.
That's one of the changes I was working on when I spotted this problem.


Bash implements it. I doubt very many people use it.




   | This conflicts with 2.9.3.1 Asynchronous Lists which says that IDs
   | remain known until:
   |
   |  1. The command terminates and the application waits for the process ID.
   |
   |  2. Another asynchronous list is invoked before "$!" (corresponding to
   | the previous asynchronous list) is expanded in the current execution
   | environment.

Does anyone implement that bit (#2) at all?  In a non-interactive shell it
might almost be possible, but in an interactive shell, if the job isn't in
the list (whether $! has been referenced or not - usually it will not have
been) because it has been removed, what is the shell supposed to do if the
job stops?   Further users (even in scripts) are allowed to use % %- %1
etc to refer to jobs, $! isn't the only way to reference one ("wait %2 should
work).   I'd suggest that #2 should simply be removed.


I think #2 should say "If job control is disabled, ...".


Why? You can use job control notation with jobs/kill/wait even if job
control isn't enabled, which implies the presence of a job list separate
from the list of known IDs.



I think the description of the wait utility should be updated to require
removal from the list.


I agree, both the jobs list and the list of known IDs.

[...]



And last, also in this area, is the question of stopped jobs and the wait
command, and how those two are intended to interact.


The wording in my current draft makes clear that wait waits for
processes to terminate.  I could, if desired, add some rationale saying
that some implementations have, as an extension, an option that allows
wait to return when a process stops.


That's not the current behavior. At best, it should be unspecified.

Bash, yash, mksh, dash, the NetBSD sh, and gwsh allow the `wait' builtin to
wait for any process status change (e.g., SIGSTOP). ksh93, FreeBSD sh, and
zsh force the shell to wait until the process terminates. Bash provides an
option (`wait -f') to force a wait for process termination. I didn't check
whether other shells do.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-06 Thread Chet Ramey via austin-group-l at The Open Group

On 4/29/22 4:23 PM, Robert Elz via austin-group-l at The Open Group wrote:


   | You can test this by doing
   |
   |true &
   |
   |wait $!; echo $?
   |
   | This should print 0. Then do the same, except with the first command
   | changed to false &. That should print 1.

Yes, in the shells you mention it does, indicating that something different
is happening.   It is interesting that in bash you can do that wait over and
over again, and it keeps returning the 0 status (until one does a plain "wait"
command, even the "jobs" command doesn't remove it, though the standard
requires that it do so).   bash is the only shell that acts like that, whether
it is intentional or not I have no idea.


It's intentional, and has been in bash for a very long time.

As I said in another message, the jobs builtin not removing the pid from
the `remembered' list is probably an oversight. I'll fix it in posix mode
after bash-5.2 is released.



But try a different test

true & X=$!

(the assignment to X is just in case there is a shell which implements that
"no need to retain" stuff when $! is not referenced).

Then repeat that line over and over. (Consecutive lines).




zsh does something different, once a job has been reported as finished
at a prompt, it is removed from the jobs table, and you can no longer do
"wait %3" for it, but the pid and status seem to be remembered somewhere
else, and wait  gets the status from the job.   That seems odd to me,
it should be possible to use either form to wait on a job. 


They're not jobs! A pid is a pid. It doesn't matter whether it's the pid of
the job's controlling process (or whatever we want to call it). The
Asynchronous Lists text says you have to be able to wait for it. This is
how bash works, too.

This is what happens when you have a jobs list and a list of terminated
asynchronous lists that are `known in the current shell environment'.


bash is different again, it counts up the job numbers, like bosh and
yash, but as it reports each earlier one finished, removes it from the
jobs table, so the "jobs" command only ever shows (and then removes) the
last one started.   It still allows wait N to return the status, as many
times as you want to do that command, but not wait %n for any but the
most recently created one.


Right. The ascending job number depends on your policy for assigning new
job numbers, and you can only use job control notation to refer to entries
in the job list. But bash will let you wait for pid N as long as pid N is
in the list of terminated asynchronous processes.


The bigger issue is what do you do about users who can be connected to
their shell for weeks, running lots of background commands, and never
issuing a wait or jobs command?   Do you just keep remembering exit
status/pid pairs forever?   That doesn't sound sustainable to me.


Bash bounds the number remembered. It's at least CHILD_MAX, as POSIX
specifies, with an upper bound (right now, 32K -- very few sessions
start that many asynchronous jobs/processes). It checks for pid reuse: the
entry for pid N will always be the status of the most recent asynchronous
process with that pid. That might not be perfect, but it works fine in
practice.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-06 Thread Chet Ramey via austin-group-l at The Open Group

On 4/29/22 2:38 PM, Robert Elz via austin-group-l at The Open Group wrote:


   | However, today it threw a last curve ball when I was working on an
   | update to the description of set -b ...

How many shells actually implement that?


Bash does. I doubt anyone uses it.



   | This conflicts with 2.9.3.1 Asynchronous Lists which says that IDs
   | remain known until:
   |
   |  1. The command terminates and the application waits for the process ID.
   |
   |  2. Another asynchronous list is invoked before "$!" (corresponding to
   | the previous asynchronous list) is expanded in the current execution
   | environment.

Does anyone implement that bit (#2) at all? 


I think the FreeBSD shell does.


In a non-interactive shell it
might almost be possible, but in an interactive shell, if the job isn't in
the list (whether $! has been referenced or not - usually it will not have
been) because it has been removed, what is the shell supposed to do if the
job stops?   Further users (even in scripts) are allowed to use % %- %1
etc to refer to jobs, $! isn't the only way to reference one ("wait %2 should
work).   I'd suggest that #2 should simply be removed.


I think the standard implies that the jobs list and the list of terminated
process IDs `known in the current environment' are different things. It's
not clear.



But do note that the definition of the jobs command says:

When jobs reports the termination status of a job, the shell shall
remove its process ID from the list of those ``known in the current
shell execution environment''; see Section 2.9.3.1 (on page 2338).


This is one place where the two things overlap.



   | It also appears that dash still implements remove-before-prompting.

Does anyone not?


Lots of shells don't.



   | B. Allow remove-before-prompting. This would mean changing 2.9.3.1 to
   | add a third list item (for interactive shells only) and deleting the
   | above quoted text from the wait page.

This is necessary, we would be making use of the shell too difficult for
interactive users otherwise. 


What does "too difficult" mean? The shells that don't do remove-before-
prompting seem to be doing just fine.



While you're considering all of this, you might want to also consider what
is intended to happen if a script does

trap '' CHLD

and how that is supposed to interact with maintenance of the jobs command,
the wait command, and all else related.


It should be explicitly stated to be unspecified behavior, since SIGCHLD is
necessary to make process handling work.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: When can shells remove "known" process IDs from the list?

2022-05-06 Thread Chet Ramey via austin-group-l at The Open Group
r sections.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: 答复: How do I get the buffered bytes in a FILE *?

2022-04-18 Thread Chet Ramey via austin-group-l at The Open Group

On 4/18/22 12:53 AM, Rob Landley wrote:

On 4/17/22 18:10, Chet Ramey wrote:

On 4/16/22 2:58 PM, Rob Landley via austin-group-l at The Open Group wrote:

Q) "How do I switch from FILE * to fd via fileno() without losing data."

A) "Don't use FILE *"

That's not the question I asked?


The answer is correct, but incomplete. The missing piece is that if you
want to use FILE *, the operation you want, and the information you need to
implement it, are not part of the public API.


Which is a fixable problem.


Sure, everything's fixable. It's not what you asked, though.




Other than using a strategy like Geoff suggested early on, or trying
something like setvbuf to turn off buffering on the FILE * completely, the
buffer associated with a FILE * and the indexes into it that say how much
data you've consumed from the underlying source are opaque.


https://github.com/coreutils/gnulib/blob/master/lib/freadahead.c


So the gnulib folks looked at a bunch of different stdio implementations
and used non-public (or at least non-standard) portions of the
implementation to agument the stdio API.

If that's what you want to do, propose adding freadahead to the standard.

Or reimplement the gnulib work and accept that the stdio implementation
can potentially change out from under you. Current POSIX provides no help
here.



If you want to
manipulate that information, or expose it to a caller, you can't use FILE *
(or, if you want a direct answer, "you can't").


The if/else staircase in m4 and gnulib and so on says I can.


Not in a way that protects you against changes to one of the underlying
stdio implementations. And isn't that the point? You can always offer that
functionality if you have stable access to stdio internals, but it's not in
the standard.


I was just wondering if there was a _clean_ way to do it. 


OK. Do you think you've gotten an answer to that?



The C99 guys point out they haven't got file descriptors and thus this would
logically belong in posix, for the same reason fileno() does. "But FILE *
doesn't have a way to fetch the file descriptor" was answered by adding
fileno(). That is ALSO grabbing an integer out of the guts of FILE *.


Sure. And adding that to the standard would require the usual things, for
which there's a process.


This exists. It would be nice if it got standardized.


Maybe it would. But that's a different question.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: 答复: How do I get the buffered bytes in a FILE *?

2022-04-17 Thread Chet Ramey via austin-group-l at The Open Group

On 4/16/22 2:58 PM, Rob Landley via austin-group-l at The Open Group wrote:

Q) "How do I switch from FILE * to fd via fileno() without losing data."

A) "Don't use FILE *"

That's not the question I asked?


The answer is correct, but incomplete. The missing piece is that if you
want to use FILE *, the operation you want, and the information you need to
implement it, are not part of the public API.

Other than using a strategy like Geoff suggested early on, or trying
something like setvbuf to turn off buffering on the FILE * completely, the
buffer associated with a FILE * and the indexes into it that say how much
data you've consumed from the underlying source are opaque. If you want to
manipulate that information, or expose it to a caller, you can't use FILE *
(or, if you want a direct answer, "you can't").

I found it easier to write my own buffered input package to satisfy the
POSIX read ahead requirements than try to coerce stdio into doing it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Another "is it quoted" issue for here doc redir op end words

2022-02-06 Thread Chet Ramey via austin-group-l at The Open Group

On 2/6/22 1:23 PM, Robert Elz via austin-group-l at The Open Group wrote:


They are definitely allowed, what saves all of us is that no-one
ever writes this, and if anyone ever attempted it, they'd probably be
hoping that the end-word on the here-doc redirect would be expanded
the same way it is fir all other redirects (which definitely does
not happen).


That's the aforementioned rabbit hole.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Another "is it quoted" issue for here doc redir op end words

2022-02-06 Thread Chet Ramey via austin-group-l at The Open Group

On 2/6/22 6:14 AM, Thorsten Glaser via austin-group-l at The Open Group wrote:

Robert Elz via austin-group-l at The Open Group dixit:


But there is a somewhat weird case that the shells (those for which
this works at all, which is a minority) differ about, that I don't



Which is correct, and why?


Should that even work?! (mksh is one of the shells in which it doesn’t,
and I’m hard-pressed to see why scripts should even be allowed to write
constructs like that.)


Yeah, we're way down the rabbit hole of hypothetical cases here.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)

2022-01-27 Thread Chet Ramey via austin-group-l at The Open Group

On 1/27/22 12:25 PM, Chet Ramey wrote:
On 1/27/22 12:07 PM, Harald van Dijk via austin-group-l at The Open Group 
wrote:

On 27/01/2022 16:06, Chet Ramey via austin-group-l at The Open Group wrote:

Wow, that seems like a bug. Environment variables can contain sequences of
arbitrary non-NULL bytes, and, as long as the portion before the `=' is a
valid NAME, the shell is required to create a variable with the remainder
of the string as its value and pass it to child processes in the
environment.


That is not what POSIX says. It says "The value of an environment 
variable is a string of characters" (8.1 Environment Variable 
Definition), and "character" is defined as "a sequence of one or more 
bytes representing a single graphic symbol or control code" (3 
Definitions), with a note that says it corresponds to what C calls a 
multi-byte character. Environment variables are not specified to allow 
arbitrary bytes.


I wonder why they chose that. It's a departure from existing practice.


I suppose it's just a quality of implementation issue, since applications
can obviously put whatever they want into the value of an environment
variable in envp and call execve.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)

2022-01-27 Thread Chet Ramey via austin-group-l at The Open Group
On 1/27/22 12:07 PM, Harald van Dijk via austin-group-l at The Open Group 
wrote:

On 27/01/2022 16:06, Chet Ramey via austin-group-l at The Open Group wrote:

Wow, that seems like a bug. Environment variables can contain sequences of
arbitrary non-NULL bytes, and, as long as the portion before the `=' is a
valid NAME, the shell is required to create a variable with the remainder
of the string as its value and pass it to child processes in the
environment.


That is not what POSIX says. It says "The value of an environment variable 
is a string of characters" (8.1 Environment Variable Definition), and 
"character" is defined as "a sequence of one or more bytes representing a 
single graphic symbol or control code" (3 Definitions), with a note that 
says it corresponds to what C calls a multi-byte character. Environment 
variables are not specified to allow arbitrary bytes.


I wonder why they chose that. It's a departure from existing practice.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)

2022-01-27 Thread Chet Ramey via austin-group-l at The Open Group
On 1/27/22 10:18 AM, Harald van Dijk via austin-group-l at The Open Group 
wrote:

On 27/01/2022 12:44, Geoff Clare via austin-group-l at The Open Group wrote:

Christoph Anton Mitterer wrote, on 26 Jan 2022:

3) Does POSIX define anywhere which values a shell variable is required
    to be able to store?
    I only found that NUL is excluded, but that alone doesn't mean that
    any other byte value is required to work.


Kind of circular, but POSIX clearly requires that a variable can be
assigned any value obtained from a command substitution that does not
include a NUL byte, and specifies utilities that can be used to
generate arbitrary byte values, therefore a variable can contain any
sequence of bytes that does not include a NUL byte.


Is it really clear that POSIX requires that? The fact that it refers to 
"characters" of the output implies the bytes need to be interpreted as 
characters according to the current locale, which is a process that can 
fail. In at least one shell (yash), bytes that do not form a valid 
character are discarded, which makes sense since yash internally stores 
variables etc. as wide strings. 


Wow, that seems like a bug. Environment variables can contain sequences of
arbitrary non-NULL bytes, and, as long as the portion before the `=' is a
valid NAME, the shell is required to create a variable with the remainder
of the string as its value and pass it to child processes in the
environment. If yash modifies that value because there are sequences that
don't form valid wide characters, that sounds like a problem.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour

2021-12-17 Thread Chet Ramey via austin-group-l at The Open Group

On 12/17/21 5:11 AM, Geoff Clare via austin-group-l at The Open Group wrote:


The more I think about it, the more I am convinced that an error
is the right thing for make to do, 


The world is an imperfect place. It seems that few, if any, make
implementations agree. We can't start standardizing behavior that no one
implements because of a desire for possible future improvement. That's
what gives standards bodies a bad name.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour

2021-12-17 Thread Chet Ramey via austin-group-l at The Open Group

On 12/17/21 5:11 AM, Geoff Clare via austin-group-l at The Open Group wrote:


Currently POSIX does not require unset macros to expand to an empty
string. The standard is silent on the matter, so the behaviour is
implicitly unspecified.


It seems like this is an opportunity to standardize behavior that is
common across multiple (all?) make implementations.


The proposed change *reduces* the allowed behaviours from many to
just two.


If all make implementations have the same behavior, why not standardize
that?

Is there evidence that the "typo in the makefile" problem is widespread
enough to devise and require a hypothetical fix in make?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour

2021-12-16 Thread Chet Ramey via austin-group-l at The Open Group
On 12/16/21 5:27 AM, Geoff Clare via austin-group-l at The Open Group wrote:
> Chet Ramey wrote, on 14 Dec 2021:
>>
>> On 12/14/21 5:15 AM, Geoff Clare via austin-group-l at The Open Group wrote:
>>> Paul Smith wrote, on 13 Dec 2021:
>>>> Why shouldn't we just state that
>>>> make implementations must expand unset variables to the empty string,
>>>> which is what all implementations (that I'm aware of) do anyway?
>>>
>>> The point is that any makefile that relies on an unset macro being
>>> expanded to an empty string is not portable.  The only reason it ever
>>> works is purely by luck.  
>>
>> These two paragraphs are clearly in conflict. They can't both be true.
> 
> They are both true, but I could perhaps have phrased it differently
> to make it clearer why the second one is true.
> 
> When a makefile relies on an unset macro being expanded to an empty
> string, the reason it is not portable has nothing to do with the way
> current make implementations expand unset macros, it is because the
> makefile is relying on the macro being unset.

Then it doesn't fully address the point.

Whether you write a makefile one way or another, when standardizing
make behavior we should give more weight to the behavior of current make
implementations. If all existing make implementations expand unset macros
to the empty string -- and no one has identified an implementation that
does not -- then that is the behavior that should be standardized.

> The only way to be sure that a given macro will expand to an empty
> string is to explicitly set it to an empty string.

This is just saying that you can never be sure that a macro is unset.
However, if it is unset, make implementations should behave consistently,
and it appears they do.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour

2021-12-14 Thread Chet Ramey via austin-group-l at The Open Group
On 12/14/21 5:15 AM, Geoff Clare via austin-group-l at The Open Group wrote:
> Paul Smith wrote, on 13 Dec 2021:
>> Why shouldn't we just state that
>> make implementations must expand unset variables to the empty string,
>> which is what all implementations (that I'm aware of) do anyway?
> 
> The point is that any makefile that relies on an unset macro being
> expanded to an empty string is not portable.  The only reason it ever
> works is purely by luck.  

These two paragraphs are clearly in conflict. They can't both be true.

Can anyone point to a make implementation that throws an error when
expanding an unset variable?

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? in a simple command with no command name

2021-09-01 Thread Chet Ramey via austin-group-l at The Open Group

On 9/1/21 4:59 PM, (Joerg Schilling) wrote:

"Chet Ramey via austin-group-l at The Open Group" 
 wrote:


Given the following:

(exit 42)
a=$? b=`false` b=$?

echo $? $a $b

Bash prints 1 42 1.

The original (v7) bourne shell and the rest of the research line through v9
prints 1 1 (b is set to the empty string). That implies that it executes
the assignment statements in reverse order, in addition to carrying $?
through the sequence of assignments.


You are right, the original Bourne Shell for unknown reasons did evaluate a
series of shell variable assignments in reverse order. That was changed in
ksh88 and in bosh.
  

The SVR4.2 shell prints 1 42 1. I imagine the rest of the SVR4 line sh is
the same.


Something called SVR4.2 does not really exist. It was a minor change compared
to SvR4 announced by Novell short before they sold the Copyright to SCO. >
I know of no customers for SVR4.2... even SCO seems to only used it internally
in their project Monterey that was abandoned by IBM.


And yet it existed as a product. Univel probably had Unixware customers
they didn't tell you about.

You can find it for download if you look for it, I suspect.


There have been major changes in the Bourne Shell for SvR4, but the $? was not
touched. So you are mistaken.


Sure, I didn't have SVR4 to test against when I wrote that.


The important thing to know here is that the Bourne Shell has some
checkpoints that update the intermediate value of $?. Since that changed in
ksh88 and since POSIX requires a different behavior compared to the Bourne
Shell, I modified one checkpoint in bosh to let it match POSIX.


Interesting, since ksh88 (Solaris 10 11/16/88i) and ksh93 (93u+ 2012-08-01)
both print

1 42 1

Odd that POSIX would specify something different, isn't it?



(exit 42); a=$? b=`false` b=$?; echo $? $a $b

prints

1 42 42

in bosh and

1 1

in the SvR4 Bourne Shell.


It echoes 255 255 with the Solaris 10 /bin/sh (b is again null). It looks
like /bin/false exits with status 255, the Solaris 10 sh still performs
the assignments in reverse order, and the Solaris 10 version of the SVR4 sh
sets $? from the result of each command substitution.

In any case, kre's point stands: the original Bourne shell (and, for that
matter, the POSIX base implementation) set $? as each command substitution
finishes.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: $? in a simple command with no command name

2021-09-01 Thread Chet Ramey via austin-group-l at The Open Group
On 9/1/21 2:23 PM, Robert Elz via austin-group-l at The Open Group wrote:
> Date:Wed, 1 Sep 2021 19:04:12 +0100
> From:Harald van Dijk 
> Message-ID:  <837d3b5b-ac61-98eb-2741-d667a78e2...@gigawatt.nl>
> 
>   | Is there any statement that overrides the general definition to 
>   | explicitly make this unspecified? If not, the general definition applies 
>   | and $? must expand to 0 both times it appears on line 2.
> 
> Perhaps as currently written that's correct, but if so, the standard
> probably needs to be updated, as it is fairly clear that shells which
> set $? as each command substitution finishes have always existed (in
> fact, that might have been what the original Bourne shell did, I haven't
> checked) and the standard should allow for that.

Given the following:

(exit 42)
a=$? b=`false` b=$?

echo $? $a $b

Bash prints 1 42 1.

The original (v7) bourne shell and the rest of the research line through v9
prints 1 1 (b is set to the empty string). That implies that it executes
the assignment statements in reverse order, in addition to carrying $?
through the sequence of assignments.

The SVR4.2 shell prints 1 42 1. I imagine the rest of the SVR4 line sh is
the same.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: utilities and write errors

2021-06-30 Thread Chet Ramey via austin-group-l at The Open Group
On 6/30/21 11:49 AM, Joerg Schilling via austin-group-l at The Open Group 
wrote:



Erm, yes. For some reason, I assumed the OP wrote &> instead of >&
which have the same meaning in GNU bash (but &> is the parse-trouble
one even if the bash manpage actively recommends it). I guess their
?~>&? confused me. My point of _please_ using ?>file 2>&1? instead
is still valid, ofc.


BTW: I would not call it a hard parse error but a semantic problem, since the
standard only mentions numbers after >&


It does not. The redirection is specified as `[n]>'. The standard
says:

"If word evaluates to one or more digits, the file descriptor denoted by n,
or standard output if n is not specified, shall be made to be a copy of the
file descriptor denoted by word; if the digits in word do not represent a
file descriptor already open for output, a redirection error shall result;
see Consequences of Shell Errors. If word evaluates to '-', file descriptor
n, or standard output if n is not specified, is closed. Attempts to close a
file descriptor that is not open shall not constitute an error. If word
evaluates to something else, the behavior is unspecified."

Everyone is conformant here. There is unspecified behavior.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: utilities and write errors

2021-06-30 Thread Chet Ramey via austin-group-l at The Open Group
On 6/29/21 5:09 PM, tg...@mirbsd.org via austin-group-l at The Open Group 
wrote:



I know the GNU bash extension >& (which incidentally
violates POSIX on the parse level) but not ~>&…


It doesn't. It's been unspecified for over 30 years.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: behavior of printf '\x61'

2021-04-16 Thread Chet Ramey via austin-group-l at The Open Group

On 4/16/21 3:41 PM, Philip Guenther via austin-group-l at The Open Group wrote:

-
7. An additional conversion specifier character, b, shall be supported as 
follows.

...
The interpretation of a  followed by any other sequence of 
characters is

unspecified.
-

That exception is about the %b conversion and the handling of its argument, 
so while that says that

      printf %b '\x61'
is unspecified, it doesn't apply to
      printf '\x61'


A strict reading of the standard says that it's not converted, as per your
original message, since `File Format Notation' says:

"Characters that are not "escape sequences" or "conversion specifications",
as described below, shall be copied to the output."

and \x is not a described "escape sequence." The printf description
explicitly allows octal, but not hex.

Output varies widely ('a', '\x61', 'x61'). I consider hex output a valid
extension, but others probably will not. I believe it's a defect in the
standard, though.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: execve(2) optimisation for last command

2021-04-15 Thread Chet Ramey via austin-group-l at The Open Group

On 4/15/21 4:36 PM, Martijn Dekker via austin-group-l at The Open Group wrote:

Most shells 'exec' the last command in -c scripts, e.g.:




However, no shell seems to do this for scripts loaded from a file:





My question: why is this? I would have thought that a script is a script 
and that it should make no essential difference whether a script is taken 
from a -c argument or loaded from a file. What makes the optimisation 
appropriate for one but not the other?


My guess is that all these shells read scripts a line (or command) at a
time, and don't realize they're at EOF until after they've executed the
last command. (In a nutshell, that's what bash does.) Commands executed
with -c don't have this limitation.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]

2021-04-13 Thread Chet Ramey via austin-group-l at The Open Group

On 4/13/21 5:16 AM, Harald van Dijk via austin-group-l at The Open Group wrote:

Please note again that POSIX's Command Search and Execution doesn't say 
"continue until execve() doesn't fail". It says "Otherwise, the command 
shall be searched for using the PATH environment variable as described in 
XBD Environment Variables", and then what happens to the result of that 
search. It very clearly separates the search from the attempt to execute. 


The complicating factor is POSIX's definition of "executable file."

You search "until an executable file with the specified name and
appropriate execution permissions is found."

An executable file is a "regular file acceptable as a new process image
file by the equivalent of the exec family of functions."

And the only way to determine that is by trying to execute it using one
of "the exec family of functions."

That said, this is the most marginal of corner cases, notwithstanding that
bash has a distinct option to handle it (disabled by default).

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]

2021-04-12 Thread Chet Ramey via austin-group-l at The Open Group

On 4/12/21 12:05 PM, Robert Elz via austin-group-l at The Open Group wrote:


Anything that the system can run, no matter how it does that, is acceptable.

If a system noticed a VAX format a.out, it could load a vax simulator, and
run the binary that way, without the user even noticing.  If it wanted.


You just described basically how macOS runs Intel binaries on M1 hardware,
and how Intel hardware ran PowerPC binaries before that. No mystery here.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]

2021-04-12 Thread Chet Ramey via austin-group-l at The Open Group

On 4/11/21 4:17 PM, shwaresyst via austin-group-l at The Open Group wrote:

conforming applications can not rely on unspecified behaviors, so having a 
use beyond that specified makes the shell nonconforming. Calling it out 
like that simply acknowledges a lot of shell implementations choose to make 
themselves nonconforming, I do not see it as an endorsement or allowance. 


This is just wrong. By this definition, every shell is non-conforming.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 3:32 PM, Robert Elz wrote:

 Date:Fri, 19 Feb 2021 14:30:25 -0500
 From:Chet Ramey 
 Message-ID:  <2b32112c-de72-c713-3f87-6840828c3...@case.edu>


   | Nope, it's consistent with the standard.

I can understand that argument.

   | that's not a fair reading of rule 4.

Whenever we need to rely upon "fair" readings (which generally means
that it isn't unambiguous, but it "must" mean ...) we have a problem.


And here we are.


It clearly needs to be fixed.bash is alone amongst shells in interpreting
it that way. 


Nope, yash in extra-pedantic-posix mode interprets in the same way.

This is what happens when you implement the grammar to the standard and
it's such a nonsense case that nobody ever reports it as an error.


Everyone else allows that "esac"
to be Esac only when it causes a match, and not when it causes a syntax
error, which is what 2.10.1 says.


As I said earlier, it's not hard to fix, just another special case in the
grammar caught in lexical analysis.

Chet


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 12:56 PM, Robert Elz via austin-group-l at The Open Group wrote:


bash's behaviour is a little weird:


Nope, it's consistent with the standard.


bash5 $case esac in
(esac) echo match
-bash: syntax error near unexpected token `esac'
bash5 $esac
-bash: syntax error near unexpected token `esac'

It is obviously converting the "esac" to Esac, which is correct according
to POSIX, but them apparently expecting it to be a pattern, which is
not correct, it should be terminating the case statement (as zsh
does) making it be that the following ')' is incorrect.


OK. You return the `(' as a token, which, since you're looking for a
pattern list, takes you to a state where you apply rule 4. We agree that a
fair reading of rule 4 results in Esac, as above, which is a syntax error.

There is nothing in the standard that allows you to treat the Esac token in
that state as terminating the case statement, nor is there anything that
allows you to discard the `('. It's just an error.

This is the crux of Geoff's argument: that's not a fair reading of rule 4.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 11:21 AM, Geoff Clare via austin-group-l at The Open Group wrote:


There is no way to apply rule 4 to produce "a token identifier acceptable at
that point in the grammar". The only token identifier acceptable at that
point in the grammar is WORD, and rule 4 does not produce WORD. Rule 4
reads:

   When the TOKEN is exactly the reserved word esac, the token identifier
   for esac shall result. Otherwise, the token WORD shall be returned.

Here, the TOKEN is exactly the reserved word esac, and you agree that this
rule is applied. This therefore produces the token identifier for esac.
There is nothing else that turns it into WORD, which is needed to parse it
as a pattern.


I see your point.  The wording of rule 4 itself does not yield WORD in
this case; it's only when read in combination with the introductory text
from 2.10.1 that it becomes apparent that this is the intention.


So "acceptable at that point in the grammar" is indeed carrying a heavy
load here. You might want to add the qualifying language you suggested.



Incidentally, bash 3 on macOS gets the '|' case wrong, e.g.:

case esac in foo|esac) echo match;; esac

whereas bash5 accept that.  So it would appear that Chet fixed the
preceded-by-'|' case at some point but not the preceded-by-'(' case.


It's just another special case in the grammar that lexical analysis
has to handle.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 11:22 AM, Geoff Clare via austin-group-l at The Open Group wrote:


Yes, rule 4 is applied there, but your mistake is in assuming that
the *result* of rule 4 is that the token is converted to an Esac.


How is it not? "the [sic] TOKEN is exactly the reserved word esac" at this
point. Why would it not return the token for `esac'? Or are you saying that
is not converted to an Esac?


Harald made essentially the same point in his last mail - see my reply
to that.


Yeah, I hadn't gotten there yet.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 10:52 AM, Donn Terry via austin-group-l at The Open Group wrote:
It was oh so many years ago that I originally wrote that hideously awful 
grammar to try to reflect what the ksh did, which was very much ad-hoc 
parsing. I won't apologise for the ksh language the grammar tries to 
reflect, or for the grammar itself since ksh is definitely not context-free 
and thus requires such awfulness. But I feel awful that it's inflicted on 
POSIX users.


I think you deserve a lot of credit for that work. It's a much more
daunting task than people appreciate.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar

2021-02-19 Thread Chet Ramey via austin-group-l at The Open Group

On 2/19/21 10:33 AM, Geoff Clare via austin-group-l at The Open Group wrote:


Observe that rule 4 is applied for the first word in a pattern even if that
pattern follows an opening parenthesis. Because of that, in my example, the
esac in parentheses is interpreted as the esac keyword token, not a regular
WORD token that makes for a valid pattern.


Yes, rule 4 is applied there, but your mistake is in assuming that
the *result* of rule 4 is that the token is converted to an Esac.


How is it not? "the [sic] TOKEN is exactly the reserved word esac" at this
point. Why would it not return the token for `esac'? Or are you saying that
is not converted to an Esac?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Bug 1393 ("command" declaration utility) possible solution

2021-01-10 Thread Chet Ramey via austin-group-l at The Open Group

On 1/9/21 4:34 PM, Martijn Dekker via austin-group-l at The Open Group wrote:

I would question that the currently published standard allows any regular 
builtin to override the regular shell grammar with special syntactic 
properties. What exactly do you base this on?


I imagine it's because these builtins also accept the syntax the standard
deems legal. As long as they do that, they can do whatever else they want
with syntax that would be an error by the rule of the standard.


This is also not a majority shell behaviour. For instance, bash does not 
currently do this. I wonder if Chet has any plans of changing that now.


I currently do not. I have other priorities.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-20 Thread Chet Ramey via austin-group-l at The Open Group

On 11/19/20 5:05 AM, Geoff Clare via austin-group-l at The Open Group wrote:


2. If the last resource-specifying option has no option-argument,
treat the operand as if it was an option-argument for that option;
otherwise report a usage error (or ignore the operand).


This option sounds like it's the most reasonable. I will look to add it,
or something along these lines, in bash-5.2. It's too late for bash-5.1.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-17 Thread Chet Ramey via austin-group-l at The Open Group

On 11/17/20 10:56 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Chet Ramey wrote, on 17 Nov 2020:


On 11/17/20 10:14 AM, Geoff Clare via austin-group-l at The Open Group wrote:


Maybe you could handle those by seeing that the option argument is
alphabetic (and not "unlimited") and treating it as a string of
option letters instead of reporting that it is an invalid number.


 From `getopt's perspective, there is no difference between -fH and
-f H. They both return `H' in optarg.


One increments optind by 1 and the other by 2, which means it's possible
to distinguish the two cases.


The bash builtin getopt doesn't quite do things the same way, since it uses
the word lists bash passes around. I could recognize this case, but it 
seems fragile.



Or I could just go with my original suggestion of adding:

  Conforming applications shall specify each option separately; that is,
  grouping option letters (for example, −fH) need not be recognized by
  all implementations.

to my proposal.


Sure, that would work.


Okay, looks like that will be the end result, unless you like my optind
trick.  It would improve the portability of ksh scripts (or perhaps
more likely, commands typed by a user) to bash.


The new language is sufficient.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-17 Thread Chet Ramey via austin-group-l at The Open Group

On 11/17/20 10:14 AM, Geoff Clare via austin-group-l at The Open Group wrote:


Maybe you could handle those by seeing that the option argument is
alphabetic (and not "unlimited") and treating it as a string of
option letters instead of reporting that it is an invalid number.


From `getopt's perspective, there is no difference between -fH and
-f H. They both return `H' in optarg. There's no good reason to try
and treat the latter case as a series of option letters.



Or I could just go with my original suggestion of adding:

 Conforming applications shall specify each option separately; that is,
 grouping option letters (for example, −fH) need not be recognized by
 all implementations.

to my proposal.


Sure, that would work.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-17 Thread Chet Ramey via austin-group-l at The Open Group

On 11/17/20 4:53 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Chet Ramey wrote, on 16 Nov 2020:


On 11/16/20 11:05 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Chet Ramey wrote, on 16 Nov 2020:



Thanks.  Looks like bash is parsing the ulimit options in an unusual
way instead of using getopt() or similar.


Quite the opposite. The bash ulimit builtin uses the same internal_getopt
code as the rest of the builtins, with the addition that option-arguments
are allowed to be missing and that `-f' is the default in the absence of
any other resource options.

What's happening is that the bash getopt treats `-xARG' identically to
`-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument,
that `H' is that argument.


This is the same thing that POSIX getopt does; of course, it doesn't handle
optional arguments at all.


Huh? This contradicts what you said above: "with the addition that
option-arguments are allowed to be missing". 


That is not a missing option-argument. A missing option argument is
something like `ulimit -c'. The POSIX getopt would not consider `-fH'
a missing option argument either, assuming `f:' were specified in the
option string; that's the point.

It also doesn't match

the behaviour I see in bash 5:

$ bash -c 'ulimit -fc'
bash: line 0: ulimit: c: invalid number


There's nothing mysterious here. The `c' constitutes the option-argument.
Numbered item 2 in the POSIX getopt description says the same thing.
The only way it would not be considered an option argument is if it looked
like an option itself: a separate argument preceded by a `-'. Which leads
me to the next example:


$ bash -c 'ulimit -f -c'
file size   (blocks, -f) unlimited
core file size  (blocks, -c) 10

If it did not handle optional option-arguments, then this last command
would fail with "bash: line 0: ulimit: -c: invalid number".


Indeed, that's one aspect of the optional argument implementation.

But all of this discussion about getopt and optional arguments is a red
herring anyway. POSIX finesses all of this by not having option arguments
in the `ulimit' description at all, and the `newlimit' in the standard and 
the current proposal is a separate operand. That limits you to modifying

one limit per call.


To match your description it would need to include:

  ... [-c[limit]] [-f[limit]] ...


Nah. I'd have to do that for every option, and I'd like to keep the manual
under 200 pages. The descriptive text explains things.


In that case you should give a synopsis that does not specify syntax,
instead of giving one that contradicts the description.


The readers seem to pick it up pretty well. I haven't received reports that
the description `contradicts' the syntax summary.

I understand that the POSIX syntax summary is a language, and if you're
fluent in its syntax and semantics, it's easy to read things in its terms.
Not a lot of people do that.


   2.   Otherwise, optarg shall point to the string following the option
character in that element of argv, and optind shall be incremented by 1."


This requires that if the option string includes "f:" and the argument
list has -f followed by -c then the -c is returned in optarg.  That's not
what bash's ulimit does.


Let's try this again. The example in question here is `ulimit -fc'. If the
`f' were specified as returning an option-argument, the POSIX getopt would
certainly return `c' in optarg. The bash option string includes `;', which
specifies that the option-argument may not be present, but it is. If you
consider `-f -c', the `-c' looks like an option, so the optional argument
code considers the option-argument to be missing.

The POSIX `ulimit' description specifies that `newlimit' is an operand, not
an option-argument, so this discussion is academic.

One consequence of the POSIX description is, as I said above, that it
restricts each invocation to modifying one limit. That's how it can finesse
the `newlimit is an operand'. I'm not going to reduce functionality and
throw away backwards compatibility without a better reason.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-16 Thread Chet Ramey via austin-group-l at The Open Group

On 11/16/20 11:05 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Chet Ramey wrote, on 16 Nov 2020:



Thanks.  Looks like bash is parsing the ulimit options in an unusual
way instead of using getopt() or similar.


Quite the opposite. The bash ulimit builtin uses the same internal_getopt
code as the rest of the builtins, with the addition that option-arguments
are allowed to be missing and that `-f' is the default in the absence of
any other resource options.

What's happening is that the bash getopt treats `-xARG' identically to
`-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument,
that `H' is that argument.


This is the same thing that POSIX getopt does; of course, it doesn't handle
optional arguments at all.



This is completely different to the syntax documented in the bash manual
at https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html
which is:

 ulimit [-HSabcdefiklmnpqrstuvxPT] [limit]

To match your description it would need to include:

 ... [-c[limit]] [-f[limit]] ...


Nah. I'd have to do that for every option, and I'd like to keep the manual
under 200 pages. The descriptive text explains things.




The utility syntax guidelines give a passing nod to this situation:

"One or more options without option-arguments, followed by at most one
option that takes an option-argument, should be accepted when grouped
behind one '-' delimiter."


That's needed for all options that take an option-argument, regardless
of whether mandatory or optional.


Sure. It fits this situation exactly, then.

Optional arguments take this out of the realm of getopt() anyway.

But doesn't the description of getopt() in

https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html#tag_16_206

require the bash behavior for options that do take an argument?

"If the option takes an argument, getopt() shall set the variable optarg to
point to the option-argument as follows:

 1.   If the option was the last character in the string pointed to by an
element of argv, then optarg shall contain the next element of argv, and
optind shall be incremented by 2. If the resulting value of optind is
greater than argc, this indicates a missing option-argument, and getopt()
shall return an error indication.

  2.   Otherwise, optarg shall point to the string following the option
character in that element of argv, and optind shall be incremented by 1."




Optional option-arguments are explained in XBD 12.1 item 2:


Yeah, getopt doesn't follow that one.




This is standard GNU getopt behavior.


As an extension to POSIX (because you have to somehow tell it via the
option string that the option takes an optional option-argument, which
is beyond what POSIX specifies for getopt).  However, since it doesn't
implement them the way the standard requires, that does mean that GNU
getopt can't be used for option handling in any of the utilities in
POSIX that are specified as having options with an optional
option-argument.


Maybe not. Neither can POSIX getopt, so we're back to the "or similar" part
of your original message. That doesn't seem to help portability.

This doesn't rise to the level of anything that would inspire me to break
that much backwards compatibility.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()

2020-11-16 Thread Chet Ramey via austin-group-l at The Open Group

On 11/16/20 4:45 AM, Geoff Clare via austin-group-l at The Open Group wrote:

Jilles Tjoelker wrote, on 13 Nov 2020:


On Mon, Nov 09, 2020 at 03:07:43PM +, Geoff Clare via austin-group-l
at The Open Group wrote:

The ksh and bash behaviour of reporting multiple values seems more
useful to me, but I wouldn't object if others want to make this
unspecified.


With bash, reporting multiple values does not work if the options are
grouped into a single argument:

% bash -c 'ulimit -fn'
bash: line 0: ulimit: n: invalid number
% bash -c 'ulimit -f -n'
file size   (blocks, -f) unlimited
open files  (-n) 231138

With ksh93, both these commands work as expected.

Similarly, commands like  ulimit -fH  do not work in bash. It must be
-Hf, -H -f or -f -H.


Thanks.  Looks like bash is parsing the ulimit options in an unusual
way instead of using getopt() or similar.


Quite the opposite. The bash ulimit builtin uses the same internal_getopt
code as the rest of the builtins, with the addition that option-arguments
are allowed to be missing and that `-f' is the default in the absence of
any other resource options.

What's happening is that the bash getopt treats `-xARG' identically to
`-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument,
that `H' is that argument. This is standard GNU getopt behavior.

The utility syntax guidelines give a passing nod to this situation:

"One or more options without option-arguments, followed by at most one
option that takes an option-argument, should be accepted when grouped
behind one '-' delimiter."


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Status of $'...' addition (was: ksh93 job control behaviour)

2020-07-30 Thread Chet Ramey
On 7/30/20 7:29 PM, Robert Elz wrote:

>   | And for that it would be tremendous if $'' would be defined so
>   | that it can be used as the sole quoting mechanism,
> 
> No thanks.   Partly because $'' is already implemented (widely)
> and used (perhaps slightly less yet) - so that ship has sailed.
> 
> I believe I've seen $" ... " used that way somewhere though (don't
> recall where) and I believe it is a mistake.

None of the existing implementations of $"..." use it in that way.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: sh: aliases in command substitutions

2020-04-23 Thread Chet Ramey
On 4/23/20 1:08 PM, Robert Elz wrote:
> Date:Thu, 23 Apr 2020 11:48:55 -0400
> From:    Chet Ramey 
> Message-ID:  
> 
>   | Keep in mind that those tests are mutually incompatible
> 
> I didn't see anything I would call that.

When run as a single file, as presented.
> 
> But:
>   | and will produce
>   | misleading (or at least confusing) results, probably the consequence of
>   | combining a number of individual files into one.
> 
> If run as a single script file, yes, the D,2 test treats the next
> several tests (to the end of F I think it was) as data...  They really
> should be separated and run one at a time.

Indeed. That's why I said what I said.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: sh: aliases in command substitutions

2020-04-23 Thread Chet Ramey
On 4/23/20 5:21 AM, Joerg Schilling wrote:
> shwaresyst  wrote:
> 
>> I never said this was expected to be clean, or even easy to do, just that it 
>> is plausible for the feature set desired. What mucks it up is things that 
>> change how lexical elements are expected to be recognized; case conditions 
>> should use someting like , with left angles 
>> being optional, to indicate end of pattern, not ')', but these don't become 
>> part of the base PCS until Issue 9.
> 
> If you believe it is possible, you could write such a beast and run it 
> against 
> the tests from Sven Mascheck

Keep in mind that those tests are mutually incompatible and will produce
misleading (or at least confusing) results, probably the consequence of
combining a number of individual files into one. They're useful; just don't
treat the results as gospel.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: aliases in command substitutions

2020-04-20 Thread Chet Ramey
On 4/20/20 9:34 AM, Robert Elz wrote:

>   | but I've always understood the
>   | case xxx in
>   | (pattern) ...;;
>   | esac
>   |
>   | (fully parenthesized pattern) syntax to have been invented precisely
>   | to allow case statements in $() subshell notation,
> 
> First, $() is command substitution, not a subshell (not really important)
> and if that was someone's intent, they did a particularly bad job of
> implementing it, as what the standard says is (XCU 2.6.3)

He's right, and it happened 30 years ago:

"An optional open-parenthesis before pattern was added to allow numerous
historical KornShell scripts to conform. At one time, using the leading
parenthesis was required if the case statement were to be embedded within a
$( ) command substitution; this is no longer the case with the POSIX shell.
Nevertheless, many existing scripts use the open-parenthesis, if only
because it makes matching-parenthesis searching easier in vi and other
editors. This is a relatively simple implementation change that is fully
upward compatible for all scripts."

This is from 1991, and I'm certain, though I don't have it with me right
now, that the same text appeared in the 1992 version of the standard.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: pwd(1) pwd -L and multple adjacent slashes in $PWD,

2020-04-14 Thread Chet Ramey
On 4/14/20 10:05 AM, casper@oracle.com wrote:
> 
>> On 4/14/20 9:44 AM, casper@oracle.com wrote:
>>> pwd has the -L option:
>>>
>>> The following options shall be supported by the implementation:
>>>
>>> -L
>>> If the PWD environment variable contains an absolute pathname
>>> of the current directory and the pathname does not contain any
>>> components that are dot or dot-dot, pwd shall write this
>>> pathname to standard output, except that if the PWD environment
>>> variable is longer than {PATH_MAX} bytes including the
>>> terminating null, it is unspecified whether pwd writes this
>>> pathname to standard output or behaves as if the -P option had
>>> been specified. Otherwise, the -L option shall behave as the -P
>>> option.
>>>
>>>
>>> It mentions "dot-dot" and "dot".
>>>
>>> It does seems to allow:
>>>
>>> (cd /; PWD=// pwd -L)
>>> //
>>> and
>>> (cd /home/casper; PWD=/home///casper  pwd -L)
>>> /home///casper
>>>
>>>
>>> Is this a correct implmentation?
>>
>> Does the standard cover this at all? It only mentions PWD being set by `cd'
>> and initialized by `sh'. If you assign it directly, at least `cd' is
>> explicitly unspecified, and since `pwd' is only required to "remove
>> unnecessary slash characters" if -P is supplied, I'd say you've left the
>> realm of the standard and the implementation can do what it likes.
> 
> 
> So you are saying that it would be fine to squish out the additional 
> slashed in the output?  (Not doing anything would be fine, too)

Yes. It's unspecified.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: pwd(1) pwd -L and multple adjacent slashes in $PWD,

2020-04-14 Thread Chet Ramey
On 4/14/20 9:44 AM, casper@oracle.com wrote:
> pwd has the -L option:
> 
> The following options shall be supported by the implementation:
> 
> -L
>   If the PWD environment variable contains an absolute pathname
>   of the current directory and the pathname does not contain any
>   components that are dot or dot-dot, pwd shall write this
>   pathname to standard output, except that if the PWD environment
>   variable is longer than {PATH_MAX} bytes including the
>   terminating null, it is unspecified whether pwd writes this
>   pathname to standard output or behaves as if the -P option had
>   been specified. Otherwise, the -L option shall behave as the -P
>   option.
> 
> 
> It mentions "dot-dot" and "dot".
> 
> It does seems to allow:
> 
>   (cd /; PWD=// pwd -L)
>   //
> and
>   (cd /home/casper; PWD=/home///casper  pwd -L)
>   /home///casper
> 
> 
> Is this a correct implmentation?

Does the standard cover this at all? It only mentions PWD being set by `cd'
and initialized by `sh'. If you assign it directly, at least `cd' is
explicitly unspecified, and since `pwd' is only required to "remove
unnecessary slash characters" if -P is supplied, I'd say you've left the
realm of the standard and the implementation can do what it likes.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-13 Thread Chet Ramey
On 3/13/20 10:14 AM, Harald van Dijk wrote:

>>> Can this instead say "in the same shell execution environment as the
>>> compound-list of the compound-command of the function definition", so that
>>>
>>>    f() (return 1)
>>>
>>> which is fairly sensible and works in all shells[*] remains well-defined,
>>> but only something along the lines of f() { (return 1) } or
>>> f() ( (return 1) ) becomes unspecified?
>>
>> We should be able to do better than that. I don't see why "if not executing
>> in the same shell execution environment as the compound-list ..." can't
>> cover the f() { (return 1) } case as well, and seems to work in all shells.
> 
> I don't see how you can allow that without also allowing
> 
>   f() { (return 7; echo no); echo $?; }; f
> 
> If that also works in all shells (meaning it doesn't print no, and does
> print 7), then by all means standardise it.

I can't find one that doesn't in my quick initial testing, but I don't
have binaries for every shell under the sun.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-13 Thread Chet Ramey
On 3/12/20 4:21 PM, Harald van Dijk wrote:
> On 11/03/2020 17:44, Don Cragun wrote:
>> Would this issue be resolved if we change the last sentence of the
>> description section of the return Special Built-In Utility from:
>>  If the shell is not currently executing a function
>>  or dot script, the results are unspecified.
>> to:
>>  If the shell is not currently executing a function
>>  or dot script running in the same shell execution
>>  environment as the command that invoked the function
>>  or dot script, the results are unspecified.
>> ?
> 
> Can this instead say "in the same shell execution environment as the
> compound-list of the compound-command of the function definition", so that
> 
>   f() (return 1)
> 
> which is fairly sensible and works in all shells[*] remains well-defined,
> but only something along the lines of f() { (return 1) } or
> f() ( (return 1) ) becomes unspecified?

We should be able to do better than that. I don't see why "if not executing
in the same shell execution environment as the compound-list ..." can't
cover the f() { (return 1) } case as well, and seems to work in all shells.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-12 Thread Chet Ramey
On 3/11/20 5:24 PM, Dirk Fieldhouse wrote:

> Even with this wording, it isn't clear that there is "the function or
> dot script, if any" (ie just one, or none) without first applying the
> restriction to the same execution environment, depending on whether you
> think that asynchronous commands in a function definition are counted in
> the "current function", so this perhaps would be better:
> 
> The return utility shall cause the shell to stop executing the
> current function or dot script, if any, that is being executed
>  using the same shell execution environment (see 2.12) as the
> command that invoked the function or dot script. Otherwise the
> results are unspecified.

I think "executed in the same shell execution environment" is better
wording, since it parallels usage elsewhere in the standard, such as

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01

steps 1 and 2.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-12 Thread Chet Ramey
On 3/11/20 1:32 PM, Don Cragun wrote:
> Would this issue be resolved if we change the last sentence of the 
> description section of the return Special Built-In Utility from:
>   If the shell is not currently executing a function
>   or dot script, the results are unspecified.
> to:
>   If the shell is not currently executing a function
>   or dot script running in the same shell execution
>   environment as the command that invoked the function
>   or dot script, the results are unspecified.

I think this is heading in the right direction.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-12 Thread Chet Ramey
On 3/11/20 1:07 PM, Dirk Fieldhouse wrote:
> On 11/03/20 15:23, Chet Ramey wrote:
>> ...>
>> What does a `return from the execution environment' mean, exactly? ...
> 
> To clarify, what I wrote was shorthand for "return from the function if
> the 'return' is executed in the same execution environment as" the
> function's defining command, or otherwise (ii) exit or (iii) unspecified
> behaviour.

I don't think `defining command' is right. It's where the function is
executed that is the issue. So different execution environments is the
way to proceed, but using something like caller instead of defining
command.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-12 Thread Chet Ramey
On 3/11/20 12:15 PM, Dirk Fieldhouse wrote:
>
>> All shells I am aware of print foo and bar
> 
> The discussion seems to have confirmed that this is the general existing
> practice, and not just in the few cases I tested, but IMO only a shell
> implementer could see the suggested behaviour of these examples as
> baffling, based on the wording of the standard (not to mention "man sh",
> etc, so I won't). 

If it's the wording that implies possible behavior that no shell
implements, let's fix the wording.

> The question is what, if any, rewording of the standard should be made.
> There are plenty of choices for better designed scripting languages, so
> arguably making the specification agree with existing practice would be
> an acceptable resolution. The example of DR 842 for 'break' and
> 'continue' shows that this should not be seen as an unnecessary change.

We can use 842 as a model for the changes. Someone needs to propose new
wording that's comprehensive enough to cover the different cases.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-12 Thread Chet Ramey
On 3/12/20 6:02 AM, Joerg Schilling wrote:
> Chet Ramey  wrote:
> 
>> I use Mac OS X. I test on Linux.
>>
>> By far, the biggest difference between older Mac OS X/Linux and current Mac
>> OS X is using lldb instead of gdb for debugging.
> 
> OK, I develop on Solaris and like dbx in favor of gxb at al. but it seems 
> to be a pity that the Solaris compilers do not get updates for OpenSolaris 
> anymore.
> 
> I test on various platforms and as a result, I recently discovered that 
> waitid() on Mac OS is still not usable even though there is a POSIX 
> certification. The still remaining problem is that it always returns a signal 
> number of 0 if the child has been killed by a signal. So for a portable 
> program like bash, it seems that OSX and Linux are not sufficient.

It might be, if bash used waitid().

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
     ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 11:52 AM, Joerg Schilling wrote:
> Chet Ramey  wrote:
> 
>> On 3/11/20 11:46 AM, Joerg Schilling wrote:
>>
>>> Since you most likely develop on Linux
>>
>> I don't; don't make assumptions.
> 
> Interesting, where do you develop?

I use Mac OS X. I test on Linux.

By far, the biggest difference between older Mac OS X/Linux and current Mac
OS X is using lldb instead of gdb for debugging.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 11:46 AM, Joerg Schilling wrote:

> Since you most likely develop on Linux

I don't; don't make assumptions.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 11:30 AM, Joerg Schilling wrote:
> Chet Ramey  wrote:
> 
>>> and that "foo" and not
>>> "bar" should be printed in each case:
>>>
>>> f1() {
>>>   ( echo foo; return )
>>>   echo bar
>>> }
>>
>> This implies some interprocess communication between the parent and child
>> that simply doesn't exist, and nothing in the standard indicates that it
>> should.
> 
> I don't see that we should do this, but id you like to be able to reably get a
> 
>   ``NOEXEC'' or ``NOTFOUND''
> 
> from expanding "$/", there is a need for interprocess communication unless 
> you 
> use vfork() for that specific command.

What is "$/"? Nobody, with perhaps the exception of bosh, implements that.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 11:13 AM, Stephane Chazelas wrote:
> 2020-03-11 09:55:57 -0400, Chet Ramey:
>> On 3/11/20 5:43 AM, Stephane Chazelas wrote:
>>
>>> AFAIK, bash and bosh are the only shells that complain when you
>>> use return outside of functions/sourced scripts (bash also
>>> doesn't exit upon that failing "return" special builtin in that
>>> case which could be seen as a conformance bug). 
>>
>> You really should try posix mode.
> [...]
> 
> As it happens, I did in that case, and I found it behaved the
> same in or outside of POSIX mode:
> 
> $ bash -c 'return; echo "$?"'
> bash: line 0: return: can only `return' from a function or sourced script
> 1
> $ bash -o posix -c 'return; echo "$?"'
> bash: line 0: return: can only `return' from a function or sourced script

This wasn't covered in my previous message; it was changed in January 2019.

$ ./bash -o posix -c 'return ; echo after'
./bash: line 0: return: can only `return' from a function or sourced script

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 11:13 AM, Stephane Chazelas wrote:
> 2020-03-11 09:55:57 -0400, Chet Ramey:
>> On 3/11/20 5:43 AM, Stephane Chazelas wrote:
>>
>>> AFAIK, bash and bosh are the only shells that complain when you
>>> use return outside of functions/sourced scripts (bash also
>>> doesn't exit upon that failing "return" special builtin in that
>>> case which could be seen as a conformance bug). 
>>
>> You really should try posix mode.
> [...]
> 
> As it happens, I did in that case, and I found it behaved the
> same in or outside of POSIX mode:

$ ./bash ./x5
./x5: line 1: return: can only `return' from a function or sourced script
after
$ ./bash -o posix ./x5
./x5: line 1: return: can only `return' from a function or sourced script
$ cat x5
return 7
echo after



-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 10:49 AM, Dirk Fieldhouse wrote:
> On 11/03/20 14:03, Chet Ramey wrote:
>>...>
>> So what's the goal here? That the function continue execution in the
>> subshell so `return' has consistent, if baffling, semantics? That we
>> tighten up the language to make the unspecified specific? What is this
>> discussion intended to accomplish?
> 
> I refer you to this excerpt:
> 
>> On 3/11/20 9:12 AM, Dirk Fieldhouse wrote:>...>
>>> a)    the wording of the standard about 'return' doesn't say this (or
>>> as you said,
>>>> what [I] believe it appears to say, and what it
>>>> actually means, are probably not the same thing.
>>> which is not a good look for a standard);
>>...>
> 
> If the standard can easily be misinterpreted, it ought to be reworded.

So the latter, then. We can move on to proposing language.

> The interesting discussion prompted by my original post indicates that
> even experts don't agree on the interpretation of the text that
> specifies 'return'.
> 
> Did the wise authors of that text mean 'return' to cause:
> 
> i)    a return from the function's lexical scope, subject to some missing
> definition of that scope, or

Yes.

> 
> ii)    a return from the execution environment of the function's defining
> command, or otherwise like 'exit', or

In the case of a subshell or other separate execution environment, the
`exit' seems the most reasonable action. It would be far worse if the
function continued execution in a subshell.

> 
> iii)    a return from the execution environment of the function's defining
> command, or otherwise unspecified?
> 
> Perhaps the answer is (i) but owing to existing practice the standard
> should say (ii) or (iii).

What does a `return from the execution environment' mean, exactly? Does
it mean that the calling shell should exit somehow? Since functions are
executed in the same execution environment as the caller, and subshells
are created as necessary as part of the function body execution, does
the `defining command' mean the caller, or something else?

> 
> As to "baffling semantics", I suggest that these are two examples where
> 'return' is meaningful (and far from baffling) 

I assert that having the function (and the rest of any script) continue
to execute in a subshell because a `return' appeared in a subshell would
be baffling and difficult to explain to users.


> and that "foo" and not
> "bar" should be printed in each case:
> 
> f1() {
>   ( echo foo; return )
>   echo bar
> }

This implies some interprocess communication between the parent and child
that simply doesn't exist, and nothing in the standard indicates that it
should.

> 
> and
> 
> f2() {
>   echo foo |
>     if read -r xx && [ "$xx" = foo ]; then
>    echo "$xx"; return
>     else
>    echo "$xx"
>     fi
>   echo bar
> }

This is unspecified, and has been ever since ksh decided to run the last
pipeline element in the current shell process (or execution environment,
if you prefer). You can't rely on either behavior. The number of shells
that print `bar' exceeds the number that don't, for what that's worth.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 9:12 AM, Dirk Fieldhouse wrote:
> On 11/03/20 06:25, Robert Elz wrote:
>>
>> ... The standard by [referring to 'subshell environment'] is trying ...
>> to avoid constraining the implementation, things that an implementation
>> can work out how to do without forking it can do (which will make it
>> faster, and less expensive to run) - but it must preserve the fiction
>> that it has forked, as other shells do, and scripts are allowed to
>> rely upon that, so no side effects (such as a return in a subshell
>> causing a function in the parent to return) are permitted.
> 
> Absolutely, but
> 
> a)    the wording of the standard about 'return' doesn't say this (or as
> you said,
>> what [I] believe it appears to say, and what it
>> actually means, are probably not the same thing.
> which is not a good look for a standard);
> 
> b)    in particular, returning from a subshell is not one of the forbidden
> side-effects that is mentioned in or can be inferred from the text of 2.12.
> 
> Isn't returning on use of 'return' an effect (ie, the actual behaviour
> expected by the script author) rather than a side-effect?

So what's the goal here? That the function continue execution in the
subshell so `return' has consistent, if baffling, semantics? That we
tighten up the language to make the unspecified specific? What is this
discussion intended to accomplish?

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: XCU: 'return' from subshell

2020-03-11 Thread Chet Ramey
On 3/11/20 5:43 AM, Stephane Chazelas wrote:

> AFAIK, bash and bosh are the only shells that complain when you
> use return outside of functions/sourced scripts (bash also
> doesn't exit upon that failing "return" special builtin in that
> case which could be seen as a conformance bug). 

You really should try posix mode.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)

2020-02-18 Thread Chet Ramey
On 2/18/20 12:00 PM, shwaresyst wrote:
> 
> Don't see why, lparen, like "=", is not a char that stops collection of a 
> token body

I'm not exactly sure what a `token body' is supposed to be, but the left
paren does delimit a token. A `=' does not.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)

2020-02-18 Thread Chet Ramey
On 2/18/20 10:36 AM, Robert Elz wrote:
> Date:Tue, 18 Feb 2020 09:58:32 -0500
> From:    Chet Ramey 
> Message-ID:  <22f16ef4-41cf-60b0-5968-f608dc988...@case.edu>
> 
>   | The 1992 version of the standard knew about time, standardized it as part
>   | of the UPE, and acknowledged that it worded the definition to allow the
>   | ksh88 reserved word implementation.
> 
> Well, kind of - they made it unspecified what
> 
>   time a | b 
> 
> timed ("a" or "a|b")

Yes. "The times reported are unspecified."

 but ksh (all versions I believe) bash and bosh
> all allow
>   time(sleep 1)
> which nothing in the standard explains (or allows).  

"If the current character is not quoted and can be used as the first
character of a new operator, the current token (if any) shall be
delimited."

 Similarly
>   time { sleep 1; }
> and
>   time if true; then sleep 1; else sleep 2; fi

They're both pipelines according to the shell grammar.

> And in a subsequent message, chet.ra...@case.edu said:
>   | There are people on the bash mailing lists who would like a word. 
> 
> There are some very strange people on those lists!

A couple of representative examples.

https://lists.gnu.org/archive/html/help-bash/2018-12/msg00110.html
https://lists.gnu.org/archive/html/help-bash/2018-12/msg00092.html


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)

2020-02-18 Thread Chet Ramey
On 2/18/20 9:59 AM, Robert Elz wrote:

> Why someone would want to time a builtin I'm not sure
> (with the possible exception of elapsed time of wait)

There are people on the bash mailing lists who would like a word.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)

2020-02-18 Thread Chet Ramey
On 2/18/20 5:31 AM, Joerg Schilling wrote:

> All shells I am aware detect the -p option in the parser already and switch 
> to 
> the time utility instead of the time keyword.

Bash doesn't do that; it's not useful and difficult to do using a bison-
based parser.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



  1   2   3   >