Re: sh 'continue' shenanigans: negating
On 2/14/24 6:40 PM, Christoph Anton Mitterer wrote: On Wed, 2024-02-14 at 09:18 -0500, Chet Ramey via austin-group-l at The Open Group wrote: POSIX requires this, since it says that return sets $? to 1 here. I assume you mean the description of the exit status from `return`? No, I mean POSIX specifies that `return' sets the value of $? directly. It's not set from the (possibly inverted) return status from `return'. The value of the special parameter '?' shall be set to n, an unsigned decimal integer, or to the exit status of the last command executed if n is not specified. If so, then IMO strictly speaking, it doesn't say whose $? shall be set that way. That doesn't make sense as written unless you're using $? as a shorthand for a command's return status. > But is there anything that prevents one from interpreting it as the $? > of the `return` itself? Similar as you said above the `continue` has > one as its a built-in? The difference in the descriptions is that the `return' description talks about setting $?: "The value of the special parameter '?' shall be set to n" where the `continue' description has a generic description of its return status. There was a discussion about this -- at least the language -- in https://www.austingroupbugs.net/view.php?id=1309 resulting in changes to the description for the next edition. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/ OpenPGP_signature.asc Description: OpenPGP digital signature
Re: sh 'continue' shenanigans: negating
On 2/14/24 12:06 AM, Oğuz wrote: On Tuesday, February 13, 2024, Chet Ramey via austin-group-l at The Open Group mailto:austin-group-l@opengroup.org>> wrote: `continue' is a builtin; continue has a return status; `!' says to negate it. It seems easy to come to the conclusion that the script should return 1. The same can be said about `return'. But bash disagrees: $ bash -c 'f(){ ! return 1;}; f; echo $?' 1 $ Does POSIX allow this or is it another case where bash diverges from POSIX? POSIX requires this, since it says that return sets $? to 1 here. If you find cases where you believe bash differs from POSIX, and it's not documented in POSIX mode, please report them. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: sh 'continue' shenanigans: negating
On 2/13/24 2:48 PM, Thorsten Glaser via austin-group-l at The Open Group wrote: Hi, I’ve got the following issue, and… yes I can see how the reporter could come to the conclusion that it should “return” 1, but… … at what point does “continue” “return”? Where do I stop operating? `continue' is a builtin; continue has a return status; `!' says to negate it. It seems easy to come to the conclusion that the script should return 1. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: sh: set -o pipefail by default
On 1/11/24 3:53 AM, Andrew Pennebaker via austin-group-l at The Open Group wrote: With sh gaining set -o pipefail, I am curious about having sh require (or encourage) enabling this option by default. it would help to catch a lot of false negatives in deceptively simple scripts. No. This would break a large body of existing scripts, and it's not the purpose of the standard. Any implementation can choose to default it to enabled, of course. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/ OpenPGP_signature.asc Description: OpenPGP digital signature
Re: Request: Standard hashmaps in sh
On 12/27/23 11:26 AM, Andrew Pennebaker via austin-group-l at The Open Group wrote: Many programs depend on hashmaps in order to work. awk is not an answer. The lack of hashmaps forces people to use less efficient algorithms, such as linear search. The bash family implements it. Simply acknowledging bash associative array syntax, would instantly improve the scalability of sh scripts. That's not the intent of the standard. The standard is supposed to give users an idea about what they can rely on for portable scripts (and, to a lesser extent, interactive use). While bash and ksh93 implement associative arrays, that's not enough for a standard. You could write something up and request that it be included -- that's how the $'...' quoting form eventually got in -- but I'm kind of skeptical that it would make it. It's a big change to the shell syntax. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/ OpenPGP_signature.asc Description: OpenPGP digital signature
Re: bug#65659: RFC: changing printf(1) behavior on %b
On 9/3/23 2:36 AM, Stephane Chazelas via austin-group-l at The Open Group wrote: And except with yash's printf (among the few printf's I've tested): $ LC_ALL=zh_TW luit $ locale title charmap Chinese locale for Taiwan R.O.C. BIG5 $ echo() { printf '%b ' "$@"\\n\\c; } $ echo 'α' αn% (the trailing %'s above indicating the absence of newline character). Presumably because yash uses wide character strings internally and doesn't rely on the printf(3) engine to output characters. On the other hand, I suspect it will refuse to print byte strings that do not form valid wide characters (I haven't tested this in particular, but yash fails on invalid wide character strings in other expansions). α (Greek lowercase alpha, U+03B1) being one of the several characters whose encoding ends in byte 0x5c (the encoding of backslash) in BIG5 (there are even more in GB18030, but I like BIG5's α as an example as that's the alphabetic character by excellence; BIG5HKSCS (Hong Kong variant) also has characters from the Latin and Cyrillic alphabets in that situation). It depends on how much of the output you want to leave to printf(3) and the other stdio functions. If printf(1) assembles format strings and arguments into something it passes to printf(3), and that processes them as bytes, the game is over. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: bug#65659: RFC: changing printf(1) behavior on %b
On 9/3/23 4:22 PM, Robert Elz via austin-group-l at The Open Group wrote: Date:Sun, 3 Sep 2023 07:36:59 +0100 From:Stephane Chazelas Message-ID: <20230903063659.mzyfen4evyrnz...@chazelas.org> | though has the same limitation as my bash echo -e "$*\n\c" Yes, I know, though as nothing anywhere says what echo is supposed to do with a lone trailing \ (or in fact, a \ that is not followed by one of the defined escape sequences), I treat that as unspecified, It's not specified, rather than being explicitly unspecified, so anything goes. Bash just outputs the `\' in this case. | $ LC_ALL=zh_TW luit | $ locale title charmap | Chinese locale for Taiwan R.O.C. | BIG5 | $ echo() { printf '%b ' "$@"\\n\\c; } | $ echo 'α' | αn% That one is a different issue, and seems to me to be a simple implementation bug (and no, I am not claiming that NetBSD wouldn't act just like that) - characters ought to be fully formed before testing their values. I suspect this is the result of printf's history as a byte-oriented utility, everyone still treats the format string as a sequence of bytes. It's probably rare enough for an encoded character to contain a backslash that no one has changed it yet. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001771]: support or reserve %q as printf-utility format specifier
On 9/4/23 3:58 AM, Geoff Clare via austin-group-l at The Open Group wrote: | Issue 9 will have an inconsistency between the printf() function and the | printf utility. Yes. And exactly why is that a problem? I think everyone in the teleconference just assumed that the inconsistency is best avoided. I don't recall living with it being discussed as an option. I don't agree that consistency is the primary requirement here. However, from the feedback it seems that enough people think "the cure is worse than the disease" on this, and we should indeed consider living with the inconsistency as another option. It just seems like a lot of backwards compatibility and POSIX guidance to throw away for little gain. POSIX has included %b -- and recommended its use -- for over 30 years. My guess is that thousands of scripts use it. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: RFC: changing printf(1) behavior on %b
On 8/31/23 11:35 AM, Eric Blake wrote: In today's Austin Group call, we discussed the fact that printf(1) has mandated behavior for %b (escape sequence processing similar to XSI echo) that will eventually conflict with C2x's desire to introduce %b to printf(3) (to produce 0b000... binary literals). For POSIX Issue 8, we plan to mark the current semantics of %b in printf(1) as obsolescent (it would continue to work, because Issue 8 targets C17 where there is no conflict with C2x), but with a Future Directions note that for Issue 9, we could remove %b entirely, or (more likely) make %b output binary literals just like C. I doubt I'd ever remove %b, even in posix mode -- it's already been there for 25 years. But that raises the question of whether the escape-sequence processing semantics of %b should still remain available under the standard, under some other spelling, since relying on XSI echo is still not portable. One of the observations made in the meeting was that currently, both the POSIX spec for printf(1) as seen at [1], and the POSIX and C standard (including the upcoming C2x standard) for printf(3) as seen at [3] state that both the ' and # flag modifiers are currently undefined when applied to %s. Neither one is a very good choice, but `#' is the better one. It at least has a passing resemblence to the desired functionality. Why not standardize another character, like %B? I suppose I'll have to look at the etherpad for the discussion. I think that came up on the mailing list, but I can't remember the details. Is there any interest in a patch to coreutils or bash that would add such a synonym, to make it easier to leave that functionality in place for POSIX Issue 9 even when %b is repurposed to align with C2x? It's maybe a two or three line change at most. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/10/23 7:02 PM, Robert Elz wrote: Date:Mon, 10 Apr 2023 10:30:08 -0400 From:Chet Ramey Message-ID: <78038281-f431-775e-6d60-a44126d1d...@case.edu> | The different semantics are that the standard specifies the status of the | simple command in terms of the command substitution that's part of the | assignment statement, so you have to hang onto it for a while. I suspect that's because you are treating the assignments (more or less) as statements of their own, and expanding and then assigning each, one by one, left to right as you encounter them. No, the sentence means exactly what it says. The difference between his example and mine is that in my example the shell has to remember the return status of the last command substitution until the command completes. then there is no issue, and no real need to "hang onto it for a while". Of course you do, the standard says it's the return status of the command. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/6/23 5:59 PM, Robert Elz wrote: Date:Wed, 5 Apr 2023 10:35:58 -0400 From:"Chet Ramey via austin-group-l at The Open Group" Message-ID: | A variant with slightly different semantics: | | (exit 8) | a=4 b=$(exit 42) c=$? | echo status:$? c=$c | | The standard is clear about what $? should be for the echo, but should it | be set fron the command substitution for the assignment to c? It isn't really different semantics, it is the same thing. The different semantics are that the standard specifies the status of the simple command in terms of the command substitution that's part of the assignment statement, so you have to hang onto it for a while. We have this identical discussion every couple of years. At least the last time produced interp 1150, which -- in true standards fashion -- attempted to clarify the issue with additional obscure language. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/6/23 5:43 PM, Robert Elz wrote: Hence we got that absurd PATH search rule for builtins, that no shell of the time did anything like, "because a user might want to override a builtin with a version in their own bin directory, earlier in PATH than where the standard version of the command exists", Yes, it's ludicrous and ahistorical, but you still had people arguing in favor of it as recently as a few years ago. There were proposals to extend (abuse?) env, exec, and command to accomplish the task of temporarily overriding a builtin, but those place the burden on the script author rather than enable the user to do it. The bash `enable' builtin is always used as the canonical example, since there's printer (I think) software that uses it as a command name. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/6/23 1:55 PM, Harald van Dijk wrote: One additional data point: in schily-2021-09-18, Jörg's last release, obosh, the legacy non-POSIX shell that is just there for existing scripts and for portability testing, prints 0 (using `` rather than $()), whereas pbosh and sh, the minimal and extended POSIX versions of the shell, print 1. This does provide extra support for the view that this was a change that POSIX demanded, that the deviation from historical practice was intentional, but does not answer what the reasoning might have been. I doubt it was `demanded'; the bosh change immediately followed an austin- group discussion (we both participated) about this exact issue. Maybe he thought it was the right thing based on that discussion. As part of the discussion, he wrote: > The important thing to know here is that the Bourne Shell has some > checkpoints that update the intermediate value of $?. Since that changed in > ksh88 and since POSIX requires a different behavior compared to the Bourne > Shell, I modified one checkpoint in bosh to let it match POSIX. so he had already been modifying that behavior before 2021, maybe after interp 1150. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/5/23 12:36 PM, Harald van Dijk wrote: On 05/04/2023 15:35, Chet Ramey via austin-group-l at The Open Group wrote: On 4/5/23 9:06 AM, Martijn Dekker via austin-group-l at The Open Group wrote: Consider: false || echo $(true) $? dash, mksh and yash print 1. bash, ksh93 and zsh print 0. Which is right? I believe dash, mksh, yash are already right based on the current wording of the standard. As Martijn wrote, the rule is that $? "Expands to the decimal exit status of the most recent pipeline", the most recent pipeline in the shell environment in which $? is evaluated is "false", and changes in the subshell environment shall not affect the parent shell environment, including changes in the subshell environment to $?. That's certainly one interpretation, and may indeed be what the 1992 authors intended. My question is why they would choose something other than what the so-called reference implementations (SVR4 sh, ksh88) did. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/5/23 11:25 AM, Oğuz wrote: 5 Nisan 2023 Çarşamba tarihinde Chet Ramey via austin-group-l at The Open Group mailto:austin-group-l@opengroup.org>> yazdı: but should it be set fron the command substitution for the assignment to c? I think it'd be practical, is there a reason why it shouldn't? https://www.austingroupbugs.net/view.php?id=1150 It's unspecified. The assignments are performed `beginning to end', but it's not specified when $? is set. Interestingly, the SVR4 sh and ksh88 both set $? as each command substitution finishes, but POSIX didn't specify that behavior explicitly. Maybe the 1992 authors didn't feel they had to. And while we're at it, is there a reason why assignments in a simple command shouldn't be applied sequentially, from left to right? The Bourne shell performed the assignments right to left. This came up on the austin-group list a couple of years ago, in almost exactly the same way, but I don't think the discussion made its way to an interpretation request. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? behaviour after comsub in same command
On 4/5/23 9:06 AM, Martijn Dekker via austin-group-l at The Open Group wrote: Consider: false || echo $(true) $? dash, mksh and yash print 1. bash, ksh93 and zsh print 0. Which is right? A variant with slightly different semantics: (exit 8) a=4 b=$(exit 42) c=$? echo status:$? c=$c The standard is clear about what $? should be for the echo, but should it be set fron the command substitution for the assignment to c? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Syntax error with "command . file" (was: [1003.1(2016/18)/Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)
On 3/14/23 4:58 PM, Harald van Dijk wrote: On 14/03/2023 20:41, Chet Ramey wrote: On 3/12/23 10:19 PM, Harald van Dijk via austin-group-l at The Open Group wrote: bash appears to disables the reading of .profile in POSIX mode entirely. This isn't quite correct. By default, a login shell named `sh' or `-sh' reads /etc/profile and ~/.profile. You can compile bash for `strict posix' conformance, or invoke it with POSIXLY_CORRECT or POSIX_PEDANTIC in the environment, and it won't. Isn't it? The mode bash gets into when invoked as sh is described in the manpage (looking at the 5.2.15 manpage) as: If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well. [...] When invoked as sh, bash enters posix mode after the startup files are read. The mode bash gets into when POSIXLY_CORRECT is set, the mode that can also be obtained with --posix, is described in the manpage as: When bash is started in posix mode, as with the --posix command line option, it follows the POSIX standard for startup files. Right. When you force posix mode immediately, as I said above, bash won't read the startup files. A login shell named sh or -sh reads the historical startup fles, then enters posix mode. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Syntax error with "command . file" (was: [1003.1(2016/18)/Issue7+TC2 0001629]: Shell vs. read(2) errors on the script)
On 3/12/23 10:19 PM, Harald van Dijk via austin-group-l at The Open Group wrote: bash appears to disables the reading of .profile in POSIX mode entirely. This isn't quite correct. By default, a login shell named `sh' or `-sh' reads /etc/profile and ~/.profile. You can compile bash for `strict posix' conformance, or invoke it with POSIXLY_CORRECT or POSIX_PEDANTIC in the environment, and it won't. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Writing the "[job] pid" for async and-or lists
On 2/24/23 10:59 AM, Robert Elz via austin-group-l at The Open Group wrote: Can we agree (and then after Draft 3 is available, submit a bug report) that this text should only apply to the top level of an interactive shell, and not to any subshell environments. I'd be good with specifying that it doesn't apply to subshell environments. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Writing the "[job] pid" for async and-or lists
On 2/24/23 11:35 AM, Harald van Dijk via austin-group-l at The Open Group wrote: I agree, but a subshell of an interactive shell is effectively non-interactive anyway, and many of the special rules for interactive shells should not apply to subshells of interactive shells and already don't in various existing shells (but the extent to which they don't varies from shell to shell). Rather than make a special exception for process IDs, could this be made a general rule? If POSIX made it a general rule, someone would probably have to go through all the changed behavior and check which shells conform and which do not. I can't see that happening in time for the next draft. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Minutes of the 6th February 2023 Teleconference
On 2/9/23 4:19 AM, Geoff Clare via austin-group-l at The Open Group wrote: When this was discussed on the call, there was general agreement that executing the partial line after getting a read error is really not a good thing for shells to be doing. OK, that's a reasonable position to take. But is it the role of a standards body to require that behavior, when shells don't do it today? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Minutes of the 6th February 2023 Teleconference
On 2/7/23 2:22 AM, Andrew Josey via austin-group-l at The Open Group wrote: Bug 1629: Shell vs. read(2) errors on the script OPEN https://austingroupbugs.net/view.php?id=1629 This item was discussed at length on the call including the feedback from Chet Ramey. Notes were updated in the etherpad -- https://posix.rhansen.org/p/2023-02-06 . I looked at the etherpad. Nick, thanks for the detail about the script. There are some missing details about what happens on read errors. One thing is that bash assumes fatal read(2) errors are *not* transient: when a read returns -1/EWHATEVER, if errno is not EINTR or EAGAIN the next read will also return an error. Same with EOF. This is, of course, not how it goes when you inject read errors. With that in mind: > /tmp/shell.script: line 1227: ent: command not found > $ echo $? > 42 > > (i.e. it treated the end of "# this is a comment" as the command "ent" > and continued execution) is reasonably easy to explain. Bash, like all the shells, converts the read error into EOF and executes the partial line, which happens to be a comment. Then it goes back to read, assuming that a real error will result in another -1/EIO (as it will in virtually all situations). Since the read succeeds, the error must have been transient, and it goes on. Other shells do things differently. The key is that everyone `executes' the partial line after getting EOF, even yash. Nick's example shows this, if you convert it to a non-interactive shell instance. printf "echo foo" | bash will output `foo' in bash and every other shell. Same with interactive shells. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Minutes of the 2nd February 2023 Teleconference
On 2/4/23 1:33 AM, Andrew Josey via austin-group-l at The Open Group wrote: Bug 1629: Shell vs. read(2) errors on the script OPEN https://austingroupbugs.net/view.php?id=1629 This item was discussed at length on the call. We will continue with this item next time. I looked at the etherpad. I'm not sure who tested bash-5.2.2 (Nick?), but you ended up running something that a vendor -- probably Ubuntu via Debian -- modified to add an error message. I can't tell what `/tmp/script' is, but running bash-5.2.15 on RHEL7 $ ./bash --version GNU bash, version 5.2.15(5)-release (x86_64-pc-linux-gnu) Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. against the following script: $ cat x10 echo a; echo b; echo after: $? exits with status 0 after a read error: $ strace -e trace=read -e inject=read:error=ESTALE:when=7 ./bash ./x10 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\316\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\16\0\0\0\0\0\0"..., 832) = 832 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`&\2\0\0\0\0\0"..., 832) = 832 read(3, "MemTotal:1862792 kB\nMemF"..., 1024) = 1024 read(3, "echo a; echo b;\necho after: $?\n", 80) = 31 read(255, "echo a; echo b;\necho after: $?\n", 31) = 31 a b after: 0 read(255, 0x26d0290, 31)= -1 ESTALE (Stale file handle) (INJECTED) +++ exited with 0 +++ You have to make sure you inject the error when bash is reading input for the parser (the fifth read is checking whether or not the script is a binary file). You'd get the same results with when=6. You didn't check dash, but it returns 0 as well. I assume the other ash- derived BSD shells behave similarly. yash is still the only shell that returns an error in this case. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Security risk in uudecode specification?
On 1/20/23 7:11 PM, Christoph Anton Mitterer via austin-group-l at The Open Group wrote: It's a pity, that the different parties often don't seem to try to agree on something standardised first, but rather add new base utilities or functionalities like the shell's "local"... and only afterwards standardisation is tried (but often fails). This is a great example of how hindsight is perfect and seductive. There was a `local' in draft 9 of POSIX.2, back in the late 80s. It got taken out -- even in its benign, non-specific form -- because nobody agreed how it should be implemented even back then, and it stood in the way of consensus. So we all went our own ways. Brian and I implemented local with dynamic scoping, like ksh88. Korn went off to do static scoping in ksh93. Ken Almquist implemented dynamic scoping in ash. That kind of set the boundaries of the debate. And here we are. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Shell vs. read(2) errors on the script
On 1/8/23 9:39 AM, Thorsten Glaser via austin-group-l at The Open Group wrote: There are two questions here: ① Should shell script read errors be treated as EOF, as is practice? ② If not, what should the shell do upon encountering one? It's pretty clear that a non-interactive shell can't continue after it gets a read error on the script, so treating it as EOF in the sense that it stops execution is the right thing to do. The only question is whether the shell is still bound by the requirement to exit with the status of the last command executed. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work
On 5/18/22 9:46 PM, Christoph Anton Mitterer via austin-group-l at The Open Group wrote: The above, I'm not quite sure what these tell/prove... I assume the ones with '?': that for all except bash/fnmatch '?' matches both, valid characters and a single byte that is no character. And the ones with bracket expression, that these also work when the BE has either a valid character or a byte (that is not a character) and vice-versa? If Chet is reading along, is the above intended in bash, or considered a bug? The bash matcher falls back to C-locale-like behavior only if the pattern and the string both do not contain any valid multibyte characters. So if, for example, the string contains a valid multibyte character, but the pattern does not, the matcher will attempt multibyte (wide character, really) matches. This is why the string \243] (a valid multibyte character in Big5) does not match [\243!]]: nothing in the bracket expression will match that character, and that string will never match a pattern ending in `]'. IMO it would have been interesting to see whether ? would also match multiple bytes that are each for themselves and together no valid character... No, it wouldn't. You can make a case for `?' matching a single byte that is not part of a valid multibyte character (there is no such thing as a single byte that is "no valid character" when you are matching), but you cannot make one for `?' matching more than one byte that does not compose a valid multibyte character. The tests involving \243 are run in a Big5 environment. In Big5, \243\135 is the representation of β, a single valid character, even though \135 on its own is still the single character ]. Seem also a bit strange to me,... all shells match \243 against ? ... i.e. ? matches a single byte that is not a character... but later on it doesn't work again with \243] and ?] Because, as Harald says, \243] is a valid multibyte character in Big5 locales. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 5:37 PM, Robert Elz wrote: Date:Sat, 14 May 2022 03:56:32 +0700 From:"Robert Elz via austin-group-l at The Open Group" Message-ID: <2459.1652475...@jinx.noi.kre.to> | | Show your work. | I no longer remember the exact command I used (cannot even locate the | message you're quoting from), I finally did ... This is what I see: I don't see that. $ echo $BASH_VERSION 5.1.16(2)-release $ sleep 20 | sleep 20 & sleep 30 | sleep 30 & jobs -l ; pstree $$ ; ps jT [1] 22954 [2] 22956 [1]- 22953 Running sleep 20 22954 | sleep 20 & [2]+ 22955 Running sleep 30 22956 | sleep 30 & -+= 22938 chet ./bash |--- 22953 chet sleep 20 |--- 22954 chet sleep 20 |--- 22955 chet sleep 30 |--- 22956 chet sleep 30 \-+- 22957 chet pstree 22938 \--- 22958 root ps -axwwo user,pid,ppid,pgid,command USER PID PPID PGID SESS JOBC STAT TT TIME COMMAND root 811 544 811 00 Ss s0190:00.05 login -pfl chet /bin/ba chet 814 811 814 01 Ss0190:00.09 -bash chet 22938 814 22938 01 S+ s0190:00.04 ./bash chet 22953 22938 22938 01 S+ s0190:00.00 sleep 20 chet 22954 22938 22938 01 S+ s0190:00.00 sleep 20 chet 22955 22938 22938 01 S+ s0190:00.00 sleep 30 chet 22956 22938 22938 01 S+ s0190:00.00 sleep 30 root 22959 22938 22938 01 R+ s0190:00.00 ps jT $ kill %1 $ ps jT USER PID PPID PGID SESS JOBC STAT TT TIME COMMAND root 811 544 811 00 Ss s0190:00.05 login -pfl chet /bin/ba chet 814 811 814 01 Ss0190:00.09 -bash chet 22938 814 22938 01 S+ s0190:00.04 ./bash chet 22955 22938 22938 01 S+ s0190:00.00 sleep 30 chet 22956 22938 22938 01 S+ s0190:00.00 sleep 30 root 22960 22938 22938 01 R+ s0190:00.00 ps jT $ -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 4:56 PM, Robert Elz wrote: Date:Fri, 13 May 2022 11:22:20 -0400 From:Chet Ramey Message-ID: | Show your work. | | I tested this on macOS 12 and RHEL 7, using interactive shells with job | control enabled, That is likely the difference. The question was about what happens when job control is not enabled. The same thing. This example uses bash-5.2-beta on macOS 10.15, but the same thing happens with bash-5.1.16. $ ./bash $ set +m $ sleep 20 | sleep 20 & [1] 22755 jenna.local(2)$ pstree $$ -+= 22753 chet ./bash |--- 22754 chet sleep 20 |--- 22755 chet sleep 20 \-+- 22756 chet pstree 22753 \--- 22757 root ps -axwwo user,pid,ppid,pgid,command $ kill %1 $ ps ax | grep sleep 22759 s018 S+ 0:00.00 grep sleep $ sleep 20 | sleep 20 & pstree $$ [1] 22787 -+= 22753 chet ./bash |--- 22786 chet sleep 20 |--- 22787 chet sleep 20 \-+- 22788 chet pstree 22753 \--- 22789 root ps -axwwo user,pid,ppid,pgid,command $ kill %1 $ ps axuw | grep sleep chet 22791 0.0 0.0 4408552764 s018 S+ 10:25AM 0:00.00 grep sleep -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/5/22 7:46 AM, Geoff Clare via austin-group-l at The Open Group wrote: [Robert intended to send the mail I'm replying to to the list, but it was only sent to me. I've quoted it in full.] Robert Elz wrote, on 05 May 2022: This leaves just bash of the shells I have to test. bash is odd, at first glance it seems to act like the ksh's, zsh & fbsh do. But it doesn't. This seems to be because in a pipeline like sleep 20 | sleep 20 & creates a subshell for the '&' first, and then creates a new subshell environment for each side of the pipe. None of the other shells do that, the processes in the pipeline are in subshell environments (in most anyway) but the same one as the one created for the async process execution - that is, the sleep processes are direct children of the parent shell, not grandchildren as they are in bash. When given "kill %1" it then seems to work just like those other shells, but all that is actually killed is the forked copy of itself, leaving the sleep processes running, orphaned. Show your work. I tested this on macOS 12 and RHEL 7, using interactive shells with job control enabled, running the latest bash devel version, and could not reproduce it. The Linux version of pstree shows the process group; the macOS version doesn't have that option. Both show the sleep processes are direct descendents of the parent shell, but even if they aren't, bash clearly does not leave the sleep processes orphaned. macOS 12: $ sleep 20 | sleep 20 & [1] 16711 $ pstree $$ -+= 16694 chet ./bash |--= 16710 chet sleep 20 |--- 16711 chet sleep 20 \-+= 16712 chet pstree 16694 \--- 16713 root ps -axwwo user,pid,ppid,pgid,command $ kill %1 $ ps axuw | grep sleep chet 16717 0.0 0.0 34142704632 s027 U+ 11:04AM 0:00.00 grep sleep [1]+ Terminated: 15 sleep 20 | sleep 20 RHEL 7: $ sleep 20 | sleep 20 & [1] 106739 $ pstree -g $$ bash(106427)─┬─pstree(106743) ├─sleep(106738) └─sleep(106738) $ kill %1 $ ps axuw | grep sleep chet 106753 0.0 0.0 112812 960 pts/1R+ 10:59 0:00 grep sleep [1]+ Terminated sleep 20 | sleep 20 -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 10:27 AM, Geoff Clare via austin-group-l at The Open Group wrote: Chet Ramey wrote, on 13 May 2022: On 5/13/22 5:20 AM, Geoff Clare via austin-group-l at The Open Group wrote: The definition of "Job" is: A set of processes, comprising a shell pipeline, and any processes descended from it, that are all in the same process group. Notice it says "that are all in the same process group". In the case of a background command started with job control disabled, the processes all have the same process group as the parent shell. By a strict reading, this counts as a job, but I don't think that was intended. Why not? This is what allows jobs/kill/wait to use job control notation in operands even when job control is not currently enabled. I'd argue that that was intended. My reading is that all the standard requires here is that if one or more jobs are created with job control enabled, and job control is subsequently disabled, you can still use "jobs" to list those jobs, and %n etc. with "kill" to refer to those jobs. Of course; it relies on your assertion that the standard requires job control to be enabled to create a job and put it in the jobs list. I've already said what I think about that, and most, if not all, shells behave differently. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 5:20 AM, Geoff Clare via austin-group-l at The Open Group wrote: You are over reaching in the way you are reading that text. I strongly disagree. If you have to work that hard to make your case, it's a good indication that the existing language is wrong -- or at least insufficient -- and needs to be changed. There is no such thing as a known process ID that is not a job. Bash allows process substitutions to set $!, so users can wait for them, but they are not jobs. Process substitution is, of course, an extension. The definition of "Job" is: A set of processes, comprising a shell pipeline, and any processes descended from it, that are all in the same process group. Notice it says "that are all in the same process group". In the case of a background command started with job control disabled, the processes all have the same process group as the parent shell. > By a strict reading, this counts as a job, but I don't think that was intended. Why not? This is what allows jobs/kill/wait to use job control notation in operands even when job control is not currently enabled. I'd argue that that was intended. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/12/22 10:03 AM, Geoff Clare via austin-group-l at The Open Group wrote: The normative text relating to creation of job numbers/IDs is all conditional on job control being enabled. Where is that? It's not in the definition of Job ID, it's not in 2.9.3 Asynchronous Lists, it's not in the `jobs' description, it's not part of the definition of Background Job or Foreground Job, it's not in any of fg/bg/kill/wait. I feel like I'm missing something obvious here. You're looking in (some of) the right places, but missing the significance of what's written there. If we're going to make basic concepts dependent on obscure language in the standard that requires the reader to make the proper set of inferences, the standard has failed. It's worse that it fails to capture what the majority of shells do in practice. This set of examples you give, which you might assert are definitive, is not all that compelling. If the standard wants to specify something, why can't it just say so in plain language? Why make it a puzzle to be solved? If you have to work this hard to make your case, it's probably not that obvious. So for the known IDs list, it's pretty much `wait' and `jobs', right? The phrase kre used was "when their termination status has been reported to the user - however that happens". That includes information written by an interactive shell before it writes a prompt. Although the standard says this information is about the exit status of "the background job", it is also, by association, information about the exit status of a process in the known process IDs list. Another reason that the language relating the two things, and describing how they interact, needs to be clear and unambiguous, and handle all four scenarios. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/11/22 6:31 PM, Robert Elz wrote: | For neither the first nor the last time. Including now. People can disagree. | > I think they should remain independent. | Sure, I agree. I don't. I cannot think of a single reason why the shell should be forced to maintain two separate lists of its child processes. The jobs table needs to have them, so processes in the job can be identified as they finish. Duplicating that in another table, for no particular reason I can imagine makes no sense to me. Still, if others want to implement it that way, I don't object - but the standard has never required that, and should not, absent some very good reason, be changed to require it now. It's going to take more work on the standard to make it be that way, then. There will have to be more specific language about when and how the jobs list is created, when jobs are added and removed, when and how jobs correspond to known process IDs, and whether or not removing IDs from that list just means removing the job from the table. If we're going to require job control to be enabled to maintain a jobs list, at least a visible one, then we have to have something else to use. It may be the jobs list internally, if we end up fixing all the places in the standard that are underspecified, and that would probably work. It's my impression that the known IDs list is a remnant from the time when job control was optional, and you didn't need to implement job control unless you implemented the UPE. You still needed a way to keep track of background processes, and the known IDs list was it. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: wait and stopped processes (was: When can shells remove "known" process IDs from the list?)
On 5/11/22 6:56 PM, Robert Elz wrote: | Maybe. And yet I can't recall ever receiving a bug about this. [...] The circumstances to provoke a problem need to be contrived. Exactly. It's a largely hypothetical scenario. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/10/22 12:03 PM, Geoff Clare via austin-group-l at The Open Group wrote: >> If jobs and kill work, you should probably add wait to this description, or >> add a separate paragraph to the wait rationale. > > If it works with "wait" in all shells (that we care about), then I > agree it would make sense to add it. Just decide whether or not it makes sense. If it makes sense, add it. Shell behavior is only selectively relevant. >> I'd be interested in your reasoning. The standard simply says that jobs >> and kill (and wait should be added) work with job %X notation whether >> or not job control is enabled. > > The normative text relating to creation of job numbers/IDs is all > conditional on job control being enabled. Where is that? It's not in the definition of Job ID, it's not in 2.9.3 Asynchronous Lists, it's not in the `jobs' description, it's not part of the definition of Background Job or Foreground Job, it's not in any of fg/bg/kill/wait. I feel like I'm missing something obvious here. >> OK. I'm pretty sure everyone already does this for the jobs list. Not sure >> whether you want it to include the known IDs list. > > I think kre intended it apply to the known IDs list as well, and I > was agreeing with that. So for the known IDs list, it's pretty much `wait' and `jobs', right? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: wait and stopped processes (was: When can shells remove "known" process IDs from the list?)
On 5/10/22 11:50 AM, Geoff Clare via austin-group-l at The Open Group wrote: > Chet Ramey wrote, on 06 May 2022: >> >>>> And last, also in this area, is the question of stopped jobs and the wait >>>> command, and how those two are intended to interact. >>> >>> The wording in my current draft makes clear that wait waits for >>> processes to terminate. I could, if desired, add some rationale saying >>> that some implementations have, as an extension, an option that allows >>> wait to return when a process stops. >> >> That's not the current behavior. At best, it should be unspecified. > > It is already what the standard requires, and with good reason. Sure. It simply isn't what many (most) shells do. > I have never, ever, seen a shell script use "wait" in a way that would > work correctly if the wait returned when the process stopped. The code > invariably assumes that wait will not return until the process > terminates. If it checks $? after the wait, it is always just to > distinguish between different exit status values. Maybe. And yet I can't recall ever receiving a bug about this. > In shells where wait (with no options specified) returns when a > process stops, that is a horrible misfeature. Kre has already stated > he will change NetBSD sh so that it doesn't do that. Hopefully the > other ash-bashed shells will follow suit. There are more shells than ash-based ones that do this. At least four independent code bases have made the same choice. > If you only change > bash in POSIX mode, you will be doing your users a disservice. I doubt that. There's no evidence that this is a problem for bash users. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/10/22 11:17 AM, Geoff Clare via austin-group-l at The Open Group wrote: >> Anyway, I agree with disallowing remove-before-prompting. > > Unfortunately that puts you in opposition to kre. For neither the first nor the last time. >> Or make it clear everywhere that removing a job from the jobs list >> means removing its pid from the list of terminated asynchronous lists. > > I think they should remain independent. Sure, I agree. It just means more work specifying when the shell can remove entries from either. I'll wait for your proposal. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/5/22 7:46 AM, Geoff Clare via austin-group-l at The Open Group wrote: The fact that the jobs command works with job control disabled is mentioned in the rationale on the jobs page: The jobs utility is not dependent on the job control option, as are the seemingly related bg and fg utilities because jobs is useful for examining background jobs, regardless of the condition of job control. When the user has invoked a set +m command and job control has been turned off, jobs can still be used to examine the background jobs associated with that current session. Similarly, kill can then be used to kill background jobs with kill %. so that's not an "issue". If jobs and kill work, you should probably add wait to this description, or add a separate paragraph to the wait rationale. XBD 2.175 defines a job as A set of processes, comprising a shell pipeline, and any processes descended from it, that are all in the same process group. Which says nothing very useful, and I am not sure is even correct. Yes, I made the same point in a previous message. The reason I think #2 should say "if job control is disabled" is because the standard talks separately about the list of "process IDs known in the shell environment" and the job list / job IDs. I think it needs to talk a little bit more clearly about the jobs list and what constitutes a job, not to mention how and when one gets created. Anyway, this also implies the existence of two separate lists. Your testing above seems to be conflating the "known IDs" and the jobs list. My reading of the standard is that entries in the jobs list only need to be created when job control is enabled, I'd be interested in your reasoning. The standard simply says that jobs and kill (and wait should be added) work with job %X notation whether or not job control is enabled. And in any event, that's not how shells work. I do agree that the current text implies two separate lists, and there's insufficient explanation of how they interact. It certainly doesn't imply that the `known IDs' stuff is only in effect when job control is not enabled. Independently of this, when job control is disabled all of the requirements relating to "known IDs" still apply and have nothing to do with %... job ID notation. If you make that change. The known IDs description doesn't depend on job control being enabled or disabled. | I think the description of the wait utility should be updated to require | removal from the list. I would agree with that. I wouldn't object. If someone wants to implement it that way, I have no objection, but it should not be required. shells should at least be permitted to remove jobs from the list of remembered stuff when their termination status has been reported to the user - however that happens. I agree. OK. I'm pretty sure everyone already does this for the jobs list. Not sure whether you want it to include the known IDs list. That could be another valid choice, but I would prefer that all shells wait for termination by default. You might, but that's not the current state of the world. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/3/22 6:52 AM, Geoff Clare via austin-group-l at The Open Group wrote: Robert Elz wrote, on 30 Apr 2022: | However, today it threw a last curve ball when I was working on an | update to the description of set -b ... How many shells actually implement that? They all accept it as an option, but for some it seems to be a no-op. That's one of the changes I was working on when I spotted this problem. Bash implements it. I doubt very many people use it. | This conflicts with 2.9.3.1 Asynchronous Lists which says that IDs | remain known until: | | 1. The command terminates and the application waits for the process ID. | | 2. Another asynchronous list is invoked before "$!" (corresponding to | the previous asynchronous list) is expanded in the current execution | environment. Does anyone implement that bit (#2) at all? In a non-interactive shell it might almost be possible, but in an interactive shell, if the job isn't in the list (whether $! has been referenced or not - usually it will not have been) because it has been removed, what is the shell supposed to do if the job stops? Further users (even in scripts) are allowed to use % %- %1 etc to refer to jobs, $! isn't the only way to reference one ("wait %2 should work). I'd suggest that #2 should simply be removed. I think #2 should say "If job control is disabled, ...". Why? You can use job control notation with jobs/kill/wait even if job control isn't enabled, which implies the presence of a job list separate from the list of known IDs. I think the description of the wait utility should be updated to require removal from the list. I agree, both the jobs list and the list of known IDs. [...] And last, also in this area, is the question of stopped jobs and the wait command, and how those two are intended to interact. The wording in my current draft makes clear that wait waits for processes to terminate. I could, if desired, add some rationale saying that some implementations have, as an extension, an option that allows wait to return when a process stops. That's not the current behavior. At best, it should be unspecified. Bash, yash, mksh, dash, the NetBSD sh, and gwsh allow the `wait' builtin to wait for any process status change (e.g., SIGSTOP). ksh93, FreeBSD sh, and zsh force the shell to wait until the process terminates. Bash provides an option (`wait -f') to force a wait for process termination. I didn't check whether other shells do. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 4/29/22 4:23 PM, Robert Elz via austin-group-l at The Open Group wrote: | You can test this by doing | |true & | |wait $!; echo $? | | This should print 0. Then do the same, except with the first command | changed to false &. That should print 1. Yes, in the shells you mention it does, indicating that something different is happening. It is interesting that in bash you can do that wait over and over again, and it keeps returning the 0 status (until one does a plain "wait" command, even the "jobs" command doesn't remove it, though the standard requires that it do so). bash is the only shell that acts like that, whether it is intentional or not I have no idea. It's intentional, and has been in bash for a very long time. As I said in another message, the jobs builtin not removing the pid from the `remembered' list is probably an oversight. I'll fix it in posix mode after bash-5.2 is released. But try a different test true & X=$! (the assignment to X is just in case there is a shell which implements that "no need to retain" stuff when $! is not referenced). Then repeat that line over and over. (Consecutive lines). zsh does something different, once a job has been reported as finished at a prompt, it is removed from the jobs table, and you can no longer do "wait %3" for it, but the pid and status seem to be remembered somewhere else, and wait gets the status from the job. That seems odd to me, it should be possible to use either form to wait on a job. They're not jobs! A pid is a pid. It doesn't matter whether it's the pid of the job's controlling process (or whatever we want to call it). The Asynchronous Lists text says you have to be able to wait for it. This is how bash works, too. This is what happens when you have a jobs list and a list of terminated asynchronous lists that are `known in the current shell environment'. bash is different again, it counts up the job numbers, like bosh and yash, but as it reports each earlier one finished, removes it from the jobs table, so the "jobs" command only ever shows (and then removes) the last one started. It still allows wait N to return the status, as many times as you want to do that command, but not wait %n for any but the most recently created one. Right. The ascending job number depends on your policy for assigning new job numbers, and you can only use job control notation to refer to entries in the job list. But bash will let you wait for pid N as long as pid N is in the list of terminated asynchronous processes. The bigger issue is what do you do about users who can be connected to their shell for weeks, running lots of background commands, and never issuing a wait or jobs command? Do you just keep remembering exit status/pid pairs forever? That doesn't sound sustainable to me. Bash bounds the number remembered. It's at least CHILD_MAX, as POSIX specifies, with an upper bound (right now, 32K -- very few sessions start that many asynchronous jobs/processes). It checks for pid reuse: the entry for pid N will always be the status of the most recent asynchronous process with that pid. That might not be perfect, but it works fine in practice. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 4/29/22 2:38 PM, Robert Elz via austin-group-l at The Open Group wrote: | However, today it threw a last curve ball when I was working on an | update to the description of set -b ... How many shells actually implement that? Bash does. I doubt anyone uses it. | This conflicts with 2.9.3.1 Asynchronous Lists which says that IDs | remain known until: | | 1. The command terminates and the application waits for the process ID. | | 2. Another asynchronous list is invoked before "$!" (corresponding to | the previous asynchronous list) is expanded in the current execution | environment. Does anyone implement that bit (#2) at all? I think the FreeBSD shell does. In a non-interactive shell it might almost be possible, but in an interactive shell, if the job isn't in the list (whether $! has been referenced or not - usually it will not have been) because it has been removed, what is the shell supposed to do if the job stops? Further users (even in scripts) are allowed to use % %- %1 etc to refer to jobs, $! isn't the only way to reference one ("wait %2 should work). I'd suggest that #2 should simply be removed. I think the standard implies that the jobs list and the list of terminated process IDs `known in the current environment' are different things. It's not clear. But do note that the definition of the jobs command says: When jobs reports the termination status of a job, the shell shall remove its process ID from the list of those ``known in the current shell execution environment''; see Section 2.9.3.1 (on page 2338). This is one place where the two things overlap. | It also appears that dash still implements remove-before-prompting. Does anyone not? Lots of shells don't. | B. Allow remove-before-prompting. This would mean changing 2.9.3.1 to | add a third list item (for interactive shells only) and deleting the | above quoted text from the wait page. This is necessary, we would be making use of the shell too difficult for interactive users otherwise. What does "too difficult" mean? The shells that don't do remove-before- prompting seem to be doing just fine. While you're considering all of this, you might want to also consider what is intended to happen if a script does trap '' CHLD and how that is supposed to interact with maintenance of the jobs command, the wait command, and all else related. It should be explicitly stated to be unspecified behavior, since SIGCHLD is necessary to make process handling work. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
r sections. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: 答复: How do I get the buffered bytes in a FILE *?
On 4/18/22 12:53 AM, Rob Landley wrote: On 4/17/22 18:10, Chet Ramey wrote: On 4/16/22 2:58 PM, Rob Landley via austin-group-l at The Open Group wrote: Q) "How do I switch from FILE * to fd via fileno() without losing data." A) "Don't use FILE *" That's not the question I asked? The answer is correct, but incomplete. The missing piece is that if you want to use FILE *, the operation you want, and the information you need to implement it, are not part of the public API. Which is a fixable problem. Sure, everything's fixable. It's not what you asked, though. Other than using a strategy like Geoff suggested early on, or trying something like setvbuf to turn off buffering on the FILE * completely, the buffer associated with a FILE * and the indexes into it that say how much data you've consumed from the underlying source are opaque. https://github.com/coreutils/gnulib/blob/master/lib/freadahead.c So the gnulib folks looked at a bunch of different stdio implementations and used non-public (or at least non-standard) portions of the implementation to agument the stdio API. If that's what you want to do, propose adding freadahead to the standard. Or reimplement the gnulib work and accept that the stdio implementation can potentially change out from under you. Current POSIX provides no help here. If you want to manipulate that information, or expose it to a caller, you can't use FILE * (or, if you want a direct answer, "you can't"). The if/else staircase in m4 and gnulib and so on says I can. Not in a way that protects you against changes to one of the underlying stdio implementations. And isn't that the point? You can always offer that functionality if you have stable access to stdio internals, but it's not in the standard. I was just wondering if there was a _clean_ way to do it. OK. Do you think you've gotten an answer to that? The C99 guys point out they haven't got file descriptors and thus this would logically belong in posix, for the same reason fileno() does. "But FILE * doesn't have a way to fetch the file descriptor" was answered by adding fileno(). That is ALSO grabbing an integer out of the guts of FILE *. Sure. And adding that to the standard would require the usual things, for which there's a process. This exists. It would be nice if it got standardized. Maybe it would. But that's a different question. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: 答复: How do I get the buffered bytes in a FILE *?
On 4/16/22 2:58 PM, Rob Landley via austin-group-l at The Open Group wrote: Q) "How do I switch from FILE * to fd via fileno() without losing data." A) "Don't use FILE *" That's not the question I asked? The answer is correct, but incomplete. The missing piece is that if you want to use FILE *, the operation you want, and the information you need to implement it, are not part of the public API. Other than using a strategy like Geoff suggested early on, or trying something like setvbuf to turn off buffering on the FILE * completely, the buffer associated with a FILE * and the indexes into it that say how much data you've consumed from the underlying source are opaque. If you want to manipulate that information, or expose it to a caller, you can't use FILE * (or, if you want a direct answer, "you can't"). I found it easier to write my own buffered input package to satisfy the POSIX read ahead requirements than try to coerce stdio into doing it. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Another "is it quoted" issue for here doc redir op end words
On 2/6/22 1:23 PM, Robert Elz via austin-group-l at The Open Group wrote: They are definitely allowed, what saves all of us is that no-one ever writes this, and if anyone ever attempted it, they'd probably be hoping that the end-word on the here-doc redirect would be expanded the same way it is fir all other redirects (which definitely does not happen). That's the aforementioned rabbit hole. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Another "is it quoted" issue for here doc redir op end words
On 2/6/22 6:14 AM, Thorsten Glaser via austin-group-l at The Open Group wrote: Robert Elz via austin-group-l at The Open Group dixit: But there is a somewhat weird case that the shells (those for which this works at all, which is a minority) differ about, that I don't Which is correct, and why? Should that even work?! (mksh is one of the shells in which it doesn’t, and I’m hard-pressed to see why scripts should even be allowed to write constructs like that.) Yeah, we're way down the rabbit hole of hypothetical cases here. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)
On 1/27/22 12:25 PM, Chet Ramey wrote: On 1/27/22 12:07 PM, Harald van Dijk via austin-group-l at The Open Group wrote: On 27/01/2022 16:06, Chet Ramey via austin-group-l at The Open Group wrote: Wow, that seems like a bug. Environment variables can contain sequences of arbitrary non-NULL bytes, and, as long as the portion before the `=' is a valid NAME, the shell is required to create a variable with the remainder of the string as its value and pass it to child processes in the environment. That is not what POSIX says. It says "The value of an environment variable is a string of characters" (8.1 Environment Variable Definition), and "character" is defined as "a sequence of one or more bytes representing a single graphic symbol or control code" (3 Definitions), with a note that says it corresponds to what C calls a multi-byte character. Environment variables are not specified to allow arbitrary bytes. I wonder why they chose that. It's a departure from existing practice. I suppose it's just a quality of implementation issue, since applications can obviously put whatever they want into the value of an environment variable in envp and call execve. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)
On 1/27/22 12:07 PM, Harald van Dijk via austin-group-l at The Open Group wrote: On 27/01/2022 16:06, Chet Ramey via austin-group-l at The Open Group wrote: Wow, that seems like a bug. Environment variables can contain sequences of arbitrary non-NULL bytes, and, as long as the portion before the `=' is a valid NAME, the shell is required to create a variable with the remainder of the string as its value and pass it to child processes in the environment. That is not what POSIX says. It says "The value of an environment variable is a string of characters" (8.1 Environment Variable Definition), and "character" is defined as "a sequence of one or more bytes representing a single graphic symbol or control code" (3 Definitions), with a note that says it corresponds to what C calls a multi-byte character. Environment variables are not specified to allow arbitrary bytes. I wonder why they chose that. It's a departure from existing practice. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: how do to cmd subst with trailing newlines portable (was: does POSIX mandate whether the output…)
On 1/27/22 10:18 AM, Harald van Dijk via austin-group-l at The Open Group wrote: On 27/01/2022 12:44, Geoff Clare via austin-group-l at The Open Group wrote: Christoph Anton Mitterer wrote, on 26 Jan 2022: 3) Does POSIX define anywhere which values a shell variable is required to be able to store? I only found that NUL is excluded, but that alone doesn't mean that any other byte value is required to work. Kind of circular, but POSIX clearly requires that a variable can be assigned any value obtained from a command substitution that does not include a NUL byte, and specifies utilities that can be used to generate arbitrary byte values, therefore a variable can contain any sequence of bytes that does not include a NUL byte. Is it really clear that POSIX requires that? The fact that it refers to "characters" of the output implies the bytes need to be interpreted as characters according to the current locale, which is a process that can fail. In at least one shell (yash), bytes that do not form a valid character are discarded, which makes sense since yash internally stores variables etc. as wide strings. Wow, that seems like a bug. Environment variables can contain sequences of arbitrary non-NULL bytes, and, as long as the portion before the `=' is a valid NAME, the shell is required to create a variable with the remainder of the string as its value and pass it to child processes in the environment. If yash modifies that value because there are sequences that don't form valid wide characters, that sounds like a problem. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour
On 12/17/21 5:11 AM, Geoff Clare via austin-group-l at The Open Group wrote: The more I think about it, the more I am convinced that an error is the right thing for make to do, The world is an imperfect place. It seems that few, if any, make implementations agree. We can't start standardizing behavior that no one implements because of a desire for possible future improvement. That's what gives standards bodies a bad name. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour
On 12/17/21 5:11 AM, Geoff Clare via austin-group-l at The Open Group wrote: Currently POSIX does not require unset macros to expand to an empty string. The standard is silent on the matter, so the behaviour is implicitly unspecified. It seems like this is an opportunity to standardize behavior that is common across multiple (all?) make implementations. The proposed change *reduces* the allowed behaviours from many to just two. If all make implementations have the same behavior, why not standardize that? Is there evidence that the "typo in the makefile" problem is widespread enough to devise and require a hypothetical fix in make? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour
On 12/16/21 5:27 AM, Geoff Clare via austin-group-l at The Open Group wrote: > Chet Ramey wrote, on 14 Dec 2021: >> >> On 12/14/21 5:15 AM, Geoff Clare via austin-group-l at The Open Group wrote: >>> Paul Smith wrote, on 13 Dec 2021: >>>> Why shouldn't we just state that >>>> make implementations must expand unset variables to the empty string, >>>> which is what all implementations (that I'm aware of) do anyway? >>> >>> The point is that any makefile that relies on an unset macro being >>> expanded to an empty string is not portable. The only reason it ever >>> works is purely by luck. >> >> These two paragraphs are clearly in conflict. They can't both be true. > > They are both true, but I could perhaps have phrased it differently > to make it clearer why the second one is true. > > When a makefile relies on an unset macro being expanded to an empty > string, the reason it is not portable has nothing to do with the way > current make implementations expand unset macros, it is because the > makefile is relying on the macro being unset. Then it doesn't fully address the point. Whether you write a makefile one way or another, when standardizing make behavior we should give more weight to the behavior of current make implementations. If all existing make implementations expand unset macros to the empty string -- and no one has identified an implementation that does not -- then that is the behavior that should be standardized. > The only way to be sure that a given macro will expand to an empty > string is to explicitly set it to an empty string. This is just saying that you can never be sure that a macro is unset. However, if it is unset, make implementations should behave consistently, and it appears they do. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Issue 8 drafts 0001505]: Make doesn't seem to specify unset macro expansion behaviour
On 12/14/21 5:15 AM, Geoff Clare via austin-group-l at The Open Group wrote: > Paul Smith wrote, on 13 Dec 2021: >> Why shouldn't we just state that >> make implementations must expand unset variables to the empty string, >> which is what all implementations (that I'm aware of) do anyway? > > The point is that any makefile that relies on an unset macro being > expanded to an empty string is not portable. The only reason it ever > works is purely by luck. These two paragraphs are clearly in conflict. They can't both be true. Can anyone point to a make implementation that throws an error when expanding an unset variable? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? in a simple command with no command name
On 9/1/21 4:59 PM, (Joerg Schilling) wrote: "Chet Ramey via austin-group-l at The Open Group" wrote: Given the following: (exit 42) a=$? b=`false` b=$? echo $? $a $b Bash prints 1 42 1. The original (v7) bourne shell and the rest of the research line through v9 prints 1 1 (b is set to the empty string). That implies that it executes the assignment statements in reverse order, in addition to carrying $? through the sequence of assignments. You are right, the original Bourne Shell for unknown reasons did evaluate a series of shell variable assignments in reverse order. That was changed in ksh88 and in bosh. The SVR4.2 shell prints 1 42 1. I imagine the rest of the SVR4 line sh is the same. Something called SVR4.2 does not really exist. It was a minor change compared to SvR4 announced by Novell short before they sold the Copyright to SCO. > I know of no customers for SVR4.2... even SCO seems to only used it internally in their project Monterey that was abandoned by IBM. And yet it existed as a product. Univel probably had Unixware customers they didn't tell you about. You can find it for download if you look for it, I suspect. There have been major changes in the Bourne Shell for SvR4, but the $? was not touched. So you are mistaken. Sure, I didn't have SVR4 to test against when I wrote that. The important thing to know here is that the Bourne Shell has some checkpoints that update the intermediate value of $?. Since that changed in ksh88 and since POSIX requires a different behavior compared to the Bourne Shell, I modified one checkpoint in bosh to let it match POSIX. Interesting, since ksh88 (Solaris 10 11/16/88i) and ksh93 (93u+ 2012-08-01) both print 1 42 1 Odd that POSIX would specify something different, isn't it? (exit 42); a=$? b=`false` b=$?; echo $? $a $b prints 1 42 42 in bosh and 1 1 in the SvR4 Bourne Shell. It echoes 255 255 with the Solaris 10 /bin/sh (b is again null). It looks like /bin/false exits with status 255, the Solaris 10 sh still performs the assignments in reverse order, and the Solaris 10 version of the SVR4 sh sets $? from the result of each command substitution. In any case, kre's point stands: the original Bourne shell (and, for that matter, the POSIX base implementation) set $? as each command substitution finishes. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: $? in a simple command with no command name
On 9/1/21 2:23 PM, Robert Elz via austin-group-l at The Open Group wrote: > Date:Wed, 1 Sep 2021 19:04:12 +0100 > From:Harald van Dijk > Message-ID: <837d3b5b-ac61-98eb-2741-d667a78e2...@gigawatt.nl> > > | Is there any statement that overrides the general definition to > | explicitly make this unspecified? If not, the general definition applies > | and $? must expand to 0 both times it appears on line 2. > > Perhaps as currently written that's correct, but if so, the standard > probably needs to be updated, as it is fairly clear that shells which > set $? as each command substitution finishes have always existed (in > fact, that might have been what the original Bourne shell did, I haven't > checked) and the standard should allow for that. Given the following: (exit 42) a=$? b=`false` b=$? echo $? $a $b Bash prints 1 42 1. The original (v7) bourne shell and the rest of the research line through v9 prints 1 1 (b is set to the empty string). That implies that it executes the assignment statements in reverse order, in addition to carrying $? through the sequence of assignments. The SVR4.2 shell prints 1 42 1. I imagine the rest of the SVR4 line sh is the same. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: utilities and write errors
On 6/30/21 11:49 AM, Joerg Schilling via austin-group-l at The Open Group wrote: Erm, yes. For some reason, I assumed the OP wrote &> instead of >& which have the same meaning in GNU bash (but &> is the parse-trouble one even if the bash manpage actively recommends it). I guess their ?~>&? confused me. My point of _please_ using ?>file 2>&1? instead is still valid, ofc. BTW: I would not call it a hard parse error but a semantic problem, since the standard only mentions numbers after >& It does not. The redirection is specified as `[n]>'. The standard says: "If word evaluates to one or more digits, the file descriptor denoted by n, or standard output if n is not specified, shall be made to be a copy of the file descriptor denoted by word; if the digits in word do not represent a file descriptor already open for output, a redirection error shall result; see Consequences of Shell Errors. If word evaluates to '-', file descriptor n, or standard output if n is not specified, is closed. Attempts to close a file descriptor that is not open shall not constitute an error. If word evaluates to something else, the behavior is unspecified." Everyone is conformant here. There is unspecified behavior. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: utilities and write errors
On 6/29/21 5:09 PM, tg...@mirbsd.org via austin-group-l at The Open Group wrote: I know the GNU bash extension >& (which incidentally violates POSIX on the parse level) but not ~>&… It doesn't. It's been unspecified for over 30 years. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: behavior of printf '\x61'
On 4/16/21 3:41 PM, Philip Guenther via austin-group-l at The Open Group wrote: - 7. An additional conversion specifier character, b, shall be supported as follows. ... The interpretation of a followed by any other sequence of characters is unspecified. - That exception is about the %b conversion and the handling of its argument, so while that says that printf %b '\x61' is unspecified, it doesn't apply to printf '\x61' A strict reading of the standard says that it's not converted, as per your original message, since `File Format Notation' says: "Characters that are not "escape sequences" or "conversion specifications", as described below, shall be copied to the output." and \x is not a described "escape sequence." The printf description explicitly allows octal, but not hex. Output varies widely ('a', '\x61', 'x61'). I consider hex output a valid extension, but others probably will not. I believe it's a defect in the standard, though. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: execve(2) optimisation for last command
On 4/15/21 4:36 PM, Martijn Dekker via austin-group-l at The Open Group wrote: Most shells 'exec' the last command in -c scripts, e.g.: However, no shell seems to do this for scripts loaded from a file: My question: why is this? I would have thought that a script is a script and that it should make no essential difference whether a script is taken from a -c argument or loaded from a file. What makes the optimisation appropriate for one but not the other? My guess is that all these shells read scripts a line (or command) at a time, and don't realize they're at EOF until after they've executed the last command. (In a nutshell, that's what bash does.) Commands executed with -c don't have this limitation. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]
On 4/13/21 5:16 AM, Harald van Dijk via austin-group-l at The Open Group wrote: Please note again that POSIX's Command Search and Execution doesn't say "continue until execve() doesn't fail". It says "Otherwise, the command shall be searched for using the PATH environment variable as described in XBD Environment Variables", and then what happens to the result of that search. It very clearly separates the search from the attempt to execute. The complicating factor is POSIX's definition of "executable file." You search "until an executable file with the specified name and appropriate execution permissions is found." An executable file is a "regular file acceptable as a new process image file by the equivalent of the exec family of functions." And the only way to determine that is by trying to execute it using one of "the exec family of functions." That said, this is the most marginal of corner cases, notwithstanding that bash has a distinct option to handle it (disabled by default). -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]
On 4/12/21 12:05 PM, Robert Elz via austin-group-l at The Open Group wrote: Anything that the system can run, no matter how it does that, is acceptable. If a system noticed a VAX format a.out, it could load a vax simulator, and run the binary that way, without the user even noticing. If it wanted. You just described basically how macOS runs Intel binaries on M1 hardware, and how Intel hardware ran PowerPC binaries before that. No mystery here. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]
On 4/11/21 4:17 PM, shwaresyst via austin-group-l at The Open Group wrote: conforming applications can not rely on unspecified behaviors, so having a use beyond that specified makes the shell nonconforming. Calling it out like that simply acknowledges a lot of shell implementations choose to make themselves nonconforming, I do not see it as an endorsement or allowance. This is just wrong. By this definition, every shell is non-conforming. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 3:32 PM, Robert Elz wrote: Date:Fri, 19 Feb 2021 14:30:25 -0500 From:Chet Ramey Message-ID: <2b32112c-de72-c713-3f87-6840828c3...@case.edu> | Nope, it's consistent with the standard. I can understand that argument. | that's not a fair reading of rule 4. Whenever we need to rely upon "fair" readings (which generally means that it isn't unambiguous, but it "must" mean ...) we have a problem. And here we are. It clearly needs to be fixed.bash is alone amongst shells in interpreting it that way. Nope, yash in extra-pedantic-posix mode interprets in the same way. This is what happens when you implement the grammar to the standard and it's such a nonsense case that nobody ever reports it as an error. Everyone else allows that "esac" to be Esac only when it causes a match, and not when it causes a syntax error, which is what 2.10.1 says. As I said earlier, it's not hard to fix, just another special case in the grammar caught in lexical analysis. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 12:56 PM, Robert Elz via austin-group-l at The Open Group wrote: bash's behaviour is a little weird: Nope, it's consistent with the standard. bash5 $case esac in (esac) echo match -bash: syntax error near unexpected token `esac' bash5 $esac -bash: syntax error near unexpected token `esac' It is obviously converting the "esac" to Esac, which is correct according to POSIX, but them apparently expecting it to be a pattern, which is not correct, it should be terminating the case statement (as zsh does) making it be that the following ')' is incorrect. OK. You return the `(' as a token, which, since you're looking for a pattern list, takes you to a state where you apply rule 4. We agree that a fair reading of rule 4 results in Esac, as above, which is a syntax error. There is nothing in the standard that allows you to treat the Esac token in that state as terminating the case statement, nor is there anything that allows you to discard the `('. It's just an error. This is the crux of Geoff's argument: that's not a fair reading of rule 4. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 11:21 AM, Geoff Clare via austin-group-l at The Open Group wrote: There is no way to apply rule 4 to produce "a token identifier acceptable at that point in the grammar". The only token identifier acceptable at that point in the grammar is WORD, and rule 4 does not produce WORD. Rule 4 reads: When the TOKEN is exactly the reserved word esac, the token identifier for esac shall result. Otherwise, the token WORD shall be returned. Here, the TOKEN is exactly the reserved word esac, and you agree that this rule is applied. This therefore produces the token identifier for esac. There is nothing else that turns it into WORD, which is needed to parse it as a pattern. I see your point. The wording of rule 4 itself does not yield WORD in this case; it's only when read in combination with the introductory text from 2.10.1 that it becomes apparent that this is the intention. So "acceptable at that point in the grammar" is indeed carrying a heavy load here. You might want to add the qualifying language you suggested. Incidentally, bash 3 on macOS gets the '|' case wrong, e.g.: case esac in foo|esac) echo match;; esac whereas bash5 accept that. So it would appear that Chet fixed the preceded-by-'|' case at some point but not the preceded-by-'(' case. It's just another special case in the grammar that lexical analysis has to handle. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 11:22 AM, Geoff Clare via austin-group-l at The Open Group wrote: Yes, rule 4 is applied there, but your mistake is in assuming that the *result* of rule 4 is that the token is converted to an Esac. How is it not? "the [sic] TOKEN is exactly the reserved word esac" at this point. Why would it not return the token for `esac'? Or are you saying that is not converted to an Esac? Harald made essentially the same point in his last mail - see my reply to that. Yeah, I hadn't gotten there yet. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 10:52 AM, Donn Terry via austin-group-l at The Open Group wrote: It was oh so many years ago that I originally wrote that hideously awful grammar to try to reflect what the ksh did, which was very much ad-hoc parsing. I won't apologise for the ksh language the grammar tries to reflect, or for the grammar itself since ksh is definitely not context-free and thus requires such awfulness. But I feel awful that it's inflicted on POSIX users. I think you deserve a lot of credit for that work. It's a much more daunting task than people appreciate. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001454]: Conflict between "case" description and grammar
On 2/19/21 10:33 AM, Geoff Clare via austin-group-l at The Open Group wrote: Observe that rule 4 is applied for the first word in a pattern even if that pattern follows an opening parenthesis. Because of that, in my example, the esac in parentheses is interpreted as the esac keyword token, not a regular WORD token that makes for a valid pattern. Yes, rule 4 is applied there, but your mistake is in assuming that the *result* of rule 4 is that the token is converted to an Esac. How is it not? "the [sic] TOKEN is exactly the reserved word esac" at this point. Why would it not return the token for `esac'? Or are you saying that is not converted to an Esac? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Bug 1393 ("command" declaration utility) possible solution
On 1/9/21 4:34 PM, Martijn Dekker via austin-group-l at The Open Group wrote: I would question that the currently published standard allows any regular builtin to override the regular shell grammar with special syntactic properties. What exactly do you base this on? I imagine it's because these builtins also accept the syntax the standard deems legal. As long as they do that, they can do whatever else they want with syntax that would be an error by the rule of the standard. This is also not a majority shell behaviour. For instance, bash does not currently do this. I wonder if Chet has any plans of changing that now. I currently do not. I have other priorities. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/19/20 5:05 AM, Geoff Clare via austin-group-l at The Open Group wrote: 2. If the last resource-specifying option has no option-argument, treat the operand as if it was an option-argument for that option; otherwise report a usage error (or ignore the operand). This option sounds like it's the most reasonable. I will look to add it, or something along these lines, in bash-5.2. It's too late for bash-5.1. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/17/20 10:56 AM, Geoff Clare via austin-group-l at The Open Group wrote: Chet Ramey wrote, on 17 Nov 2020: On 11/17/20 10:14 AM, Geoff Clare via austin-group-l at The Open Group wrote: Maybe you could handle those by seeing that the option argument is alphabetic (and not "unlimited") and treating it as a string of option letters instead of reporting that it is an invalid number. From `getopt's perspective, there is no difference between -fH and -f H. They both return `H' in optarg. One increments optind by 1 and the other by 2, which means it's possible to distinguish the two cases. The bash builtin getopt doesn't quite do things the same way, since it uses the word lists bash passes around. I could recognize this case, but it seems fragile. Or I could just go with my original suggestion of adding: Conforming applications shall specify each option separately; that is, grouping option letters (for example, −fH) need not be recognized by all implementations. to my proposal. Sure, that would work. Okay, looks like that will be the end result, unless you like my optind trick. It would improve the portability of ksh scripts (or perhaps more likely, commands typed by a user) to bash. The new language is sufficient. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/17/20 10:14 AM, Geoff Clare via austin-group-l at The Open Group wrote: Maybe you could handle those by seeing that the option argument is alphabetic (and not "unlimited") and treating it as a string of option letters instead of reporting that it is an invalid number. From `getopt's perspective, there is no difference between -fH and -f H. They both return `H' in optarg. There's no good reason to try and treat the latter case as a series of option letters. Or I could just go with my original suggestion of adding: Conforming applications shall specify each option separately; that is, grouping option letters (for example, −fH) need not be recognized by all implementations. to my proposal. Sure, that would work. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/17/20 4:53 AM, Geoff Clare via austin-group-l at The Open Group wrote: Chet Ramey wrote, on 16 Nov 2020: On 11/16/20 11:05 AM, Geoff Clare via austin-group-l at The Open Group wrote: Chet Ramey wrote, on 16 Nov 2020: Thanks. Looks like bash is parsing the ulimit options in an unusual way instead of using getopt() or similar. Quite the opposite. The bash ulimit builtin uses the same internal_getopt code as the rest of the builtins, with the addition that option-arguments are allowed to be missing and that `-f' is the default in the absence of any other resource options. What's happening is that the bash getopt treats `-xARG' identically to `-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument, that `H' is that argument. This is the same thing that POSIX getopt does; of course, it doesn't handle optional arguments at all. Huh? This contradicts what you said above: "with the addition that option-arguments are allowed to be missing". That is not a missing option-argument. A missing option argument is something like `ulimit -c'. The POSIX getopt would not consider `-fH' a missing option argument either, assuming `f:' were specified in the option string; that's the point. It also doesn't match the behaviour I see in bash 5: $ bash -c 'ulimit -fc' bash: line 0: ulimit: c: invalid number There's nothing mysterious here. The `c' constitutes the option-argument. Numbered item 2 in the POSIX getopt description says the same thing. The only way it would not be considered an option argument is if it looked like an option itself: a separate argument preceded by a `-'. Which leads me to the next example: $ bash -c 'ulimit -f -c' file size (blocks, -f) unlimited core file size (blocks, -c) 10 If it did not handle optional option-arguments, then this last command would fail with "bash: line 0: ulimit: -c: invalid number". Indeed, that's one aspect of the optional argument implementation. But all of this discussion about getopt and optional arguments is a red herring anyway. POSIX finesses all of this by not having option arguments in the `ulimit' description at all, and the `newlimit' in the standard and the current proposal is a separate operand. That limits you to modifying one limit per call. To match your description it would need to include: ... [-c[limit]] [-f[limit]] ... Nah. I'd have to do that for every option, and I'd like to keep the manual under 200 pages. The descriptive text explains things. In that case you should give a synopsis that does not specify syntax, instead of giving one that contradicts the description. The readers seem to pick it up pretty well. I haven't received reports that the description `contradicts' the syntax summary. I understand that the POSIX syntax summary is a language, and if you're fluent in its syntax and semantics, it's easy to read things in its terms. Not a lot of people do that. 2. Otherwise, optarg shall point to the string following the option character in that element of argv, and optind shall be incremented by 1." This requires that if the option string includes "f:" and the argument list has -f followed by -c then the -c is returned in optarg. That's not what bash's ulimit does. Let's try this again. The example in question here is `ulimit -fc'. If the `f' were specified as returning an option-argument, the POSIX getopt would certainly return `c' in optarg. The bash option string includes `;', which specifies that the option-argument may not be present, but it is. If you consider `-f -c', the `-c' looks like an option, so the optional argument code considers the option-argument to be missing. The POSIX `ulimit' description specifies that `newlimit' is an operand, not an option-argument, so this discussion is academic. One consequence of the POSIX description is, as I said above, that it restricts each invocation to modifying one limit. That's how it can finesse the `newlimit is an operand'. I'm not going to reduce functionality and throw away backwards compatibility without a better reason. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/16/20 11:05 AM, Geoff Clare via austin-group-l at The Open Group wrote: Chet Ramey wrote, on 16 Nov 2020: Thanks. Looks like bash is parsing the ulimit options in an unusual way instead of using getopt() or similar. Quite the opposite. The bash ulimit builtin uses the same internal_getopt code as the rest of the builtins, with the addition that option-arguments are allowed to be missing and that `-f' is the default in the absence of any other resource options. What's happening is that the bash getopt treats `-xARG' identically to `-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument, that `H' is that argument. This is the same thing that POSIX getopt does; of course, it doesn't handle optional arguments at all. This is completely different to the syntax documented in the bash manual at https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html which is: ulimit [-HSabcdefiklmnpqrstuvxPT] [limit] To match your description it would need to include: ... [-c[limit]] [-f[limit]] ... Nah. I'd have to do that for every option, and I'd like to keep the manual under 200 pages. The descriptive text explains things. The utility syntax guidelines give a passing nod to this situation: "One or more options without option-arguments, followed by at most one option that takes an option-argument, should be accepted when grouped behind one '-' delimiter." That's needed for all options that take an option-argument, regardless of whether mandatory or optional. Sure. It fits this situation exactly, then. Optional arguments take this out of the realm of getopt() anyway. But doesn't the description of getopt() in https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html#tag_16_206 require the bash behavior for options that do take an argument? "If the option takes an argument, getopt() shall set the variable optarg to point to the option-argument as follows: 1. If the option was the last character in the string pointed to by an element of argv, then optarg shall contain the next element of argv, and optind shall be incremented by 2. If the resulting value of optind is greater than argc, this indicates a missing option-argument, and getopt() shall return an error indication. 2. Otherwise, optarg shall point to the string following the option character in that element of argv, and optind shall be incremented by 1." Optional option-arguments are explained in XBD 12.1 item 2: Yeah, getopt doesn't follow that one. This is standard GNU getopt behavior. As an extension to POSIX (because you have to somehow tell it via the option string that the option takes an optional option-argument, which is beyond what POSIX specifies for getopt). However, since it doesn't implement them the way the standard requires, that does mean that GNU getopt can't be used for option handling in any of the utilities in POSIX that are specified as having options with an optional option-argument. Maybe not. Neither can POSIX getopt, so we're back to the "or similar" part of your original message. That doesn't seem to help portability. This doesn't rise to the level of anything that would inspire me to break that much backwards compatibility. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2016/18)/Issue7+TC2 0001418]: Add options to ulimit to match get/setrlimit()
On 11/16/20 4:45 AM, Geoff Clare via austin-group-l at The Open Group wrote: Jilles Tjoelker wrote, on 13 Nov 2020: On Mon, Nov 09, 2020 at 03:07:43PM +, Geoff Clare via austin-group-l at The Open Group wrote: The ksh and bash behaviour of reporting multiple values seems more useful to me, but I wouldn't object if others want to make this unspecified. With bash, reporting multiple values does not work if the options are grouped into a single argument: % bash -c 'ulimit -fn' bash: line 0: ulimit: n: invalid number % bash -c 'ulimit -f -n' file size (blocks, -f) unlimited open files (-n) 231138 With ksh93, both these commands work as expected. Similarly, commands like ulimit -fH do not work in bash. It must be -Hf, -H -f or -f -H. Thanks. Looks like bash is parsing the ulimit options in an unusual way instead of using getopt() or similar. Quite the opposite. The bash ulimit builtin uses the same internal_getopt code as the rest of the builtins, with the addition that option-arguments are allowed to be missing and that `-f' is the default in the absence of any other resource options. What's happening is that the bash getopt treats `-xARG' identically to `-x ARG'. When it sees `-fH' it assumes, since -f takes an option argument, that `H' is that argument. This is standard GNU getopt behavior. The utility syntax guidelines give a passing nod to this situation: "One or more options without option-arguments, followed by at most one option that takes an option-argument, should be accepted when grouped behind one '-' delimiter." -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Status of $'...' addition (was: ksh93 job control behaviour)
On 7/30/20 7:29 PM, Robert Elz wrote: > | And for that it would be tremendous if $'' would be defined so > | that it can be used as the sole quoting mechanism, > > No thanks. Partly because $'' is already implemented (widely) > and used (perhaps slightly less yet) - so that ship has sailed. > > I believe I've seen $" ... " used that way somewhere though (don't > recall where) and I believe it is a mistake. None of the existing implementations of $"..." use it in that way. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: sh: aliases in command substitutions
On 4/23/20 1:08 PM, Robert Elz wrote: > Date:Thu, 23 Apr 2020 11:48:55 -0400 > From: Chet Ramey > Message-ID: > > | Keep in mind that those tests are mutually incompatible > > I didn't see anything I would call that. When run as a single file, as presented. > > But: > | and will produce > | misleading (or at least confusing) results, probably the consequence of > | combining a number of individual files into one. > > If run as a single script file, yes, the D,2 test treats the next > several tests (to the end of F I think it was) as data... They really > should be separated and run one at a time. Indeed. That's why I said what I said. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: sh: aliases in command substitutions
On 4/23/20 5:21 AM, Joerg Schilling wrote: > shwaresyst wrote: > >> I never said this was expected to be clean, or even easy to do, just that it >> is plausible for the feature set desired. What mucks it up is things that >> change how lexical elements are expected to be recognized; case conditions >> should use someting like , with left angles >> being optional, to indicate end of pattern, not ')', but these don't become >> part of the base PCS until Issue 9. > > If you believe it is possible, you could write such a beast and run it > against > the tests from Sven Mascheck Keep in mind that those tests are mutually incompatible and will produce misleading (or at least confusing) results, probably the consequence of combining a number of individual files into one. They're useful; just don't treat the results as gospel. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: aliases in command substitutions
On 4/20/20 9:34 AM, Robert Elz wrote: > | but I've always understood the > | case xxx in > | (pattern) ...;; > | esac > | > | (fully parenthesized pattern) syntax to have been invented precisely > | to allow case statements in $() subshell notation, > > First, $() is command substitution, not a subshell (not really important) > and if that was someone's intent, they did a particularly bad job of > implementing it, as what the standard says is (XCU 2.6.3) He's right, and it happened 30 years ago: "An optional open-parenthesis before pattern was added to allow numerous historical KornShell scripts to conform. At one time, using the leading parenthesis was required if the case statement were to be embedded within a $( ) command substitution; this is no longer the case with the POSIX shell. Nevertheless, many existing scripts use the open-parenthesis, if only because it makes matching-parenthesis searching easier in vi and other editors. This is a relatively simple implementation change that is fully upward compatible for all scripts." This is from 1991, and I'm certain, though I don't have it with me right now, that the same text appeared in the 1992 version of the standard. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: pwd(1) pwd -L and multple adjacent slashes in $PWD,
On 4/14/20 10:05 AM, casper@oracle.com wrote: > >> On 4/14/20 9:44 AM, casper@oracle.com wrote: >>> pwd has the -L option: >>> >>> The following options shall be supported by the implementation: >>> >>> -L >>> If the PWD environment variable contains an absolute pathname >>> of the current directory and the pathname does not contain any >>> components that are dot or dot-dot, pwd shall write this >>> pathname to standard output, except that if the PWD environment >>> variable is longer than {PATH_MAX} bytes including the >>> terminating null, it is unspecified whether pwd writes this >>> pathname to standard output or behaves as if the -P option had >>> been specified. Otherwise, the -L option shall behave as the -P >>> option. >>> >>> >>> It mentions "dot-dot" and "dot". >>> >>> It does seems to allow: >>> >>> (cd /; PWD=// pwd -L) >>> // >>> and >>> (cd /home/casper; PWD=/home///casper pwd -L) >>> /home///casper >>> >>> >>> Is this a correct implmentation? >> >> Does the standard cover this at all? It only mentions PWD being set by `cd' >> and initialized by `sh'. If you assign it directly, at least `cd' is >> explicitly unspecified, and since `pwd' is only required to "remove >> unnecessary slash characters" if -P is supplied, I'd say you've left the >> realm of the standard and the implementation can do what it likes. > > > So you are saying that it would be fine to squish out the additional > slashed in the output? (Not doing anything would be fine, too) Yes. It's unspecified. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: pwd(1) pwd -L and multple adjacent slashes in $PWD,
On 4/14/20 9:44 AM, casper@oracle.com wrote: > pwd has the -L option: > > The following options shall be supported by the implementation: > > -L > If the PWD environment variable contains an absolute pathname > of the current directory and the pathname does not contain any > components that are dot or dot-dot, pwd shall write this > pathname to standard output, except that if the PWD environment > variable is longer than {PATH_MAX} bytes including the > terminating null, it is unspecified whether pwd writes this > pathname to standard output or behaves as if the -P option had > been specified. Otherwise, the -L option shall behave as the -P > option. > > > It mentions "dot-dot" and "dot". > > It does seems to allow: > > (cd /; PWD=// pwd -L) > // > and > (cd /home/casper; PWD=/home///casper pwd -L) > /home///casper > > > Is this a correct implmentation? Does the standard cover this at all? It only mentions PWD being set by `cd' and initialized by `sh'. If you assign it directly, at least `cd' is explicitly unspecified, and since `pwd' is only required to "remove unnecessary slash characters" if -P is supplied, I'd say you've left the realm of the standard and the implementation can do what it likes. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/13/20 10:14 AM, Harald van Dijk wrote: >>> Can this instead say "in the same shell execution environment as the >>> compound-list of the compound-command of the function definition", so that >>> >>> f() (return 1) >>> >>> which is fairly sensible and works in all shells[*] remains well-defined, >>> but only something along the lines of f() { (return 1) } or >>> f() ( (return 1) ) becomes unspecified? >> >> We should be able to do better than that. I don't see why "if not executing >> in the same shell execution environment as the compound-list ..." can't >> cover the f() { (return 1) } case as well, and seems to work in all shells. > > I don't see how you can allow that without also allowing > > f() { (return 7; echo no); echo $?; }; f > > If that also works in all shells (meaning it doesn't print no, and does > print 7), then by all means standardise it. I can't find one that doesn't in my quick initial testing, but I don't have binaries for every shell under the sun. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/12/20 4:21 PM, Harald van Dijk wrote: > On 11/03/2020 17:44, Don Cragun wrote: >> Would this issue be resolved if we change the last sentence of the >> description section of the return Special Built-In Utility from: >> If the shell is not currently executing a function >> or dot script, the results are unspecified. >> to: >> If the shell is not currently executing a function >> or dot script running in the same shell execution >> environment as the command that invoked the function >> or dot script, the results are unspecified. >> ? > > Can this instead say "in the same shell execution environment as the > compound-list of the compound-command of the function definition", so that > > f() (return 1) > > which is fairly sensible and works in all shells[*] remains well-defined, > but only something along the lines of f() { (return 1) } or > f() ( (return 1) ) becomes unspecified? We should be able to do better than that. I don't see why "if not executing in the same shell execution environment as the compound-list ..." can't cover the f() { (return 1) } case as well, and seems to work in all shells. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 5:24 PM, Dirk Fieldhouse wrote: > Even with this wording, it isn't clear that there is "the function or > dot script, if any" (ie just one, or none) without first applying the > restriction to the same execution environment, depending on whether you > think that asynchronous commands in a function definition are counted in > the "current function", so this perhaps would be better: > > The return utility shall cause the shell to stop executing the > current function or dot script, if any, that is being executed > using the same shell execution environment (see 2.12) as the > command that invoked the function or dot script. Otherwise the > results are unspecified. I think "executed in the same shell execution environment" is better wording, since it parallels usage elsewhere in the standard, such as https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01 steps 1 and 2. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 1:32 PM, Don Cragun wrote: > Would this issue be resolved if we change the last sentence of the > description section of the return Special Built-In Utility from: > If the shell is not currently executing a function > or dot script, the results are unspecified. > to: > If the shell is not currently executing a function > or dot script running in the same shell execution > environment as the command that invoked the function > or dot script, the results are unspecified. I think this is heading in the right direction. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 1:07 PM, Dirk Fieldhouse wrote: > On 11/03/20 15:23, Chet Ramey wrote: >> ...> >> What does a `return from the execution environment' mean, exactly? ... > > To clarify, what I wrote was shorthand for "return from the function if > the 'return' is executed in the same execution environment as" the > function's defining command, or otherwise (ii) exit or (iii) unspecified > behaviour. I don't think `defining command' is right. It's where the function is executed that is the issue. So different execution environments is the way to proceed, but using something like caller instead of defining command. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 12:15 PM, Dirk Fieldhouse wrote: > >> All shells I am aware of print foo and bar > > The discussion seems to have confirmed that this is the general existing > practice, and not just in the few cases I tested, but IMO only a shell > implementer could see the suggested behaviour of these examples as > baffling, based on the wording of the standard (not to mention "man sh", > etc, so I won't). If it's the wording that implies possible behavior that no shell implements, let's fix the wording. > The question is what, if any, rewording of the standard should be made. > There are plenty of choices for better designed scripting languages, so > arguably making the specification agree with existing practice would be > an acceptable resolution. The example of DR 842 for 'break' and > 'continue' shows that this should not be seen as an unnecessary change. We can use 842 as a model for the changes. Someone needs to propose new wording that's comprehensive enough to cover the different cases. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/12/20 6:02 AM, Joerg Schilling wrote: > Chet Ramey wrote: > >> I use Mac OS X. I test on Linux. >> >> By far, the biggest difference between older Mac OS X/Linux and current Mac >> OS X is using lldb instead of gdb for debugging. > > OK, I develop on Solaris and like dbx in favor of gxb at al. but it seems > to be a pity that the Solaris compilers do not get updates for OpenSolaris > anymore. > > I test on various platforms and as a result, I recently discovered that > waitid() on Mac OS is still not usable even though there is a POSIX > certification. The still remaining problem is that it always returns a signal > number of 0 if the child has been killed by a signal. So for a portable > program like bash, it seems that OSX and Linux are not sufficient. It might be, if bash used waitid(). -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 11:52 AM, Joerg Schilling wrote: > Chet Ramey wrote: > >> On 3/11/20 11:46 AM, Joerg Schilling wrote: >> >>> Since you most likely develop on Linux >> >> I don't; don't make assumptions. > > Interesting, where do you develop? I use Mac OS X. I test on Linux. By far, the biggest difference between older Mac OS X/Linux and current Mac OS X is using lldb instead of gdb for debugging. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 11:46 AM, Joerg Schilling wrote: > Since you most likely develop on Linux I don't; don't make assumptions. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 11:30 AM, Joerg Schilling wrote: > Chet Ramey wrote: > >>> and that "foo" and not >>> "bar" should be printed in each case: >>> >>> f1() { >>> ( echo foo; return ) >>> echo bar >>> } >> >> This implies some interprocess communication between the parent and child >> that simply doesn't exist, and nothing in the standard indicates that it >> should. > > I don't see that we should do this, but id you like to be able to reably get a > > ``NOEXEC'' or ``NOTFOUND'' > > from expanding "$/", there is a need for interprocess communication unless > you > use vfork() for that specific command. What is "$/"? Nobody, with perhaps the exception of bosh, implements that. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 11:13 AM, Stephane Chazelas wrote: > 2020-03-11 09:55:57 -0400, Chet Ramey: >> On 3/11/20 5:43 AM, Stephane Chazelas wrote: >> >>> AFAIK, bash and bosh are the only shells that complain when you >>> use return outside of functions/sourced scripts (bash also >>> doesn't exit upon that failing "return" special builtin in that >>> case which could be seen as a conformance bug). >> >> You really should try posix mode. > [...] > > As it happens, I did in that case, and I found it behaved the > same in or outside of POSIX mode: > > $ bash -c 'return; echo "$?"' > bash: line 0: return: can only `return' from a function or sourced script > 1 > $ bash -o posix -c 'return; echo "$?"' > bash: line 0: return: can only `return' from a function or sourced script This wasn't covered in my previous message; it was changed in January 2019. $ ./bash -o posix -c 'return ; echo after' ./bash: line 0: return: can only `return' from a function or sourced script -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 11:13 AM, Stephane Chazelas wrote: > 2020-03-11 09:55:57 -0400, Chet Ramey: >> On 3/11/20 5:43 AM, Stephane Chazelas wrote: >> >>> AFAIK, bash and bosh are the only shells that complain when you >>> use return outside of functions/sourced scripts (bash also >>> doesn't exit upon that failing "return" special builtin in that >>> case which could be seen as a conformance bug). >> >> You really should try posix mode. > [...] > > As it happens, I did in that case, and I found it behaved the > same in or outside of POSIX mode: $ ./bash ./x5 ./x5: line 1: return: can only `return' from a function or sourced script after $ ./bash -o posix ./x5 ./x5: line 1: return: can only `return' from a function or sourced script $ cat x5 return 7 echo after -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 10:49 AM, Dirk Fieldhouse wrote: > On 11/03/20 14:03, Chet Ramey wrote: >>...> >> So what's the goal here? That the function continue execution in the >> subshell so `return' has consistent, if baffling, semantics? That we >> tighten up the language to make the unspecified specific? What is this >> discussion intended to accomplish? > > I refer you to this excerpt: > >> On 3/11/20 9:12 AM, Dirk Fieldhouse wrote:>...> >>> a) the wording of the standard about 'return' doesn't say this (or >>> as you said, >>>> what [I] believe it appears to say, and what it >>>> actually means, are probably not the same thing. >>> which is not a good look for a standard); >>...> > > If the standard can easily be misinterpreted, it ought to be reworded. So the latter, then. We can move on to proposing language. > The interesting discussion prompted by my original post indicates that > even experts don't agree on the interpretation of the text that > specifies 'return'. > > Did the wise authors of that text mean 'return' to cause: > > i) a return from the function's lexical scope, subject to some missing > definition of that scope, or Yes. > > ii) a return from the execution environment of the function's defining > command, or otherwise like 'exit', or In the case of a subshell or other separate execution environment, the `exit' seems the most reasonable action. It would be far worse if the function continued execution in a subshell. > > iii) a return from the execution environment of the function's defining > command, or otherwise unspecified? > > Perhaps the answer is (i) but owing to existing practice the standard > should say (ii) or (iii). What does a `return from the execution environment' mean, exactly? Does it mean that the calling shell should exit somehow? Since functions are executed in the same execution environment as the caller, and subshells are created as necessary as part of the function body execution, does the `defining command' mean the caller, or something else? > > As to "baffling semantics", I suggest that these are two examples where > 'return' is meaningful (and far from baffling) I assert that having the function (and the rest of any script) continue to execute in a subshell because a `return' appeared in a subshell would be baffling and difficult to explain to users. > and that "foo" and not > "bar" should be printed in each case: > > f1() { > ( echo foo; return ) > echo bar > } This implies some interprocess communication between the parent and child that simply doesn't exist, and nothing in the standard indicates that it should. > > and > > f2() { > echo foo | > if read -r xx && [ "$xx" = foo ]; then > echo "$xx"; return > else > echo "$xx" > fi > echo bar > } This is unspecified, and has been ever since ksh decided to run the last pipeline element in the current shell process (or execution environment, if you prefer). You can't rely on either behavior. The number of shells that print `bar' exceeds the number that don't, for what that's worth. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 9:12 AM, Dirk Fieldhouse wrote: > On 11/03/20 06:25, Robert Elz wrote: >> >> ... The standard by [referring to 'subshell environment'] is trying ... >> to avoid constraining the implementation, things that an implementation >> can work out how to do without forking it can do (which will make it >> faster, and less expensive to run) - but it must preserve the fiction >> that it has forked, as other shells do, and scripts are allowed to >> rely upon that, so no side effects (such as a return in a subshell >> causing a function in the parent to return) are permitted. > > Absolutely, but > > a) the wording of the standard about 'return' doesn't say this (or as > you said, >> what [I] believe it appears to say, and what it >> actually means, are probably not the same thing. > which is not a good look for a standard); > > b) in particular, returning from a subshell is not one of the forbidden > side-effects that is mentioned in or can be inferred from the text of 2.12. > > Isn't returning on use of 'return' an effect (ie, the actual behaviour > expected by the script author) rather than a side-effect? So what's the goal here? That the function continue execution in the subshell so `return' has consistent, if baffling, semantics? That we tighten up the language to make the unspecified specific? What is this discussion intended to accomplish? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: XCU: 'return' from subshell
On 3/11/20 5:43 AM, Stephane Chazelas wrote: > AFAIK, bash and bosh are the only shells that complain when you > use return outside of functions/sourced scripts (bash also > doesn't exit upon that failing "return" special builtin in that > case which could be seen as a conformance bug). You really should try posix mode. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)
On 2/18/20 12:00 PM, shwaresyst wrote: > > Don't see why, lparen, like "=", is not a char that stops collection of a > token body I'm not exactly sure what a `token body' is supposed to be, but the left paren does delimit a token. A `=' does not. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)
On 2/18/20 10:36 AM, Robert Elz wrote: > Date:Tue, 18 Feb 2020 09:58:32 -0500 > From: Chet Ramey > Message-ID: <22f16ef4-41cf-60b0-5968-f608dc988...@case.edu> > > | The 1992 version of the standard knew about time, standardized it as part > | of the UPE, and acknowledged that it worded the definition to allow the > | ksh88 reserved word implementation. > > Well, kind of - they made it unspecified what > > time a | b > > timed ("a" or "a|b") Yes. "The times reported are unspecified." but ksh (all versions I believe) bash and bosh > all allow > time(sleep 1) > which nothing in the standard explains (or allows). "If the current character is not quoted and can be used as the first character of a new operator, the current token (if any) shall be delimited." Similarly > time { sleep 1; } > and > time if true; then sleep 1; else sleep 2; fi They're both pipelines according to the shell grammar. > And in a subsequent message, chet.ra...@case.edu said: > | There are people on the bash mailing lists who would like a word. > > There are some very strange people on those lists! A couple of representative examples. https://lists.gnu.org/archive/html/help-bash/2018-12/msg00110.html https://lists.gnu.org/archive/html/help-bash/2018-12/msg00092.html -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)
On 2/18/20 9:59 AM, Robert Elz wrote: > Why someone would want to time a builtin I'm not sure > (with the possible exception of elapsed time of wait) There are people on the bash mailing lists who would like a word. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: [1003.1(2004)/Issue 6 0000267]: time (keyword)
On 2/18/20 5:31 AM, Joerg Schilling wrote: > All shells I am aware detect the -p option in the parser already and switch > to > the time utility instead of the time keyword. Bash doesn't do that; it's not useful and difficult to do using a bison- based parser. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/