[1003.1(2016/18)/Issue7+TC2 0001585]: kill - add -j option to avoid PID reuse race
A NOTE has been added to this issue. == https://austingroupbugs.net/view.php?id=1585 == Reported By:steffen Assigned To: == Project:1003.1(2016/18)/Issue7+TC2 Issue ID: 1585 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: New Name: steffen Organization: User Reference: Section:Vol. 3: Shell and Utilities Page Number:2879 Line Number:94942 Interp Status: --- Final Accepted Text: == Date Submitted: 2022-05-14 22:05 UTC Last Modified: 2022-05-16 23:41 UTC == Summary:kill - add -j option to avoid PID reuse race == -- (0005839) steffen (reporter) - 2022-05-16 23:41 https://austingroupbugs.net/view.php?id=1585#c5839 -- Maybe i do misunderstand. Then i would retract the issue. What this issue wants to achieve is to close the gap in between wait(1) and waitpid(2)/x(2). - A saved-away process identifier will be known to the shell unless wait(1) has been called on it. - The process itself is known to the operating system aka kept in the process table until it has been waitpid(2)ed for. If i kill(1) a process that is still known to the sh(1)ell because wait(1) has not yet been called, but the shell itself has already waitpid(2)ed on the child, after having received SIGCLD or for whatever reason, then the operating system may already have reused the process identifier as such. The -j option to kill(1) should overcome this gap in that the sh(1)ell is forced to check the given identifiers for whether the process identifier has yet been waitpid(2) for or not. In the first case "kill -j -SIG PID" shall fail. I do come here for a reason. Years ago i "saved-away process identifiers", and was under the impression that the sh(1)ell applies special care for those process identifiers until wait(1) has been called on them. But it turned out i could kill(1) a process that was no longer mine, even though i did not have yet wait(1)ed on the process identifier. The shell simply called kill(2) on the process identifier, which in the meantime was reused by the operating system. So maybe this was a sh(1)ell bug, and shells are not allowed to call waitpid(2) on a process where the saved-away process identifier has not been wait(1)ed or (that is, may only call it when the latter is called). If this was so i would retract this issue. If not then the default behaviour of sh(1)ell wait(1) cannot be changed since this could break things in the wild. There should be an option to explicitly request that saved-away process identifiers should be still-alive when kill(1)ing them. Issue History Date ModifiedUsername FieldChange == 2022-05-14 22:05 steffenNew Issue 2022-05-14 22:05 steffenName => steffen 2022-05-14 22:05 steffenSection => Vol. 3: Shell and Utilities 2022-05-14 22:05 steffenPage Number => 2879 2022-05-14 22:05 steffenLine Number => 94942 2022-05-16 08:21 geoffclare Note Added: 0005835 2022-05-16 08:33 geoffclare Note Edited: 0005835 2022-05-16 10:08 kreNote Added: 0005836 2022-05-16 13:37 steffenNote Added: 0005837 2022-05-16 13:54 steffenNote Added: 0005838 2022-05-16 23:41 steffenNote Added: 0005839 ==
Re: POSIX gettext(): lifetime of returned values
Bruno Haible wrote, on 12 May 2022: > > https://posix.rhansen.org/p/gettext_draft > Line 357 > > "The returned string may be invalidated by ... a subsequent call to > uselocale() in the same thread, except for calls that only query values." > > As explained in my mail from 2021-05-04 [1] > >uselocale() is a helper function to implement *_l functions where >the POSIX standard does not specify them or the system does not have >them. >For example, when a program wants to have a function to parse >a number, recognizing only the ASCII digits and only '.' as decimal >separator, a reliable way to implement such a function is by calling >uselocale of the "C" locale, strtod(), and then uselocale() again >to switch the thread back to the previous locale. > >If POSIX did not have uselocale(), it would need to provide many >more *_l functions. > > If temporarily switching a thread's locale through uselocale() > invalidates the gettext functions' results (even if only those from > the same thread), it effectively disallows uselocale() as a helper > function. This was discussed in today's call, but we did not reach a conclusion. Can you explain how glibc manages not to invalidate strings returned by gettext() when uselocale() is used to change the locale (without leaking memory - or does it leak memory?), in particular if codeset translation was needed. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: POSIX gettext(): Use of LANGUAGE in the POSIX locale
Bruno Haible wrote, on 12 May 2022: > > https://posix.rhansen.org/p/gettext_draft > Line 65 > "The locale names in LANGUAGE shall take precedence over <...>" > > Issue: If this is true in all cases, then > > 1) programs such as 'diff' > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/diff.html > - which are forced to produce a specific output in the POSIX locale - > will have to explicitly test for the POSIX locale, for example by > doing > const char *fmt = > (in_posix_locale () ? "Only in %s: %s\n" : gettext ("Only in %s: %s\n")); > > 2) for many languages, which use non-ASCII characters, the output > will contain many question marks, due to transliteration, because the > POSIX locale, on many systems, comes with the ASCII encoding. > > Suggestion: Change > "over <...>" > to > "over <...>, if the latter is not the POSIX locale" In today's call we made changes along the lines you suggest. Please check the updated etherpad to see if they achieve what you wanted. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: POSIX gettext(): messages catalog lookup when LANGUAGE is set
Bruno Haible wrote, on 12 May 2022: > > https://posix.rhansen.org/p/gettext_draft > Lines 308..309 > "o attempt to locate a suitable messages object..." >o attempt to retrieve the string identified by msgid from the messages > object" > and line 342, 344 > "the pathname used to locate the messages object shall be >dirname/localename/categoryname/textdomainname.mo, where: >... >additional searches of locale names without .codeset (if present), >without _territory (if present), and without @modifier (if present) >may be performed" > > This text is suggesting that once the first suitable messages object > has been found, the string identified by msgid will be looked up in > this ONE AND ONLY ONE messages object. > > Lines 339, 340 on the other hand suggest that when the msgid is not > found in the messages object, the search may continue. [...] > Suggestion: Reword it so that > "o attempt to locate a suitable messages object..." >o attempt to retrieve the string identified by msgid from the messages > object" > are no longer separate steps, but such that backtracking occurs when the > messages object does not contain a translation for the given msgid. > This would resolve the apparent contradiction with lines 339, 340. In today's call we made changes along the lines you suggest. Please check the updated etherpad to see if they achieve what you wanted. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: POSIX gettext(): messages catalog lookup when LANGUAGE is not set
Bruno Haible wrote, on 12 May 2022: > > https://posix.rhansen.org/p/gettext_draft > Lines 335, 344 > > "For portable applications, only the LANGUAGE search supports searches >across multiple locale names." > "For the LANGUAGE search, ... if a locale name has the format >language[_territory][.codeset][@modifier], additional searches of locale >names without .codeset (if present), without _territory (if present), >and without @modifier (if present) may be performed; if .codeset is not >present, additional searches of locale names with an added .codeset may >be performed. For the single-locale search, the localename part is the >name of the current locale, or the locale specified in an *_l() function >call, for the category named by categoryname." > > As explained in my mails from 2021-05-04 and 2022-01-16, it is important to > support people who live in communities which often (but not always) have > translations of their own but can read translations for other locales. > While, at the same time, it is important allow a translator for say, German, > to produce a translation that is useful for users in Germany, Austria, and > Switzerland, if no other (more specific) translation is available. > > So, while the user may be working in either of the locales > de_DE.UTF-8 > de_AT.UTF-8 > de_CH.UTF-8 > they SHOULD see the translations that have been installed at > dirname/de/LC_MESSAGES/textdomainname.mo > > This is true also if the LANGUAGE environment variable has not been set. > Most operating systems set the LANG or LC_ALL environment variable for the > user, but do not set LANGUAGE. > > In this situation, the current text mandates(!) that for a user in the > de_DE.UTF-8 locale > - dirname/de/LC_MESSAGES/textdomainname.mo gets always ignored, and > - dirname/de_DE.UTF-8/LC_MESSAGES/textdomainname.mo gets used - but this > messages object file almost never exists. > > This is NOT how GNU gettext behaves. [...] > Suggestion: > In line 344, make the >"if a locale name has the format language[_territory][.codeset][@modifier], > additional searches of locale names without .codeset (if present), without > _territory (if present), and without @modifier (if present) may be > performed; if .codeset is not present, additional searches of locale > names with an added .codeset may be performed." > text apply also to the single-locale case. > In line 335, remove the sentence "only the LANGUAGE search supports searches > across multiple locale names." In today's call we made changes along the lines you suggest. Please check the updated etherpad to see if they achieve what you wanted. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: POSIX msgfmt and duplicate msgids
Bruno Haible wrote, on 12 May 2022: > > https://posix.rhansen.org/p/gettext_draft > Lines 925..926, 1140 > > "-n Do not allow duplicate msgid directives. Treat duplicate msgid > directives for the same message_identifier as errors instead of > ignoring the duplicates." > [...] > > Suggestion: > Remove these lines. In today's call we removed those lines. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: When can shells remove "known" process IDs from the list?
Chet and I can continue thus conversation off list, what is being discussed now has nothing at all to do with anything related to posix. kre
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 5:37 PM, Robert Elz wrote: Date:Sat, 14 May 2022 03:56:32 +0700 From:"Robert Elz via austin-group-l at The Open Group" Message-ID: <2459.1652475...@jinx.noi.kre.to> | | Show your work. | I no longer remember the exact command I used (cannot even locate the | message you're quoting from), I finally did ... This is what I see: I don't see that. $ echo $BASH_VERSION 5.1.16(2)-release $ sleep 20 | sleep 20 & sleep 30 | sleep 30 & jobs -l ; pstree $$ ; ps jT [1] 22954 [2] 22956 [1]- 22953 Running sleep 20 22954 | sleep 20 & [2]+ 22955 Running sleep 30 22956 | sleep 30 & -+= 22938 chet ./bash |--- 22953 chet sleep 20 |--- 22954 chet sleep 20 |--- 22955 chet sleep 30 |--- 22956 chet sleep 30 \-+- 22957 chet pstree 22938 \--- 22958 root ps -axwwo user,pid,ppid,pgid,command USER PID PPID PGID SESS JOBC STAT TT TIME COMMAND root 811 544 811 00 Ss s0190:00.05 login -pfl chet /bin/ba chet 814 811 814 01 Ss0190:00.09 -bash chet 22938 814 22938 01 S+ s0190:00.04 ./bash chet 22953 22938 22938 01 S+ s0190:00.00 sleep 20 chet 22954 22938 22938 01 S+ s0190:00.00 sleep 20 chet 22955 22938 22938 01 S+ s0190:00.00 sleep 30 chet 22956 22938 22938 01 S+ s0190:00.00 sleep 30 root 22959 22938 22938 01 R+ s0190:00.00 ps jT $ kill %1 $ ps jT USER PID PPID PGID SESS JOBC STAT TT TIME COMMAND root 811 544 811 00 Ss s0190:00.05 login -pfl chet /bin/ba chet 814 811 814 01 Ss0190:00.09 -bash chet 22938 814 22938 01 S+ s0190:00.04 ./bash chet 22955 22938 22938 01 S+ s0190:00.00 sleep 30 chet 22956 22938 22938 01 S+ s0190:00.00 sleep 30 root 22960 22938 22938 01 R+ s0190:00.00 ps jT $ -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: When can shells remove "known" process IDs from the list?
On 5/13/22 4:56 PM, Robert Elz wrote: Date:Fri, 13 May 2022 11:22:20 -0400 From:Chet Ramey Message-ID: | Show your work. | | I tested this on macOS 12 and RHEL 7, using interactive shells with job | control enabled, That is likely the difference. The question was about what happens when job control is not enabled. The same thing. This example uses bash-5.2-beta on macOS 10.15, but the same thing happens with bash-5.1.16. $ ./bash $ set +m $ sleep 20 | sleep 20 & [1] 22755 jenna.local(2)$ pstree $$ -+= 22753 chet ./bash |--- 22754 chet sleep 20 |--- 22755 chet sleep 20 \-+- 22756 chet pstree 22753 \--- 22757 root ps -axwwo user,pid,ppid,pgid,command $ kill %1 $ ps ax | grep sleep 22759 s018 S+ 0:00.00 grep sleep $ sleep 20 | sleep 20 & pstree $$ [1] 22787 -+= 22753 chet ./bash |--- 22786 chet sleep 20 |--- 22787 chet sleep 20 \-+- 22788 chet pstree 22753 \--- 22789 root ps -axwwo user,pid,ppid,pgid,command $ kill %1 $ ps axuw | grep sleep chet 22791 0.0 0.0 4408552764 s018 S+ 10:25AM 0:00.00 grep sleep -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
[1003.1(2016/18)/Issue7+TC2 0001585]: kill - add -j option to avoid PID reuse race
A NOTE has been added to this issue. == https://austingroupbugs.net/view.php?id=1585 == Reported By:steffen Assigned To: == Project:1003.1(2016/18)/Issue7+TC2 Issue ID: 1585 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: New Name: steffen Organization: User Reference: Section:Vol. 3: Shell and Utilities Page Number:2879 Line Number:94942 Interp Status: --- Final Accepted Text: == Date Submitted: 2022-05-14 22:05 UTC Last Modified: 2022-05-16 13:54 UTC == Summary:kill - add -j option to avoid PID reuse race == -- (0005838) steffen (reporter) - 2022-05-16 13:54 https://austingroupbugs.net/view.php?id=1585#c5838 -- re 5835 and 5836: It is clear the idea is that the shell's children will remain in the operating system's table of active processes until they have been wait(2)ed for, therefore only the sh(1)ell as the parent process can kill(2) the child safely. -j is thus meant to give the sh(1)ell script writer access to the race-free capability that the sh(1)ell as such has in its internals (anyway). Regarding process groups. Yes, this is true, of course, but i think it is weak reasoning to not offer the hand because somewhere down the process chain such things may happen. Quite the opposite, if i have the possibility everywhere, i can write race-free sh(1)ell scripts on all (subshell) levels. And programs with direct access to wait(2) that start child processes are hopefully doing it right anyway. They at least could using POSIX interfaces. But sh(1)ell script can not, even though the sh(1)ell as such can, or even has to do right. This is what this issue wants to change. Regarding operating system support. Oh, that is true!! They do start implementing this but unfortunately non-portable, POSIX is late and should possibly have tried to set a scent here in the past. Linux has the prctl(2)s PR_SET_CHILD_SUBREAPER and PR_GET_CHILD_SUBREAPER A subreaper fulfills the role of init(1) for its descendant pro‐ cesses. When a process becomes orphaned (i.e., its immediate parent terminates), then that process will be reparented to the nearest still living ancestor subreaper. Subsequently, calls to getppid(2) in the orphaned process will now return the PID of the subreaper process, and when the orphan terminates, it is the subreaper process that will receive a SIGCHLD signal and will be able to wait(2) on the process to discover its termination sta‐ tus. and FreeBSD has an even more sophisticated approach that allows iterating over the "descendants of the reaper", especially as with PROC_REAP_KILL the possibility to kill only a subset of these. It could be that in order to implement timeout(1) properly portions of this functionality need to be implemented kernel-wise. Or already have done so. Issue History Date ModifiedUsername FieldChange == 2022-05-14 22:05 steffenNew Issue 2022-05-14 22:05 steffenName => steffen 2022-05-14 22:05 steffenSection => Vol. 3: Shell and Utilities 2022-05-14 22:05 steffenPage Number => 2879 2022-05-14 22:05 steffenLine Number => 94942 2022-05-16 08:21 geoffclare Note Added: 0005835 2022-05-16 08:33 geoffclare Note Edited: 0005835 2022-05-16 10:08 kreNote Added: 0005836 2022-05-16 13:37 steffenNote Added: 0005837 2022-05-16 13:54 steffenNote Added: 0005838 ==
[1003.1(2016/18)/Issue7+TC2 0001585]: kill - add -j option to avoid PID reuse race
A NOTE has been added to this issue. == https://austingroupbugs.net/view.php?id=1585 == Reported By:steffen Assigned To: == Project:1003.1(2016/18)/Issue7+TC2 Issue ID: 1585 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: New Name: steffen Organization: User Reference: Section:Vol. 3: Shell and Utilities Page Number:2879 Line Number:94942 Interp Status: --- Final Accepted Text: == Date Submitted: 2022-05-14 22:05 UTC Last Modified: 2022-05-16 13:37 UTC == Summary:kill - add -j option to avoid PID reuse race == -- (0005837) steffen (reporter) - 2022-05-16 13:37 https://austingroupbugs.net/view.php?id=1585#c5837 -- First of all: correction: what i really meant was -j Process the kill request only when the given jobs or saved-aways process identifiers are known to the shell, and have not yet terminated, otherwise exit with status 66 (EX_NOINPUT from sysexits.h). Issue History Date ModifiedUsername FieldChange == 2022-05-14 22:05 steffenNew Issue 2022-05-14 22:05 steffenName => steffen 2022-05-14 22:05 steffenSection => Vol. 3: Shell and Utilities 2022-05-14 22:05 steffenPage Number => 2879 2022-05-14 22:05 steffenLine Number => 94942 2022-05-16 08:21 geoffclare Note Added: 0005835 2022-05-16 08:33 geoffclare Note Edited: 0005835 2022-05-16 10:08 kreNote Added: 0005836 2022-05-16 13:37 steffenNote Added: 0005837 ==
[1003.1(2016/18)/Issue7+TC2 0001585]: kill - add -j option to avoid PID reuse race
A NOTE has been added to this issue. == https://austingroupbugs.net/view.php?id=1585 == Reported By:steffen Assigned To: == Project:1003.1(2016/18)/Issue7+TC2 Issue ID: 1585 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: New Name: steffen Organization: User Reference: Section:Vol. 3: Shell and Utilities Page Number:2879 Line Number:94942 Interp Status: --- Final Accepted Text: == Date Submitted: 2022-05-14 22:05 UTC Last Modified: 2022-05-16 10:08 UTC == Summary:kill - add -j option to avoid PID reuse race == -- (0005836) kre (reporter) - 2022-05-16 10:08 https://austingroupbugs.net/view.php?id=1585#c5836 -- I agree that this is invention, and should be rejected, but even if some shell did try implementing it, I cannot see how they would do so in a way that would meet the objectives of the issue raised. As best I can tell, doing anything as suggested requires kernel assistance, as no matter how carefully the shell checks, there's no way that it can avoid race conditions, it must check first, and then do the kill sys call, and in the intervening period, things might have altered. If the kernel had a kill_my_child() sts call, then it could be made to work I think, as the shell could check the pids it knows belong to its children, and avoid creating any new ones between that check and doing the kill_my_child() sys call, but since I don't know of any system that implements a sys call like that (orr an option on kill - it cold be done by setting a high order bit in the signal number - then I cannot see how a shell could possibly make this work well enough to make adding such an option sensible. kre ps": such a new sys call would work on pgrps just the same as kill() does. Issue History Date ModifiedUsername FieldChange == 2022-05-14 22:05 steffenNew Issue 2022-05-14 22:05 steffenName => steffen 2022-05-14 22:05 steffenSection => Vol. 3: Shell and Utilities 2022-05-14 22:05 steffenPage Number => 2879 2022-05-14 22:05 steffenLine Number => 94942 2022-05-16 08:21 geoffclare Note Added: 0005835 2022-05-16 08:33 geoffclare Note Edited: 0005835 2022-05-16 10:08 kreNote Added: 0005836 ==
[1003.1(2016/18)/Issue7+TC2 0001585]: kill - add -j option to avoid PID reuse race
A NOTE has been added to this issue. == https://austingroupbugs.net/view.php?id=1585 == Reported By:steffen Assigned To: == Project:1003.1(2016/18)/Issue7+TC2 Issue ID: 1585 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: New Name: steffen Organization: User Reference: Section:Vol. 3: Shell and Utilities Page Number:2879 Line Number:94942 Interp Status: --- Final Accepted Text: == Date Submitted: 2022-05-14 22:05 UTC Last Modified: 2022-05-16 08:21 UTC == Summary:kill - add -j option to avoid PID reuse race == -- (0005835) geoffclare (manager) - 2022-05-16 08:21 https://austingroupbugs.net/view.php?id=1585#c5835 -- As far as I can see, the only time this -j option would be useful is if an application wants to send a signal just to a process group leader without sending it to the other processes in the group. This seems like a very rare thing for an application to need to do. It would not be useful in the other two possible cases, which are: 1. The signal is to be sent to the whole process group. In this case, the application can just use "kill JOB". 2. The signal is to be sent to one or more of the processes comprising a process group, but excluding the process group leader. In this case the use of -j does not solve the problem, as any of those process IDs could have been reused even though the leader is still running (and thus the job still exists). Anyway, a discussion of the technical merits of the proposal is pointless unless there is a shell which already implements this kill -j option. None of the shells I have available do. Does anybody know of one that does? If not, this request should be rejected as invention. Issue History Date ModifiedUsername FieldChange == 2022-05-14 22:05 steffenNew Issue 2022-05-14 22:05 steffenName => steffen 2022-05-14 22:05 steffenSection => Vol. 3: Shell and Utilities 2022-05-14 22:05 steffenPage Number => 2879 2022-05-14 22:05 steffenLine Number => 94942 2022-05-16 08:21 geoffclare Note Added: 0005835 ==