Re: Coordination on standardizing gettext() in future POSIX
Sorry, hit Send early by accident. It is not a matter of what I like or not, as that would mean adding something way more flexible than gettext to the standard; it is if one implementation choice, for technical reasons, can be seen as intrinsically more portable than another that choice has priority for standardization. Backwards compatibility with non-portable behavior was only a priority for Issue 6, as was explained to me before V6TC1 came out, to simplify the merge effort of 1003.1 with SUSV5. So, for Issue 8, if this means the Solaris version loses out, so be it. On Wednesday, January 22, 2020 Joerg Schilling wrote: Shware Systems wrote: > This is not invention, as even Solaris allows you to turn it off with -s, as > you point out. It may work fine for the charsets/charmap files Solaris > historically provides to have escapes active as the default, but this does > not equate to it being valid for all conforming charsets, if an application > makes use of localedef, that I see. As such, from a portability standpoint, I > view not processing escapes as the safer alternative. What should be the reason for making the standard incompatible to the existing practice since more than 30 years? Gettext is a SunOS invention and other implementations are expected to follow the definition from the reference implementation. Do you really like to require SunOS to loose backwads incompatiblity? Jörg -- EMail:jo...@schily.net (home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Coordination on standardizing gettext() in future POSIX
It is not a matter of what I like or not, as that would mean adding something way more flexible than gettext to the standard, it is if one implementation choice, for technical reasons, can be seen as intrinsically more portable than another that choice has priority for standardization. Backwards compatibility with non-portable behavior was only a priority for Issue 6 On Wednesday, January 22, 2020 Joerg Schilling wrote: Shware Systems wrote: > This is not invention, as even Solaris allows you to turn it off with -s, as > you point out. It may work fine for the charsets/charmap files Solaris > historically provides to have escapes active as the default, but this does > not equate to it being valid for all conforming charsets, if an application > makes use of localedef, that I see. As such, from a portability standpoint, I > view not processing escapes as the safer alternative. What should be the reason for making the standard incompatible to the existing practice since more than 30 years? Gettext is a SunOS invention and other implementations are expected to follow the definition from the reference implementation. Do you really like to require SunOS to loose backwads incompatiblity? Jörg -- EMail:jo...@schily.net (home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Coordination on standardizing gettext() in future POSIX
On 1/22/20 9:08 AM, Bruno Haible wrote: Ulrich Drepper wrote: Do you really like to require SunOS to loose backwads incompatiblity? Overly dramatic. You just need one mode that is POSIX compatible. Many GNU tools use POSIXLY_CORRECT_ The Solaris practice for keeping backward compatibility despite new evolutions of the standards is to use /usr/xpg[457] directories. For example: /usr/xpg4/bin/sh != /usr/bin/sh /usr/xpg6/bin/ls != /usr/bin/ls /usr/xpg7/bin/getconf != /usr/bin/getconf There could be a /usr/xpg8/bin/gettext if POSIX gettext(1) ends up specifying a different behaviour than the current Solaris implementation has. This is absolutely correct, but only relevant if Solaris ever ends up fully implementing XPG8, which is unlikely at this point. (Realistically, we'd probably just make it a link to the /usr/gnu/bin/gettext we already ship and widely use to build FOSS packages that expect the GNU behaviors.) -- -Alan Coopersmith- alan.coopersm...@oracle.com Oracle Solaris Engineering - https://blogs.oracle.com/alanc [The preceding is my personal opinion, and not an official statement of Oracle.]
Re: Coordination on standardizing gettext() in future POSIX
On Wed, Jan 22, 2020 at 10:47 AM Joerg Schilling < joerg.schill...@fokus.fraunhofer.de> wrote: > Gettext is a SunOS invention and other implementations are expected to > follow > the definition from the reference implementation. > That implementation was the starting point but I didn't just copy it. We (mostly François, Peter, and I) fixed many shortcomings to make the API actually usable. Without that additional functionality the already-standardized message catalog mechanism would certainly have won. Do you really like to require SunOS to loose backwads incompatiblity? > Overly dramatic. You just need one mode that is POSIX compatible. Many GNU tools use POSIXLY_CORRECT_
Re: Coordination on standardizing gettext() in future POSIX
Jörg Schilling wrote: > > It is well-known that the escape sequence expansion in 'echo' was different > > in System V and BSD systems. You can assume that when Ulrich Drepper started > > out writing GNU gettext in 1995, he did NOT want to copy the System V > > behaviour > > of 'echo' into the 'gettext' program. > > So in other words, this is a result of not following the POSIX standard from > the beginning? What you call "System V behaviour" is the official required > POSIX > behavior for implementations that like to use the UNIX brand name. You can view it like this. I view it as a failure to provide a useful standard in this place (the 'echo' command). Even POSIX acknowledges this [1]: "It is not possible to use echo portably across all POSIX systems unless both -n (as the first argument) and escape sequences are omitted." "The two different historical versions of echo vary in fatally incompatible ways." With gettext(1), we are now in the same situation: Solaris gettext(1) behaves like System V 'echo', and GNU gettext(1) behaves like BSD 'echo' (on purpose, not by mistake, otherwise it would not have a '-e' option, borrowed from BSD 'echo'). It varies "in fatally incompatible ways" here too. Would it be useful to copy the POSIX echo(1) tragedy and produce the same thing once again, as a POSIX gettext(1) tragedy? I don't think so. Even the POSIXLY_CORRECT subterfuge variable would not be of real help to solve this dilemma: People avoid this variable because it has side effects on several programs, some of them negative. > It seems that the text in LI18NUX-2000-amd4.pdf is a comprimise negotiated > between > Sun and some GNU people that unfortunately is ignored by the GNU > implementation in > the default case of using gettext(1). The appendices in the LI18NUX were written down in a hurry at the end of the specification process. The LI18NUX group spent a lot of time discussing what is Unicode support Level 1, Level 2, etc., and at the end delegated one person to do a copy of existing documentation for the appendices. As far as I recall, there was no (or hardly any) critical review and no discussion any more at this point. This explains why the gettext(1) documentation in there is ambiguous. Bruno [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html
Re: Coordination on standardizing gettext() in future POSIX
Bruno Haible wrote: > It is well-known that the escape sequence expansion in 'echo' was different > in System V and BSD systems. You can assume that when Ulrich Drepper started > out writing GNU gettext in 1995, he did NOT want to copy the System V > behaviour > of 'echo' into the 'gettext' program. So in other words, this is a result of not following the POSIX standard from the beginning? What you call "System V behaviour" is the official required POSIX behavior for implementations that like to use the UNIX brand name. Even bash implements a compile variant that makes bash compliant with regard to the POSIX echo requirements. This compile variant is used on Solaris and on Mac OS where bash has been used as the shell to run the test suite. BTW: My text contained a question that you did not answer. It seems that the text in LI18NUX-2000-amd4.pdf is a comprimise negotiated between Sun and some GNU people that unfortunately is ignored by the GNU implementation in the default case of using gettext(1). I tried to build you a bridge and I am still in hope that you are interested in a result that is useful for standardisation. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Coordination on standardizing gettext() in future POSIX
Joerg Schilling wrote: > It is obvious that gettext(1) must expand escape sequences by default since > this is the documented default behavior for both Solaris gettext(1) and GNU > gettext(1) but in the default case, GNU gettext does not behave the way it is > documented. What you call the "GNU gettext documentation" is [1], the ambiguous LI18NUX specification (which, by the way, was a common effort of Sun Microsystems people and GNU people). The actual GNU gettext documentation is here: [2], and the part about the escape sequences has not changed since 2004. GNU gettext behaves the way it is documented: It does NOT expand escapes by default. > You forgot to mention that it also mentions: > >-n Suppress trailing newline. > > which makes it obvious that someone made a mistake while writing the GNU > documentation that describes the options. Correct, and I fixed this documentation mistake four months ago. [3] > given that GNU gettext even copied > this text from SunOS man pages from the early 1990s, it is obvious that the > intention of the GNU gettext implementation was to be compatible with the > reference implementation and there is only a bug in the GNU implementation. It is well-known that the escape sequence expansion in 'echo' was different in System V and BSD systems. You can assume that when Ulrich Drepper started out writing GNU gettext in 1995, he did NOT want to copy the System V behaviour of 'echo' into the 'gettext' program. > > 2) GNU gettext(1) and Solaris gettext(1) differ in this respect: > > > > GNU: > > $ gettext 'abc\ndef'; echo > > abc\ndef > > > > Solaris: > > $ gettext 'abc\ndef'; echo > > abc > > def > > > > This makes it hard to standardize, since the behaviours differ, and > > both implementations will want to claim need for backward-compatibility. > > Well people who expect the current GNU behavior obviously rely on a bug in > the > implementation. This argument fails because you were looking at LI18NUX, not at the documentation of GNU gettext. > > 3) Additionally, there's the problem that gettext(1) does not and can not > > (as a program) deal with strings that contain placeholders. As soon as > > It seems that you missunderstand the way gettext(1) is intended to be used. This is quite unlikely, because I have been the GNU gettext maintainer for 12 years. > I see two useful ways to do what you like: > > 1) > > gettext -s "Hello World" $$ No, this is not a reasonable way to use the 'gettext' program. It violates the principle "Entire sentences" [4]. In different languages, the number may need to be embedded into a sentence, rather than at the end of the sentence. > 2) > > text=$(gettext 'Hello World $$') > eval echo $text > > or > > eval echo $(gettext 'Hello World $$') No, this is not a reasonable way to use the 'gettext' program either. It fails miserably when the translation of 'Hello World $$' contains a semicolon. Try text='Coucou; le monde $$' eval echo $text In general, there is agreement among people writing shell scripts that the use of 'eval' should be minimized, i.e. that 'eval' should only be used when the lexical structure of the string being eval'ed can be predicted. Bruno [1] http://web.archive.org/web/20030428195733/http://www.li18nux.org/docs/html/LI18NUX-2000-amd4.htm [2] https://www.gnu.org/software/gettext/manual/html_node/gettext-Invocation.html [3] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=b0d302e404c4b7c2c59e7609aacff35476a494d8 [4] https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html
Re: [1003.1(2016)/Issue7+TC2 0001309]: Clarity needed for initial value of $? at start of compound-list compound statements
Robert Elz wrote, on 22 Jan 2020: > > From:Geoff Clare > > | If we do add something, then I think that some non-normative words along > | the lines of your explanation at the bottom ("to clarify that ...") > | would be more helpful than the type of normative addition you are > | requesting. > > Something just to make it clear would be better than nothing. Looking at the "Exit Status" sections of some of these commands, there are more "last command" problems. Also, some of them refer to the exit status of a compound-list, but I don't think that's defined anywhere. So I think we need to fix those problems, and at the same time could add a non-normative note to if, while, and until about the availability of the exit status of the first compound-list. How about: On page 2371 line 75731 section 2.9.4 Compound Commands, add a new paragraph: In the descriptions below, the exit status of some compound commands is stated in terms of the exit status of a compound-list. The exit status of a compound-list shall be the value that the special parameter '?' (see [xref to 2.5.2]) would have immediately after execution of the compound-list. On page 2372 line 75766 section 2.9.4.2 The for Loop, change: The exit status of a for command shall be the exit status of the last command that executes. to: If there is at least one item in the list of items, the exit status of a for command shall be the exit status of the last compound-list executed. On page 2373 line 75793 section 2.9.4.3 Case Conditional Construct, change: ... the exit status shall be the exit status of the last command executed in the compound-list. to: ... the exit status shall be the exit status of the executed compound-list. On page 2373 line 75814 section 2.9.4.4 The if Conditional Construct, add: Note: Although the exit status of the if or elif compound-list is ignored when determining the exit status of the if command, it is available through the special parameter '?' (see [[xref to 2.5.2]) during execution of the next then, elif, or else compound-list (if any is executed) in the normal way. On page 2374 line 75827 section 2.9.4.5 The while Loop, add: Note: Since the exit status of compound-list-1 is ignored when determining the exit status of the while command, it is not possible to obtain the status of the command that caused the loop to exit, other than via the special parameter '?' (see [[xref to 2.5.2]) during execution of compound-list-1, for example: while some_command; st=$?; false; do The exit status of compound-list-1 is available through the special parameter '?' during execution of compound-list-2, but is known to be zero at that point anyway. On page 2374 line 75840 section 2.9.4.6 The until Loop, add: Note: Although the exit status of compound-list-1 is ignored when determining the exit status of the until command, it is available through the special parameter '?' (see [[xref to 2.5.2]) during execution of compound-list-2 in the normal way. > | This phrase is in the existing text (after bug 1150 was applied). > | It's in a small-font note, which means it is non-normative, > > but the new proposed text is not Yes it is. There is nothing in the new proposed changes that specifies this text should change from a small-font note to something else. > | so I don't see a problem with using this informal phrase to refer > | to the var=... command. > > In general terms nor would I (it isn't a command, in the strict sense, > but we might ignore that) > > | It's just being used as shorthand for "the command containing the > | assignment to var". > > Not really, any other command containing an assignment to a var, which > contained a command substitution wouldn't be relevant, only a simple > command with no command word. > > The bigger problem is that the wording suggests there is something > special about command substitutions in assignment statements, which > isn't correct, any command substitution in any command without a > command word works It reads to me as an example, rather than indicating the statement only applies to that case, but if you are reading it differently then I'm happy to make it clearer that there are other similar cases. I'd suggest adding to the end of the small-font note (borrowing some words from part of your email I haven't quoted): Likewise for any pipeline consisting entirely of a simple command that has no command word, but contains one or more command substitutions. (See [xref to 2.9.1].) > Next issue: > > | This behaviour of ksh was the reason I proposed the unspecified behaviour. > > Yes, I assumed that. > > | The bug, as I see it, is that the value of $? and the behaviour of exit > | differ. > > Yes. Or well kind of, the bug is that exit picks the wrong default > for n when executed in a
Re: Coordination on standardizing gettext() in future POSIX
Bruno Haible wrote: > If that is your approach to standardization, then it is better to not > standardize > anything. If your approach is to standardize obvious implementation bugs, I am a bit bewildered. I was in hope that you are interested in a fruitful discussion and open to useful arguments. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Coordination on standardizing gettext() in future POSIX
Hi Bruno! Bruno Haible wrote: > Regarding the gettext(1) program and whether it expands escape sequences > by default: > > > 1) [1] is ambiguous / self-contradictory. > On one hand it says: > > This utility interprets C escape sequences such as \t for tab. Use \\ to > print a backslash... It is obvious that gettext(1) must expand escape sequences by default since this is the documented default behavior for both Solaris gettext(1) and GNU gettext(1) but in the default case, GNU gettext does not behave the way it is documented. > Which sounds like they are expanded by default. > > On the other hand it says: > > OPTIONS > -e > Enable expansion of some escape sequences. > > Which sounds like they are NOT expanded by default. You forgot to mention that it also mentions: -n Suppress trailing newline. which makes it obvious that someone made a mistake while writing the GNU documentation that describes the options. Someone forgot to mention that the options -e/-n are both valid only together with the -s option that switches the behavior. But this wrong wording is only in the GNU documentaion, while the official reference documentation from the inventor of the utility says: -sBehaves like echo(1) (see DESCRIP- TION above). If the -s option is specified, no expansion of C escape sequences is performed and a newline character is appended to the output, by default. With your interpretation of the GNU documentation, GNU gettext would need to output a newline at the end by default, but it does not. This is another hint for an implementation bug in GNU gettext... > So, you can't resolve this question by referencing an ambiguous specification. Given that the main explanation requires to expand escape sequences without giving any exception, this is doubtlessly the the default behavior. We may discuss things beyond that description, but given that GNU gettext even copied this text from SunOS man pages from the early 1990s, it is obvious that the intention of the GNU gettext implementation was to be compatible with the reference implementation and there is only a bug in the GNU implementation. The documentation from the reference implementation (Solaris) is definitely not ambiguous since it correctly documents -s as an exception. The GNU documentation is obvious for the default case that is documented in the DESCRIPTION section, but GNU gettext does not follow that GNU documentation. The only ambiguity I see in the GNU documentation is in effect for the non-default case, but in the non-default case, GNU gettext follows the behavior of the reference implementation. > 2) GNU gettext(1) and Solaris gettext(1) differ in this respect: > > GNU: > $ gettext 'abc\ndef'; echo > abc\ndef > > Solaris: > $ gettext 'abc\ndef'; echo > abc > def > > This makes it hard to standardize, since the behaviours differ, and > both implementations will want to claim need for backward-compatibility. Well people who expect the current GNU behavior obviously rely on a bug in the implementation. So the main question to me is whether GNU gettext will have a chance to be fixed. If you like to protect GNU users that rely on that implementation bug, GNU gettext could be enhanced to follow the documented behavior in case that POSIXLY_CORRECT is set, as used for other standard deviations on Linux already. The Solaris gettext behaves as documented and I see no reason to introduce a different description in POSIX since that would cause backwards compatibility problems. The Solaris behavior is obviously not a bug and did not change during the past 30+ years - much longer than GNU gettext exists. The Solaris implementation is even able to emulate the GNU behavior if it is called as: gettext -sn "some text" as long as you do not like to supply "textdomain" as first argument but rather as -d option argument. > 3) Additionally, there's the problem that gettext(1) does not and can not > (as a program) deal with strings that contain placeholders. As soon as It seems that you missunderstand the way gettext(1) is intended to be used. I see two useful ways to do what you like: 1) gettext -s "Hello World" $$ 2) text=$(gettext 'Hello World $$') eval echo $text or eval echo $(gettext 'Hello World $$') Method 2 is equivalent to the way, C programs use gettext(3). Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Coordination on standardizing gettext() in future POSIX
Shware Systems wrote: > This is not invention, as even Solaris allows you to turn it off with -s, as > you point out. It may work fine for the charsets/charmap files Solaris > historically provides to have escapes active as the default, but this does > not equate to it being valid for all conforming charsets, if an application > makes use of localedef, that I see. As such, from a portability standpoint, I > view not processing escapes as the safer alternative. What should be the reason for making the standard incompatible to the existing practice since more than 30 years? Gettext is a SunOS invention and other implementations are expected to follow the definition from the reference implementation. Do you really like to require SunOS to loose backwads incompatiblity? Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'