Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Shware Systems

Sorry, hit Send early by accident.

It is not a matter of what I like or not, as that would mean adding something 
way more flexible than gettext to the standard; it is if one implementation 
choice, for technical reasons, can be seen as intrinsically more portable than 
another that choice has priority for standardization. Backwards compatibility 
with non-portable behavior was only a priority for Issue 6, as was explained to 
me before V6TC1 came out, to simplify the merge effort of 1003.1 with SUSV5. 
So, for Issue 8, if this means the Solaris version loses out, so be it.
On Wednesday, January 22, 2020 Joerg Schilling 
 wrote:
Shware Systems  wrote:

> This is not invention, as even Solaris allows you to turn it off with -s, as 
> you point out. It may work fine for the charsets/charmap files Solaris 
> historically provides to have escapes active as the default, but this does 
> not equate to it being valid for all conforming charsets, if an application 
> makes use of localedef, that I see. As such, from a portability standpoint, I 
> view not processing escapes as the safer alternative.

What should be the reason for making the standard incompatible to the existing 
practice since more than 30 years?

Gettext is a SunOS invention and other implementations are expected to follow 
the definition from the reference implementation.

Do you really like to require SunOS to loose backwads incompatiblity?

Jörg

-- 
 EMail:jo...@schily.net                    (home) Jörg Schilling D-13353 Berlin
    joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'


Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Shware Systems

It is not a matter of what I like or not, as that would mean adding something 
way more flexible than gettext to the standard, it is if one implementation 
choice, for technical reasons, can be seen as intrinsically more portable than 
another that choice has priority for standardization. Backwards compatibility 
with non-portable behavior was only a priority for Issue 6 
On Wednesday, January 22, 2020 Joerg Schilling 
 wrote:
Shware Systems  wrote:

> This is not invention, as even Solaris allows you to turn it off with -s, as 
> you point out. It may work fine for the charsets/charmap files Solaris 
> historically provides to have escapes active as the default, but this does 
> not equate to it being valid for all conforming charsets, if an application 
> makes use of localedef, that I see. As such, from a portability standpoint, I 
> view not processing escapes as the safer alternative.

What should be the reason for making the standard incompatible to the existing 
practice since more than 30 years?

Gettext is a SunOS invention and other implementations are expected to follow 
the definition from the reference implementation.

Do you really like to require SunOS to loose backwads incompatiblity?

Jörg

-- 
 EMail:jo...@schily.net                    (home) Jörg Schilling D-13353 Berlin
    joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'


Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Alan Coopersmith

On 1/22/20 9:08 AM, Bruno Haible wrote:

Ulrich Drepper wrote:

Do you really like to require SunOS to loose backwads incompatiblity?


Overly dramatic.  You just need one mode that is POSIX compatible. Many GNU
tools use POSIXLY_CORRECT_


The Solaris practice for keeping backward compatibility despite new evolutions
of the standards is to use /usr/xpg[457] directories. For example:
   /usr/xpg4/bin/sh  != /usr/bin/sh
   /usr/xpg6/bin/ls  != /usr/bin/ls
   /usr/xpg7/bin/getconf != /usr/bin/getconf

There could be a /usr/xpg8/bin/gettext if POSIX gettext(1) ends up specifying
a different behaviour than the current Solaris implementation has.


This is absolutely correct, but only relevant if Solaris ever ends up fully
implementing XPG8, which is unlikely at this point.  (Realistically, we'd
probably just make it a link to the /usr/gnu/bin/gettext we already ship
and widely use to build FOSS packages that expect the GNU behaviors.)

--
-Alan Coopersmith-   alan.coopersm...@oracle.com
 Oracle Solaris Engineering - https://blogs.oracle.com/alanc

[The preceding is my personal opinion, and not an official statement of Oracle.]



Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Ulrich Drepper
On Wed, Jan 22, 2020 at 10:47 AM Joerg Schilling <
joerg.schill...@fokus.fraunhofer.de> wrote:

> Gettext is a SunOS invention and other implementations are expected to
> follow
> the definition from the reference implementation.
>

That implementation was the starting point but I didn't just copy it. We
(mostly François, Peter, and I) fixed many shortcomings to make the API
actually usable. Without that additional functionality the
already-standardized message catalog mechanism would certainly have won.


Do you really like to require SunOS to loose backwads incompatiblity?
>

Overly dramatic.  You just need one mode that is POSIX compatible. Many GNU
tools use POSIXLY_CORRECT_


Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Bruno Haible
Jörg Schilling wrote:
> > It is well-known that the escape sequence expansion in 'echo' was different
> > in System V and BSD systems. You can assume that when Ulrich Drepper started
> > out writing GNU gettext in 1995, he did NOT want to copy the System V 
> > behaviour
> > of 'echo' into the 'gettext' program.
> 
> So in other words, this is a result of not following the POSIX standard from 
> the beginning? What you call "System V behaviour" is the official required 
> POSIX 
> behavior for implementations that like to use the UNIX brand name.

You can view it like this. I view it as a failure to provide a useful standard
in this place (the 'echo' command). Even POSIX acknowledges this [1]:

  "It is not possible to use echo portably across all POSIX systems unless
   both -n (as the first argument) and escape sequences are omitted."

  "The two different historical versions of echo vary in fatally incompatible
   ways."

With gettext(1), we are now in the same situation: Solaris gettext(1) behaves
like System V 'echo', and GNU gettext(1) behaves like BSD 'echo' (on purpose,
not by mistake, otherwise it would not have a '-e' option, borrowed from BSD
'echo').

It varies "in fatally incompatible ways" here too.

Would it be useful to copy the POSIX echo(1) tragedy and produce the same
thing once again, as a POSIX gettext(1) tragedy? I don't think so. Even the
POSIXLY_CORRECT subterfuge variable would not be of real help to solve this
dilemma: People avoid this variable because it has side effects on several
programs, some of them negative.

> It seems that the text in LI18NUX-2000-amd4.pdf is a comprimise negotiated 
> between 
> Sun and some GNU people that unfortunately is ignored by the GNU 
> implementation in 
> the default case of using gettext(1).

The appendices in the LI18NUX were written down in a hurry at the end of the
specification process. The LI18NUX group spent a lot of time discussing what is
Unicode support Level 1, Level 2, etc., and at the end delegated one person to
do a copy of existing documentation for the appendices. As far as I 
recall,
there was no (or hardly any) critical review and no discussion any more at this
point. This explains why the gettext(1) documentation in there is ambiguous.

Bruno

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html




Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Joerg Schilling
Bruno Haible  wrote:

> It is well-known that the escape sequence expansion in 'echo' was different
> in System V and BSD systems. You can assume that when Ulrich Drepper started
> out writing GNU gettext in 1995, he did NOT want to copy the System V 
> behaviour
> of 'echo' into the 'gettext' program.

So in other words, this is a result of not following the POSIX standard from 
the beginning? What you call "System V behaviour" is the official required 
POSIX 
behavior for implementations that like to use the UNIX brand name.

Even bash implements a compile variant that makes bash compliant with regard to 
the POSIX echo requirements. This compile variant is used on Solaris and on Mac 
OS where bash has been used as the shell to run the test suite.

BTW: My text contained a question that you did not answer.

It seems that the text in LI18NUX-2000-amd4.pdf is a comprimise negotiated 
between 
Sun and some GNU people that unfortunately is ignored by the GNU implementation 
in 
the default case of using gettext(1).

I tried to build you a bridge and I am still in hope that you are interested in 
a result that is useful for standardisation.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Bruno Haible
Joerg Schilling wrote:
> It is obvious that gettext(1) must expand escape sequences by default since 
> this is the documented default behavior for both Solaris gettext(1) and GNU 
> gettext(1) but in the default case, GNU gettext does not behave the way it is
> documented. 

What you call the "GNU gettext documentation" is [1], the ambiguous LI18NUX
specification (which, by the way, was a common effort of Sun Microsystems
people and GNU people).

The actual GNU gettext documentation is here: [2], and the part about the
escape sequences has not changed since 2004. GNU gettext behaves the way
it is documented: It does NOT expand escapes by default.

> You forgot to mention that it also mentions:
> 
>-n  Suppress trailing newline.
> 
> which makes it obvious that someone made a mistake while writing the GNU 
> documentation that describes the options.

Correct, and I fixed this documentation mistake four months ago. [3]

> given that GNU gettext even copied 
> this text from SunOS man pages from the early 1990s, it is obvious that the 
> intention of the GNU gettext implementation was to be compatible with the 
> reference implementation and there is only a bug in the GNU implementation. 

It is well-known that the escape sequence expansion in 'echo' was different
in System V and BSD systems. You can assume that when Ulrich Drepper started
out writing GNU gettext in 1995, he did NOT want to copy the System V behaviour
of 'echo' into the 'gettext' program.

> > 2) GNU gettext(1) and Solaris gettext(1) differ in this respect:
> >
> > GNU:
> > $ gettext 'abc\ndef'; echo
> > abc\ndef
> >
> > Solaris:
> > $ gettext 'abc\ndef'; echo
> > abc
> > def
> >
> > This makes it hard to standardize, since the behaviours differ, and
> > both implementations will want to claim need for backward-compatibility.
> 
> Well people who expect the current GNU behavior obviously rely on a bug in 
> the 
> implementation.

This argument fails because you were looking at LI18NUX, not at the
documentation of GNU gettext.

> > 3) Additionally, there's the problem that gettext(1) does not and can not
> > (as a program) deal with strings that contain placeholders. As soon as
> 
> It seems that you missunderstand the way gettext(1) is intended to be used.

This is quite unlikely, because I have been the GNU gettext maintainer for
12 years.

> I see two useful ways to do what you like:
> 
> 1)
> 
>   gettext -s "Hello World" $$

No, this is not a reasonable way to use the 'gettext' program. It violates
the principle "Entire sentences" [4]. In different languages, the number may
need to be embedded into a sentence, rather than at the end of the sentence.

> 2)
> 
>   text=$(gettext 'Hello World $$')
>   eval echo $text
> 
>   or
> 
>   eval echo $(gettext 'Hello World $$')

No, this is not a reasonable way to use the 'gettext' program either. It fails
miserably when the translation of 'Hello World $$' contains a semicolon. Try

  text='Coucou; le monde $$'
  eval echo $text

In general, there is agreement among people writing shell scripts that the use
of 'eval' should be minimized, i.e. that 'eval' should only be used when the
lexical structure of the string being eval'ed can be predicted.

Bruno

[1] 
http://web.archive.org/web/20030428195733/http://www.li18nux.org/docs/html/LI18NUX-2000-amd4.htm
[2] 
https://www.gnu.org/software/gettext/manual/html_node/gettext-Invocation.html
[3] 
https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=b0d302e404c4b7c2c59e7609aacff35476a494d8
[4] https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html



Re: [1003.1(2016)/Issue7+TC2 0001309]: Clarity needed for initial value of $? at start of compound-list compound statements

2020-01-22 Thread Geoff Clare
Robert Elz  wrote, on 22 Jan 2020:
>
> From:Geoff Clare 
> 
>   | If we do add something, then I think that some non-normative words along
>   | the lines of your explanation at the bottom ("to clarify that ...")
>   | would be more helpful than the type of normative addition you are
>   | requesting.
> 
> Something just to make it clear would be better than nothing.

Looking at the "Exit Status" sections of some of these commands, there
are more "last command" problems.  Also, some of them refer to the exit
status of a compound-list, but I don't think that's defined anywhere.
So I think we need to fix those problems, and at the same time could
add a non-normative note to if, while, and until about the availability
of the exit status of the first compound-list.

How about:
 
On page 2371 line 75731 section 2.9.4 Compound Commands, add a new paragraph:

In the descriptions below, the exit status of some compound commands
is stated in terms of the exit status of a compound-list.
The exit status of a compound-list shall be the value that
the special parameter '?' (see [xref to 2.5.2]) would have 
immediately after execution of the compound-list.

On page 2372 line 75766 section 2.9.4.2 The for Loop, change:

The exit status of a for command shall be the exit status
of the last command that executes.

to:

If there is at least one item in the list of items, the exit status
of a for command shall be the exit status of the last
compound-list executed.

On page 2373 line 75793 section 2.9.4.3 Case Conditional Construct, change:

... the exit status shall be the exit status of the last command
executed in the compound-list.

to:

... the exit status shall be the exit status of the executed
compound-list.

On page 2373 line 75814 section 2.9.4.4 The if Conditional Construct, add:

Note: Although the exit status of the if or elif
compound-list is ignored when determining the exit status
of the if command, it is available through the special
parameter '?' (see [[xref to 2.5.2]) during execution of the next
then, elif, or else compound-list (if any
is executed) in the normal way.

On page 2374 line 75827 section 2.9.4.5 The while Loop, add:

Note: Since the exit status of compound-list-1 is ignored
when determining the exit status of the while command, it
is not possible to obtain the status of the command that caused the
loop to exit, other than via the special parameter '?' (see [[xref
to 2.5.2]) during execution of compound-list-1, for example:
while some_command; st=$?; false; do 
The exit status of compound-list-1 is available through the
special parameter '?' during execution of compound-list-2,
but is known to be zero at that point anyway.

On page 2374 line 75840 section 2.9.4.6 The until Loop, add:

Note: Although the exit status of compound-list-1 is ignored
when determining the exit status of the until command, it is
available through the special parameter '?' (see [[xref to 2.5.2])
during execution of compound-list-2 in the normal way.

>   | This phrase is in the existing text (after bug 1150 was applied).
>   | It's in a small-font note, which means it is non-normative,
> 
> but the new proposed text is not

Yes it is.  There is nothing in the new proposed changes that specifies
this text should change from a small-font note to something else.

>   | so I don't see a problem with using this informal phrase to refer
>   | to the var=...  command.
> 
> In general terms nor would I (it isn't a command, in the strict sense,
> but we might ignore that)
> 
>   | It's just being used as shorthand for "the command containing the
>   | assignment to var".
> 
> Not really, any other command containing an assignment to a var, which
> contained a command substitution wouldn't be relevant, only a simple
> command with no command word.
> 
> The bigger problem is that the wording suggests there is something
> special about command substitutions in assignment statements, which
> isn't correct, any command substitution in any command without a
> command word works

It reads to me as an example, rather than indicating the statement
only applies to that case, but if you are reading it differently then
I'm happy to make it clearer that there are other similar cases.  I'd
suggest adding to the end of the small-font note (borrowing some words
from part of your email I haven't quoted):

Likewise for any pipeline consisting entirely of a simple command
that has no command word, but contains one or more command
substitutions.  (See [xref to 2.9.1].)

> Next issue:
> 
>   | This behaviour of ksh was the reason I proposed the unspecified behaviour.
> 
> Yes, I assumed that.
> 
>   | The bug, as I see it, is that the value of $? and the behaviour of exit
>   | differ.
> 
> Yes.   Or well kind of, the bug is that exit picks the wrong default
> for n when executed in a 

Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Joerg Schilling
Bruno Haible  wrote:

> If that is your approach to standardization, then it is better to not 
> standardize
> anything.

If your approach is to standardize obvious implementation bugs, I am a bit 
bewildered.

I was in hope that you are interested in a fruitful discussion and open to 
useful arguments.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Joerg Schilling
Hi Bruno!

Bruno Haible  wrote:

> Regarding the gettext(1) program and whether it expands escape sequences
> by default:
>
>
> 1) [1] is ambiguous / self-contradictory.
> On one hand it says:
>
>   This utility interprets C escape sequences such as \t for tab. Use \\ to
>   print a backslash...

It is obvious that gettext(1) must expand escape sequences by default since 
this is the documented default behavior for both Solaris gettext(1) and GNU 
gettext(1) but in the default case, GNU gettext does not behave the way it is
documented. 

> Which sounds like they are expanded by default.
>
> On the other hand it says:
>
>   OPTIONS
>  -e
>  Enable expansion of some escape sequences.
>
> Which sounds like they are NOT expanded by default.

You forgot to mention that it also mentions:

   -n  Suppress trailing newline.

which makes it obvious that someone made a mistake while writing the GNU 
documentation that describes the options. Someone forgot to mention that the 
options -e/-n are both valid only together with the -s option that switches
the behavior. But this wrong wording is only in the GNU documentaion, while 
the official reference documentation from the inventor of the utility says:

 -sBehaves like echo(1) (see DESCRIP-
   TION  above).  If the -s option is
   specified,  no  expansion   of   C
   escape  sequences is performed and
   a newline character is appended to
   the output, by default.

With your interpretation of the GNU documentation, GNU gettext would need to 
output a newline at the end by default, but it does not. This is another hint
for an implementation bug in GNU gettext...

> So, you can't resolve this question by referencing an ambiguous specification.

Given that the main explanation requires to expand escape sequences without 
giving any exception, this is doubtlessly the the default behavior. We may 
discuss things beyond that description, but given that GNU gettext even copied 
this text from SunOS man pages from the early 1990s, it is obvious that the 
intention of the GNU gettext implementation was to be compatible with the 
reference implementation and there is only a bug in the GNU implementation. 

The documentation from the reference implementation (Solaris) is definitely 
not ambiguous since it correctly documents -s as an exception.

The GNU documentation is obvious for the default case that is documented in the 
DESCRIPTION section, but GNU gettext does not follow that GNU documentation.

The only ambiguity I see in the GNU documentation is in effect for the 
non-default case, but in the non-default case, GNU gettext follows the behavior 
of the reference implementation.


> 2) GNU gettext(1) and Solaris gettext(1) differ in this respect:
>
> GNU:
> $ gettext 'abc\ndef'; echo
> abc\ndef
>
> Solaris:
> $ gettext 'abc\ndef'; echo
> abc
> def
>
> This makes it hard to standardize, since the behaviours differ, and
> both implementations will want to claim need for backward-compatibility.

Well people who expect the current GNU behavior obviously rely on a bug in the 
implementation. So the main question to me is whether GNU gettext will have a 
chance to be fixed. If you like to protect GNU users that rely on that 
implementation bug, GNU gettext could be enhanced to follow the documented 
behavior in case that POSIXLY_CORRECT is set, as used for other standard 
deviations on Linux already.

The Solaris gettext behaves as documented and I see no reason to introduce a 
different description in POSIX since that would cause backwards compatibility 
problems. The Solaris behavior is obviously not a bug and did not change 
during the past 30+ years - much longer than GNU gettext exists.

The Solaris implementation is even able to emulate the GNU behavior if it is
called as:

gettext -sn "some text"

as long as you do not like to supply "textdomain" as first argument but rather 
as -d option argument.


> 3) Additionally, there's the problem that gettext(1) does not and can not
> (as a program) deal with strings that contain placeholders. As soon as

It seems that you missunderstand the way gettext(1) is intended to be used.

I see two useful ways to do what you like:

1)

gettext -s "Hello World" $$

2)

text=$(gettext 'Hello World $$')
eval echo $text

or

eval echo $(gettext 'Hello World $$')

Method 2 is equivalent to the way, C  programs use gettext(3).

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: Coordination on standardizing gettext() in future POSIX

2020-01-22 Thread Joerg Schilling
Shware Systems  wrote:

> This is not invention, as even Solaris allows you to turn it off with -s, as 
> you point out. It may work fine for the charsets/charmap files Solaris 
> historically provides to have escapes active as the default, but this does 
> not equate to it being valid for all conforming charsets, if an application 
> makes use of localedef, that I see. As such, from a portability standpoint, I 
> view not processing escapes as the safer alternative.

What should be the reason for making the standard incompatible to the existing 
practice since more than 30 years?

Gettext is a SunOS invention and other implementations are expected to follow 
the definition from the reference implementation.

Do you really like to require SunOS to loose backwads incompatiblity?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'