Re: Status of $'...' addition (was: ksh93 job control behaviour)
It is not "some sensible \u sequences" alone. First off, there's little agreement on what constitutes 'sensible'. Just the headache of the U300 diacritics adds to XBD6 significantly, if they're to be supported, as one example. The 'sensible' present solution is to not support them at all; others will argue the 'sensible' thing is to support them because Unocode does include these code points. The headache stems from it is not simply arbitrarily saying let's have the utility support these in $'', it's ensuring there are interfaces for the utilities to be written in that understand left-associative combining sequences, and these interfaces are portable because requirements in XBD add that support. On Thursday, July 30, 2020 Steffen Nurpmeso wrote: shwaresyst wrote in <1127836834.9524758.1596121054...@mail.yahoo.com>: |Yes, the additions necessary still for even limited Unicode support \ |above the broken bandaids C11+ provide are one of those issues. Where \ |Unicode is incompatible with POSIX, and is therefore (by design) broken \ |too needs addressing also. The white papers detailing most of these \ |changes have yet to be written, or published if some have been. Hmm, the ISO C reference is of course true. But then this is about Unix/POSIX shells, and then adding some sensible \u sequences and defining their conversion to locale charset can only be an improvement, i think. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: Status of $'...' addition (was: ksh93 job control behaviour)
On 7/30/20 7:29 PM, Robert Elz wrote: > | And for that it would be tremendous if $'' would be defined so > | that it can be used as the sole quoting mechanism, > > No thanks. Partly because $'' is already implemented (widely) > and used (perhaps slightly less yet) - so that ship has sailed. > > I believe I've seen $" ... " used that way somewhere though (don't > recall where) and I believe it is a mistake. None of the existing implementations of $"..." use it in that way. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Status of $'...' addition (was: ksh93 job control behaviour)
Date:Thu, 30 Jul 2020 15:53:53 +0200 From:Steffen Nurpmeso Message-ID: <20200730135353.qwslp%stef...@sdaoden.eu> | The problem being that what is in the wild does not work out for | many languages. I admit to not knowing a lot of the internationalisation issues, or of unicode, but I don't understand this at all. The quoting mechanisms in the shell provide a means to create specific bit patterns to assign to variables, pass as parameters to programs, etc. I don't see that the mechanism by which they're encoded in the sh language should matter all that much, the same thing could be read from a file instead ( var=$(cat file) ) in which case the shell spec has no control over the bit patterns at all. Of course the quoting mechanisms make a difference to the ease of use for the sh programmer, but that's an entirely different issue. | The in-use shell quote pattern consisting of small, isolated parts | which depend on which kind of escaping and expanding is necessary | just does not work out for many languages. Can you give an example of something which cannot be done (assuming $'' as currently intended to be specified)? Note: not an example of someone using the mechanisms to do the wrong thing - there are zillions of ways to write bad code, but an example of something which cannot be done correctly as specified. Then we'll see if that really matters. | ? echo Don"'"t you worry$'\x21' The sun shines on us. $'\u263A' | | The latter is what i mean. There are many languages on this world | where these \u expansions do not work out that way, but where the | "entire sentence must be interpreted as a unity" in order to get | the iconv(3) conversation to nl_langinfo(CODESET) correctly, aka | the way it is _desired_. Surely this depends upon how the shell works - if the shell is attempting to convert just the \u escape into some other codeset, I can see your point, but it doesn't need to work like that - it can work internally in 10646 code points (whether encoded in 16 or 32 bit values, or as UTF-8), and only convert to the desired charset when actually used (that is, when about to run "echo" at which point the entire string is available. In any case, if the user has specified a specific unicode code point, shouldn't that always be what is generated, regardless of whether it makes sense or not? | And for that it would be tremendous if $'' would be defined so | that it can be used as the sole quoting mechanism, No thanks. Partly because $'' is already implemented (widely) and used (perhaps slightly less yet) - so that ship has sailed. I believe I've seen $" ... " used that way somewhere though (don't recall where) and I believe it is a mistake. As soon as you have multiple different types of expansions that can occur, there are problems with which one gets priority, which is performed first. So, assuming there is a $"..." which works as you desire, what happens with $"${VAR+foo\x7Dbar}" Do we get foo}bar or foobar} ? (assuming VAR was set of course). Whichever way you pick, there will be arguments for doing it the other way, in some other case. This stuff simply becomes a mess. Please, don't go there. If we wanted to add C type encodings along with the others, we'd need to do it in a way that is consistent with the other expansions, perhaps using something like $[x7D] or $[u263A] or $[n] (but no, this is not a serious suggestion). And I cannot fathom how this in any way overcomes your earlier objection, quoted strings in sh are not units, they're simply pieces of some longer word (or can be) - your Don"'"t example above (and the worry$'\x21') are both examples of that. kre
Re: More issues with pattern matching
On 26/09/2019 10:20, Geoff Clare wrote: Geoff Clare wrote, on 26 Sep 2019: Are shells required to support this, and are shells therefore implicitly required to translate patterns to regular expressions, or should it be okay to implement this with single character support only? Shells are required to support it. They don't need to translate entire patterns to regular expressions - they can use either regcomp()+regexec() or fnmatch() to see if the bracket expression matches the next character. Sorry, I should have written "matches *at* the next character" here; I didn't mean to imply checking against a single character. For example, if using regcomp()+regexec() the shell could try to match the bracket expression against the remainder of the string and see how much of it regexec() reported as matching. To use fnmatch() I suppose you would have to use it in a loop, passing it first one character, then two, etc. (stopping at the number of characters between the '.'s). As I had replied at the time, it is fundamentally impossible in the general case as POSIX does not provide any mechanism to escape characters and there is nothing in POSIX that rules out the possibility of a collating element containing "=]" or ".]". However, ignoring that aspect of it, looking at implementing this once again, implementing it the way you specified is incorrect, fixing it to make it correct cannot possibly be done efficiently with standard library support, and shells in general don't bother to implement what POSIX specifies here. Take the previous example glibc's cy_GB.UTF-8 locale, but with a different collating element: in this locale, "dd" is a single collating element too. Therefore, this must be matchable by bracket expressions. However, "d" individually must *also* be matched by pattern expressions. "dd" can be matched by both [!x] and [!x][!x]. A shell cannot use regcomp()+regexec() to find the longest match for [!x] and assume that that is matched: a shell where case dd in [!x]d) echo match ;; esac does not print "match" does not implement what POSIX requires. A shell where case dd in [!x]) echo match ;; esac does not print "match" does not implement what POSIX requires either. Using regcomp()+regexec() to bind [!x] to either "d" or "dd" without taking the rest of the pattern into account will fail to match in one of these cases. And it needn't be the same way for all bracket expressions in a single pattern: case ddd in [!x][!x]) echo match ;; esac Shells are required by POSIX to consider both the possibility that [!x] picks up "d" and that it picks up "dd" for each bracket expression individually. This means that in the worst case, if every bracket expression in a pattern has X ways to match, and a pattern has Y bracket expressions, the shell is required to consider X^Y possibilities. This is completely unreasonable and it's obvious why no shell actually does this. The complexity can be reduced in theory, but POSIX does not expose enough information to allow that to be implemented in a shell. The only way around this mess is by translating the whole pattern to a regular expression, as only the C library has enough detailed knowledge about the locale that it can implement it efficiently.[*] Doing that has its own new set of problems though: translating the whole pattern to a regular expression means the shell no longer has the option to decide how to handle invalid byte sequences (byte sequences that lead to EILSEQ) that shells in general try to tolerate, and the shell no longer has the option to decide how to handle invalid patterns (patterns containing non-existent character classes or collating elements) which shells in general also aim to tolerate. Cheers, Harald van Dijk [*] I have not investigated whether implementations actually do do this efficiently.
Re: Status of $'...' addition (was: ksh93 job control behaviour)
shwaresyst wrote in <1127836834.9524758.1596121054...@mail.yahoo.com>: |Yes, the additions necessary still for even limited Unicode support \ |above the broken bandaids C11+ provide are one of those issues. Where \ |Unicode is incompatible with POSIX, and is therefore (by design) broken \ |too needs addressing also. The white papers detailing most of these \ |changes have yet to be written, or published if some have been. Hmm, the ISO C reference is of course true. But then this is about Unix/POSIX shells, and then adding some sensible \u sequences and defining their conversion to locale charset can only be an improvement, i think. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: Status of $'...' addition (was: ksh93 job control behaviour)
David A. Wheeler wrote in : |Steffen Nurpmeso wrote: |>> And for that it would be tremendous if $'' would be defined so |>> that it can be used as the sole quoting mechanism, and that would |>> then also include expansion of $VAR (i use \$VAR or \${VAR} in my |>> mailer). But to know exactly how problematic splitting of quotes |>> is for many languages of the world, including right-to-left |>> direction and shift state changes etc., and changing of meaning as |>> such if the sentence cannot be interpreted as a unity, a real |>> expert had to be asked. Anyhow, the Unicode effort mandates |>> processing of entire strings and denotes isolated treatment as |>> a complete error. | |I think eliminating old quoting mechanisms would be a mistake. That is an unfortunate misunderstanding, sorry. I do not want to obsolete them from the standard side, regarding that all i would like to see is that $'' gets the few tweaks it needs to include the possibilities of the other quoting mechanisms, and in effect this is only "" ($VAR and `` thereof). And this is solely, no, this is because (a) like that the entire string expansion can be fed into iconv(3), and (b) because i think for users, and for program/script source hm audit it is much easier to grasp than having the need to sequence it, for example to embed $VAR expansion into a string. |On Thu, 30 Jul 2020 16:09:56 +0200, Joerg Schilling wrote: |> Even if it would become part of the standad today, you stilll would need |> to wait some years until all implementations take it up. | |That's true for almost all standards changes. |However, many shells *already* implement $'...'. |It's also relatively trivial to implement, and it provides |very useful capabilities (such as the ability to easily assign terminating \ |newlines). | |I'd still like to see the addition of $'...'. Me too, i am all in favour of $'', and i hope it is not because of me that issue 249 is still open. It is anyway implemented the way it is as of today! --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: Status of $'...' addition (was: ksh93 job control behaviour)
Joerg Schilling wrote in <5f22d4b4.8vf9+w1hbegjrn1d%joerg.schill...@fokus.fraunhofer.de>: |Steffen Nurpmeso wrote: | |> And for that it would be tremendous if $'' would be defined so |> that it can be used as the sole quoting mechanism, and that would |> then also include expansion of $VAR (i use \$VAR or \${VAR} in my |> mailer). But to know exactly how problematic splitting of quotes |> is for many languages of the world, including right-to-left |> direction and shift state changes etc., and changing of meaning as |> such if the sentence cannot be interpreted as a unity, a real |> expert had to be asked. Anyhow, the Unicode effort mandates |> processing of entire strings and denotes isolated treatment as |> a complete error. | |Even if it would become part of the standad today, you stilll would need |to wait some years until all implementations take it up. I must admit the last time i looked in an iconv(3) implementation (GNU) it was not like that either, it was plain "1:1" conversion. (I hope i am not lying now, ..it is what i remember.) But even if it is for the future, if you write u$'\u0308' nothing can happen, if you would write $'u\u0308' then an iconv(3) which does its job really well could recognize the COMBINING DIAERESIS and create the ü you want in your LATIN1 environment. (This is a simple example, but \u is meant to embed Unicode, and then graphemes come into play; i mean, even ncurses is capable to properly deal with this stuff since many years, and this is something yet to be standardized.) --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: A question about interpretation
Date:Thu, 30 Jul 2020 09:53:00 -0700 From:Nick Stoughton Message-ID: | Cross-references are informative. That is what I would have expected. | However, even without a cross-reference if the normative text in | two places disagrees or allows for different things, Not quite that, the fake example I gave was somewhat more extreme. It is more a question of what should be done, specifics below. | If a specific case refers to a | generalized one, then there is nothing wrong. Sure. | In your example, I would file an interpretation request, since the standard | does not appear to define the concept of Orange as a fruit except in this | one narrow case. That would be the same as the actual case. Further, there are real similarities, in both the general expectation would be that the general case doesn't really need defining, everyone "simply knows" what it is (ie: the concept of an orange is common knowledge) but a definition is required for the specific case to make it clear what it applies to (just which oranges are Valencia oranges, must they actually be grown in Spain?). Now the actual case: XCU 2.6.1 (this is from the Issue 8 draft, but the relevant parts are mostly unchanged from Issue 7 I believe) In an assignment (see XBD Section 4.23), ["assignment" is the orange] multiple tilde-prefixes can be used: one at the beginning of the word (that is, following the of the assignment), or one following any unquoted , or both. A tilde-prefix in an assignment is terminated by the first unquoted or , or the end of the assignment word. (earlier text says that in the default case, ie: when not an assignment, the tilde-prefix is everything up to an unquoted '/' or the whole word if there is none.) XBD 4.23 is 4.23 Variable Assignment In the shell command language, a word consisting of the following parts: [...] There is no other definition of "assignment" (in XBD 3 or XBD 4, or anywhere I can see in XCU), just this one, which is the definition of one specific form of assignment. Where this becomes an issue is in relation to XCU 2.6.2 In addition, a parameter expansion can be modified by using one of the following formats. In each case that a value of word is needed (based on the state of parameter, as described below), word shall be subjected to tilde expansion, parameter expansion, [...] and still from 2.6.2 ${parameter:=[word]} Assign Default Values. If parameter is unset or null, quote removal shall be performed on the expansion of word and the result (or an empty string if word is omitted) shall be assigned to parameter. [...] Now the question is in an expression like ${unset_var=~:~user} what should happen? That last quote from 2.6.2 says an assignment takes place (given unset_var is in fact unset, if not, the "word" is irrelevant), the earlier quote from 2.6.2 says that tilde expansion happens on the word, and the quote from 2.6.1 says that in an assignment, ':' terminates the tilde prefix, and the ~ after the ':' starts a new tilde prefix, so provided that "user" is a known user name, this should set unset_var to ${HOME}:$(homedir_of user) (assuming there was a function/command "homedir_of" which does the obvious thing). That's how the NetBSD shell works. But it is the only one that I'm aware of. The various ksh's (and bash) seem to treat the ':' as a terminator for the tilde prefix, but don't treat it as being the starting point of a new one (ie: kind of half of an assignment). As best I can tell that's behaviour that makes no sense (with one caveat below). Other shells treat the entire word (there being no '/') as the tilde-prefix. This must be being justified by the xref to XBD 4.23 as the definition of an "assignment" even though what it defines is a "variable assignment" which is not the phrase that 2.6.1 uses. (ie: orange vs valencia orange). The word in question does not meet the definition of a variable assignment (while the var= part exists in the parameter expansion, only the word part of it is being processed here - and even if we step backwards, the parameter expansion, while superficially similar to a variable assignment doesn't meet the definition (it cotains "${" at the start for example, and even that might be embedded in some longer word). Since they do not treat this expansion as an assignment, the ':' does not terminate the tilde-prefix, and they fail to find a user with the resulting name (":~user") which isn't a portable user name in any case. They then simply say that tilde expansion does nothing, and leave the word unaltered (assign '~:~user' to unset_var). The analysis gets complicated here, (again from XCU 2.6.1): If these characters do not form a portable login name (see the
Re: Status of $'...' addition (was: ksh93 job control behaviour)
Steffen Nurpmeso wrote: > > And for that it would be tremendous if $'' would be defined so > > that it can be used as the sole quoting mechanism, and that would > > then also include expansion of $VAR (i use \$VAR or \${VAR} in my > > mailer). But to know exactly how problematic splitting of quotes > > is for many languages of the world, including right-to-left > > direction and shift state changes etc., and changing of meaning as > > such if the sentence cannot be interpreted as a unity, a real > > expert had to be asked. Anyhow, the Unicode effort mandates > > processing of entire strings and denotes isolated treatment as > > a complete error. I think eliminating old quoting mechanisms would be a mistake. On Thu, 30 Jul 2020 16:09:56 +0200, Joerg Schilling wrote: > Even if it would become part of the standad today, you stilll would need > to wait some years until all implementations take it up. That's true for almost all standards changes. However, many shells *already* implement $'...'. It's also relatively trivial to implement, and it provides very useful capabilities (such as the ability to easily assign terminating newlines). I'd still like to see the addition of $'...'. --- David A. Wheeler
Re: A question about interpretation
Cross-references are informative. However, even without a cross-reference if the normative text in two places disagrees or allows for different things, then an interpretation is required. If a specific case refers to a generalized one, then there is nothing wrong. In your example, I would file an interpretation request, since the standard does not appear to define the concept of Orange as a fruit except in this one narrow case. If, on the other hand, the definition was for an Orange, and the squeezing requirement was only for Valencia Oranges, then the standard is consistent, and the standard is silent about Mandarin Oranges (permitting but not requiring squeezing). Hope that helps! -- Nick On Thu, Jul 30, 2020 at 7:54 AM Robert Elz wrote: > In the standard, if the words say something, and a > followed by a cross reference (xref) to a definition > which defines something subtly different, is the > correct reading to limit the specification to what > is defined in the xref, or is the xref to be treated > as more informative - as additional information which > might help explain a term used ? > > To give an example (purposely, and obviously, not > related to posix for right now), suppose the standard > said > > if the fruit is an orange [xref definitions 17) > then squeeze it, otherwise take care not to squeeze it. > > and definitions 17 is: > > 17. Valencia Orange: a roundish orange coloured citrus ... > (the rest of what it might say is irrelevant). > > In this case, if we pick up a piece of fruit, and it is an orange, > but not a valencia orange, are we to squeeze it or not? > > That is, does the definition that was xref'd limit the interpretation > of the preceding word (or phrase) to only apply to what is defined, > or is it to be taken as simply providing information, should the > reader happen not to know what an orange might be? > > kre > >
Re: Status of $'...' addition (was: ksh93 job control behaviour)
Yes, the additions necessary still for even limited Unicode support above the broken bandaids C11+ provide are one of those issues. Where Unicode is incompatible with POSIX, and is therefore (by design) broken too needs addressing also. The white papers detailing most of these changes have yet to be written, or published if some have been. On Thursday, July 30, 2020 Steffen Nurpmeso wrote: shwaresyst wrote in <311169368.9432836.1596108598...@mail.yahoo.com>: |On Thursday, July 30, 2020 Geoff Clare wrote: |Robert Elz wrote, on 29 Jul 2020: |> |> Speaking of which, what is the current holdup with resolving |> whichever bug it is (I hate searching in mantis, so I won't |> try here) which specifies $'...' ? Perhaps whatever the |> problem was (before my time) with the specification of that |> is no longer a problem? | |It's bug 249. It was reopened in Oct 2015 and several notes were |added to the bug after that, starting with | |https://austingroupbugs.net/view.php?id=249#c2893 | |My guess is the conference calls postponed returning to it because |there was ongoing discussion, but by the time the discussion ended |it had "gone off the radar". ... |Also, as something new, its inclusion is part of a later draft of Issue \ |8. Additional issues it depends on need to be addressed first, specified \ |fully, and incorporated. This is more why it went on the back burner, \ |that I recall. Various other bugs are in similar state; the prerequisites \ |to finish speciifying them so they can be considered portable aren't \ |done yet either. The problem being that what is in the wild does not work out for many languages. The in-use shell quote pattern consisting of small, isolated parts which depend on which kind of escaping and expanding is necessary just does not work out for many languages. Period. I (the mailer i maintain, using POSIX-incompatible sh(1)ell-style command line input) for example claim ? echo 'Quotes '${HOME}' and 'tokens" differ!"# no comment ? echo Quotes ${HOME} and tokens differ! # comment ? echo Don"'"t you worry$'\x21' The sun shines on us. $'\u263A' The latter is what i mean. There are many languages on this world where these \u expansions do not work out that way, but where the "entire sentence must be interpreted as a unity" in order to get the iconv(3) conversation to nl_langinfo(CODESET) correctly, aka the way it is _desired_. Of course you can move it all to the twilight zone of "undefined behaviour", but if you do not, then quoting must extend to the largest possible extend, and interpreted as a unity. And for that it would be tremendous if $'' would be defined so that it can be used as the sole quoting mechanism, and that would then also include expansion of $VAR (i use \$VAR or \${VAR} in my mailer). But to know exactly how problematic splitting of quotes is for many languages of the world, including right-to-left direction and shift state changes etc., and changing of meaning as such if the sentence cannot be interpreted as a unity, a real expert had to be asked. Anyhow, the Unicode effort mandates processing of entire strings and denotes isolated treatment as a complete error. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
A question about interpretation
In the standard, if the words say something, and a followed by a cross reference (xref) to a definition which defines something subtly different, is the correct reading to limit the specification to what is defined in the xref, or is the xref to be treated as more informative - as additional information which might help explain a term used ? To give an example (purposely, and obviously, not related to posix for right now), suppose the standard said if the fruit is an orange [xref definitions 17) then squeeze it, otherwise take care not to squeeze it. and definitions 17 is: 17. Valencia Orange: a roundish orange coloured citrus ... (the rest of what it might say is irrelevant). In this case, if we pick up a piece of fruit, and it is an orange, but not a valencia orange, are we to squeeze it or not? That is, does the definition that was xref'd limit the interpretation of the preceding word (or phrase) to only apply to what is defined, or is it to be taken as simply providing information, should the reader happen not to know what an orange might be? kre
Re: Status of $'...' addition (was: ksh93 job control behaviour)
Steffen Nurpmeso wrote: > And for that it would be tremendous if $'' would be defined so > that it can be used as the sole quoting mechanism, and that would > then also include expansion of $VAR (i use \$VAR or \${VAR} in my > mailer). But to know exactly how problematic splitting of quotes > is for many languages of the world, including right-to-left > direction and shift state changes etc., and changing of meaning as > such if the sentence cannot be interpreted as a unity, a real > expert had to be asked. Anyhow, the Unicode effort mandates > processing of entire strings and denotes isolated treatment as > a complete error. Even if it would become part of the standad today, you stilll would need to wait some years until all implementations take it up. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: Status of $'...' addition (was: ksh93 job control behaviour)
shwaresyst wrote in <311169368.9432836.1596108598...@mail.yahoo.com>: |On Thursday, July 30, 2020 Geoff Clare wrote: |Robert Elz wrote, on 29 Jul 2020: |> |> Speaking of which, what is the current holdup with resolving |> whichever bug it is (I hate searching in mantis, so I won't |> try here) which specifies $'...' ? Perhaps whatever the |> problem was (before my time) with the specification of that |> is no longer a problem? | |It's bug 249. It was reopened in Oct 2015 and several notes were |added to the bug after that, starting with | |https://austingroupbugs.net/view.php?id=249#c2893 | |My guess is the conference calls postponed returning to it because |there was ongoing discussion, but by the time the discussion ended |it had "gone off the radar". ... |Also, as something new, its inclusion is part of a later draft of Issue \ |8. Additional issues it depends on need to be addressed first, specified \ |fully, and incorporated. This is more why it went on the back burner, \ |that I recall. Various other bugs are in similar state; the prerequisites \ |to finish speciifying them so they can be considered portable aren't \ |done yet either. The problem being that what is in the wild does not work out for many languages. The in-use shell quote pattern consisting of small, isolated parts which depend on which kind of escaping and expanding is necessary just does not work out for many languages. Period. I (the mailer i maintain, using POSIX-incompatible sh(1)ell-style command line input) for example claim ? echo 'Quotes '${HOME}' and 'tokens" differ!"# no comment ? echo Quotes ${HOME} and tokens differ! # comment ? echo Don"'"t you worry$'\x21' The sun shines on us. $'\u263A' The latter is what i mean. There are many languages on this world where these \u expansions do not work out that way, but where the "entire sentence must be interpreted as a unity" in order to get the iconv(3) conversation to nl_langinfo(CODESET) correctly, aka the way it is _desired_. Of course you can move it all to the twilight zone of "undefined behaviour", but if you do not, then quoting must extend to the largest possible extend, and interpreted as a unity. And for that it would be tremendous if $'' would be defined so that it can be used as the sole quoting mechanism, and that would then also include expansion of $VAR (i use \$VAR or \${VAR} in my mailer). But to know exactly how problematic splitting of quotes is for many languages of the world, including right-to-left direction and shift state changes etc., and changing of meaning as such if the sentence cannot be interpreted as a unity, a real expert had to be asked. Anyhow, the Unicode effort mandates processing of entire strings and denotes isolated treatment as a complete error. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: ksh93 job control behaviour [was: Draft suggestion: Job control and subshells]
Geoff Clare wrote: > It's only easy because (most/all?) shells take the easy option and do > a lexical analysis of the command to be substituted. Applications can't > expect the following to work, but if the feature was implemented > "properly", it would: > > showalltraps() { trap -p; } > alltraps=$(showalltraps) Could you explain what you understand by "most shells take the easy option and do a lexical analysis of the command to be substituted." What I however remember is that nobody checked the output of the command when used in a command substitution...while we discussed the new feature. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
RE: Status of $'...' addition (was: ksh93 job control behaviour)
Also, as something new, its inclusion is part of a later draft of Issue 8. Additional issues it depends on need to be addressed first, specified fully, and incorporated. This is more why it went on the back burner, that I recall. Various other bugs are in similar state; the prerequisites to finish speciifying them so they can be considered portable aren't done yet either. On Thursday, July 30, 2020 Geoff Clare wrote: Robert Elz wrote, on 29 Jul 2020: > > Speaking of which, what is the current holdup with resolving > whichever bug it is (I hate searching in mantis, so I won't > try here) which specifies $'...' ? Perhaps whatever the > problem was (before my time) with the specification of that > is no longer a problem? It's bug 249. It was reopened in Oct 2015 and several notes were added to the bug after that, starting with https://austingroupbugs.net/view.php?id=249#c2893 My guess is the conference calls postponed returning to it because there was ongoing discussion, but by the time the discussion ended it had "gone off the radar". -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Status of $'...' addition (was: ksh93 job control behaviour)
Robert Elz wrote, on 29 Jul 2020: > > Speaking of which, what is the current holdup with resolving > whichever bug it is (I hate searching in mantis, so I won't > try here) which specifies $'...' ? Perhaps whatever the > problem was (before my time) with the specification of that > is no longer a problem? It's bug 249. It was reopened in Oct 2015 and several notes were added to the bug after that, starting with https://austingroupbugs.net/view.php?id=249#c2893 My guess is the conference calls postponed returning to it because there was ongoing discussion, but by the time the discussion ended it had "gone off the radar". -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England