Re: Could U+E0001 LANGUAGE TAG become undeprecated please? There is a good reason why I ask

2020-02-10 Thread Steffen Nurpmeso via Unicode
wjgo_10...@btinternet.com via Unicode wrote in <141cecf1.23e.1702ea529c1.webtop@btinternet.com>: |Could U+E0001 LANGUAGE TAG become undeprecated please? There is a good |reason why I ask | |There is a German song, Lorelei, and I searched to find an English |translation. Regarding Rhine

Re: Access to the Unicode technical site (was: Re: Unicode's got a new logo?)

2019-07-19 Thread Steffen Nurpmeso via Unicode
Hello Mr. Ken Whistler. Ken Whistler wrote in <3d1676bb-f3c1-8a3e-fdc5-1c0bdd74a...@sonic.net>: |On 7/18/2019 11:50 AM, Steffen Nurpmeso via Unicode wrote: |> I also decided to enter /L2 directly from now on. | |For folks wishing to access the UTC document register, Unicode |C

Re: Unicode's got a new logo?

2019-07-18 Thread Steffen Nurpmeso via Unicode
Yifán Wáng via Unicode wrote in : |I cannot help but notice the new home.unicode.org site embraces a new |logo, blue base color with a humanist type, rather than the |traditional one, red and geometric. Does anybody know if it means that |Unicode wants to renew its logo or that they serve for

Re: Base64 encoding applied to different unicode texts always yields different base64 texts ... true or false?

2018-10-15 Thread Steffen Nurpmeso via Unicode
Philippe Verdy via Unicode wrote in : |Padding itself does not clearly indicate the length. | |It's an artefact that **may** be infered only in some other layers \ |of protocols which specify when and how padding is needed (and how \ |many padding bytes |are required or accepted), it works

Re: Base64 encoding applied to different unicode texts always yields different base64 texts ... true or false?

2018-10-15 Thread Steffen Nurpmeso via Unicode
Doug Ewell via Unicode wrote in <2A67B4F082F74F8AADF34BA11D885554@DougEwell>: |Steffen Nurpmeso wrote: |> Base64 is defined in RFC 2045 (Multipurpose Internet Mail Extensions |> (MIME) Part One: Format of Internet Message Bodies). | |Base64 is defined in RFC 4648, "T

Re: Base64 encoding applied to different unicode texts always yields different base64 texts ... true or false?

2018-10-13 Thread Steffen Nurpmeso via Unicode
Philippe Verdy via Unicode wrote in : |You forget that Base64 (as used in MIME) does not follow these rules \ |as it allows multiple different encodings for the same source binary. \ |MIME actually |splits a binary object into multiple fragments at random positions, \ |and then encodes these

Re: Tales from the Archives

2018-08-20 Thread Steffen Nurpmeso via Unicode
Terrible! Ken Whistler wrote in <12e6ad91-89e4-ec87-85ad-8fc4ab3f6...@att.net>: |Steffen, | |Are you looking for the Unicode list email archives? | |https://www.unicode.org/mail-arch/ | |Those contain list content going back all the way to 1994. Dear Ken Whistler, no, and yes, having an

Re: Tales from the Archives

2018-08-20 Thread Steffen Nurpmeso via Unicode
James Kass via Unicode wrote in : ... |Eighteen years pass, display issues have mostly gone away, nearly |everything works "out-of-the-box", and list traffic has dropped |dramatically. Today's questions are usually either highly technical |or emoji-related. | |Recent threads related to

Re: Split a UTF-8 multi-octet sequence such that it cannot be unambiguously restored?

2017-07-24 Thread Steffen Nurpmeso via Unicode
"Costello, Roger L. via Unicode" wrote: |Suppose an application splits a UTF-8 multi-octet sequence. The application \ |then sends the split sequence to a client. The client must restore \ |the original sequence. | |Question: is it possible to split a UTF-8 multi-octet

Re: "A Programmer's Introduction to Unicode"

2017-03-15 Thread Steffen Nurpmeso
"Doug Ewell" wrote: |Philippe Verdy wrote: |>>> Well, you do have eleven bits for flags per codepoint, for example. |>> |>> That's not UCS-4; that's a custom encoding. |>> |>> (any UCS-4 code unit) & 0xFFE0 == 0 | |(changing to "UTF-32" per Ken's observation) | |>

Re: "A Programmer's Introduction to Unicode"

2017-03-14 Thread Steffen Nurpmeso
Alastair Houghton wrote: |On 13 Mar 2017, at 21:10, Khaled Hosny wrote: |> On Mon, Mar 13, 2017 at 07:18:00PM +, Alastair Houghton wrote: |>> On 13 Mar 2017, at 17:55, J Decker wrote: |>>> |>>> I liked the Go

Re: I'm excited about the proposal to add a brontosaurus emoji codepoint

2016-08-29 Thread Steffen Nurpmeso
Leonardo Boiko wrote: |We obviously need an emoji for every species name listed within The \ |Official Registry of Zoological Nomenclature. Ride it out. Ride it out. Oh, it shouldn't take that much longer if we all go for it. --steffen

Re: Encoding the Mayan Script:

2016-06-04 Thread Steffen Nurpmeso
|http://blog.unicode.org/2016/06/encoding-mayan-script-your-adopt.html | |This is great news. Congratulations to both UTC and the sponsors for |helping to fund this worthwhile encoding effort. I concur with all my heart! Are uschê ocher zîch warâl K'itschê' ub'î'. Good luck! Are

Re: Surrogates and noncharacters

2015-05-12 Thread Steffen Nurpmeso
Hans Aberg haber...@telia.com wrote: | On 12 May 2015, at 16:50, Philippe Verdy verd...@wanadoo.fr wrote: | Indeed, that is why UTF-8 was invented for use in Unix-like environments. | | Not the main reason: communication protocols, and data storage \ | is also based on 8-bit code units (even

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy verd...@wanadoo.fr wrote: |glibc is not more borken and any other C library implementing toupper and |tolower from the legacy ctype standard library. These are old APIs that |are just widely used and still have valid contexts were they are simple and |safe to use. But they are

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy verd...@wanadoo.fr wrote: |Successors to convert strings instead of just isolated characters (sorry, |they are NOT what we need to handle texts, they are not even equivalent |to Unicode characters, they are just code units, most often 8-bit with |char or 16-bit only with wchar_t

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy verd...@wanadoo.fr wrote: |The standard C++ string package could have then used this standard |internally in the methods exposed in its API. I cannot understand this |simple effort was never done on such basic functionality needed and used in |almost all softwares and OSes.

Re: Off-topic: Tate Britain After Dark

2014-08-13 Thread Steffen Nurpmeso
William_J_G Overington wjgo_10...@btinternet.com wrote: |http://afterdark.tate.org.uk Let's just hope those won't cause any damages! Heaven only knows who wrote their control software... And how beautiful that old stuff looks when enlightened by headlights! --steffen

Re: Apparent discrepanccy between FAQ and Age.txt

2014-06-10 Thread Steffen Nurpmeso
Hello, Karl Williamson pub...@khwilliamson.com wrote: |The FAQ http://www.unicode.org/faq/private_use.html#sentinels |says that the last 2 code points on the planes except BMP were made |noncharacters in TUS 3.1. DerivedAge.txt gives 2.0 for these. The (nothing but informational except for

Re: Corner cases (was: Re: UTF-16 Encoding Scheme and U+FFFE)

2014-06-06 Thread Steffen Nurpmeso
Doug Ewell d...@ewellic.org wrote: |Philippe Verdy verdy underscore p at wanadoo dot fr wrote: | Not necessarily true. | | [602 words] | |This has nothing to do with the scenario I described, which involved |removing a BOM from the start of an arbitrary fragment of data, |thereby

Re: Guillements in Email

2014-05-02 Thread Steffen Nurpmeso
Sorry for not replying in the thread, and jumping in general, i'm currently «jolly well fed up» of dealing with mail, but... Philippe Verdy wrote: |I do not criticize the fact of using quoted-printable; but the |fact that of NOT using it to preserve characters; based on an |arbitrary

Re: ID_Start, ID_Continue, and stability extensions

2014-04-28 Thread Steffen Nurpmeso
Markus Scherer markus@gmail.com wrote: |On Fri, Apr 25, 2014 at 6:05 AM, Steffen Nurpmeso sdao...@yandex.comwrote: |So imho it's a bit like «Kraut und Rüben» («higgledy-piggledy» | sayy http://www.dict.cc/?s=Kraut+und+R%C3%BCben). | |Ich weiß was das bedeutet :-) hmmm, possibly a bit

Re: ID_Start, ID_Continue, and stability extensions

2014-04-25 Thread Steffen Nurpmeso
Hello, Markus Scherer markus@gmail.com wrote: |On Thu, Apr 24, 2014 at 12:56 PM, Steffen Nurpmeso sdaode\ |n...@yandex.comwrote: | Markus Scherer markus@gmail.com wrote: ||I strongly recommend you parse the derived properties rather than trying | to ||follow the derivation formula

Re: ID_Start, ID_Continue, and stability extensions

2014-04-24 Thread Steffen Nurpmeso
Markus Scherer markus@gmail.com wrote: |I strongly recommend you parse the derived properties rather than trying to |follow the derivation formula, because that can change over time. ..this file includes only those core properties that have themselves a derivation-may-change property? (I

Re: Names for control characters

2014-03-13 Thread Steffen Nurpmeso
|So then the wizard Unicode and the warlock 10646 started casting |their spells together. Fantastic reading. |Shazaamaazama! Pockety spoketi! Keeeraack! History is made by winners. --steffen ___ Unicode mailing list Unicode@unicode.org