RE: Pupil's question about Burmese

2010-11-10 Thread Shawn Steele
FWIW: The OS really likes Unicode, so lots of the text input, etc, are really Unicode. ANSI apps (including non-Unicode web pages), get the data back from those controls in ANSI, so you can lose data that it looked like you entered. As mentioned, the solution is to fix the app to use

Re: Pupil's question about Burmese

2010-11-10 Thread Keith Stribley
On 11/10/2010 02:17 PM, Shawn Steele wrote: As mentioned, the solution is to fix the app to use Unicode. Especially for a language like this. In these cases, machines will be fairly inconsistent even if they did support some code page, but Unicode works most everywhere. Afaik there never

Are Latin and Cyrillic essentially the same script?

2010-11-10 Thread Karl Pentzlin
As shown in N3916: http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3916.pdf = L2/10-356, there exists a Latin letter which resembles the Cyrillic soft sign Ь/ь (U+042C/U+044C). This letter is part of the Jaꞑalif variant of the alphabet, which was used for several languages in the former Soviet Union (e.g.

Re: Are Latin and Cyrillic essentially the same script?

2010-11-10 Thread Karl Pentzlin
2010-11-10 10:08, I wrote: KP As shown in N3916 ... Please read vowel instead of vocal throughout the mail. Sorry.

Combining Triple Diacritics (N3915) not accepted by UTC #125

2010-11-10 Thread Karl Pentzlin
From the Pre-Preliminary minutes of UTC #125 (L2/10-416): C.4 Preliminary Proposal to enable the use of Combining Triple Diacritics in Plain Text (WG2 N3915) [Pentzlin, L2/10-353] - see http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3915.pdf [125-A13] ... UTC does not believe that either solution A

Re: Combining Triple Diacritics (N3915) not accepted by UTC #125

2010-11-10 Thread Khaled Hosny
On Wed, Nov 10, 2010 at 06:11:08PM +0100, Karl Pentzlin wrote: From the Pre-Preliminary minutes of UTC #125 (L2/10-416): C.4 Preliminary Proposal to enable the use of Combining Triple Diacritics in Plain Text (WG2 N3915) [Pentzlin, L2/10-353] - see

RE: Combining Triple Diacritics (N3915) not accepted by UTC #125

2010-11-10 Thread Murray Sargent
You can put diacritics over an arbitrarily large base by using an accent object in a math zone. For example, in my email editor (Outlook), I type alt+= to insert a math zone and then (a+b)\tildespacespace to get [cid:image001.png@01CB80BE.389DD340] (wide tilde over a+b). Evidently

Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Jim Monty
Here's a peculiar question. Is there a standard term to describe text that is in some subset CCS of another CCS but, strictly speaking, is only really in the subset CCS because it doesn't have any characters in it other than those represented in the smaller CCS? (The fact that I struggled to

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Asmus Freytag
If you want to get that point across to a general audience, you could use a more colloquial term, albeit one that itself derives from mathematics. Text that can be completely expressed in ASCII is fits into something (ASCII) that works as a lowest common denominator of a large number of

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Mark Davis ☕
Mark *— Il meglio è l’inimico del bene —* On Wed, Nov 10, 2010 at 12:38, Asmus Freytag asm...@ix.netcom.com wrote: If you want to get that point across to a general audience, you could use a more colloquial term, albeit one that itself derives from mathematics. Text that can be completely

RE: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Shawn Steele
Or did you mean this is UTF-8 even though in only has characters that also look like ASCII? I was a bit confused :) If you are communicating this information, then that's probably also a good time to also communicate Use Unicode, like UTF-8, and you won't have this kind of problem! -Shawn

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Markus Scherer
Specifically for ASCII, a common term is seven-bit ASCII. markus

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Kenneth Whistler
Mark Davis wrote: What are also tricky are the 'almost' supersets, where there are only a few different characters. Those definitely cause problems because the difference in data is almost undetectable. For example, Mark is referring to cases such as ISO 8859-1 and 8859-15. Those share all

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Tim Greenwood
Even more interesting is Windows 1252 and ISO8859-15 where the former is a repertoire superset of the latter for the graphic characters, but not an encoding superset. On Wed, Nov 10, 2010 at 5:53 PM, Kenneth Whistler k...@sybase.com wrote: Mark Davis wrote: What are also tricky are the

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Jim Monty
I like lowest common denominator as a helpful term. It's familiar and means just the right thing, euphemistically.   Thank you, Asmus. You groked what I struggled to express.   Jim Monty - Original Message From: Asmus Freytag asm...@ix.netcom.com To: Jim Monty jim.mo...@yahoo.com Cc:

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Martin J. Dürst
On 2010/11/11 6:28, Mark Davis ☕ wrote: That is actually not the case. There are superset relations among some of the CJK character sets, and also -- practically speaking -- between some of the windows and ISO-8859 sets. I say practically speaking because in general environments, the C1

Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

2010-11-10 Thread Bjoern Hoehrmann
* Jim Monty wrote: Is there a standard term to describe text that is in some subset CCS of another CCS but, strictly speaking, is only really in the subset CCS because it doesn't have any characters in it other than those represented in the smaller CCS? (The fact that I struggled to phrase

Re: Combining Triple Diacritics (N3915) not accepted by UTC #125

2010-11-10 Thread Khaled Hosny
Or the other way around... On Thu, Nov 11, 2010 at 08:53:49AM +0200, Klaas Ruppel wrote: Typographic solutions (as established they ever may be) do not solve encoding matters. Best regards, __ Klaas Ruppel www.kotus.fi/?l=ens=1 Kotus