FWIW: The OS really likes Unicode, so lots of the text input, etc, are really
Unicode. ANSI apps (including non-Unicode web pages), get the data back from
those controls in ANSI, so you can lose data that it looked like you entered.
As mentioned, the solution is to fix the app to use
On 11/10/2010 02:17 PM, Shawn Steele wrote:
As mentioned, the solution is to fix the app to use Unicode. Especially for
a language like this. In these cases, machines will be fairly inconsistent even if they
did support some code page, but Unicode works most everywhere.
Afaik there never
As shown in N3916: http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3916.pdf
= L2/10-356, there exists a Latin letter which resembles the Cyrillic
soft sign Ь/ь (U+042C/U+044C). This letter is part of the Jaꞑalif
variant of the alphabet, which was used for several languages in the
former Soviet Union (e.g.
2010-11-10 10:08, I wrote:
KP As shown in N3916 ...
Please read vowel instead of vocal throughout the mail. Sorry.
From the Pre-Preliminary minutes of UTC #125 (L2/10-416):
C.4 Preliminary Proposal to enable the use of Combining Triple
Diacritics in Plain Text (WG2 N3915) [Pentzlin, L2/10-353]
- see http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3915.pdf
[125-A13] ... UTC does not believe that either solution A
On Wed, Nov 10, 2010 at 06:11:08PM +0100, Karl Pentzlin wrote:
From the Pre-Preliminary minutes of UTC #125 (L2/10-416):
C.4 Preliminary Proposal to enable the use of Combining Triple
Diacritics in Plain Text (WG2 N3915) [Pentzlin, L2/10-353]
- see
You can put diacritics over an arbitrarily large base by using an accent object
in a math zone. For example, in my email editor (Outlook), I type alt+= to
insert a math zone and then (a+b)\tildespacespace to get
[cid:image001.png@01CB80BE.389DD340]
(wide tilde over a+b). Evidently
Here's a peculiar question.
Is there a standard term to describe text that is in some subset CCS of another
CCS but, strictly speaking, is only really in the subset CCS because it doesn't
have any characters in it other than those represented in the smaller CCS?
(The fact that I struggled to
If you want to get that point across to a general audience, you could
use a more colloquial term, albeit one that itself derives from mathematics.
Text that can be completely expressed in ASCII is fits into something
(ASCII) that works as a lowest common denominator of a large number of
Mark
*— Il meglio è l’inimico del bene —*
On Wed, Nov 10, 2010 at 12:38, Asmus Freytag asm...@ix.netcom.com wrote:
If you want to get that point across to a general audience, you could use a
more colloquial term, albeit one that itself derives from mathematics.
Text that can be completely
Or did you mean this is UTF-8 even though in only has characters that also
look like ASCII? I was a bit confused :)
If you are communicating this information, then that's probably also a good
time to also communicate Use Unicode, like UTF-8, and you won't have this kind
of problem!
-Shawn
Specifically for ASCII, a common term is seven-bit ASCII.
markus
Mark Davis wrote:
What are also tricky are the 'almost' supersets, where there are only a few
different characters. Those definitely cause problems because the difference
in data is almost undetectable.
For example, Mark is referring to cases such as ISO 8859-1 and 8859-15.
Those share all
Even more interesting is Windows 1252 and ISO8859-15 where the former is a
repertoire superset of the latter for the graphic characters, but not an
encoding superset.
On Wed, Nov 10, 2010 at 5:53 PM, Kenneth Whistler k...@sybase.com wrote:
Mark Davis wrote:
What are also tricky are the
I like lowest common denominator as a helpful term. It's familiar and means
just the right thing, euphemistically.
Thank you, Asmus. You groked what I struggled to express.
Jim Monty
- Original Message
From: Asmus Freytag asm...@ix.netcom.com
To: Jim Monty jim.mo...@yahoo.com
Cc:
On 2010/11/11 6:28, Mark Davis ☕ wrote:
That is actually not the case. There are superset relations among some of
the CJK character sets, and also -- practically speaking -- between some of
the windows and ISO-8859 sets. I say practically speaking because in general
environments, the C1
* Jim Monty wrote:
Is there a standard term to describe text that is in some subset CCS of
another
CCS but, strictly speaking, is only really in the subset CCS because it
doesn't
have any characters in it other than those represented in the smaller CCS?
(The fact that I struggled to phrase
Or the other way around...
On Thu, Nov 11, 2010 at 08:53:49AM +0200, Klaas Ruppel wrote:
Typographic solutions (as established they ever may be) do not solve encoding
matters.
Best regards,
__
Klaas Ruppel www.kotus.fi/?l=ens=1
Kotus
18 matches
Mail list logo