Mark *— Il meglio è l’inimico del bene —*
On Wed, Nov 10, 2010 at 12:38, Asmus Freytag <[email protected]> wrote: > If you want to get that point across to a general audience, you could use a > more colloquial term, albeit one that itself derives from mathematics. > > Text that can be completely expressed in ASCII is fits into something > (ASCII) that works as a "lowest common denominator" of a large number of > character sets. > > You could call it "lowest common denominator" text. > > Since ASCII is the only set that exhibits such a lowest common denominator > relationship with enough other sets to make it interesting, and since that > relation is so well known, it's usually enough to just refer to it by name > (ASCII) without needing a general term - except perhaps for general > audiences that aren't very familiar with it. > That is actually not the case. There are superset relations among some of the CJK character sets, and also -- practically speaking -- between some of the windows and ISO-8859 sets. I say practically speaking because in general environments, the C1 controls are really unused, so where a non ISO-8859 set is same except for 80..9F you can treat it pragmatically as a superset. What are also tricky are the 'almost' supersets, where there are only a few different characters. Those definitely cause problems because the difference in data is almost undetectable. > > In this kinds of discussions I find it invariably useful to mention that > the copyright sign is not part of ASCII. (I suspect that it's the most > common character that makes a text lose its "lowest common denominator" > status). > > A./ > > > > > > > On 11/10/2010 11:41 AM, Jim Monty wrote: > >> Here's a peculiar question. >> >> Is there a standard term to describe text that is in some subset CCS of >> another >> CCS but, strictly speaking, is only really in the subset CCS because it >> doesn't >> have any characters in it other than those represented in the smaller CCS? >> >> (The fact that I struggled to phrase this question in a way that made my >> meaning >> clear -- and failed -- is precisely my dilemma.) >> >> Text that has in it only characters that are in the >> ASCII character encoding is also in the ISO 8859-1 character encoding and >> the >> UTF-8 character encoding form of the Unicode coded character set, right? I >> often >> need to talk and write about text that has such multiple personalities, >> but I >> invariably struggle to make my point clearly and succinctly. I wind up >> describing the notion of it in awkwardly verbose detail. >> >> So I'm left wondering if the character encoding cognoscenti have a special >> utilitarian word for this, maybe one borrowed from mathematics (set >> theory). >> >> Jim Monty >> >> >> >> >> > >

