On Sun, 12 Mar 2000 10:49:25 +0100 (CET), [EMAIL PROTECTED] (Richard Menedetter) wrote:
> Hi
> "Samuel W. Heywood" <[EMAIL PROTECTED]> wrote:
> SH> http://utopia.knoware.nl/users/schluter/doc/tags/characters.html
> Look what is in the first half of the page:
> 'Table of printable Latin-1 Character codes'
The ALT + NUM characters of course.
> So your 'HTML Entity' is latin-1 (aka iso-8859-1)
I have discovered by testing that the editor within Arachne uses Latin-1,
ISO-8859-1 ALT + NUMs that correspond with those in the referenced table.
The editor will recognize the same ALT + NUMs regardless of how I set up
Arachne. These ALT + NUMs are not the same as those used in popular DOS
text editors such as PEDIT and QEDIT and EDIT.COM.
If I set up my code page for ISO-8859-1, would the popular text editors
perform differently? I don't know. I have not yet conducted the
experiment.
> Step by step we get closer to the solution :)
> SH> By the American ISO I mean the DOS system default.
> OK this is codepage 437.
> ISO and DOS have _nothing_ in common.
You said that if I set up my code page within DOS for a different ISO
then I would see the characters differently from the DOS console. In
other words, if I were to enter the command "type myfile.txt", the
appearance of the non-English characters would vary according to the code
page used.
> SH> It doesn't matter what I call it as long as you know what I'm talking
> SH> about.
> Yes :) but I really wasn't sure ... I was just guessing.
> (but it seems that I guessed right)
OK
>>> First of all there is no american ISO ...
>>> ISO stands for _INTERNATIONAL_ Standards Organization
>>> HTML Entities are such things like ä -> parsed as �
> SH> OK, OK. Now I understand what you are talking about. I am talking
> SH> about the ALT + NUMs. I don't know the technical term for the ALT +
> SH> NUMs.
> There is no ... it depends on the codepage you use ...
> eg alt 065 is always A, but alt num 123 depends on the codepage.
> So you simply tell us what codepage you use.
> If you haven't changed that than it is the default codepage 437.
I haven't changed that yet. I have the idea that changing my code page
will eat up a lot of memory.
>>> If you write the mail in a correctly configured insight, than it
>>> should be readable.
> SH> I have found that as long as I use the ALT + NUMs as provided in the
> SH> referenced table, the characters will be viewed correctly within
> SH> Insight regardless of what character set I specify in my Arachne
> SH> setup.
> IMHO the default fonts are latin-1 fonts.
> The character set specification is simply makeup .. IMHO.
> But I don't use insight, so I'm not sure.
> SH> If, however, I should use the ALT + NUMs provided in the "standard"
> SH> ascii table, to include the "standard extended ascii table", then the
> SH> characters will not be readable within Insight.
> you speak about cp437.
> Sure ... because cp437 is not equal to iso latin-1.
> On the internet the iso latin family is widely used, so it is a very good
> idea to use this !!
>>> If you install the stuff that I zipped up, you will end up with a
>>> iso latin1 in DOS. So if you can read it in DOS, and send it away
>>> with the correct charset header it should be readable to anybody
>>> with an average mail reader (or a good one :)
> SH> But this would seem to work out only if my correspondent had his code
> SH> page set up the same as mine.
> No ... there are 2 possibilities.
> On other OSes you simply use a font, that displays the characters
> correctly. (you look into the header, and use the font specified there.)
> SH> I don't want to have to change my DOS setup and my code pages and my
> SH> keyboard maps every time I alternate beween reading or writing in one
> SH> language and another.
> Use another OS ...
> Sorry this is the only advice I can give you, when you frequently use
> different languages, which are not fully covered by iso latin1. (see the
> url that you have sent in the first part of the message, for which
> characters are covered)
> If you only need latin1, you can simply use the default cp437, switch to
> latin1 in a batchfile, load your mailreader, switch to cp437 - ready
> SH> I don't know how you and the other folks on this list who do your
> SH> emails in various languages can deal with the problem.
> simply use a mailreader that can either
> 1) use different fonts in different languages
> 2) use translation tables.
> to 2 eg there is � in cp437, but on another position.
> So my mailreader has translation tables, and corrects this.
> SH> If I use the ALT + NUMs provided in the table referenced by the URL,
> SH> then the message will be readable within Insight,
> and all other sane and good mailreader
> SH> but it would not be readable with Net-Tamer, which is also considered
> SH> a very good mail reader.
> If it can't display iso latin-1 (defacto standrad) than I consider it not
> to be a good mail reader.
Maybe it can display ISO Latin 1. I don't know. It does display extended
ascii characters. I don't know how it would work for a different code page.
I haven't tested this.
> It seems that it simply doesn't care about codepages.
> and uses the codepage provided by DOS.
> SH> In order to read the message correctly with Net-Tamer I would
> SH> have to use the extended ascii characters found in the "standard"
> SH> table rather than the table provided by the URL.
> Even that is not true.
> If I would use Nettamer they would be not correctly displayed, as I don't
> use cp437, but cp850. (dos latin1)
> SH> The "standard" table is the one provided by popular text editors such
> SH> as PEDIT and QEDIT and found in US textbooks.
> the editors present you with a table that shows YOUR _current_ codepage.
That indeed very interesting. These text editors must have required
some highly ingenious programming.
> SH> Whether that table should be referred to as the "standard" one is
> SH> another question altogether.
> It is codepage 437
> SH> Conclusion: The table provided by the URL works for Insight, but not
> SH> for Net-Tamer. The "standard" table works for Net-Tamer, but not for
> SH> Insight.
> Because nettamer simply ignores and does not care about that topic.
> Nettamer seems to be written by an american for americans ...
Net-Tamer may be used for sending and receiving messages in most European
languages provided that both the receiver and sender use the same code
page.
> If you write your mail in iso latin-1 than it is displayed correctly in 95%
> of the cases.
> (all windows mailreaders, all linux mailreaders, all mac mailreaders)
> Only very old DOS programs have problems with it.
OK, this is very good to know. How much memory do you think I would lose
by changing my code page to Latin-1?
> SH> It would be nice to be able to use a single table to be universally
> SH> compatible with all good email readers.
> There _IS_ one ...
>>> You can't use ascii values for 'special' characters, because there
>>> ARE NO ascii values for these characters !!!
> SH> I meant of course "extended ascii values" as given in the "standard"
> SH> table.
> There is no standard table.
> There is a default characterset for DOS wich is codepage 437.
>>> SH> I am merely suggesting that everybody should adopt one universal
>>> SH> numbering scheme for all the characters used in all of the
>>> SH> world's languages.
>>> There IS such a scheme ... as I and many others have already written
>>> here .... goto http://www.unicode.org
> SH> Yes, I've looked into that. It sure would be good if they could
> SH> develop Unicode for DOS. Some list members think it is possible.
> I don't belong to this group.
> IMHO DOS does not have the capabilities to do so.
> Specially DOS programs can use the display directly, and so they wouldn't
> car about unicode, and mess up everything. (as Michael P. already said)
>>> SH> Please examine the characters below from within Insight and also
>>> SH> from within the DOS console. I think we all can agree that this
>>> SH> is a big problem.
>>> Not really ... this is like I put my tape into the cd-player and I
>>> couldn't hear anything.
> SH> Bad analogy. Tape players and cd players are different types of
> SH> machines, data is recorded on different types of media, the tape
> SH> player uses analog technology and the CD player digital technology.
> This about covers the difference between a modern OS and DOS :)))))
> (sorry ... I like DOS, but ... :)
>>> SH> BTW, I would be curious to know how this message looks to
>>> SH> someone who is reading it with Outlook or Eudora, or any other
>>> SH> popular Winblows program.
>>> OK here the facts:
>>> You state in your header that you used iso latin1.
>>> So every sane Email program (including windows, nextstep, amiga ...)
>>> will display it using an iso latin1 font.
> SH> This is not the case. Some email readers such as Net-Tamer, Barebones
> SH> DOS, and NetMail DOS don't care what the header says. The message
> SH> would be displayed as though it were seen from the DOS console.
> But this is the problem of THESE PROGRAMS.
> If there is a sign that says don't smoke (and there are many explosives
> near), and you do smoke, and BOOOOM, that's your problem.
> Same applies to codepages.
> If you know what codepage you should use, but don't use it ... HMMM
> SH> You are saying that good email readers actually read the header and
> SH> display the message in accordance with what the header says.
> YES ... exactly
> SH> Arachne doesn't care what the header says.
> I nevers said that Insight is considered good by me ... :)
> (Arachne browser is GREAT !!)
> SH> If I should send the same message with the default ISO header instead
> SH> of with the Latin-1 header,
> ISO is an organization that 'produces' standards.
> One of them is the iso-8859-x family. it's most prominent member is latin1.
> (iso-8859-1)
> SH> the message would still be viewed the same from within Insight.
> Nevertheless Insight is much better than NT, because it displays iso-latin1
> and US-ascii correctly. (And this covers most communicatiosn on the net)
>>> So the 1. part is displayed correctly .... (in every sane mail
>>> program)
> SH> Part 1 is displayed correctly by Insight, but incorrectly by email
> SH> programs that display text the same as the DOS console.
> Sure ... because these programs ignore any charset line.
> This behaviour is NOT considered sane by me.
>>> and the 2. part (written with the standard DOS codepage 437, but
>>> wrongly assumed to be latin1) is displayed incorrectly.
> SH> The DOS console doesn't assume anything. It doesn't convert
> SH> anything. It only shows you what can be seen from its own vantage
> SH> point.
> Exactly
> SH> DOS applications can be written to read headers and convert
> SH> characters, but the DOS console doesn't see anything beyond its own
> SH> vantage point.
> yes
> SH> If you have a code page loaded, then you are probably running some
> SH> kind of application to change the way DOS displays text.
> SH> Am I right?
> hmm ... I don't understand, if you load a different CP, you don't need an
> additional application.
Then a code page is not a DOS application? The code page is not some kind
of TSR running in the background?
> My mailreader works so that you tell him what CP it should use for export,
> which codepage is the local one, and you provide some rules how to convert
> them.
> SH> In a "normal" machine, the keys you press, or the key combinations
> SH> you press will create their corresponding characters in the message.
> NO !!!
> The corresponding characters are NOT necessarily CODED equally.
> EG I press a this generates code 065. in cp437, and nearly all other CPs.
> BUT it might be possible that in cp ricsi-1 (imaginery)
> it displays as � and a is 001.
> So you have to translate code 001 in ricsi-1 to code 065 if viewed with
> cp437 !!!
Then the code page is an application running in the background and it
translates and decodes? It must take a lot of memory.
> SH> These characters should correspond to the way the keys are labeled or
> SH> to a universal standard set of numeric designators to which they are
> SH> assigned.
> There aren't even keys for all characters ...
> where is the � key ??
For my code page, cp 437, the � is created by entering ALT + NUM 225. There
is no special key for it. If the � is a normal alphabet character in some
language group, then the people who speak the language could easily hardwire
a special keyboard that would have a � key. It is entirely up to them if
they really want to have a � key. Everybody has a right to speak and write
and type in his own language.
> SH> Characters sent should be the same as characters received
> SH> and displayed.
> You don't send charsets.
> You send codes.
> And these codes are interpreted by the codepage of the receiving party.
I understand that I am sending strings of binary numbers. As long as they
occur in the same pattern that is seen in a language system, then they
are not "codes" in the strictest sense. You could even understand a Martian
who sends binary numbers provided he were actually attempting to communicate
with you. In such a case the strings of binary numbers would fit a
statistical pattern that could be easily analyzed, especially if the Martian
really wanted to be understood. You could easily display his binary numbers
on a computer screen without the need for a code page. In a matter of a few
hours the folks at NSA would have analyzed what the Martian has to say and
they could even send him a reply in his own language. No problem.
So why can't Earthlings learn to communicate as well as their hypothetical
extra-terrestrial counterparts?
> So it is possible that codes are misinterpreted.
> So you tell the receiver how you meant to interpret it.
> (eg charset iso-latin1)
> So you have provided enough information to display the characters
> correctly. If they don't do so, it's THEIR problem, not yours.
It is everybody's problem as long as we don't know what the standards are
and we don't know how to set up our machines so as to be compliant with
the standards.
> SH> Why do we need code pages when no encryption or conversion is being
> SH> made?
> See above.
> Ancryption has nothing to do with codepages !!
You shouldn't need codepages, even if you are trying to communicate with
an extra-terrestrial, especially if the extra-terrestrial is really trying
to communicate with you.
> SH> The exchanging of words ought to be made as easy as the exchanging of
> SH> music.
> Yes ... but you can't listen to a modern CD in a 20 year old tape deck !
But you can record on a tape everything that is on the CD and then play
the tape and listen to it. The quality will be diminished, but not to
the extent that you cannot appreciate it.
> And DOS can be compared to a tapedeck ... it by itself offers very few
> services by it's own.
If your tapedeck goes bad it is easy to discover what is wrong with it and
then you can fix it. Or you can buy a new tapedeck cheap. The same applies
to a DOS machine.
Later,
Sam Heywood
-- This mail was written by user of Arachne, the Ultimate Internet Client