On Sat, 11 Mar 2000 21:05:54 +0100 (CET), Richard Menedetter wrote:

> Hi

> "Samuel W. Heywood" <[EMAIL PROTECTED]> wrote:

> SH> With the American ISO that I am using, HTML entity values are not the
> SH> same as ascii values.

> Sorry ... but I'm not able to understand this sentence ... :)

By HTML entity values I am speaking of the ALT + NUMs as referenced in the
table found at the URL:

http://utopia.knoware.nl/users/schluter/doc/tags/characters.html

By the American ISO I mean the DOS system default.  It doesn't matter what
I call it as long as you know what I'm talking about.

I interpret the meaning of the term "entity values" from context only.
Perhaps my interpretation is wrong, but at least you now know what I am
talking about because I have defined it as the ALT + NUMs in the table.
It doesn't matter if you disagree with the definition I have inferred.
As long as you understand what I have inferred, then you should be able to
understand me, even if my definitions are not agreed upon by the experts
in this matter.  It is not my purpose to argue over the definitions of the
terms.  I am only trying to explain what I'm talking about.

> First of all there is no american ISO ...
> ISO stands for _INTERNATIONAL_ Standards Organization
> HTML Entities are such things like &auml -> parsed as �

OK, OK.  Now I understand what you are talking about.  I am talking about
the ALT + NUMs.  I don't know the technical term for the ALT + NUMs.

> SH> This means that if I were to send an email
> SH> message written in the Spanish language to a person who uses Arachne
> SH> Insight to read his mail, then the message would be perfectly
> SH> intelligible to him provided I use the HTML entity values.
> ???

Meaning the ALT + NUMs as indicated in the table referenced in the URL.

> If you write the mail in a correctly configured insight, than it should be
> readable.

I have found that as long as I use the ALT + NUMs as provided in the
referenced table, the characters will be viewed correctly within Insight
regardless of what character set I specify in my Arachne setup.

If, however, I should use the ALT + NUMs provided in the "standard" ascii
table, to include the "standard extended ascii table", then the characters
will not be readable within Insight.

> If you write it in plain DOS, with the standard codepage 437, it will only
> be readable by people who also use plain DOS with cp 437 ...
> This is NOT what you want.

You are right about that.

> If you install the stuff that I zipped up, you will end up with a iso
> latin1 in DOS.
> So if you can read it in DOS, and send it away with the correct charset
> header it should be readable to anybody with an average mail reader (or a
> good one :)

But this would seem to work out only if my correspondent had his code page
set up the same as mine.  I don't want to have to change my DOS setup and
my code pages and my keyboard maps every time I alternate beween reading
or writing in one language and another.  I don't know how you and the other
folks on this list who do your emails in various languages can deal with the
problem.

If I use the ALT + NUMs provided in the table referenced by the URL, then the
message will be readable within Insight, but it would not be readable with
Net-Tamer, which is also considered a very good mail reader.  In order to read
the message correctly with Net-Tamer I would have to use the extended ascii
characters found in the "standard" table rather than the table provided by
the URL.  The "standard" table is the one provided by popular text editors
such as PEDIT and QEDIT and found in US textbooks.  Whether that table should
be referred to as the "standard" one is another question altogether.

Conclusion:  The table provided by the URL works for Insight, but not for
Net-Tamer.  The "standard" table works for Net-Tamer, but not for Insight.

It would be nice to be able to use a single table to be universally
compatible with all good email readers.

<snip>

> You can't use ascii values for 'special' characters, because there ARE NO
> ascii values for these characters !!!

I meant of course "extended ascii values" as given in the "standard" table.

> SH> If I were to set up my ISO within DOS to use Latin-1, would the ALT +
> SH> NUM characters as viewed from the DOS console be seen the same as
> SH> viewed from within Arachne Insight Mail?
> it should IMHO ...

> SH> In order to achieve universal compatibility we will have to
> SH> deconstruct this Tower of Babel.
> And if you are dead than you feel no pain .... sorry :)))
> but this is impossible ...

> SH> I am merely suggesting that everybody should adopt one universal
> SH> numbering scheme for all the characters used in all of the world's
> SH> languages.
> There IS such a scheme ... as I and many others have already written here
> .... goto http://www.unicode.org

Yes, I've looked into that.  It sure would be good if they could develop
Unicode for DOS.  Some list members think it is possible.

> SH> Please examine the characters below from within Insight and also from
> SH> within the DOS console.  I think we all can agree that this is a big
> SH> problem.
> Not really ... this is like I put my tape into the cd-player and I couldn't
> hear anything.

Bad analogy.  Tape players and cd players are different types of machines,
data is recorded on different types of media, the tape player uses analog
technology and the CD player digital technology.

Your computer and my computer are similar machines.  I can send a MIDI
from my machine to your machine.  If we assume that we both have normal
hearing, you can play it on your machine and you will hear it the same way
I hear it, even if you identify the tune by a different name by which I know
it.  Assuming we both have normal vision, if I should send a string of
characters from my machine to your machine, you should be able to load those
characters on your display and see them the same as I do.  The physical
perceptions should be the same for me as for you, whether we are exchanging
literary works or works of music.  Only the meanings derived from our
perceptions could be expected to be different.  Our interpretations will
vary largely upon how we are schooled in the culture.

The problem with email programs is that the physical perception of the
text is not the same for the receiver as for the sender.

> SH> ALT + NUM, Spanish language characters
> SH> from table
> SH> http://utopia.knoware.nl/users/schluter/doc/tags/characters.html These
> SH> are rendered correctly in Arachne's Insight Mail, but are
> SH> incorrect when viewed at the DOS console:

> SH> 225  �     Corresponds to ascii 160 in DOS
> SH> 233  �     Corresponds to ascii 130 in DOS
> SH> 237  �     Corresponds to ascii 161 in DOS
> SH> 243  �     Corresponds to ascii 162 in DOS
> SH> 250  �     Corresponds to ascii 163 in DOS
> SH> 241  �     Corresponds to ascii 164 in DOS
> SH> 170  �     Corresponds to ascii 166 in DOS
> SH> 186  �     Corresponds to ascii 167 in DOS
> SH> 191  �     Corresponds to ascii 168 in DOS
> SH> 161  �     Corresponds to ascii 173 in DOS
> OK ... again there is NO ASCII 160 !!!! (ASCII is 0-127)
> But these seem to be correct ISO Latin-1 codes.
> At least they display correctly here.
> 'a,'e, ....

> SH> ALT + NUM, Spanish language characters
> SH> from ascii chart, the version normally used in the US.
> SH> These are viewed correctly from the DOS console, but are incorrect
> SH> when viewed in Arachne's Insight Mail.

> SH> 160  �     Corresponds to HTML entity 225
> SH> 130  �     Corresponds to HTML entity 233
> SH> 161  �     Corresponds to HTML entity 237
> SH> 162  �     Corresponds to HTML entity 243
> SH> 163  �     Corresponds to HTML entity 250
> SH> 164       Corresponds to HTML entity 241
> SH> 166  |     Corresponds to HTML entity 170
> SH> 167       Corresponds to HTML entity 186
> SH> 168  "     Corresponds to HTML entity 191
> SH> 173  -     Corresponds to HTML entity 161

> Does HTML entity mean codepage 437 to you ???
> (The standard codepage for DOS)

No, I meant the ALT + NUM characters provided in the URL

> PS: These are displayed as garbage.
> Sure ... because your message haeder states that you used iso latin 1 ...
> but this lines were written with cp437 !!!!!!
> It can't be displayed correctly this way!

I had changed my Arachne setup for ISO Latin 1, but I did not change my DOS
setup.

> SH> BTW, I would be curious to know how this message looks to someone who
> SH> is reading it with Outlook or Eudora, or any other popular Winblows
> SH> program.
> OK here the facts:
> You state in your header that you used iso latin1.
> So every sane Email program (including windows, nextstep, amiga ...)
> will display it using an iso latin1 font.

This is not the case.  Some email readers such as Net-Tamer, Barebones DOS,
and NetMail DOS don't care what the header says.  The message would be
displayed as though it were seen from the DOS console.

You are saying that good email readers actually read the header and display
the message in accordance with what the header says.  Arachne doesn't care
what the header says.  If I should send the same message with the default
ISO header instead of with the Latin-1 header, the message would still be
viewed the same from within Insight.  I know.  I've conducted the experiment.
Perhaps if I changed my DOS setup and my code pages and did all that stuff,
then Insight might behave differently.

> So the 1. part is displayed correctly .... (in every sane mail program)

Part 1 is displayed correctly by Insight, but incorrectly by email programs
that display text the same as the DOS console.

> and the 2. part (written with the standard DOS codepage 437, but wrongly
> assumed to be latin1) is displayed incorrectly.

The DOS console doesn't assume anything.  It doesn't convert anything.
It only shows you what can be seen from its own vantage point.  DOS
applications can be written to read headers and convert characters, but the
DOS console doesn't see anything beyond its own vantage point.  If you have
a code page loaded, then you are probably running some kind of application
to change the way DOS displays text.  Am I right?

> This is not the fault of any mail reader, but a misconfiguration made by
> you. (stating you use iso latin1 but you really used cp437 in that part of
> the mail)

I did not know at the time that I needed to change my code page.  I don't
even understand why code pages are needed to deal with any situation that
does not involve encryption.  The only types of machines in which you might
want to send a character that is different from the key you press would be
an encryption machine such as the Enigma, for example.  In a "normal"
machine, the keys you press, or the key combinations you press will create
their corresponding characters in the message.  These characters should
correspond to the way the keys are labeled or to a universal standard set of
numeric designators to which they are assigned.  Characters sent should be
the same as characters received and displayed.  Why do we need code pages
when no encryption or conversion is being made?  The exchanging of words
ought to be made as easy as the exchanging of music.

Sam Heywood

-- This mail was written by user of Arachne, the Ultimate Internet Client

Reply via email to