Hi

"Samuel W. Heywood" <[EMAIL PROTECTED]> wrote:

 >> SH> With the American ISO that I am using, HTML entity values are
 >> SH> not the same as ascii values.
 >> Sorry ... but I'm not able to understand this sentence ... :)

 SH> By HTML entity values I am speaking of the ALT + NUMs as referenced in
 SH> the table found at the URL:
 SH> http://utopia.knoware.nl/users/schluter/doc/tags/characters.html

Look what is in the first half of the page:
'Table of printable Latin-1 Character codes'

So your 'HTML Entity' is latin-1 (aka iso-8859-1)

Step by step we get closer to the solution :)

 SH> By the American ISO I mean the DOS system default.
OK this is codepage 437.
ISO and DOS have _nothing_ in common.

 SH> It doesn't matter what I call it as long as you know what I'm talking
 SH> about.
Yes :) but I really wasn't sure ... I was just guessing.
(but it seems that I guessed right)

 >> First of all there is no american ISO ...
 >> ISO stands for _INTERNATIONAL_ Standards Organization
 >> HTML Entities are such things like &auml -> parsed as �

 SH> OK, OK.  Now I understand what you are talking about.  I am talking
 SH> about the ALT + NUMs.  I don't know the technical term for the ALT +
 SH> NUMs.
There is no ... it depends on the codepage you use ...
eg alt 065 is always A, but alt num 123 depends on the codepage.
So you simply tell us what codepage you use.
If you haven't changed that than it is the default codepage 437.

 >> If you write the mail in a correctly configured insight, than it
 >> should be readable.

 SH> I have found that as long as I use the ALT + NUMs as provided in the
 SH> referenced table, the characters will be viewed correctly within
 SH> Insight regardless of what character set I specify in my Arachne
 SH> setup.
IMHO the default fonts are latin-1 fonts.

The character set specification is simply makeup .. IMHO.
But I don't use insight, so I'm not sure.

 SH> If, however, I should use the ALT + NUMs provided in the "standard"
 SH> ascii table, to include the "standard extended ascii table", then the
 SH> characters will not be readable within Insight.
you speak about cp437.
Sure ... because cp437 is not equal to iso latin-1.

On the internet the iso latin family is widely used, so it is a very good
idea to use this !!

 >> If you install the stuff that I zipped up, you will end up with a
 >> iso latin1 in DOS. So if you can read it in DOS, and send it away
 >> with the correct charset header it should be readable to anybody
 >> with an average mail reader (or a good one :)

 SH> But this would seem to work out only if my correspondent had his code
 SH> page set up the same as mine.
No ... there are 2 possibilities.

On other OSes you simply use a font, that displays the characters
correctly. (you look into the header, and use the font specified there.)

 SH> I don't want to have to change my DOS setup and my code pages and my
 SH> keyboard maps every time I alternate beween reading or writing in one
 SH> language and another.
Use another OS ...

Sorry this is the only advice I can give you, when you frequently use
different languages, which are not fully covered by iso latin1. (see the
url that you have sent in the first part of the message, for which
characters are covered)

If you only need latin1, you can simply use the default cp437, switch to
latin1 in a batchfile, load your mailreader, switch to cp437 - ready

 SH> I don't know how you and the other folks on this list who do your
 SH> emails in various languages can deal with the problem.
simply use a mailreader that can either
1) use different fonts in different languages
2) use translation tables.

to 2 eg there is � in cp437, but on another position.
So my mailreader has translation tables, and corrects this.

 SH> If I use the ALT + NUMs provided in the table referenced by the URL,
 SH> then the message will be readable within Insight,
and all other sane and good mailreader

 SH> but it would not be readable with Net-Tamer, which is also considered
 SH> a very good mail reader.
If it can't display iso latin-1 (defacto standrad) than I consider it not
to be a good mail reader.
It seems that it simply doesn't care about codepages.
and uses the codepage provided by DOS.

 SH> In order to read the message correctly with Net-Tamer I would
 SH> have to use the extended ascii characters found in the "standard"
 SH> table rather than the table provided by the URL.
Even that is not true.

If I would use Nettamer they would be not correctly displayed, as I don't
use cp437, but cp850. (dos latin1)

 SH> The "standard" table is the one provided by popular text editors such
 SH> as PEDIT and QEDIT and found in US textbooks.
the editors present you with a table that shows YOUR _current_ codepage.

 SH> Whether that table should be referred to as the "standard" one is
 SH> another question altogether.
It is codepage 437

 SH> Conclusion:  The table provided by the URL works for Insight, but not
 SH> for Net-Tamer.  The "standard" table works for Net-Tamer, but not for
 SH> Insight.
Because nettamer simply ignores and does not care about that topic.
Nettamer seems to be written by an american for americans ...

If you write your mail in iso latin-1 than it is displayed correctly in 95%
of the cases.
(all windows mailreaders, all linux mailreaders, all mac mailreaders)
Only very old DOS programs have problems with it.

 SH> It would be nice to be able to use a single table to be universally
 SH> compatible with all good email readers.
There _IS_ one ...

 >> You can't use ascii values for 'special' characters, because there
 >> ARE NO ascii values for these characters !!!
 SH> I meant of course "extended ascii values" as given in the "standard"
 SH> table.
There is no standard table.
There is a default characterset for DOS wich is codepage 437.

 >> SH> I am merely suggesting that everybody should adopt one universal
 >> SH> numbering scheme for all the characters used in all of the
 >> SH> world's languages.
 >> There IS such a scheme ... as I and many others have already written
 >> here .... goto http://www.unicode.org

 SH> Yes, I've looked into that.  It sure would be good if they could
 SH> develop Unicode for DOS.  Some list members think it is possible.
I don't belong to this group.

IMHO DOS does not have the capabilities to do so.
Specially DOS programs can use the display directly, and so they wouldn't
car about unicode, and mess up everything. (as Michael P. already said)

 >> SH> Please examine the characters below from within Insight and also
 >> SH> from within the DOS console.  I think we all can agree that this
 >> SH> is a big problem.
 >> Not really ... this is like I put my tape into the cd-player and I
 >> couldn't hear anything.
 SH> Bad analogy.  Tape players and cd players are different types of
 SH> machines, data is recorded on different types of media, the tape
 SH> player uses analog technology and the CD player digital technology.
This about covers the difference between a modern OS and DOS :)))))
(sorry ... I like DOS, but ... :)

 >> SH> BTW, I would be curious to know how this message looks to
 >> SH> someone who is reading it with Outlook or Eudora, or any other
 >> SH> popular Winblows program.
 >> OK here the facts:
 >> You state in your header that you used iso latin1.
 >> So every sane Email program (including windows, nextstep, amiga ...)
 >> will display it using an iso latin1 font.

 SH> This is not the case.  Some email readers such as Net-Tamer, Barebones
 SH> DOS, and NetMail DOS don't care what the header says.  The message
 SH> would be displayed as though it were seen from the DOS console.
But this is the problem of THESE PROGRAMS.

If there is a sign that says don't smoke (and there are many explosives
near), and you do smoke, and BOOOOM, that's your problem.
Same applies to codepages.
If you know what codepage you should use, but don't use it ... HMMM

 SH> You are saying that good email readers actually read the header and
 SH> display the message in accordance with what the header says.
YES ... exactly

 SH> Arachne doesn't care what the header says.
I nevers said that Insight is considered good by me ... :)
(Arachne browser is GREAT !!)

 SH> If I should send the same message with the default ISO header instead
 SH> of with the Latin-1 header,
ISO is an organization that 'produces' standards.
One of them is the iso-8859-x family. it's most prominent member is latin1.
(iso-8859-1)

 SH> the message would still be viewed the same from within Insight.
Nevertheless Insight is much better than NT, because it displays iso-latin1
and US-ascii correctly. (And this covers most communicatiosn on the net)

 >> So the 1. part is displayed correctly .... (in every sane mail
 >> program)
 SH> Part 1 is displayed correctly by Insight, but incorrectly by email
 SH> programs that display text the same as the DOS console.
Sure ... because these programs ignore any charset line.
This behaviour is NOT considered sane by me.

 >> and the 2. part (written with the standard DOS codepage 437, but
 >> wrongly assumed to be latin1) is displayed incorrectly.
 SH> The DOS console doesn't assume anything.  It doesn't convert
 SH> anything. It only shows you what can be seen from its own vantage
 SH> point.
Exactly

 SH> DOS applications can be written to read headers and convert
 SH> characters, but the DOS console doesn't see anything beyond its own
 SH> vantage point.
yes

 SH> If you have a code page loaded, then you are probably running some
 SH> kind of application to change the way DOS displays text.
 SH> Am I right?
hmm ... I don't understand, if you load a different CP, you don't need an
additional application.

My mailreader works so that you tell him what CP it should use for export,
which codepage is the local one, and you provide some rules how to convert
them.

 SH> In a "normal" machine, the keys you press, or the key combinations
 SH> you press will create their corresponding characters in the message.
NO !!!
The corresponding characters are NOT necessarily CODED equally.

EG I press a this generates code 065. in cp437, and nearly all other CPs.
BUT it might be possible that in cp ricsi-1 (imaginery)
it displays as � and a is 001.

So you have to translate code 001 in ricsi-1 to code 065 if viewed with
cp437 !!!

 SH> These characters should correspond to the way the keys are labeled or
 SH> to a universal standard set of numeric designators to which they are
 SH> assigned.
There aren't even keys for all characters ...
where is the � key ??

 SH> Characters sent should be the same as characters received
 SH> and displayed.
You don't send charsets.
You send codes.
And these codes are interpreted by the codepage of the receiving party.

So it is possible that codes are misinterpreted.
So you tell the receiver how you meant to interpret it.
(eg charset iso-latin1)
So you have provided enough information to display the characters
correctly. If they don't do so, it's THEIR problem, not yours.

 SH> Why do we need code pages when no encryption or conversion is being
 SH> made?
See above.
Ancryption has nothing to do with codepages !!

 SH> The exchanging of words ought to be made as easy as the exchanging of
 SH> music.
Yes ... but you can't listen to a modern CD in a 20 year old tape deck !

And DOS can be compared to a tapedeck ... it by itself offers very few
services by it's own.

 SH> Sam Heywood

CU, Ricsi

-- 
Richard Menedetter <[EMAIL PROTECTED]> [ICQ: 7659421] {RSA-PGP Key avail.}
-=> Please return stewardess to original upright position <=-

Reply via email to