>This is the problem, HTML does not support mixed character sets.
>Also, the charset affects the entire HTML document.  Therefore, your
>resource settings would have to conform with the charset, and this
>can be a big problem if messages existing in the archive have different
>specified charsets.  It would be hard to guarantee that all messages
>will use the same charset.

I think I understand ... is this right?

If an single email contains two different character sets,
you're screwed, I understand that.

If two emails are received, each with a different character set
1) you are screwed on index pages, which will has a bunch
   of subject lines from different character sets
2) you are screwed on message pages, because navigational aids
   like the word "follow-ups" will be in a different character set
   from the messages.

Ok, so I see how unicode would magically fix everything. But, imagine that
wasn't available, and I get a message in an unknown character set. 

The result is an un meta-tagged message page (which will default to either
iso-8859-1 or some browser heuristic). Assuming iso-8859-1, we get good
navigational aids and an undreadable message. Had we used a meta tag the
message would be readable and we'd lose the navigational aids. Yuck, yuck,
yuck, it's a choice between two evils. Given just those options, I think a
message page meta tag (generated from the corresponding email's character
set) would be better, though.

Converting to unicode won't be graceful either. If one converts everything
unknown to unicode, I bet in practice a lot of iso-8859-1 messages will go to
unicode and be unrenderable by legacy browswers. I guess legacy browswers
will have to be replaced.

Jeff

Reply via email to