Re: SeaMonkey Mail and text encoding

Alexander Yudenitsch Sun, 09 Oct 2016 21:20:41 -0700

Hi, Paul B. Gallagher! Thanks for your quick reply, on 09 Oct 16 21:39!

A happy coincidence that you decided to do so, since I already had theintention of writing you, to ask for details about your migration of SMfrom XP to W7 -- but let's leave that for a little later, OK?

I send and receive e-mails in several languages (but mostly English), ...


So do I. In my case, it's Russian (русский язык) and Korean (한국어).
Both have local encoding specifications such as KOI8-R and EUC-KR, but
are also amenable to Unicode.

That's part of the problem, and why I took the time to list "what I'vemanaged to find involving text encoding in SeaMonkey Mail": There areno explanations or specifications for most of these, so one has toguess... I do speak/write Russian, but so far have had no need to useit in e-mails, and have no correspondents who use anything exceptUnicode and what I THOUGHT could be "Western" characters -- but what'sthat: Is that an euphemism for "English"? Portuguese, Spanish, French,all use "Western characters", IMHO; but I suspect the former (ie,"Western") actually means "English"...

Assuming that's true, then in my case there would be 3 types of textencoding: Western/English, Unicode (and I don't know if 'all Unicodesare created equal'), and "other Western codes", like Portuguese,Spanish, French, etc.

First, when a message is received, the "auto-detect" is mostly worse
than useless, so I usually start it as "Unicode" or "Western", depending
on history ...


For incoming messages, this is usually the fault of the sender.

Paraphrasing your standard 'footer', "War doesn't determine who's right,just who's left", I don't care whose 'fault' it is (something like thosewho hold SM as 'right' because it follows standards better than MS IE:Since most sites still code 'for IE', I prefer not to be "dead right",but instead to be able to read the texts I want with a minimum fuss --and, in such cases, ignoring what the majority does is self-defeating.

If the message is well-formed (specifies the encoding actually used,
and that specification is in the standard format), SM will recognize
and understand the specification and apply it. However, some senders
live in countries where everyone assumes that all messages are in the
same local encoding, so their email programs don't even bother to
state the encoding. If your receiving program doesn't make the same
assumption, it has to guess. And sometimes the sending program "lies"
-- it says it uses one encoding but actually uses another.

I have noticed that sometimes SM will guess wrong when I select a
message in a mail folder, but if I navigate away and then return
it'll guess right. I don't know why that is -- it seems to be
sticking to the encoding it used for the previous message that I just
deleted.

If SM guessed right 98% of the time, I'd think it's worth it; but, sinceit's frequently wrong (in over 30% of the cases, in my experience, Ithink), I'd prefer if there were a 'preference' to turn the 'guessing' off!

There are also things you can do at your end to sabotage the display of
well-formed messages.

The most common error on the receiving side is to specify a font that
doesn't work for the incoming encoding. For example, if you have
SeaMonkey set to display Thai text in a font that does not contain Thai
glyphs, you'll get garbage when you receive a well-formed Thai message.
Well, by garbage I mean you'll probably get boxes, or stuff like this: .

To see if this is the case, go to Edit | Preferences | Appearance |
Fonts, where you will see a pull-down list:
     Fonts: [Western]
Below that, you'll see a list of six fonts for various conditions:
proportional, serif, sans-serif, etc.

Instead of "Western" at the top, pull down the encoding in question.
Let's suppose you choose "Cyrillic." In the list of six fonts that
follows, do all six fonts support Cyrillic? In other words, do those
fonts contain Cyrillic glyphs? If not, change to fonts that do.

Do this for all the encodings you typically receive. You can specify
different "sans-serif" fonts for different encodings -- e.g., Arial for
Cyrillic but DotumChe for Korean.

On the other hand, if SeaMonkey is using the wrong encoding to decipher
a message, instead of boxes you'll typically get "mojibake," which the
Russians call "крякозябры" and the French call "hiéroglyphes." Is that
what you're seeing?

Your comments about fonts make me question if that might have somethingto do with the problem: Notice that I didn't list the "Fonts" optionsin "Preferences / Appearance" so, even though I did mark "Allowdocuments to use other fonts", and I have hundreds of installed fonts,that might still be a problem... But, since I only send 'text-only'messages (even in reply to HTML), and use the 'standard' "Courier New"font for that, AND when I write messages all the chgaracters aredisplayed in the fonts I use, I don't see how that might be the cause --but, once more, maye that's just my ignorance talking...

A common reason that SeaMonkey could choose the wrong encoding (other
than a sender whose computer lies about the encoding it used) is if the
user gives it improper guidance. For example, if you tell it "no matter
what you see, display it in Western encoding," messages with foreign
characters will be corrupted.

One place you should look is under Edit | Preferences | Appearance |
Fonts -- have you checked the box, "Allow documents to use other fonts"?
That's fine if you have lots of fonts installed, but if an incoming
message specifies a font you don't have, your system will have to
substitute something it does have, and that doesn't always work. I have
it checked, and it hasn't caused problems. Things may be different for
your correspondents.

As i said, guessing the correct coding to display an incoming e-mailisn't a big problem, for me: If I guess wrong, I just have to re-guessand start over (or anm I missing something here, also? Since I'm usingtext-only for display and composing, and only have 3 types of textencoding, (Western/English, Unicode, and "other Western codes" likePortuguese), I thought that the first 2 options (Western and Unicode)would cover 100 of my received e-mails.

For mixed-content messages, I recommend specifying Unicode when you
begin drafting (in the composition window, Options | Text Encoding |
Unicode). My implementation of SM does that on its own for all outgoing
messages (both plain text and HTML), which is very convenient. After
all, the definition of Unicode is that it supports all languages. You're
right that the encoding of an incoming message must be selected
correctly before you reply.


Sorry, but that doesn't cover my main problem which, as I said, is that:

when I compose a large message with parts in English and parts in
another language, and make several drafts: Each time the draft is
re-opened for further editing, parts of it have been changed by SM,
usually changing some non-English characters into gibberish (Not
Unicode!); what's particularly puzzling is that the same character
may be changed in some parts of the message and not in others, and
where/when this happens seems almost random.

I assume there must be some 'automatic processing' going on (like the
"auto-detect" above), but see no way to turn it off; I have tried
changing all the above options to "unicode", or to "English", but
nothing seems to help...

This seems to ba relatively recent problem: In the past, I rarelynoticed this; but, since these messages have gotten longer, and are veryfrequently re-drafted, parts of them (and only PARTS, which I find evenmore mysterious) turn into gibberish (and not boxes, or stuff like ,or even "крякозябры" -- a mouthful!). An example:

"ç" might become "Р“В§", or even (with repeated drafts) "Р“С“Р’С’Р“С“Р’СћР“СћРІР‚С™Р’В¬Р“вЂ¦РІР‚СљР“С“Р’С’Р“СћРІвЂљВ¬РІвЂћСћР“С“РІР‚С™Р“вЂљР’В§"(I'm not kidding!!)

I mean: If I can write "ç" when composing (like I did right now), andjust save the draft and re-open it, the reason for the gibberishshouldn't be the font, right? I normally write and send messages withthese characters without any problem (as far as I know: Usually, no-onecomplains), it seems to happen only with very long messages which mixEnglish and another language, which is why I suspect that SM 'guesses'each time a draft is saved -- and, oveerr dozens of 'saves', it gets itwrong part of the time.

For a relatively long time, I was able to stop this behaviour by alwayschoosing the "View | Text Encoding" before opening these drafts (and,many times, when I forgot to do that, gibberish appeared) but, lately,even when doing that, the gibberish still appears, and gets worse witheach 'save': Perhaps there have been some 'tweaks' to SM's "encodingguessing" algorithms?



--
Thanks beforehand for your attention, and I hope to hear from you soon.

s) Alexander Yudenitsch   <[email protected]>



_______________________________________________
support-seamonkey mailing list
[email protected]
https://lists.mozilla.org/listinfo/support-seamonkey

Re: SeaMonkey Mail and text encoding

Reply via email to