On Sat, 6 Aug 2011, Groups munged Tony Mechelynck's mail into:

:set list lcs=eol:ś,tab:\|_,nbsp:~,conceal:*

And he followed up:

...and for some reason that f???ing bl??dy st??id googlegroups interface changed my Pilcrow mark to an s-acute. Well, the exact character used there is irrelevant in this case but still, I don't like it. The copy in my "Sent" folder is in 8bit ISO-8859-1 with the correct Pilcrow mark; after the [me (SMTP) relay.skynet.be (ESMTP) googlegroups.com (SMTP) gmail.com (POP3) me] round-trip it comes back in quoted-printable UTF-8 as =C5=9B (equal Charlie Pantafayf equal Noveniner Bravo) which means U+015B SMALL LATIN LETTER S WITH ACUTE instead of the 0xB6 (U+00B6 PILCROW MARK) which I had sent. Ah, why couldn't Google simply understand that Latin1 0xB6 means UTF-8 U+00B6? You don't need iconv to know that. Ah, Google pisses me off. >:-(

In both this thread and the last time I discussed this¹, it appears that the only charset that survives roundtripping to Groups when using codepoints outside of ASCII is UTF-8.

Also as before, though, it's recipient-dependent. ZyX's response² to the initial, munged mail seems to have it correctly quoted as:

:set list lcs=eol:¶,tab:\|_,nbsp:~,conceal:*


In the Groups web interface, all of the broken characters are replaced (for me, using a default charset of UTF-8 everywhere) by the three characters:

�

That means that, in the old thread { å, æ, ø, «, » } and in the new thread { ¶ } were all replaced by �.

ZyX appears to have received the old thread correctly, too. His response there³ has them correctly quoted, but Ben Fritz's response⁴ indicates that the erroneously converted characters were simply absent.

All that said, it's unclear how 0xB6 was misinterpreted as 0xC5,0x9B... But, alas. Unless you have good reason to stick to explicit Latin-1, you're probably better off using UTF-8. In the current HTML specs⁵, for example, even stating that something is ISO-8859-1 is now *intentionally* treated as CP1252 (Microsoft's version of Latin-1). So, the number of places in which using ISO-8859-1 instead of UTF-8 will bite you is only going to increase.

--
Best,
Ben

¹: https://groups.google.com/d/msg/vim_use/UY8vGwc3kvo/QPMZXlptOioJ
²: https://groups.google.com/d/msg/vim_dev/A0Q_z0OksxQ/H-zuwNjtOM4J
³: https://groups.google.com/d/msg/vim_use/UY8vGwc3kvo/P3yr3kNpMBMJ
⁴: https://groups.google.com/d/msg/vim_use/UY8vGwc3kvo/7Vs-BlvtHsQJ
⁵: 
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#character-encodings-0

--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Raspunde prin e-mail lui