Re: Incorrect character encoding in received messages

2019-12-27 Thread Lars Ingebrigtsen
Garjola Dindi  writes:

> I have recently been having trouble with Gnus decoding some e-mails as
> ASCII when actually they should be decoded as unicode.

Sounds like there's no Content-Type header in the message that says what
the charset it.

> For instance, in French, the “à” char gets displayed as “\340”.

Then the message isn't encoded as utf-8, but is probably latin-1.

`C-u W M c' in the summary buffer should allow you to decode the message
in the proper charset.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

___
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english


Incorrect character encoding in received messages

2019-12-25 Thread Garjola Dindi
Hi all,

I have recently been having trouble with Gnus decoding some e-mails as
ASCII when actually they should be decoded as unicode.

For instance, in French, the “à” char gets displayed as “\340”.

If I go to «edit mode» with 'gnus-summary-edit-article' and just do C-c
C-c (with no real edit), the message gets displayed correctly.

Another example with the "é" char which appears as 'i' in an HTML
message. Describe char gives me this:
>>> 
>>> > >
> character: i (displayed as i) (codepoint 105, #o151, #x69)
> >
>   charset: ascii (ASCII (ISO646 IRV)) 
> >
> code point in charset: 0x69   
> >
>script: latin  
> >
>syntax: w  which means: word   
> >
>  category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, 
> r:Roman>
>  to input: type "C-x 8 RET 69" or "C-x 8 RET LATIN SMALL LETTER 
> I">
>   buffer code: #x69   
> >
> file code: #x69 (encoded by coding system utf-8-unix) 
> >
>   display: by this font (glyph code)  
> >
> xfthb:-PfEd-DejaVu Sans-normal-normal-normal-*-16-*-*-*-*-0-iso10646-1 
> (#x4C) >
>   
> >
> Character code properties: customize what to show 
> >
>   name: LATIN SMALL LETTER I  
> >
>   general-category: Ll (Letter, Lowercase)
> >
>   decomposition: (105) ('i')  
> >
> >> >> 
> >> >>>
> >> > >

And after 'gnus-summary-edit-article' followed by C-c C-c:

 >
 > >
> character: é (displayed as é) (codepoint 233, #o351, #xe9)
>  >
>   charset: unicode (Unicode (ISO10646))   
>  >
> code point in charset: 0xE9   
>  >
>script: latin  
>  >
>syntax: w  which means: word   
>  >
>  category: .:Base, L:Left-to-right (strong), c:Chinese, 
> j:Japanese, l:Latin, v:Viet>
>  to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E 
> WITH ACUTE"  >
>   buffer code: #xC3 #xA9  
>  >
> file code: #xC3 #xA9 (encoded by coding system utf-8-unix)
>  >
>   display: by this font (glyph code)  
>  >
> xfthb:-PfEd-DejaVu Sans-normal-normal-normal-*-16-*-*-*-*-0-iso10646-1 
> (#xAB)  >
>   
>  >
> Character code properties: customize what to show 
>  >
>   name: LATIN SMALL LETTER E WITH ACUTE   
>  >
>   old-name: LATIN SMALL LETTER E ACUTE
>  >
>   general-category: Ll (Letter, Lowercase)
>  >
>   decomposition: (101 769) ('e' '́')  
>   >
>