On Sat, Jun 26, 2010 at 12:27:28PM +0300, Alexander Gattin
wrote:
> 
> On Fri, Jun 25, 2010 at 11:31:23PM -0700, George
> Davidovich wrote:
> >  32 Content-Type: text/plain; charset=iso-8859-1
> >  33 Content-Transfer-Encoding: quoted-printable
> >  34 
> >  35 George=A0=A0-=A0=A0 ...
>  
> > As I understand it, "A0" represents the
> > non-breaking space character.
> 
> In iso-8859-1? Maybe. If I create this file:
>   $ echo -e 'a\xa0b' > /tmp/nbsp
> and then
>   $ vim /tmp/nbsp
> vim shows:
>   a b
> and
>   "/tmp/nbsp" [converted] 1L, 5C
> in its status line at the bottom. 
> This means that vim converted text to utf8 (my
> locale) and "nbsp" character now takes 2 bytes (in
> utf8). And the 5th byte is "\n" BTW.

Correction noted.  My terminal (for better or worse)
doesn't support UTF-8.  That's becoming increasingly
worse.

> > Mutt displays the message correctly, but in vim,
> > the character appears as a pipe symbol.
> 
> Please verify that vim can or cannot correctly
> handle \xa0 character using the abovementioned
> method.

Yes, Vim has no problems.

> >  And, as you can tell, there's a whole lot of them.
> 
> If vim itself works OK but e-mails from mutt still
> show a lot of pipes, then, well, mutt really feeds
> vim with pipes.

Yes, mutt is feeding vim with pipes, and the entire email
appears in vim as a single line.  

Can't complain, really, as Yahoo's email software is
presenting messages from me with a paperclip icon!  Still
I'd like to know whether this a case of "yet another badly
formatted email" or a shortcoming in mutt.  I suspect it's
the former as running the message through
MIME::QuotedPrinted doesn't strip the nbsp characters.

> >   2. As a workaround, how do I search/replace
> >      non-printable characters
> >      in vim?
> 
> If you want to perform a substitution
> automatically when any file of type "mail" is
> opened by vim, then the following snippet in your
> ~/.vimrc will help:
> 
> if has("autocmd")
>   " Replace all iso8859-1 nbsp chars with "_"
>   autocmd BufReadPost *
>   \ if &filetype == "mail" && &fileencoding == "latin1" |
>   \   %s/\%xA0/_/g |
>   \ endif
> endif

Interesting.  I've always relied on single 'autocmd
Bufenter FileType mail' lines, but using BufReadPost with
if statements makes much more sense.

Using 's/\%xA0/' works for known characters, but the more
optimal solution for interactive use that I've since
discovered (and that doesn't require external utilities,
memorisation, or reference documentation lookups) is

  # yank unknown character into unnamed buffer
  :
  <C-r> "
  
  # position cursor at unknown character
  :
  <C-r> <C-a> 

Maybe the above will help other mutt users in similar
situations.

Thanks for the help, Alexander.

-- 
George

Reply via email to