Gene Kwiecinski wrote:
Hi, i'm trying to replace all occurrences of characaters like é, è, ê
etc ... by their corresponding htmlentities. To do that, i use the
following command:
%s/é/\é/g

The problem with that command is that i have to do that for all
characters. I was wandering if there's a way to do it with only one
command.

Honestly, the way I'd do it is to just throw it at 'sed' (simple but constant 
multiple substitutions), or write a little 'lex' script (more onvolved, with 
reformatting, rearranging, whatever else needs to be done to the doc), 
depending on how involved it is and how often I'd need to use it.  I've got DOS 
versions of both, which is what (still) I tend to use on Lose systems.

As for 'vim', it's probably best to write a function which you could map to a 
single key like one of the F-keys.  I'm still new at that, so couldn't tell you 
how to do that, but certainly someone here can tell you how to build the 
skeleton of the function;  you'd add salt and pepper to taste.

Then you'd just open up a doc, hit <F9> or something, and it's magically done 
in one pass.

Don't have it here, but did write a few cheapo 'lex' scripts to turn 0xA0 to 
"&nbsp;", convert and match ldquo/rdquo/lsquo/rsquo chars, ellipses, most of 
the common things I encounter.  Gimme a yell today and I tomorrow can dig up and post the 
script here (if anyone else is interested) or via email directly.


Assuming 'encoding' is set to UTF-8,

let htmlEntities = {
        \ "\<Char-160>" : "&nbsp;",
        \ '«' : "&laquo;",
        \ '»' : "&raquo;",
        \ "á" : "&aacute;",
        \ "Á" : "&Aacute;",
        \ "â" : "&acirc;",
        \ "Â" : "&Acirc;",
        \ "ä" : "&auml;",
        \ "Ä" : "&Auml;",
        \ "æ" : "&aelig;",
        \ "Æ" : "&AElig;",
        \ "ã" : "&atilde;",
        \ "Ã" : "&Atilde;",
        \ "ç" : "&ccedil;",
        \ "Ç" : "&Ccedil;",
        \ '©' : "&copy;"
...etc...
        \ }

function Entity(char)
        if has_key(htmlEntities, a:char)
                return htmlEntities.a:char
        elseif char2nr(a:char) > 127
                return '&#' . char2nr(a:char) . ';'
        else
                return a:char
        endif
endfunction

Then the substitute becomes

        :%s/./\=Entity(submatch(0))/g

which should benefit from the dictionary-lookup algorithm. I haven't benchmarked it though. Whether to replace & by &amp;, < by &lt;, > by &gt; is a more ticklish decision since you don't want to mess with preexisting entities and/or HTML tags.


Note that I think I uploaded quite some time ago to vim-online a script htmlmap.vim with a set of buffer-local mappings to "convert as you type" when editing HTML files. Unlike the above approach, that script works also with Vim version 6.


Best regards,
Tony.

Reply via email to