Gene Kwiecinski wrote:
Hi, i'm trying to replace all occurrences of characaters like é, è, ê
etc ... by their corresponding htmlentities. To do that, i use the
following command:
%s/é/\é/g
The problem with that command is that i have to do that for all
characters. I was wandering if there's a way to do it with only one
command.
Honestly, the way I'd do it is to just throw it at 'sed' (simple but constant
multiple substitutions), or write a little 'lex' script (more onvolved, with
reformatting, rearranging, whatever else needs to be done to the doc),
depending on how involved it is and how often I'd need to use it. I've got DOS
versions of both, which is what (still) I tend to use on Lose systems.
As for 'vim', it's probably best to write a function which you could map to a
single key like one of the F-keys. I'm still new at that, so couldn't tell you
how to do that, but certainly someone here can tell you how to build the
skeleton of the function; you'd add salt and pepper to taste.
Then you'd just open up a doc, hit <F9> or something, and it's magically done
in one pass.
Don't have it here, but did write a few cheapo 'lex' scripts to turn 0xA0 to
" ", convert and match ldquo/rdquo/lsquo/rsquo chars, ellipses, most of
the common things I encounter. Gimme a yell today and I tomorrow can dig up and post the
script here (if anyone else is interested) or via email directly.
Assuming 'encoding' is set to UTF-8,
let htmlEntities = {
\ "\<Char-160>" : " ",
\ '«' : "«",
\ '»' : "»",
\ "á" : "á",
\ "Á" : "Á",
\ "â" : "â",
\ "Â" : "Â",
\ "ä" : "ä",
\ "Ä" : "Ä",
\ "æ" : "æ",
\ "Æ" : "Æ",
\ "ã" : "ã",
\ "Ã" : "Ã",
\ "ç" : "ç",
\ "Ç" : "Ç",
\ '©' : "©"
...etc...
\ }
function Entity(char)
if has_key(htmlEntities, a:char)
return htmlEntities.a:char
elseif char2nr(a:char) > 127
return '&#' . char2nr(a:char) . ';'
else
return a:char
endif
endfunction
Then the substitute becomes
:%s/./\=Entity(submatch(0))/g
which should benefit from the dictionary-lookup algorithm. I haven't
benchmarked it though. Whether to replace & by &, < by <, > by > is
a more ticklish decision since you don't want to mess with preexisting
entities and/or HTML tags.
Note that I think I uploaded quite some time ago to vim-online a script
htmlmap.vim with a set of buffer-local mappings to "convert as you type" when
editing HTML files. Unlike the above approach, that script works also with Vim
version 6.
Best regards,
Tony.