I often need to convert from HTML to t2t.

There is already https://txt2tags.googlecode.com/svn/trunk/extras/unhtml.vim 
but the result is not always clean and it's not scriptable.

Pandoc (http://johnmacfarlane.net/pandoc/) can also convert from html to 
several other formats, but I didn't manage to adapt it to txt2tags (it seems 
complicated to compile so I didn't go furthen than a simple installation)

But I've just discovered this handy piece of software:


It converts from html to some wiki formats, such as dokuwiki or mediawiki. The 
good new is it's very easy to adapt to new syntax. For example part of the 
definition file is like this:

b => { start => '**', end => '**' },
    strong => { alias => 'b' },
    i => { start => '//', end => '//' },
    em => { alias => 'i' },
    u => { start => '__', end => '__' },

I've create a txt2tags export: 
Until it's clean enough to put on the txt2tags svn, I've put the (work in 
progress) project there:


The archive is in: https://code.google.com/p/textallion/downloads/list

Once html2wiki is installed and this module as well, you can invoke it this way:

html2wiki --dialect Txt2tags file.html

You can even get remote files and convert them like this:

curl --silent  http://theody.net/elements.html |  html2wiki --dialect Txt2tags 
> elements.t2t
