You could try html2markdown, which uses iconv, tidy, and pandoc. It should have no trouble with these characters. It's included in the pandoc distribution: http://sophos.berkeley.edu/macfarlane/pandoc/
JM +++ Jeremy C. Reed [Mar 22 07 15:52 ]: > The html document various characters like > � \xa0 > � \xa9 (Copyright symbol) > (and others). > > I tried using html2text.py but it didn't like these characters. > > Any ideas on how I can use iconv or another tool to convert documents like > this so I can then convert to Markdown? > > I don't want to do manually as I have around 500+ documents. > > > Jeremy C. Reed > _______________________________________________ > Markdown-Discuss mailing list > [email protected] > http://six.pairlist.net/mailman/listinfo/markdown-discuss
_______________________________________________ Markdown-Discuss mailing list [email protected] http://six.pairlist.net/mailman/listinfo/markdown-discuss
