You could try html2markdown, which uses iconv, tidy, and pandoc.
It should have no trouble with these characters. It's included in the
pandoc distribution: http://sophos.berkeley.edu/macfarlane/pandoc/

JM

+++ Jeremy C. Reed [Mar 22 07 15:52 ]:
> The html document various characters like
> �     \xa0
> �     \xa9  (Copyright symbol)
> (and others).
> 
> I tried using html2text.py but it didn't like these characters.
> 
> Any ideas on how I can use iconv or another tool to convert documents like 
> this so I can then convert to Markdown?
> 
> I don't want to do manually as I have around 500+ documents.
> 
> 
>   Jeremy C. Reed
> _______________________________________________
> Markdown-Discuss mailing list
> [email protected]
> http://six.pairlist.net/mailman/listinfo/markdown-discuss

_______________________________________________
Markdown-Discuss mailing list
[email protected]
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to