Has anyone found a fluid cost-free process to achieve clean output from
pasting via Word docs? HTML Tidy is nice but doesn't really cut it as
part
of a client process.
I'm currently investigating using OpenOffice.org to convert Word to
OpenDocument, then XSLT to convert the content.xml from the zip file to
XHTML. OpenOffice can batch convert whole directory trees of Word docs in
one go :)
Works nicely if you want the relatively plain text (perhaps leaving bold
and italic words) and nothing else: my converter even strips out tables,
leaving each cell as a paragraph. Haven't yet got it to work on my server
though, so don't know if it can be fully automated.
Anthony
--
www.fonant.com - hand-crafted web sites
*********************************************************
The CMS discussion list for http://webstandardsgroup.org/
*********************************************************