On Fri, 01 May 2009 12:22:32 +0100, Adrian Sutton <[email protected]> wrote:

The biggest challenge in this is actually removing the huge amount of inline formatting and proprietary tags/attributes that Microsoft Word adds. In the latest versions it's also a challenge to put lists back together as actual HTML lists since Word has started exporting them as paragraphs with a bullet
from the symbol font and lots of nbsps.

Off topic, I know - but couldn't a VBA macro hook into word and actually make an "export as semantic html" option that exported the heading levels as h1..h6, honoured bold, italics, links, bullets and numbers as ul and ol, and just ignored all colours, font changes etc. So there is nothing to clean up?

bruce

Reply via email to