I did some work on this using XSL a few years ago which luckily I seem to have saved. This transform was designed to replace the XSL at the heart of Sharepoint's Word to HTML converter. That converter handled extracting the XML from the docx (which is really just a zip file), so you'd have to build a bit of infrastructure around it, but it could get you started:
http://www.nwlink.com/~woodruff/DocX2Html.xsl Note that this was designed for a particular use model of allowing business users to update website content by uploading word docs. We wanted the HTML to be semantic (e.g. a bulleted list in Word became a <UL> in HTML, a paragraph became a <P>, while the original DocX2HTML just used DIVs for everything). It doesn't try to preserve all in-line formats, but relied on the idea of mapping Word style names to CSS classes (the creation of the CSS stylesheets was done by hand). --Ken On Monday, November 17, 2014 9:44:47 PM UTC-8, Arvind T wrote: > > Hi > Is there any node module to converting docx to html without losing the > format? > I have tried mammoth but the format is getting lost. > -- Job board: http://jobs.nodejs.org/ New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/ab275489-db2f-4159-83f9-a48c6e5df7ec%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
