I did some work on this using XSL a few years ago which luckily I seem to 
have saved.  This transform was designed to replace the XSL at the heart of 
Sharepoint's Word to HTML converter. That converter handled extracting the 
XML from the docx (which is really just a zip file), so you'd have to build 
a bit of infrastructure around it, but it could get you started:

http://www.nwlink.com/~woodruff/DocX2Html.xsl

Note that this was designed for a particular use model of allowing business 
users to update website content by uploading word docs.  We wanted the HTML 
to be semantic (e.g. a bulleted list in Word became a <UL> in HTML, a 
paragraph became a <P>, while the original DocX2HTML just used DIVs for 
everything).  It doesn't try to preserve all in-line formats, but relied on 
the idea of mapping Word style names to CSS classes (the creation of the 
CSS stylesheets was done by hand).

--Ken

On Monday, November 17, 2014 9:44:47 PM UTC-8, Arvind T wrote:
>
> Hi
> Is there any node module to converting docx to html without losing the 
> format?
> I have tried mammoth but the format is getting lost.
>

-- 
Job board: http://jobs.nodejs.org/
New group rules: 
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/nodejs/ab275489-db2f-4159-83f9-a48c6e5df7ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to