Hi,

The stylesheet structure was rationalised for the 1.75.2 release so that Word, 
Pages and OpenOffice formats could all be supported. There is a stylesheet for 
each of those formats that normalises the document to a common format, and then 
the other stylesheets take the document through to structured DocBook.

Office 2007 basically uses WordML under-the-hood, and a .docx "file" is really 
just a Zip file containing the XML documents. The one with the document content 
is word/document.xml. It wouldn't be too much work to upgrade the roundtrip 
stylesheets to handle this document; basically it is just the XML Namespace 
URIs that have changed.

I'm working on libxslt at the moment (implementing XSLT 2.0), so haven't really 
got time to look at the roundtripping stuff. However, email me directly if you 
have any further questions.

Cheers,
Steve Ball

On 07/05/2010, at 6:29 AM, [email protected] wrote:

> Howdy DocBook Community: 
> 
> I am new to DocBook, and also new to this forum. I have been going through 
> the archives, and found some very interesting discussions. Primarily I am 
> interested in moving/converting some documents from Word which they were 
> authored in to DocBook.
> I have been looking at several tools to help in this process, and found some 
> very good information here in the archives.
> 
> One method which seems very promising is the docbook-xsl/roundtrip 
> The discussion for this was from a few years ago. So I am thinking that the 
> some of the style sheets may have changed with the docbook-xsl-1.75.2 distro 
> that I have. The suggested conversions were:
>  
>  wordml-normalise.xsl, wordml-sections.xsl, wordml-blocks.xsl, 
> wordml-final.xsl
>  
> none of which I found in the 1.75.2
> Instead I have xsl such as:
> normalise-common.xsl, normalise2sections.xsl, sections2blocks.xsl, and 
> blocks2dbk.xsl
>  
> It seems to me that this is just the logical evolution of the same xsl style 
> sheets referenced in the archives from years ago. Does anyone know if this is 
> indeed the case. 
>  
> Further there has been little to no discussion or even apparently any new 
> tools regarding converting Microsoft Word to DocBook at least for quite a 
> while.
> Corresponding roughly to the time when Microsoft Word started implementing 
> XML or w:xml as I like to call it. It is still very ugly xml, and even though 
> the new docx format is apparently valid XML it is still cumbersome to work 
> with, at least in my opinion. 
> Are there any newer tools designed primarily to work with the latest 
> incarnation of w:xml or any techniques that could help the effort to get 
> these docs into DocBook?
> I greatly appreciate any response!
>  
> Thanks,
> /GregP    

Reply via email to