Hi Ricardo, On Thu, 2011-01-20 at 12:21 +0000, Ricardo Quintas wrote: > I'm trying to read the table of contents from doc and docx files. I can > extract all text from the documents, > but can't find a way to read the table of contents of the document, or at > least find the paragraphs with 'headings' style. > Is there any way to achieve this?
You will need to get the SPRM's of the text runs using [0] and then check for the SPRM indicating the outline level. The SPRM to look at is sprmPOutLvl (0x2640). This property may be also present in the style definition (not sure). You may need to have a look at the doc specs [1]. [0]http://poi.apache.org/apidocs/org/apache/poi/hwpf/model/CHPX.html#getSprmBuf() [1]http://msdn.microsoft.com/en-us/library/cc313153%28v=office.12% 29.aspx Regards, -- Cédric Bosdonnat LibreOffice hacker http://documentfoundation.org OOo Eclipse Integration developer http://cedric.bosdonnat.free.fr --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
