Hi Ricardo,

On Thu, 2011-01-20 at 12:21 +0000, Ricardo Quintas wrote:
> I'm trying to read the table of contents from doc and docx files. I can
> extract all text from the documents,
> but can't find a way to read the table of contents of the document, or at
> least find the paragraphs with 'headings' style.
> Is there any way to achieve this?

You will need to get the SPRM's of the text runs using [0] and then
check for the SPRM indicating the outline level. The SPRM to look at is
sprmPOutLvl (0x2640). This property may be also present in the style
definition (not sure). You may need to have a look at the doc specs [1].

[0]http://poi.apache.org/apidocs/org/apache/poi/hwpf/model/CHPX.html#getSprmBuf()
[1]http://msdn.microsoft.com/en-us/library/cc313153%28v=office.12%
29.aspx

Regards,
-- 
Cédric Bosdonnat
LibreOffice hacker
http://documentfoundation.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to