On Thu, 2 Sep 2010, Viliam Ďurina wrote:
I'm trying to parse bookmarks and cross-references from word documents. I have parsed the sttbBkmk and plcbkf tables, from which I get the starting character position, where the bookmark points at.

You might want to double check the word specs, and ensure you're handling the offsets in the correct one of bytes or characters. It's quite messy which one is used where, and you'll want to ensure you're correctly in the character domain before finding text

But I fail to find the text at that position. Is there a way for a particular CharacterRun to tell, which character position it starts at?

Yup, if you look at:
http://poi.apache.org/apidocs/org/apache/poi/hwpf/usermodel/CharacterRun.html
then follow it back to the parent Range object:
http://poi.apache.org/apidocs/org/apache/poi/hwpf/usermodel/Range.html
you have things like getStartOffset(), getEndOffset(), and the details on where the section and paragraphs start

Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to