On Thu, 2 Sep 2010, Viliam Ďurina wrote:
I'm trying to parse bookmarks and cross-references from word documents.
I have parsed the sttbBkmk and plcbkf tables, from which I get the
starting character position, where the bookmark points at.
You might want to double check the word specs, and ensure you're handling
the offsets in the correct one of bytes or characters. It's quite messy
which one is used where, and you'll want to ensure you're correctly in the
character domain before finding text
But I fail to find the text at that position. Is there a way for a
particular CharacterRun to tell, which character position it starts at?
Yup, if you look at:
http://poi.apache.org/apidocs/org/apache/poi/hwpf/usermodel/CharacterRun.html
then follow it back to the parent Range object:
http://poi.apache.org/apidocs/org/apache/poi/hwpf/usermodel/Range.html
you have things like getStartOffset(), getEndOffset(), and the details on
where the section and paragraphs start
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]