Hi Zachary,

On Fri, 2010-09-24 at 15:39 +0930, Zachary Mitchell wrote:
> It looks like I was looking at the wrong HWPFDocument byte [] after all.
> 
> I have a demo HWPFDocument file,
> which is read from a Word file that has two gif
> images inserted and embedded.
> 
> I have been told that the bytes for the images
> inside the doc file, the HWPFDocument file
> I am programming with, starts at 
> Character point 0x01
> byte 01.

It seems you missunderstood me with that 0x01 character. In the Document
stream of a word file you can find all the text (starting from
fib.fcMin). In these text data, the pictures are marked by a 0x01
character.

The trick there is that all 0x01 aren't necessary pictures: it also need
the fcPicLoc SPRM (see the Table stream and the CHP plex). The fcPicLoc
then provides the offset to the beginning of the PICF structure in the
data stream.

I am too lazy to make a code snippet, but this shouldn't be too hard to
get all the pictures from a doc file using HWPFDocument...

Regards,

-- 
Cédric Bosdonnat
Go-oo hacker
http://go-oo.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to