Hi Zachary, On Fri, 2010-09-24 at 15:39 +0930, Zachary Mitchell wrote: > It looks like I was looking at the wrong HWPFDocument byte [] after all. > > I have a demo HWPFDocument file, > which is read from a Word file that has two gif > images inserted and embedded. > > I have been told that the bytes for the images > inside the doc file, the HWPFDocument file > I am programming with, starts at > Character point 0x01 > byte 01.
It seems you missunderstood me with that 0x01 character. In the Document stream of a word file you can find all the text (starting from fib.fcMin). In these text data, the pictures are marked by a 0x01 character. The trick there is that all 0x01 aren't necessary pictures: it also need the fcPicLoc SPRM (see the Table stream and the CHP plex). The fcPicLoc then provides the offset to the beginning of the PICF structure in the data stream. I am too lazy to make a code snippet, but this shouldn't be too hard to get all the pictures from a doc file using HWPFDocument... Regards, -- Cédric Bosdonnat Go-oo hacker http://go-oo.org OOo Eclipse Integration developer http://cedric.bosdonnat.free.fr --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
