----- Original Message ----- From: "Cédric Bosdonnat" <[email protected]>
To: "Zachary Mitchell" <[email protected]>
Sent: Wednesday, September 22, 2010 10:37 PM
Subject: Re: Unicode Byte terminator for PicturesTable and HWPFDocument.


Hi Zachary,

On Wed, 2010-09-22 at 21:40 +0930, Zachary Mitchell wrote:
I have learned the following -

"Word picture structures are concatenated one after the other in the data stream if the document contains pictures. Data stream is easily reachable via HWPFDocument._dataStream property. A picture is represented in the document text stream as a special character, an Unicode  whose CharacterRun.isSpecial() returns true. The file location of the picture in the Word binary file is accessed via CharacterRun.getPicOffset(). The CharacterRun.getPicOffset() is a byte offset into the data stream. Beginning at the position recorded in picOffset, a header data structure, will be stored. "


If I wish to insert one or more Picture(s), [org.apache.poi.pwpf.usermodel.Picture] picture files, from a Java.io.File object of an available OS file,

(using

byte [] content = Picture.getRawContent();

)
-What is the unicode/byte(s) I am looking for in my HWPFDocument data stream where I place Picture Data?

IIRC you just need to append the picture and it's description structure
to the data stream. You'll end up with a succession of PICF structures.
(p181 in the PDF specs). Have a look at PICF.cbHeader as it gives the
length of the structure (from what I remember of it).

-How do I terminate the new place in the Word HWPFDocument where Picture data in the stream ends?

-May I perform the same process, ad infinitum, for as many Picture(s) desired, in a HWPFDocument?

-How do I specify in the datastream what the image type is,

by sending the bytes of the java.io.File image object to the appropriate fields in the Picture Object
created from this stream

[Picture.BMP/Picture.GIF/Picture.JPG/Picture.WMF1/Picture.WMF2]

before callying getRawContent();

From what I saw in the PicturesTable source code, there is currently
nothing to write images to an HPWFDocument, but they can be read. You
could improve that and submit a patch if you wish.

If you want to directly write to the Data stream, then don't forget that
images are refered by a 0x01 character in the text content and you need
to define some character properties like picOffset.

Regards,

--
Cédric Bosdonnat
Go-oo hacker
http://go-oo.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr



What I wish to know is how, for example, a written and completed HWPFDocument with inserted, embedded images, how the images bytes for one image start and end, for however many images there are
in the HWPFDocument.getDataStream();
So that I may write (insert )image(s) data to

HWPFDocument.getDataStream()

which have been taken from a java.io.File, to a org.apache.poi.hwpf.userModel.Picture object (appropriate byte field for Image type), and finally from Picture.getRawContent();

How do I correctly alter the HWPFDocument._dataStream
by inserting bytes to insert Pictures,
and/ or past one Picture, as many Pictures as one may want?

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to