I am not certain that you can do this without some 'extra' processing.

If I am correct, then the location of the image will be determined by it's
anchor. To get at the anchor, you need to work with the information in the
Escher layer. Looking at the javadoc for the PictureTable class, it seems
that the EscherRecordHolder is passed to the PictureTable when it is created
and it ought to be possible to use this class instance to get at the
information you require. However, the PictureTable class does not expose
this object so you will have to do some more work I think. Reading through
the introduction that Dimitry wrote the PictureTable;

"Holds information about all pictures embedded in Word Document either via
"Insert -> Picture -> From File" or via clipboard. Responsible for images
extraction and determining whether some document's piece contains embedded
image. Analyzes raw data bytestream 'Data' (where Word stores all embedded
objects) provided by HWPFDocument. Word stores images as is within so called
"Data stream" - the stream within a Word docfile containing various data
that hang off of characters in the main stream. For example, binary data
describing in-line pictures and/or formfields an also embedded
objects-native data. Word picture structures are concatenated one after the
other in the data stream if the document contains pictures. Data stream is
easily reachable via HWPFDocument._dataStream property. A picture is
represented in the document text stream as a special character, an Unicode 
whose CharacterRun.isSpecial() returns true. The file location of the
picture in the Word binary file is accessed via CharacterRun.getPicOffset().
The CharacterRun.getPicOffset() is a byte offset into the data stream.
Beginning at the position recorded in picOffset, a header data structure,
will be stored. "

It seems that you may have to go back and process the data stream further to
get at this information. Alternatively, you could try patching the API and,
if it works, offering the code as a patch so that others could benefit.

Another thought did occur to me as well. Is each picture also represented
within the document as a CharacterRun object? If so then there are methods
available to get at the vertical and horizontal offsets of the CharacterRun.
That may be a quick way to find out how far the picture is set in from the
left and top, but the left and top of what is the next question I guess.
Still, it might be worth a quick test.

Yours

Mark B


shyamala wrote:
> 
> Iam trying to convert MSWord document to html.
> 
> I have processed the paragraph + characterrun and pictureTable to extract
> and save the images.
> 
> Iam unable to find the way to get alignment of the picture.What is the
> best way to get the alignment of the image in the document?.
> 
> regards,
> Shyamala
> 
>          
> 
> 

-- 
View this message in context: 
http://www.nabble.com/image-alignment-from-HWPFDocument-tp24385476p24386624.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to