On 15/12/11 15:26, alee amin wrote:
I was able to extract images from doc format using HWPF but i am unable to
do for docx. couldn't find the class supporting docx in latest scratch*.jar

Correct. The code for the OOXML formats are in a different jar, and have a few extra dependencies. You'll want the poi-ooxml jar, and the dependencies listed for it on http://poi.apache.org/overview.html

At the moment, no-one has been inspired enough to take the many hours needed to work up the documentation for XWPF, nor update the HWPF docs. If a volunteer was able to spend some time though, that'd be wonderful! :)

In the mean time, your best bet is to look at the examples we ship, as well as the HWPF and XWPF parsers included in Apache Tika. The latter two show extracting images from the position they occur in the text, so should be useful to crib off if the examples don't help.


Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to