both, it could be doc or docx. I was able to extract images from doc format using HWPF but i am unable to do for docx. couldn't find the class supporting docx in latest scratch*.jar
Could you explain a little more ? I couldn't find the documentantion helpful so wordsl ike "text runs in them" is kind a confusing me. thanks, On Wed, Dec 14, 2011 at 6:46 PM, Nick Burch <[email protected]> wrote: > On Wed, 14 Dec 2011, alee amin wrote: > >> I have a document which is consist of images and text. >> > > Word Document? .doc or .docx? > > > the "expected" format of the document would be something >> >> Image >> Text >> > > If this is word, then they'll usually be a character run with the image > attached to it (though not in all cases...). You can just get the > paragraphs around the image, and the text runs in them > > Nick > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > [email protected].**org<[email protected]> > For additional commands, e-mail: [email protected] > > -- ..alee http://techboard.wordpress.com
