On Thu, 1 Aug 2013, amit bhandarkar wrote:
I'm currently trying to parse a *.doc file* using Apache POI. While parsing I'm able to fetch all the paragraphs, images, shapes and other members of the file, but I'm unable to find their absolute position in the document(position from top of the current page).
Correct. Unlike things such as PDF, the Word .doc format isn't a page based layout format. You'd need to use all the formatting rules to work out how big each object is, handle images, tables etc. Given how much documents can re-flow when you open them on a different copy of Microsoft Office, you'll struggle...
That said, if people wanted to collaborate on some code to get close-ish, much as we already have for column widths in Excel (HSSF/XSSF), we'd be happy to host!
Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
