iText does not have any lexical analysis tools, so it does not know what a "word" is. It only sees the drawing instructions.
So you will need to obtain all of the text and coordinates for the page, then perform your own analysis to determine "words". Don't forget that the definition of a "word" differs across languagesÅ Leonard On 7/15/12 8:46 PM, "Kalani Bright" <kapaa...@manastudios.com> wrote: >Hi guys; > >Heres what I am trying to do; I would appreciate to know if this is >possible in iText. >I'm not interested in constructing pdfs only deconstructing existing >pdf's for analysis of content and positions of words on the page. > >Rather than boundary of all text on the page I want the boundary info >for each word in order to generate some xml for another program I wrote. > >Something like this... ><word id="0" x="0" y="0" width="8" height="4">The</word> ><word id="1" x="12" y="0" width="7" height="4">fox</word> ><word id="2" x="22" y="0" width="7" height="4">was</word> > >I know I can do it for a region of text; as shown in the IText in Action >book in Chapter 15; but I really do want it for each individual word so >I can generate invisible yet clickable hotspots over what will end up >being just be a plain image. > >Is this possible to do with iText; how would I accomplish something like >this? > >Thanks guys, > >kb > > > > >-------------------------------------------------------------------------- >---- >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >iText-questions mailing list >iText-questions@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/itext-questions > >iText(R) is a registered trademark of 1T3XT BVBA. >Many questions posted to this list can (and will) be answered with a >reference to the iText book: http://www.itextpdf.com/book/ >Please check the keywords list before you ask for examples: >http://itextpdf.com/themes/keywords.php ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php