On Thu, Mar 09, 2006 at 06:50:09AM -0500, Leonard Rosenthol wrote: > > >So - is iText a good way to extract just the text of a page so that we > >can use it to calculate the offsets? > > No. > > Look at PdfBox or Multivalent.
Thanks for the pointer. Seems like the char offset method isn't too reliable (something that's 150 chars inside the text fiel from PDFBox is 200 chars in according to the highlighter in reader. But - with word based offset (and a lot of guesswork as to what acrobat reader thinks is a word boundary) then this looks like it might actually fly :) -- Chris Searle [EMAIL PROTECTED] ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions