--- karl wettin <[EMAIL PROTECTED]> wrote:
> I just stubled over iText as I'm looking for way to
> extract the text
> elements of a PDF for storage in a text index
> (Apache Lucene).
> 

iText is many things, but its emphasis is not in this
area.

> My goal is a subclassed PdfReader with a convenience
> method called
> "enumerateTextElements", "enumerateElements" or so.
> 

This is somewhat possible.  The list contains several
instances of Paulo explaining in general how to do it,
and also several of Leonard explaining why you'll
never be sure you got ALL the content.  ;)  Ultimately
the answer you're going to get is "use PdfBox, JPEDAL
or Multivalent."

-Matt

__________________________________
Do you Yahoo!?
Get better spam protection with Yahoo! Mail.
http://antispam.yahoo.com/tools


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to