Bill Tschumy writes: > I've looked at pdfBox, but the jar file is so big that I > hate to burden my users by incorporating it.
Bill, My system (see http://www.parc.com/janssen/pubs/TR-03-16.pdf) uses pdftotext underneath. I've been very satisfied with that. Another Java solution would be to use Multivalent (multivalent.sourceforge.net). Multivalent, by the way, advertises the following: "Extract text from all formats. Full-text search with Lucene." Bill --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]