PDF Indexing Issue

Don Vaillancourt Mon, 28 Jun 2004 13:59:42 -0700

I used the following code example from an article that I linked off of jakarta's site to index PDF files:

doc.add(Field.Text("content", new FileReader(f)));

But I realized today that this method only indexes the PDF as is. For those wondering if the the PDF were actually indexed or if maybe they only contained images, well I verified this with LUKE and those PDFs are in there, but the only keywords that were indexed were the PDF defintion statements and encoded stuff.

So what is the proper way to index a PDF?

Thank You


Don Vaillancourt
Director of Software Development

WEB IMPACT INC.
416-815-2000 ext. 245
email: [EMAIL PROTECTED]
web: http://www.web-impact.com


This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright.  If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.

PDF Indexing Issue

Reply via email to