I thought Verity could index the text in PDFs automatically! It did with
previous versions of Acrobat. You can create a Verity index that indexes
your database AND other files (pdfs) that you want. Try it and see.
-----Original Message-----
From: Dennis Powers [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 13, 2001 9:00 AM
To: CF-Talk
Subject: Extracting Text from PDF Documents with CF
Hi,
I am wondering if anyone has a method of extracting the raw (unformatted)
text from a PDF file using CF? I have a project were we need to index PDF
files AND associated information in a database. We are currently using
Verity for searching the database with great success but now we need to
index the PDF files that are associated with the data records in the
database.
When a user uploads a new PDF I would like to extract the text from it and
add it to the database with the other information. Then I can use Verity to
search all the data fields AND the PDF text data field.
A CFX or a COM object would be nice so that I can call it from CF. I would
be very appreciative if anyone can steer me to a tag or object that can
accomplish this task.
Best Regards,
Dennis Powers
UXB Internet
(203) 879-2844
http://www.uxbinfo.com/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Structure your ColdFusion code with Fusebox. Get the official book at
http://www.fusionauthority.com/bkinfo.cfm
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists