PDFBox certainly does what I need, in a JAR.  Thanks Matt.

I"m stuck; if anyone can help, much appreciated!

We're going to a shared server ColdFusion MX 6.1 server.  They can't/won't
put a JAR in the WEB-INF for me (because it's shared).

Is there a way to load up a bunch of JAR's from a directory somewhere under
my docroot, and access a method from CFM?
  -----Original Message-----
  From: Matt Liotta [mailto:[EMAIL PROTECTED]
  Sent: Friday, December 19, 2003 8:30 PM
  To: CF-Talk
  Subject: Re: Extract text from PDF?

  PDF Box (http://www.pdfbox.org/) provides a Java API for extracting
  text from a PDF.

  Matt Liotta
  Montara Software, Inc.
  http://www.MontaraSoftware.com

  On Dec 19, 2003, at 8:13 PM, Michael R. Levy wrote:

  > Does anyone know of a CFX to extract the text from a PDF file so that
  > it can
  > be used in a variable?
  >
  > I'll tell you why I want to do this: I have an application that allows
  > a
  > user to enter several kinds of data about a PDF file while uploading
  > the
  > file to the server.
  >
  > The search function opens a page with various data about the PDF file
  > and
  > also opens the PDF file itself.
  >
  > Currently the application uses <cfexecute> and pdftotext.exe, and adds
  > the
  > text that is thus extracted from the PDF file into the Verity index
  > along
  > with all the user-added data.
  >
  > This way, the extracted PDF string data is associated directly with the
  > user-added data and can be used the way I want it.
  >
  > However, we're moving the whole application to a hosted, shared
  > server, and
  > cannot run <cfexecute> there.
  >
  > I know Verity can index PDF files directly, but then that would be
  > separate
  > from the other user-entered data about the PDF file.
  >
  > Any ideas?
  >
  > Thanks,
  > Michael
  >
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]

Reply via email to