On Tue, Mar 20, 2012 at 6:40 AM, Lee <[email protected]> wrote:

> CF9 has CFPDF which can extract text, does OpenBD have anything similar?
> the cfdocument write up in the manual says 'no documentation available'.
>

OpenBD ships with both PDFBox and iText, both of which can extract text
from PDFs. We just haven't implemented a CFPDF tag yet.

Bit of info on both solutions:
http://www.danielspangler.com/2009/01/pdf-text-extraction-in-java.html

If you want to use CFEXECUTE and write the text from the PDF out to a file
you can do this:
http://pdfbox.apache.org/commandlineutilities/ExtractText.html

If you want to read the text from a PDF into a variable you can use
PDFTextStripper:
http://pdfbox.apache.org/userguide/text_extraction.html
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/util/PDFTextStripper.html

With either PDFBox or iText since those ship with OpenBD you'd just use
CreateObject("java" ...) to create instances of the necessary PDFBox or
iText classes and go from there.

jPedal (which is what CF uses under the hood for a lot of its PDF
functionality) also does this. jPedal doesn't ship with OpenBD but it's
available as LGPL depending on what you're doing with it so could be used
for free:
http://www.jpedal.org/support_Extraction.php

Hope that helps. If you need a specific example of how to do this in OpenBD
I can put one together later today.

-- 
Matthew Woodward
[email protected]
http://blog.mattwoodward.com
identi.ca / Twitter: @mpwoodward

Please do not send me proprietary file formats such as Word, PowerPoint,
etc. as attachments.
http://www.gnu.org/philosophy/no-word-attachments.html

-- 
online documentation: http://openbd.org/manual/
   google+ hints/tips: https://plus.google.com/115990347459711259462
     http://groups.google.com/group/openbd?hl=en

Reply via email to