---------- Forwarded message ----------
From: Brian Thornton <[email protected]>
Date: Wed, Jun 5, 2013 at 11:56 PM
Subject: How Would you do this
To: [email protected]


I have a government client that is unable to natively export data out
of a secured system.. Instead the ONLY option is to print, scan and
then email as a PDF...

The end result is a flat PDF of say 40 pages with an scanned image of
Courier text...

Neither CFPDF extracttext get the content and a handoff to a free COR
has a very bad ratio...

So I turn to Google and ask myswlf how does google turn 300 million
PDF's of mostly images into text for searching and indexing??

Turns out, the answer is a project called ocropus found here..
https://code.google.com/p/ocropus/

I am wondering... what would the deployment model be to include this
library.. would this be a cfexec job or would it be able to connect at
a java level...

Any ideas?
BT


-- 
Brian Thornton
(260) 267-6520

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:355873
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

Reply via email to