---------- Forwarded message ---------- From: Brian Thornton <[email protected]> Date: Wed, Jun 5, 2013 at 11:56 PM Subject: How Would you do this To: [email protected]
I have a government client that is unable to natively export data out of a secured system.. Instead the ONLY option is to print, scan and then email as a PDF... The end result is a flat PDF of say 40 pages with an scanned image of Courier text... Neither CFPDF extracttext get the content and a handoff to a free COR has a very bad ratio... So I turn to Google and ask myswlf how does google turn 300 million PDF's of mostly images into text for searching and indexing?? Turns out, the answer is a project called ocropus found here.. https://code.google.com/p/ocropus/ I am wondering... what would the deployment model be to include this library.. would this be a cfexec job or would it be able to connect at a java level... Any ideas? BT -- Brian Thornton (260) 267-6520 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Order the Adobe Coldfusion Anthology now! http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:355873 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

