Re: [ocropus] OCR .pdf docs into a meaningful database to process further

Tom Morris Tue, 01 Mar 2011 10:21:36 -0800

On Sun, Feb 20, 2011 at 4:01 AM, Sean Brown <[email protected]> wrote:


> I currently have an issue to be able to convert .pdf format documents
> with the exact same characteristics that I subscribe to and receive
> daily into a meaningfull database format - Excell/SQL to be able to
> apply weighted rules to this database to do further analysis/ratings,
> etc.
>
> Any help in this regard will be appreciated.

There are two options that you should consider before resorting to OCR:

1. Getting your publisher/correspondent to send you the data in a more
friendly format like csv or Excel.

2. Using a PDF to text extractor to parse the text out of the PDF document.

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Re: [ocropus] OCR .pdf docs into a meaningful database to process further

Reply via email to