OCR Form tools

2011-12-08 Thread Adam Vande More
I have thousands of forms equivalent to invoices that I'd like to put into a database. I'm thinking I would like to have some OCR app/tool scan these forms, and then generate a CSV with each field. Does anyone have recommendations on software for this? -- Adam Vande More

Re: OCR Form tools

2011-12-08 Thread Ryan Coleman
equivalent to invoices that I'd like to put into a database. I'm thinking I would like to have some OCR app/tool scan these forms, and then generate a CSV with each field. Does anyone have recommendations on software for this? -- Adam Vande More

Re: OCR...

2009-01-29 Thread Andrew Gould
if Finereader runs under emulator though. If the file is already a PDF and 72 DPI with text as graphics most of the damage has already been done, and it will be extremely hard to OCR. well, damage is probably done. how can i check the resolution? i

Re: OCR...

2009-01-29 Thread Reko Turja
-- From: Gary Kline kl...@thought.org Sent: Thursday, January 29, 2009 4:23 AM To: Andrew Gould andrewlylego...@gmail.com Cc: Reko Turja reko.tu...@liukuma.net; FreeBSD Mailing List freebsd-questions@freebsd.org Subject: Re: OCR... On Wed, Jan

Re: OCR...

2009-01-29 Thread Andrew Gould
Mailing List freebsd-questions@freebsd.org Subject: Re: OCR... On Wed, Jan 28, 2009 at 07:33:41PM -0600, Andrew Gould wrote: On Wed, Jan 28, 2009 at 5:09 PM, Gary Kline kl...@thought.org wrote: On Wed, Jan 28, 2009 at 01:32:57PM -0600, Andrew Gould wrote: On Wed, Jan 28, 2009 at 1:22

Re: OCR...

2009-01-28 Thread Michel Talon
Gary Kline wrote: well, i'm ashamed to admit that i've put at least a dozen hours in trying, then re-re-retrying to OCR a imaged pdf file with as many open source ocr packages as i can find. I have seen good results with tesseract which is in the ports and free. Otherwise with OmniPage

Re: OCR...

2009-01-28 Thread Reko Turja
to OCR. -Reko ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org

Re: OCR...

2009-01-28 Thread Gary Kline
of the damage has already been done, and it will be extremely hard to OCR. well, damage is probably done. how can i check the resolution? i tried to increase it by creating huge ppm and tif files, but then that's really absurd since there can only be just so much

Re: OCR...

2009-01-28 Thread Andrew Gould
though. If the file is already a PDF and 72 DPI with text as graphics most of the damage has already been done, and it will be extremely hard to OCR. well, damage is probably done. how can i check the resolution? i tried to increase it by creating huge ppm and tif files

Re: OCR...

2009-01-28 Thread Gary Kline
either feature or qualitywise. No idea if Finereader runs under emulator though. If the file is already a PDF and 72 DPI with text as graphics most of the damage has already been done, and it will be extremely hard to OCR. well, damage is probably done. how can i check

Re: OCR...

2009-01-28 Thread Andrew Gould
, and it will be extremely hard to OCR. well, damage is probably done. how can i check the resolution? i tried to increase it by creating huge ppm and tif files, but then that's really absurd since there can only be just so much data per image. i _could_ try xv and jpeg

Re: OCR...

2009-01-28 Thread Gary Kline
as graphics most of the damage has already been done, and it will be extremely hard to OCR. well, damage is probably done. how can i check the resolution? i tried to increase it by creating huge ppm and tif files, but then that's really absurd since

OCR...

2009-01-27 Thread Gary Kline
guys, well, i'm ashamed to admit that i've put at least a dozen hours in trying, then re-re-retrying to OCR a imaged pdf file with as many open source ocr packages as i can find. before i quit for supper tonight, i finally threw in the towel. realized than i would have been THROUGH with all

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Roland Smith
of .jpg/.gif/.whatever. Read the manual carefully before attempting; also note this can be a slow process. Which still doesn't give plain text. But in this case one would need an OCR app. There is a new one available in ports called cuneiform. It is supposed to be quite good, but I haven't had

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Gary Kline
On Tue, Dec 02, 2008 at 02:07:30AM +0100, Roland Smith wrote: On Mon, Dec 01, 2008 at 03:14:43PM -0800, Gary Kline wrote: pdftotext fail on the large [32MB] file I've got. Is there any other way I can translate this huge textfile to ascii or html or text? Please define fail

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Gary Kline
'pypdf' to split a multipage PDF scan into individual pages, then used the tesseract OCR to convert to text. Not 100% of course, and it really got confused by pages that were not right-side-up, but not a bad start for pages that are really scans -- images -- rather than PDF representation

any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Gary Kline
Guys, pdftotext fail on the large [32MB] file I've got. Is there any other way I can translate this huge textfile to ascii or html or text? thanks, gary -- Gary Kline [EMAIL PROTECTED] http://www.thought.org Public Service Unix

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Roland Smith
On Mon, Dec 01, 2008 at 03:14:43PM -0800, Gary Kline wrote: pdftotext fail on the large [32MB] file I've got. Is there any other way I can translate this huge textfile to ascii or html or text? Please define fail in this context? I've used pdftotxt on documents exceeding

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Robert Huff
Roland Smith writes: pdftotext fail on the large [32MB] file I've got. Is there any other way I can translate this huge textfile to ascii or html or text? Please define fail in this context? I've used pdftotxt on documents exceeding 40MB. However there are of course

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Olivier Nicole
1) Some PDFs are just wrappers around JPEG images. In this case there is no text for pdftotext to convert = epic fail. In this case convert from the ImageMagick port will get you a series of .jpg/.gif/.whatever. Read the manual carefully before attempting; also note this can be a

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
On Thu, Sep 01, 2005 at 08:07:26PM -0700, Gary Kline wrote: People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR

Re: best OCR scanner??

2005-09-02 Thread Nikolas Britton
On 9/1/05, Gary Kline [EMAIL PROTECTED] wrote: People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR software in recent

Re: best OCR scanner??

2005-09-02 Thread Roger Merritt
At 08:07 PM 9/1/2005 -0700, Gary Kline wrote: People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR software in recent years

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
been substantial improvement in OCR software in recent years. This book has few footnotes or different typefaces, so it should make things easier. Oh, an if there is something that plugs into DOS/DOZE and just works, super. I'lll use my W2K box. (Hopefully

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
is and if there has been substantial improvement in OCR software in recent years. This book has few footnotes or different typefaces, so it should make things easier. Oh, an if there is something that plugs into DOS/DOZE and just works, super. I'lll use my W2K box

Re: best OCR scanner??

2005-09-02 Thread Roland Smith
substantial improvement in OCR software in recent years. This book has few footnotes or different typefaces, so it should make things easier. There are several free OCR programs. I've used gocr (http://jocr.sourceforge.net/ and no, that's not a typo) and ocrad (http

Re: best OCR scanner??

2005-09-02 Thread Bill Campbell
it's been photographed, with something to keep the opposite page out of the camera's way. I have to admit that I do all my scanning and OCR on an OS X system, only marginally related to FreeBSD. I use an older HP Scanjet with automatic document feeder (ADF), and the HP software will scan straight

Re: best OCR scanner??

2005-09-02 Thread Nikolas Britton
to know what the best scanner is and if there has been substantial improvement in OCR software in recent years. This book has few footnotes or different typefaces, so it should make things easier. Oh, an if there is something that plugs into DOS/DOZE

best OCR scanner??

2005-09-01 Thread Gary Kline
People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR software in recent years. This book has few footnotes