Re: [CODE4LIB] best OCR package? [SEC=UNCLASSIFIED]

2009-02-04 Thread Dyer, Renata
would highly recommend both of them. Renata -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Emmanuel Di Pretoro Sent: Tuesday, 3 February 2009 7:54 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] best OCR package? Hi, It wasn't

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Randy Stern
Abbyy Finereader and Nuance Omnipage are the two leading commercial OCR products. Both can achieve 98% + character accuracy on most book-like material scanned at 300 dpi. - Randy Stern (who formerly worked in the OCR industry) At 07:37 AM 2/3/2009 -0500, Nicole Engard wrote: I'm with

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread MJ Ray
Alberto Accomazzi aaccoma...@cfa.harvard.edu wrote: [...] I know about OCRopus but I have a feeling that commercial products still have a significant edge over public domain packages. [...] OCRopus is released under the Apache License 2.0, which allows commercial development. It is not a

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Walter Lewis
Randy Stern wrote: Abbyy Finereader and Nuance Omnipage are the two leading commercial OCR products. Both can achieve 98% + character accuracy on most book-like material scanned at 300 dpi. At 07:37 AM 2/3/2009 -0500, Nicole Engard wrote: I'm with Christian - I loved Abbyy FineReader when I

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Karen Coyle
Randy Stern wrote: Abbyy Finereader and Nuance Omnipage are the two leading commercial OCR products. Both can achieve 98% + character accuracy on most book-like material scanned at 300 dpi. I know that 98% is impressive, but I always like to remember that with an average of 2000 characters

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Gabriel Farrell
On Tue, Feb 03, 2009 at 10:09:54AM -0500, Walter Lewis wrote: If we had to correct it all: a) it would never get done and b) it would be better than some of the originals which are rife with typographic errors. Hence the genius of Distributed Proofreaders [1] and reCAPTCHA [2]. [1]

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Nicole Engard
I'm with Christian - I loved Abbyy FineReader when I used it at both my previous libraries. It's very accurate and it's affordable if you're not using it for mass digitization :) but we never got the server contract because like Christian said - it is quite expensive. --- Nicole C. Engard Open

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Walter Lewis
Gabriel Farrell wrote: On Tue, Feb 03, 2009 at 10:09:54AM -0500, Walter Lewis wrote: If we had to correct it all: a) it would never get done and b) it would be better than some of the originals which are rife with typographic errors. Hence the genius of Distributed Proofreaders [1]

Re: [CODE4LIB] best OCR package?

2009-02-03 Thread Walter Lewis
Karen Coyle wrote: I know that 98% is impressive, but I always like to remember that with an average of 2000 characters per page that means 40 potential errors per book page. Just to give us some perspective on the level of cleanup that will be needed for books being digitized today. The good