Abbyy Finereader and Nuance Omnipage are the two leading commercial OCR
products. Both can achieve 98% + character accuracy on most book-like
material scanned at 300 dpi.
- Randy Stern (who formerly worked in the OCR industry)
At 07:37 AM 2/3/2009 -0500, Nicole Engard wrote:
I'm with
Alberto Accomazzi aaccoma...@cfa.harvard.edu wrote:
[...] I know about OCRopus but I have a feeling that
commercial products still have a significant edge over public domain
packages. [...]
OCRopus is released under the Apache License 2.0, which allows
commercial development. It is not a
Randy Stern wrote:
Abbyy Finereader and Nuance Omnipage are the two leading commercial
OCR products. Both can achieve 98% + character accuracy on most
book-like material scanned at 300 dpi.
At 07:37 AM 2/3/2009 -0500, Nicole Engard wrote:
I'm with Christian - I loved Abbyy FineReader when I
Randy Stern wrote:
Abbyy Finereader and Nuance Omnipage are the two leading commercial
OCR products. Both can achieve 98% + character accuracy on most
book-like material scanned at 300 dpi.
I know that 98% is impressive, but I always like to remember that with
an average of 2000 characters
Just wanted to send a reminder of this useful wiki page for roommates and
rides:
http://wiki.code4lib.org/index.php/RoommatesRidesEtc
I have an ulterior motive for the reminder--someone I was tentatively going
to share a room with turns out not to be able to make it. So if you're
interested in
On Tue, Feb 03, 2009 at 10:09:54AM -0500, Walter Lewis wrote:
If we had to correct it all: a) it would never get done and b) it would
be better than some of the originals which are rife with typographic
errors.
Hence the genius of Distributed Proofreaders [1] and reCAPTCHA [2].
[1]
I'm with Christian - I loved Abbyy FineReader when I used it at both
my previous libraries. It's very accurate and it's affordable if
you're not using it for mass digitization :) but we never got the
server contract because like Christian said - it is quite expensive.
---
Nicole C. Engard
Open
Gabriel Farrell wrote:
On Tue, Feb 03, 2009 at 10:09:54AM -0500, Walter Lewis wrote:
If we had to correct it all: a) it would never get done and b) it would
be better than some of the originals which are rife with typographic
errors.
Hence the genius of Distributed Proofreaders [1]
Karen Coyle wrote:
I know that 98% is impressive, but I always like to remember that with
an average of 2000 characters per page that means 40 potential errors
per book page. Just to give us some perspective on the level of
cleanup that will be needed for books being digitized today.
The good
Folks,
Just a reminder that the deadline for Code4Lib 2010 hosting proposals
is next Thursday, February 12th. See below for more information.
-Mike
The Code4Lib Conference Planning Group is putting out a call for
proposals to host the 2010 Code4Lib Conference. Information on the
The 11th edition of the Dewey Decimal system, which he wrote in his
'reformed spelling.' Amazingly, the Google text (at least the part I've
scanned) catches it perfectly:
In the clast card catalog the clasification is mapt out abuv the cards
by projecting gyds, making reference almost
Hello,
Do you know a tool running under Linux to make PDFs from images? I use
Adobe Acrobat professional in Windows to create PDFs from image files.
However, Acrobat does not handle image files with east Asian characters.
Yan
Hello,
Do you know an OCR engine for Persian/Dari ? If so, what is the accurate
rate?
Thanks,
Yan
Yan, not sure how it handles east Asian characters, but imagemagick will create
PDFs, e.g.,
convert FILE.jpg FILE.pdf
See http://www.imagemagick.org/script/convert.php for more info.
Mark
- Yan Han h...@u.library.arizona.edu wrote:
Hello,
Do you know a tool running under Linux
Hi again Yan,
There's this one:
http://www.worldlanguage.com/Products/Readiris-Pro-11-Middle-East-Edition-ArabicReadiris-Farsi-Persian-Arabic-Farsi-110226.htm
We have a copy of the Traditional Chinese version of Readiris and find its
accuracy to be fairly poor (and its performance on latin
15 matches
Mail list logo