Mark Carey wrote:

>My question for the list; Is there an app that takes input from a .png .eps etc and 
>gives me the text back in ascii form?  Or even in a tex formatted file (crosses 
>fingers and hopes real hard)?
>
It depend to a huge extent how good the text image is and how many fonts 
the ocr engine has to recognise. This is one of the  places where linux 
is somewhat lacking because making good ocr software is _much_ harder 
than it at first appears to be. The Internet is littered with the 
results of good intentions. A google search will reveal literally 
"dozens" in various stages of creation and abandonment.


Two projects which seem to work reasonably well are:-

http://jocr.sourceforge.net/

and

http://www.claraocr.org/

Neither is anything like as good as the commercial offerings. They both 
fail miserably on trying to recognise ligatures whether deliberate or 
otherwise. clarocr is better for single font documents.

There is at least one commercial product for Linux:

http://www.vividata.com/

I tried this out some time ago, but failed to get it to work. The e-mail 
support was somewhat indifferent to my difficulties. However there have 
been big changes to the WWW site  and new products have been released 
since I made my attempts to use it.

google searches will find more.


Reply via email to