You can look at other open source OCR project OpenOCR (Cuneiform) https://launchpad.net/cuneiform-linux
On Fri, Apr 2, 2010 at 3:49 PM, Pierre <[email protected]> wrote: > Hello everyone, > > i'm currently working on a small project which aims to recognize very > small chunks of texts (Typically, blocks of 20 chars, with known > location for each of them in the documents). > > i have read a lot about Tesseract, and some points are a bit confusing > me. > > First, i had a look at the .net wrapper for Tesseract (Which actually > doesn't really interest me, was for making myself a good idea about > "how to") , and following trails of clues over the net, i've run into > a lot of discussions saying that Tesseract is full of memleaks, is > kind of unstable etc. i'd like to have a clear overview of the > reliability of Tesseract, and if possible a confirmation or > infirmation that it's memleaking... i don't feel confortable starting > an external binary from my application, i'd rather prefer using the > Tesseract API directly. Most of all, my application aims to be cross- > platform, so the ideal deal would be to include Tesseract code in my > project, and deploying to another platform would just be a compilation > away. > > Secondly, me following the same trails the .net author left behind, > i've read that Tesseract's code is really not "thought" or "modeled" > to be used from another code. It looks like it includes a lot of exit > messages, which leads to the conclusion it's modeled to be running > from it's standalone binary. Is that true? Would it be a lot of work > to change that if i decide to? > > Last thing, i've been looking around for documentation, the best i've > found so far is maintained by this group of fellow hackers: > http://tesseract-ocr.repairfaq.org . Though it's looking very nice, i > was unable to find any good example for using a very simple Tesseract > recognition routine directly from C++ code. Again, i'm mostly looking > for C++ instanciating objects / calling functions from code included > in my project, not a library or an external binary. > > If i'm not at the right place for asking such questions, please pardon > me, and feel free to point me the right direction... i have to admin > that i feel a bit confused about who is maintaining what, how "alive" > are projects, etc. > > Thanks a lot for your time reading me, > Pierre. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

