OK, I checked in a fixed version. We're probably going to try to remove the Tesseract dependence before the official 0.4 release; you'll still be able to use Tesseract, but it will be a separate command and (small) interface library.
The reason is that Tesseract right now cannot be used with dynamic link libraries, so we can't provide working Python bindings if we link. Also, the different versions of Tesseract have incompatible APIs, so it's hard to make OCRopus work with all of them. In terms of performance, the Tesseract character recognizer provides reasonable results on a wide range of documents, while OCRopus pre-0.4 has less consistent performance but performs much better in our benchmarks than Tesseract for document classes that it has been trained on. Mostly, what OCRopus needs now for more consistent performance is a lot more training on different document types. We're aiming for a distributed training model, where many different people can train OCRopus on their documents and on their machines and submit the trained models. We can then build a "supermodel" out of the components that works well for a lot of models. Again, the infrastructure for that is in place, and we hope that that will be part of 0.5. Tom On Fri, May 15, 2009 at 10:20, Thomas Breuel <[email protected]> wrote: > Oops, that change got checked in accidentally. Just replace the "if 0" > with "if 1"; I'll try to fix the repository later. > > Sorry about that. > > Tom > > > On Fri, May 15, 2009 at 00:43, Taxman <[email protected]> wrote: > >> >> Thanks 0mat, but tesseract-ocr-dev and tesseract-ocr is definitely >> already installed (it was already in the ocropus/ubuntu script, Tom) >> and the tesseract section of my SConstruct file looks like: >> >> ### tesseract >> >> if 0: >> env.Append(CPPPATH=["${tesseract}/include/tesseract"]) >> env.Append(LIBPATH=["${tesseract}/lib"]) >> env.Append(LIBS=["libtesseract_full.a","pthread"]) >> env.Append(CPPDEFINES=["HAVE_TESSERACT"]) >> assert conf.CheckLibWithHeader('tesseract_full', 'tesseract/ >> baseapi.h', 'C++') >> assert conf.CheckLib('pthread') >> >> It's definitely not commented out. I'm guessing there must be some >> other missing dependency that most people have on a running system >> that I don't yet have on this clean one. Any more ideas? >> >> >> > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
