I grabbed your latest version, Remi, and it's noticeably slower than the last version. I've written a simple console app in .NET C# to perform the OCR operation to allow the CLR to clean up the memory leak. I then create processes in two threads to perform OCR using the console app. With version 2.03, I get between .48-.6s to process an image. With 2.04, the time went up to ~.6-.8. With my dual-threaded approach, I'm able to process as many as 3.5 images/sec in 2.03, and it dropped to 2.6 images/sec in 2.04.
I'd love to see a memory leak free version, as when I don't have to take on the overhead of spawning a process, I get a pretty consistent . 25 process time. With a dual threaded approach, that'd let me process between 7 and 8 images a second... it'd almost double my throughput! I'm very happy with the project, though... it's so much faster than Microsoft Office Document Imaging, and I can distribute my app to others and they don't need Office 2007. On Jun 5, 10:20 am, Remi Thomas <[email protected]> wrote: > Hi, > > You can take the .NET wrapper based on version > 2.04http://www.pixel-technology.com/freeware/tessnet2/bin.zip > Two modifications. > > SetRootPath has been removed and merge with Init > Init(string tessdataPath, string lang, bool numericMode) > if tessdataPath==null then Init works like previous version. > > tessnet2 assembly is now renamed tessnet2_32.dll and tessnet2_64.dll > to avoid confusion between 32 and 64 bits version. > > Ray, for me everything works. > > Have fun, > Remi > > On Jun 3, 7:51 pm, Ray <[email protected]> wrote: > > > > > The current (v250) svn code is a 2.04 release candidate. > > If you are able to download from svn, and have reported an issue on > > the list below, then please take a look and give it a try. > > This version will be uploaded to the download page soon unless I hear > > of any further problems. > > NOTE that VC++ express 2005 is deprecated and no longer supported. Get > > vc++ express 2008 instead. > > > After 2.04, there will be no going back: > > This is that *last* version to build with VC++6! V3.00 has some new > > template code that VC++6 just can't cope with. > > V3.00 will have big changes to TessBaseAPI, moving towards (but not > > complete) thread safety. > > V3.00 will have page layout analysis that will not work well without > > leptonica. As a consequence, 3.00 by default will require leptonica to > > build on windows. It may be possible to disable it, but the resulting > > code will have reduced functionality. > > Completion of 2.04 will open the door to an upload of a preliminary > > version of 3.00 to svn... > > > Here are the 2.04 release notes: > > Tesseract release notes June 2 2009 - V2.04 > > Integrated patches for portability and to remove some of the > > "access" macros. > > Removed dependence on lua from the viewer making it a *lot* > > faster. Also the viewer now compiles and works (on Linux.) Also works > > on windows via a pre-built ScrollView.jar. > > Fixed the following issues: > > 1, 63, 67, 71, 76, 79, 81, 82, 84, 106, 108, 111, 112, 128, 129, 130, > > 133, 135, > > 142, 143, 145, 146, 147, 153, 154, 160, 165, 169, 170, 175, 177, 187, > > 192, > > 195, 199, 201, 205, 209. > > This is the last version to support VC++6! > > This may also be the last version to compile without leptonica!- Hide > > quoted text - > > - Show quoted text - --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

