Appreciate this. Just wondering, Is there a C# variant of Tesseract? Thanks in advance.
On Saturday, November 3, 2012 10:50:34 PM UTC+8, zdenop wrote: > > Hello all, > > Tesseract OCR 3.02 was released (as 3.02.02) and you can find it in > download section[1] or on the Project page in section "Featured". > > > *Tesseract release notes - V3.02* > > - Moved ResultIterator/PageIterator to ccmain. > - Added Right-to-left/Bidi capability in the output iterators for > Hebrew/Arabic. > - Added paragraph detection in layout analysis/post OCR. > - Fixed inconsistent xheight during training and over-chopping. > - Added simultaneous multi-language capability. > - Refactored top-level word recognition module. > - Added experimental equation detector. > - Improved handling of resolution from input images. > - Blamer module added for error analysis. > - Cleaned up externally used namespace by removing includes from > baseapi.h. > - Removed dead memory management code. > - Tidied up constraints on control parameters. > - Added support for ShapeTable in classifier and training. > - Refactored class pruner. > - Fixed training leaks and randomness. > - Major improvements to layout analysis for better image detection, > diacritic detection, better textline finding, better tabstop finding. > - Improved line detection and removal. > - Added fixed pitch chopper for CJK. > - Added UNICHARSET to WERD_CHOICE to make mult-language handling > easier. > - Fixed problems with internally scaled images. > - Added page and bbox to string in tr files to identify source of > training data better. > - Fixes to Hindi Shiroreka splitter. > - Added word bigram correction. > - Reduced stack memory consumption and eliminated some ugly typedefs. > - Added new uniform classifier API. > - Added new training error counter. > - Fixed endian bug in dawg reader. > - C API (thanks to Tobias Müller) > - New solution for VS 2008 (thanks to Tom Powers) > - Many other fixes, including the way in which the chopper finds chops > and messes with the outline while it does so. > > Windows installer was build on Windows XP SP3 with NSIS tool. Tesseract.exe > (and trainings tools) is 32bit static build with VC++ 2008 Express, so > maybe you will need Microsoft Visual C++ 2008 SP1 Redistributable Package > (x86) [2]. > > All google generated language data were updated (community language data > files were not updated yet). > New languages available from google: afr, aze, bel, ben, chr, enm, epo, > est, eus, frm, glg, ita_old, kan, mal, mkd, mlt, msa, spa_old, sqi, swa, > tam, tel. > Cube data files are available for ita, fra, rus, spa too. > Added experimental equation detector (equ). > There is also new community language Ancient Greek (grc) - thanks to Nick > White. > > Language data files created for 3.00 and 3.01 can be used in 3.02. > Language data files created with Tesseract OCR 3.02 will not work in > previous versions. > > Thanks you all who shared your know-how and tested tesseract 3.02 in svn. > Thanks Google for supporting this project! > > [1] http://code.google.com/p/tesseract-ocr/downloads/list > [2] > http://www.microsoft.com/en-us/download/details.aspx?id=5582&WT.mc_id=MSCOM_EN_US_DLC_DETAILS_121LSUS007998 > > > -- > Zdenko Podobný > Community project contributor > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

