Hi Phild, Ur Question. Does anyone have a fragment/snippet of c/c++ code which has the call(s) to (I guess) rectangle which will tell Tesseract the area to look at? What are the parameters in, pixels? Where is the origin (0,0), top left, bottom left? -> As written in training tesseract documentation. http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract2 "coordinate system used in the box file has (0,0) at the bottom-left." Tesseract uses origin at bottom left. I am studying Tesseract source code.I will tell U how tesseract selects any rectangle area. Other option is following. If want .NET source code to extract rectangle areas from image, see this application http://www.codeproject.com/KB/GDI-plus/Image_Processing_Lab.aspx Run application Open any image. Use menu Filters->Other->Blob extractor And U can see source code also.
How R U using tesseract? calling tesseract.exe from command line or using dll? My problem- I using tessnet2 .NET assembly in C#.NET (http://www.pixel- technology.com/freeware/tessnet2/). It is using tesseract 2.04 as OCR Engine. I have to recognize only numbers on electric fuses in font OCR A or OCR A extended But OCR results are as follows. OCR A extended font results Digit= its OCR output 0=11 .=. 1=1 2=2 3=3 4=.1, 9 5=5 6=5 7=7 8=5 9=.1 training tesseract 2.04 and tesseract 3.01 is little different.I m getting worse outputs after training.I want to train tesseract 2.04 for new font OCR-A or OCR A extended.Please send me training files tif image, .box, .tr and trained data on [email protected] urgent.Thanks for help in advance. On Dec 7, 4:01 pm, Lahiru Himash Madusanka <[email protected]> wrote: > Thanks. :-) > > On 12/6/11, Sven Pedersen <[email protected]> wrote: > > > > > > > > > > > Hi Lahiru, > > Please read the wiki on the website for training info. > > --Sven > > > On Tue, Dec 6, 2011 at 2:17 AM, Lahiru Himash Madusanka > > <[email protected]> wrote: > >> can you tell me how to train tess in to a new language > > >> On Dec 6, 12:14 am, Phild <[email protected]> wrote: > >>> Hi, > > >>> I have successfully trained Tesseract to read numeric OCR-A font. So > >>> far so good. > > >>> I am also using the c/c++ code testApi contained in 'api' and this > >>> works fine as long as I have preprocessed the image file (ie cropped > >>> the area I'm interested in). > > >>> The documents I am using, have a small area of some 20 chars or so, > >>> which contain the OCR-A text. In my code I need to be able to > >>> specify only that area in which Tesseract will do the recognition. > > >>> Does anyone have a fragment/snippet of c/c++ code which has the > >>> call(s) to (I guess) rectangle which will tell Tesseract the area to > >>> look at? What are the parameters in, pixels? Where is the origin > >>> (0,0), top left, bottom left? > > >>> Thanks for all help > >>> Phil > > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "tesseract-ocr" group. > >> To post to this group, send email to [email protected] > >> To unsubscribe from this group, send email to > >> [email protected] > >> For more options, visit this group at > >>http://groups.google.com/group/tesseract-ocr?hl=en > > > -- > > ``All that is gold does not glitter, > > not all those who wander are lost; > > the old that is strong does not wither, > > deep roots are not reached by the frost. > > From the ashes a fire shall be woken, > > a light from the shadows shall spring; > > renewed shall be blade that was broken, > > the crownless again shall be king.” > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en > > -- > Lahiru Himash Madusankahttp://119sinhala.blogspot.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

