Hi Nguyen,

Thanks for the suggestion. I've tried with the ROI and also isolating the 
digits as independent images but with no results improvement. In some 
images I got better results resizing the image by a scale factor of 2.5, 
some other images required DILATE/ERODE operations for closing 1 pixel 
holes.

V.Lorz

On Thursday, March 27, 2014 1:53:27 AM UTC+1, Quan Nguyen wrote:
>
> I defined a ROI around each number and it seemed to produce better results.
>
> On Wednesday, March 26, 2014 1:10:56 PM UTC-5, V.Lorz wrote:
>>
>> Hi All,
>>
>> I started integrating tesseract (version 3.2, EMGV) in a project for 
>> recognizing short texts in scanned images. Using some very simple image 
>> processing I extract the area of interest for speeding up the process. 
>>
>> The errors I get are related to recognition results, tesseract sometimes 
>> confuses the digits '6' and '5', the image bellow is recognized as "443669
>> *5*" instead of "443669*6*". I'm using the default *eng.traineddata*file 
>> bundled with the library. Using some other trained data files from 
>> around the Inet I got the same results with the same two digits (5 and 6). 
>> Before processing the image I configure tesseract to process only digits.
>>
>>
>>
>>
>> Does anyone know what could be causing this error? How could I solve it?
>>
>> I started reading the guide for training the engine (
>> http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 - 
>> tracked<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3>) 
>> as suggested in some other threads, but it is of near to no help for me. Is 
>> there any other guide around for 'dummies' like [presummably :(] me? In 
>> this case I want to train it using one image that I created from 40 sampled 
>> documents (attached here). Using jTessBoxEditor-1.0 I was able to generate 
>> and correct the box file. What should I do next?
>>
>>
>> Thanks a lot in advance, V.Lorz
>>
>>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to