Sven: What do you exactly mean with 200-300 dpi ? The dpi attributes in the jpg files are being evaluated ? Or you are referring to some scaling of the images ?
Andriy: If you continue having problems with this and if the camera is in a fixed position with respect to your display and the font is always the same, it should be very easy for you to avoid using tesseract and just recognizing the characters by evaluating some pixels after thresholding. (I would threshold just the evaluated pixels). Regards, Andres 2011/8/18 Sven Pedersen <[email protected]>: > You should not need to retrain. You need to change the images to > grayscale or B&W of 200-300 dpi, get the background (which seems to be > gray) to be closer to white. You can do that kind of cleanup > transformation with ImageMagick. > --Sven > > > On Wed, Aug 17, 2011 at 3:09 PM, Andriy Malovanyy <[email protected]> wrote: >> Hi, >> >> I try to write a simple program that uses pictures, which are taken from a >> web-cam every 10 sec. with another program, recognises the text with OCR and >> log the data into a text file. Everything seems to be working fine except >> the fact that tesseract does not want to recognize the pictures that are >> taken. If I "feed" tesseract pictures created with Photoshop, it works >> better but sometimes also can not recognize very simple and obvious text >> (numbers). >> >> I attach the 3 files taken by a web cam and 1 created with Photoshop. None >> of them recognize well. The first two web-cam picture return garbage text, >> the third one (the best quality I think) returns "Empty page message". >> Photoshop picture returns "1234.018" instead of "1234.0.18". >> >> I use Tesseract-OCR 3.0 with language files that followed the package >> (English only). Do I need to train Tessarat to recognise the pictures?? How >> is it better to do it then?? Take several pictures taken with a web-cam, and >> from them make a training file with numbers from 0 to 9 and points? I have >> started to read how to do that, it seems sooo complicated.. >> >> Any advice appreciated. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > > > -- > ``All that is gold does not glitter, > not all those who wander are lost; > the old that is strong does not wither, > deep roots are not reached by the frost. > From the ashes a fire shall be woken, > a light from the shadows shall spring; > renewed shall be blade that was broken, > the crownless again shall be king.” > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

