Lo siento (sorry) -- Andres and Andriy -- mixed up your names.
--Sven

On Thu, Aug 18, 2011 at 2:11 PM, Sven Pedersen <[email protected]> wrote:
> Hi Andre,
> You must have a minimum resolution to be effective, overly high
> resolution will also throw results off. Internally the JPG is
> converted to a bitmap format. They are currently only 72dpi, so you'll
> need to resize (scale) them at least, but the low contrast (black on
> gray) will be a problem, so ImageMagick or something could help.
> --Sven
>
>
> On Thu, Aug 18, 2011 at 9:39 AM, Andres <[email protected]> wrote:
>> Sven:
>> What do you exactly mean with 200-300 dpi ? The dpi attributes in the
>> jpg files are being evaluated ? Or you are referring to some scaling
>> of the images ?
>>
>> Andriy:
>> If you continue having problems with this and if the camera is in a
>> fixed position with respect to your display and the font is always the
>> same, it should be very easy for you to avoid using tesseract and just
>> recognizing the characters by evaluating some pixels after
>> thresholding. (I would threshold just the evaluated pixels).
>>
>> Regards,
>>
>> Andres
>>
>>
>>
>> 2011/8/18 Sven Pedersen <[email protected]>:
>>> You should not need to retrain. You need to change the images to
>>> grayscale or B&W of 200-300 dpi, get the background (which seems to be
>>> gray) to be closer to white. You can do that kind of cleanup
>>> transformation with ImageMagick.
>>> --Sven
>>>
>>>
>>> On Wed, Aug 17, 2011 at 3:09 PM, Andriy Malovanyy <[email protected]> 
>>> wrote:
>>>> Hi,
>>>>
>>>> I try to write a simple program that uses pictures, which are taken from a
>>>> web-cam every 10 sec. with another program, recognises the text with OCR 
>>>> and
>>>> log the data into a text file. Everything seems to be working fine except
>>>> the fact that tesseract does not want to recognize the pictures that are
>>>> taken. If I "feed" tesseract pictures created with Photoshop, it works
>>>> better but sometimes also can not recognize very simple and obvious text
>>>> (numbers).
>>>>
>>>> I attach the 3 files taken by a web cam and 1 created with Photoshop. None
>>>> of them recognize well. The first two web-cam picture return garbage text,
>>>> the third one (the best quality I think) returns "Empty page message".
>>>> Photoshop picture returns "1234.018" instead of "1234.0.18".
>>>>
>>>> I use Tesseract-OCR 3.0 with language files that followed the package
>>>> (English only). Do I need to train Tessarat to recognise the pictures?? How
>>>> is it better to do it then?? Take several pictures taken with a web-cam, 
>>>> and
>>>> from them make a training file with numbers from 0 to 9 and points? I have
>>>> started to read how to do that, it seems sooo complicated..
>>>>
>>>> Any advice appreciated.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to