I had asked to try vietocr because it is using a newer svn version for the
java 4.0beta  and I find it easy to test under windows with the gui, as I
can change the image filter settings in it.

You will have to choose the tools based on your platform and other
requirements. You could use imagemagick for preprocessing. You may still
have problem because of the shape of 'A'.

I am attaching the results that I got using latest version of tesseract
from git (I run it under msys2/mingw-w64 on windows8). I tried with the png
and then with a modified tif - I used irfanview - negative (invert image) -
blur - resize/resample to tif with lzw compression,

Both image files and results are attached.

BTW, I am using the english traineddata and other related files from
https://code.google.com/p/tesseract-ocr/source/browse/?repo=tessdata
The file is 20.9 MB.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, Nov 5, 2014 at 2:54 PM, <[email protected]> wrote:

> I tried it with version 3.03 and found no improvements. As you suggested,
> I used invert, tried blurring but could not improve recognition. VietOCR is
> not an option as I have to integrate the recognition into an application
> and have to do this without a GUI.
> Could you tell me the steps (and if available, parameters) you used to
> convert the image to get better results?
>
>
> Am Donnerstag, 23. Oktober 2014 08:55:36 UTC+2 schrieb shree:
>>
>> Try .net wrapper with newer version of tesseract.
>>
>> invert the image, smoothen/blur, make greyscale ... I tried with vietocr
>>
>> output is 'QBCDEFGHIJKL'
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Thu, Oct 23, 2014 at 12:07 PM, <[email protected]> wrote:
>>
>>> Hello.
>>>
>>> I have images that contain characters that are made from individual
>>> dots, like from a dot matrix printer. I tried to use various operations on
>>> the images (binarization, edge detection, dilatation, ...) and was able to
>>> make the dots bigger so they are connected 90% of the time. However,
>>> detection is still very bad.
>>>
>>> This image contains characters from A to L
>>>
>>>
>>> <https://lh3.googleusercontent.com/-WxgjmUF846M/VEig6eA1FNI/AAAAAAAAAAM/BdQPQPVTUrs/s1600/AL.png>
>>> my modified version is
>>>
>>>
>>> <https://lh5.googleusercontent.com/-TUZSXsiBHJY/VEihDy5RCUI/AAAAAAAAAAU/HmwIkEemSAY/s1600/AL2.png>
>>> after recognition, Tesseract (3.02, using the .NET wrapper) gives me for
>>> the standard english language the characters "FJBEDEFEHIJKL". Only the last
>>> 5 characters are right, the rest is wrong. Do you know of a way to make
>>> recognition better besides training a new font for this special case?
>>> Tesseract works quite good for other projects I have, I would love a
>>> solution that does not rely on a special font if possible.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/e6b8d4bb-ecc3-463c-9cc7-96f46a63be27%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/e6b8d4bb-ecc3-463c-9cc7-96f46a63be27%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/a7a262b3-f785-44e8-82c1-56fc3e60eeec%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/a7a262b3-f785-44e8-82c1-56fc3e60eeec%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVy9uG%2BqLefgZo7zOzKLS15-buY%2BDySjjP-_KPB4zxTng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
FiBCDEFEiHIJKL

QBCDEFGHIJKL

Reply via email to