I'm really impressed, I didn't expect to have 100% recognition without
training tesseract, I take my hat off to you, sir
Did you manually clean image in GIMP/Photoshtop or you used some image
processing library to get such clean result?



On Mon, Nov 4, 2013 at 8:34 AM, zdenko podobny <[email protected]> wrote:
> Tesseract is "noise sensitive" - you need to pre-process image before OCR.
> See FAQ[1]
>
> [1]
> https://code.google.com/p/tesseract-ocr/wiki/FAQ#Output_without_result_or_bad_output
>
>
> Zdenko
>
>
> On Mon, Nov 4, 2013 at 1:24 AM, Valent <[email protected]> wrote:
>>
>> Hi,
>> got back to doing this project, plan is to ocr my gas and power meters so
>> I can get some insight how much of each we use them over time.
>> I got past first block, and instead or "Empty page" error I get some OCR
>> results, finally!
>>
>> After reading blog posts and forum posts about how tesseract works still
>> can't fully figure it out, so your help is much appreciated.
>>
>> I have this image to work with: http://imgur.com/CiruqHF
>>
>> Tesseract gets this as result "3 2 9 3  3... 3 -" which is far from
>> perfect because meter reads "32830,8"
>>
>> From what I have read so far I guess that my next step to get better
>> results is to take 10-20 pictures and train tesseract, right?
>>
>> WIki on training tesseract is not clear for newbie like me. Now I got
>> Qt-Box-Editor installed on my Linux Mint laptop and got first box text file
>> done:
>> https://www.dropbox.com/s/8hjgh21wu41d2l9/struja.box
>>
>> I have followed these instructions:
>> https://github.com/this-is-ari/python-tesseract-3.02-training
>>
>> But how do I now train for more samples?
>>
>> What do I do after I take more pictures from my power meter and gas meter?
>> Should I train tesseract for each meter separately or together?
>>
>>
>> --
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/groups/opt_out.
>
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.



-- 
follow me - www.twitter.com/valentt & http://kernelreloaded.blog385.com
linux, anime, spirituality, wireless, scuba, linuxmce smart home, zwave
ICQ: 2125241, Skype: valent.turkovic, MSN: [email protected]

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to