Hi, I'm trying to train tesseract. But text2image creates a single box for
'fi' or 'fl'. Why it thinks that 'fi' or 'fl' are a single character
instead of two? How can I fix this?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe
en
>> chasing other issues and haven't verified a solution.
>>
>>
>> On Saturday, September 3, 2016 at 5:23:55 AM UTC-4, Brais Gabín Moreira
>> wrote:
>>>
>>> Hi, I'm trying to train tesseract. But text2image creates a single box
I'm running tesseract to read screenshots. I noticed that when I run the
"easy screenshots" first and then the more difficult ones I get better
recognition. This is not a feeling. I can reproduce this behaviour.
It's possible to export this new "knowledge" that tesseract learned to a
new .train
I'm using tesseract to recognice some screenshots. I'm building this in an
Android app so ~20MB of traineddata is a lot of weight. I know the font in
those screenshots.
How can I reproduce the steps to generate the eng.traineddata? I want to
use the same data: text, dictionary, patterns, etc. O
You can try somthing like this:
http://www.imagemagick.org/Usage/color_mods/#level make the light colors
completely white and the dark colors completely black. I use something
similar with my images and it works great (I can't use imagemagick).
El domingo, 11 de septiembre de 2016, 21:19:23 (UT
ile, one of
> which is only 3MB.
>
> https://sourceforge.net/projects/tesseract-ocr-alt/files/
>
> On Sunday, September 11, 2016 at 7:02:54 AM UTC-5, Brais Gabín Moreira
> wrote:
>>
>> I'm using tesseract to recognice some screenshots. I'm building this in
&g
I couldn't find a Docker image with the training tools on it so I built one!
Code:
https://github.com/BraisGabin/docker-tesseract
Image:
https://hub.docker.com/r/braisgabin/tesseract/
I hope you have better luck training your OCR than me :P
--
You received this message because you are subscrib
7 matches
Mail list logo