I am using the latest code from master branch.

I would expect same result with same image and same traineddata files.

On Sun, 1 Sep 2019, 08:04 Jack, <[email protected]> wrote:

> Thank you for replying, that was very helpful.
> I've now tried tessdata_best and tessdata_fast trained data found on the
> tesseract github, which has drastically improved my results, but still not
> as accurate as yours.
> Here are my outputs:
>
> tesseract listpng output2 --psm 6 --tessdata-dir ~/tessdata/tessdata_best
> --oem 1
> 3 70
> 2 127
> 4 15
> 7 96
> 7 98
> 9 B58
> 9 65
> 19 695
> 29 91
> 33 75
>
> tesseract listpng output_fast --psm 6 --tessdata-dir
> ~/tessdata/tessdata_fast --oem 1
> 3 70
> 2 127
> 4 15
> 7 56
> 7 58
> 9 #58
> 9 #65
> 19 ~=665
> 24 #691
> 33 #675
>
> On Saturday, August 31, 2019 at 11:24:23 AM UTC-5, Jack wrote:
>>
>> I have a weird niche project here, essentially I have about 4,000 images,
>> each with 2 numbers between 0 and 127.
>> I've tweaked the images in a million different ways and I can't get
>> tesseract to recognized individual numbers, with the exception of 2, all
>> other 1 digit numbers are not recognized.
>>
>> Also, for some reason if I use tesseract directly I get way worse
>> results, whereas if I convert to pdf first and use ocrmypdf, which
>> apparently uses tesseract, I get WAY better results, which I don't
>> understand.
>>
>> The font is very straight-forward I think, so I'm not sure if training
>> would be helpful, but I'm open to the idea if needed.
>>
>> Here are the sample images I'm using for testing, before and after I
>> modified them:
>> Before: https://imgur.com/a/PhjWXXK
>> After: https://imgur.com/a/sCRE67S
>> Okay some of them failed to upload but that's the gist.
>>
>> Thanks,
>> Jack
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/934d89f8-a455-4787-8d8d-8986cc615059%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/934d89f8-a455-4787-8d8d-8986cc615059%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXYoiFikb5A_CKiiiUqa6LLFOB8b8%2BT_EL%3D6r6kFx29Pw%40mail.gmail.com.

Reply via email to