I am using the latest code from master branch. I would expect same result with same image and same traineddata files.
On Sun, 1 Sep 2019, 08:04 Jack, <[email protected]> wrote: > Thank you for replying, that was very helpful. > I've now tried tessdata_best and tessdata_fast trained data found on the > tesseract github, which has drastically improved my results, but still not > as accurate as yours. > Here are my outputs: > > tesseract listpng output2 --psm 6 --tessdata-dir ~/tessdata/tessdata_best > --oem 1 > 3 70 > 2 127 > 4 15 > 7 96 > 7 98 > 9 B58 > 9 65 > 19 695 > 29 91 > 33 75 > > tesseract listpng output_fast --psm 6 --tessdata-dir > ~/tessdata/tessdata_fast --oem 1 > 3 70 > 2 127 > 4 15 > 7 56 > 7 58 > 9 #58 > 9 #65 > 19 ~=665 > 24 #691 > 33 #675 > > On Saturday, August 31, 2019 at 11:24:23 AM UTC-5, Jack wrote: >> >> I have a weird niche project here, essentially I have about 4,000 images, >> each with 2 numbers between 0 and 127. >> I've tweaked the images in a million different ways and I can't get >> tesseract to recognized individual numbers, with the exception of 2, all >> other 1 digit numbers are not recognized. >> >> Also, for some reason if I use tesseract directly I get way worse >> results, whereas if I convert to pdf first and use ocrmypdf, which >> apparently uses tesseract, I get WAY better results, which I don't >> understand. >> >> The font is very straight-forward I think, so I'm not sure if training >> would be helpful, but I'm open to the idea if needed. >> >> Here are the sample images I'm using for testing, before and after I >> modified them: >> Before: https://imgur.com/a/PhjWXXK >> After: https://imgur.com/a/sCRE67S >> Okay some of them failed to upload but that's the gist. >> >> Thanks, >> Jack >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/934d89f8-a455-4787-8d8d-8986cc615059%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/934d89f8-a455-4787-8d8d-8986cc615059%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXYoiFikb5A_CKiiiUqa6LLFOB8b8%2BT_EL%3D6r6kFx29Pw%40mail.gmail.com.

