ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 --tessdata-dir ~/tessdata --oem 1 2 127
a 15 7 56 7 58 9 58 19 65 24 91 3375 ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 --tessdata-dir ~/tessdata_best --oem 1 2 127 a 15 7 56 7 58 9 58 19 65 24 91 3375 ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 --tessdata-dir ~/tessdata_fast --oem 1 2 127 4 15 7 56 7 58 9 58 19 65 24 «(91 33 «75 On Sat, Aug 31, 2019 at 9:54 PM Jack <[email protected]> wrote: > I have a weird niche project here, essentially I have about 4,000 images, > each with 2 numbers between 0 and 127. > I've tweaked the images in a million different ways and I can't get > tesseract to recognized individual numbers, with the exception of 2, all > other 1 digit numbers are not recognized. > > Also, for some reason if I use tesseract directly I get way worse results, > whereas if I convert to pdf first and use ocrmypdf, which apparently uses > tesseract, I get WAY better results, which I don't understand. > > The font is very straight-forward I think, so I'm not sure if training > would be helpful, but I'm open to the idea if needed. > > Here are the sample images I'm using for testing, before and after I > modified them: > Before: https://imgur.com/a/PhjWXXK > After: https://imgur.com/a/sCRE67S > Okay some of them failed to upload but that's the gist. > > Thanks, > Jack > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/7be5ed42-df44-4530-b7a2-0d0fa340918e%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/7be5ed42-df44-4530-b7a2-0d0fa340918e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWWsbmrCfdnAJXHKHztGnqtmt4iyPVSZqZin8WPXwPCEA%40mail.gmail.com.

