Thank you very much for your reply, that was very helpful, I think that 
should do the trick.

On Saturday, August 31, 2019 at 12:02:20 PM UTC-5, shree wrote:
>
> ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 
> --tessdata-dir ~/tessdata --oem 1
> 2 127
>
> a 15
>
> 7 56
>
> 7 58
>
> 9 58
>
> 19 65
> 24 91
> 3375
> ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 
> --tessdata-dir ~/tessdata_best --oem 1
> 2 127
>
> a 15
>
> 7 56
>
> 7 58
>
> 9 58
>
> 19 65
> 24 91
> 3375
> ubuntu@tesseract-ocr:~/TEST$ tesseract twonumbers.png - --psm 6 
> --tessdata-dir ~/tessdata_fast --oem 1
> 2 127
>
> 4 15
>
> 7 56
>
> 7 58
>
> 9 58
>
> 19 65
> 24 «(91
> 33 «75
>
> On Sat, Aug 31, 2019 at 9:54 PM Jack <[email protected] <javascript:>> 
> wrote:
>
>> I have a weird niche project here, essentially I have about 4,000 images, 
>> each with 2 numbers between 0 and 127.
>> I've tweaked the images in a million different ways and I can't get 
>> tesseract to recognized individual numbers, with the exception of 2, all 
>> other 1 digit numbers are not recognized.
>>
>> Also, for some reason if I use tesseract directly I get way worse 
>> results, whereas if I convert to pdf first and use ocrmypdf, which 
>> apparently uses tesseract, I get WAY better results, which I don't 
>> understand. 
>>
>> The font is very straight-forward I think, so I'm not sure if training 
>> would be helpful, but I'm open to the idea if needed.
>>
>> Here are the sample images I'm using for testing, before and after I 
>> modified them:
>> Before: https://imgur.com/a/PhjWXXK
>> After: https://imgur.com/a/sCRE67S
>> Okay some of them failed to upload but that's the gist.
>>
>> Thanks,
>> Jack
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/7be5ed42-df44-4530-b7a2-0d0fa340918e%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/7be5ed42-df44-4530-b7a2-0d0fa340918e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
>
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b4f7f73a-f728-48ca-b6d3-98142cb29a8b%40googlegroups.com.

Reply via email to