Did you have any luck without training, but including the bazaar config file and user-patterns txt file?
On Friday, April 26, 2019 at 10:19:12 PM UTC+8, JB Data31 wrote: > > Image processing can improve the result, but this typo is very particular, > i.e. "unconnected" digit. > I try morphological transformations to re-connect digit, better (*tesseract > ocr_inv-8.png ocr-8 --psm 6*) , but yet far away a proper result. > According to me, train *tesseract* with this typo is a way. > > @*JB*Δ <http://jbigdata.fr/jbweb2/dev/miscs/image/index.html> > > Le ven. 26 avr. 2019 à 09:53, <[email protected] <javascript:>> a > écrit : > >> Hi, >> >> I created a bazaar file as attached with load_system_dawg, load_freq_dawg >> set to F. I also want to use user-patterns so I set it as well. >> load_system_dawg F >> load_freq_dawg F >> user_patterns_suffix user-patterns >> >> In the same directory, I also have the user-pattern file: >> \d\d\d\d\c >> >> So the structure looks like: >> >> ./bazaar >> ./eng.user-patterns >> ./ocr_inv.png >> >> >> But these settings still fail to recognise the image correctly as >> "1880A". If I just run tesseract without any bells and whistles, the >> outputs are still the same. >> >> Commands used: >> tesseract ocr_inv.png stdout >> tesseract ocr_inv.png stdout bazaar >> tesseract ocr_inv.png stdout --user-patterns eng.user-patterns bazaar >> >> Output: >> Warning. Invalid resolution 0 dpi. Using 70 instead. >> Estimating resolution as 1128 >> ISON >> >> Can anyone tell me if this is expected behaviour? >> Tesseract version: >> tesseract 4.0.0-beta.1 >> leptonica-1.75.3 >> libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : >> libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 >> >> Found AVX2 >> Found AVX >> Found SSE >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/10d1ed21-d30e-4eaf-8f2a-6fdf74a6a7d1%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/10d1ed21-d30e-4eaf-8f2a-6fdf74a6a7d1%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0fdb955c-ad6c-493d-85bf-f71cb4ebd0c6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

