BLAZIKEN-M RAPIDASH-M VICTREEBEL-M SHRRPEDO-M PORYGON-I-M RAZELF-M with
tesseract -v tesseract 4.0.0-beta.1-133-g5435c leptonica-1.76.0 libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8 : libopenjp2 2.3.0 Found AVX Found SSE tesseract names.png - --tessdata-dir ./tessdata_best Warning. Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 547 BLAZIKEN-M RAPIDASH-M VICTREEBEL-M SHRRPEDO-M PORYGON-I-M RAZELF-M Which version of tesseract are you using? ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Apr 21, 2018 at 6:32 AM, 'DR' via tesseract-ocr < tesseract-ocr@googlegroups.com> wrote: > I have this image I want to turn into text: > > > <https://lh3.googleusercontent.com/-CQevnMSjYeM/WtqJNMUuI1I/AAAAAAAAAGY/_0vwKc52EMoAKeDcuyGrgWIPqb22raMfACLcBGAs/s1600/names.png> > To clean it up, I've used Fred's textcleaner script ( > http://www.fmwconcepts.com/imagemagick/textcleaner/index.php) and ran > > ./textcleaner -i 2 names.png result.png >> > > on the image, the result is now: > > > <https://lh3.googleusercontent.com/-et8RIpYuVb8/WtqJxA3eEsI/AAAAAAAAAGg/I4TXRy4AzaIB2QVntxU28XUV3ZFBbGiEQCLcBGAs/s1600/result.png> > It looks a lot cleaner, so now I use tesseract to turn it into text: > > tesseract result.png stdout -psm 7 -l eng --user-words >> /path/to/eng.user-words --user-patterns /path/to/eng.user-patterns > > > with the following files, eng.user-words: > > BLAZIKEN >> RAPIDASH >> VICTREEBEL >> SHARPEDO >> PORYGON-Z >> AZELF > > > eng.user-pattern: > > -M > > > & /path/to/configs/bazaar: > > load_system_dawg F >> load_freq_dawg F >> user_words_suffix user-words >> user_patterns_suffix user-patterns > > > Yet my output is: > > Bl*H*ZIKEN-M R*H*PID*H*SH-M V*lE*TREEBEl-M SH*H*RPE*IIIJ*-M P*U*RY*Efl*N-Z-M >> *H*ZELF-M > > > Since case isn't an issue for me, the only problems are "A" showing up as > "H", "C" showing up as "LE", "DO" showing up as "IIIJ", and "GO" showing up > as "Efl" (with "fl" being one character). > > I'm not sure how to make the image any clearer if possible or if I'm doing > something wrong with tesseract. Any help is appreciated. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/cc3d86fb-4d9f-4e77-a5dd-23a41df213e3% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/cc3d86fb-4d9f-4e77-a5dd-23a41df213e3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduV%2BhWxicE7n82e3VrzuBmGe5wFhTaHAEp2Gf-Yeb5ievg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.