I tried using version 4.0 by building it from source. However, I get following messages, and without much surprise, the output is totally bizarre.
Failed to load any lstm-specific dictionaries for lang eng-numCAPS!! Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Warning. Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 233 Output: *4JTX9T* I understand that the DPI message is there since older versions and I had it in 3.05 as well, the 'lstm-specific' message is probably from the training data file? Only other option is train/finetune on my own set? On Tuesday, March 27, 2018 at 1:37:36 PM UTC-7, shree wrote: > > Version mismatch. That traineddata is for 4.0. > > Wiki has pages for training. Look for one appropriate for your version of > tesseract. > > On Wed 28 Mar, 2018, 1:23 AM , <[email protected] <javascript:>> wrote: > >> Hi Shree, >> >> I just tried using the training data file you provided but it seems that >> there is some problem with Tesseract recognizing this file. I should have >> mentioned before that I am using version '3.05.01'. >> >> Below is the sequence of commands I ran: >> >> Bhargavs-MacBook-Pro-2:LPR bhargav$ tesseract topcrop1.jpg out -l >> end-numCAPS >> >> Error opening data file >> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/end-numCAPS.traineddata >> >> Please make sure the TESSDATA_PREFIX environment variable is set to the >> parent directory of your "tessdata" directory. >> >> Failed loading language 'end-numCAPS' >> >> Tesseract couldn't load any languages! >> >> Could not initialize tesseract. >> >> Bhargavs-MacBook-Pro-2:LPR bhargav$ ls >> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/ >> >> configs eng.traineddata pdf.ttf >> >> eng-numCAPS.traineddata osd.traineddata tessconfigs >> >> Bhargavs-MacBook-Pro-2:LPR bhargav$ echo $TESSDATA_PREFIX >> >> /usr/local/share/tessdata >> >> Please let me know if I have done something wrong or the train data file >> has version mismatch or corrupted. >> >> Thanks, >> Bhargav >> >> On Tuesday, March 27, 2018 at 11:24:36 AM UTC-7, [email protected] wrote: >>> >>> Thank you Shree. I will give it a shot with the attached train data! >>> >>> About fine-tuning, are there any example tutorials on the Tesseract >>> wiki? I am not sure. I will try to find, but I you know and post the link, >>> I would really appreciate that! >>> >>> Thanks. >>> >>> On Tuesday, March 27, 2018 at 3:00:06 AM UTC-7, shree wrote: >>>> >>>> You can try finetune training. >>>> >>>> Test with attached traineddata file. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/c346ec8b-32ef-4b29-b9e6-e5d9225a31df%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/c346ec8b-32ef-4b29-b9e6-e5d9225a31df%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b8487fb6-cfd7-49d9-a422-312beeec4616%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

