Thanks for your advice, however I am using ubuntu on wsl (windows subsystem for linux), and I have already tried to set TESSDATA_PEREFIX by executing $ export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata/ . But, I cannot use tesseract. If I use the traineddata downloaded from sudo apt install tesseract-ocr-chi-sim, I can use tesseract with the data downloaded from data Data files. Cannot I use tesseract on wsl (Ubuntu)?
2019年12月8日日曜日 11時06分58秒 UTC+9 NY C: > > Try to set TESSDATA_PREFIX environment variable. > > 1. Go to Control Panel -> System -> Advanced System Settings -> > Advanced tab -> *Environment Variables...* button > 2. In System variables window scroll down to *TESSDATA_PREFIX*. If > it's not right, select and click *Edit...* > > > > 坂本聖於 2019年12月8日星期日 UTC+8上午12時34分26秒寫道: >> >> Hi, >> I want to use tesseract for Chinese words. So, first I tried to execute >> the command >> sudo apt install tesseract-ocr-chi-sim >> And, I can find chi_sim.traineddata in >> /usr/share/tesseract-ocr/4.00/tessdata and can check like this (I also >> downloaded chi_tra and jpn.) >> >> $ tesseract --list-langs >> >> List of available languages (5): >> >> chi_sim >> >> chi_tra >> >> eng >> >> jpn >> >> osd >> >> >> Actually, I can use tesseract, but I want to do ocr more accurately, so I >> want to use chi_sim.traineddata downloaded from here. >> https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata >> After I executed the command >> sudo apt remove tesseract-ocr-chi-sim >> I put the new chi_sim.traineddata in >> /usr/share/tesseract-ocr/4.00/tessdata, and I tried to use tesseract. >> However I cannot like this. >> >> $ tesseract 0.jpeg output -l chi_sim >> >> Error opening data file >> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >> >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> >> Failed loading language 'chi_sim' >> >> Tesseract couldn't load any languages! >> >> Could not initialize tesseract. >> >> >> Then, I tried like this, but I cannot. >> >> >> $ tesseract 0.jpeg output -l chi_sim --tessdata-dir /usr/share/tesse >> ract-ocr/4.00/tessdata >> >> Error opening data file >> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >> >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> >> Failed loading language 'chi_sim' >> >> Tesseract couldn't load any languages! >> >> Could not initialize tesseract. >> >> >> Then, I tried to connect path to /usr/share/tesseract-ocr/4.00/tessdata >> and tried again, but I cannot. >> >> >> $ export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata/ >> >> $ tesseract 0.jpeg output -l chi_sim >> >> Error opening data file >> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >> >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> >> Failed loading language 'chi_sim' >> >> Tesseract couldn't load any languages! >> >> Could not initialize tesseract. >> >> >> If I execute the language list, I can find chi_sim.traineddata again. >> >> $ tesseract --list-langs >> >> List of available languages (5): >> >> chi_sim >> >> chi_tra >> >> eng >> >> jpn >> >> osd >> >> >> Please tell me why I cannot use the traineddata downloaded from >> https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata? >> Did I make a mistake? >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/06b3b21c-130a-416e-b32b-c95557d8a156%40googlegroups.com.

