what is output of: tesseract --version Zdenko
ne 8. 12. 2019 o 15:55 坂本聖 <[email protected]> napísal(a): > Thanks for your advice. > I downdloaded files by clicking the "download" button in > https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata. > And I moved the chi_sim.traineddata file > to /usr/share/tesseract-ocr/4.00/tessdata/ , and checked the file (which > size is 42.3MB) exactly there. > But, I cannot use tesseract. > As I said, I can use tesseract with the file downloaded by executing sudo > apt install tesseract-ocr-chi-sim, but the data downloaded from Data files > did not work. > I cannot understand why it did not work. > > 2019年12月8日日曜日 23時15分31秒 UTC+9 zdenop: >> >> How did you downloaded files from repository? >> Please check files in /usr/share/tesseract-ocr/4.00/tessdata/ if there >> have the same size as in repository. >> >> Zdenko >> >> >> so 7. 12. 2019 o 17:34 坂本聖 <[email protected]> napísal(a): >> >>> Hi, >>> I want to use tesseract for Chinese words. So, first I tried to execute >>> the command >>> sudo apt install tesseract-ocr-chi-sim >>> And, I can find chi_sim.traineddata in >>> /usr/share/tesseract-ocr/4.00/tessdata and can check like this (I also >>> downloaded chi_tra and jpn.) >>> >>> $ tesseract --list-langs >>> >>> List of available languages (5): >>> >>> chi_sim >>> >>> chi_tra >>> >>> eng >>> >>> jpn >>> >>> osd >>> >>> >>> Actually, I can use tesseract, but I want to do ocr more accurately, so >>> I want to use chi_sim.traineddata downloaded from here. >>> https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata >>> After I executed the command >>> sudo apt remove tesseract-ocr-chi-sim >>> I put the new chi_sim.traineddata in >>> /usr/share/tesseract-ocr/4.00/tessdata, and I tried to use tesseract. >>> However I cannot like this. >>> >>> $ tesseract 0.jpeg output -l chi_sim >>> >>> Error opening data file >>> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >>> >>> Please make sure the TESSDATA_PREFIX environment variable is set to your >>> "tessdata" directory. >>> >>> Failed loading language 'chi_sim' >>> >>> Tesseract couldn't load any languages! >>> >>> Could not initialize tesseract. >>> >>> >>> Then, I tried like this, but I cannot. >>> >>> >>> $ tesseract 0.jpeg output -l chi_sim --tessdata-dir /usr/share/tesse >>> ract-ocr/4.00/tessdata >>> >>> Error opening data file >>> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >>> >>> Please make sure the TESSDATA_PREFIX environment variable is set to your >>> "tessdata" directory. >>> >>> Failed loading language 'chi_sim' >>> >>> Tesseract couldn't load any languages! >>> >>> Could not initialize tesseract. >>> >>> >>> Then, I tried to connect path to /usr/share/tesseract-ocr/4.00/tessdata >>> and tried again, but I cannot. >>> >>> >>> $ export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata/ >>> >>> $ tesseract 0.jpeg output -l chi_sim >>> >>> Error opening data file >>> /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata >>> >>> Please make sure the TESSDATA_PREFIX environment variable is set to your >>> "tessdata" directory. >>> >>> Failed loading language 'chi_sim' >>> >>> Tesseract couldn't load any languages! >>> >>> Could not initialize tesseract. >>> >>> >>> If I execute the language list, I can find chi_sim.traineddata again. >>> >>> $ tesseract --list-langs >>> >>> List of available languages (5): >>> >>> chi_sim >>> >>> chi_tra >>> >>> eng >>> >>> jpn >>> >>> osd >>> >>> >>> Please tell me why I cannot use the traineddata downloaded from >>> https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata >>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Ftesseract-ocr%2Ftessdata%2Fblob%2Fmaster%2Fchi_sim.traineddata&sa=D&sntz=1&usg=AFQjCNFDC123R3ymMJl_jEb2iqh-WMZfdg>? >>> Did I make a mistake? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/e93f49e3-978e-458d-8f97-1e0266a318c8%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/e93f49e3-978e-458d-8f97-1e0266a318c8%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/fd0e48ec-412c-464d-85bb-5ed65d4419c3%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/fd0e48ec-412c-464d-85bb-5ed65d4419c3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xKAPnuksRfK-YF-fzGXUv77%3DeDUT4UbKTx-9_Pdwm19w%40mail.gmail.com.

