[tesseract-ocr] Re: Tesseract couldn't load any languages!
It is possible that you have not downloaded eng.traineddata or it is in a different location. Try running tesseract on command line, check --list-langs. On Friday, May 18, 2018 at 9:27:59 AM UTC+5:30, Dattatraya Tembare wrote: > > > *[SOLVED] changed the language from 'hin+eng' to 'hin'In this case > selection of language also matters -* I was processing image with > lang=hin+eng, but it was giving the same error (mentioned in this post) > > As English text was less in the image so I changed lang=hin and I got the > expected result. > > public static void main(String[] args) { > Tesseract in = new ReadImageText().getTesseractInstance("C:/Program > Files (x86)/Tesseract-OCR/tessdata/", "hin"); > try { > String resultText = in.doOCR(new > File("C:/EA/app-result/im/01-001/34/0.png")); > log.info("resultText {}", resultText); > } catch (TesseractException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > } > > > On Friday, May 4, 2018 at 2:38:16 PM UTC-4, Dattatraya Tembare wrote: >> >> Exception in thread "main" java.lang.Error: Invalid memory access >> at com.sun.jna.Native.invokePointer(Native Method) >> at com.sun.jna.Function.invokePointer(Function.java:490) >> at com.sun.jna.Function.invoke(Function.java:434) >> at com.sun.jna.Function.invoke(Function.java:354) >> at com.sun.jna.Library$Handler.invoke(Library.java:244) >> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) >> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) >> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59) >> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32) >> Error opening data file ./eng.traineddata >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> Failed loading language 'eng' >> Tesseract couldn't load any languages! >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/5e7a362a-8e38-4ce0-b096-12ca2dfcfc70%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[tesseract-ocr] Re: Tesseract couldn't load any languages!
*[SOLVED] changed the language from 'hin+eng' to 'hin'In this case selection of language also matters -* I was processing image with lang=hin+eng, but it was giving the same error (mentioned in this post) As English text was less in the image so I changed lang=hin and I got the expected result. public static void main(String[] args) { Tesseract in = new ReadImageText().getTesseractInstance("C:/Program Files (x86)/Tesseract-OCR/tessdata/", "hin"); try { String resultText = in.doOCR(new File("C:/EA/app-result/im/01-001/34/0.png")); log.info("resultText {}", resultText); } catch (TesseractException e) { // TODO Auto-generated catch block e.printStackTrace(); } } On Friday, May 4, 2018 at 2:38:16 PM UTC-4, Dattatraya Tembare wrote: > > Exception in thread "main" java.lang.Error: Invalid memory access > at com.sun.jna.Native.invokePointer(Native Method) > at com.sun.jna.Function.invokePointer(Function.java:490) > at com.sun.jna.Function.invoke(Function.java:434) > at com.sun.jna.Function.invoke(Function.java:354) > at com.sun.jna.Library$Handler.invoke(Library.java:244) > at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) > at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) > at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59) > at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32) > Error opening data file ./eng.traineddata > Please make sure the TESSDATA_PREFIX environment variable is set to your > "tessdata" directory. > Failed loading language 'eng' > Tesseract couldn't load any languages! > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c6ab9b16-dedb-47a2-859f-72dd83ae997d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[tesseract-ocr] Re: Tesseract couldn't load any languages!
Thanks! Your solution worked. Now facing something different -- Same pattern 33 files executed successfully, failed for 34th file. java.lang.Error: Invalid memory access at com.sun.jna.Native.invokePointer(Native Method) ~[jna-4.5.1.jar:4.5.1 ( b0)] at com.sun.jna.Function.invokePointer(Function.java:490) ~[jna-4.5.1.jar: 4.5.1 (b0)] at com.sun.jna.Function.invoke(Function.java:434) ~[jna-4.5.1.jar:4.5.1 (b0 )] at com.sun.jna.Function.invoke(Function.java:354) ~[jna-4.5.1.jar:4.5.1 (b0 )] at com.sun.jna.Library$Handler.invoke(Library.java:244) ~[jna-4.5.1.jar:4.5 .1 (b0)] at com.sun.proxy.$Proxy77.TessBaseAPIGetUTF8Text(Unknown Source) ~[na:na] at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) ~[tess4j -4.0.1.jar:4.0.1] at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) ~[tess4j-4.0. 1.jar:4.0.1] at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) ~[tess4j-4.0. 1.jar:4.0.1] at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) ~[tess4j-4.0. 1.jar:4.0.1] When checked into Tesseract code, found below line Pointer utf8Text = renderedFormat == RenderedFormat.HOCR ? api. TessBaseAPIGetHOCRText(handle, pageNum - 1) : api.TessBaseAPIGetUTF8Text( handle); Please guide. Regards, Datta On Friday, May 4, 2018 at 6:04:41 PM UTC-4, Quan Nguyen wrote: > > You'll need to setDatapath to your tessdata directory so Tesseract can > find the *.traineddata files > > On Friday, May 4, 2018 at 1:38:16 PM UTC-5, Dattatraya Tembare wrote: >> >> Exception in thread "main" java.lang.Error: Invalid memory access >> at com.sun.jna.Native.invokePointer(Native Method) >> at com.sun.jna.Function.invokePointer(Function.java:490) >> at com.sun.jna.Function.invoke(Function.java:434) >> at com.sun.jna.Function.invoke(Function.java:354) >> at com.sun.jna.Library$Handler.invoke(Library.java:244) >> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) >> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) >> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) >> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59) >> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32) >> Error opening data file ./eng.traineddata >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> Failed loading language 'eng' >> Tesseract couldn't load any languages! >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/07a4d51e-10aa-4219-ab3f-5e238cec562b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[tesseract-ocr] Re: Tesseract couldn't load any languages!
You'll need to setDatapath to your tessdata directory so Tesseract can find the *.traineddata files On Friday, May 4, 2018 at 1:38:16 PM UTC-5, Dattatraya Tembare wrote: > > Exception in thread "main" java.lang.Error: Invalid memory access > at com.sun.jna.Native.invokePointer(Native Method) > at com.sun.jna.Function.invokePointer(Function.java:490) > at com.sun.jna.Function.invoke(Function.java:434) > at com.sun.jna.Function.invoke(Function.java:354) > at com.sun.jna.Library$Handler.invoke(Library.java:244) > at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) > at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) > at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) > at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59) > at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32) > Error opening data file ./eng.traineddata > Please make sure the TESSDATA_PREFIX environment variable is set to your > "tessdata" directory. > Failed loading language 'eng' > Tesseract couldn't load any languages! > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2b8a1b17-2904-4def-97ae-625f3b3b88f5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.