[tesseract-ocr] Re: Tesseract couldn't load any languages!

2018-05-17 Thread shree
It is possible that you have not downloaded eng.traineddata or it is in a 
different location.

Try running tesseract on command line, check --list-langs.

On Friday, May 18, 2018 at 9:27:59 AM UTC+5:30, Dattatraya Tembare wrote:
>
>
> *[SOLVED] changed the language from 'hin+eng' to 'hin'In this case 
> selection of language also matters -* I was processing image with 
> lang=hin+eng, but it was giving the same error (mentioned in this post)
>
> As English text was less in the image so I changed lang=hin and I got the 
> expected result.
>
> public static void main(String[] args) {
> Tesseract in = new ReadImageText().getTesseractInstance("C:/Program 
> Files (x86)/Tesseract-OCR/tessdata/", "hin");
> try {
> String resultText = in.doOCR(new 
> File("C:/EA/app-result/im/01-001/34/0.png"));
> log.info("resultText {}", resultText);
> } catch (TesseractException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> }
> }
>
>
> On Friday, May 4, 2018 at 2:38:16 PM UTC-4, Dattatraya Tembare wrote:
>>
>> Exception in thread "main" java.lang.Error: Invalid memory access
>> at com.sun.jna.Native.invokePointer(Native Method)
>> at com.sun.jna.Function.invokePointer(Function.java:490)
>> at com.sun.jna.Function.invoke(Function.java:434)
>> at com.sun.jna.Function.invoke(Function.java:354)
>> at com.sun.jna.Library$Handler.invoke(Library.java:244)
>> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source)
>> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193)
>> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59)
>> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32)
>> Error opening data file ./eng.traineddata
>> Please make sure the TESSDATA_PREFIX environment variable is set to your 
>> "tessdata" directory.
>> Failed loading language 'eng'
>> Tesseract couldn't load any languages!
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5e7a362a-8e38-4ce0-b096-12ca2dfcfc70%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Tesseract couldn't load any languages!

2018-05-17 Thread Dattatraya Tembare



*[SOLVED] changed the language from 'hin+eng' to 'hin'In this case 
selection of language also matters -* I was processing image with 
lang=hin+eng, but it was giving the same error (mentioned in this post)

As English text was less in the image so I changed lang=hin and I got the 
expected result.

public static void main(String[] args) {
Tesseract in = new ReadImageText().getTesseractInstance("C:/Program 
Files (x86)/Tesseract-OCR/tessdata/", "hin");
try {
String resultText = in.doOCR(new 
File("C:/EA/app-result/im/01-001/34/0.png"));
log.info("resultText {}", resultText);
} catch (TesseractException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}


On Friday, May 4, 2018 at 2:38:16 PM UTC-4, Dattatraya Tembare wrote:
>
> Exception in thread "main" java.lang.Error: Invalid memory access
> at com.sun.jna.Native.invokePointer(Native Method)
> at com.sun.jna.Function.invokePointer(Function.java:490)
> at com.sun.jna.Function.invoke(Function.java:434)
> at com.sun.jna.Function.invoke(Function.java:354)
> at com.sun.jna.Library$Handler.invoke(Library.java:244)
> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source)
> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193)
> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59)
> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32)
> Error opening data file ./eng.traineddata
> Please make sure the TESSDATA_PREFIX environment variable is set to your 
> "tessdata" directory.
> Failed loading language 'eng'
> Tesseract couldn't load any languages!
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c6ab9b16-dedb-47a2-859f-72dd83ae997d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Tesseract couldn't load any languages!

2018-05-17 Thread Dattatraya Tembare
Thanks!
Your solution worked.
Now facing something different -- Same pattern 33 files executed 
successfully, failed for 34th file. 

java.lang.Error: Invalid memory access
 at com.sun.jna.Native.invokePointer(Native Method) ~[jna-4.5.1.jar:4.5.1 (
b0)]
 at com.sun.jna.Function.invokePointer(Function.java:490) ~[jna-4.5.1.jar:
4.5.1 (b0)]
 at com.sun.jna.Function.invoke(Function.java:434) ~[jna-4.5.1.jar:4.5.1 (b0
)]
 at com.sun.jna.Function.invoke(Function.java:354) ~[jna-4.5.1.jar:4.5.1 (b0
)]
 at com.sun.jna.Library$Handler.invoke(Library.java:244) ~[jna-4.5.1.jar:4.5
.1 (b0)]
 at com.sun.proxy.$Proxy77.TessBaseAPIGetUTF8Text(Unknown Source) ~[na:na]
 at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433) ~[tess4j
-4.0.1.jar:4.0.1]
 at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288) ~[tess4j-4.0.
1.jar:4.0.1]
 at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209) ~[tess4j-4.0.
1.jar:4.0.1]
 at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193) ~[tess4j-4.0.
1.jar:4.0.1]

When checked into Tesseract code, found below line 

Pointer utf8Text = renderedFormat == RenderedFormat.HOCR ? api.
TessBaseAPIGetHOCRText(handle, pageNum - 1) : api.TessBaseAPIGetUTF8Text(
handle);

Please guide.

Regards,
Datta

On Friday, May 4, 2018 at 6:04:41 PM UTC-4, Quan Nguyen wrote:
>
> You'll need to setDatapath to your tessdata directory so Tesseract can 
> find the *.traineddata files
>
> On Friday, May 4, 2018 at 1:38:16 PM UTC-5, Dattatraya Tembare wrote:
>>
>> Exception in thread "main" java.lang.Error: Invalid memory access
>> at com.sun.jna.Native.invokePointer(Native Method)
>> at com.sun.jna.Function.invokePointer(Function.java:490)
>> at com.sun.jna.Function.invoke(Function.java:434)
>> at com.sun.jna.Function.invoke(Function.java:354)
>> at com.sun.jna.Library$Handler.invoke(Library.java:244)
>> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source)
>> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193)
>> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59)
>> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32)
>> Error opening data file ./eng.traineddata
>> Please make sure the TESSDATA_PREFIX environment variable is set to your 
>> "tessdata" directory.
>> Failed loading language 'eng'
>> Tesseract couldn't load any languages!
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/07a4d51e-10aa-4219-ab3f-5e238cec562b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Tesseract couldn't load any languages!

2018-05-04 Thread Quan Nguyen
You'll need to setDatapath to your tessdata directory so Tesseract can find 
the *.traineddata files

On Friday, May 4, 2018 at 1:38:16 PM UTC-5, Dattatraya Tembare wrote:
>
> Exception in thread "main" java.lang.Error: Invalid memory access
> at com.sun.jna.Native.invokePointer(Native Method)
> at com.sun.jna.Function.invokePointer(Function.java:490)
> at com.sun.jna.Function.invoke(Function.java:434)
> at com.sun.jna.Function.invoke(Function.java:354)
> at com.sun.jna.Library$Handler.invoke(Library.java:244)
> at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source)
> at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:433)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:288)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:209)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:193)
> at com.ea.ocr.tesseract.ReadImageText.readText(ReadImageText.java:59)
> at com.ea.ocr.tesseract.ReadImageText.main(ReadImageText.java:32)
> Error opening data file ./eng.traineddata
> Please make sure the TESSDATA_PREFIX environment variable is set to your 
> "tessdata" directory.
> Failed loading language 'eng'
> Tesseract couldn't load any languages!
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2b8a1b17-2904-4def-97ae-625f3b3b88f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.