[tesseract-ocr] Re: Tess4j failing near load of shared library tesseract-ocr-5.2 in Java 11 and 17, succeeds in Java 8

2022-09-28 Thread Quan Nguyen
PDF files are read by PDFBox library. You may want to look into that area 
as well.

On Wednesday, September 28, 2022 at 10:52:15 PM UTC-5 Quan Nguyen wrote:

> The source of tess4j is available; you can trace through the code to see 
> what threw the exception.
>
> Nevertheless, "throwable while reading PDF" seems to point to the part of 
> code that reads in PDF file. Was that something you wrote, or from tess4j 
> itself?
>
> On Sunday, September 25, 2022 at 11:02:35 AM UTC-5 rcja...@gmail.com 
> wrote:
>
>> I'm using Tess4j in a Java program to access Tesseract and read  PDFs 
>> read with PDFBox. I've been using Java 8, and things are running. The 
>> program is not commercial; I provide it to non-profits doing pro bono legal 
>> work in my state. In java 8 using the command line and eclipse, the program 
>> runs fine; running from the command line in either Java 11 or Java 17 
>> causes an error at the point where the program calls Tesseract.doOCR().
>>
>> I've dumped class loading information and see that last class loaded 
>> before the fatal exception is com.sun.jna.Platform; it would be used, for 
>> instance, to determine the platform on which the program is running. I 
>> haven't been able to find the source for the 5.2 version I downloaded from 
>> UB Mannheim, that would be useful since the stack trace has line numbers.
>>
>> The following is a snippet showing log messages, System.out.println 
>> messages, stacktraces, and class loading messages near the point of failure:
>>
>> pdfRenderer created buffered Image
>> set a couple of tesseract vars
>> [14.960s][info][class,load] net.sourceforge.tess4j.util.ImageIOHelper 
>> source: rsrc:tess4j-5.4.0.jar
>> [14.961s][info][class,load] javax.imageio.IIOParam source: 
>> jrt:/java.desktop
>> [14.961s][info][class,load] javax.imageio.ImageWriteParam source: 
>> jrt:/java.desktop
>> [14.962s][info][class,load] 
>> com.github.jaiimageio.plugins.tiff.TIFFImageWriteParam source: 
>> rsrc:jai-imageio-core-1.4.0.jar
>> [14.963s][info][class,load] javax.imageio.IIOImage source: 
>> jrt:/java.desktop
>> [14.964s][info][class,load] com.sun.jna.Library source: 
>> rsrc:jna-5.12.1.jar
>> [14.965s][info][class,load] net.sourceforge.tess4j.ITessAPI source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.965s][info][class,load] net.sourceforge.tess4j.TessAPI source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.966s][info][class,load] net.sourceforge.tess4j.util.LoadLibs source: 
>> rsrc:tess4j-5.4.0.jar
>> [14.969s][info][class,load] com.sun.jna.Platform source: 
>> rsrc:jna-5.12.1.jar
>> [14.973s][info][class,load] java.lang.ExceptionInInitializerError source: 
>> jrt:/java.base
>> throwable while reading PDF
>> [14.973s][info][class,load] java.lang.Throwable$PrintStreamOrWriter 
>> source: jrt:/java.base
>> [14.974s][info][class,load] java.lang.Throwable$WrappedPrintStream 
>> source: jrt:/java.base
>> java.lang.ExceptionInInitializerError
>> at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:442)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:326)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:309)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:290)
>> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:274)
>> at 
>> drivingrecordtool.file.DrivingRecordPDFTextReader.getOCRText(DrivingRecordPDFTextReader.java:152)
>> at 
>> drivingrecordtool.file.DrivingRecordPDFTextReader.getText(DrivingRecordPDFTextReader.java:46)
>> at 
>> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:78)
>> at 
>> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:1)
>> at 
>> java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304)
>> at 
>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343)
>> at 
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>> at 
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>> at java.base/java.lang.Thread.run(Thread.java:834)
>> Caused by: java.lang.IllegalStateException: zip file closed
>> at java.base/java.util.zip.ZipFile.ensureOpen(ZipFile.java:913)
>> at java.base/java.util.zip.ZipFile.getEntry(ZipFile.java:348)
>>
>> If I uninstall Java and install Java 8, the program works fine.
>>
>> If I uninstall Java and install Java 11 or Java 17, it fails in this 
>> fashion.
>>
>> Can anyone help me understand what the difference might be between the 
>> versions of Java so I can fix this?
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this 

[tesseract-ocr] Re: Tess4j failing near load of shared library tesseract-ocr-5.2 in Java 11 and 17, succeeds in Java 8

2022-09-28 Thread Quan Nguyen
The source of tess4j is available; you can trace through the code to see 
what threw the exception.

Nevertheless, "throwable while reading PDF" seems to point to the part of 
code that reads in PDF file. Was that something you wrote, or from tess4j 
itself?

On Sunday, September 25, 2022 at 11:02:35 AM UTC-5 rcja...@gmail.com wrote:

> I'm using Tess4j in a Java program to access Tesseract and read  PDFs read 
> with PDFBox. I've been using Java 8, and things are running. The program is 
> not commercial; I provide it to non-profits doing pro bono legal work in my 
> state. In java 8 using the command line and eclipse, the program runs fine; 
> running from the command line in either Java 11 or Java 17 causes an error 
> at the point where the program calls Tesseract.doOCR().
>
> I've dumped class loading information and see that last class loaded 
> before the fatal exception is com.sun.jna.Platform; it would be used, for 
> instance, to determine the platform on which the program is running. I 
> haven't been able to find the source for the 5.2 version I downloaded from 
> UB Mannheim, that would be useful since the stack trace has line numbers.
>
> The following is a snippet showing log messages, System.out.println 
> messages, stacktraces, and class loading messages near the point of failure:
>
> pdfRenderer created buffered Image
> set a couple of tesseract vars
> [14.960s][info][class,load] net.sourceforge.tess4j.util.ImageIOHelper 
> source: rsrc:tess4j-5.4.0.jar
> [14.961s][info][class,load] javax.imageio.IIOParam source: 
> jrt:/java.desktop
> [14.961s][info][class,load] javax.imageio.ImageWriteParam source: 
> jrt:/java.desktop
> [14.962s][info][class,load] 
> com.github.jaiimageio.plugins.tiff.TIFFImageWriteParam source: 
> rsrc:jai-imageio-core-1.4.0.jar
> [14.963s][info][class,load] javax.imageio.IIOImage source: 
> jrt:/java.desktop
> [14.964s][info][class,load] com.sun.jna.Library source: rsrc:jna-5.12.1.jar
> [14.965s][info][class,load] net.sourceforge.tess4j.ITessAPI source: 
> rsrc:tess4j-5.4.0.jar
> [14.965s][info][class,load] net.sourceforge.tess4j.TessAPI source: 
> rsrc:tess4j-5.4.0.jar
> [14.966s][info][class,load] net.sourceforge.tess4j.util.LoadLibs source: 
> rsrc:tess4j-5.4.0.jar
> [14.969s][info][class,load] com.sun.jna.Platform source: 
> rsrc:jna-5.12.1.jar
> [14.973s][info][class,load] java.lang.ExceptionInInitializerError source: 
> jrt:/java.base
> throwable while reading PDF
> [14.973s][info][class,load] java.lang.Throwable$PrintStreamOrWriter 
> source: jrt:/java.base
> [14.974s][info][class,load] java.lang.Throwable$WrappedPrintStream source: 
> jrt:/java.base
> java.lang.ExceptionInInitializerError
> at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:442)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:326)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:309)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:290)
> at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:274)
> at 
> drivingrecordtool.file.DrivingRecordPDFTextReader.getOCRText(DrivingRecordPDFTextReader.java:152)
> at 
> drivingrecordtool.file.DrivingRecordPDFTextReader.getText(DrivingRecordPDFTextReader.java:46)
> at 
> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:78)
> at 
> drivingrecordtool.file.DrivingRecordFileReader.doInBackground(DrivingRecordFileReader.java:1)
> at 
> java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304)
> at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.IllegalStateException: zip file closed
> at java.base/java.util.zip.ZipFile.ensureOpen(ZipFile.java:913)
> at java.base/java.util.zip.ZipFile.getEntry(ZipFile.java:348)
>
> If I uninstall Java and install Java 8, the program works fine.
>
> If I uninstall Java and install Java 11 or Java 17, it fails in this 
> fashion.
>
> Can anyone help me understand what the difference might be between the 
> versions of Java so I can fix this?
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f7309e7d-79bf-4594-a581-a1ce6a556b62n%40googlegroups.com.