I did a Google search on your issue. There are a couple of solutions. InflaterInputStream read Unexpected end of ZLIB It came up with: Results 1 - 10 of about 854
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4040920 Work Around The workaround is to never attempt to read more bytes than the entry contains. Call ZipEntry.getSize() to get the actual size of the entry, then use this value to keep track of the number of bytes remaining in the entry while reading from it. To take the previous example: This code change may solve the issue for PDFBox. at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) Add the Math.min() to reduce the number of bytes you are trying to read. int mayRead=compressedData.available(); while ((amountRead = decompressor.read(buffer, 0, Math.min(mayRead,BUFFER_SIZE))) != -1) I found another potential issue like this with a solution on the Sun site. It was described using windows, but the same could happen on UNIX. It suggests that the issue could happen if you are running several processes against the same directory. Please look this over to see if this is the problem. Are you running multiple processes to accomplish the job faster? http://forums.sun.com/thread.jspa?threadID=5316308 paul.miner Posts:2,639 Registered: 10/8/07 Re: Unexpected end of ZLIB input stream error while compiling Jul 22, 2008 6:54 AM (reply 1 of 2) (In reply to original post ) koko191 wrote: Main batch : start /B %SWIFT_LOCAL_HOME%\scripts\rmicAll.bat start /B %SWIFT_LOCAL_HOME%\scripts\create_jar.bat The "start" command does not wait for the command to finish, so both those batch files would be running in parallel. If they both work on the same jar, this could be a problem. If you want to run the batch files in sequence, use "call". -----Original Message----- From: Balasubramaniam, Balaji [mailto:[email protected]] Sent: Tuesday, January 13, 2009 7:05 PM To: [email protected] Subject: java.io.EOFException: Unexpected end of ZLIB input stream error message on UNIX box Hello, I'm trying to use PdfBox to identify a PDF file is corrupted or not. We are trying to automate a process in which it is going to loop through a given folder and see how many of the PDF files are corrupted. This program works fine in windows XP environment (OS Version: x86 Windows XP 5.1, Java version : Java HotSpot(tm) Client VM 1.5.0-15-b04). When we ran this application in UNIX box (OS Version: PA_RISC2.0 HP-UX B.11.23, Java Version: Java HotSpot(tm) Client VM 1.5.0.11 jinteg:11.07.07-09:52 PA2.0(aCC_AP)) it throws the following error. NOTE: This error is not happening for all the time. It throws the error only for some of the PDF files. Those PDF files are not corrupted and I could open those PDF files manually and it opens fine. java.io.EOFException: Unexpected end of ZLIB input stream at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:216) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:134) at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) at org.pdfbox.cos.COSStream.doDecode(COSStream.java:290) at org.pdfbox.cos.COSStream.doDecode(COSStream.java:235) at org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:170) at org.pdfbox.pdmodel.common.COSStreamArray.getUnfilteredStream(COSStreamAr ray.j ava:200) at org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:101) at ProcessDefinitions.RunAuditProcess.RunAuditProcessGenerateAuditLogMessag e.inv oke(RunAuditProcessGenerateAuditLogMessage.java:212) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl. java:25) at java.lang.reflect.Method.invoke(Method.java:585) at com.tibco.plugin.java.JavaActivity.eval(JavaActivity.java:383) at com.tibco.pe.plugin.Activity.eval(Activity.java:209) at com.tibco.pe.core.TaskImpl.eval(TaskImpl.java:540) at com.tibco.pe.core.Job.a(Job.java:712) at com.tibco.pe.core.Job.k(Job.java:501) at com.tibco.pe.core.JobDispatcher$JobCourier.a(JobDispatcher.java:249) at com.tibco.pe.core.JobDispatcher$JobCourier.run(JobDispatcher.java:200) Sample code snippet I use to do the task. PDDocument document = PDDocument.load(<input stream>); List pages = document.getDocumentCatalog().getAllPages(); If(pages != null && pages.size() > 0) { PDPage page = (PDPage)pages.get(i); PDStream contents = page.getContents(); PDFStreamParser parser = null; try { parser = new PDFStreamParser(contents.getStream()); } catch(Exception e) { System.err.println("This PDF cannot be read. Most possibly it could be corrupted. " + pdfFileName); } } Could somebody shed some light on this one? Thank you.
