The approach we took to get around the problem was to load the file ourselves and then pass the byte array to the library instead of the filename as shown below, thus ensuring that the input source file only gets read in once. Any reOpen()'s within the library results in a re-read of the byte stream not the physical file.
File someFile = new File(“pathToFile”); InputStream s = new FileInputStream(someFile); byte arrayIn[] = RandomAccessFileOrArray.InputStreamToArray(s); reader = new PdfReader(arrayIn); // do rest of process s.close(); Paulo Soares wrote: > > You are right, it was the re-opening. I commited some changes in the SVN > and > now in my machine what took 2 minutes takes just 5 seconds to run. > > Paulo > > ----- Original Message ----- > From: "BorisTheCat" <[EMAIL PROTECTED]> > To: <itext-questions@lists.sourceforge.net> > Sent: Monday, August 27, 2007 3:31 PM > Subject: Re: [iText-questions] One page PDF takes 7-8 minutes to merge ?? > > >> >> While the input PDF file is less than pristine, it would appear >> that the input PDF is NOT root of this performance issue. We >> added some debugging information into the iText source code and >> found that over 50% of the processing time is being spent in >> in the RandomAccessFileOrArray::reOpen() method. When the objects >> are processed, you eventually see PdfReader.getStreamBytesRaw() get >> called which has a finally block which forces the input file to be >> closed. When the next object is processed, the library re-reads >> the entire input file again, which in our case is a 26Mb file. >> >> So in the end, we see the input PDF file (26Mb) being re-read over >> 5000 times ! So while the input PDF could be better, re-reading >> the input file with every object processed is what is resulting in >> in the poor performance. >> >> Is there any way for us to use the library in such a way that the >> file will not be re-read with each processing loop ? >> >> >> >> >> Paulo Soares wrote: >>> >>> In my machine it takes 1 minute and that's already way too long. The >>> reason >>> is that the application that created the PDF is really lousy, it can't >>> even >>> get the syntax right in the creation data. You PDF is composed of 2503 >>> images each one 2 pixels high. It looks like it was built from a striped >>> TIFF without any effort to adapt the image to the new media. >>> >>> Paulo >>> >>> ----- Original Message ----- >>> From: "BorisTheCat" <[EMAIL PROTECTED]> >>> To: <itext-questions@lists.sourceforge.net> >>> Sent: Thursday, August 23, 2007 4:10 PM >>> Subject: [iText-questions] One page PDF takes 7-8 minutes to merge ?? >>> >>> >>>> >>>> We used the examples to create a pdf merge program. We have a one-page >>>> PDF, >>>> 26Mb in size that contains an image. It is taking 7-8 minutes to merge >>>> this >>>> PDF. We have written a tiny java app that demonstrates the problem. >>>> We >>>> stripped the test app down so that it is only processing one input file >>>> so >>>> that it is now really only "merging" one PDF into a new PDF. So... the >>>> app >>>> is only processing a one-page PDF (albeit big pdf) and it is taking 7-8 >>>> minutes to complete. Any help would be GREATLY appreciated to >>>> determine >>>> why it is taking this long and whether there is any way to fix this. >>>> >>>> The input PDF can be found at: >>>> http://home.fuse.net/mikebrungs/test.pdf >>>> and the source code is shown below: >>>> >>>> >>>> import java.io.File; >>>> import java.io.FileOutputStream; >>>> import com.lowagie.text.Document; >>>> import com.lowagie.text.pdf.PdfCopy; >>>> import com.lowagie.text.pdf.PdfImportedPage; >>>> import com.lowagie.text.pdf.PdfReader; >>>> import java.util.Date; >>>> >>>> public class PdfTest >>>> { >>>> public static void main(String[] args) >>>> { >>>> try >>>> { >>>> // input file >>>> File pdfFile = new File(args[0]); >>>> >>>> // copied file being created >>>> File outFile = new File(args[0] + ".Copied.pdf"); >>>> >>>> System.out.println("Input file is: " + args[0]); >>>> System.out.println("Output file is: " + args[0] + >>>> ".Copied.pdf"); >>>> System.out.println("starting at: " + new Date().toString()); >>>> >>>> PdfReader reader = new PdfReader(pdfFile.getPath()); >>>> >>>> reader.consolidateNamedDestinations(); >>>> int numPages = reader.getNumberOfPages(); >>>> >>>> Document document = new >>>> Document(reader.getPageSizeWithRotation(1)); >>>> PdfCopy writer = new PdfCopy(document, new >>>> FileOutputStream(outFile.getPath())); >>>> >>>> document.open(); >>>> >>>> PdfImportedPage page; >>>> >>>> for (int j = 1; j <= numPages; j++) >>>> { >>>> page = writer.getImportedPage(reader, j); >>>> writer.addPage(page); >>>> } >>>> >>>> reader.close(); >>>> writer.close(); >>>> writer.freeReader(reader); >>>> >>>> System.out.println("finishing at: " + new Date().toString()); >>>> } >>>> catch (Exception t) >>>> { >>>> t.printStackTrace(); >>>> } >>>> } >>>> } > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > iText-questions mailing list > iText-questions@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > > -- View this message in context: http://www.nabble.com/One-page-PDF-takes-7-8-minutes-to-merge----tf4318146.html#a12371497 Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/