Thanks Tillman, that makes sense...
-----Original Message-----
From: Tilman Hausherr [mailto:[email protected]]
Sent: Tuesday, March 22, 2016 4:38 PM
To: [email protected]
Subject: Re: Strange performance problem with certain PDF files
public void save(File file) throws IOException
{
save(new BufferedOutputStream(new FileOutputStream(file)));
}
so it is more efficient than
save(OutputStream output)
which just takes what it gets. See also
https://issues.apache.org/jira/browse/PDFBOX-3121
Tilman
Am 21.03.2016 um 20:58 schrieb Stahle, Patrick:
> Hi John / Tillman,
>
> I have reduced it down to be a difference between doing a PDDocument.save()
> using FileOutputStream. If I pass in Java File instead, the problem does not
> occur. Also we have only been able to reproduce it on some larger pdf files.
> It also seems to only happen in certain environments. On my linux virtual
> machine I have not been able to reproduce it at all. Windows and Solaris
> Server (3par drive cluster). I have some simple sample code that reproduces
> the problem but the 2 pdf files I have at hand I don't think I can send you.
> The one is a 3D PDF of ours (TE Classified) and the other ironically is IText
> v1 manual in pdf form. The times are pretty drastic, on Windows the 3D PDF
> with using Java File class is about 3 seconds vs. 29 seconds for the
> FileOutputStream. IText manual is not as bad at 2 vs. 20.
>
> Anyways, we have a workaround. We just converted our code to pass Java File
> class for use by PDFBox. If I can find a suitable PDF that reproduces the
> problem I will send it your way.
>
> Thanks,
> Patrick
>
> -----Original Message-----
> From: John Hewson [mailto:[email protected]]
> Sent: Friday, March 18, 2016 4:45 PM
> To: [email protected]
> Subject: Re: Strange performance problem with certain PDF files
>
>
>> On 18 Mar 2016, at 12:01, Stahle, Patrick <[email protected]> wrote:
>>
>> Hi all,
>>
>> I am running into a lot of strange performance issues with certain PDF files.
>>
>> Background info:
>> The strange thing I can't reproduce this consistently. When I get a pdf
>> being generated on a particular environment it seems consistent. I do most
>> of my development inside VirtualBox virtual machine running fedora. These
>> pdf files I am having problems with never have performance issues when run
>> on my virtual machine local drive, but if I use a Virtual Box Shared drive
>> as the source / destination for the PDF, I see the problem. Another
>> co-worker working from pure windows environment experience the performance
>> problem. We are also seeing the same issue on our dev solaris servers. The
>> performance range can be quite drastic on one of our 3DPDF's (12meg) running
>> on my local environment it can be opened, stamped with some text, encrypted,
>> and saved in around 8 sec. Doing the same job pointing to a virtual box
>> share drive or on our solaris server that same work will take minutes. On my
>> coworkers windows environment it takes around 30 seconds. We really only
>> reproduced this consistently on the 12m 3D PDF. I have a much smaller pdf
>> (non 3d / convert from msoffice) that does show similar performance issue
>> but the times range from 200ms local to 8 sec.
> You need to isolate the problem, you’ve got too many variables to make any
> sense of it all. Get a reproducible problem on one, non-virtualised JVM first.
>
> — John
>
>> The one thing I see in common between the 2 files is I see a lot of the
>> following messages to the console:
>> Using output from the 12m 3DPDF file:
>> :
>> :
>> 1787 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser -
>> parsed=COSObject{13166, 0}
>>
>> These messages seem to happen on the PDDocument.open and from what I can
>> tell, I get 13,166 of these messages in this example PDF.
>> The slowness does not happen until the following line:
>> document.save(outputPDFStream);
>>
>> Other PDF's including some quite large I do not see this performance issue
>> nor those log messages.
>>
>> I know this is not much to go on, I am working on seeing if I can isolate
>> this down to something more concrete / reproducible point. But I thought I
>> would send this out to see if anyone has any ideas or have seen issues
>> similar to this? Suggestions?
>>
>> Thanks,
>> Patrick
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]