I have a Ryzen 1700 cpu and for tests I'm running it on max energy settings. It is unclear if a mac has a similar setting.  This url http://www.macos.utah.edu/documentation/administration/pmset.html shows there is a setting for "better performance" but I don't know if that does the same as on Windows where I get a performance doubling. Try PDFDebugger, it has a built-in benchmark feature, it shows the rendering speed in the status line.

I'm also avoiding that one-time initializations are part of the benchmark results with this code that is also in PDFDebugger:

        // trigger premature initializations for more accurate rendering benchmarks
        // See discussion in PDFBOX-3988
        if (PDType1Font.COURIER.isStandard14())
            // Yes this is always true
            PDDeviceCMYK.INSTANCE.toRGB(new float[] { 0, 0, 0, 0} );
            PDDeviceRGB.INSTANCE.toRGB(new float[] { 0, 0, 0 } );

I see you're using the PDFToImage utility. That one doesn't support subsampling yet, it has been on my "todo" list for a few days, I'll try to do it tonight... But PDFToImage is really just a command line utility.

Args 7, 8 and 11 don't work that way. Re arg 7 and 8, you need to call System.setProperty(). Re arg 11, you need to have a PDFRenderer object.

Another way to convert to images is explained here:

there call pdfRenderer.setSubsamplingAllowed(true) to activate subsampling. PDFDebugger also supports it in the menu.


Am 17.04.2018 um 01:20 schrieb Arthur Wang:

Thanks for the quick response and testing on my case. Below is my java code, my 
test result after adding the subsampling allowed. For the first page of ashley 
file, it cost 3362 milliseconds.

For the Gill file, the time elapsed is 2456 milliseconds.

My test were conducted on my MAC with processor: 2.2GHz, Core i7.  how come 
your PC runs so fast? 1.4 seconds is fast enough for a web access. Maybe there 
is something wrong with my code? I would appreciate if you take a look at my 




import org.apache.pdfbox.tools.PDFToImage;
//import java.awt.image.BufferedImage;
import java.io.File;
//import java.io.IOException;
//import java.io.OutputStream;
import org.apache.commons.lang3.time.StopWatch;

public class PdfToImage2 {

     private static final String OUTPUT_DIR = "/Users/someone/Desktop/";

     public static void main(String[] args) throws Exception{

         String pdfPath = "/Users/someone/Desktop/Ashley NJ_HHL101125_FV.pdf";
         //config option 2:convert page 1 in pdf to image
         String [] args_1 =  new String[13];
         args_1[0] = "-startPage";
         args_1[1] = "1";
         args_1[2] = "-endPage";
         args_1[3] = "1";
         args_1[4] = "-outputPrefix";
         args_1[5] = OUTPUT_DIR+"Ashley NJ_HHL101125_FV1";
         args_1[6] = pdfPath;
         args_1[7] = "-Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion";
         args_1[8] = "true";
         args_1[9] = "-dpi";
         args_1[10] = "72";//@48-->3283 milliseconds, @96>3545 milliseconds, 
@72--> 3362milliseconds
         args_1[11] = "-PDFRenderer.setSubsamplingAllowed";
         args_1[12] = "true";

         File f = new File(args_1[5]+"1.jpg");
         if(f.exists() && !f.isDirectory()) {
             System.out.println("file exist already");;

             StopWatch stopwatch = new StopWatch();


               try {

               } catch (Exception e) {
                   System.err.println("Exception while trying to create pdf document 
- " + e);

                  stopwatch.stop(); // optional
                 System.out.println("Time elapsed is "+ stopwatch.getTime() + " 


         //first try without setting property: 3779 milliseconds
         //second try with the property set: 3852 milliseconds
         //third try with subsamplingAllowed: 3362 milliseconds



From: Tilman Hausherr <thaush...@t-online.de>
Sent: Monday, April 16, 2018 10:55 AM
To: users@pdfbox.apache.org
Subject: Re: Performance issue with PDFBox 2.0.8

The java code didn't get through, most attachments get deleted. Call
PDFRenderer.setSubsamplingAllowed(true) to activate subsampling.

I had a look at your files... These are not extremely slow renderings. 4
seconds for such a page is pretty good.

On my PC, the first page of the Ashley file is rendered in PDFDebugger
in 1.4 seconds at 72dpi. The Gill file is done in less than a second.


Am 16.04.2018 um 19:05 schrieb Arthur Wang:
Arthur Wang has shared OneDrive files with you. To view them, click
the links below.


Ashley NJ_HHL101125_FV.pdf<https://1drv.ms/b/s%21AhA_REgBppCpgQluAoJe28B935ru>
Shared via OneDrive

Ashley NJ_HHL101125_FV.pdf




Screen Shot 2018-04-16 at 9.23.52 AM.png
        [Screen Shot 2018-04-16 at 9.23.52 AM.png]

just tried on 2.0.9, it works almost the same. to process all 4 pages
cost 32 seconds, if only process the first page, it cost about 4 seconds.

My server is HP DL380 with dual Xeon processors and 32 G RAM, the hard
drive is Intel Optane SSD NVMe.

Once the JPG image is produced, the access of the image is almost
instant regardless the size of the image file, so the open and close
time of the image file are insignificant and could be ignored.

By enable subsampling, do you mean to set up the dpi option ? do you
have the sample code for PDFRenderer ? attached file
---PdfToImage2.java is my testing code. Ashley...pdf is a file with
size about 45 M, and Gill...pdf is a file with size about 5 M. with
the size 1/10th of the other one, the processing time is cut down to
2657 milliseconds compare to 3779 milliseconds. seems like the size
does matter.



*From:* Tilman Hausherr <thaush...@t-online.de>
*Sent:* Monday, April 16, 2018 8:57 AM
*To:* users@pdfbox.apache.org
*Subject:* Re: Performance issue with PDFBox 2.0.8
- retry with the current version 2.0.9
- share your file for a profiler analysis
- as said by Itai (who implemented it) try enabling subsampling in
PDFRenderer (read the javadoc first). Compare the results and decide
whether the quality is OK for you.
- set the energy settings of your computer to maximum or at least to
"balanced", not to "energy save"
- don't know if adding GPU will help;
- try also the
-Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true option

The speed is not related to the size but to the complexity. 32 seconds
may sound disappointing but it's not the worst I've ever seen. "Nice
illustrations" with nested patterns or large shadings may be slow.


Am 16.04.2018 um 09:21 schrieb Arthur Wang:
Hi, everyone,

I am using PDFBox 2.0.8 and java 8 running in tomcat 8 for
production to convert pdf into image for display. it works very well
for pdf file size less than 5 M, it takes about 3800 milliseconds.
however, it slows down very much when the file size is increased to 50
M. it takes about 70,000 milliseconds, after setting system property
of sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider", it does
increase the performance to 32550 milliseconds, which almost double
the speed. but for 32 seconds to load a web page still too slow. Is
there any other way to speed up the performance? would adding a GPU
into the server help the performance? or any other software or
hardware solution could help on the processing speed? My current
server come with 32 G RAM, and the server never used more than half of it.



To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to