Re: Re[8]: PDFRenderer, PDDocument memory issue

2015-07-01 Thread Andreas Lehmkühler


> Alex Sviridov  hat am 1. Juli 2015 um 13:59 geschrieben:
> 
> 
>  Ok. Thank you very much for explanation. Could you say where this scratch
> file is located linux/windows?
java.io.File.createTempFile is used to create that file. It uses the default
temp directory. It's "/tmp" on linux. I'm not sure for windows as different
environment variables (TMP, TEMP, USERPROFILE, ) are used to search for such
a directory.

You may define your own temp directory using the following parameter when
starting your application

-Djava.io.tmpdir=PATH-TO-YOUR-TEMP


> 
> 
> Среда,  1 июля 2015, 13:54 +02:00 от Andreas Lehmkühler :
> >> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:38
> >> geschrieben:
> >> 
> >> 
> >>  The file is here  https://yadi.sk/i/Y0fTuvHmhbZiE
> >Ah, that explains a lot. The pdf is a scanned document, every page holds a
> >color
> >image, consuming a lot of memory when processed
> >
> >> I tried with load (fileName,true). The result - now I don't have memory
> >> problems. However now I have 2 problems:
> >>
> >> 1) All the thumbnail images are loaded. However, the speed is VERY SLOW.
> >> One
> >> thumbnail image is loaded about 4 seconds! 
> >If it comes to huge pdfs, you have to die one death. Either you provide
> >enough
> >memory to do all the stuff in memory (fast) or you use a scratch file to save
> >memory (slow)
> >
> >And yes, there is room for an improvement of the memory handling (read on
> >demand, remove after usage) in PDFBox, but that is some future feature.
> >Patches
> >are welcome.
> >
> >> 2) Besides, as you see thumbnail images are loaded in separate thread.
> >> While
> >> this thread is running and I try to
> >> get big image for main content using   BufferedImage
> >> bi=pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); I get the
> >> following exception:
> >> 
> >> java.io.IOException: java.util.zip.DataFormatException: unknown compression
> >> method
> >>     at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
> >>     at org.apache.pdfbox.cos.COSStream.attemptDecode(COSStream.java:422)
> >>     at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:398)
> >>     at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:335)
> >>     at
> >> org.apache.pdfbox.cos.COSStream.checkUnfilteredBuffer(COSStream.java:265)
> >>     at
> >> org.apache.pdfbox.cos.COSStream.getUnfilteredRandomAccess(COSStream.java:239)
> >>     at org.apache.pdfbox.pdfparser.BaseParser.(BaseParser.java:146)
> >>     at
> >> org.apache.pdfbox.pdfparser.PDFStreamParser.(PDFStreamParser.java:78)
> >>     at
> >> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:451)
> >>     at
> >> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:438)
> >>     at
> >> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
> >>     at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:180)
> >>     at
> >> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:205)
> >>     at
> >> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:136)
> >>     at
> >> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:95)
> >>   
> >>     at javafx.concurrent.Task$TaskCallable.call(Task.java:1423)
> >>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>     at java.lang.Thread.run(Thread.java:745)
> >> Caused by: java.util.zip.DataFormatException: unknown compression method
> >>     at java.util.zip.Inflater.inflateBytes(Native Method)
> >>     at java.util.zip.Inflater.inflate(Inflater.java:259)
> >>     at java.util.zip.Inflater.inflate(Inflater.java:280)
> >>     at
> >> org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:101)
> >>     at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:74)
> >>     ... 20 more
> >> 
> >> How to solve these problems?
> >PDFBox isn't supposed to be thread safe.
> >
> >> 
> >> 
> >> Среда,  1 июля 2015, 13:17 +02:00 от Andreas Lehmkühler < andr...@lehmi.de
> >> >:
> >> >
> >> >
> >> >> Alex Sviridov <  ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:09
> >> >> geschrieben:
> >> >> 
> >> >> 
> >> >>  I decided to show all the code. I also send the pdf file - some file
> >> >> from
> >> >> internet I use for testing.
> >> >The attachment didn't make it due to some restrictions to the mailing
> >> >list.
> >> >Please post a link to the origin source or another place where we can
> >> >download
> >> >the pdf in question.
> >> >
> >> >> 
> >> >> Task task = new Task() {
> >> >>     @Override protected Integer call() throws Exception {
> >> >>     for (int i=0;i >> >>     System.out.println("Point a:"+i);
> >> >>     WritableImage writableImage=model.getPageThumbImage(i);
> >> >>     System.out.println("Point b:"+i);
> >> >>     ImageView imageView=new ImageView(writableImage);
> >> >>     Sys

Re[8]: PDFRenderer, PDDocument memory issue

2015-07-01 Thread Alex Sviridov
 Ok. Thank you very much for explanation. Could you say where this scratch file 
is located linux/windows?


Среда,  1 июля 2015, 13:54 +02:00 от Andreas Lehmkühler :
>> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:38 
>> geschrieben:
>> 
>> 
>>  The file is here  https://yadi.sk/i/Y0fTuvHmhbZiE
>Ah, that explains a lot. The pdf is a scanned document, every page holds a 
>color
>image, consuming a lot of memory when processed
>
>> I tried with load (fileName,true). The result - now I don't have memory
>> problems. However now I have 2 problems:
>>
>> 1) All the thumbnail images are loaded. However, the speed is VERY SLOW. One
>> thumbnail image is loaded about 4 seconds! 
>If it comes to huge pdfs, you have to die one death. Either you provide enough
>memory to do all the stuff in memory (fast) or you use a scratch file to save
>memory (slow)
>
>And yes, there is room for an improvement of the memory handling (read on
>demand, remove after usage) in PDFBox, but that is some future feature. Patches
>are welcome.
>
>> 2) Besides, as you see thumbnail images are loaded in separate thread. While
>> this thread is running and I try to
>> get big image for main content using   BufferedImage
>> bi=pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); I get the
>> following exception:
>> 
>> java.io.IOException: java.util.zip.DataFormatException: unknown compression
>> method
>>     at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
>>     at org.apache.pdfbox.cos.COSStream.attemptDecode(COSStream.java:422)
>>     at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:398)
>>     at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:335)
>>     at
>> org.apache.pdfbox.cos.COSStream.checkUnfilteredBuffer(COSStream.java:265)
>>     at
>> org.apache.pdfbox.cos.COSStream.getUnfilteredRandomAccess(COSStream.java:239)
>>     at org.apache.pdfbox.pdfparser.BaseParser.(BaseParser.java:146)
>>     at
>> org.apache.pdfbox.pdfparser.PDFStreamParser.(PDFStreamParser.java:78)
>>     at
>> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:451)
>>     at
>> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:438)
>>     at
>> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
>>     at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:180)
>>     at
>> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:205)
>>     at
>> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:136)
>>     at
>> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:95)
>>   
>>     at javafx.concurrent.Task$TaskCallable.call(Task.java:1423)
>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>     at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.util.zip.DataFormatException: unknown compression method
>>     at java.util.zip.Inflater.inflateBytes(Native Method)
>>     at java.util.zip.Inflater.inflate(Inflater.java:259)
>>     at java.util.zip.Inflater.inflate(Inflater.java:280)
>>     at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:101)
>>     at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:74)
>>     ... 20 more
>> 
>> How to solve these problems?
>PDFBox isn't supposed to be thread safe.
>
>> 
>> 
>> Среда,  1 июля 2015, 13:17 +02:00 от Andreas Lehmkühler < andr...@lehmi.de >:
>> >
>> >
>> >> Alex Sviridov <  ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:09
>> >> geschrieben:
>> >> 
>> >> 
>> >>  I decided to show all the code. I also send the pdf file - some file from
>> >> internet I use for testing.
>> >The attachment didn't make it due to some restrictions to the mailing list.
>> >Please post a link to the origin source or another place where we can
>> >download
>> >the pdf in question.
>> >
>> >> 
>> >> Task task = new Task() {
>> >>     @Override protected Integer call() throws Exception {
>> >>     for (int i=0;i> >>     System.out.println("Point a:"+i);
>> >>     WritableImage writableImage=model.getPageThumbImage(i);
>> >>     System.out.println("Point b:"+i);
>> >>     ImageView imageView=new ImageView(writableImage);
>> >>     System.out.println("Point c:"+i);
>> >>     Label label=new Label(Integer.toString(i+1));
>> >>     System.out.println("Point d:"+i);
>> >>     VBox vBox=new VBox(imageView,label);
>> >>     System.out.println("Point e:"+i);
>> >>     vBox.setAlignment(Pos.CENTER);
>> >>     vBox.setStyle("-fx-padding:5px 5px 5px
>> >> 5px;-fx-background-color:red");
>> >>     System.out.println("Point f:"+i);
>> >>     Platform.runLater(new Runnable() {
>> >>     @Override
>> >>     public void run() {
>> >>  thumbFlowPane.getChildren().add(vBox);
>> >>     }
>> >>     });