> Alex Sviridov <ooo_satu...@mail.ru> hat am 1. Juli 2015 um 13:59 geschrieben: > > > Ok. Thank you very much for explanation. Could you say where this scratch > file is located linux/windows? java.io.File.createTempFile is used to create that file. It uses the default temp directory. It's "/tmp" on linux. I'm not sure for windows as different environment variables (TMP, TEMP, USERPROFILE, ....) are used to search for such a directory.
You may define your own temp directory using the following parameter when starting your application -Djava.io.tmpdir=PATH-TO-YOUR-TEMP > > > Среда, 1 июля 2015, 13:54 +02:00 от Andreas Lehmkühler <andr...@lehmi.de>: > >> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:38 > >> geschrieben: > >> > >> > >> The file is here https://yadi.sk/i/Y0fTuvHmhbZiE > >Ah, that explains a lot. The pdf is a scanned document, every page holds a > >color > >image, consuming a lot of memory when processed > > > >> I tried with load (fileName,true). The result - now I don't have memory > >> problems. However now I have 2 problems: > >> > >> 1) All the thumbnail images are loaded. However, the speed is VERY SLOW. > >> One > >> thumbnail image is loaded about 4 seconds! > >If it comes to huge pdfs, you have to die one death. Either you provide > >enough > >memory to do all the stuff in memory (fast) or you use a scratch file to save > >memory (slow) > > > >And yes, there is room for an improvement of the memory handling (read on > >demand, remove after usage) in PDFBox, but that is some future feature. > >Patches > >are welcome. > > > >> 2) Besides, as you see thumbnail images are loaded in separate thread. > >> While > >> this thread is running and I try to > >> get big image for main content using BufferedImage > >> bi=pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); I get the > >> following exception: > >> > >> java.io.IOException: java.util.zip.DataFormatException: unknown compression > >> method > >> at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83) > >> at org.apache.pdfbox.cos.COSStream.attemptDecode(COSStream.java:422) > >> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:398) > >> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:335) > >> at > >> org.apache.pdfbox.cos.COSStream.checkUnfilteredBuffer(COSStream.java:265) > >> at > >> org.apache.pdfbox.cos.COSStream.getUnfilteredRandomAccess(COSStream.java:239) > >> at org.apache.pdfbox.pdfparser.BaseParser.<init>(BaseParser.java:146) > >> at > >> org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:78) > >> at > >> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:451) > >> at > >> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:438) > >> at > >> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149) > >> at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:180) > >> at > >> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:205) > >> at > >> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:136) > >> at > >> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:95) > >> .... > >> at javafx.concurrent.Task$TaskCallable.call(Task.java:1423) > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >> at java.lang.Thread.run(Thread.java:745) > >> Caused by: java.util.zip.DataFormatException: unknown compression method > >> at java.util.zip.Inflater.inflateBytes(Native Method) > >> at java.util.zip.Inflater.inflate(Inflater.java:259) > >> at java.util.zip.Inflater.inflate(Inflater.java:280) > >> at > >> org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:101) > >> at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:74) > >> ... 20 more > >> > >> How to solve these problems? > >PDFBox isn't supposed to be thread safe. > > > >> > >> > >> Среда, 1 июля 2015, 13:17 +02:00 от Andreas Lehmkühler < andr...@lehmi.de > >> >: > >> > > >> > > >> >> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um 13:09 > >> >> geschrieben: > >> >> > >> >> > >> >> I decided to show all the code. I also send the pdf file - some file > >> >> from > >> >> internet I use for testing. > >> >The attachment didn't make it due to some restrictions to the mailing > >> >list. > >> >Please post a link to the origin source or another place where we can > >> >download > >> >the pdf in question. > >> > > >> >> > >> >> Task task = new Task() { > >> >> @Override protected Integer call() throws Exception { > >> >> for (int i=0;i<model.getTotalPages();i++){ > >> >> System.out.println("Point a:"+i); > >> >> WritableImage writableImage=model.getPageThumbImage(i); > >> >> System.out.println("Point b:"+i); > >> >> ImageView imageView=new ImageView(writableImage); > >> >> System.out.println("Point c:"+i); > >> >> Label label=new Label(Integer.toString(i+1)); > >> >> System.out.println("Point d:"+i); > >> >> VBox vBox=new VBox(imageView,label); > >> >> System.out.println("Point e:"+i); > >> >> vBox.setAlignment(Pos.CENTER); > >> >> vBox.setStyle("-fx-padding:5px 5px 5px > >> >> 5px;-fx-background-color:red"); > >> >> System.out.println("Point f:"+i); > >> >> Platform.runLater(new Runnable() { > >> >> @Override > >> >> public void run() { > >> >> thumbFlowPane.getChildren().add(vBox); > >> >> } > >> >> }); > >> >> } > >> >> return null; > >> >> } > >> >> }; > >> >> new Thread(task).start(); > >> >> > >> >> And here is the tail of the output > >> >> .... > >> >> Point a:30 > >> >> Point b:30 > >> >> Point c:30 > >> >> Point d:30 > >> >> Point e:30 > >> >> Point f:30 > >> >> Point a:31 > >> >> > >> >> What is scratch file? Sorry, I don't understand you. > >> > > >> >PDFBox holds a lot of temporary data in the memory. To reduce the memory > >> >footprint one can choose to use a scratch file instead, so that some/most > >> >of > >> >that data will be hold in a file. > >> > > >> >To do so, simply use another load method, e.g. > >> > > >> >load(File file, boolean useScratchFiles) > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> Среда, 1 июля 2015, 13:04 +02:00 от Andreas Lehmkühler < > >> >> andr...@lehmi.de > >> >> >: > >> >> > > >> >> > > >> >> >> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um 12:58 > >> >> >> geschrieben: > >> >> >> > >> >> >> > >> >> >> Thank you for answer. I tried > >> >> >> pdfbox-app-2.0.0-20150630.220424-1464.jar > >> >> >> the > >> >> >> result is the same. > >> >> >> > >> >> >> When I create images I add them to javafx FlowPane. However, the > >> >> >> problem > >> >> >> is > >> >> >> not in images because I repeat - I get 400mb when I do > >> >> >> pdfDocument=null,pdfRenderer=null. > >> >> >> > >> >> >> Bedised, when I do pdfDocument = PDDocument.load(new File(fileName)) > >> >> >> I > >> >> >> don't > >> >> >> have any problems with memory. > >> >> >> > >> >> >> I'm getting problem with memory when I run in for loop > >> >> >> getPageThumbImage. > >> >> >> > >> >> >> I am sure that the problem is in PdfBox. Please, help me. > >> >> >Maybe, but I'm not sure at all. > >> >> > > >> >> >Try to use the scratch file. > >> >> > > >> >> >> Среда, 1 июля 2015, 12:48 +02:00 от Andreas Lehmkühler < > >> >> >> andr...@lehmi.de > >> >> >> >: > >> >> >> > > >> >> >> > > >> >> >> >> Alex Sviridov < ooo_satu...@mail.ru > hat am 1. Juli 2015 um > >> >> >> >> 10:16 > >> >> >> >> geschrieben: > >> >> >> >> > >> >> >> >> > >> >> >> >> I want to display all page thumbnails. However I came across > >> >> >> >> memory > >> >> >> >> size > >> >> >> >> problem with PDFRenderer or PDDocument - I don't know which one. > >> >> >> >> > >> >> >> >> I have the following code: > >> >> >> >> .... > >> >> >> >> private PDDocument pdfDocument; > >> >> >> >> > >> >> >> >> private PDFRenderer pdfRenderer; > >> >> >> >> > >> >> >> >> public WritableImage getPageThumbImage(int page){ > >> >> >> >> WritableImage result=null; > >> >> >> >> try { > >> >> >> >> BufferedImage bi=pdfRenderer.renderImageWithDPI(page, > >> >> >> >> 12, > >> >> >> >> ImageType.RGB); > >> >> >> >> result=SwingFXUtils.toFXImage(bi, null); > >> >> >> >> } catch (IOException ex) { > >> >> >> >> .... > >> >> >> >> } > >> >> >> >> return result; > >> >> >> >> } > >> >> >> >> ..... > >> >> >> >> The method getPageThumbImage I run in for loop for every page.I > >> >> >> >> set > >> >> >> >> java > >> >> >> >> memory heap to 500mb. > >> >> >> >> And I can get about 30 images using getPageThumbImage (if I set > >> >> >> >> more > >> >> >> >> memory > >> >> >> >> I > >> >> >> >> get more). > >> >> >> >> In my application I have real time memory graphs and they show > >> >> >> >> that > >> >> >> >> memory > >> >> >> >> is > >> >> >> >> very fast filled. > >> >> >> >> When there is no more free memory getPageThumbImage hangs - no > >> >> >> >> exception, > >> >> >> >> nothing. But the code stops. > >> >> >> >> When I do pdfDocument=null,pdfRenderer=null I get about 400mb free > >> >> >> >> memory. > >> >> >> >> How > >> >> >> >> to solve this problem? > >> >> >> >There are 2 possible issues and maybe both are relevant. > >> >> >> > > >> >> >> >1. PDFBox consumes more or less memory to load a pdf depending on > >> >> >> >the > >> >> >> >size > >> >> >> >and > >> >> >> >the content of the pdf. > >> >> >> > > >> >> >> >- Are you using the latest 2.0.0-SNAPSHOT? There were some > >> >> >> >improvements > >> >> >> >concerning the memory footprint lately > >> >> >> >- Try to use of a scratch file (there are load methods including a > >> >> >> >boolean > >> >> >> >switcht ot activate that) > >> >> >> > > >> >> >> >2. Your own implementation consumes more or less memory to process > >> >> >> >those > >> >> >> >thumbnails > >> >> >> > > >> >> >> >- check if you are releasing all resources (ecspecially those images > >> >> >> >you're > >> >> >> >creating) you are using during your process > >> >> >> > > >> >> >> >HTH, > >> >> >> >Andreas > >> >> >> > > >> >> >> >--------------------------------------------------------------------- > >> >> >> >To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >> >> >> >For additional commands, e-mail: users-h...@pdfbox.apache.org > >> >> >> > > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Alex Sviridov > >> >> > > >> >> >BR > >> >> >Andreas > >> >> > > >> >> >--------------------------------------------------------------------- > >> >> >To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >> >> >For additional commands, e-mail: users-h...@pdfbox.apache.org > >> >> > > >> >> > >> >> > >> >> -- > >> >> Alex Sviridov > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >> >> For additional commands, e-mail: users-h...@pdfbox.apache.org > >> > > >> > > >> >BR > >> >Andreas > >> > > >> >--------------------------------------------------------------------- > >> >To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >> >For additional commands, e-mail: users-h...@pdfbox.apache.org > >> > > >> > >> > >> -- > >> Alex Sviridov > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >For additional commands, e-mail: users-h...@pdfbox.apache.org > > > > > -- > Alex Sviridov --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org