Hi Tilman, I used a decompiler to have a look at the sources.
Perhaps it would be a good idea to set Splitter() deprecated @deprecated public Splitter() {} public Splitter(MemoryUsageSetting memoryUsageSetting) { this.memoryUsageSetting = memoryUsageSetting; } to point people to the improvement before they fall into the out of memory hole themselves. Please add a program argument to PDFSplit.split() like so: if (args[i].equals("-memory")) { if (++i >= args.length) { PDFSplit.usage(); } if (args[i].equals("tempFile")) { memoryUsageSetting = ......... } else if (args[i].equals("mainMemory")) { memoryUsageSetting = ......... } else if (args[i].equals("mixed")) { memoryUsageSetting = ......... } else { PDFSplit.usage(); } continue; } Perhaps it would be a good idea to even make "maxMainMemoryBytes" and "maxStorageBytes" configurable, too. Thanks a lot - I really appreciate your great work and support! Cheers, Daniel -----Ursprüngliche Nachricht----- Von: Tilman Hausherr [mailto:thaush...@t-online.de] Gesendet: Donnerstag, 13. Juli 2017 21:21 An: users@pdfbox.apache.org Betreff: Re: Splitter.createNewDocument() always uses main memory only - this leads to out of memory when splitting large documents See https://issues.apache.org/jira/browse/PDFBOX-3869 and try a snapshot from https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.7-SNAPSHOT/ (at the bottom) Please give feedback whether this is what you wanted. Please do it quickly because a new version will be built on monday so either I'd have to revert before or we'll be stuck with this API. Re: a global configuration - maybe at a later time. I'm not THAT convinced that it is needed. Tilman Am 13.07.2017 um 09:20 schrieb d.ham...@aurenz.de: > Hi dear contributors to pdfbox, > > I just would like to report that Splitter.createNewDocument() should be able > to consider different MemoryUsageSetting configurations. > > In version 2.0.6 this method is implemented as > > > protected PDDocument createNewDocument() throws IOException > { > PDDocument document = new PDDocument(); > document.getDocument().setVersion(getSourceDocument().getVersion()); > > document.setDocumentInformation(getSourceDocument().getDocumentInformation()); > document.getDocumentCatalog().setViewerPreferences( > > getSourceDocument().getDocumentCatalog().getViewerPreferences()); > return document; > } > > > > I would suggest to introduce a member variable "MemoryUsageSetting > memSetting" that can be set for each instance of "Splitter". > > This way createNewDocument() could be implemented as > > > protected PDDocument createNewDocument() throws IOException > { > PDDocument document = new PDDocument(this. memSetting); > document.getDocument().setVersion(getSourceDocument().getVersion()); > > document.setDocumentInformation(getSourceDocument().getDocumentInformation()); > document.getDocumentCatalog().setViewerPreferences( > > getSourceDocument().getDocumentCatalog().getViewerPreferences()); > return document; > } > > > Thankfully createNewDocument() is not private, so I could override > this method in my child class (as I did for "protected void > processPage()", too... (just FYI - to create process messages) > > > Please have a look at "PDFMergerUtility.mergeDocuments()" which is deprecated > since MemoryUsageSetting was introduced. Now, the usage of > "PDFMergerUtility.mergeDocuments(MemoryUsageSetting memUsageSetting)" is > encouraged. > > > By the way: The utility "PDFSplit" would have to be updated to pass a > configured MemoryUsageSetting to "Splitter" - otherwise this tool relies on > main memory only. > > Perhaps it would be a good thing to be able to define a "pdfbox-wide" > basic MemoryUsageSetting which could be used everywhere as a fallback. > This way the default constructor of PDDocument could be changed from > > its implementation in version 2.0.6 > > public PDDocument() > { > this(MemoryUsageSetting.setupMainMemoryOnly()); > } > > > to something like > > > public PDDocument() > { > this(MemoryUsageSetting.asConfigured()); > } > > > > Regards, > > Daniel > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org