Hi PDFBox Team, I have identified a potential bug in Apache PDFBox and would like to report it. Below are the details:
- **PDFBox Version**: 2.0.32 、3.0.0 - **Java Version**: 11 When there are a large number of sources (e.g., thousands), the `tobeclosed` method will load the PDF document into memory. This may pose a risk of Out-of-Memory (OOM) during the merge process. The following adjustments can be made. org.apache.pdfbox.multipdf.PDFMergerUtility#legacyMergeDocuments for (Object sourceObject : sources) { PDDocument sourceDoc = null; if (sourceObject instanceof File) { sourceDoc = PDDocument.load((File) sourceObject, partitionedMemSetting); } else { sourceDoc = PDDocument.load((InputStream) sourceObject, partitionedMemSetting); } try { appendDocument(destination, sourceDoc); }finally { IOUtils.closeAndLogException(sourceDoc, LOG, "PDDocument", null); } } one of the case : Comparison of Memory Usage Before and After Modification (Merging a 16.8MB File 200 Times, with JVM Heap Size Limit Set to 2GB) - **Before Modification**: An OutOfMemoryError (OOM) occurred after just over 1 minute of operation. Due to insufficient heap memory, Full GC (Full Garbage Collection) was triggered frequently, which can be observed from the CPU usage curve on the left. - **After Modification**: The heap memory is now able to be collected normally without causing an OOM. Thank you for your attention. Please let me know if you need any further information. Best regards