[jira] [Commented] (PDFBOX-5490) Add reconstruction information to the PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578537#comment-17578537 ] Maruan Sahyoun commented on PDFBOX-5490: OK - let's wait what [~lehmi] has to say about that as he's the one - apart from other areas - doing the parser. Looks like we need a somewhat extensible Event data model in order to deal with different needs and being extensible ... > Add reconstruction information to the PDDocument > > > Key: PDFBOX-5490 > URL: https://issues.apache.org/jira/browse/PDFBOX-5490 > Project: PDFBox > Issue Type: Wish > Components: Parsing >Reporter: Tim Allison >Priority: Minor > > When the xref has to be rebuilt or there are other anomalies in the parsing > of the PDDocument, the results are currently logged. In a multithreaded > environment it is not easy to reconstruct which documents had which problems. > It would be helpful if a PDF was able to be successfully loaded to include > information about what had to be fixed in order to load it successfully. > Certainly, rebuilding the xref table comes to mind, but any other info would > also be useful. > This is a wish for 3.x. I don't think I'll have time to contribute. :( -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5490) Add reconstruction information to the PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578510#comment-17578510 ] Tim Allison commented on PDFBOX-5490: - My initial request would be for whether or not the xref table had to be rebuilt...largely because I'm somewhat interested in that at the moment. Any info at the pre-DOM stage for what had to be guessed or assumed -- alleged obj stream length != actual object stream. Other places where PDFBox currently logs warnings (missing font, missing unicode mappings etc) after the DOM has been built would also be useful. > Add reconstruction information to the PDDocument > > > Key: PDFBOX-5490 > URL: https://issues.apache.org/jira/browse/PDFBOX-5490 > Project: PDFBox > Issue Type: Wish > Components: Parsing >Reporter: Tim Allison >Priority: Minor > > When the xref has to be rebuilt or there are other anomalies in the parsing > of the PDDocument, the results are currently logged. In a multithreaded > environment it is not easy to reconstruct which documents had which problems. > It would be helpful if a PDF was able to be successfully loaded to include > information about what had to be fixed in order to load it successfully. > Certainly, rebuilding the xref table comes to mind, but any other info would > also be useful. > This is a wish for 3.x. I don't think I'll have time to contribute. :( -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-5490) Add reconstruction information to the PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578480#comment-17578480 ] Maruan Sahyoun edited comment on PDFBOX-5490 at 8/11/22 1:45 PM: - [~lehmi] thoughts? I could do a small patch for an initial PoC - maybe initially using the FOP events package but havn't looked into it. [~tallison] what's the information you'd like to capture. Like the fact that there was some repair or is there more information you are looking for? Maybe it would be wise to postpone that until after 3.0. was (Author: msahyoun): [~lehmi] thoughts? I could do a small patch for an initial PoC - maybe initially using the FOP events package but havn't looked into it. [~tallison] what's the information you'd like to capture. Like the fact that there was some repair or is there more information you are looking for? > Add reconstruction information to the PDDocument > > > Key: PDFBOX-5490 > URL: https://issues.apache.org/jira/browse/PDFBOX-5490 > Project: PDFBox > Issue Type: Wish > Components: Parsing >Reporter: Tim Allison >Priority: Minor > > When the xref has to be rebuilt or there are other anomalies in the parsing > of the PDDocument, the results are currently logged. In a multithreaded > environment it is not easy to reconstruct which documents had which problems. > It would be helpful if a PDF was able to be successfully loaded to include > information about what had to be fixed in order to load it successfully. > Certainly, rebuilding the xref table comes to mind, but any other info would > also be useful. > This is a wish for 3.x. I don't think I'll have time to contribute. :( -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5490) Add reconstruction information to the PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578480#comment-17578480 ] Maruan Sahyoun commented on PDFBOX-5490: [~lehmi] thoughts? I could do a small patch for an initial PoC - maybe initially using the FOP events package but havn't looked into it. [~tallison] what's the information you'd like to capture. Like the fact that there was some repair or is there more information you are looking for? > Add reconstruction information to the PDDocument > > > Key: PDFBOX-5490 > URL: https://issues.apache.org/jira/browse/PDFBOX-5490 > Project: PDFBox > Issue Type: Wish > Components: Parsing >Reporter: Tim Allison >Priority: Minor > > When the xref has to be rebuilt or there are other anomalies in the parsing > of the PDDocument, the results are currently logged. In a multithreaded > environment it is not easy to reconstruct which documents had which problems. > It would be helpful if a PDF was able to be successfully loaded to include > information about what had to be fixed in order to load it successfully. > Certainly, rebuilding the xref table comes to mind, but any other info would > also be useful. > This is a wish for 3.x. I don't think I'll have time to contribute. :( -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5485) Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream
[ https://issues.apache.org/jira/browse/PDFBOX-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578399#comment-17578399 ] Andreas Lehmkühler commented on PDFBOX-5485: [~omcgovern] thanks for the report and the input especially the test. I've fixed the Stackoverflow. You might check the fix using the next upcoming snapshot > Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream > --- > > Key: PDFBOX-5485 > URL: https://issues.apache.org/jira/browse/PDFBOX-5485 > Project: PDFBox > Issue Type: Bug > Components: Writing >Affects Versions: 3.0.0 PDFBox > Environment: MacOS, but likely not OS specific. >Reporter: Owen McGovern >Assignee: Andreas Lehmkühler >Priority: Major > Fix For: 3.0.0 PDFBox > > > Version: org.apache.pdfbox:pdfbox:3.0.0-alpha3 > > In a subset of PDFs I process, I cannot extract a range of PDF pages and > write them out to a new PDF. ( As part of test code ) > Here's the Kotlin code I use > {code:java} > fun extractPages(documentName: String, fromPage: Int, toPage: Int) : Path { > val pdfFile = Paths.get("data", "input", "PDFS", "${documentName}.pdf") > val pdfPagesFile = Paths.get("data", "input", "PDFS", > "${documentName}_Page_$fromPage-$toPage.pdf") >val pdfDoc = org.apache.pdfbox.Loader.loadPDF(pdfFile.toFile()) > val pageExtractor = PageExtractor(pdfDoc, fromPage, toPage) >val pdfPages = pageExtractor.extract() > pdfPages.save(pdfPagesFile.toFile()) > return pdfPagesFile > }{code} > It doesn't occur in all PDFS... maybe 10-20% of the PDFs I use. > > The a slice of the stack trace is > {code:java} > java.lang.StackOverflowError > at java.base/java.util.HashMap.tableSizeFor(HashMap.java:380) > at java.base/java.util.HashMap.(HashMap.java:453) > at java.base/java.util.LinkedHashMap.(LinkedHashMap.java:347) > at java.base/java.util.HashSet.(HashSet.java:162) > at java.base/java.util.LinkedHashSet.(LinkedHashSet.java:154) > at org.apache.pdfbox.util.SmallMap.entrySet(SmallMap.java:380) > at org.apache.pdfbox.cos.COSDictionary.entrySet(COSDictionary.java:1225) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:336) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > {code} > As I mentioned, hits some PDFs, not all. > I legally cannot share the original source PDFs but it looks like a recursive > loop in writeCOSDictionary and writeObject in COSWriterObjectStream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail:
[jira] [Commented] (PDFBOX-5485) Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream
[ https://issues.apache.org/jira/browse/PDFBOX-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578398#comment-17578398 ] ASF subversion and git services commented on PDFBOX-5485: - Commit 1903349 from le...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1903349 ] PDFBOX-5485: add test as proposed by Owen McGovern > Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream > --- > > Key: PDFBOX-5485 > URL: https://issues.apache.org/jira/browse/PDFBOX-5485 > Project: PDFBox > Issue Type: Bug > Components: Writing >Affects Versions: 3.0.0 PDFBox > Environment: MacOS, but likely not OS specific. >Reporter: Owen McGovern >Assignee: Andreas Lehmkühler >Priority: Major > Fix For: 3.0.0 PDFBox > > > Version: org.apache.pdfbox:pdfbox:3.0.0-alpha3 > > In a subset of PDFs I process, I cannot extract a range of PDF pages and > write them out to a new PDF. ( As part of test code ) > Here's the Kotlin code I use > {code:java} > fun extractPages(documentName: String, fromPage: Int, toPage: Int) : Path { > val pdfFile = Paths.get("data", "input", "PDFS", "${documentName}.pdf") > val pdfPagesFile = Paths.get("data", "input", "PDFS", > "${documentName}_Page_$fromPage-$toPage.pdf") >val pdfDoc = org.apache.pdfbox.Loader.loadPDF(pdfFile.toFile()) > val pageExtractor = PageExtractor(pdfDoc, fromPage, toPage) >val pdfPages = pageExtractor.extract() > pdfPages.save(pdfPagesFile.toFile()) > return pdfPagesFile > }{code} > It doesn't occur in all PDFS... maybe 10-20% of the PDFs I use. > > The a slice of the stack trace is > {code:java} > java.lang.StackOverflowError > at java.base/java.util.HashMap.tableSizeFor(HashMap.java:380) > at java.base/java.util.HashMap.(HashMap.java:453) > at java.base/java.util.LinkedHashMap.(LinkedHashMap.java:347) > at java.base/java.util.HashSet.(HashSet.java:162) > at java.base/java.util.LinkedHashSet.(LinkedHashSet.java:154) > at org.apache.pdfbox.util.SmallMap.entrySet(SmallMap.java:380) > at org.apache.pdfbox.cos.COSDictionary.entrySet(COSDictionary.java:1225) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:336) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > {code} > As I mentioned, hits some PDFs, not all. > I legally cannot share the original source PDFs but it looks like a recursive > loop in writeCOSDictionary and writeObject in COSWriterObjectStream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe,
[jira] [Commented] (PDFBOX-5485) Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream
[ https://issues.apache.org/jira/browse/PDFBOX-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578396#comment-17578396 ] ASF subversion and git services commented on PDFBOX-5485: - Commit 1903348 from le...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1903348 ] PDFBOX-5485: avaoid StackOverflowException > Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream > --- > > Key: PDFBOX-5485 > URL: https://issues.apache.org/jira/browse/PDFBOX-5485 > Project: PDFBox > Issue Type: Bug > Components: Writing >Affects Versions: 3.0.0 PDFBox > Environment: MacOS, but likely not OS specific. >Reporter: Owen McGovern >Assignee: Andreas Lehmkühler >Priority: Major > Fix For: 3.0.0 PDFBox > > > Version: org.apache.pdfbox:pdfbox:3.0.0-alpha3 > > In a subset of PDFs I process, I cannot extract a range of PDF pages and > write them out to a new PDF. ( As part of test code ) > Here's the Kotlin code I use > {code:java} > fun extractPages(documentName: String, fromPage: Int, toPage: Int) : Path { > val pdfFile = Paths.get("data", "input", "PDFS", "${documentName}.pdf") > val pdfPagesFile = Paths.get("data", "input", "PDFS", > "${documentName}_Page_$fromPage-$toPage.pdf") >val pdfDoc = org.apache.pdfbox.Loader.loadPDF(pdfFile.toFile()) > val pageExtractor = PageExtractor(pdfDoc, fromPage, toPage) >val pdfPages = pageExtractor.extract() > pdfPages.save(pdfPagesFile.toFile()) > return pdfPagesFile > }{code} > It doesn't occur in all PDFS... maybe 10-20% of the PDFs I use. > > The a slice of the stack trace is > {code:java} > java.lang.StackOverflowError > at java.base/java.util.HashMap.tableSizeFor(HashMap.java:380) > at java.base/java.util.HashMap.(HashMap.java:453) > at java.base/java.util.LinkedHashMap.(LinkedHashMap.java:347) > at java.base/java.util.HashSet.(HashSet.java:162) > at java.base/java.util.LinkedHashSet.(LinkedHashSet.java:154) > at org.apache.pdfbox.util.SmallMap.entrySet(SmallMap.java:380) > at org.apache.pdfbox.cos.COSDictionary.entrySet(COSDictionary.java:1225) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:336) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > {code} > As I mentioned, hits some PDFs, not all. > I legally cannot share the original source PDFs but it looks like a recursive > loop in writeCOSDictionary and writeObject in COSWriterObjectStream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail:
[jira] [Commented] (PDFBOX-5485) Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream
[ https://issues.apache.org/jira/browse/PDFBOX-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578387#comment-17578387 ] ASF subversion and git services commented on PDFBOX-5485: - Commit 1903345 from le...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1903345 ] PDFBOX-5485: preserve origin key > Stackoverflow writing out a subset of PDF pages - COSWriterObjectStream > --- > > Key: PDFBOX-5485 > URL: https://issues.apache.org/jira/browse/PDFBOX-5485 > Project: PDFBox > Issue Type: Bug > Components: Writing >Affects Versions: 3.0.0 PDFBox > Environment: MacOS, but likely not OS specific. >Reporter: Owen McGovern >Assignee: Andreas Lehmkühler >Priority: Major > Fix For: 3.0.0 PDFBox > > > Version: org.apache.pdfbox:pdfbox:3.0.0-alpha3 > > In a subset of PDFs I process, I cannot extract a range of PDF pages and > write them out to a new PDF. ( As part of test code ) > Here's the Kotlin code I use > {code:java} > fun extractPages(documentName: String, fromPage: Int, toPage: Int) : Path { > val pdfFile = Paths.get("data", "input", "PDFS", "${documentName}.pdf") > val pdfPagesFile = Paths.get("data", "input", "PDFS", > "${documentName}_Page_$fromPage-$toPage.pdf") >val pdfDoc = org.apache.pdfbox.Loader.loadPDF(pdfFile.toFile()) > val pageExtractor = PageExtractor(pdfDoc, fromPage, toPage) >val pdfPages = pageExtractor.extract() > pdfPages.save(pdfPagesFile.toFile()) > return pdfPagesFile > }{code} > It doesn't occur in all PDFS... maybe 10-20% of the PDFs I use. > > The a slice of the stack trace is > {code:java} > java.lang.StackOverflowError > at java.base/java.util.HashMap.tableSizeFor(HashMap.java:380) > at java.base/java.util.HashMap.(HashMap.java:453) > at java.base/java.util.LinkedHashMap.(LinkedHashMap.java:347) > at java.base/java.util.HashSet.(HashSet.java:162) > at java.base/java.util.LinkedHashSet.(LinkedHashSet.java:154) > at org.apache.pdfbox.util.SmallMap.entrySet(SmallMap.java:380) > at org.apache.pdfbox.cos.COSDictionary.entrySet(COSDictionary.java:1225) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:336) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:319) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:226) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:230) > at > org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:341) > {code} > As I mentioned, hits some PDFs, not all. > I legally cannot share the original source PDFs but it looks like a recursive > loop in writeCOSDictionary and writeObject in COSWriterObjectStream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: