[ 
https://issues.apache.org/jira/browse/PDFBOX-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763754#comment-17763754
 ] 

Tim Allison commented on PDFBOX-5681:
-------------------------------------

I initially thought this was a threading issue, but it isn't.  The exception 
can be thrown if any modification is made to the underlying collection while 
the iterator is iterating, even if in the same thread.

My guess is that the computeIfAbsent call in {{getObjectFromPool}} is somehow 
changing the xRefTable keyset that is being iterated over???

There may be another iteration + modification on a different collection during 
the parse.  The triggering object {{5 0 R}} requires parsing numerous objects 
from an xrefstream.




> ConcurrentModificationException in getObjectsByType() in 3.x
> ------------------------------------------------------------
>
>                 Key: PDFBOX-5681
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5681
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 3.0.0 PDFBox
>            Reporter: Tim Allison
>            Priority: Minor
>         Attachments: PDFBOX-3714-2.pdf
>
>
> [~tilman]'s regression testing turned up this exception when we integrate 
> PDFBox 3.0.0 into Tika:
> {noformat}
> java.util.ConcurrentModificationException
>       at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
>       at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620)
>       at 
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:254)
>       at 
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:240)
> {noformat}
> I can replicate this exception consistently on the attached file.
> With this code:
> {noformat}
>         Path path = Paths.get("/.../PDFBOX-3714-2.pdf");
>         PDDocument document = Loader.loadPDF(path.toFile());
>         List<COSObject> objs = 
> document.getDocument().getObjectsByType(COSName.FILESPEC);
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to