[
https://issues.apache.org/jira/browse/PDFBOX-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764000#comment-17764000
]
Axel Howind commented on PDFBOX-5681:
-------------------------------------
You are right. It would fix the crash, but the result might be incorrect.
I just came up with this, I did a mvn verify and it looks good. I did not check
whether/how it affects performance. What do you think?
{code:java}
public List<COSObject> getObjectsByType(COSName type1, COSName type2)
{
List<COSObject> retval = new ArrayList<>();
Set<COSObjectKey> processedKeys = new HashSet<>();
Set<COSObjectKey> remainingKeys = Set.copyOf(xrefTable.keySet());
do {
for (COSObjectKey objectKey : remainingKeys)
{
COSObject objectFromPool = getObjectFromPool(objectKey);
COSBase realObject = objectFromPool.getObject();
if( realObject instanceof COSDictionary )
{
COSName dictType = ((COSDictionary)
realObject).getCOSName(COSName.TYPE);
if (type1.equals(dictType) || (type2 != null &&
type2.equals(dictType)))
{
retval.add(objectFromPool);
}
}
}
processedKeys.addAll(remainingKeys);
remainingKeys=new HashSet<>(xrefTable.keySet());
remainingKeys.removeAll(processedKeys);
} while (!remainingKeys.isEmpty());
return retval;
}
{code}
> ConcurrentModificationException in getObjectsByType() in 3.x
> ------------------------------------------------------------
>
> Key: PDFBOX-5681
> URL: https://issues.apache.org/jira/browse/PDFBOX-5681
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 3.0.0 PDFBox
> Reporter: Tim Allison
> Priority: Minor
> Attachments: PDFBOX-3714-2.pdf
>
>
> [~tilman]'s regression testing turned up this exception when we integrate
> PDFBox 3.0.0 into Tika:
> {noformat}
> java.util.ConcurrentModificationException
> at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
> at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620)
> at
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:254)
> at
> org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:240)
> {noformat}
> I can replicate this exception consistently on the attached file.
> With this code:
> {noformat}
> Path path = Paths.get("/.../PDFBOX-3714-2.pdf");
> PDDocument document = Loader.loadPDF(path.toFile());
> List<COSObject> objs =
> document.getDocument().getObjectsByType(COSName.FILESPEC);
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]