[ https://issues.apache.org/jira/browse/PDFBOX-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678851#comment-16678851 ]
Tim Allison commented on PDFBOX-4370: ------------------------------------- There are two other files in our new corpus that trigger this. I'm ok with a "won't fix"...unless the solution is fairly easy...and there's no rush if this is fixable. :) Thank you! > Jempbox's ResourceEvent crazily slow to initialize > -------------------------------------------------- > > Key: PDFBOX-4370 > URL: https://issues.apache.org/jira/browse/PDFBOX-4370 > Project: PDFBox > Issue Type: Task > Components: JempBox > Affects Versions: 1.8.16 > Reporter: Tim Allison > Priority: Trivial > Attachments: slow.zip > > > In our new batch of regression files on Tika, one of the new PDFs caused a > timeout. This is not an infinite loop, but it does take several minutes. > This may not be fixable. > Admittedly, the XMP is large, and there are quite a few events. > This is the code that triggers the problem. > {noformat} > XMPMetadata xmp = XMPMetadata.load(is); > XMPSchemaMediaManagement mmSchema = > xmp.getMediaManagementSchema(); > mmSchema.getHistory(); > {noformat} > The slow part _seems_ to be setting the attribute namespace when creating a > new ResourceEvent. When I comment out the following in ResourceEvent's > initializer, the processing time is quite fast (1 second). > {noformat} > parent.setAttributeNS( > XMPSchema.NS_NAMESPACE, > "xmlns:stEvt", > NAMESPACE ); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org