[ https://issues.apache.org/jira/browse/PDFBOX-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr updated PDFBOX-5110: ------------------------------------ Labels: optimization (was: ) > improve performance in signature validation > ------------------------------------------- > > Key: PDFBOX-5110 > URL: https://issues.apache.org/jira/browse/PDFBOX-5110 > Project: PDFBox > Issue Type: Improvement > Components: Signing > Affects Versions: 2.0.22, 3.0.0 PDFBox > Environment: java8 > Reporter: Jordi Boixadera > Priority: Major > Labels: optimization > Attachments: COSFilterInputStream.java, TESTPdfbox.zip > > > We are developing a software that validates integrity of PDF signatures using > PDFBox. > We have faced a performance problem in the PDF validation, we have found the > class that is causing the problem, and have made an improved version of it. > We are doing the validation in the same way as it is done hereĀ > [https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/sign/ValidateSignature.java] > , method "validateSignaturesImproved". This method uses > PDSignature.getContents() and PDSignature.getSignedContent() > When validating big PDF files of more than 3MB we realized the performance in > validation was very high. > In the end we found that, > org.apache.pdfbox.pdmodel.interactive.digitalsignature.COSFilterInputStream > was reading the document byte-by-byte, checking ranges every byte. > We have rewritten COSFilterInputStream to work with byte blocks and the > validation time has dropped a lot. > We have tested this in PDFBox 2.0.22 and 3.0.0-SNAPSHOT. We have attached the > test project (TestPDFBox.zip). Here is the code that reproduces the problem: > {code:java} > try(PDDocument doc = Loader.loadPDF(new File(args[0]))){ > PDSignature signature = > doc.getLastSignatureDictionary(); > byte[] signedContent = signature.getSignedContent(new > FileInputStream(args[0])); > byte[] signatureBytes = signature.getContents(); > } > {code} > Without our modification, with a 3MB signed PDF, it takes 10 seconds to do > this. With our modification, it takes 0.2 seconds. > We would like to have this improved code reviewed in pdfbox. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org