Jordi Boixadera created PDFBOX-5110:
---------------------------------------

             Summary: improve performance in signature validation
                 Key: PDFBOX-5110
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5110
             Project: PDFBox
          Issue Type: Improvement
          Components: Signing
    Affects Versions: 2.0.22, 3.0.0 PDFBox
         Environment: java8
            Reporter: Jordi Boixadera
         Attachments: COSFilterInputStream.java, TESTPdfbox.zip

We are developing a software that validates integrity of PDF signatures using 
PDFBox.

We have faced a performance problem in the PDF validation, we have found the 
class that is causing the problem, and have made an improved version of it.

We are doing the validation in the same way as it is done hereĀ 
[https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/sign/ValidateSignature.java]
 , method "validateSignaturesImproved". This method uses 
PDSignature.getContents() and PDSignature.getSignedContent()

When validating big PDF files of more than 3MB we realized the performance in 
validation was very high.

In the end we found that, 
org.apache.pdfbox.pdmodel.interactive.digitalsignature.COSFilterInputStream was 
reading the document byte-by-byte, checking ranges every byte.

We have rewritten COSFilterInputStream to work with byte blocks and the 
validation time has dropped a lot.

We have tested this in PDFBox 2.0.22 and 3.0.0-SNAPSHOT. We have attached the 
test project (TestPDFBox.zip). Here is the code that reproduces the problem:
{code:java}
                try(PDDocument doc = Loader.loadPDF(new File(args[0]))){
                        PDSignature signature = 
doc.getLastSignatureDictionary();
                        byte[] signedContent = signature.getSignedContent(new 
FileInputStream(args[0]));
                        byte[] signatureBytes = signature.getContents();
                }
 {code}
Without our modification, with a 3MB signed PDF, it takes 10 seconds to do 
this. With our modification, it takes 0.2 seconds.

We would like to have this improved code reviewed in pdfbox.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to