[
https://issues.apache.org/jira/browse/PDFBOX-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257682#comment-17257682
]
Ralf Hauser commented on PDFBOX-4297:
-------------------------------------
Re 3b) "whether it is signed" [correctly]
Looking at ShowSignature.java unfortunately, it is not yet memory-efficient.
For example in ShowSignature.checkContentValueWithFile(File file, int[]
byteRange, byte[] contents), the memory usage grows linearly with the file size
due to the contents byte-array.
But there is hope since
a) in showSignature() when
switch (subFilter)
is executed, the "adbe.*" convert the "byte[] contents" back into a stream.
(albeit I do not see that in this case, it is verified whether the document
is altered or not)
b) verifyPKCS7() probably could work with a stream instead of "byte[] contents"
because the
bouncycastle classes also have stream approaches.
(CMSSignedData has constructors with streams instead of byte[] )
So to begin,
i) PDSignature.getContents(InputStream pdfFile) should be amended with a sibling
public InputStream getSignedContentStream(InputStream pdfFile) throws
IOException
{
try (COSFilterInputStream fis = new COSFilterInputStream(pdfFile,
getByteRange()))
{
return fis;
}
}
ii) verifyETSIdotRFC3161() should be refactored to work with streams and not
the content byte[]
> Allow to space efficiently analyse large PDFs
> ---------------------------------------------
>
> Key: PDFBOX-4297
> URL: https://issues.apache.org/jira/browse/PDFBOX-4297
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Reporter: Ralf Hauser
> Priority: Major
>
> Assume you get a 300+MB large pdf and need to know
> 1) the file names of embedded files if any
> 2) whether it is encrypted (symmetric or asymmetric)
> 3) certification level (and whether it is signed)
> This should not use more than 5 MB (extra) memory
>
> P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle
> large PDF files"
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]