[ 
https://issues.apache.org/jira/browse/TIKA-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18087997#comment-18087997
 ] 

Tilman Hausherr edited comment on TIKA-4756 at 6/10/26 2:04 PM:
----------------------------------------------------------------

Why not use PDFBox directly if it is only about this? Here's some code by 
ChatGPT:
{code:java}
    public static boolean hasSignature(File pdfFile) throws IOException {
        try (PDDocument document = Loader.loadPDF(pdfFile)) {
            List<PDSignature> signatures = document.getSignatureDictionaries();
            return !signatures.isEmpty();
        }
    }
{code}



was (Author: tilman):
Why not use PDFBox directly if it is only about this? Here's some code by 
ChatGPT and me:
{code:java}
    public static boolean hasSignature(File pdfFile) throws IOException {
        try (PDDocument document = Loader.loadPDF(pdfFile)) {
            List<PDSignature> signatures = document.getSignatureDictionaries();
            return signatures != null && !signatures.isEmpty();
        }
    }
{code}


> Detecting Signatures in PDFs with AcroForm
> ------------------------------------------
>
>                 Key: TIKA-4756
>                 URL: https://issues.apache.org/jira/browse/TIKA-4756
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: Willy T. Koch
>            Priority: Minor
>              Labels: Signature
>         Attachments: sigflags_sample.pdf
>
>
> We see that PDFs that have an Acroform that contains a signture /Sig fields 
> aren't detected by the /meta analysis. It detects the AcroForm with  
> "pdf:hasAcroFormFields": "true", but nothing on the /Sig part. They are 
> created directly in Adobe Acrobat which is also possible in the Free version.
> It would be very useful to also return   "hasSignature": "true" (or some 
> other signature: property) in these kinds of filees, so we can handle it on 
> our end. We use this to exluce PDFs with digital signatures from being 
> reconverted to PDF/A.
>  
> When I run it through the OCRmyPDF, it flags it as digitally signed and 
> exits, which is how I first noticed it.
> _ocrmypdf sigflags_sample.pdf sigflags_sample_ocrmypdf.pdf_
> _DigitalSignatureError: Input PDF has a digital signature. OCR would alter 
> the document,_
> _invalidating the signature._
>  
> I've attached a small sample PDF with AcroForm and Signature to reproduce the 
> issue.
>  
> Willy T. Koch
> Technical Product manager,
> Public 360°
> Norway



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to