If this is the wrong forum to report this, let me know. I'm trying to create a couple rules to identify questionable PDFs (phishing, etc.). While evaluating the debug output from spamassassin for the pdfinfo plugin, I noticed that some of the test file attributes aren't being populated correctly, when comparing against exiftool, Adobe Reader, Firefox, etc. The producer and creator fields, specifically, appear to be left as unknown.
Compared against other emails and PDFs, I get similar results, so I suspect it's an issue with the plugin or how it is parsing the PDF. I do have this example available, however it is malicious (it links to a phishing site), so I wouldn't want to link to it directly in this thread. For example: $ less Invoice0098539.pdf %PDF-1.4 1 0 obj << /Title (<FE><FF>) /Creator (<FE><FF>^@w^@k^@h^@t^@m^@l^@t^@o^@p^@d^@f^@ ^@0^@.^@1^@2^@.^@5) /Producer (<FE><FF>^@Q^@t^@ ^@4^@.^@8^@.^@7) /CreationDate (D:20220302192255Z) >> ... $ exiftool Invoice0098539.pdf ExifTool Version Number : 12.30 File Name : Invoice0098539.pdf Directory : . File Size : 21 KiB File Modification Date/Time : 2022:03:02 16:34:04-05:00 File Access Date/Time : 2022:03:02 16:37:43-05:00 File Inode Change Date/Time : 2022:03:02 16:34:04-05:00 File Permissions : -rw-r--r-- File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.4 Linearized : No Title : Creator : wkhtmltopdf 0.12.5 Producer : Qt 4.8.7 Create Date : 2022:03:02 19:22:55Z Page Count : 1 $ sa-debug ... config: fixed relative path: /var/lib/spamassassin/3.004004/updates_spamassassin_org/20_pdfinfo.cf config: using "/var/lib/spamassassin/3.004004/updates_spamassassin_org/ 20_pdfinfo.cf" for included file config: read file /var/lib/spamassassin/3.004004/updates_spamassassin_org/ 20_pdfinfo.cf pdfinfo: Identified 1 possible mime parts that need checked for PDF content pdfinfo: found part, type=1 file=Invoice0098539.pdf cte=base64 pdfinfo: set_tag called for PDFVERSION 1.4 pdfinfo: set_tag called for PDFNAME Invoice0098539.pdf ... pdfinfo: Filename=Invoice0098539.pdf Total HxW: 560 x 824 (55232 area) pdfinfo: Filename=Invoice0098539.pdf Title=untitled Author=unknown Producer=unknown Created=20220302192255 Modified=0 pdfinfo: MD5 results for Invoice0098539.pdf - md5=3F6F5C7CB71BDB101BADEF3CFFA9FE63 fuzzy1=32531F1D9420EE5721866DF28A3C6A17 fuzzy2=549DC099D6DFEF65AEA67FA0DF151C14 pdfinfo: set_tag called for PDFPRODUCER unknown pdfinfo: set_tag called for PDFTITLE untitled pdfinfo: set_tag called for PDFCREATOR unknown pdfinfo: set_tag called for PDFAUTHOR unknown pdfinfo: set_tag called for PDFMD5 32531F1D9420EE5721866DF28A3C6A17 pdfinfo: set_tag called for PDFMD5FUZZY1 32531F1D9420EE5721866DF28A3C6A17 pdfinfo: set_tag called for PDFMD5FUZZY2 549DC099D6DFEF65AEA67FA0DF151C14 pdfinfo: set_tag called for PDFCOUNT 1 pdfinfo: set_tag called for PDFIMGCOUNT 8 pdfinfo: image ratio=0.00103201042873696, min=0.000 max=0.005 pdfinfo: is_empty_body = 23 bytes pdfinfo: pdf_name_regex hit on Invoice0098539.pdf