Hi, There is this part of code in PdfObjectStreamParserObject.cpp:
*if*( m_pEncrypt && (m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt:: ePdfEncryptAlgorithm_AESV2 #ifndef PODOFO_HAVE_OPENSSL_NO_RC4 || m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt:: ePdfEncryptAlgorithm_RC4V2 #endif // PODOFO_HAVE_OPENSSL_NO_RC4 ) ) variantTokenizer.GetNextVariant( var, 0 ); // Stream is already decrypted *else* variantTokenizer.GetNextVariant( var, m_pEncrypt ); But document https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf clearly states in 7.6.1 that strings that are inside object streams are not encrypted but instead whole stream is encrypted (regardless of used encryption algorithm). So why is this done correctly only for ePdfEncryptAlgorithm_RC4V2 and ePdfEncryptAlgorithm_AESV2? This pdf https://web.archive.org/web/20200109090442/https://www.ok.gov/tax/documents/bm26.pdf uses ePdfEncryptAlgorithm_RC4V1 and this code prints rubbish: PdfMemDocument pdf("bm26.pdf"); printf("%s\n", pdf.GetPage(0)->GetAnnotation(3)->GetTitle().GetString()); Possible output: "?v??=??>?>" Attached is patch which completely removes m_pEncrypt from PdfObjectStreamParsetObject as there is nothing which should be decrypted. Except source stream from which are objects loaded which is decrypted elsewhere during "m_pParser->GetStream()" (and this m_pParser has its own m_pEncrypt of course). Correct output after patch: "Check Type"
decrypt_object_streams_once.patch
Description: Binary data
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users