[Podofo-users] Strings in object streams are decrypted twice for some encryption algorithms

Michal Sudolsky Thu, 12 Mar 2020 11:50:13 -0700

Hi,

There is this part of code in PdfObjectStreamParserObject.cpp:


*if*( m_pEncrypt && (m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt::
ePdfEncryptAlgorithm_AESV2

#ifndef PODOFO_HAVE_OPENSSL_NO_RC4

|| m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt::
ePdfEncryptAlgorithm_RC4V2

#endif // PODOFO_HAVE_OPENSSL_NO_RC4


  ) )

variantTokenizer.GetNextVariant( var, 0 ); // Stream is already decrypted

*else*

variantTokenizer.GetNextVariant( var, m_pEncrypt );

But document
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
clearly
states in 7.6.1 that strings that are inside object streams are not
encrypted but instead whole stream is encrypted (regardless of used
encryption algorithm). So why is this done correctly only for
ePdfEncryptAlgorithm_RC4V2 and ePdfEncryptAlgorithm_AESV2?

This pdf
https://web.archive.org/web/20200109090442/https://www.ok.gov/tax/documents/bm26.pdf
uses ePdfEncryptAlgorithm_RC4V1
and this code prints rubbish:

PdfMemDocument pdf("bm26.pdf");
printf("%s\n", pdf.GetPage(0)->GetAnnotation(3)->GetTitle().GetString());

Possible output: "?v??=??>?>"

Attached is patch which completely removes m_pEncrypt from
PdfObjectStreamParsetObject as there is nothing which should be decrypted.
Except source stream from which are objects loaded which is decrypted
elsewhere during "m_pParser->GetStream()" (and this m_pParser has its own
m_pEncrypt of course).

Correct output after patch: "Check Type"

decrypt_object_streams_once.patch
Description: Binary data

_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

[Podofo-users] Strings in object streams are decrypted twice for some encryption algorithms

Reply via email to