Hello dear podofo users, I had some time to come up with a patch for this issue. I don't think it's the best solution available (a proper redesign might be better), but at least the patch should have low impact and is easily conceivable.
The current situation: PdfTokenizer.cpp:615 // 'Contents' key of a /Type/Sig dictionary is an unencrypted Hex string bool bIsSigContents = key == PdfName( "Contents" ) && dict.HasKey( "Type" ) && dict.GetKey( "Type" )->GetDataType() == ePdfDataType_Name && dict.GetKey( "Type" )->GetName() == PdfName( "Sig" ); This check is invalid when the 'Type' key is located after the 'Contents' key in the pdf file. The boolean is set wrongly to 'false'. PdfTokenizer.cpp:621 // Get the next variant. If there isn't one, it'll throw UnexpectedEOF. this->GetNextVariant( val, bIsSigContents ? NULL : pEncrypt ); Due to the invalid check, the bool is 'false' and the unecnrypted signature content is going to be decrypted, resulting in a decryption error ('Length not a multiple of 16'). The patch idea: We cannot check the dictionary type during parsing, so the check has to be moved after all dictionary keys have been processed. Since we don't know beforehand if the 'Contents' content has to be decrypted or not and we have to read the pdf sequentially (thats the current design), we have to read the 'Contents' content first and process it later. This requires some function interface changes, so we are able to get the raw, unprocessed content. The patch itself: Some of the PdfTokenizer functions were expanded with an optional boolean parameter ('getRawHex') and/or were given a return value (EPdfDataType instead of void). The optional parameter only has an effect if the currently read out content is a hex string, it prevents the conversion of the content via SetHexData (when set to true). The return value is required for checking whether the currently read out content was actually a hex string or something else. // 'Contents' key of a signature dictionary is an unencrypted hex string. // Since '/Type/Sig' may be located after 'Contents' key, we can't check for it here and have to postpone the check. bool bIsContents = ( key == PdfName( "Contents" ) ); // Get the next variant. If there isn't one, it'll throw UnexpectedEOF. // If we have the 'Contents' key and it is a hex string, we will get the raw, unprocessed string. // We will process the raw string at the end, after all keys have been read, so the check for '/Type/Sig' is valid. EPdfDataType dataType = this->GetNextVariant( val, pEncrypt, bIsContents ); processRawHexContents |= ( bIsContents && dataType == ePdfDataType_HexString ); => When the current key is 'Contents', the optional parameter is set to indicate we want the raw hex content. If the function then returns the hex string data type, we know 'val' is set with a raw hex string. The string is stored into the dictionary and a flag is set indicating that we have to process the raw hex string later ('processRawHexContents'). // Process raw hex contents, decrypting it if we don't have a signature dicionary. if ( processRawHexContents ) { // Check if signature dictionary bool isSigDict = dict.HasKey( "Type" ) && dict.GetKey( "Type" )->GetDataType() == ePdfDataType_Name && dict.GetKey( "Type" )->GetName() == PdfName( "Sig" ); // Read raw hex contents from dictionary. PdfString rawContents = dict.GetKey( "Contents" )->GetString(), processedContents; // Process raw hex contents (decrypting if NOT sig dict) processedContents.SetHexData( rawContents.GetString(), rawContents.GetLength(), isSigDict ? NULL : pEncrypt ); // Overwrite raw hex content in dictionary with processed one. dict.AddKey( "Contents", processedContents ); } When all keys were processed, the flag is evaluated; if set, the raw hex string is extracted from the dictionary, processed according to the dictionary type and stored into the dictionary again. I tested the patch with the following file: http://www.all-ip-store.de/_uploads/user/Telekom%20All-IP/IP-Voice-Data_ Auftragsformular.pdf Without patch, the loading fails, with the patch the file is loaded without error. I will check if the signature can be verified tomorow, other tests I havn't done yet. Best regards, F.E. 2018-02-27 11:01 GMT+01:00 F. E. <exler7...@gmail.com>: > Hi, >> do you really expect everyone on the list to answer to your message >> that he/she doesn't have currently time and/or resources to look on >> your issue? I hope not. >> > No, I don't expect everyone to answer. But I think it's not too much to > ask for that *at least one* acknowledges it's a problem, > or tells me it's not important enough. Just *any reaction from anyone*! > Especially when there was quite some activity on the mailing list. > > Yes, I agree the problem is there. > > Great! > > >> No, I do not know how to deal with >> it at the moment. No, I cannot promise whether I'll look on it myself. >> > No problem. > > >> Yes, I would like to see this fixed in the next release of PoDoFo too. >> > That would be great. > > >> Yes, a simple test .pdf file would help with the reproducer. >> > I can provide that, luckily one of the pdf files is openly available: > http://www.all-ip-store.de/_uploads/user/Telekom%20All-IP/IP-Voice-Data_ > Auftragsformular.pdf > > >> P.S.: if anyone replied to you during the weekend, then the SourceForge >> site had some outage of the list(s), thus it was not received. I resent >> my weekend messages only today. >> > I had some troubles, too. My initial mail didn't make it to the mailing > list, it seemed, so I had to resend it again. > > Best regards, > F.E. >
podofo_sig_check_fix.patch
Description: Binary data
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users