> I think "\n\r" combination is unnecessary. On principal I agree. But without the patch, Podofo loads PDFs with '\n\r' just fine and I don't want to change that without need. Furthermore, the standard states 'if the marker is 2 characters (both a carriage return and a line feed)'. Imo there's no restriction to the order of CR+LF.
> This would be better: (e1 == '\r' && e2 == '\n') || (e1 == ' ' && (e2 == '\r' || e2 == '\n')) I added two patches with your simplification, one with check for '\n\r'and one without it. Greetings, F.E. Am So., 10. März 2019 um 10:02 Uhr schrieb Michal Sudolsky < sudols...@gmail.com>: > I think "\n\r" combination is unnecessary. > > Also CheckEOL is too complicated there is unnecessary allocation and > four string comparisons while it just had to compare two char values. This > would be better: > > (e1 == '\r' && e2 == '\n') || (e1 == ' ' && (e2 == '\r' || e2 == '\n')) > > On Fri, Mar 8, 2019 at 8:16 PM F. E. <exler7...@gmail.com> wrote: > >> Hello again, >> >> I created another patch for checking the xref size. With this iteration, >> '\r\n', '\n\r', ' \r' and ' \n' are considered valid as EOL, anythign else >> will fail the parsing of the current subsection. >> I checked the patch with some of the researchers pdf files, it works >> fine, but the error description from podofo could be a bit more clear, >> though. >> >> I also checked the patch against the unit tests. The tests from the SVN, >> although with fixes for MSCV compaired to the 0.9.6 tests, somehow produced >> a "dereference out of range deque iterator" error. So I used the tests form >> v0.9.6, had to fix some stuff, but tests were running through. I had one >> failing test ('testDifferencesEncoding', last assert fails, somehow the >> differences of the encoding are not applied), but thats totally unrelated >> to my chances (I think). >> >> It would be nice if someone could check the patch again with a proper >> test environment, mine feels very thrown together and quick-n-dirty (had to >> dl and build cppunit as well and I'm not sure if I set everything up >> correctly .. at least it built). >> >> Greetings >> F.E. >> >> Am Do., 7. März 2019 um 13:13 Uhr schrieb F. E. <exler7...@gmail.com>: >> >>> I disagree. To much leeway only creates more attack surface for >>>>> malicious pdfs, since parsing is alwas a delicate procedure. >>>>> >>>> I do not think that allowing xref entries shorter than 20 chars is >>>> vulnerability. On the other side it could extend supported PDFs (if such >>>> PDFs really exists). >>>> >>> My remark is more of a general advise. Parsers in general are often the >>> weakest link and primary attack vector for attackers. Easy to exploit as >>> well, just send a malicious input and a vulnerable parser will ultimately >>> grant you RCE or what else. So the premise for writing a parser should >>> always be, keep it as simple as possible and stick to the requirements. >>> So i would say, if the pdf standard defines that xref entries have to be >>> exactly 20 chacracters long, we should strictly stick to that. And if there >>> are some PDFs with different-sized xref entries, they are to be rejected. >>> Nevertheless, even if we would like to allow xref entries shorter than >>> 20 chars, Podofos current parsing code still handles this wrong, so it has >>> to be fixed regardless. And I think the easier solution right now is to >>> just check for the correct size and fail consistantly otherwise. Changing >>> the code to allow shorter xref entries requires a bigger change. >>> >>> Or there is hole in standard: https://pdfsig-collision.florz.de >>>> In contracts with first two attacks seems SWA does not violate >>>> standard, at least not with ByteRange key. From what I see in pdf reference >>>> there is nothing what states that byte range must not include only >>>> contents. And there is also not stated that there must be always exactly >>>> two ranges. But seems for example acrobat reader does not accept more than >>>> 2 ranges and checks whether byte range covers what it should. >>>> >>> I've got to read this link, looks interesting. Yeah, the SWA attack is >>> quite a nifty trick and looks totally fine from the signature validation >>> perspective. The ByteRange to the signature clearly states which parts to >>> use for the check, which all adds up just fine. It may be an oversight of >>> the standard, yeah, and what the acrobat reader is imposing sounds very >>> reasonable, I think I will add something like that to my signature check as >>> well. >>> >>> rom what I read about SWA I really do not think that too short xref >>>> entry had role in that attack. I rather think that this is simply mistake >>>> or bug in software which manipulated these pdf files. Only one entry is >>>> shorter and there is sequence CR+LF+LF at the end of xref right before >>>> "trailer". I am sending fixed version. Can you try it? >>>> >>> Yes, this particular file can be repaired, because it has only one >>> shorter xref entry. I fixed it myself by "moving" a CR from the last line >>> up. >>> But the exploits from the researchers contain 5 genuinely different SWA >>> pdf files (there are 27 swa pdfs in total, but many duplicates in different >>> folders). Only the first I linked can be easily fixed, with the other four >>> pdfs ALL xref entries of the forged xraf table are to short! See: >>> >>> https://www.pdf-insecurity.org/download/exploits/3_eXpert_PDF_12_Ultimate/siwa.pdf >>> https://www.pdf-insecurity.org/download/exploits/5_Foxit_Reader/siwa.pdf >>> >>> https://www.pdf-insecurity.org/download/exploits/10_Nuance_Power_PDF_Standard/siwa-2.pdf >>> >>> https://www.pdf-insecurity.org/download/exploits/18_Perfect_PDF_10_Premium/siwa.pdf >>> >>> https://www.pdf-insecurity.org/download/exploits/20_Soda_PDF_Desktop/siwa.pdf >>> <- Thats the one that can be repaired! >>> >>> Podofo is unable to load the first three, all because the xref entry >>> parsing unalignes bc of the to-short entries: >>> ePdfError_NoNumber: Unable to load objects from file.: Error while >>> loading object 0 0 Offset = 380 Index = 2: Object and generation number >>> cannot be read.: >>> ePdfError_NoNumber: Unable to load objects from file.: Error while >>> loading object 0 0 Offset = 71 Index = 2: Object and generation number >>> cannot be read.: >>> ePdfError_NoNumber: Unable to load objects from file.: Error while >>> loading object 0 0 Offset = 189 Index = 2: Object and generation number >>> cannot be read.: >>> The forth can be loaded, but the existing signature object does not get >>> parsed and misses when trying to resolve the reference to it. >>> >>> Here i wanted to challenge you to fix the files, but actually, thats not >>> really difficult either, it turned out. You can just add whitespaces before >>> the LF and adjust the forged /ByteRange to point to the real /ByteRange >>> again. And optionally fix some offsets in the xref entries itself (Foxit >>> Reader one was totally broken, the offsets of the forged table totally out >>> of place). With this, Podofo can load the files and my code finds the >>> signature and checks it to valid. So I've got to agree with you, the xref >>> entry size is not needed for the attack. If the researchers had paid >>> attention to it, the pdf files would have been loadable from the start. >>> >>> > These attack pdf files are supposed to load correctly, signature >>> verification part must handle and resist them. >>> Yes they should be loadable, but mostly are not due to the incorrect >>> xref size, at least for podofo. I think I should mail the researchers my >>> fixed versions. I already contacted them regarding this issue, but they >>> were unimpressed :-/. >>> >>> I'm currently also working on a new patch for the size check, but that >>> will come in a later mail, gotta test that first. >>> >>> Greetings, >>> F.E. >>> >>> Am Do., 7. März 2019 um 10:14 Uhr schrieb Michal Sudolsky < >>> sudols...@gmail.com>: >>> >>>> >>>>> On the contrary I think podofo is too strict. Maybe would be better >>>>>> that it actually can open these files with xref entries shorter or longer >>>>>> than 20 chars as readers from that page do. I think some readers are >>>>>> implemented to read xref table by lines (by using for example c++ >>>>>> function >>>>>> getline). >>>>>> >>>>> I disagree. To much leeway only creates more attack surface for >>>>> malicious pdfs, since parsing is alwas a delicate procedure. >>>>> >>>> >>>> I do not think that allowing xref entries shorter than 20 chars is >>>> vulnerability. On the other side it could extend supported PDFs (if such >>>> PDFs really exists). >>>> >>>> >>>>> A lax handling of the standard is the reason why the researchers had >>>>> so much success with tinkering pdf files for circumventing the signature >>>>> check. >>>>> >>>> >>>> Or there is hole in standard: https://pdfsig-collision.florz.de >>>> In contracts with first two attacks seems SWA does not violate >>>> standard, at least not with ByteRange key. From what I see in pdf reference >>>> there is nothing what states that byte range must not include only >>>> contents. And there is also not stated that there must be always exactly >>>> two ranges. But seems for example acrobat reader does not accept more than >>>> 2 ranges and checks whether byte range covers what it should. >>>> >>>> To be more specific, all the shown SIWA attacks use to-short xref >>>>> entries. And it looks that this is actually a requirement for this >>>>> attacks, >>>>> otherwise the restrictions regarding offsets (secured by the signature) >>>>> are >>>>> much harder to meet. >>>>> >>>> >>>>> >>>>>> I checked this file and all 3 xref tables has CR+LF >>>>> >>>>> Nope. Check it again, there is one entry with only LF: >>>>> In the second xref table, right before the '8 5' subsection, the entry >>>>> '0000005131 00000 n' is missing it. >>>>> >>>>> This one tiny error totally breaks the loading of the whole file, bc >>>>> Podofo does not read the next subsection properly. >>>>> Instead of '8 5', he reads '5 958' as next sub section! >>>>> >>>>> >>>> From what I read about SWA I really do not think that too short xref >>>> entry had role in that attack. I rather think that this is simply mistake >>>> or bug in software which manipulated these pdf files. Only one entry is >>>> shorter and there is sequence CR+LF+LF at the end of xref right before >>>> "trailer". I am sending fixed version. Can you try it? >>>> >>>> Trust me, I checked every pdf file provided by the researchers, more >>>>> than half of it had this error. Mostly the Load failed, but two of them >>>>> could be loaded (e.g. the one I linked earlier). For this two pdf files, I >>>>> could not find any siganture fields at all, bc. Podofo did not parse >>>>> properly. The reference to the signature object was there, but it pointed >>>>> to a not existing object. And thats the REAL issue here, when Podofo does >>>>> NOT fail at loading, but the parsed document is insufficient. >>>>> >>>>> >>>> These attack pdf files are supposed to load correctly, signature >>>> verification part must handle and resist them. >>>> >>>> Greetings >>>>> F.E. >>>>> _______________________________________________ >>>>> Podofo-users mailing list >>>>> Podofo-users@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users >>>>> >>>> _______________________________________________ >> Podofo-users mailing list >> Podofo-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/podofo-users >> >
pdfparser_xref_entry_size_check2.patch
Description: Binary data
pdfparser_xref_entry_size_check2_nolfcr.patch
Description: Binary data
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users