> I think "\n\r" combination is unnecessary.
On principal I agree. But without the patch, Podofo loads PDFs with '\n\r'
just fine and I don't want to change that without need.
Furthermore, the standard states 'if the marker is 2 characters (both a
carriage return and a line feed)'. Imo there's no restriction to the order
of CR+LF.

> This would be better: (e1 == '\r' && e2 == '\n') || (e1 == ' ' && (e2 ==
'\r' || e2 == '\n'))
I added two patches with your simplification, one with check for '\n\r'and
one without it.

Greetings,
F.E.

Am So., 10. März 2019 um 10:02 Uhr schrieb Michal Sudolsky <
sudols...@gmail.com>:

> I think "\n\r" combination is unnecessary.
>
> Also CheckEOL is too complicated there is unnecessary allocation and
> four string comparisons while it just had to compare two char values. This
> would be better:
>
> (e1 == '\r' && e2 == '\n') || (e1 == ' ' && (e2 == '\r' || e2 == '\n'))
>
> On Fri, Mar 8, 2019 at 8:16 PM F. E. <exler7...@gmail.com> wrote:
>
>> Hello again,
>>
>> I created another patch for checking the xref size. With this iteration,
>> '\r\n', '\n\r', ' \r' and ' \n' are considered valid as EOL, anythign else
>> will fail the parsing of the current subsection.
>> I checked the patch with some of the researchers pdf files, it works
>> fine, but the error description from podofo could be a bit more clear,
>> though.
>>
>> I also checked the patch against the unit tests. The tests from the SVN,
>> although with fixes for MSCV compaired to the 0.9.6 tests, somehow produced
>> a "dereference out of range deque iterator" error. So I used the tests form
>> v0.9.6, had to fix some stuff, but tests were running through. I had one
>> failing test ('testDifferencesEncoding', last assert fails, somehow the
>> differences of the encoding are not applied), but thats totally unrelated
>> to my chances (I think).
>>
>> It would be nice if someone could check the patch again with a proper
>> test environment, mine feels very thrown together and quick-n-dirty (had to
>> dl and build cppunit as well and I'm not sure if I set everything up
>> correctly .. at least it built).
>>
>> Greetings
>> F.E.
>>
>> Am Do., 7. März 2019 um 13:13 Uhr schrieb F. E. <exler7...@gmail.com>:
>>
>>> I disagree. To much leeway only creates more attack surface for
>>>>> malicious pdfs, since parsing is alwas a delicate procedure.
>>>>>
>>>> I do not think that allowing xref entries shorter than 20 chars is
>>>> vulnerability. On the other side it could extend supported PDFs (if such
>>>> PDFs really exists).
>>>>
>>> My remark is more of a general advise. Parsers in general are often the
>>> weakest link and primary attack vector for attackers. Easy to exploit as
>>> well, just send a malicious input and a vulnerable parser will ultimately
>>> grant you RCE or what else. So the premise for writing a parser should
>>> always be, keep it as simple as possible and stick to the requirements.
>>> So i would say, if the pdf standard defines that xref entries have to be
>>> exactly 20 chacracters long, we should strictly stick to that. And if there
>>> are some PDFs with different-sized xref entries, they are to be rejected.
>>> Nevertheless, even if we would like to allow xref entries shorter than
>>> 20 chars, Podofos current parsing code still handles this wrong, so it has
>>> to be fixed regardless. And I think the easier solution right now is to
>>> just check for the correct size and fail consistantly otherwise. Changing
>>> the code to allow shorter xref entries requires a bigger change.
>>>
>>> Or there is hole in standard: https://pdfsig-collision.florz.de
>>>> In contracts with first two attacks seems SWA does not violate
>>>> standard, at least not with ByteRange key. From what I see in pdf reference
>>>> there is nothing what states that byte range must not include only
>>>> contents. And there is also not stated that there must be always exactly
>>>> two ranges. But seems for example acrobat reader does not accept more than
>>>> 2 ranges and checks whether byte range covers what it should.
>>>>
>>> I've got to read this link, looks interesting. Yeah, the SWA attack is
>>> quite a nifty trick and looks totally fine from the signature validation
>>> perspective. The ByteRange to the signature clearly states which parts to
>>> use for the check, which all adds up just fine. It may be an oversight of
>>> the standard, yeah, and what the acrobat reader is imposing sounds very
>>> reasonable, I think I will add something like that to my signature check as
>>> well.
>>>
>>> rom what I read about SWA I really do not think that too short xref
>>>> entry had role in that attack. I rather think that this is simply mistake
>>>> or bug in software which manipulated these pdf files. Only one entry is
>>>> shorter and there is sequence CR+LF+LF at the end of xref right before
>>>> "trailer". I am sending fixed version. Can you try it?
>>>>
>>> Yes, this particular file can be repaired, because it has only one
>>> shorter xref entry. I fixed it myself by "moving" a CR from the last line
>>> up.
>>> But the exploits from the researchers contain 5 genuinely different SWA
>>> pdf files (there are 27 swa pdfs in total, but many duplicates in different
>>> folders). Only the first I linked can be easily fixed, with the other four
>>> pdfs ALL xref entries of the forged xraf table are to short! See:
>>>
>>> https://www.pdf-insecurity.org/download/exploits/3_eXpert_PDF_12_Ultimate/siwa.pdf
>>> https://www.pdf-insecurity.org/download/exploits/5_Foxit_Reader/siwa.pdf
>>>
>>> https://www.pdf-insecurity.org/download/exploits/10_Nuance_Power_PDF_Standard/siwa-2.pdf
>>>
>>> https://www.pdf-insecurity.org/download/exploits/18_Perfect_PDF_10_Premium/siwa.pdf
>>>
>>> https://www.pdf-insecurity.org/download/exploits/20_Soda_PDF_Desktop/siwa.pdf
>>> <- Thats the one that can be repaired!
>>>
>>> Podofo is unable to load the first three, all because the xref entry
>>> parsing unalignes bc of the to-short entries:
>>> ePdfError_NoNumber: Unable to load objects from file.: Error while
>>> loading object 0 0 Offset = 380 Index = 2: Object and generation number
>>> cannot be read.:
>>> ePdfError_NoNumber: Unable to load objects from file.: Error while
>>> loading object 0 0 Offset = 71 Index = 2: Object and generation number
>>> cannot be read.:
>>> ePdfError_NoNumber: Unable to load objects from file.: Error while
>>> loading object 0 0 Offset = 189 Index = 2: Object and generation number
>>> cannot be read.:
>>> The forth can be loaded, but the existing signature object does not get
>>> parsed and misses when trying to resolve the reference to it.
>>>
>>> Here i wanted to challenge you to fix the files, but actually, thats not
>>> really difficult either, it turned out. You can just add whitespaces before
>>> the LF and adjust the forged /ByteRange to point to the real /ByteRange
>>> again. And optionally fix some offsets in the xref entries itself (Foxit
>>> Reader one was totally broken, the offsets of the forged table totally out
>>> of place). With this, Podofo can load the files and my code finds the
>>> signature and checks it to valid. So I've got to agree with you, the xref
>>> entry size is not needed for the attack. If the researchers had paid
>>> attention to it, the pdf files would have been loadable from the start.
>>>
>>> > These attack pdf files are supposed to load correctly, signature
>>> verification part must handle and resist them.
>>> Yes they should be loadable, but mostly are not due to the incorrect
>>> xref size, at least for podofo. I think I should mail the researchers my
>>> fixed versions. I already contacted them regarding this issue, but they
>>> were unimpressed :-/.
>>>
>>> I'm currently also working on a new patch for the size check, but that
>>> will come in a later mail, gotta test that first.
>>>
>>> Greetings,
>>> F.E.
>>>
>>> Am Do., 7. März 2019 um 10:14 Uhr schrieb Michal Sudolsky <
>>> sudols...@gmail.com>:
>>>
>>>>
>>>>> On the contrary I think podofo is too strict. Maybe would be better
>>>>>> that it actually can open these files with xref entries shorter or longer
>>>>>> than 20 chars as readers from that page do. I think some readers are
>>>>>> implemented to read xref table by lines (by using for example c++ 
>>>>>> function
>>>>>> getline).
>>>>>>
>>>>> I disagree. To much leeway only creates more attack surface for
>>>>> malicious pdfs, since parsing is alwas a delicate procedure.
>>>>>
>>>>
>>>> I do not think that allowing xref entries shorter than 20 chars is
>>>> vulnerability. On the other side it could extend supported PDFs (if such
>>>> PDFs really exists).
>>>>
>>>>
>>>>> A lax handling of the standard is the reason why the researchers had
>>>>> so much success with tinkering pdf files for circumventing the signature
>>>>> check.
>>>>>
>>>>
>>>> Or there is hole in standard: https://pdfsig-collision.florz.de
>>>> In contracts with first two attacks seems SWA does not violate
>>>> standard, at least not with ByteRange key. From what I see in pdf reference
>>>> there is nothing what states that byte range must not include only
>>>> contents. And there is also not stated that there must be always exactly
>>>> two ranges. But seems for example acrobat reader does not accept more than
>>>> 2 ranges and checks whether byte range covers what it should.
>>>>
>>>> To be more specific, all the shown SIWA attacks use to-short xref
>>>>> entries. And it looks that this is actually a requirement for this 
>>>>> attacks,
>>>>> otherwise the restrictions regarding offsets (secured by the signature) 
>>>>> are
>>>>> much harder to meet.
>>>>>
>>>>
>>>>>
>>>>>> I checked this file and all 3 xref tables has CR+LF
>>>>>
>>>>> Nope. Check it again, there is one entry with only LF:
>>>>> In the second xref table, right before the '8 5' subsection, the entry
>>>>> '0000005131 00000 n' is missing it.
>>>>>
>>>>> This one tiny error totally breaks the loading of the whole file, bc
>>>>> Podofo does not read the next subsection properly.
>>>>> Instead of '8 5', he reads '5 958' as next sub section!
>>>>>
>>>>>
>>>> From what I read about SWA I really do not think that too short xref
>>>> entry had role in that attack. I rather think that this is simply mistake
>>>> or bug in software which manipulated these pdf files. Only one entry is
>>>> shorter and there is sequence CR+LF+LF at the end of xref right before
>>>> "trailer". I am sending fixed version. Can you try it?
>>>>
>>>> Trust me, I checked every pdf file provided by the researchers, more
>>>>> than half of it had this error. Mostly the Load failed, but two of them
>>>>> could be loaded (e.g. the one I linked earlier). For this two pdf files, I
>>>>> could not find any siganture fields at all, bc. Podofo did not parse
>>>>> properly. The reference to the signature object was there, but it pointed
>>>>> to a not existing object. And thats the REAL issue here, when Podofo does
>>>>> NOT fail at loading, but the parsed document is insufficient.
>>>>>
>>>>>
>>>> These attack pdf files are supposed to load correctly, signature
>>>> verification part must handle and resist them.
>>>>
>>>> Greetings
>>>>> F.E.
>>>>> _______________________________________________
>>>>> Podofo-users mailing list
>>>>> Podofo-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>
>>>> _______________________________________________
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>

Attachment: pdfparser_xref_entry_size_check2.patch
Description: Binary data

Attachment: pdfparser_xref_entry_size_check2_nolfcr.patch
Description: Binary data

_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to