On Tue, Apr 26, 2022 at 5:42 PM Dennis Voss <dennis.v...@dots.de> wrote:

> Hey,
>
> attached is a pdf-file that is failing to be parsed by PoDoFo and a patch
> for that bug.
>
> Description:
>
> The pdf-file has comments between the EOF and the 'trailer' token. These
> comments are 'longer' than the lRange (lookup range) provided to findToken,
> so when we try to find the 'trailer' token we will end up somewhere in the
> comments and fail to find the token.
>

Seems if there would be word "trailer" in these comments then podofo would
confuse it for actual trailer.


> Therefore,  if the token we are looking for is equal to 'trailer' we
> resize the buffer accordingly (nFileSize - m_nXRefOffset), this should
> always find the 'trailer' token.
>
>
> What if xref itself is between trailer and startxref?



> I dont know about the findToken2 function. The same code can be copied
> over to there, but* i didn't patch *findToken2. That function seems to be
> a bandaid for some other issue already, so i dont want to mess with it...
> (feel free to patch it too though, i think it has the same problem).
>
>
>
Also not really related to attached patch but this looks really weird:

    int i; // Do not make this unsigned, this will cause infinte loops in
files without trailer

    for( i = lXRefBuf - nTokenLen; i >= 0; i-- )
    {
        if( strncmp( m_buffer.GetBuffer()+i, pszToken, nTokenLen ) == 0 )
        {
            break;
        }
    }

    if( !i )
    {
        PODOFO_RAISE_ERROR( ePdfError_InternalLogic );
    }

If token keyword is found exactly at position 512 bytes before EOF (suppose
pdf is larger than 512 bytes) then it will throw ePdfError_InternalLogic
("break" will happen when i == 0). There probably should have been if( i <
0 ) because if token is not found then i will be negative (-1) and not zero
(this is also indicated in that comment where is i defined) and in this
case that error probably should have been something like "invalid pdf" not
"internal logic error". I would add new issue on github but is this
resolved in pdfmm?

Best Regards,
>
> Dennis Voss
> --
>
> [image: dots Software] <http://www.dots.de/en/>
>
> Dennis Voss
> Lead Programmer
>
> dots Gesellschaft für Softwareentwicklung mbH
> Schlesische Str. 27, 10997 Berlin, Germany
>
> Tel: +49 (0)30 695 799-30
>
> dennis.v...@dots.de <max.musterm...@dots.de>
> https://www.dots.de
>
> District court | Amtsgericht: Berlin Charlottenburg HRB 65201
> Managing Director | Geschäftsführer: Katsuji Kondo
>
>
> _______________________________________________
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to