On Tue, Apr 26, 2022 at 5:42 PM Dennis Voss <dennis.v...@dots.de> wrote:
> Hey, > > attached is a pdf-file that is failing to be parsed by PoDoFo and a patch > for that bug. > > Description: > > The pdf-file has comments between the EOF and the 'trailer' token. These > comments are 'longer' than the lRange (lookup range) provided to findToken, > so when we try to find the 'trailer' token we will end up somewhere in the > comments and fail to find the token. > Seems if there would be word "trailer" in these comments then podofo would confuse it for actual trailer. > Therefore, if the token we are looking for is equal to 'trailer' we > resize the buffer accordingly (nFileSize - m_nXRefOffset), this should > always find the 'trailer' token. > > > What if xref itself is between trailer and startxref? > I dont know about the findToken2 function. The same code can be copied > over to there, but* i didn't patch *findToken2. That function seems to be > a bandaid for some other issue already, so i dont want to mess with it... > (feel free to patch it too though, i think it has the same problem). > > > Also not really related to attached patch but this looks really weird: int i; // Do not make this unsigned, this will cause infinte loops in files without trailer for( i = lXRefBuf - nTokenLen; i >= 0; i-- ) { if( strncmp( m_buffer.GetBuffer()+i, pszToken, nTokenLen ) == 0 ) { break; } } if( !i ) { PODOFO_RAISE_ERROR( ePdfError_InternalLogic ); } If token keyword is found exactly at position 512 bytes before EOF (suppose pdf is larger than 512 bytes) then it will throw ePdfError_InternalLogic ("break" will happen when i == 0). There probably should have been if( i < 0 ) because if token is not found then i will be negative (-1) and not zero (this is also indicated in that comment where is i defined) and in this case that error probably should have been something like "invalid pdf" not "internal logic error". I would add new issue on github but is this resolved in pdfmm? Best Regards, > > Dennis Voss > -- > > [image: dots Software] <http://www.dots.de/en/> > > Dennis Voss > Lead Programmer > > dots Gesellschaft für Softwareentwicklung mbH > Schlesische Str. 27, 10997 Berlin, Germany > > Tel: +49 (0)30 695 799-30 > > dennis.v...@dots.de <max.musterm...@dots.de> > https://www.dots.de > > District court | Amtsgericht: Berlin Charlottenburg HRB 65201 > Managing Director | Geschäftsführer: Katsuji Kondo > > > _______________________________________________ > Podofo-users mailing list > Podofo-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/podofo-users >
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users