Hey,

attached is a pdf-file that is failing to be parsed by PoDoFo and a patch for that bug.

Description:

The pdf-file has comments between the EOF and the 'trailer' token. These comments are 'longer' than the lRange (lookup range) provided to findToken, so when we try to find the 'trailer' token we will end up somewhere in the comments and fail to find the token.

Therefore,  if the token we are looking for is equal to 'trailer' we resize the buffer accordingly (nFileSize - m_nXRefOffset), this should always find the 'trailer' token.


I dont know about the findToken2 function. The same code can be copied over to there, but*i didn't patch *findToken2. That function seems to be a bandaid for some other issue already, so i dont want to mess with it... (feel free to patch it too though, i think it has the same problem).


Best Regards,

Dennis Voss

--

dots Software <http://www.dots.de/en/>

Dennis Voss
Lead Programmer

dots Gesellschaft für Softwareentwicklung mbH
Schlesische Str. 27, 10997 Berlin, Germany

Tel: +49 (0)30 695 799-30

dennis.v...@dots.de <mailto:max.musterm...@dots.de>
https://www.dots.de

District court | Amtsgericht: Berlin Charlottenburg HRB 65201
Managing Director | Geschäftsführer: Katsuji Kondo
Index: src/podofo/base/PdfParser.cpp
===================================================================
--- src/podofo/base/PdfParser.cpp       (revision 2056)
+++ src/podofo/base/PdfParser.cpp       (working copy)
@@ -1313,7 +1313,16 @@
                 "Failed to seek to EOF when looking for xref");
     }
 
-    pdf_long lXRefBuf  = PDF_MIN( static_cast<pdf_long>(nFileSize), 
static_cast<pdf_long>(lRange) );
+    pdf_long lXRefBuf;
+       // Dennis Voss 26.04.22, incase there are comments between 'trailer' 
and EOF we need to resize the buffer
+       // otherwise we will not find the trailer in lRange
+       if (strncmp(pszToken, "trailer", 6) == 0) {
+               lXRefBuf = PDF_MIN(static_cast<pdf_long>(nFileSize), 
static_cast<pdf_long>(nFileSize - m_nXRefOffset));
+               m_buffer.Resize(lXRefBuf);
+       }
+       else {
+               lXRefBuf = PDF_MIN(static_cast<pdf_long>(nFileSize), 
static_cast<pdf_long>(lRange));
+       }
     size_t   nTokenLen = strlen( pszToken );
 
     m_device.Device()->Seek( -lXRefBuf, std::ios_base::cur );

Attachment: EQUIOS .pdf
Description: Adobe PDF document

_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to