On Fri, 2020-11-06 at 13:14 +0100, F. E. wrote: > one of our customers uses a self-built pdf file, which is processed > with podofo and causes a stack overflow crash inside of podofo when > trying to load the file.
Hi, I gave it a quick test and it doesn't crash here (trunk at r2016). It can be my stack size is larger than that yours. I get an exception being thrown with this content: PoDoFo encountered an error. Error: 21 ePdfError_InvalidXRef Callstack: #0 Error Source: src/podofo/doc/PdfMemDocument.cpp:263 Information: Handler fixes issue #49 #1 Error Source: src/podofo/base/PdfParser.cpp:272 Information: Unable to load objects from file. #2 Error Source: src/podofo/base/PdfParser.cpp:375 Information: Unable to load xref entries. #3 Error Source: src/podofo/base/PdfParser.cpp:974 #4 Error Source: src/podofo/base/PdfParser.cpp:974 #5 Error Source: src/podofo/base/PdfParser.cpp:974 #6 Error Source: src/podofo/base/PdfParser.cpp:974 #7 Error Source: src/podofo/base/PdfParser.cpp:974 #8 Error Source: src/podofo/base/PdfParser.cpp:974 ... #251 Error Source: src/podofo/base/PdfParser.cpp:974 #252 Error Source: src/podofo/base/PdfParser.cpp:974 #253 Error Source: src/podofo/base/PdfParser.cpp:104 Using gdb I get this backtrace (cut for brevity): Breakpoint 1, PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=566598, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:974 974 e.AddToCallstack( __FILE__, __LINE__ ); (gdb) l 969 } catch(PdfError &e) { 970 /* Be forgiving, the error happens when an entry in XRef stream points 971 to a wrong place (offset) in the PDF file. */ 972 if( e != ePdfError_NoNumber ) 973 { 974 e.AddToCallstack( __FILE__, __LINE__ ); 975 throw e; 976 } 977 } 978 } (gdb) p e $1 = (PoDoFo::PdfError &) @0x6779f0: {_vptr.PdfError = 0x5d83f8 <vtable for PoDoFo::PdfError+16>, m_error = PoDoFo::ePdfError_InvalidXRef, m_callStack = std::deque with 1 element = {{m_nLine = 104, m_sFile = "src/podofo/base/PdfParser.cpp", m_sInfo = "", m_swInfo = L""}}, static s_DgbEnabled = true, static s_LogEnabled = true, static m_fLogMessageCallback = 0x0} (gdb) b 972 Breakpoint 2 at 0x58c2f6: file src/podofo/base/PdfParser.cpp, line 972. (gdb) bt #0 PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=566598, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:974 #1 0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, lOffset=566598, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727 #2 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=574009, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:968 #3 0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, lOffset=574009, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727 #4 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=581388, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:968 #5 0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, lOffset=581388, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727 #6 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=588148, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:968 #7 0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, lOffset=588148, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727 #8 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=594866, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:968 #9 0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, lOffset=594866, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727 #10 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=602251, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:968 > I do not know the parsing code well enough to understand what goes > wrong with the pdf file Neither do I. The gdb backtrace suggests it is progressing with the lOffset. It looks like the file contains 325-times '%%EOF' and 325-times 'startxref' directives, which is quite inefficient way to create PDF files, from my point of view. I do not say it's not possible to create it this way, it's only inefficient. On the other hand, it shows that the PoDoFo's catcher for the recursion in the XRef table misbehaves (because there is no real recursion here) and that the read of the XRef table this way can cause stack overflow in some cases. Bye, zyx _______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users