On Fri, 2020-11-06 at 13:14 +0100, F. E. wrote:
> one of our customers uses a self-built pdf file, which is processed
> with podofo and causes a stack overflow crash inside of podofo when
> trying to load the file.

        Hi,
I gave it a quick test and it doesn't crash here (trunk at r2016). It
can be my stack size is larger than that yours. I get an exception
being thrown with this content:

PoDoFo encountered an error. Error: 21 ePdfError_InvalidXRef
        Callstack:
        #0 Error Source: src/podofo/doc/PdfMemDocument.cpp:263
                Information: Handler fixes issue #49
        #1 Error Source: src/podofo/base/PdfParser.cpp:272
                Information: Unable to load objects from file.
        #2 Error Source: src/podofo/base/PdfParser.cpp:375
                Information: Unable to load xref entries.
        #3 Error Source: src/podofo/base/PdfParser.cpp:974
        #4 Error Source: src/podofo/base/PdfParser.cpp:974
        #5 Error Source: src/podofo/base/PdfParser.cpp:974
        #6 Error Source: src/podofo/base/PdfParser.cpp:974
        #7 Error Source: src/podofo/base/PdfParser.cpp:974
        #8 Error Source: src/podofo/base/PdfParser.cpp:974
        ...
        #251 Error Source: src/podofo/base/PdfParser.cpp:974
        #252 Error Source: src/podofo/base/PdfParser.cpp:974
        #253 Error Source: src/podofo/base/PdfParser.cpp:104


Using gdb I get this backtrace (cut for brevity):

Breakpoint 1, PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, 
lOffset=566598, bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:974
974                     e.AddToCallstack( __FILE__, __LINE__ );
(gdb) l
969             } catch(PdfError &e) {
970                 /* Be forgiving, the error happens when an entry in XRef 
stream points
971                    to a wrong place (offset) in the PDF file. */
972                 if( e != ePdfError_NoNumber )
973                 {
974                     e.AddToCallstack( __FILE__, __LINE__ );
975                     throw e;
976                 }
977             }
978         }
(gdb) p e
$1 = (PoDoFo::PdfError &) @0x6779f0: {_vptr.PdfError = 0x5d83f8 <vtable for 
PoDoFo::PdfError+16>, m_error = PoDoFo::ePdfError_InvalidXRef, m_callStack = 
std::deque with 1 element = {{m_nLine = 104,
      m_sFile = "src/podofo/base/PdfParser.cpp", m_sInfo = "", m_swInfo = 
L""}}, static s_DgbEnabled = true, static s_LogEnabled = true, static 
m_fLogMessageCallback = 0x0}
(gdb) b 972
Breakpoint 2 at 0x58c2f6: file src/podofo/base/PdfParser.cpp, line 972.
(gdb) bt
#0  PoDoFo::PdfParser::ReadXRefStreamContents (this=0x674820, lOffset=566598, 
bReadOnlyTrailer=false) at src/podofo/base/PdfParser.cpp:974
#1  0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, 
lOffset=566598, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727
#2  0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents 
(this=0x674820, lOffset=574009, bReadOnlyTrailer=false) at 
src/podofo/base/PdfParser.cpp:968
#3  0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, 
lOffset=574009, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727
#4  0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents 
(this=0x674820, lOffset=581388, bReadOnlyTrailer=false) at 
src/podofo/base/PdfParser.cpp:968
#5  0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, 
lOffset=581388, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727
#6  0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents 
(this=0x674820, lOffset=588148, bReadOnlyTrailer=false) at 
src/podofo/base/PdfParser.cpp:968
#7  0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, 
lOffset=588148, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727
#8  0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents 
(this=0x674820, lOffset=594866, bReadOnlyTrailer=false) at 
src/podofo/base/PdfParser.cpp:968
#9  0x000000000058b61b in PoDoFo::PdfParser::ReadXRefContents (this=0x674820, 
lOffset=594866, bPositionAtEnd=false) at src/podofo/base/PdfParser.cpp:727
#10 0x000000000058c290 in PoDoFo::PdfParser::ReadXRefStreamContents 
(this=0x674820, lOffset=602251, bReadOnlyTrailer=false) at 
src/podofo/base/PdfParser.cpp:968

> I do not know the parsing code well enough to understand what goes
> wrong with the pdf file

Neither do I. The gdb backtrace suggests it is progressing with the
lOffset.

It looks like the file contains 325-times '%%EOF' and 325-times
'startxref' directives, which is quite inefficient way to create PDF
files, from my point of view. I do not say it's not possible to create
it this way, it's only inefficient.

On the other hand, it shows that the PoDoFo's catcher for the recursion
in the XRef table misbehaves (because there is no real recursion here)
and that the read of the XRef table this way can cause stack overflow
in some cases.

        Bye,
        zyx



_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to