Hello PoDoFo developers,

I think that I’ve fixed a bug in PdfXRefStreamParserObject which occurred if 
the first array entry in the „W“-Array has a zero value:

Current SVN version r1665 treated this case as „type 0“ (free object), however 
according to the PDF-Spec this case should have a default value „type 1“.
Please see the spec "PDF 32000-1:2008“, Section "7.5.8.2 Cross-Reference Stream 
Dictionary“, "Table 17 – Additional entries specific to a cross-reference 
stream dictionary“. The description to the key „W“ states:
> A value of zero for an element in the W array indicates that the 
> corresponding field shall not be present in the stream, and the default value 
> shall be used, if there is one. If the first element is zero, the type field 
> shall not be present, and shall default to type 1.
> 

I have managed to fix this issue with the following patch to 
PdfXRefStreamParserObject.cpp, the if-statement added just before the 
switch-statement fixed this bug:

> Index: podofo-src-r1665/src/base/PdfXRefStreamParserObject.cpp
> ===================================================================
> --- podofo-src-r1665/src/base/PdfXRefStreamParserObject.cpp   (revision 7630)
> +++ podofo-src-r1665/src/base/PdfXRefStreamParserObject.cpp   (working copy)
> @@ -228,6 +228,7 @@
>  
>      //printf("OBJ=%i nData = [ %i %i %i ]\n", nObjNo, 
> static_cast<int>(nData[0]), static_cast<int>(nData[1]), 
> static_cast<int>(nData[2]) );
>      (*m_pOffsets)[nObjNo].bParsed = true;
> +    if (lW[0]==0) nData[0]=1; // If the first element is zero, the type 
> field shall not be present, and shall default to type 1.
>      switch( nData[0] ) // nData[0] contains the type information of this 
> entry
>      {
>          case 0:

Without this patch, I could not create a PdfMemDocument for a certain PDF 
sample file which contains an XRef with a W-key of the form [0 3 0]. The 
symptom looked like this
> PoDoFo encounter an error. Error: 15 ePdfError_NoObject
>       Error Description: A object was expected but not found.
>       Callstack:
>       #0 Error Source: 
> /Users/amin/podofo/podofo-svn/podofo-src/src/doc/PdfMemDocument.cpp:182
>               Information: Catalog object not found!
The error occurred in the function PdfMemDocument::InitFromParser( PdfParser* 
pParser ) at the call
PdfObject* pCatalog = pTrailer->GetIndirectKey( "Root" );
which yielded the error as the indirect object in „Root“ could not be 
dereferenced (because this->GetObjects().GetSize() yielded zero caused by all 
PdfObjects being parsed of type „free“).

After the mentioned patch the problematic PDF file could be parsed without 
problems... 

Could you please check this patch and add it to the SVN version? Thank you very 
much!

Best regards,
Amin

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to