[ 
https://issues.apache.org/jira/browse/PDFBOX-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882539#comment-15882539
 ] 

Viraf Bankwalla commented on PDFBOX-3700:
-----------------------------------------

Your earlier suggestion of providing my own ResourceCache would probably work 
best for me as it gives me control of what images are cached.  However as this 
is an issue that I anticipate others will run into it would be good to find a 
solution or add your proposal to the FAQ.  Thanks for your help.  - viraf

> OutOfMemoryException converting PDF to TIFF Images
> --------------------------------------------------
>
>                 Key: PDFBOX-3700
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3700
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.4
>            Reporter: Viraf Bankwalla
>         Attachments: jira-pdfbox-3700.zip
>
>
> I am using PDFBox to convert PDF documents to a series of TIFF images (one 
> for each page).  The implementation uses PDFRenderer to render each page.  
> Things work fine when I am processing a single document in a single thread, 
> however when I try to process multiple documents (each in its own thread) I 
> get an OutOfMemoryException.
> In analyzing the heap dump, I see that this is caused by the images cached in 
> DefaultResourceCache.  Objects are added to the cache in PDResources, which 
> includes a method private boolean isAllowedCache(PDXObject xobject) that is 
> used to determine whether an PDXObject can be cached.  I have extended this 
> to filter out COSName.IMAGE, and am now able to process multiple documents in 
> parallel.
> A proposed fix would be to include Images in the set of objects not to add to 
> the cache.  For example, the following could be added to  
> PDResources.isAllowedCache
> {code:title=Bar.java|borderStyle=solid}
> COSBase image =  xobject.getCOSObject().getDictionaryObject(COSName.SUBTYPE);
> if (image instanceof COSName && ((COSName) image).equals(COSName.IMAGE))
> {
>              return false;            
> }
> {code}
> A possible patch is enclosed below.  I would like to get a fix in for the 
> next release.
> diff --git a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/PDResources.java 
> b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/PDResources.java
> index 6e1e464..aa94122 100644
> --- a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/PDResources.java
> +++ b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/PDResources.java
> @@ -31,15 +31,15 @@
>  import 
> org.apache.pdfbox.pdmodel.documentinterchange.markedcontent.PDPropertyList;
>  import org.apache.pdfbox.pdmodel.font.PDFont;
>  import org.apache.pdfbox.pdmodel.font.PDFontFactory;
> +import org.apache.pdfbox.pdmodel.graphics.PDXObject;
> +import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace;
>  import org.apache.pdfbox.pdmodel.graphics.color.PDPattern;
>  import org.apache.pdfbox.pdmodel.graphics.form.PDFormXObject;
> +import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
>  import 
> org.apache.pdfbox.pdmodel.graphics.optionalcontent.PDOptionalContentGroup;
> -import org.apache.pdfbox.pdmodel.graphics.state.PDExtendedGraphicsState;
> -import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace;
>  import org.apache.pdfbox.pdmodel.graphics.pattern.PDAbstractPattern;
>  import org.apache.pdfbox.pdmodel.graphics.shading.PDShading;
> -import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
> -import org.apache.pdfbox.pdmodel.graphics.PDXObject;
> +import org.apache.pdfbox.pdmodel.graphics.state.PDExtendedGraphicsState;
>  
>  /**
>   * A set of resources available at the page/pages/stream level.
> @@ -445,6 +445,12 @@
>                      return false;
>                  }
>              }
> +            
> +            COSBase image = 
> xobject.getCOSObject().getDictionaryObject(COSName.SUBTYPE);
> +            if (image instanceof COSName && ((COSName) 
> image).equals(COSName.IMAGE))
> +            {
> +             return false;
> +            }
>          }
>          return true;
>      }



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to