[ 
https://issues.apache.org/jira/browse/PDFBOX-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386366#comment-16386366
 ] 

Tilman Hausherr edited comment on PDFBOX-4137 at 3/5/18 5:07 PM:
-----------------------------------------------------------------

Thanks; I ran my tests.

PDFBOX_1841.pdf and several other files throw an ArrayIndexOutOfBoundsException 
exception at 50% and 100% zoom in PDFDebugger, but not at 25% or 200% or 400%.

I suspect that you should check for {{(x >= startx + scanWidth)}} below this 
code and then break if needed:
{code:java}
if (x < startx || (x % subsampling) > 0)
{
    x++;
{code}
PDFJS-4575-noimagedim.pdf brings a NegativeArraySizeException. There is a check 
in getRGBImage() which would throw an IOException but that one has been 
disabled by the change.

The subsampling feature should be optional. I suggest an option in PDFRenderer, 
e.g. {{setSubsamplingAllowed}} and {{isSubsamplingAllowed}} and have it 
disabled by default in 2.* and enabled in the trunk, and advertise the feature 
in the FAQ. I will also look in Stackoverflow for answers that complain about 
memory.

Yes, caching makes sense. We could remember the subsampling value and compare 
with the current one.


was (Author: tilman):
Thanks; I ran my tests.

gs-bugzilla692689.pdf and several other files throw an 
ArrayIndexOutOfBoundsException exception at 50% and 100% zoom in PDFDebugger, 
but not at 25% or 200% or 400%.

I suspect that you should check for {{(x >= startx + scanWidth)}} below this 
code and then break if needed:
{code}
if (x < startx || (x % subsampling) > 0)
{
    x++;
{code}

PDFJS-4575-noimagedim.pdf brings a NegativeArraySizeException. There is a check 
in getRGBImage() which would throw an IOException but that one has been 
disabled by the change.

The subsampling feature should be optional. I suggest an option in PDFRenderer, 
e.g. {{setSubsamplingAllowed}} and {{isSubsamplingAllowed}} and have it 
disabled by default in 2.* and enabled in the trunk, and advertise the feature 
in the FAQ. I will also look in Stackoverflow for answers that complain about 
memory.

Yes, caching makes sense. We could remember the subsampling value and compare 
with the current one.

> Allow subsampled/downscaled rendering of images, and rendering subimages 
> -------------------------------------------------------------------------
>
>                 Key: PDFBOX-4137
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4137
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.8
>            Reporter: Itai Shaked
>            Priority: Minor
>         Attachments: 0001-Image-render-subsample.patch, 
> 0001-Image-rendering-subsampling-removed-metadata-options.patch, 067445.pdf, 
> PDFBOX-1841.pdf, PDFJS-4575-noimagedim.pdf, 
> image_rendering_subsampling_hack.patch, large-jpeg.pdf
>
>
> Suggested/contributed change to allow subsampling of images and rendering 
> sub-regions of images.  
> The need arises from having very large images which are highly compressed 
> (usually JPEG or JBIG2). The current implementation decodes the entire image 
> into memory at full resolution, even if rendering is done at a much lower 
> resolution. 
> Since the change required augmenting the way Filters work (to allow 
> partial/subsampled decoding), it also includes a partial fix for PDFBOX-3340. 
>  
>  
> This change introduces "DecodeOptions" which are currently only applicable 
> for images. They include requesting only metadata (for PDImageXObject's 
> repair method), subsampling and sub-region (similar to 
> javax.imagio.ImageReadParam). 
> Since not all filters can or do honor (use) the options, the DecodeOptions 
> class contains a flag. Filters that honor the options (subsample / decode 
> only requested region) set it to true. If the flag is false, the subsampling 
> or cropping should be done after decoding, to ensure consistency. 
> PageDrawer was modified so it uses subsampling based on the ratio of the 
> desired output to the original image. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to