Itai Shaked commented on PDFBOX-4137:

Thanks for the tests. I have attached a revision patch (on a technical note - 
in such cases is it best to attach a full patch every time, or incremental one? 
This time I created an incremental patch to it's easier to see what changed, 
but if a full-patch is preferred let me know). 

I have fixed both problems, and added isSubsamplingAllowed to PDFRenderer and 
PageDrawerParameters. I have also made it so cache may be saved for any 
subsampling value, so long as "region" is null. Higher quality (i.e. lower 
subsampling value) renders may replace lower-quality (higher subsampling) 
renders, but there is no handling of the hypothetical case of a high-quality 
cached version being discarded by the GC, so once a request is made with e.g. 
subsampling=1, the PDImageXObject instance will never cache versions with 
higher subsampling. I assume this won't hurt performance too much (the current 
implementation only saves the one version anyway), but it may be worth noting 
for the future. 

I have modified the problematic code in from1Bit so now it is: 
for (int i = 0; i < 8; i++)
    if (x >= startx + scanWidth)
    int bit = value & mask;
    mask >>= 1;
    if (x >= startx && x % subsampling == 0)
        output[idx++] = bit == 0 ? value0 : value1;
I believe it is clearer (only one place where x is incremented, no `continue`), 
but looking at it again perhaps the break condition should be in the end of the 
loop - algorithmically it makes no difference, but it could be more readable. 

> Allow subsampled/downscaled rendering of images, and rendering subimages 
> -------------------------------------------------------------------------
>                 Key: PDFBOX-4137
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4137
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.8
>            Reporter: Itai Shaked
>            Priority: Minor
>         Attachments: 0001-Image-render-subsample.patch, 
> 0001-Image-rendering-subsampling-removed-metadata-options.patch, 
> 0001-Image-subsampling-revision-2.patch, 067445.pdf, PDFBOX-1841.pdf, 
> PDFJS-4575-noimagedim.pdf, image_rendering_subsampling_hack.patch, 
> large-jpeg.pdf
> Suggested/contributed change to allow subsampling of images and rendering 
> sub-regions of images.  
> The need arises from having very large images which are highly compressed 
> (usually JPEG or JBIG2). The current implementation decodes the entire image 
> into memory at full resolution, even if rendering is done at a much lower 
> resolution. 
> Since the change required augmenting the way Filters work (to allow 
> partial/subsampled decoding), it also includes a partial fix for PDFBOX-3340. 
> This change introduces "DecodeOptions" which are currently only applicable 
> for images. They include requesting only metadata (for PDImageXObject's 
> repair method), subsampling and sub-region (similar to 
> javax.imagio.ImageReadParam). 
> Since not all filters can or do honor (use) the options, the DecodeOptions 
> class contains a flag. Filters that honor the options (subsample / decode 
> only requested region) set it to true. If the flag is false, the subsampling 
> or cropping should be done after decoding, to ensure consistency. 
> PageDrawer was modified so it uses subsampling based on the ratio of the 
> desired output to the original image. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to