[ 
https://issues.apache.org/jira/browse/PDFBOX-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385051#comment-16385051
 ] 

Itai Shaked commented on PDFBOX-4137:
-------------------------------------

I am attaching a revised patch. I have removed the "metadata" portions from the 
DecodeOptions (maybe now it should be "ImageDecodeOptions"?).  I guess it is 
best if your fix goes in the PDFBOX-3340 discussion for clarity. As I believe 
it can be implemented without changes to public API, maybe it can be integrated 
sooner than the other features?  

This solves the issues of the "decodeMeta" name (it is no longer needed), and 
Flate-encoded streams (there is no longer an option to not decode). 

I have renamed "subsample" to "subsampling", and renamed "honored" to 
"filterSubsampled" (if the filter subsampled the image all is well, but if it 
didn't, SampledImageReader would need to do the subsampling). 

I have also factored out the subsampling frequency calculation in PageDrawer so 
there is one method which calculates the subsampling frequency. If a need 
arises to disable subsampling, this method could be amended to always return 1, 
i.e. : 
{code:java}
private int getSubsampling(PDImage pdImage, AffineTransform at) 
{
    if (noSubsampling) 
    {
        return 1;
    }
    ...
}{code}
 

Another thought that came to mind is - maybe PDImageXObject should be allowed 
to cache subsampled versions of the full image (i.e. subsampling==1,2,3,..., 
region==null). As subsampled versions go down in size as the square of the 
subsampling frequency, this shouldn't add too much memory overhead (~1.6x the 
original size). I have not implemented it for now so the code is easier to 
follow and reason about, but it is a possibility. 

> Allow subsampled/downscaled rendering of images, and rendering subimages 
> -------------------------------------------------------------------------
>
>                 Key: PDFBOX-4137
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4137
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.8
>            Reporter: Itai Shaked
>            Priority: Minor
>         Attachments: 0001-Image-render-subsample.patch, 
> 0001-Image-rendering-subsampling-removed-metadata-options.patch, 067445.pdf, 
> image_rendering_subsampling_hack.patch, large-jpeg.pdf
>
>
> Suggested/contributed change to allow subsampling of images and rendering 
> sub-regions of images.  
> The need arises from having very large images which are highly compressed 
> (usually JPEG or JBIG2). The current implementation decodes the entire image 
> into memory at full resolution, even if rendering is done at a much lower 
> resolution. 
> Since the change required augmenting the way Filters work (to allow 
> partial/subsampled decoding), it also includes a partial fix for PDFBOX-3340. 
>  
>  
> This change introduces "DecodeOptions" which are currently only applicable 
> for images. They include requesting only metadata (for PDImageXObject's 
> repair method), subsampling and sub-region (similar to 
> javax.imagio.ImageReadParam). 
> Since not all filters can or do honor (use) the options, the DecodeOptions 
> class contains a flag. Filters that honor the options (subsample / decode 
> only requested region) set it to true. If the flag is false, the subsampling 
> or cropping should be done after decoding, to ensure consistency. 
> PageDrawer was modified so it uses subsampling based on the ratio of the 
> desired output to the original image. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to