[ https://issues.apache.org/jira/browse/PDFBOX-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385051#comment-16385051 ]
Itai Shaked commented on PDFBOX-4137: ------------------------------------- I am attaching a revised patch. I have removed the "metadata" portions from the DecodeOptions (maybe now it should be "ImageDecodeOptions"?). I guess it is best if your fix goes in the PDFBOX-3340 discussion for clarity. As I believe it can be implemented without changes to public API, maybe it can be integrated sooner than the other features? This solves the issues of the "decodeMeta" name (it is no longer needed), and Flate-encoded streams (there is no longer an option to not decode). I have renamed "subsample" to "subsampling", and renamed "honored" to "filterSubsampled" (if the filter subsampled the image all is well, but if it didn't, SampledImageReader would need to do the subsampling). I have also factored out the subsampling frequency calculation in PageDrawer so there is one method which calculates the subsampling frequency. If a need arises to disable subsampling, this method could be amended to always return 1, i.e. : {code:java} private int getSubsampling(PDImage pdImage, AffineTransform at) { if (noSubsampling) { return 1; } ... }{code} Another thought that came to mind is - maybe PDImageXObject should be allowed to cache subsampled versions of the full image (i.e. subsampling==1,2,3,..., region==null). As subsampled versions go down in size as the square of the subsampling frequency, this shouldn't add too much memory overhead (~1.6x the original size). I have not implemented it for now so the code is easier to follow and reason about, but it is a possibility. > Allow subsampled/downscaled rendering of images, and rendering subimages > ------------------------------------------------------------------------- > > Key: PDFBOX-4137 > URL: https://issues.apache.org/jira/browse/PDFBOX-4137 > Project: PDFBox > Issue Type: Improvement > Components: Rendering > Affects Versions: 2.0.8 > Reporter: Itai Shaked > Priority: Minor > Attachments: 0001-Image-render-subsample.patch, > 0001-Image-rendering-subsampling-removed-metadata-options.patch, 067445.pdf, > image_rendering_subsampling_hack.patch, large-jpeg.pdf > > > Suggested/contributed change to allow subsampling of images and rendering > sub-regions of images. > The need arises from having very large images which are highly compressed > (usually JPEG or JBIG2). The current implementation decodes the entire image > into memory at full resolution, even if rendering is done at a much lower > resolution. > Since the change required augmenting the way Filters work (to allow > partial/subsampled decoding), it also includes a partial fix for PDFBOX-3340. > > > This change introduces "DecodeOptions" which are currently only applicable > for images. They include requesting only metadata (for PDImageXObject's > repair method), subsampling and sub-region (similar to > javax.imagio.ImageReadParam). > Since not all filters can or do honor (use) the options, the DecodeOptions > class contains a flag. Filters that honor the options (subsample / decode > only requested region) set it to true. If the flag is false, the subsampling > or cropping should be done after decoding, to ensure consistency. > PageDrawer was modified so it uses subsampling based on the ratio of the > desired output to the original image. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org