My apologies, it seems the last patch did not include the added file DecodeOptions.java. Attached is a (hopefully) fixed patch.
Itai. On Thu, Mar 1, 2018 at 1:54 PM, Itai <itai...@gmail.com> wrote: > Hello, > > Following a question asked on pdfbox-users [1] , I set about trying to > allow rendering images at lower resolutions, and additionally rendering > only parts of images. The need arises from having very large images, > usually JPEG or JBIG2, which are tens of megabytes in size when compressed, > but may take up 8 or even more gigabytes when rendered as a BufferedImage > at full resolution. > I have come up with a solution that seems to work (passes all of the > built-in PDFBox tests, and a few manual ones I tried), but since it > includes some deep changes in the logic I understand if it won't find its > way into PDFBox. > > While working on it, I also came across PDFBOX-3340 [2], and since my hack > relies on making changes to the way filters work, it includes a (partial) > fix for that bug too. > > Finally, since I'm not well versed in git/github, I'm not sure of the best > way to share my work. I attach here a unified diff, but let me know if > there is another preferred method (pull request? clone the repository?) > > Following is an explanation/description of my changes, for those > interested. I would love to hear any feedback, especially for things which > may increase the likelihood of such a feature being included in future > versions of PDFBox. > > Thanks, > Itai. > > -- > > As stated, the issue pertains mainly to very large images (lots of pixels) > which are highly compressed. Since DCTFilter, JBIG2Filter etc. render the > entire image, I had to augment the way Filter works, to allow it to accept > options. > This is where the class DecodeOptions comes in. It has sub-region and > subsampling options (mirroring those of ImageReadParam), as well as a > "metadata only" param. When decoding, you may pass DecodeOptions, such that > image-related filters can downscale or only render a part of the image. > The "metadata-only" option is used for the `repair` method of > PDImageXObject, as it only really needs the DecodeResult - where applicable > and possible, a filter encountering this option will not decode the stream, > only set the DecodeResult parameters (this is not always possible, e.g. for > JPXFilter, which must decode the image to get the parameters). > > The DecodeOptions also has an "honored" flag, which the filter sets to > true if it honored the options - this is needed because when decoding an > image stored in a Flate or LZW stream, the filter doesn't know the image > format (or does it? I couldn't find a simple way of telling), so it can't > make sense of subsampling or partial render options. SampledImageReader > checks this flag, and if it is not set to true it does the subsampling by > itself. > > This allows the addition of a method in PDImage > > BufferedImage getImage(Rectangle region, int subsample) throws > IOException; > > The result of which is not cached, as it is not "canonical". > When drawing an image, PDPageDrawer calculates a subsampling factor based > on the desired size: > > int subsample = (int)Math.floor(pdImage.getWidth()/at.getScaleX()); > if (subsample<1) subsample = 1; > if (subsample>8) subsample = 8; > drawBufferedImage(pdImage.getImage(null, subsample), at); > > Such that if e.g. the pixel should be drawn at 0.5 times its pixel-size, > it will be subsampled at 2-pixel intervals. > > SampledImageReader issues the corresponding DecodeOptions to > PDImage#createInputStream when rendering, and if the "honored" flag is not > set, it does its own sub-sampling and partial rendering. > > I realize most/all of those optimizations won't work for raw, Flate or LZW > encoded images, but presumably those won't be too large in the first place. > Also, this has little to no benefit for PDInlineImage, but as it already > holds all of its raw data I assume little optimization is possible. > > In general, this hack allowed me to speed-up rendering of some files by > significant margins (20%-80%, depending on size and desired DPI), and > significantly lower the memory footprint if only a lower-res render is > required, or rendering of small regions of the image. > > -- > > [1]: https://lists.apache.org/thread.html/6b396e3d8bfc4ed44bcadf37881035 > d7447fb711253ef962f187455c@%3Cusers.pdfbox.apache.org%3E > [2]: https://issues.apache.org/jira/browse/PDFBOX-3340 >
Index: pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java (revision ) @@ -24,9 +24,11 @@ import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; +import javax.imageio.ImageReadParam; import javax.imageio.ImageReader; import javax.imageio.stream.ImageInputStream; import javax.imageio.stream.MemoryCacheImageInputStream; + import org.apache.pdfbox.cos.COSDictionary; import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.pdmodel.graphics.color.PDJPXColorSpace; @@ -34,12 +36,12 @@ /** * Decompress data encoded using the wavelet-based JPEG 2000 standard, * reproducing the original data. - * + * <p> * Requires the Java Advanced Imaging (JAI) Image I/O Tools to be installed from java.net, see * <a href="http://download.java.net/media/jai-imageio/builds/release/1.1/">jai-imageio</a>. * Alternatively you can build from the source available in the * <a href="https://java.net/projects/jai-imageio-core/">jai-imageio-core svn repo</a>. - * + * <p> * Mac OS X users should download the tar.gz file for linux and unpack it to obtain the * required jar files. The .so file can be safely ignored. * @@ -49,12 +51,17 @@ public final class JPXFilter extends Filter { @Override - public DecodeResult decode(InputStream encoded, OutputStream decoded, - COSDictionary parameters, int index) throws IOException + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, int index, DecodeOptions options) throws IOException { DecodeResult result = new DecodeResult(new COSDictionary()); result.getParameters().addAll(parameters); - BufferedImage image = readJPX(encoded, result); + BufferedImage image = readJPX(encoded, options, result); + + if (options.isMetadataOnly()) + { + return result; + } WritableRaster raster = image.getRaster(); switch (raster.getDataBuffer().getDataType()) @@ -74,25 +81,39 @@ return result; default: - throw new IOException("Data type " + raster.getDataBuffer().getDataType() + " not implemented"); - } + throw new IOException("Data type " + raster.getDataBuffer().getDataType() + " not" + + " implemented"); + } } + + @Override + public DecodeResult decode(InputStream encoded, OutputStream decoded, + COSDictionary parameters, int index) throws IOException + { + return decode(encoded, decoded, parameters, index, DecodeOptions.DEFAULT); + } // try to read using JAI Image I/O - private BufferedImage readJPX(InputStream input, DecodeResult result) throws IOException + private BufferedImage readJPX(InputStream input, DecodeOptions options, DecodeResult result) + throws IOException { - ImageReader reader = findImageReader("JPEG2000", "Java Advanced Imaging (JAI) Image I/O Tools are not installed"); + ImageReader reader = findImageReader("JPEG2000", "Java Advanced Imaging (JAI) Image I/O " + + "Tools are not installed"); // PDFBOX-4121: ImageIO.createImageInputStream() is much slower try (ImageInputStream iis = new MemoryCacheImageInputStream(input)) { reader.setInput(iis, true, true); + ImageReadParam irp = reader.getDefaultReadParam(); + irp.setSourceRegion(options.getSourceRegion()); + irp.setSourceSubsampling(options.getSubsamplingX(), options.getSubsamplingY(), + options.getSubsamplingOffsetX(), options.getSubsamplingOffsetY()); + options.setHonored(true); BufferedImage image; try { - image = reader.read(0); - } - catch (Exception e) + image = reader.read(0, irp); + } catch (Exception e) { // wrap and rethrow any exceptions throw new IOException("Could not read JPEG 2000 (JPX) image", e); @@ -114,8 +135,8 @@ } // override dimensions, see PDFBOX-1735 - parameters.setInt(COSName.WIDTH, image.getWidth()); - parameters.setInt(COSName.HEIGHT, image.getHeight()); + parameters.setInt(COSName.WIDTH, reader.getWidth(0)); + parameters.setInt(COSName.HEIGHT, reader.getHeight(0)); // extract embedded color space if (!parameters.containsKey(COSName.COLORSPACE)) @@ -124,8 +145,7 @@ } return image; - } - finally + } finally { reader.dispose(); } Index: pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java (revision ) @@ -26,6 +26,7 @@ import javax.imageio.IIOException; import javax.imageio.ImageIO; +import javax.imageio.ImageReadParam; import javax.imageio.ImageReader; import javax.imageio.metadata.IIOMetadata; import javax.imageio.metadata.IIOMetadataNode; @@ -51,10 +52,15 @@ private static final String ADOBE = "Adobe"; @Override - public DecodeResult decode(InputStream encoded, OutputStream decoded, - COSDictionary parameters, int index) throws IOException + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, int index, DecodeOptions options) throws IOException { - ImageReader reader = findImageReader("JPEG", "a suitable JAI I/O image filter is not installed"); + if (options.isMetadataOnly()) + { + return new DecodeResult(parameters); + } + ImageReader reader = findImageReader("JPEG", "a suitable JAI I/O image filter is not " + + "installed"); try (ImageInputStream iis = ImageIO.createImageInputStream(encoded)) { @@ -63,9 +69,15 @@ { iis.seek(0); } - + reader.setInput(iis); - + ImageReadParam irp = reader.getDefaultReadParam(); + irp.setSourceSubsampling(options.getSubsamplingX(), options.getSubsamplingY(), + options.getSubsamplingOffsetX(), options.getSubsamplingOffsetY()); + irp.setSourceRegion(options.getSourceRegion()); + options.setHonored(true); + + String numChannels = getNumChannels(reader); // get the raster using horrible JAI workarounds @@ -73,29 +85,29 @@ Raster raster; // Strategy: use read() for RGB or "can't get metadata" - // use readRaster() for CMYK and gray and as fallback if read() fails + // use readRaster() for CMYK and gray and as fallback if read() fails // after "can't get metadata" because "no meta" file was CMYK if ("3".equals(numChannels) || numChannels.isEmpty()) { try { - // I'd like to use ImageReader#readRaster but it is buggy and can't read RGB correctly - BufferedImage image = reader.read(0); + // I'd like to use ImageReader#readRaster but it is buggy and can't read RGB + // correctly + BufferedImage image = reader.read(0, irp); raster = image.getRaster(); - } - catch (IIOException e) + } catch (IIOException e) { // JAI can't read CMYK JPEGs using ImageReader#read or ImageIO.read but // fortunately ImageReader#readRaster isn't buggy when reading 4-channel files - LOG.debug("Couldn't read use read() for RGB image - using readRaster() as fallback", e); - raster = reader.readRaster(0, null); + LOG.debug("Couldn't read use read() for RGB image - using readRaster() as " + + "fallback", e); + raster = reader.readRaster(0, irp); } - } - else + } else { // JAI can't read CMYK JPEGs using ImageReader#read or ImageIO.read but // fortunately ImageReader#readRaster isn't buggy when reading 4-channel files - raster = reader.readRaster(0, null); + raster = reader.readRaster(0, irp); } // special handling for 4-component images @@ -106,11 +118,11 @@ try { transform = getAdobeTransform(reader.getImageMetadata(0)); - } - catch (IIOException | NegativeArraySizeException e) + } catch (IIOException | NegativeArraySizeException e) { // we really tried asking nicely, now we're using brute force. - LOG.debug("Couldn't read usíng getAdobeTransform() - using getAdobeTransformByBruteForce() as fallback", e); + LOG.debug("Couldn't read usíng getAdobeTransform() - using " + + "getAdobeTransformByBruteForce() as fallback", e); transform = getAdobeTransformByBruteForce(iis); } int colorTransform = transform != null ? transform : 0; @@ -130,28 +142,33 @@ default: throw new IllegalArgumentException("Unknown colorTransform"); } - } - else if (raster.getNumBands() == 3) + } else if (raster.getNumBands() == 3) { // BGR to RGB raster = fromBGRtoRGB(raster); } - DataBufferByte dataBuffer = (DataBufferByte)raster.getDataBuffer(); + DataBufferByte dataBuffer = (DataBufferByte) raster.getDataBuffer(); decoded.write(dataBuffer.getData()); - } - finally + } finally { reader.dispose(); } return new DecodeResult(parameters); } + @Override + public DecodeResult decode(InputStream encoded, OutputStream decoded, + COSDictionary parameters, int index) throws IOException + { + return decode(encoded, decoded, parameters, index, DecodeOptions.DEFAULT); + } + // reads the APP14 Adobe transform tag and returns its value, or 0 if unknown private Integer getAdobeTransform(IIOMetadata metadata) { - Element tree = (Element)metadata.getAsTree("javax_imageio_jpeg_image_1.0"); - Element markerSequence = (Element)tree.getElementsByTagName("markerSequence").item(0); + Element tree = (Element) metadata.getAsTree("javax_imageio_jpeg_image_1.0"); + Element markerSequence = (Element) tree.getElementsByTagName("markerSequence").item(0); NodeList app14AdobeNodeList = markerSequence.getElementsByTagName("app14Adobe"); if (app14AdobeNodeList != null && app14AdobeNodeList.getLength() > 0) { @@ -160,7 +177,7 @@ } return 0; } - + // See in https://github.com/haraldk/TwelveMonkeys // com.twelvemonkeys.imageio.plugins.jpeg.AdobeDCT class for structure of APP14 segment private int getAdobeTransformByBruteForce(ImageInputStream iis) throws IOException @@ -196,8 +213,7 @@ return app14[POS_TRANSFORM]; } } - } - else + } else { a = 0; } @@ -239,7 +255,7 @@ value[0] = cyan; value[1] = magenta; value[2] = yellow; - value[3] = (int)K; + value[3] = (int) K; writableRaster.setPixel(x, y, value); } } @@ -264,9 +280,10 @@ float K = value[3]; // YCbCr to RGB, see http://www.equasys.de/colorconversion.html - int r = clamp( (1.164f * (Y-16)) + (1.596f * (Cr - 128)) ); - int g = clamp( (1.164f * (Y-16)) + (-0.392f * (Cb-128)) + (-0.813f * (Cr-128))); - int b = clamp( (1.164f * (Y-16)) + (2.017f * (Cb-128))); + int r = clamp((1.164f * (Y - 16)) + (1.596f * (Cr - 128))); + int g = clamp((1.164f * (Y - 16)) + (-0.392f * (Cb - 128)) + (-0.813f * (Cr - + 128))); + int b = clamp((1.164f * (Y - 16)) + (2.017f * (Cb - 128))); // naive RGB to CMYK int cyan = 255 - r; @@ -277,7 +294,7 @@ value[0] = cyan; value[1] = magenta; value[2] = yellow; - value[3] = (int)K; + value[3] = (int) K; writableRaster.setPixel(x, y, value); } } @@ -307,8 +324,9 @@ } return writableRaster; } - - // returns the number of channels as a string, or an empty string if there is an error getting the meta data + + // returns the number of channels as a string, or an empty string if there is an error + // getting the meta data private String getNumChannels(ImageReader reader) { try @@ -318,25 +336,26 @@ { return ""; } - IIOMetadataNode metaTree = (IIOMetadataNode) imageMetadata.getAsTree("javax_imageio_1.0"); - Element numChannelsItem = (Element) metaTree.getElementsByTagName("NumChannels").item(0); + IIOMetadataNode metaTree = (IIOMetadataNode) imageMetadata.getAsTree + ("javax_imageio_1.0"); + Element numChannelsItem = (Element) metaTree.getElementsByTagName("NumChannels").item + (0); if (numChannelsItem == null) { return ""; } return numChannelsItem.getAttribute("value"); - } - catch (IOException | NegativeArraySizeException e) + } catch (IOException | NegativeArraySizeException e) { LOG.debug("Couldn't read metadata - returning empty string", e); return ""; } - } + } // clamps value to 0-255 range private int clamp(float value) { - return (int)((value < 0) ? 0 : ((value > 255) ? 255 : value)); + return (int) ((value < 0) ? 0 : ((value > 255) ? 255 : value)); } @Override Index: pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java (revision ) @@ -16,9 +16,7 @@ */ package org.apache.pdfbox.pdmodel.graphics.image; -import java.awt.Graphics2D; -import java.awt.Paint; -import java.awt.RenderingHints; +import java.awt.*; import java.awt.image.BufferedImage; import java.awt.image.WritableRaster; import java.io.BufferedInputStream; @@ -31,6 +29,7 @@ import java.lang.ref.SoftReference; import java.util.List; import javax.imageio.ImageIO; + import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.pdfbox.cos.COSArray; @@ -39,6 +38,8 @@ import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.cos.COSObject; import org.apache.pdfbox.cos.COSStream; +import org.apache.pdfbox.filter.DecodeOptions; +import org.apache.pdfbox.filter.DecodeResult; import org.apache.pdfbox.io.IOUtils; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDResources; @@ -73,7 +74,8 @@ /** * Creates an Image XObject in the given document. This constructor is for internal PDFBox use - * and is not for PDF generation. Users who want to create images should look at {@link #createFromFileByExtension(File, PDDocument) + * and is not for PDF generation. Users who want to create images should look at {@link + * #createFromFileByExtension(File, PDDocument) * }. * * @param document the current document @@ -89,18 +91,18 @@ * constructor is for internal PDFBox use and is not for PDF generation. Users who want to * create images should look at {@link #createFromFileByExtension(File, PDDocument) }. * - * @param document the current document - * @param encodedStream an encoded stream of image data - * @param cosFilter the filter or a COSArray of filters - * @param width the image width - * @param height the image height + * @param document the current document + * @param encodedStream an encoded stream of image data + * @param cosFilter the filter or a COSArray of filters + * @param width the image width + * @param height the image height * @param bitsPerComponent the bits per component - * @param initColorSpace the color space + * @param initColorSpace the color space * @throws IOException if there is an error creating the XObject. */ - public PDImageXObject(PDDocument document, InputStream encodedStream, - COSBase cosFilter, int width, int height, int bitsPerComponent, - PDColorSpace initColorSpace) throws IOException + public PDImageXObject(PDDocument document, InputStream encodedStream, + COSBase cosFilter, int width, int height, int bitsPerComponent, + PDColorSpace initColorSpace) throws IOException { super(createRawStream(document, encodedStream), COSName.IMAGE); getCOSObject().setItem(COSName.FILTER, cosFilter); @@ -117,25 +119,26 @@ * constructor is for internal PDFBox use and is not for PDF generation. Users who want to * create images should look at {@link #createFromFileByExtension(File, PDDocument) }. * - * @param stream the XObject stream to read + * @param stream the XObject stream to read * @param resources the current resources * @throws java.io.IOException if there is an error creating the XObject. */ public PDImageXObject(PDStream stream, PDResources resources) throws IOException { - this(stream, resources, stream.createInputStream()); + this(stream, resources, stream.decode()); } - + // repairs parameters using decode result - private PDImageXObject(PDStream stream, PDResources resources, COSInputStream input) + private PDImageXObject(PDStream stream, PDResources resources, DecodeResult decodeResult) { - super(repair(stream, input), COSName.IMAGE); + super(repair(stream, decodeResult), COSName.IMAGE); this.resources = resources; - this.colorSpace = input.getDecodeResult().getJPXColorSpace(); + this.colorSpace = decodeResult.getJPXColorSpace(); } /** * Creates a thumbnail Image XObject from the given COSBase and name. + * * @param cosStream the COS stream * @return an XObject * @throws IOException if there is an error creating the XObject. @@ -162,14 +165,15 @@ } /** - * Create a PDImageXObject from an image file, see {@link #createFromFileByExtension(File, PDDocument)} for + * Create a PDImageXObject from an image file, see + * {@link #createFromFileByExtension(File, PDDocument)} for * more details. * * @param imagePath the image file path. - * @param doc the document that shall use this PDImageXObject. + * @param doc the document that shall use this PDImageXObject. * @return a PDImageXObject. * @throws IOException if there is an error when reading the file or creating the - * PDImageXObject, or if the image type is not supported. + * PDImageXObject, or if the image type is not supported. */ public static PDImageXObject createFromFile(String imagePath, PDDocument doc) throws IOException { @@ -185,13 +189,14 @@ * PDImageXObject from a BufferedImage). * * @param file the image file. - * @param doc the document that shall use this PDImageXObject. + * @param doc the document that shall use this PDImageXObject. * @return a PDImageXObject. - * @throws IOException if there is an error when reading the file or creating the - * PDImageXObject. + * @throws IOException if there is an error when reading the file or creating the + * PDImageXObject. * @throws IllegalArgumentException if the image type is not supported. */ - public static PDImageXObject createFromFileByExtension(File file, PDDocument doc) throws IOException + public static PDImageXObject createFromFileByExtension(File file, PDDocument doc) throws + IOException { String name = file.getName(); int dot = file.getName().lastIndexOf('.'); @@ -228,20 +233,21 @@ * PDImageXObject from a BufferedImage). * * @param file the image file. - * @param doc the document that shall use this PDImageXObject. + * @param doc the document that shall use this PDImageXObject. * @return a PDImageXObject. - * @throws IOException if there is an error when reading the file or creating the - * PDImageXObject. + * @throws IOException if there is an error when reading the file or creating the + * PDImageXObject. * @throws IllegalArgumentException if the image type is not supported. */ - public static PDImageXObject createFromFileByContent(File file, PDDocument doc) throws IOException + public static PDImageXObject createFromFileByContent(File file, PDDocument doc) throws + IOException { FileType fileType = null; - try (BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(file))) + try (BufferedInputStream bufferedInputStream = new BufferedInputStream(new + FileInputStream(file))) { fileType = FileTypeDetector.detectFileType(bufferedInputStream); - } - catch (IOException e) + } catch (IOException e) { throw new IOException("Could not determine file type: " + file.getName(), e); } @@ -261,7 +267,8 @@ { return CCITTFactory.createFromFile(doc, file); } - if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) || fileType.equals(FileType.PNG)) + if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) || fileType.equals + (FileType.PNG)) { BufferedImage bim = ImageIO.read(file); return LosslessFactory.createFromImage(doc, bim); @@ -278,21 +285,21 @@ * PDImageXObject from a BufferedImage). * * @param byteArray bytes from an image file. - * @param document the document that shall use this PDImageXObject. - * @param name name of image file for exception messages, can be null. + * @param document the document that shall use this PDImageXObject. + * @param name name of image file for exception messages, can be null. * @return a PDImageXObject. - * @throws IOException if there is an error when reading the file or creating the - * PDImageXObject. + * @throws IOException if there is an error when reading the file or creating the + * PDImageXObject. * @throws IllegalArgumentException if the image type is not supported. */ - public static PDImageXObject createFromByteArray(PDDocument document, byte[] byteArray, String name) throws IOException + public static PDImageXObject createFromByteArray(PDDocument document, byte[] byteArray, + String name) throws IOException { FileType fileType; try { fileType = FileTypeDetector.detectFileType(byteArray); - } - catch (IOException e) + } catch (IOException e) { throw new IOException("Could not determine file type: " + name, e); } @@ -309,7 +316,8 @@ { return CCITTFactory.createFromByteArray(document, byteArray); } - if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) || fileType.equals(FileType.PNG)) + if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) || fileType.equals + (FileType.PNG)) { ByteArrayInputStream bais = new ByteArrayInputStream(byteArray); BufferedImage bim = ImageIO.read(bais); @@ -319,14 +327,15 @@ } // repairs parameters using decode result - private static PDStream repair(PDStream stream, COSInputStream input) + private static PDStream repair(PDStream stream, DecodeResult decodeResult) { - stream.getCOSObject().addAll(input.getDecodeResult().getParameters()); + stream.getCOSObject().addAll(decodeResult.getParameters()); return stream; } /** * Returns the metadata associated with this XObject, or null if there is none. + * * @return the metadata associated with this object. */ public PDMetadata getMetadata() @@ -341,6 +350,7 @@ /** * Sets the metadata associated with this XObject, or null if there is none. + * * @param meta the metadata associated with this object */ public void setMetadata(PDMetadata meta) @@ -350,6 +360,7 @@ /** * Returns the key of this XObject in the structural parent tree. + * * @return this object's key the structural parent tree */ public int getStructParent() @@ -359,6 +370,7 @@ /** * Sets the key of this XObject in the structural parent tree. + * * @param key the new key for this XObject */ public void setStructParent(int key) @@ -381,17 +393,25 @@ return cached; } } + BufferedImage image = getImage(null, 1); + cachedImage = new SoftReference<>(image); + return image; + } + @Override + public BufferedImage getImage(Rectangle region, int subsample) throws IOException + { // get image as RGB - BufferedImage image = SampledImageReader.getRGBImage(this, getColorKeyMask()); + BufferedImage image = SampledImageReader.getRGBImage(this, region, subsample, + getColorKeyMask()); + // soft mask (overrides explicit mask) PDImageXObject softMask = getSoftMask(); if (softMask != null) { image = applyMask(image, softMask.getOpaqueImage(), true); - } - else + } else { // explicit mask - to be applied only if /ImageMask true PDImageXObject mask = getMask(); @@ -401,9 +421,10 @@ } } - cachedImage = new SoftReference<>(image); return image; + } + /** * {@inheritDoc} @@ -422,6 +443,7 @@ /** * Returns an RGB buffered image containing the opaque image stream without any masks applied. * If this Image XObject is a mask then the buffered image will contain the raw mask. + * * @return the image without any masks applied * @throws IOException if the image cannot be read */ @@ -447,8 +469,7 @@ if (mask.getWidth() < width || mask.getHeight() < height) { mask = scaleImage(mask, width, height); - } - else if (mask.getWidth() > width || mask.getHeight() > height) + } else if (mask.getWidth() > width || mask.getHeight() > height) { width = mask.getWidth(); height = mask.getHeight(); @@ -473,13 +494,12 @@ rgba[0] = rgb[0]; rgba[1] = rgb[1]; rgba[2] = rgb[2]; - + alphaPixel = alpha.getPixel(x, y, alphaPixel); if (isSoft) { rgba[3] = alphaPixel[0]; - } - else + } else { rgba[3] = 255 - alphaPixel[0]; } @@ -499,9 +519,9 @@ BufferedImage image2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB); Graphics2D g = image2.createGraphics(); g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, - RenderingHints.VALUE_INTERPOLATION_BICUBIC); + RenderingHints.VALUE_INTERPOLATION_BICUBIC); g.setRenderingHint(RenderingHints.KEY_RENDERING, - RenderingHints.VALUE_RENDER_QUALITY); + RenderingHints.VALUE_RENDER_QUALITY); g.drawImage(image, 0, 0, width, height, 0, 0, image.getWidth(), image.getHeight(), null); g.dispose(); return image2; @@ -509,6 +529,7 @@ /** * Returns the Mask Image XObject associated with this image, or null if there is none. + * * @return Mask Image XObject * @throws java.io.IOException */ @@ -519,8 +540,7 @@ { // color key mask, no explicit mask to return return null; - } - else + } else { COSStream cosStream = (COSStream) getCOSObject().getDictionaryObject(COSName.MASK); if (cosStream != null) @@ -534,6 +554,7 @@ /** * Returns the color key mask array associated with this image, or null if there is none. + * * @return Mask Image XObject */ public COSArray getColorKeyMask() @@ -541,13 +562,14 @@ COSBase mask = getCOSObject().getDictionaryObject(COSName.MASK); if (mask instanceof COSArray) { - return (COSArray)mask; + return (COSArray) mask; } return null; } /** * Returns the Soft Mask Image XObject associated with this image, or null if there is none. + * * @return the SMask Image XObject, or null. * @throws java.io.IOException */ @@ -568,8 +590,7 @@ if (isStencil()) { return 1; - } - else + } else { return getCOSObject().getInt(COSName.BITS_PER_COMPONENT, COSName.BPC); } @@ -607,13 +628,11 @@ { resources.getResourceCache().put(indirect, colorSpace); } - } - else if (isStencil()) + } else if (isStencil()) { // stencil mask color space must be gray, it is often missing return PDDeviceGray.INSTANCE; - } - else + } else { // an image without a color space is always broken throw new IOException("could not determine color space"); @@ -628,6 +647,12 @@ return getStream().createInputStream(); } + @Override + public InputStream createInputStream(DecodeOptions options) throws IOException + { + return getStream().createInputStream(options); + } + @Override public InputStream createInputStream(List<String> stopFilters) throws IOException { @@ -713,6 +738,7 @@ /** * This will get the suffix for this image type, e.g. jpg/png. + * * @return The image suffix or null if not available. */ @Override @@ -723,30 +749,24 @@ if (filters == null) { return "png"; - } - else if (filters.contains(COSName.DCT_DECODE)) + } else if (filters.contains(COSName.DCT_DECODE)) { return "jpg"; - } - else if (filters.contains(COSName.JPX_DECODE)) + } else if (filters.contains(COSName.JPX_DECODE)) { return "jpx"; - } - else if (filters.contains(COSName.CCITTFAX_DECODE)) + } else if (filters.contains(COSName.CCITTFAX_DECODE)) { return "tiff"; - } - else if (filters.contains(COSName.FLATE_DECODE) + } else if (filters.contains(COSName.FLATE_DECODE) || filters.contains(COSName.LZW_DECODE) || filters.contains(COSName.RUN_LENGTH_DECODE)) { return "png"; - } - else if (filters.contains(COSName.JBIG2_DECODE)) + } else if (filters.contains(COSName.JBIG2_DECODE)) { return "jb2"; - } - else + } else { LOG.warn("getSuffix() returns null, filters: " + filters); return null; Index: pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java (revision ) @@ -91,7 +91,7 @@ PDStream pdStream = new PDStream(doc, is, new COSArray()); Assert.assertEquals(0, pdStream.getFilters().size()); - is = pdStream.createInputStream(null); + is = pdStream.createInputStream((List<String>)null); Assert.assertEquals(12, is.read()); Assert.assertEquals(34, is.read()); Assert.assertEquals(56, is.read()); Index: pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java (revision ) @@ -32,6 +32,8 @@ import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.cos.COSNull; import org.apache.pdfbox.cos.COSStream; +import org.apache.pdfbox.filter.DecodeOptions; +import org.apache.pdfbox.filter.DecodeResult; import org.apache.pdfbox.filter.Filter; import org.apache.pdfbox.filter.FilterFactory; import org.apache.pdfbox.io.IOUtils; @@ -229,6 +231,15 @@ return stream.createInputStream(); } + public COSInputStream createInputStream(DecodeOptions options) throws IOException + { + return stream.createInputStream(options); + } + + public DecodeResult decode() throws IOException { + return stream.decode(); + } + /** * This will get a stream with some filters applied but not others. This is * useful when doing images, ie filters = [flate,dct], we want to remove Index: pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java (revision ) @@ -25,6 +25,7 @@ import java.util.zip.Deflater; import java.util.zip.DeflaterOutputStream; import java.util.zip.Inflater; + import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.pdfbox.cos.COSDictionary; @@ -43,9 +44,13 @@ private static final int BUFFER_SIZE = 16348; @Override - public DecodeResult decode(InputStream encoded, OutputStream decoded, - COSDictionary parameters, int index) throws IOException + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, int index, DecodeOptions options) throws IOException { + if (options.isMetadataOnly()) + { + return new DecodeResult(parameters); + } final COSDictionary decodeParams = getDecodeParams(parameters, index); int predictor = decodeParams.getInt(COSName.PREDICTOR); @@ -63,13 +68,11 @@ decoded.flush(); baos.reset(); bais.reset(); - } - else + } else { decompress(encoded, decoded); } - } - catch (DataFormatException e) + } catch (DataFormatException e) { // if the stream is corrupt a DataFormatException may occur LOG.error("FlateFilter: stop reading corrupt stream due to a DataFormatException"); @@ -80,60 +83,67 @@ return new DecodeResult(parameters); } + @Override + public DecodeResult decode(InputStream encoded, OutputStream decoded, + COSDictionary parameters, int index) throws IOException + { + return decode(encoded, decoded, parameters, index, DecodeOptions.DEFAULT); + } + // Use Inflater instead of InflateInputStream to avoid an EOFException due to a probably // missing Z_STREAM_END, see PDFBOX-1232 for details - private void decompress(InputStream in, OutputStream out) throws IOException, DataFormatException - { + private void decompress(InputStream in, OutputStream out) throws IOException, + DataFormatException + { byte[] buf = new byte[2048]; // skip zlib header - in.read(buf,0,2); - int read = in.read(buf); - if (read > 0) - { + in.read(buf, 0, 2); + int read = in.read(buf); + if (read > 0) + { // use nowrap mode to bypass zlib-header and checksum to avoid a DataFormatException - Inflater inflater = new Inflater(true); - inflater.setInput(buf,0,read); - byte[] res = new byte[1024]; + Inflater inflater = new Inflater(true); + inflater.setInput(buf, 0, read); + byte[] res = new byte[1024]; boolean dataWritten = false; - while (true) - { + while (true) + { int resRead = 0; try { resRead = inflater.inflate(res); - } - catch(DataFormatException exception) + } catch (DataFormatException exception) { if (dataWritten) { // some data could be read -> don't throw an exception - LOG.warn("FlateFilter: premature end of stream due to a DataFormatException"); + LOG.warn("FlateFilter: premature end of stream due to a " + + "DataFormatException"); break; - } - else + } else { // nothing could be read -> re-throw exception throw exception; } } - if (resRead != 0) - { - out.write(res,0,resRead); + if (resRead != 0) + { + out.write(res, 0, resRead); dataWritten = true; - continue; - } - if (inflater.finished() || inflater.needsDictionary() || in.available() == 0) + continue; + } + if (inflater.finished() || inflater.needsDictionary() || in.available() == 0) { break; - } - read = in.read(buf); - inflater.setInput(buf,0,read); + } + read = in.read(buf); + inflater.setInput(buf, 0, read); } inflater.end(); } out.flush(); } - + @Override protected void encode(InputStream input, OutputStream encoded, COSDictionary parameters) throws IOException @@ -141,22 +151,22 @@ int compressionLevel = Deflater.DEFAULT_COMPRESSION; try { - compressionLevel = Integer.parseInt(System.getProperty(Filter.SYSPROP_DEFLATELEVEL, "-1")); - } - catch (NumberFormatException ex) + compressionLevel = Integer.parseInt(System.getProperty(Filter.SYSPROP_DEFLATELEVEL, + "-1")); + } catch (NumberFormatException ex) { LOG.warn(ex.getMessage(), ex); } compressionLevel = Math.max(-1, Math.min(Deflater.BEST_COMPRESSION, compressionLevel)); Deflater deflater = new Deflater(compressionLevel); - try (DeflaterOutputStream out = new DeflaterOutputStream(encoded,deflater)) + try (DeflaterOutputStream out = new DeflaterOutputStream(encoded, deflater)) { int amountRead; int mayRead = input.available(); if (mayRead > 0) { - byte[] buffer = new byte[Math.min(mayRead,BUFFER_SIZE)]; - while ((amountRead = input.read(buffer, 0, Math.min(mayRead,BUFFER_SIZE))) != -1) + byte[] buffer = new byte[Math.min(mayRead, BUFFER_SIZE)]; + while ((amountRead = input.read(buffer, 0, Math.min(mayRead, BUFFER_SIZE))) != -1) { out.write(buffer, 0, amountRead); } Index: pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java (revision ) @@ -34,7 +34,6 @@ import org.apache.pdfbox.cos.COSName; /** - * * This is the filter used for the LZWDecode filter. * * @author Ben Litchfield @@ -56,17 +55,19 @@ * The LZW end of data code. */ public static final long EOD = 257; - + //BEWARE: codeTable must be local to each method, because there is only // one instance of each filter - /** - * {@inheritDoc} - */ + @Override - public DecodeResult decode(InputStream encoded, OutputStream decoded, - COSDictionary parameters, int index) throws IOException + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, int index, DecodeOptions options) throws IOException { + if (options.isMetadataOnly()) + { + return new DecodeResult(parameters); + } COSDictionary decodeParams = getDecodeParams(parameters, index); int predictor = decodeParams.getInt(COSName.PREDICTOR); int earlyChange = decodeParams.getInt(COSName.EARLY_CHANGE, 1); @@ -88,15 +89,25 @@ decoded.flush(); baos.reset(); bais.reset(); - } - else + } else { doLZWDecode(encoded, decoded, earlyChange); } return new DecodeResult(parameters); } - private void doLZWDecode(InputStream encoded, OutputStream decoded, int earlyChange) throws IOException + /** + * {@inheritDoc} + */ + @Override + public DecodeResult decode(InputStream encoded, OutputStream decoded, + COSDictionary parameters, int index) throws IOException + { + return decode(encoded, decoded, parameters, index, DecodeOptions.DEFAULT); + } + + private void doLZWDecode(InputStream encoded, OutputStream decoded, int earlyChange) throws + IOException { List<byte[]> codeTable = new ArrayList<>(); int chunk = 9; @@ -113,8 +124,7 @@ chunk = 9; codeTable = createCodeTable(); prevCommand = -1; - } - else + } else { if (nextCommand < codeTable.size()) { @@ -129,8 +139,7 @@ newData[data.length] = firstByte; codeTable.add(newData); } - } - else + } else { checkIndexBounds(codeTable, prevCommand, in); byte[] data = codeTable.get((int) prevCommand); @@ -139,20 +148,20 @@ decoded.write(newData); codeTable.add(newData); } - + chunk = calculateChunk(codeTable.size(), earlyChange); prevCommand = nextCommand; } } - } - catch (EOFException ex) + } catch (EOFException ex) { LOG.warn("Premature EOF in LZW stream, EOD code missing", ex); } decoded.flush(); } - private void checkIndexBounds(List<byte[]> codeTable, long index, MemoryCacheImageInputStream in) + private void checkIndexBounds(List<byte[]> codeTable, long index, MemoryCacheImageInputStream + in) throws IOException { if (index < 0) @@ -189,10 +198,9 @@ byte by = (byte) r; if (inputPattern == null) { - inputPattern = new byte[] { by }; + inputPattern = new byte[]{by}; foundCode = by & 0xff; - } - else + } else { inputPattern = Arrays.copyOf(inputPattern, inputPattern.length + 1); inputPattern[inputPattern.length - 1] = by; @@ -204,18 +212,17 @@ out.writeBits(foundCode, chunk); // create new table entry codeTable.add(inputPattern); - + if (codeTable.size() == 4096) { // code table is full out.writeBits(CLEAR_TABLE, chunk); codeTable = createCodeTable(); } - - inputPattern = new byte[] { by }; + + inputPattern = new byte[]{by}; foundCode = by & 0xff; - } - else + } else { foundCode = newFoundCode; } @@ -226,19 +233,19 @@ chunk = calculateChunk(codeTable.size() - 1, 1); out.writeBits(foundCode, chunk); } - + // PPDFBOX-1977: the decoder wouldn't know that the encoder would output // an EOD as code, so he would have increased his own code table and // possibly adjusted the chunk. Therefore, the encoder must behave as // if the code table had just grown and thus it must be checked it is // needed to adjust the chunk, based on an increased table size parameter chunk = calculateChunk(codeTable.size(), 1); - + out.writeBits(EOD, chunk); - + // pad with 0 out.writeBits(0, 7); - + // must do or file will be empty :-( out.flush(); } @@ -248,7 +255,7 @@ * Find the longest matching pattern in the code table. * * @param codeTable The LZW code table. - * @param pattern The pattern to be searched for. + * @param pattern The pattern to be searched for. * @return The index of the longest matching pattern or -1 if nothing is * found. */ @@ -264,16 +271,16 @@ if (foundCode != -1) { // we already found pattern with size > 1 - return foundCode; - } - else if (pattern.length > 1) + return foundCode; + } else if (pattern.length > 1) { // we won't find anything here anyway return -1; } } byte[] tryPattern = codeTable.get(i); - if ((foundCode != -1 || tryPattern.length > foundLen) && Arrays.equals(tryPattern, pattern)) + if ((foundCode != -1 || tryPattern.length > foundLen) && Arrays.equals(tryPattern, + pattern)) { foundCode = i; foundLen = tryPattern.length; @@ -291,7 +298,7 @@ List<byte[]> codeTable = new ArrayList<>(4096); for (int i = 0; i < 256; ++i) { - codeTable.add(new byte[] { (byte) (i & 0xFF) }); + codeTable.add(new byte[]{(byte) (i & 0xFF)}); } codeTable.add(null); // 256 EOD codeTable.add(null); // 257 CLEAR_TABLE @@ -301,9 +308,8 @@ /** * Calculate the appropriate chunk size * - * @param tabSize the size of the code table + * @param tabSize the size of the code table * @param earlyChange 0 or 1 for early chunk increase - * * @return a value between 9 and 12 */ private int calculateChunk(int tabSize, int earlyChange) Index: pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java (revision ) @@ -955,7 +955,10 @@ else { // draw the image - drawBufferedImage(pdImage.getImage(), at); + int subsample = (int)Math.floor(pdImage.getWidth()/at.getScaleX()); + if (subsample<1) subsample = 1; + if (subsample>8) subsample = 8; + drawBufferedImage(pdImage.getImage(null, subsample), at); } if (!pdImage.getInterpolate()) Index: pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java (revision ) @@ -16,12 +16,14 @@ */ package org.apache.pdfbox.pdmodel.graphics.image; -import java.awt.Paint; +import java.awt.*; import java.awt.image.BufferedImage; import java.io.IOException; import java.io.InputStream; import java.util.List; + import org.apache.pdfbox.cos.COSArray; +import org.apache.pdfbox.filter.DecodeOptions; import org.apache.pdfbox.pdmodel.common.COSObjectable; import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace; @@ -34,24 +36,29 @@ { /** * Returns the content of this image as an AWT buffered image with an (A)RGB color space. - * The size of the returned image is the larger of the size of the image itself or its mask. + * The size of the returned image is the larger of the size of the image itself or its mask. + * * @return content of this image as a buffered image. * @throws IOException */ BufferedImage getImage() throws IOException; + BufferedImage getImage(Rectangle region, int subsample) throws IOException; + /** * Returns an ARGB image filled with the given paint and using this image as a mask. + * * @param paint the paint to fill the visible portions of the image with * @return a masked image filled with the given paint - * @throws IOException if the image cannot be read + * @throws IOException if the image cannot be read * @throws IllegalStateException if the image is not a stencil. */ BufferedImage getStencilImage(Paint paint) throws IOException; - + /** * Returns an InputStream containing the image data, irrespective of whether this is an * inline image or an image XObject. + * * @return Decoded stream * @throws IOException if the data could not be read. */ @@ -60,12 +67,15 @@ /** * Returns an InputStream containing the image data, irrespective of whether this is an * inline image or an image XObject. The given filters will not be decoded. + * * @param stopFilters A list of filters to stop decoding at. * @return Decoded stream * @throws IOException if the data could not be read. */ InputStream createInputStream(List<String> stopFilters) throws IOException; + public InputStream createInputStream(DecodeOptions options) throws IOException; + /** * Returns true if the image has no data. */ @@ -79,6 +89,7 @@ /** * Sets whether or not the image is a stencil. * This corresponds to the {@code ImageMask} entry in the image stream's dictionary. + * * @param isStencil True to make the image a stencil. */ void setStencil(boolean isStencil); @@ -90,18 +101,21 @@ /** * Set the number of bits per component. + * * @param bitsPerComponent The number of bits per component. */ void setBitsPerComponent(int bitsPerComponent); /** * Returns the image's color space. + * * @throws IOException If there is an error getting the color space. */ PDColorSpace getColorSpace() throws IOException; /** * Sets the color space for this image. + * * @param colorSpace The color space for this image. */ void setColorSpace(PDColorSpace colorSpace); @@ -113,6 +127,7 @@ /** * Sets the height of the image. + * * @param height The height of the image. */ void setHeight(int height); @@ -124,13 +139,15 @@ /** * Sets the width of the image. + * * @param width The width of the image. */ void setWidth(int width); /** * Sets the decode array. - * @param decode the new decode array. + * + * @param decode the new decode array. */ void setDecode(COSArray decode); Index: pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java (revision ) @@ -24,6 +24,8 @@ import java.io.InputStream; import java.util.ArrayList; import java.util.List; + +import org.apache.pdfbox.filter.DecodeOptions; import org.apache.pdfbox.filter.DecodeResult; import org.apache.pdfbox.filter.Filter; import org.apache.pdfbox.io.RandomAccess; @@ -51,6 +53,12 @@ static COSInputStream create(List<Filter> filters, COSDictionary parameters, InputStream in, ScratchFile scratchFile) throws IOException { + return create(filters, parameters, in, scratchFile, DecodeOptions.DEFAULT); + } + + static COSInputStream create(List<Filter> filters, COSDictionary parameters, InputStream in, + ScratchFile scratchFile, DecodeOptions options) throws IOException + { List<DecodeResult> results = new ArrayList<>(); InputStream input = in; if (filters.isEmpty()) @@ -66,7 +74,7 @@ { // scratch file final RandomAccess buffer = scratchFile.createBuffer(); - DecodeResult result = filters.get(i).decode(input, new RandomAccessOutputStream(buffer), parameters, i); + DecodeResult result = filters.get(i).decode(input, new RandomAccessOutputStream(buffer), parameters, i, options); results.add(result); input = new RandomAccessInputStream(buffer) { @@ -81,7 +89,7 @@ { // in-memory ByteArrayOutputStream output = new ByteArrayOutputStream(); - DecodeResult result = filters.get(i).decode(input, output, parameters, i); + DecodeResult result = filters.get(i).decode(input, output, parameters, i, options); results.add(result); input = new ByteArrayInputStream(output.toByteArray()); } @@ -90,6 +98,46 @@ return new COSInputStream(input, results); } + public static DecodeResult decode(List<Filter> filters, COSDictionary parameters, InputStream in, + ScratchFile scratchFile) throws IOException { + DecodeResult result = DecodeResult.DEFAULT; + InputStream input = in; + if (filters.isEmpty()) + { + input = in; + } + else + { + // apply filters + for (int i = 0; i < filters.size(); i++) + { + if (scratchFile != null) + { + // scratch file + final RandomAccess buffer = scratchFile.createBuffer(); + result = filters.get(i).decode(input, new RandomAccessOutputStream(buffer), parameters, i, DecodeOptions.METADATA_ONLY); + input = new RandomAccessInputStream(buffer) + { + @Override + public void close() throws IOException + { + buffer.close(); + } + }; + } + else + { + // in-memory + ByteArrayOutputStream output = new ByteArrayOutputStream(); + result = filters.get(i).decode(input, output, parameters, i, DecodeOptions.METADATA_ONLY); + input = new ByteArrayInputStream(output.toByteArray()); + } + } + } + return result; + + } + private final List<DecodeResult> decodeResults; /** Index: pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java (revision ) @@ -16,17 +16,19 @@ */ package org.apache.pdfbox.pdmodel.graphics.image; -import java.awt.Paint; +import java.awt.*; import java.awt.image.BufferedImage; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStream; import java.util.List; + import org.apache.pdfbox.cos.COSArray; import org.apache.pdfbox.cos.COSBase; import org.apache.pdfbox.cos.COSDictionary; import org.apache.pdfbox.cos.COSName; +import org.apache.pdfbox.filter.DecodeOptions; import org.apache.pdfbox.filter.DecodeResult; import org.apache.pdfbox.filter.Filter; import org.apache.pdfbox.filter.FilterFactory; @@ -58,8 +60,8 @@ * Creates an inline image from the given parameters and data. * * @param parameters the image parameters - * @param data the image data - * @param resources the current resources + * @param data the image data + * @param resources the current resources * @throws IOException if the stream cannot be decoded */ public PDInlineImage(COSDictionary parameters, byte[] data, PDResources resources) @@ -74,8 +76,7 @@ if (filters == null || filters.isEmpty()) { this.decodedData = data; - } - else + } else { ByteArrayInputStream in = new ByteArrayInputStream(data); ByteArrayOutputStream out = new ByteArrayOutputStream(data.length); @@ -109,8 +110,7 @@ if (isStencil()) { return 1; - } - else + } else { return parameters.getInt(COSName.BPC, COSName.BITS_PER_COMPONENT, -1); } @@ -129,19 +129,17 @@ if (cs != null) { return createColorSpace(cs); - } - else if (isStencil()) + } else if (isStencil()) { // stencil mask color space must be gray, it is often missing return PDDeviceGray.INSTANCE; - } - else + } else { // an image without a color space is always broken throw new IOException("could not determine inline image color space"); } } - + // deliver the long name of a device colorspace, or the parameter private COSBase toLongName(COSBase cs) { @@ -159,7 +157,7 @@ } return cs; } - + private PDColorSpace createColorSpace(COSBase cs) throws IOException { if (cs instanceof COSName) @@ -247,8 +245,7 @@ { COSName name = (COSName) filters; names = new COSArrayList<>(name.getName(), name, parameters, COSName.FILTER); - } - else if (filters instanceof COSArray) + } else if (filters instanceof COSArray) { names = COSArrayList.convertCOSNameCOSArrayToList((COSArray) filters); } @@ -296,6 +293,12 @@ return new ByteArrayInputStream(decodedData); } + @Override + public InputStream createInputStream(DecodeOptions options) throws IOException + { + return createInputStream(); + } + @Override public InputStream createInputStream(List<String> stopFilters) throws IOException { @@ -309,8 +312,7 @@ if (stopFilters.contains(filters.get(i))) { break; - } - else + } else { Filter filter = FilterFactory.INSTANCE.getFilter(filters.get(i)); filter.decode(in, out, parameters, i); @@ -333,13 +335,19 @@ { return decodedData; } - + @Override public BufferedImage getImage() throws IOException { return SampledImageReader.getRGBImage(this, getColorKeyMask()); } + @Override + public BufferedImage getImage(Rectangle region, int subsample) throws IOException + { + return getImage(); + } + @Override public BufferedImage getStencilImage(Paint paint) throws IOException { Index: pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java (revision ) @@ -26,6 +26,8 @@ import java.util.List; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; +import org.apache.pdfbox.filter.DecodeOptions; +import org.apache.pdfbox.filter.DecodeResult; import org.apache.pdfbox.filter.Filter; import org.apache.pdfbox.filter.FilterFactory; import org.apache.pdfbox.io.IOUtils; @@ -159,6 +161,11 @@ */ public COSInputStream createInputStream() throws IOException { + return createInputStream(DecodeOptions.DEFAULT); + } + + public COSInputStream createInputStream(DecodeOptions options) throws IOException + { checkClosed(); if (isWriting) { @@ -166,7 +173,18 @@ } ensureRandomAccessExists(true); InputStream input = new RandomAccessInputStream(randomAccess); - return COSInputStream.create(getFilterList(), this, input, scratchFile); + return COSInputStream.create(getFilterList(), this, input, scratchFile, options); + } + + public DecodeResult decode() throws IOException { + checkClosed(); + if (isWriting) + { + throw new IllegalStateException("Cannot read while there is an open stream writer"); + } + ensureRandomAccessExists(true); + InputStream input = new RandomAccessInputStream(randomAccess); + return COSInputStream.decode(getFilterList(), this, input, scratchFile); } /** Index: pdfbox/src/main/java/org/apache/pdfbox/filter/DecodeOptions.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/DecodeOptions.java (revision ) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/DecodeOptions.java (revision ) @@ -0,0 +1,109 @@ +package org.apache.pdfbox.filter; + +import java.awt.Rectangle; + +public class DecodeOptions +{ + private boolean metadataOnly = false; + private Rectangle sourceRegion = null; + private int subsampleX = 1, subsampleY = 1, subsampleOffsetX = 0, subsampleOffsetY = 0; + private boolean honored = false; + + public static final DecodeOptions METADATA_ONLY = new DecodeOptions(true); + public static final DecodeOptions DEFAULT = new DecodeOptions(); + + public DecodeOptions() + { + } + + public DecodeOptions(boolean metadataOnly) + { + this.metadataOnly = metadataOnly; + } + + public DecodeOptions(Rectangle sourceRegion) + { + this.sourceRegion = sourceRegion; + } + + public DecodeOptions(int x, int y, int width, int height) + { + this(new Rectangle(x, y, width, height)); + } + + public DecodeOptions(int subsampling) + { + subsampleX = subsampling; + subsampleY = subsampling; + } + + public boolean isMetadataOnly() + { + return metadataOnly; + } + + public void setMetadataOnly(boolean metadataOnly) + { + this.metadataOnly = metadataOnly; + } + + public Rectangle getSourceRegion() + { + return sourceRegion; + } + + public void setSourceRegion(Rectangle sourceRegion) + { + this.sourceRegion = sourceRegion; + } + + public int getSubsamplingX() + { + return subsampleX; + } + + public void setSubsamplingX(int ssX) + { + this.subsampleX = ssX; + } + + public int getSubsamplingY() + { + return subsampleY; + } + + public void setSubsamplingY(int ssY) + { + this.subsampleY = ssY; + } + + public int getSubsamplingOffsetX() + { + return subsampleOffsetX; + } + + public void setSubsamplingOffsetX(int ssOffsetX) + { + this.subsampleOffsetX = ssOffsetX; + } + + public int getSubsamplingOffsetY() + { + return subsampleOffsetY; + } + + public void setSubsamplingOffsetY(int ssOffsetY) + { + this.subsampleOffsetY = ssOffsetY; + } + + public boolean isHonored() + { + return honored; + } + + public void setHonored(boolean honored) + { + this.honored = honored; + } +} Index: pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java (revision ) @@ -16,9 +16,7 @@ */ package org.apache.pdfbox.pdmodel.graphics.image; -import java.awt.Graphics2D; -import java.awt.Paint; -import java.awt.Point; +import java.awt.*; import java.awt.image.BufferedImage; import java.awt.image.DataBuffer; import java.awt.image.DataBufferByte; @@ -29,31 +27,35 @@ import java.util.Arrays; import javax.imageio.stream.ImageInputStream; import javax.imageio.stream.MemoryCacheImageInputStream; + import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.pdfbox.cos.COSArray; import org.apache.pdfbox.cos.COSNumber; +import org.apache.pdfbox.filter.DecodeOptions; import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace; import org.apache.pdfbox.pdmodel.graphics.color.PDDeviceGray; import org.apache.pdfbox.pdmodel.graphics.color.PDIndexed; /** * Reads a sampled image from a PDF file. + * * @author John Hewson */ final class SampledImageReader { private static final Log LOG = LogFactory.getLog(SampledImageReader.class); - + private SampledImageReader() { } /** * Returns an ARGB image filled with the given paint and using the given image as a mask. + * * @param paint the paint to fill the visible portions of the image with * @return a masked image filled with the given paint - * @throws IOException if the image cannot be read + * @throws IOException if the image cannot be read * @throws IllegalStateException if the image is not a stencil. */ public static BufferedImage getStencilImage(PDImage pdImage, Paint paint) throws IOException @@ -122,7 +124,7 @@ LOG.warn("premature EOF, image will be incomplete"); break; } - } + } } return masked; @@ -132,23 +134,46 @@ * Returns the content of the given image as an AWT buffered image with an RGB color space. * If a color key mask is provided then an ARGB image is returned instead. * This method never returns null. - * @param pdImage the image to read + * + * @param pdImage the image to read * @param colorKey an optional color key mask * @return content of this image as an RGB buffered image * @throws IOException if the image cannot be read */ public static BufferedImage getRGBImage(PDImage pdImage, COSArray colorKey) throws IOException { + return getRGBImage(pdImage, null, 1, colorKey); + } + + private static Rectangle clipRegion(PDImage pdImage, Rectangle region) + { + if (region == null) + { + return new Rectangle(0, 0, pdImage.getWidth(), pdImage.getHeight()); + } else + { + int x = Math.max(0, region.x); + int y = Math.max(0, region.y); + int width = Math.min(region.width, pdImage.getWidth() - x); + int height = Math.min(region.height, pdImage.getHeight() - y); + return new Rectangle(x, y, width, height); + } + } + + public static BufferedImage getRGBImage(PDImage pdImage, Rectangle region, int subsample, + COSArray colorKey) throws IOException + { if (pdImage.isEmpty()) { throw new IOException("Image stream is empty"); } + Rectangle clipped = clipRegion(pdImage, region); // get parameters, they must be valid or have been repaired final PDColorSpace colorSpace = pdImage.getColorSpace(); final int numComponents = colorSpace.getNumberOfComponents(); - final int width = pdImage.getWidth(); - final int height = pdImage.getHeight(); + final int width = (int) Math.round(clipped.getWidth() / subsample); + final int height = (int) Math.round(clipped.getHeight() / subsample); final int bitsPerComponent = pdImage.getBitsPerComponent(); final float[] decode = getDecodeArray(pdImage); @@ -159,7 +184,7 @@ if (bitsPerComponent == 1 && colorKey == null && numComponents == 1) { - return from1Bit(pdImage); + return from1Bit(pdImage, clipped, subsample, width, height); } // @@ -168,47 +193,65 @@ // in depth to 8bpc as they will be drawn to TYPE_INT_RGB images anyway. All code // in PDColorSpace#toRGBImage expects an 8-bit range, i.e. 0-255. // - WritableRaster raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height, - numComponents, new Point(0, 0)); final float[] defaultDecode = pdImage.getColorSpace().getDefaultDecode(8); if (bitsPerComponent == 8 && Arrays.equals(decode, defaultDecode) && colorKey == null) { // convert image, faster path for non-decoded, non-colormasked 8-bit images - return from8bit(pdImage, raster); + return from8bit(pdImage, clipped, subsample, width, height); } - return fromAny(pdImage, raster, colorKey); + return fromAny(pdImage, colorKey, clipped, subsample, width, height); } - private static BufferedImage from1Bit(PDImage pdImage) throws IOException + private static BufferedImage from1Bit(PDImage pdImage, Rectangle clipped, int subsample, + final int width, final int height) throws IOException { final PDColorSpace colorSpace = pdImage.getColorSpace(); - final int width = pdImage.getWidth(); - final int height = pdImage.getHeight(); final float[] decode = getDecodeArray(pdImage); BufferedImage bim = null; WritableRaster raster; byte[] output; - if (colorSpace instanceof PDDeviceGray) - { - // TYPE_BYTE_GRAY and not TYPE_BYTE_BINARY because this one is handled - // without conversion to RGB by Graphics.drawImage - // this reduces the memory footprint, only one byte per pixel instead of three. - bim = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY); - raster = bim.getRaster(); - } - else - { - raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height, 1, new Point(0, 0)); - } - output = ((DataBufferByte) raster.getDataBuffer()).getData(); - - // read bit stream - try (InputStream iis = pdImage.createInputStream()) - { + + // read bit stream + DecodeOptions options = new DecodeOptions(subsample); + options.setSourceRegion(clipped); + try (InputStream iis = pdImage.createInputStream(options)) + { + final int inputWidth, inputHeight, startx, starty, scanWidth, scanHeight; + if (options.isHonored()) + { + inputWidth = width; + inputHeight = height; + startx = 0; + starty = 0; + scanWidth = width; + scanHeight = height; + subsample = 1; + } else + { + inputWidth = pdImage.getWidth(); + inputHeight = pdImage.getHeight(); + startx = clipped.x; + starty = clipped.y; + scanWidth = clipped.width; + scanHeight = clipped.height; + } + if (colorSpace instanceof PDDeviceGray) + { + // TYPE_BYTE_GRAY and not TYPE_BYTE_BINARY because this one is handled + // without conversion to RGB by Graphics.drawImage + // this reduces the memory footprint, only one byte per pixel instead of three. + bim = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY); + raster = bim.getRaster(); + } else + { + raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height, 1, new + Point(0, 0)); + } + output = ((DataBufferByte) raster.getDataBuffer()).getData(); final boolean isIndexed = colorSpace instanceof PDIndexed; - int rowLen = width / 8; - if (width % 8 > 0) + int rowLen = inputWidth / 8; + if (inputWidth % 8 > 0) { rowLen++; } @@ -220,18 +263,21 @@ { value0 = 0; value1 = (byte) 255; - } - else + } else { value0 = (byte) 255; value1 = 0; } byte[] buff = new byte[rowLen]; int idx = 0; - for (int y = 0; y < height; y++) + for (int y = 0; y < starty + scanHeight; y++) { int x = 0; int readLen = iis.read(buff); + if (y < starty || y % subsample > 0) + { + continue; + } for (int r = 0; r < rowLen && r < readLen; r++) { int value = buff[r]; @@ -240,9 +286,14 @@ { int bit = value & mask; mask >>= 1; + if (x < startx || x % subsample > 0) + { + x++; + continue; + } output[idx++] = bit == 0 ? value0 : value1; x++; - if (x == width) + if (x >= startx + scanWidth) { break; } @@ -266,31 +317,58 @@ } // faster, 8-bit non-decoded, non-colormasked image conversion - private static BufferedImage from8bit(PDImage pdImage, WritableRaster raster) - throws IOException + private static BufferedImage from8bit(PDImage pdImage, Rectangle clipped, int subsample, + final int width, final int height) throws IOException { - try (InputStream input = pdImage.createInputStream()) + DecodeOptions options = new DecodeOptions(subsample); + options.setSourceRegion(clipped); + try (InputStream input = pdImage.createInputStream(options)) { + final int inputWidth, inputHeight, startx, starty, scanWidth, scanHeight; + if (options.isHonored()) + { + inputWidth = width; + inputHeight = height; + startx = 0; + starty = 0; + scanWidth = width; + scanHeight = height; + subsample = 1; + } else + { + inputWidth = pdImage.getWidth(); + inputHeight = pdImage.getHeight(); + startx = clipped.x; + starty = clipped.y; + scanWidth = clipped.width; + scanHeight = clipped.height; + } + final int numComponents = pdImage.getColorSpace().getNumberOfComponents(); + WritableRaster raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height, + numComponents, new Point(0, 0)); // get the raster's underlying byte buffer byte[][] banks = ((DataBufferByte) raster.getDataBuffer()).getBankData(); - final int width = pdImage.getWidth(); - final int height = pdImage.getHeight(); - final int numComponents = pdImage.getColorSpace().getNumberOfComponents(); - byte[] tempBytes = new byte[numComponents * width]; + byte[] tempBytes = new byte[numComponents * inputWidth]; // compromise between memory and time usage: // reading the whole image consumes too much memory // reading one pixel at a time makes it slow in our buffering infrastructure int i = 0; - for (int y = 0; y < height; ++y) + for (int y = 0; y < starty + scanHeight; ++y) { long inputResult = input.read(tempBytes); if (Long.compare(inputResult, tempBytes.length) != 0) { - LOG.debug("Tried reading " + tempBytes.length + " bytes but only " + inputResult + " bytes read"); + LOG.debug("Tried reading " + tempBytes.length + " bytes but only " + + inputResult + " bytes read"); + } + // + if (y < starty || y % subsample > 0) + { + continue; } - for (int x = 0; x < width; ++x) + for (int x = startx; x < startx + scanWidth; x += subsample) { for (int c = 0; c < numComponents; c++) { @@ -305,19 +383,42 @@ } // slower, general-purpose image conversion from any image format - private static BufferedImage fromAny(PDImage pdImage, WritableRaster raster, COSArray colorKey) + private static BufferedImage fromAny(PDImage pdImage, COSArray colorKey, Rectangle clipped, + int subsample, final int width, final int height) throws IOException { final PDColorSpace colorSpace = pdImage.getColorSpace(); final int numComponents = colorSpace.getNumberOfComponents(); - final int width = pdImage.getWidth(); - final int height = pdImage.getHeight(); final int bitsPerComponent = pdImage.getBitsPerComponent(); final float[] decode = getDecodeArray(pdImage); + DecodeOptions options = new DecodeOptions(subsample); + options.setSourceRegion(clipped); // read bit stream - try (ImageInputStream iis = new MemoryCacheImageInputStream(pdImage.createInputStream())) + try (ImageInputStream iis = new MemoryCacheImageInputStream(pdImage.createInputStream + (options))) { + final int inputWidth, inputHeight, startx, starty, scanWidth, scanHeight; + if (options.isHonored()) + { + inputWidth = width; + inputHeight = height; + startx = 0; + starty = 0; + scanWidth = width; + scanHeight = height; + subsample = 1; + } else + { + inputWidth = pdImage.getWidth(); + inputHeight = pdImage.getHeight(); + startx = clipped.x; + starty = clipped.y; + scanWidth = clipped.width; + scanHeight = clipped.height; + } + WritableRaster raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height, + numComponents, new Point(0, 0)); final float sampleMax = (float) Math.pow(2, bitsPerComponent) - 1f; final boolean isIndexed = colorSpace instanceof PDIndexed; @@ -332,28 +433,28 @@ // calculate row padding int padding = 0; - if (width * numComponents * bitsPerComponent % 8 > 0) + if (inputWidth * numComponents * bitsPerComponent % 8 > 0) { - padding = 8 - (width * numComponents * bitsPerComponent % 8); + padding = 8 - (inputWidth * numComponents * bitsPerComponent % 8); } // read stream byte[] srcColorValues = new byte[numComponents]; byte[] alpha = new byte[1]; - for (int y = 0; y < height; y++) + for (int y = 0; y < starty + scanHeight; y++) { - for (int x = 0; x < width; x++) + for (int x = 0; x < startx + scanWidth; x++) { boolean isMasked = true; for (int c = 0; c < numComponents; c++) { - int value = (int)iis.readBits(bitsPerComponent); + int value = (int) iis.readBits(bitsPerComponent); // color key mask requires values before they are decoded if (colorKeyRanges != null) { isMasked &= value >= colorKeyRanges[c * 2] && - value <= colorKeyRanges[c * 2 + 1]; + value <= colorKeyRanges[c * 2 + 1]; } // decode array @@ -368,23 +469,26 @@ // indexed color spaces get the raw value, because the TYPE_BYTE // below cannot be reversed by the color space without it having // knowledge of the number of bits per component - srcColorValues[c] = (byte)Math.round(output); - } - else + srcColorValues[c] = (byte) Math.round(output); + } else { // interpolate to TYPE_BYTE int outputByte = Math.round(((output - Math.min(dMin, dMax)) / Math.abs(dMax - dMin)) * 255f); - srcColorValues[c] = (byte)outputByte; + srcColorValues[c] = (byte) outputByte; } } - raster.setDataElements(x, y, srcColorValues); + if (x >= startx && y >= starty && x % subsample == 0 && y % subsample == 0) + { + raster.setDataElements((x - startx) / subsample, (y - starty) / subsample, + srcColorValues); + } // set alpha channel in color key mask, if any if (colorKeyMask != null) { - alpha[0] = (byte)(isMasked ? 255 : 0); + alpha[0] = (byte) (isMasked ? 255 : 0); colorKeyMask.getRaster().setDataElements(x, y, alpha); } } @@ -400,8 +504,7 @@ if (colorKeyMask != null) { return applyColorKeyMask(rgbImage, colorKeyMask); - } - else + } else { return rgbImage; } @@ -466,15 +569,14 @@ LOG.warn("decode array " + cosDecode + " not compatible with color space, using the first two entries"); return new float[] - { - decode0, decode1 - }; + { + decode0, decode1 + }; } } LOG.error("decode array " + cosDecode + " not compatible with color space, using default"); - } - else + } else { decode = cosDecode.toFloatArray(); } Index: pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java (revision ) @@ -59,26 +59,35 @@ /** * Decodes data, producing the original non-encoded data. - * @param encoded the encoded byte stream - * @param decoded the stream where decoded data will be written + * + * @param encoded the encoded byte stream + * @param decoded the stream where decoded data will be written * @param parameters the parameters used for decoding - * @param index the index to the filter being decoded + * @param index the index to the filter being decoded * @return repaired parameters dictionary, or the original parameters dictionary * @throws IOException if the stream cannot be decoded */ - public abstract DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary parameters, - int index) throws IOException; + public abstract DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, + int index) throws IOException; + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary parameters, + int index, DecodeOptions options) throws IOException + { + return decode(encoded, decoded, parameters, index); + } + /** * Encodes data. - * @param input the byte stream to encode - * @param encoded the stream where encoded data will be written + * + * @param input the byte stream to encode + * @param encoded the stream where encoded data will be written * @param parameters the parameters used for encoding - * @param index the index to the filter being encoded + * @param index the index to the filter being encoded * @throws IOException if the stream cannot be encoded */ public final void encode(InputStream input, OutputStream encoded, COSDictionary parameters, - int index) throws IOException + int index) throws IOException { encode(input, encoded, parameters.asUnmodifiableDictionary()); } @@ -96,26 +105,25 @@ if (filter instanceof COSName && obj instanceof COSDictionary) { // PDFBOX-3932: The PDF specification requires "If there is only one filter and that - // filter has parameters, DecodeParms shall be set to the filter’s parameter dictionary" + // filter has parameters, DecodeParms shall be set to the filter’s parameter + // dictionary" // but tests show that Adobe means "one filter name object". - return (COSDictionary)obj; - } - else if (filter instanceof COSArray && obj instanceof COSArray) + return (COSDictionary) obj; + } else if (filter instanceof COSArray && obj instanceof COSArray) { - COSArray array = (COSArray)obj; + COSArray array = (COSArray) obj; if (index < array.size()) { COSBase objAtIndex = array.getObject(index); if (objAtIndex instanceof COSDictionary) { - return (COSDictionary)array.getObject(index); + return (COSDictionary) array.getObject(index); } } - } - else if (obj != null && !(filter instanceof COSArray || obj instanceof COSArray)) + } else if (obj != null && !(filter instanceof COSArray || obj instanceof COSArray)) { LOG.error("Expected DecodeParams to be an Array or Dictionary but found " + - obj.getClass().getName()); + obj.getClass().getName()); } return new COSDictionary(); } @@ -128,7 +136,8 @@ * @return The image reader for the format. * @throws MissingImageReaderException if no image reader is found. */ - protected static ImageReader findImageReader(String formatName, String errorCause) throws MissingImageReaderException + protected static ImageReader findImageReader(String formatName, String errorCause) throws + MissingImageReaderException { Iterator<ImageReader> readers = ImageIO.getImageReadersByFormatName(formatName); ImageReader reader = null; @@ -142,7 +151,8 @@ } if (reader == null) { - throw new MissingImageReaderException("Cannot read " + formatName + " image: " + errorCause); + throw new MissingImageReaderException("Cannot read " + formatName + " image: " + + errorCause); } return reader; } Index: pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java (revision fadc0aff4bc2c52f6849c78d878888c1c0f390eb) +++ pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java (revision ) @@ -25,6 +25,7 @@ import java.io.OutputStream; import java.io.SequenceInputStream; import javax.imageio.ImageIO; +import javax.imageio.ImageReadParam; import javax.imageio.ImageReader; import javax.imageio.stream.ImageInputStream; import org.apache.commons.logging.Log; @@ -61,8 +62,8 @@ } @Override - public DecodeResult decode(InputStream encoded, OutputStream decoded, - COSDictionary parameters, int index) throws IOException + public DecodeResult decode(InputStream encoded, OutputStream decoded, COSDictionary + parameters, int index, DecodeOptions options) throws IOException { ImageReader reader = findImageReader("JBIG2", "jbig2-imageio is not installed"); if (reader.getClass().getName().contains("levigo")) @@ -73,6 +74,17 @@ int bits = parameters.getInt(COSName.BITS_PER_COMPONENT, 1); COSDictionary params = getDecodeParams(parameters, index); + if (options.isMetadataOnly()) + { + return new DecodeResult(parameters); + } + + ImageReadParam irp = reader.getDefaultReadParam(); + irp.setSourceSubsampling(options.getSubsamplingX(), options.getSubsamplingY(), + options.getSubsamplingOffsetX(), options.getSubsamplingOffsetY()); + irp.setSourceRegion(options.getSourceRegion()); + options.setHonored(true); + InputStream source = encoded; if (params != null) { @@ -90,9 +102,8 @@ BufferedImage image; try { - image = reader.read(0, reader.getDefaultReadParam()); - } - catch (Exception e) + image = reader.read(0, irp); + } catch (Exception e) { // wrap and rethrow any exceptions throw new IOException("Could not read JBIG2 image", e); @@ -128,9 +139,17 @@ { reader.dispose(); } + return new DecodeResult(parameters); } + @Override + public DecodeResult decode(InputStream encoded, OutputStream decoded, + COSDictionary parameters, int index) throws IOException + { + return decode(encoded, decoded, parameters, index, DecodeOptions.DEFAULT); + } + @Override protected void encode(InputStream input, OutputStream encoded, COSDictionary parameters) throws IOException
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org