Hello,
Following a question asked on pdfbox-users [1] , I set about trying to
allow rendering images at lower resolutions, and additionally rendering
only parts of images. The need arises from having very large images,
usually JPEG or JBIG2, which are tens of megabytes in size when compressed,
but may take up 8 or even more gigabytes when rendered as a BufferedImage
at full resolution.
I have come up with a solution that seems to work (passes all of the
built-in PDFBox tests, and a few manual ones I tried), but since it
includes some deep changes in the logic I understand if it won't find its
way into PDFBox.
While working on it, I also came across PDFBOX-3340 [2], and since my hack
relies on making changes to the way filters work, it includes a (partial)
fix for that bug too.
Finally, since I'm not well versed in git/github, I'm not sure of the best
way to share my work. I attach here a unified diff, but let me know if
there is another preferred method (pull request? clone the repository?)
Following is an explanation/description of my changes, for those
interested. I would love to hear any feedback, especially for things which
may increase the likelihood of such a feature being included in future
versions of PDFBox.
Thanks,
Itai.
--
As stated, the issue pertains mainly to very large images (lots of pixels)
which are highly compressed. Since DCTFilter, JBIG2Filter etc. render the
entire image, I had to augment the way Filter works, to allow it to accept
options.
This is where the class DecodeOptions comes in. It has sub-region and
subsampling options (mirroring those of ImageReadParam), as well as a
"metadata only" param. When decoding, you may pass DecodeOptions, such that
image-related filters can downscale or only render a part of the image.
The "metadata-only" option is used for the `repair` method of
PDImageXObject, as it only really needs the DecodeResult - where applicable
and possible, a filter encountering this option will not decode the stream,
only set the DecodeResult parameters (this is not always possible, e.g. for
JPXFilter, which must decode the image to get the parameters).
The DecodeOptions also has an "honored" flag, which the filter sets to true
if it honored the options - this is needed because when decoding an image
stored in a Flate or LZW stream, the filter doesn't know the image format
(or does it? I couldn't find a simple way of telling), so it can't make
sense of subsampling or partial render options. SampledImageReader checks
this flag, and if it is not set to true it does the subsampling by itself.
This allows the addition of a method in PDImage
BufferedImage getImage(Rectangle region, int subsample) throws
IOException;
The result of which is not cached, as it is not "canonical".
When drawing an image, PDPageDrawer calculates a subsampling factor based
on the desired size:
int subsample = (int)Math.floor(pdImage.getWidth()/at.getScaleX());
if (subsample<1) subsample = 1;
if (subsample>8) subsample = 8;
drawBufferedImage(pdImage.getImage(null, subsample), at);
Such that if e.g. the pixel should be drawn at 0.5 times its pixel-size, it
will be subsampled at 2-pixel intervals.
SampledImageReader issues the corresponding DecodeOptions to
PDImage#createInputStream when rendering, and if the "honored" flag is not
set, it does its own sub-sampling and partial rendering.
I realize most/all of those optimizations won't work for raw, Flate or LZW
encoded images, but presumably those won't be too large in the first place.
Also, this has little to no benefit for PDInlineImage, but as it already
holds all of its raw data I assume little optimization is possible.
In general, this hack allowed me to speed-up rendering of some files by
significant margins (20%-80%, depending on size and desired DPI), and
significantly lower the memory footprint if only a lower-res render is
required, or rendering of small regions of the image.
--
[1]:
https://lists.apache.org/thread.html/6b396e3d8bfc4ed44bcadf37881035d7447fb711253ef962f187455c@%3Cusers.pdfbox.apache.org%3E
[2]: https://issues.apache.org/jira/browse/PDFBOX-3340
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java
b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java
index a11445131..058ed5e81 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSInputStream.java
@@ -24,6 +24,8 @@ import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
+
+import org.apache.pdfbox.filter.DecodeOptions;
import org.apache.pdfbox.filter.DecodeResult;
import org.apache.pdfbox.filter.Filter;
import org.apache.pdfbox.io.RandomAccess;
@@ -50,6 +52,12 @@ public final class COSInputStream extends FilterInputStream
*/
static COSInputStream create(List<Filter> filters, COSDictionary
parameters, InputStream in,
ScratchFile scratchFile) throws IOException
+ {
+ return create(filters, parameters, in, scratchFile,
DecodeOptions.DEFAULT);
+ }
+
+ static COSInputStream create(List<Filter> filters, COSDictionary
parameters, InputStream in,
+ ScratchFile scratchFile, DecodeOptions
options) throws IOException
{
List<DecodeResult> results = new ArrayList<>();
InputStream input = in;
@@ -66,7 +74,7 @@ public final class COSInputStream extends FilterInputStream
{
// scratch file
final RandomAccess buffer = scratchFile.createBuffer();
- DecodeResult result = filters.get(i).decode(input, new
RandomAccessOutputStream(buffer), parameters, i);
+ DecodeResult result = filters.get(i).decode(input, new
RandomAccessOutputStream(buffer), parameters, i, options);
results.add(result);
input = new RandomAccessInputStream(buffer)
{
@@ -81,7 +89,7 @@ public final class COSInputStream extends FilterInputStream
{
// in-memory
ByteArrayOutputStream output = new ByteArrayOutputStream();
- DecodeResult result = filters.get(i).decode(input, output,
parameters, i);
+ DecodeResult result = filters.get(i).decode(input, output,
parameters, i, options);
results.add(result);
input = new ByteArrayInputStream(output.toByteArray());
}
@@ -90,6 +98,46 @@ public final class COSInputStream extends FilterInputStream
return new COSInputStream(input, results);
}
+ public static DecodeResult decode(List<Filter> filters, COSDictionary
parameters, InputStream in,
+ ScratchFile scratchFile) throws
IOException {
+ DecodeResult result = DecodeResult.DEFAULT;
+ InputStream input = in;
+ if (filters.isEmpty())
+ {
+ input = in;
+ }
+ else
+ {
+ // apply filters
+ for (int i = 0; i < filters.size(); i++)
+ {
+ if (scratchFile != null)
+ {
+ // scratch file
+ final RandomAccess buffer = scratchFile.createBuffer();
+ result = filters.get(i).decode(input, new
RandomAccessOutputStream(buffer), parameters, i, DecodeOptions.METADATA_ONLY);
+ input = new RandomAccessInputStream(buffer)
+ {
+ @Override
+ public void close() throws IOException
+ {
+ buffer.close();
+ }
+ };
+ }
+ else
+ {
+ // in-memory
+ ByteArrayOutputStream output = new ByteArrayOutputStream();
+ result = filters.get(i).decode(input, output, parameters,
i, DecodeOptions.METADATA_ONLY);
+ input = new ByteArrayInputStream(output.toByteArray());
+ }
+ }
+ }
+ return result;
+
+ }
+
private final List<DecodeResult> decodeResults;
/**
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java
b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java
index c3f3ddb5a..a8c8e22c8 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java
@@ -26,6 +26,8 @@ import java.util.ArrayList;
import java.util.List;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
+import org.apache.pdfbox.filter.DecodeOptions;
+import org.apache.pdfbox.filter.DecodeResult;
import org.apache.pdfbox.filter.Filter;
import org.apache.pdfbox.filter.FilterFactory;
import org.apache.pdfbox.io.IOUtils;
@@ -159,6 +161,22 @@ public class COSStream extends COSDictionary implements
Closeable
*/
public COSInputStream createInputStream() throws IOException
{
+ return createInputStream(DecodeOptions.DEFAULT);
+ }
+
+ public COSInputStream createInputStream(DecodeOptions options) throws
IOException
+ {
+ checkClosed();
+ if (isWriting)
+ {
+ throw new IllegalStateException("Cannot read while there is an
open stream writer");
+ }
+ ensureRandomAccessExists(true);
+ InputStream input = new RandomAccessInputStream(randomAccess);
+ return COSInputStream.create(getFilterList(), this, input,
scratchFile, options);
+ }
+
+ public DecodeResult decode() throws IOException {
checkClosed();
if (isWriting)
{
@@ -166,7 +184,7 @@ public class COSStream extends COSDictionary implements
Closeable
}
ensureRandomAccessExists(true);
InputStream input = new RandomAccessInputStream(randomAccess);
- return COSInputStream.create(getFilterList(), this, input,
scratchFile);
+ return COSInputStream.decode(getFilterList(), this, input,
scratchFile);
}
/**
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java
index eff70a428..efa70cb1f 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/DCTFilter.java
@@ -26,6 +26,7 @@ import java.io.OutputStream;
import javax.imageio.IIOException;
import javax.imageio.ImageIO;
+import javax.imageio.ImageReadParam;
import javax.imageio.ImageReader;
import javax.imageio.metadata.IIOMetadata;
import javax.imageio.metadata.IIOMetadataNode;
@@ -51,10 +52,15 @@ final class DCTFilter extends Filter
private static final String ADOBE = "Adobe";
@Override
- public DecodeResult decode(InputStream encoded, OutputStream decoded,
- COSDictionary parameters, int index)
throws IOException
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary
+ parameters, int index, DecodeOptions options) throws IOException
{
- ImageReader reader = findImageReader("JPEG", "a suitable JAI I/O image
filter is not installed");
+ if (options.isMetadataOnly())
+ {
+ return new DecodeResult(parameters);
+ }
+ ImageReader reader = findImageReader("JPEG", "a suitable JAI I/O image
filter is not " +
+ "installed");
try (ImageInputStream iis = ImageIO.createImageInputStream(encoded))
{
@@ -63,9 +69,15 @@ final class DCTFilter extends Filter
{
iis.seek(0);
}
-
+
reader.setInput(iis);
-
+ ImageReadParam irp = reader.getDefaultReadParam();
+ irp.setSourceSubsampling(options.getSubsamplingX(),
options.getSubsamplingY(),
+ options.getSubsamplingOffsetX(),
options.getSubsamplingOffsetY());
+ irp.setSourceRegion(options.getSourceRegion());
+ options.setHonored(true);
+
+
String numChannels = getNumChannels(reader);
// get the raster using horrible JAI workarounds
@@ -73,29 +85,29 @@ final class DCTFilter extends Filter
Raster raster;
// Strategy: use read() for RGB or "can't get metadata"
- // use readRaster() for CMYK and gray and as fallback if read()
fails
+ // use readRaster() for CMYK and gray and as fallback if read()
fails
// after "can't get metadata" because "no meta" file was CMYK
if ("3".equals(numChannels) || numChannels.isEmpty())
{
try
{
- // I'd like to use ImageReader#readRaster but it is buggy
and can't read RGB correctly
- BufferedImage image = reader.read(0);
+ // I'd like to use ImageReader#readRaster but it is buggy
and can't read RGB
+ // correctly
+ BufferedImage image = reader.read(0, irp);
raster = image.getRaster();
- }
- catch (IIOException e)
+ } catch (IIOException e)
{
// JAI can't read CMYK JPEGs using ImageReader#read or
ImageIO.read but
// fortunately ImageReader#readRaster isn't buggy when
reading 4-channel files
- LOG.debug("Couldn't read use read() for RGB image - using
readRaster() as fallback", e);
- raster = reader.readRaster(0, null);
+ LOG.debug("Couldn't read use read() for RGB image - using
readRaster() as " +
+ "fallback", e);
+ raster = reader.readRaster(0, irp);
}
- }
- else
+ } else
{
// JAI can't read CMYK JPEGs using ImageReader#read or
ImageIO.read but
// fortunately ImageReader#readRaster isn't buggy when reading
4-channel files
- raster = reader.readRaster(0, null);
+ raster = reader.readRaster(0, irp);
}
// special handling for 4-component images
@@ -106,11 +118,11 @@ final class DCTFilter extends Filter
try
{
transform = getAdobeTransform(reader.getImageMetadata(0));
- }
- catch (IIOException | NegativeArraySizeException e)
+ } catch (IIOException | NegativeArraySizeException e)
{
// we really tried asking nicely, now we're using brute
force.
- LOG.debug("Couldn't read usíng getAdobeTransform() - using
getAdobeTransformByBruteForce() as fallback", e);
+ LOG.debug("Couldn't read usíng getAdobeTransform() - using
" +
+ "getAdobeTransformByBruteForce() as fallback", e);
transform = getAdobeTransformByBruteForce(iis);
}
int colorTransform = transform != null ? transform : 0;
@@ -130,28 +142,33 @@ final class DCTFilter extends Filter
default:
throw new IllegalArgumentException("Unknown
colorTransform");
}
- }
- else if (raster.getNumBands() == 3)
+ } else if (raster.getNumBands() == 3)
{
// BGR to RGB
raster = fromBGRtoRGB(raster);
}
- DataBufferByte dataBuffer = (DataBufferByte)raster.getDataBuffer();
+ DataBufferByte dataBuffer = (DataBufferByte)
raster.getDataBuffer();
decoded.write(dataBuffer.getData());
- }
- finally
+ } finally
{
reader.dispose();
}
return new DecodeResult(parameters);
}
+ @Override
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
+ COSDictionary parameters, int index) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index,
DecodeOptions.DEFAULT);
+ }
+
// reads the APP14 Adobe transform tag and returns its value, or 0 if
unknown
private Integer getAdobeTransform(IIOMetadata metadata)
{
- Element tree =
(Element)metadata.getAsTree("javax_imageio_jpeg_image_1.0");
- Element markerSequence =
(Element)tree.getElementsByTagName("markerSequence").item(0);
+ Element tree = (Element)
metadata.getAsTree("javax_imageio_jpeg_image_1.0");
+ Element markerSequence = (Element)
tree.getElementsByTagName("markerSequence").item(0);
NodeList app14AdobeNodeList =
markerSequence.getElementsByTagName("app14Adobe");
if (app14AdobeNodeList != null && app14AdobeNodeList.getLength() > 0)
{
@@ -160,7 +177,7 @@ final class DCTFilter extends Filter
}
return 0;
}
-
+
// See in https://github.com/haraldk/TwelveMonkeys
// com.twelvemonkeys.imageio.plugins.jpeg.AdobeDCT class for structure of
APP14 segment
private int getAdobeTransformByBruteForce(ImageInputStream iis) throws
IOException
@@ -196,8 +213,7 @@ final class DCTFilter extends Filter
return app14[POS_TRANSFORM];
}
}
- }
- else
+ } else
{
a = 0;
}
@@ -239,7 +255,7 @@ final class DCTFilter extends Filter
value[0] = cyan;
value[1] = magenta;
value[2] = yellow;
- value[3] = (int)K;
+ value[3] = (int) K;
writableRaster.setPixel(x, y, value);
}
}
@@ -264,9 +280,10 @@ final class DCTFilter extends Filter
float K = value[3];
// YCbCr to RGB, see http://www.equasys.de/colorconversion.html
- int r = clamp( (1.164f * (Y-16)) + (1.596f * (Cr - 128)) );
- int g = clamp( (1.164f * (Y-16)) + (-0.392f * (Cb-128)) +
(-0.813f * (Cr-128)));
- int b = clamp( (1.164f * (Y-16)) + (2.017f * (Cb-128)));
+ int r = clamp((1.164f * (Y - 16)) + (1.596f * (Cr - 128)));
+ int g = clamp((1.164f * (Y - 16)) + (-0.392f * (Cb - 128)) +
(-0.813f * (Cr -
+ 128)));
+ int b = clamp((1.164f * (Y - 16)) + (2.017f * (Cb - 128)));
// naive RGB to CMYK
int cyan = 255 - r;
@@ -277,7 +294,7 @@ final class DCTFilter extends Filter
value[0] = cyan;
value[1] = magenta;
value[2] = yellow;
- value[3] = (int)K;
+ value[3] = (int) K;
writableRaster.setPixel(x, y, value);
}
}
@@ -307,8 +324,9 @@ final class DCTFilter extends Filter
}
return writableRaster;
}
-
- // returns the number of channels as a string, or an empty string if there
is an error getting the meta data
+
+ // returns the number of channels as a string, or an empty string if there
is an error
+ // getting the meta data
private String getNumChannels(ImageReader reader)
{
try
@@ -318,25 +336,26 @@ final class DCTFilter extends Filter
{
return "";
}
- IIOMetadataNode metaTree = (IIOMetadataNode)
imageMetadata.getAsTree("javax_imageio_1.0");
- Element numChannelsItem = (Element)
metaTree.getElementsByTagName("NumChannels").item(0);
+ IIOMetadataNode metaTree = (IIOMetadataNode)
imageMetadata.getAsTree
+ ("javax_imageio_1.0");
+ Element numChannelsItem = (Element)
metaTree.getElementsByTagName("NumChannels").item
+ (0);
if (numChannelsItem == null)
{
return "";
}
return numChannelsItem.getAttribute("value");
- }
- catch (IOException | NegativeArraySizeException e)
+ } catch (IOException | NegativeArraySizeException e)
{
LOG.debug("Couldn't read metadata - returning empty string", e);
return "";
}
- }
+ }
// clamps value to 0-255 range
private int clamp(float value)
{
- return (int)((value < 0) ? 0 : ((value > 255) ? 255 : value));
+ return (int) ((value < 0) ? 0 : ((value > 255) ? 255 : value));
}
@Override
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java
index 4fcaf43c6..0b06a305b 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/Filter.java
@@ -59,26 +59,35 @@ public abstract class Filter
/**
* Decodes data, producing the original non-encoded data.
- * @param encoded the encoded byte stream
- * @param decoded the stream where decoded data will be written
+ *
+ * @param encoded the encoded byte stream
+ * @param decoded the stream where decoded data will be written
* @param parameters the parameters used for decoding
- * @param index the index to the filter being decoded
+ * @param index the index to the filter being decoded
* @return repaired parameters dictionary, or the original parameters
dictionary
* @throws IOException if the stream cannot be decoded
*/
- public abstract DecodeResult decode(InputStream encoded, OutputStream
decoded, COSDictionary parameters,
- int index) throws IOException;
+ public abstract DecodeResult decode(InputStream encoded, OutputStream
decoded, COSDictionary
+ parameters,
+ int index) throws IOException;
+
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary parameters,
+ int index, DecodeOptions options) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index);
+ }
/**
* Encodes data.
- * @param input the byte stream to encode
- * @param encoded the stream where encoded data will be written
+ *
+ * @param input the byte stream to encode
+ * @param encoded the stream where encoded data will be written
* @param parameters the parameters used for encoding
- * @param index the index to the filter being encoded
+ * @param index the index to the filter being encoded
* @throws IOException if the stream cannot be encoded
*/
public final void encode(InputStream input, OutputStream encoded,
COSDictionary parameters,
- int index) throws IOException
+ int index) throws IOException
{
encode(input, encoded, parameters.asUnmodifiableDictionary());
}
@@ -96,26 +105,25 @@ public abstract class Filter
if (filter instanceof COSName && obj instanceof COSDictionary)
{
// PDFBOX-3932: The PDF specification requires "If there is only
one filter and that
- // filter has parameters, DecodeParms shall be set to the filter’s
parameter dictionary"
+ // filter has parameters, DecodeParms shall be set to the filter’s
parameter
+ // dictionary"
// but tests show that Adobe means "one filter name object".
- return (COSDictionary)obj;
- }
- else if (filter instanceof COSArray && obj instanceof COSArray)
+ return (COSDictionary) obj;
+ } else if (filter instanceof COSArray && obj instanceof COSArray)
{
- COSArray array = (COSArray)obj;
+ COSArray array = (COSArray) obj;
if (index < array.size())
{
COSBase objAtIndex = array.getObject(index);
if (objAtIndex instanceof COSDictionary)
{
- return (COSDictionary)array.getObject(index);
+ return (COSDictionary) array.getObject(index);
}
}
- }
- else if (obj != null && !(filter instanceof COSArray || obj instanceof
COSArray))
+ } else if (obj != null && !(filter instanceof COSArray || obj
instanceof COSArray))
{
LOG.error("Expected DecodeParams to be an Array or Dictionary but
found " +
- obj.getClass().getName());
+ obj.getClass().getName());
}
return new COSDictionary();
}
@@ -128,7 +136,8 @@ public abstract class Filter
* @return The image reader for the format.
* @throws MissingImageReaderException if no image reader is found.
*/
- protected static ImageReader findImageReader(String formatName, String
errorCause) throws MissingImageReaderException
+ protected static ImageReader findImageReader(String formatName, String
errorCause) throws
+ MissingImageReaderException
{
Iterator<ImageReader> readers =
ImageIO.getImageReadersByFormatName(formatName);
ImageReader reader = null;
@@ -142,7 +151,8 @@ public abstract class Filter
}
if (reader == null)
{
- throw new MissingImageReaderException("Cannot read " + formatName
+ " image: " + errorCause);
+ throw new MissingImageReaderException("Cannot read " + formatName
+ " image: " +
+ errorCause);
}
return reader;
}
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java
index 341413385..879b814fd 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/FlateFilter.java
@@ -25,6 +25,7 @@ import java.util.zip.DataFormatException;
import java.util.zip.Deflater;
import java.util.zip.DeflaterOutputStream;
import java.util.zip.Inflater;
+
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.pdfbox.cos.COSDictionary;
@@ -43,9 +44,13 @@ final class FlateFilter extends Filter
private static final int BUFFER_SIZE = 16348;
@Override
- public DecodeResult decode(InputStream encoded, OutputStream decoded,
- COSDictionary parameters, int index)
throws IOException
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary
+ parameters, int index, DecodeOptions options) throws IOException
{
+ if (options.isMetadataOnly())
+ {
+ return new DecodeResult(parameters);
+ }
final COSDictionary decodeParams = getDecodeParams(parameters, index);
int predictor = decodeParams.getInt(COSName.PREDICTOR);
@@ -63,13 +68,11 @@ final class FlateFilter extends Filter
decoded.flush();
baos.reset();
bais.reset();
- }
- else
+ } else
{
decompress(encoded, decoded);
}
- }
- catch (DataFormatException e)
+ } catch (DataFormatException e)
{
// if the stream is corrupt a DataFormatException may occur
LOG.error("FlateFilter: stop reading corrupt stream due to a
DataFormatException");
@@ -80,60 +83,67 @@ final class FlateFilter extends Filter
return new DecodeResult(parameters);
}
+ @Override
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
+ COSDictionary parameters, int index) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index,
DecodeOptions.DEFAULT);
+ }
+
// Use Inflater instead of InflateInputStream to avoid an EOFException due
to a probably
// missing Z_STREAM_END, see PDFBOX-1232 for details
- private void decompress(InputStream in, OutputStream out) throws
IOException, DataFormatException
- {
+ private void decompress(InputStream in, OutputStream out) throws
IOException,
+ DataFormatException
+ {
byte[] buf = new byte[2048];
// skip zlib header
- in.read(buf,0,2);
- int read = in.read(buf);
- if (read > 0)
- {
+ in.read(buf, 0, 2);
+ int read = in.read(buf);
+ if (read > 0)
+ {
// use nowrap mode to bypass zlib-header and checksum to avoid a
DataFormatException
- Inflater inflater = new Inflater(true);
- inflater.setInput(buf,0,read);
- byte[] res = new byte[1024];
+ Inflater inflater = new Inflater(true);
+ inflater.setInput(buf, 0, read);
+ byte[] res = new byte[1024];
boolean dataWritten = false;
- while (true)
- {
+ while (true)
+ {
int resRead = 0;
try
{
resRead = inflater.inflate(res);
- }
- catch(DataFormatException exception)
+ } catch (DataFormatException exception)
{
if (dataWritten)
{
// some data could be read -> don't throw an exception
- LOG.warn("FlateFilter: premature end of stream due to
a DataFormatException");
+ LOG.warn("FlateFilter: premature end of stream due to
a " +
+ "DataFormatException");
break;
- }
- else
+ } else
{
// nothing could be read -> re-throw exception
throw exception;
}
}
- if (resRead != 0)
- {
- out.write(res,0,resRead);
+ if (resRead != 0)
+ {
+ out.write(res, 0, resRead);
dataWritten = true;
- continue;
- }
- if (inflater.finished() || inflater.needsDictionary() ||
in.available() == 0)
+ continue;
+ }
+ if (inflater.finished() || inflater.needsDictionary() ||
in.available() == 0)
{
break;
- }
- read = in.read(buf);
- inflater.setInput(buf,0,read);
+ }
+ read = in.read(buf);
+ inflater.setInput(buf, 0, read);
}
inflater.end();
}
out.flush();
}
-
+
@Override
protected void encode(InputStream input, OutputStream encoded,
COSDictionary parameters)
throws IOException
@@ -141,22 +151,22 @@ final class FlateFilter extends Filter
int compressionLevel = Deflater.DEFAULT_COMPRESSION;
try
{
- compressionLevel =
Integer.parseInt(System.getProperty(Filter.SYSPROP_DEFLATELEVEL, "-1"));
- }
- catch (NumberFormatException ex)
+ compressionLevel =
Integer.parseInt(System.getProperty(Filter.SYSPROP_DEFLATELEVEL,
+ "-1"));
+ } catch (NumberFormatException ex)
{
LOG.warn(ex.getMessage(), ex);
}
compressionLevel = Math.max(-1, Math.min(Deflater.BEST_COMPRESSION,
compressionLevel));
Deflater deflater = new Deflater(compressionLevel);
- try (DeflaterOutputStream out = new
DeflaterOutputStream(encoded,deflater))
+ try (DeflaterOutputStream out = new DeflaterOutputStream(encoded,
deflater))
{
int amountRead;
int mayRead = input.available();
if (mayRead > 0)
{
- byte[] buffer = new byte[Math.min(mayRead,BUFFER_SIZE)];
- while ((amountRead = input.read(buffer, 0,
Math.min(mayRead,BUFFER_SIZE))) != -1)
+ byte[] buffer = new byte[Math.min(mayRead, BUFFER_SIZE)];
+ while ((amountRead = input.read(buffer, 0, Math.min(mayRead,
BUFFER_SIZE))) != -1)
{
out.write(buffer, 0, amountRead);
}
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java
index 756d47237..7f0fb4d8c 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/JBIG2Filter.java
@@ -25,6 +25,7 @@ import java.io.InputStream;
import java.io.OutputStream;
import java.io.SequenceInputStream;
import javax.imageio.ImageIO;
+import javax.imageio.ImageReadParam;
import javax.imageio.ImageReader;
import javax.imageio.stream.ImageInputStream;
import org.apache.commons.logging.Log;
@@ -61,8 +62,8 @@ final class JBIG2Filter extends Filter
}
@Override
- public DecodeResult decode(InputStream encoded, OutputStream decoded,
- COSDictionary parameters, int index)
throws IOException
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary
+ parameters, int index, DecodeOptions options) throws IOException
{
ImageReader reader = findImageReader("JBIG2", "jbig2-imageio is not
installed");
if (reader.getClass().getName().contains("levigo"))
@@ -73,6 +74,17 @@ final class JBIG2Filter extends Filter
int bits = parameters.getInt(COSName.BITS_PER_COMPONENT, 1);
COSDictionary params = getDecodeParams(parameters, index);
+ if (options.isMetadataOnly())
+ {
+ return new DecodeResult(parameters);
+ }
+
+ ImageReadParam irp = reader.getDefaultReadParam();
+ irp.setSourceSubsampling(options.getSubsamplingX(),
options.getSubsamplingY(),
+ options.getSubsamplingOffsetX(),
options.getSubsamplingOffsetY());
+ irp.setSourceRegion(options.getSourceRegion());
+ options.setHonored(true);
+
InputStream source = encoded;
if (params != null)
{
@@ -90,9 +102,8 @@ final class JBIG2Filter extends Filter
BufferedImage image;
try
{
- image = reader.read(0, reader.getDefaultReadParam());
- }
- catch (Exception e)
+ image = reader.read(0, irp);
+ } catch (Exception e)
{
// wrap and rethrow any exceptions
throw new IOException("Could not read JBIG2 image", e);
@@ -128,9 +139,17 @@ final class JBIG2Filter extends Filter
{
reader.dispose();
}
+
return new DecodeResult(parameters);
}
+ @Override
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
+ COSDictionary parameters, int index) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index,
DecodeOptions.DEFAULT);
+ }
+
@Override
protected void encode(InputStream input, OutputStream encoded,
COSDictionary parameters)
throws IOException
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java
index c9f91cfbe..0a706b0c3 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/JPXFilter.java
@@ -24,9 +24,11 @@ import java.awt.image.WritableRaster;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
+import javax.imageio.ImageReadParam;
import javax.imageio.ImageReader;
import javax.imageio.stream.ImageInputStream;
import javax.imageio.stream.MemoryCacheImageInputStream;
+
import org.apache.pdfbox.cos.COSDictionary;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.graphics.color.PDJPXColorSpace;
@@ -34,12 +36,12 @@ import
org.apache.pdfbox.pdmodel.graphics.color.PDJPXColorSpace;
/**
* Decompress data encoded using the wavelet-based JPEG 2000 standard,
* reproducing the original data.
- *
+ * <p>
* Requires the Java Advanced Imaging (JAI) Image I/O Tools to be installed
from java.net, see
* <a
href="http://download.java.net/media/jai-imageio/builds/release/1.1/">jai-imageio</a>.
* Alternatively you can build from the source available in the
* <a href="https://java.net/projects/jai-imageio-core/">jai-imageio-core svn
repo</a>.
- *
+ * <p>
* Mac OS X users should download the tar.gz file for linux and unpack it to
obtain the
* required jar files. The .so file can be safely ignored.
*
@@ -49,12 +51,17 @@ import
org.apache.pdfbox.pdmodel.graphics.color.PDJPXColorSpace;
public final class JPXFilter extends Filter
{
@Override
- public DecodeResult decode(InputStream encoded, OutputStream decoded,
- COSDictionary parameters, int index)
throws IOException
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary
+ parameters, int index, DecodeOptions options) throws IOException
{
DecodeResult result = new DecodeResult(new COSDictionary());
result.getParameters().addAll(parameters);
- BufferedImage image = readJPX(encoded, result);
+ BufferedImage image = readJPX(encoded, options, result);
+
+ if (options.isMetadataOnly())
+ {
+ return result;
+ }
WritableRaster raster = image.getRaster();
switch (raster.getDataBuffer().getDataType())
@@ -74,25 +81,39 @@ public final class JPXFilter extends Filter
return result;
default:
- throw new IOException("Data type " +
raster.getDataBuffer().getDataType() + " not implemented");
- }
+ throw new IOException("Data type " +
raster.getDataBuffer().getDataType() + " not" +
+ " implemented");
+ }
+ }
+
+ @Override
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
+ COSDictionary parameters, int index) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index,
DecodeOptions.DEFAULT);
}
// try to read using JAI Image I/O
- private BufferedImage readJPX(InputStream input, DecodeResult result)
throws IOException
+ private BufferedImage readJPX(InputStream input, DecodeOptions options,
DecodeResult result)
+ throws IOException
{
- ImageReader reader = findImageReader("JPEG2000", "Java Advanced
Imaging (JAI) Image I/O Tools are not installed");
+ ImageReader reader = findImageReader("JPEG2000", "Java Advanced
Imaging (JAI) Image I/O " +
+ "Tools are not installed");
// PDFBOX-4121: ImageIO.createImageInputStream() is much slower
try (ImageInputStream iis = new MemoryCacheImageInputStream(input))
{
reader.setInput(iis, true, true);
+ ImageReadParam irp = reader.getDefaultReadParam();
+ irp.setSourceRegion(options.getSourceRegion());
+ irp.setSourceSubsampling(options.getSubsamplingX(),
options.getSubsamplingY(),
+ options.getSubsamplingOffsetX(),
options.getSubsamplingOffsetY());
+ options.setHonored(true);
BufferedImage image;
try
{
- image = reader.read(0);
- }
- catch (Exception e)
+ image = reader.read(0, irp);
+ } catch (Exception e)
{
// wrap and rethrow any exceptions
throw new IOException("Could not read JPEG 2000 (JPX) image",
e);
@@ -114,8 +135,8 @@ public final class JPXFilter extends Filter
}
// override dimensions, see PDFBOX-1735
- parameters.setInt(COSName.WIDTH, image.getWidth());
- parameters.setInt(COSName.HEIGHT, image.getHeight());
+ parameters.setInt(COSName.WIDTH, reader.getWidth(0));
+ parameters.setInt(COSName.HEIGHT, reader.getHeight(0));
// extract embedded color space
if (!parameters.containsKey(COSName.COLORSPACE))
@@ -124,8 +145,7 @@ public final class JPXFilter extends Filter
}
return image;
- }
- finally
+ } finally
{
reader.dispose();
}
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java
b/pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java
index a67d1c67b..8443e7ffc 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/filter/LZWFilter.java
@@ -34,7 +34,6 @@ import org.apache.pdfbox.cos.COSDictionary;
import org.apache.pdfbox.cos.COSName;
/**
- *
* This is the filter used for the LZWDecode filter.
*
* @author Ben Litchfield
@@ -56,17 +55,19 @@ public class LZWFilter extends Filter
* The LZW end of data code.
*/
public static final long EOD = 257;
-
+
//BEWARE: codeTable must be local to each method, because there is only
// one instance of each filter
- /**
- * {@inheritDoc}
- */
+
@Override
- public DecodeResult decode(InputStream encoded, OutputStream decoded,
- COSDictionary parameters, int index) throws IOException
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
COSDictionary
+ parameters, int index, DecodeOptions options) throws IOException
{
+ if (options.isMetadataOnly())
+ {
+ return new DecodeResult(parameters);
+ }
COSDictionary decodeParams = getDecodeParams(parameters, index);
int predictor = decodeParams.getInt(COSName.PREDICTOR);
int earlyChange = decodeParams.getInt(COSName.EARLY_CHANGE, 1);
@@ -88,15 +89,25 @@ public class LZWFilter extends Filter
decoded.flush();
baos.reset();
bais.reset();
- }
- else
+ } else
{
doLZWDecode(encoded, decoded, earlyChange);
}
return new DecodeResult(parameters);
}
- private void doLZWDecode(InputStream encoded, OutputStream decoded, int
earlyChange) throws IOException
+ /**
+ * {@inheritDoc}
+ */
+ @Override
+ public DecodeResult decode(InputStream encoded, OutputStream decoded,
+ COSDictionary parameters, int index) throws
IOException
+ {
+ return decode(encoded, decoded, parameters, index,
DecodeOptions.DEFAULT);
+ }
+
+ private void doLZWDecode(InputStream encoded, OutputStream decoded, int
earlyChange) throws
+ IOException
{
List<byte[]> codeTable = new ArrayList<>();
int chunk = 9;
@@ -113,8 +124,7 @@ public class LZWFilter extends Filter
chunk = 9;
codeTable = createCodeTable();
prevCommand = -1;
- }
- else
+ } else
{
if (nextCommand < codeTable.size())
{
@@ -129,8 +139,7 @@ public class LZWFilter extends Filter
newData[data.length] = firstByte;
codeTable.add(newData);
}
- }
- else
+ } else
{
checkIndexBounds(codeTable, prevCommand, in);
byte[] data = codeTable.get((int) prevCommand);
@@ -139,20 +148,20 @@ public class LZWFilter extends Filter
decoded.write(newData);
codeTable.add(newData);
}
-
+
chunk = calculateChunk(codeTable.size(), earlyChange);
prevCommand = nextCommand;
}
}
- }
- catch (EOFException ex)
+ } catch (EOFException ex)
{
LOG.warn("Premature EOF in LZW stream, EOD code missing", ex);
}
decoded.flush();
}
- private void checkIndexBounds(List<byte[]> codeTable, long index,
MemoryCacheImageInputStream in)
+ private void checkIndexBounds(List<byte[]> codeTable, long index,
MemoryCacheImageInputStream
+ in)
throws IOException
{
if (index < 0)
@@ -189,10 +198,9 @@ public class LZWFilter extends Filter
byte by = (byte) r;
if (inputPattern == null)
{
- inputPattern = new byte[] { by };
+ inputPattern = new byte[]{by};
foundCode = by & 0xff;
- }
- else
+ } else
{
inputPattern = Arrays.copyOf(inputPattern,
inputPattern.length + 1);
inputPattern[inputPattern.length - 1] = by;
@@ -204,18 +212,17 @@ public class LZWFilter extends Filter
out.writeBits(foundCode, chunk);
// create new table entry
codeTable.add(inputPattern);
-
+
if (codeTable.size() == 4096)
{
// code table is full
out.writeBits(CLEAR_TABLE, chunk);
codeTable = createCodeTable();
}
-
- inputPattern = new byte[] { by };
+
+ inputPattern = new byte[]{by};
foundCode = by & 0xff;
- }
- else
+ } else
{
foundCode = newFoundCode;
}
@@ -226,19 +233,19 @@ public class LZWFilter extends Filter
chunk = calculateChunk(codeTable.size() - 1, 1);
out.writeBits(foundCode, chunk);
}
-
+
// PPDFBOX-1977: the decoder wouldn't know that the encoder would
output
// an EOD as code, so he would have increased his own code table
and
// possibly adjusted the chunk. Therefore, the encoder must behave
as
// if the code table had just grown and thus it must be checked it
is
// needed to adjust the chunk, based on an increased table size
parameter
chunk = calculateChunk(codeTable.size(), 1);
-
+
out.writeBits(EOD, chunk);
-
+
// pad with 0
out.writeBits(0, 7);
-
+
// must do or file will be empty :-(
out.flush();
}
@@ -248,7 +255,7 @@ public class LZWFilter extends Filter
* Find the longest matching pattern in the code table.
*
* @param codeTable The LZW code table.
- * @param pattern The pattern to be searched for.
+ * @param pattern The pattern to be searched for.
* @return The index of the longest matching pattern or -1 if nothing is
* found.
*/
@@ -264,16 +271,16 @@ public class LZWFilter extends Filter
if (foundCode != -1)
{
// we already found pattern with size > 1
- return foundCode;
- }
- else if (pattern.length > 1)
+ return foundCode;
+ } else if (pattern.length > 1)
{
// we won't find anything here anyway
return -1;
}
}
byte[] tryPattern = codeTable.get(i);
- if ((foundCode != -1 || tryPattern.length > foundLen) &&
Arrays.equals(tryPattern, pattern))
+ if ((foundCode != -1 || tryPattern.length > foundLen) &&
Arrays.equals(tryPattern,
+ pattern))
{
foundCode = i;
foundLen = tryPattern.length;
@@ -291,7 +298,7 @@ public class LZWFilter extends Filter
List<byte[]> codeTable = new ArrayList<>(4096);
for (int i = 0; i < 256; ++i)
{
- codeTable.add(new byte[] { (byte) (i & 0xFF) });
+ codeTable.add(new byte[]{(byte) (i & 0xFF)});
}
codeTable.add(null); // 256 EOD
codeTable.add(null); // 257 CLEAR_TABLE
@@ -301,9 +308,8 @@ public class LZWFilter extends Filter
/**
* Calculate the appropriate chunk size
*
- * @param tabSize the size of the code table
+ * @param tabSize the size of the code table
* @param earlyChange 0 or 1 for early chunk increase
- *
* @return a value between 9 and 12
*/
private int calculateChunk(int tabSize, int earlyChange)
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
index 8f520f981..30a430ae8 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
@@ -32,6 +32,8 @@ import org.apache.pdfbox.cos.COSInputStream;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.cos.COSNull;
import org.apache.pdfbox.cos.COSStream;
+import org.apache.pdfbox.filter.DecodeOptions;
+import org.apache.pdfbox.filter.DecodeResult;
import org.apache.pdfbox.filter.Filter;
import org.apache.pdfbox.filter.FilterFactory;
import org.apache.pdfbox.io.IOUtils;
@@ -229,6 +231,15 @@ public class PDStream implements COSObjectable
return stream.createInputStream();
}
+ public COSInputStream createInputStream(DecodeOptions options) throws
IOException
+ {
+ return stream.createInputStream(options);
+ }
+
+ public DecodeResult decode() throws IOException {
+ return stream.decode();
+ }
+
/**
* This will get a stream with some filters applied but not others. This is
* useful when doing images, ie filters = [flate,dct], we want to remove
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java
index 891544beb..cf154808b 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImage.java
@@ -16,12 +16,14 @@
*/
package org.apache.pdfbox.pdmodel.graphics.image;
-import java.awt.Paint;
+import java.awt.*;
import java.awt.image.BufferedImage;
import java.io.IOException;
import java.io.InputStream;
import java.util.List;
+
import org.apache.pdfbox.cos.COSArray;
+import org.apache.pdfbox.filter.DecodeOptions;
import org.apache.pdfbox.pdmodel.common.COSObjectable;
import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace;
@@ -34,24 +36,29 @@ public interface PDImage extends COSObjectable
{
/**
* Returns the content of this image as an AWT buffered image with an
(A)RGB color space.
- * The size of the returned image is the larger of the size of the image
itself or its mask.
+ * The size of the returned image is the larger of the size of the image
itself or its mask.
+ *
* @return content of this image as a buffered image.
* @throws IOException
*/
BufferedImage getImage() throws IOException;
+ BufferedImage getImage(Rectangle region, int subsample) throws IOException;
+
/**
* Returns an ARGB image filled with the given paint and using this image
as a mask.
+ *
* @param paint the paint to fill the visible portions of the image with
* @return a masked image filled with the given paint
- * @throws IOException if the image cannot be read
+ * @throws IOException if the image cannot be read
* @throws IllegalStateException if the image is not a stencil.
*/
BufferedImage getStencilImage(Paint paint) throws IOException;
-
+
/**
* Returns an InputStream containing the image data, irrespective of
whether this is an
* inline image or an image XObject.
+ *
* @return Decoded stream
* @throws IOException if the data could not be read.
*/
@@ -60,12 +67,15 @@ public interface PDImage extends COSObjectable
/**
* Returns an InputStream containing the image data, irrespective of
whether this is an
* inline image or an image XObject. The given filters will not be decoded.
+ *
* @param stopFilters A list of filters to stop decoding at.
* @return Decoded stream
* @throws IOException if the data could not be read.
*/
InputStream createInputStream(List<String> stopFilters) throws IOException;
+ public InputStream createInputStream(DecodeOptions options) throws
IOException;
+
/**
* Returns true if the image has no data.
*/
@@ -79,6 +89,7 @@ public interface PDImage extends COSObjectable
/**
* Sets whether or not the image is a stencil.
* This corresponds to the {@code ImageMask} entry in the image stream's
dictionary.
+ *
* @param isStencil True to make the image a stencil.
*/
void setStencil(boolean isStencil);
@@ -90,18 +101,21 @@ public interface PDImage extends COSObjectable
/**
* Set the number of bits per component.
+ *
* @param bitsPerComponent The number of bits per component.
*/
void setBitsPerComponent(int bitsPerComponent);
/**
* Returns the image's color space.
+ *
* @throws IOException If there is an error getting the color space.
*/
PDColorSpace getColorSpace() throws IOException;
/**
* Sets the color space for this image.
+ *
* @param colorSpace The color space for this image.
*/
void setColorSpace(PDColorSpace colorSpace);
@@ -113,6 +127,7 @@ public interface PDImage extends COSObjectable
/**
* Sets the height of the image.
+ *
* @param height The height of the image.
*/
void setHeight(int height);
@@ -124,13 +139,15 @@ public interface PDImage extends COSObjectable
/**
* Sets the width of the image.
+ *
* @param width The width of the image.
*/
void setWidth(int width);
/**
* Sets the decode array.
- * @param decode the new decode array.
+ *
+ * @param decode the new decode array.
*/
void setDecode(COSArray decode);
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java
index 1f8727364..8a5476d19 100644
---
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java
+++
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDImageXObject.java
@@ -16,9 +16,7 @@
*/
package org.apache.pdfbox.pdmodel.graphics.image;
-import java.awt.Graphics2D;
-import java.awt.Paint;
-import java.awt.RenderingHints;
+import java.awt.*;
import java.awt.image.BufferedImage;
import java.awt.image.WritableRaster;
import java.io.BufferedInputStream;
@@ -31,6 +29,7 @@ import java.io.OutputStream;
import java.lang.ref.SoftReference;
import java.util.List;
import javax.imageio.ImageIO;
+
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.pdfbox.cos.COSArray;
@@ -39,6 +38,8 @@ import org.apache.pdfbox.cos.COSInputStream;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.cos.COSObject;
import org.apache.pdfbox.cos.COSStream;
+import org.apache.pdfbox.filter.DecodeOptions;
+import org.apache.pdfbox.filter.DecodeResult;
import org.apache.pdfbox.io.IOUtils;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDResources;
@@ -73,7 +74,8 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Creates an Image XObject in the given document. This constructor is for
internal PDFBox use
- * and is not for PDF generation. Users who want to create images should
look at {@link #createFromFileByExtension(File, PDDocument)
+ * and is not for PDF generation. Users who want to create images should
look at {@link
+ * #createFromFileByExtension(File, PDDocument)
* }.
*
* @param document the current document
@@ -89,18 +91,18 @@ public final class PDImageXObject extends PDXObject
implements PDImage
* constructor is for internal PDFBox use and is not for PDF generation.
Users who want to
* create images should look at {@link #createFromFileByExtension(File,
PDDocument) }.
*
- * @param document the current document
- * @param encodedStream an encoded stream of image data
- * @param cosFilter the filter or a COSArray of filters
- * @param width the image width
- * @param height the image height
+ * @param document the current document
+ * @param encodedStream an encoded stream of image data
+ * @param cosFilter the filter or a COSArray of filters
+ * @param width the image width
+ * @param height the image height
* @param bitsPerComponent the bits per component
- * @param initColorSpace the color space
+ * @param initColorSpace the color space
* @throws IOException if there is an error creating the XObject.
*/
- public PDImageXObject(PDDocument document, InputStream encodedStream,
- COSBase cosFilter, int width, int height, int bitsPerComponent,
- PDColorSpace initColorSpace) throws IOException
+ public PDImageXObject(PDDocument document, InputStream encodedStream,
+ COSBase cosFilter, int width, int height, int
bitsPerComponent,
+ PDColorSpace initColorSpace) throws IOException
{
super(createRawStream(document, encodedStream), COSName.IMAGE);
getCOSObject().setItem(COSName.FILTER, cosFilter);
@@ -117,25 +119,26 @@ public final class PDImageXObject extends PDXObject
implements PDImage
* constructor is for internal PDFBox use and is not for PDF generation.
Users who want to
* create images should look at {@link #createFromFileByExtension(File,
PDDocument) }.
*
- * @param stream the XObject stream to read
+ * @param stream the XObject stream to read
* @param resources the current resources
* @throws java.io.IOException if there is an error creating the XObject.
*/
public PDImageXObject(PDStream stream, PDResources resources) throws
IOException
{
- this(stream, resources, stream.createInputStream());
+ this(stream, resources, stream.decode());
}
-
+
// repairs parameters using decode result
- private PDImageXObject(PDStream stream, PDResources resources,
COSInputStream input)
+ private PDImageXObject(PDStream stream, PDResources resources,
DecodeResult decodeResult)
{
- super(repair(stream, input), COSName.IMAGE);
+ super(repair(stream, decodeResult), COSName.IMAGE);
this.resources = resources;
- this.colorSpace = input.getDecodeResult().getJPXColorSpace();
+ this.colorSpace = decodeResult.getJPXColorSpace();
}
/**
* Creates a thumbnail Image XObject from the given COSBase and name.
+ *
* @param cosStream the COS stream
* @return an XObject
* @throws IOException if there is an error creating the XObject.
@@ -162,14 +165,15 @@ public final class PDImageXObject extends PDXObject
implements PDImage
}
/**
- * Create a PDImageXObject from an image file, see {@link
#createFromFileByExtension(File, PDDocument)} for
+ * Create a PDImageXObject from an image file, see
+ * {@link #createFromFileByExtension(File, PDDocument)} for
* more details.
*
* @param imagePath the image file path.
- * @param doc the document that shall use this PDImageXObject.
+ * @param doc the document that shall use this PDImageXObject.
* @return a PDImageXObject.
* @throws IOException if there is an error when reading the file or
creating the
- * PDImageXObject, or if the image type is not supported.
+ * PDImageXObject, or if the image type is not
supported.
*/
public static PDImageXObject createFromFile(String imagePath, PDDocument
doc) throws IOException
{
@@ -185,13 +189,14 @@ public final class PDImageXObject extends PDXObject
implements PDImage
* PDImageXObject from a BufferedImage).
*
* @param file the image file.
- * @param doc the document that shall use this PDImageXObject.
+ * @param doc the document that shall use this PDImageXObject.
* @return a PDImageXObject.
- * @throws IOException if there is an error when reading the file or
creating the
- * PDImageXObject.
+ * @throws IOException if there is an error when reading the
file or creating the
+ * PDImageXObject.
* @throws IllegalArgumentException if the image type is not supported.
*/
- public static PDImageXObject createFromFileByExtension(File file,
PDDocument doc) throws IOException
+ public static PDImageXObject createFromFileByExtension(File file,
PDDocument doc) throws
+ IOException
{
String name = file.getName();
int dot = file.getName().lastIndexOf('.');
@@ -228,20 +233,21 @@ public final class PDImageXObject extends PDXObject
implements PDImage
* PDImageXObject from a BufferedImage).
*
* @param file the image file.
- * @param doc the document that shall use this PDImageXObject.
+ * @param doc the document that shall use this PDImageXObject.
* @return a PDImageXObject.
- * @throws IOException if there is an error when reading the file or
creating the
- * PDImageXObject.
+ * @throws IOException if there is an error when reading the
file or creating the
+ * PDImageXObject.
* @throws IllegalArgumentException if the image type is not supported.
*/
- public static PDImageXObject createFromFileByContent(File file, PDDocument
doc) throws IOException
+ public static PDImageXObject createFromFileByContent(File file, PDDocument
doc) throws
+ IOException
{
FileType fileType = null;
- try (BufferedInputStream bufferedInputStream = new
BufferedInputStream(new FileInputStream(file)))
+ try (BufferedInputStream bufferedInputStream = new
BufferedInputStream(new
+ FileInputStream(file)))
{
fileType = FileTypeDetector.detectFileType(bufferedInputStream);
- }
- catch (IOException e)
+ } catch (IOException e)
{
throw new IOException("Could not determine file type: " +
file.getName(), e);
}
@@ -261,7 +267,8 @@ public final class PDImageXObject extends PDXObject
implements PDImage
{
return CCITTFactory.createFromFile(doc, file);
}
- if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) ||
fileType.equals(FileType.PNG))
+ if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) ||
fileType.equals
+ (FileType.PNG))
{
BufferedImage bim = ImageIO.read(file);
return LosslessFactory.createFromImage(doc, bim);
@@ -278,21 +285,21 @@ public final class PDImageXObject extends PDXObject
implements PDImage
* PDImageXObject from a BufferedImage).
*
* @param byteArray bytes from an image file.
- * @param document the document that shall use this PDImageXObject.
- * @param name name of image file for exception messages, can be null.
+ * @param document the document that shall use this PDImageXObject.
+ * @param name name of image file for exception messages, can be null.
* @return a PDImageXObject.
- * @throws IOException if there is an error when reading the file or
creating the
- * PDImageXObject.
+ * @throws IOException if there is an error when reading the
file or creating the
+ * PDImageXObject.
* @throws IllegalArgumentException if the image type is not supported.
*/
- public static PDImageXObject createFromByteArray(PDDocument document,
byte[] byteArray, String name) throws IOException
+ public static PDImageXObject createFromByteArray(PDDocument document,
byte[] byteArray,
+ String name) throws
IOException
{
FileType fileType;
try
{
fileType = FileTypeDetector.detectFileType(byteArray);
- }
- catch (IOException e)
+ } catch (IOException e)
{
throw new IOException("Could not determine file type: " + name, e);
}
@@ -309,7 +316,8 @@ public final class PDImageXObject extends PDXObject
implements PDImage
{
return CCITTFactory.createFromByteArray(document, byteArray);
}
- if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) ||
fileType.equals(FileType.PNG))
+ if (fileType.equals(FileType.BMP) || fileType.equals(FileType.GIF) ||
fileType.equals
+ (FileType.PNG))
{
ByteArrayInputStream bais = new ByteArrayInputStream(byteArray);
BufferedImage bim = ImageIO.read(bais);
@@ -319,14 +327,15 @@ public final class PDImageXObject extends PDXObject
implements PDImage
}
// repairs parameters using decode result
- private static PDStream repair(PDStream stream, COSInputStream input)
+ private static PDStream repair(PDStream stream, DecodeResult decodeResult)
{
- stream.getCOSObject().addAll(input.getDecodeResult().getParameters());
+ stream.getCOSObject().addAll(decodeResult.getParameters());
return stream;
}
/**
* Returns the metadata associated with this XObject, or null if there is
none.
+ *
* @return the metadata associated with this object.
*/
public PDMetadata getMetadata()
@@ -341,6 +350,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Sets the metadata associated with this XObject, or null if there is
none.
+ *
* @param meta the metadata associated with this object
*/
public void setMetadata(PDMetadata meta)
@@ -350,6 +360,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Returns the key of this XObject in the structural parent tree.
+ *
* @return this object's key the structural parent tree
*/
public int getStructParent()
@@ -359,6 +370,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Sets the key of this XObject in the structural parent tree.
+ *
* @param key the new key for this XObject
*/
public void setStructParent(int key)
@@ -381,17 +393,25 @@ public final class PDImageXObject extends PDXObject
implements PDImage
return cached;
}
}
+ BufferedImage image = getImage(null, 1);
+ cachedImage = new SoftReference<>(image);
+ return image;
+ }
+ @Override
+ public BufferedImage getImage(Rectangle region, int subsample) throws
IOException
+ {
// get image as RGB
- BufferedImage image = SampledImageReader.getRGBImage(this,
getColorKeyMask());
+ BufferedImage image = SampledImageReader.getRGBImage(this, region,
subsample,
+ getColorKeyMask());
+
// soft mask (overrides explicit mask)
PDImageXObject softMask = getSoftMask();
if (softMask != null)
{
image = applyMask(image, softMask.getOpaqueImage(), true);
- }
- else
+ } else
{
// explicit mask - to be applied only if /ImageMask true
PDImageXObject mask = getMask();
@@ -401,10 +421,11 @@ public final class PDImageXObject extends PDXObject
implements PDImage
}
}
- cachedImage = new SoftReference<>(image);
return image;
+
}
+
/**
* {@inheritDoc}
* The returned images are not cached.
@@ -422,6 +443,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Returns an RGB buffered image containing the opaque image stream
without any masks applied.
* If this Image XObject is a mask then the buffered image will contain
the raw mask.
+ *
* @return the image without any masks applied
* @throws IOException if the image cannot be read
*/
@@ -447,8 +469,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
if (mask.getWidth() < width || mask.getHeight() < height)
{
mask = scaleImage(mask, width, height);
- }
- else if (mask.getWidth() > width || mask.getHeight() > height)
+ } else if (mask.getWidth() > width || mask.getHeight() > height)
{
width = mask.getWidth();
height = mask.getHeight();
@@ -473,13 +494,12 @@ public final class PDImageXObject extends PDXObject
implements PDImage
rgba[0] = rgb[0];
rgba[1] = rgb[1];
rgba[2] = rgb[2];
-
+
alphaPixel = alpha.getPixel(x, y, alphaPixel);
if (isSoft)
{
rgba[3] = alphaPixel[0];
- }
- else
+ } else
{
rgba[3] = 255 - alphaPixel[0];
}
@@ -499,9 +519,9 @@ public final class PDImageXObject extends PDXObject
implements PDImage
BufferedImage image2 = new BufferedImage(width, height,
BufferedImage.TYPE_INT_RGB);
Graphics2D g = image2.createGraphics();
g.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
- RenderingHints.VALUE_INTERPOLATION_BICUBIC);
+ RenderingHints.VALUE_INTERPOLATION_BICUBIC);
g.setRenderingHint(RenderingHints.KEY_RENDERING,
- RenderingHints.VALUE_RENDER_QUALITY);
+ RenderingHints.VALUE_RENDER_QUALITY);
g.drawImage(image, 0, 0, width, height, 0, 0, image.getWidth(),
image.getHeight(), null);
g.dispose();
return image2;
@@ -509,6 +529,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Returns the Mask Image XObject associated with this image, or null if
there is none.
+ *
* @return Mask Image XObject
* @throws java.io.IOException
*/
@@ -519,8 +540,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
{
// color key mask, no explicit mask to return
return null;
- }
- else
+ } else
{
COSStream cosStream = (COSStream)
getCOSObject().getDictionaryObject(COSName.MASK);
if (cosStream != null)
@@ -534,6 +554,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* Returns the color key mask array associated with this image, or null if
there is none.
+ *
* @return Mask Image XObject
*/
public COSArray getColorKeyMask()
@@ -541,13 +562,14 @@ public final class PDImageXObject extends PDXObject
implements PDImage
COSBase mask = getCOSObject().getDictionaryObject(COSName.MASK);
if (mask instanceof COSArray)
{
- return (COSArray)mask;
+ return (COSArray) mask;
}
return null;
}
/**
* Returns the Soft Mask Image XObject associated with this image, or null
if there is none.
+ *
* @return the SMask Image XObject, or null.
* @throws java.io.IOException
*/
@@ -568,8 +590,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
if (isStencil())
{
return 1;
- }
- else
+ } else
{
return getCOSObject().getInt(COSName.BITS_PER_COMPONENT,
COSName.BPC);
}
@@ -607,13 +628,11 @@ public final class PDImageXObject extends PDXObject
implements PDImage
{
resources.getResourceCache().put(indirect, colorSpace);
}
- }
- else if (isStencil())
+ } else if (isStencil())
{
// stencil mask color space must be gray, it is often missing
return PDDeviceGray.INSTANCE;
- }
- else
+ } else
{
// an image without a color space is always broken
throw new IOException("could not determine color space");
@@ -628,6 +647,12 @@ public final class PDImageXObject extends PDXObject
implements PDImage
return getStream().createInputStream();
}
+ @Override
+ public InputStream createInputStream(DecodeOptions options) throws
IOException
+ {
+ return getStream().createInputStream(options);
+ }
+
@Override
public InputStream createInputStream(List<String> stopFilters) throws
IOException
{
@@ -713,6 +738,7 @@ public final class PDImageXObject extends PDXObject
implements PDImage
/**
* This will get the suffix for this image type, e.g. jpg/png.
+ *
* @return The image suffix or null if not available.
*/
@Override
@@ -723,30 +749,24 @@ public final class PDImageXObject extends PDXObject
implements PDImage
if (filters == null)
{
return "png";
- }
- else if (filters.contains(COSName.DCT_DECODE))
+ } else if (filters.contains(COSName.DCT_DECODE))
{
return "jpg";
- }
- else if (filters.contains(COSName.JPX_DECODE))
+ } else if (filters.contains(COSName.JPX_DECODE))
{
return "jpx";
- }
- else if (filters.contains(COSName.CCITTFAX_DECODE))
+ } else if (filters.contains(COSName.CCITTFAX_DECODE))
{
return "tiff";
- }
- else if (filters.contains(COSName.FLATE_DECODE)
+ } else if (filters.contains(COSName.FLATE_DECODE)
|| filters.contains(COSName.LZW_DECODE)
|| filters.contains(COSName.RUN_LENGTH_DECODE))
{
return "png";
- }
- else if (filters.contains(COSName.JBIG2_DECODE))
+ } else if (filters.contains(COSName.JBIG2_DECODE))
{
return "jb2";
- }
- else
+ } else
{
LOG.warn("getSuffix() returns null, filters: " + filters);
return null;
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java
index dbdfba837..32f233dd2 100644
---
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java
+++
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/PDInlineImage.java
@@ -16,17 +16,19 @@
*/
package org.apache.pdfbox.pdmodel.graphics.image;
-import java.awt.Paint;
+import java.awt.*;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.List;
+
import org.apache.pdfbox.cos.COSArray;
import org.apache.pdfbox.cos.COSBase;
import org.apache.pdfbox.cos.COSDictionary;
import org.apache.pdfbox.cos.COSName;
+import org.apache.pdfbox.filter.DecodeOptions;
import org.apache.pdfbox.filter.DecodeResult;
import org.apache.pdfbox.filter.Filter;
import org.apache.pdfbox.filter.FilterFactory;
@@ -58,8 +60,8 @@ public final class PDInlineImage implements PDImage
* Creates an inline image from the given parameters and data.
*
* @param parameters the image parameters
- * @param data the image data
- * @param resources the current resources
+ * @param data the image data
+ * @param resources the current resources
* @throws IOException if the stream cannot be decoded
*/
public PDInlineImage(COSDictionary parameters, byte[] data, PDResources
resources)
@@ -74,8 +76,7 @@ public final class PDInlineImage implements PDImage
if (filters == null || filters.isEmpty())
{
this.decodedData = data;
- }
- else
+ } else
{
ByteArrayInputStream in = new ByteArrayInputStream(data);
ByteArrayOutputStream out = new ByteArrayOutputStream(data.length);
@@ -109,8 +110,7 @@ public final class PDInlineImage implements PDImage
if (isStencil())
{
return 1;
- }
- else
+ } else
{
return parameters.getInt(COSName.BPC, COSName.BITS_PER_COMPONENT,
-1);
}
@@ -129,19 +129,17 @@ public final class PDInlineImage implements PDImage
if (cs != null)
{
return createColorSpace(cs);
- }
- else if (isStencil())
+ } else if (isStencil())
{
// stencil mask color space must be gray, it is often missing
return PDDeviceGray.INSTANCE;
- }
- else
+ } else
{
// an image without a color space is always broken
throw new IOException("could not determine inline image color
space");
}
}
-
+
// deliver the long name of a device colorspace, or the parameter
private COSBase toLongName(COSBase cs)
{
@@ -159,7 +157,7 @@ public final class PDInlineImage implements PDImage
}
return cs;
}
-
+
private PDColorSpace createColorSpace(COSBase cs) throws IOException
{
if (cs instanceof COSName)
@@ -247,8 +245,7 @@ public final class PDInlineImage implements PDImage
{
COSName name = (COSName) filters;
names = new COSArrayList<>(name.getName(), name, parameters,
COSName.FILTER);
- }
- else if (filters instanceof COSArray)
+ } else if (filters instanceof COSArray)
{
names = COSArrayList.convertCOSNameCOSArrayToList((COSArray)
filters);
}
@@ -296,6 +293,12 @@ public final class PDInlineImage implements PDImage
return new ByteArrayInputStream(decodedData);
}
+ @Override
+ public InputStream createInputStream(DecodeOptions options) throws
IOException
+ {
+ return createInputStream();
+ }
+
@Override
public InputStream createInputStream(List<String> stopFilters) throws
IOException
{
@@ -309,8 +312,7 @@ public final class PDInlineImage implements PDImage
if (stopFilters.contains(filters.get(i)))
{
break;
- }
- else
+ } else
{
Filter filter =
FilterFactory.INSTANCE.getFilter(filters.get(i));
filter.decode(in, out, parameters, i);
@@ -333,13 +335,19 @@ public final class PDInlineImage implements PDImage
{
return decodedData;
}
-
+
@Override
public BufferedImage getImage() throws IOException
{
return SampledImageReader.getRGBImage(this, getColorKeyMask());
}
+ @Override
+ public BufferedImage getImage(Rectangle region, int subsample) throws
IOException
+ {
+ return getImage();
+ }
+
@Override
public BufferedImage getStencilImage(Paint paint) throws IOException
{
diff --git
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java
index 0f60b8819..3fefd095b 100644
---
a/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java
+++
b/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/image/SampledImageReader.java
@@ -16,9 +16,7 @@
*/
package org.apache.pdfbox.pdmodel.graphics.image;
-import java.awt.Graphics2D;
-import java.awt.Paint;
-import java.awt.Point;
+import java.awt.*;
import java.awt.image.BufferedImage;
import java.awt.image.DataBuffer;
import java.awt.image.DataBufferByte;
@@ -29,31 +27,35 @@ import java.io.InputStream;
import java.util.Arrays;
import javax.imageio.stream.ImageInputStream;
import javax.imageio.stream.MemoryCacheImageInputStream;
+
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.pdfbox.cos.COSArray;
import org.apache.pdfbox.cos.COSNumber;
+import org.apache.pdfbox.filter.DecodeOptions;
import org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace;
import org.apache.pdfbox.pdmodel.graphics.color.PDDeviceGray;
import org.apache.pdfbox.pdmodel.graphics.color.PDIndexed;
/**
* Reads a sampled image from a PDF file.
+ *
* @author John Hewson
*/
final class SampledImageReader
{
private static final Log LOG = LogFactory.getLog(SampledImageReader.class);
-
+
private SampledImageReader()
{
}
/**
* Returns an ARGB image filled with the given paint and using the given
image as a mask.
+ *
* @param paint the paint to fill the visible portions of the image with
* @return a masked image filled with the given paint
- * @throws IOException if the image cannot be read
+ * @throws IOException if the image cannot be read
* @throws IllegalStateException if the image is not a stencil.
*/
public static BufferedImage getStencilImage(PDImage pdImage, Paint paint)
throws IOException
@@ -122,7 +124,7 @@ final class SampledImageReader
LOG.warn("premature EOF, image will be incomplete");
break;
}
- }
+ }
}
return masked;
@@ -132,23 +134,46 @@ final class SampledImageReader
* Returns the content of the given image as an AWT buffered image with an
RGB color space.
* If a color key mask is provided then an ARGB image is returned instead.
* This method never returns null.
- * @param pdImage the image to read
+ *
+ * @param pdImage the image to read
* @param colorKey an optional color key mask
* @return content of this image as an RGB buffered image
* @throws IOException if the image cannot be read
*/
public static BufferedImage getRGBImage(PDImage pdImage, COSArray
colorKey) throws IOException
+ {
+ return getRGBImage(pdImage, null, 1, colorKey);
+ }
+
+ private static Rectangle clipRegion(PDImage pdImage, Rectangle region)
+ {
+ if (region == null)
+ {
+ return new Rectangle(0, 0, pdImage.getWidth(),
pdImage.getHeight());
+ } else
+ {
+ int x = Math.max(0, region.x);
+ int y = Math.max(0, region.y);
+ int width = Math.min(region.width, pdImage.getWidth() - x);
+ int height = Math.min(region.height, pdImage.getHeight() - y);
+ return new Rectangle(x, y, width, height);
+ }
+ }
+
+ public static BufferedImage getRGBImage(PDImage pdImage, Rectangle region,
int subsample,
+ COSArray colorKey) throws
IOException
{
if (pdImage.isEmpty())
{
throw new IOException("Image stream is empty");
}
+ Rectangle clipped = clipRegion(pdImage, region);
// get parameters, they must be valid or have been repaired
final PDColorSpace colorSpace = pdImage.getColorSpace();
final int numComponents = colorSpace.getNumberOfComponents();
- final int width = pdImage.getWidth();
- final int height = pdImage.getHeight();
+ final int width = (int) Math.round(clipped.getWidth() / subsample);
+ final int height = (int) Math.round(clipped.getHeight() / subsample);
final int bitsPerComponent = pdImage.getBitsPerComponent();
final float[] decode = getDecodeArray(pdImage);
@@ -159,7 +184,7 @@ final class SampledImageReader
if (bitsPerComponent == 1 && colorKey == null && numComponents == 1)
{
- return from1Bit(pdImage);
+ return from1Bit(pdImage, clipped, subsample, width, height);
}
//
@@ -168,47 +193,65 @@ final class SampledImageReader
// in depth to 8bpc as they will be drawn to TYPE_INT_RGB images
anyway. All code
// in PDColorSpace#toRGBImage expects an 8-bit range, i.e. 0-255.
//
- WritableRaster raster =
Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height,
- numComponents, new Point(0, 0));
final float[] defaultDecode =
pdImage.getColorSpace().getDefaultDecode(8);
if (bitsPerComponent == 8 && Arrays.equals(decode, defaultDecode) &&
colorKey == null)
{
// convert image, faster path for non-decoded, non-colormasked
8-bit images
- return from8bit(pdImage, raster);
+ return from8bit(pdImage, clipped, subsample, width, height);
}
- return fromAny(pdImage, raster, colorKey);
+ return fromAny(pdImage, colorKey, clipped, subsample, width, height);
}
- private static BufferedImage from1Bit(PDImage pdImage) throws IOException
+ private static BufferedImage from1Bit(PDImage pdImage, Rectangle clipped,
int subsample,
+ final int width, final int height)
throws IOException
{
final PDColorSpace colorSpace = pdImage.getColorSpace();
- final int width = pdImage.getWidth();
- final int height = pdImage.getHeight();
final float[] decode = getDecodeArray(pdImage);
BufferedImage bim = null;
WritableRaster raster;
byte[] output;
- if (colorSpace instanceof PDDeviceGray)
- {
- // TYPE_BYTE_GRAY and not TYPE_BYTE_BINARY because this one is
handled
- // without conversion to RGB by Graphics.drawImage
- // this reduces the memory footprint, only one byte per pixel
instead of three.
- bim = new BufferedImage(width, height,
BufferedImage.TYPE_BYTE_GRAY);
- raster = bim.getRaster();
- }
- else
- {
- raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width,
height, 1, new Point(0, 0));
- }
- output = ((DataBufferByte) raster.getDataBuffer()).getData();
// read bit stream
- try (InputStream iis = pdImage.createInputStream())
+ DecodeOptions options = new DecodeOptions(subsample);
+ options.setSourceRegion(clipped);
+ try (InputStream iis = pdImage.createInputStream(options))
{
+ final int inputWidth, inputHeight, startx, starty, scanWidth,
scanHeight;
+ if (options.isHonored())
+ {
+ inputWidth = width;
+ inputHeight = height;
+ startx = 0;
+ starty = 0;
+ scanWidth = width;
+ scanHeight = height;
+ subsample = 1;
+ } else
+ {
+ inputWidth = pdImage.getWidth();
+ inputHeight = pdImage.getHeight();
+ startx = clipped.x;
+ starty = clipped.y;
+ scanWidth = clipped.width;
+ scanHeight = clipped.height;
+ }
+ if (colorSpace instanceof PDDeviceGray)
+ {
+ // TYPE_BYTE_GRAY and not TYPE_BYTE_BINARY because this one is
handled
+ // without conversion to RGB by Graphics.drawImage
+ // this reduces the memory footprint, only one byte per pixel
instead of three.
+ bim = new BufferedImage(width, height,
BufferedImage.TYPE_BYTE_GRAY);
+ raster = bim.getRaster();
+ } else
+ {
+ raster = Raster.createBandedRaster(DataBuffer.TYPE_BYTE,
width, height, 1, new
+ Point(0, 0));
+ }
+ output = ((DataBufferByte) raster.getDataBuffer()).getData();
final boolean isIndexed = colorSpace instanceof PDIndexed;
- int rowLen = width / 8;
- if (width % 8 > 0)
+ int rowLen = inputWidth / 8;
+ if (inputWidth % 8 > 0)
{
rowLen++;
}
@@ -220,18 +263,21 @@ final class SampledImageReader
{
value0 = 0;
value1 = (byte) 255;
- }
- else
+ } else
{
value0 = (byte) 255;
value1 = 0;
}
byte[] buff = new byte[rowLen];
int idx = 0;
- for (int y = 0; y < height; y++)
+ for (int y = 0; y < starty + scanHeight; y++)
{
int x = 0;
int readLen = iis.read(buff);
+ if (y < starty || y % subsample > 0)
+ {
+ continue;
+ }
for (int r = 0; r < rowLen && r < readLen; r++)
{
int value = buff[r];
@@ -240,9 +286,14 @@ final class SampledImageReader
{
int bit = value & mask;
mask >>= 1;
+ if (x < startx || x % subsample > 0)
+ {
+ x++;
+ continue;
+ }
output[idx++] = bit == 0 ? value0 : value1;
x++;
- if (x == width)
+ if (x >= startx + scanWidth)
{
break;
}
@@ -266,31 +317,58 @@ final class SampledImageReader
}
// faster, 8-bit non-decoded, non-colormasked image conversion
- private static BufferedImage from8bit(PDImage pdImage, WritableRaster
raster)
- throws IOException
+ private static BufferedImage from8bit(PDImage pdImage, Rectangle clipped,
int subsample,
+ final int width, final int height)
throws IOException
{
- try (InputStream input = pdImage.createInputStream())
+ DecodeOptions options = new DecodeOptions(subsample);
+ options.setSourceRegion(clipped);
+ try (InputStream input = pdImage.createInputStream(options))
{
+ final int inputWidth, inputHeight, startx, starty, scanWidth,
scanHeight;
+ if (options.isHonored())
+ {
+ inputWidth = width;
+ inputHeight = height;
+ startx = 0;
+ starty = 0;
+ scanWidth = width;
+ scanHeight = height;
+ subsample = 1;
+ } else
+ {
+ inputWidth = pdImage.getWidth();
+ inputHeight = pdImage.getHeight();
+ startx = clipped.x;
+ starty = clipped.y;
+ scanWidth = clipped.width;
+ scanHeight = clipped.height;
+ }
+ final int numComponents =
pdImage.getColorSpace().getNumberOfComponents();
+ WritableRaster raster =
Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height,
+ numComponents, new Point(0, 0));
// get the raster's underlying byte buffer
byte[][] banks = ((DataBufferByte)
raster.getDataBuffer()).getBankData();
- final int width = pdImage.getWidth();
- final int height = pdImage.getHeight();
- final int numComponents =
pdImage.getColorSpace().getNumberOfComponents();
- byte[] tempBytes = new byte[numComponents * width];
+ byte[] tempBytes = new byte[numComponents * inputWidth];
// compromise between memory and time usage:
// reading the whole image consumes too much memory
// reading one pixel at a time makes it slow in our buffering
infrastructure
int i = 0;
- for (int y = 0; y < height; ++y)
+ for (int y = 0; y < starty + scanHeight; ++y)
{
long inputResult = input.read(tempBytes);
if (Long.compare(inputResult, tempBytes.length) != 0)
{
- LOG.debug("Tried reading " + tempBytes.length + " bytes
but only " + inputResult + " bytes read");
+ LOG.debug("Tried reading " + tempBytes.length + " bytes
but only " +
+ inputResult + " bytes read");
+ }
+ //
+ if (y < starty || y % subsample > 0)
+ {
+ continue;
}
- for (int x = 0; x < width; ++x)
+ for (int x = startx; x < startx + scanWidth; x += subsample)
{
for (int c = 0; c < numComponents; c++)
{
@@ -305,19 +383,42 @@ final class SampledImageReader
}
// slower, general-purpose image conversion from any image format
- private static BufferedImage fromAny(PDImage pdImage, WritableRaster
raster, COSArray colorKey)
+ private static BufferedImage fromAny(PDImage pdImage, COSArray colorKey,
Rectangle clipped,
+ int subsample, final int width, final
int height)
throws IOException
{
final PDColorSpace colorSpace = pdImage.getColorSpace();
final int numComponents = colorSpace.getNumberOfComponents();
- final int width = pdImage.getWidth();
- final int height = pdImage.getHeight();
final int bitsPerComponent = pdImage.getBitsPerComponent();
final float[] decode = getDecodeArray(pdImage);
+ DecodeOptions options = new DecodeOptions(subsample);
+ options.setSourceRegion(clipped);
// read bit stream
- try (ImageInputStream iis = new
MemoryCacheImageInputStream(pdImage.createInputStream()))
+ try (ImageInputStream iis = new
MemoryCacheImageInputStream(pdImage.createInputStream
+ (options)))
{
+ final int inputWidth, inputHeight, startx, starty, scanWidth,
scanHeight;
+ if (options.isHonored())
+ {
+ inputWidth = width;
+ inputHeight = height;
+ startx = 0;
+ starty = 0;
+ scanWidth = width;
+ scanHeight = height;
+ subsample = 1;
+ } else
+ {
+ inputWidth = pdImage.getWidth();
+ inputHeight = pdImage.getHeight();
+ startx = clipped.x;
+ starty = clipped.y;
+ scanWidth = clipped.width;
+ scanHeight = clipped.height;
+ }
+ WritableRaster raster =
Raster.createBandedRaster(DataBuffer.TYPE_BYTE, width, height,
+ numComponents, new Point(0, 0));
final float sampleMax = (float) Math.pow(2, bitsPerComponent) - 1f;
final boolean isIndexed = colorSpace instanceof PDIndexed;
@@ -332,28 +433,28 @@ final class SampledImageReader
// calculate row padding
int padding = 0;
- if (width * numComponents * bitsPerComponent % 8 > 0)
+ if (inputWidth * numComponents * bitsPerComponent % 8 > 0)
{
- padding = 8 - (width * numComponents * bitsPerComponent % 8);
+ padding = 8 - (inputWidth * numComponents * bitsPerComponent %
8);
}
// read stream
byte[] srcColorValues = new byte[numComponents];
byte[] alpha = new byte[1];
- for (int y = 0; y < height; y++)
+ for (int y = 0; y < starty + scanHeight; y++)
{
- for (int x = 0; x < width; x++)
+ for (int x = 0; x < startx + scanWidth; x++)
{
boolean isMasked = true;
for (int c = 0; c < numComponents; c++)
{
- int value = (int)iis.readBits(bitsPerComponent);
+ int value = (int) iis.readBits(bitsPerComponent);
// color key mask requires values before they are
decoded
if (colorKeyRanges != null)
{
isMasked &= value >= colorKeyRanges[c * 2] &&
- value <= colorKeyRanges[c * 2 + 1];
+ value <= colorKeyRanges[c * 2 + 1];
}
// decode array
@@ -368,23 +469,26 @@ final class SampledImageReader
// indexed color spaces get the raw value, because
the TYPE_BYTE
// below cannot be reversed by the color space
without it having
// knowledge of the number of bits per component
- srcColorValues[c] = (byte)Math.round(output);
- }
- else
+ srcColorValues[c] = (byte) Math.round(output);
+ } else
{
// interpolate to TYPE_BYTE
int outputByte = Math.round(((output -
Math.min(dMin, dMax)) /
Math.abs(dMax - dMin)) * 255f);
- srcColorValues[c] = (byte)outputByte;
+ srcColorValues[c] = (byte) outputByte;
}
}
- raster.setDataElements(x, y, srcColorValues);
+ if (x >= startx && y >= starty && x % subsample == 0 && y
% subsample == 0)
+ {
+ raster.setDataElements((x - startx) / subsample, (y -
starty) / subsample,
+ srcColorValues);
+ }
// set alpha channel in color key mask, if any
if (colorKeyMask != null)
{
- alpha[0] = (byte)(isMasked ? 255 : 0);
+ alpha[0] = (byte) (isMasked ? 255 : 0);
colorKeyMask.getRaster().setDataElements(x, y, alpha);
}
}
@@ -400,8 +504,7 @@ final class SampledImageReader
if (colorKeyMask != null)
{
return applyColorKeyMask(rgbImage, colorKeyMask);
- }
- else
+ } else
{
return rgbImage;
}
@@ -466,15 +569,14 @@ final class SampledImageReader
LOG.warn("decode array " + cosDecode
+ " not compatible with color space, using the
first two entries");
return new float[]
- {
- decode0, decode1
- };
+ {
+ decode0, decode1
+ };
}
}
LOG.error("decode array " + cosDecode
+ " not compatible with color space, using default");
- }
- else
+ } else
{
decode = cosDecode.toFloatArray();
}
diff --git a/pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java
b/pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java
index 052ea1223..42c0b5a45 100644
--- a/pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java
+++ b/pdfbox/src/main/java/org/apache/pdfbox/rendering/PageDrawer.java
@@ -955,7 +955,10 @@ public class PageDrawer extends PDFGraphicsStreamEngine
else
{
// draw the image
- drawBufferedImage(pdImage.getImage(), at);
+ int subsample = (int)Math.floor(pdImage.getWidth()/at.getScaleX());
+ if (subsample<1) subsample = 1;
+ if (subsample>8) subsample = 8;
+ drawBufferedImage(pdImage.getImage(null, subsample), at);
}
if (!pdImage.getInterpolate())
diff --git
a/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java
b/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java
index de0f63ee6..4062c8be3 100644
--- a/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java
+++ b/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/common/PDStreamTest.java
@@ -91,7 +91,7 @@ public class PDStreamTest
PDStream pdStream = new PDStream(doc, is, new COSArray());
Assert.assertEquals(0, pdStream.getFilters().size());
- is = pdStream.createInputStream(null);
+ is = pdStream.createInputStream((List<String>)null);
Assert.assertEquals(12, is.read());
Assert.assertEquals(34, is.read());
Assert.assertEquals(56, is.read());
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]