[
https://issues.apache.org/jira/browse/PDFBOX-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040698#comment-13040698
]
Roland Quast commented on PDFBOX-1018:
--------------------------------------
You are correct about the lack of the fax decode codec in Sanselan, I just
checked it on their supported formats page. I had to fix the same problem in a
TIFF file reader before, and after looking at the code I saw that I used ImageJ
to do this. I don't know if the code is of any use, but the ImageJ license is
public domain.
public static BufferedImage convertRenderedImage(RenderedImage img, String[]
decoders) throws UnsupportedImageModelException {
if (img instanceof BufferedImage) {
return (BufferedImage)img;
}
WritableRaster wr = ImagePlusCreator.forceTileUpdate(img);
ImagePlus im;
if (decoders[0].equalsIgnoreCase("GIF")
|| decoders[0].equalsIgnoreCase("JPEG")) {
// Convert the way ImageJ does (ij.io.Opener.openJpegOrGif())
BufferedImage bi = new BufferedImage(img.getColorModel(), wr,
false, null);
im = ImagePlusCreator.create(wr, img.getColorModel());
im.setImage(bi);
} else {
im = ImagePlusCreator.create(wr, img.getColorModel());
if (img instanceof TIFFImage) {
TIFFImage ti = (TIFFImage) img;
try {
Object o = ti.getProperty("tiff_directory");
if (o instanceof TIFFDirectory) {
TIFFDirectory dir = (TIFFDirectory) o;
BufferedImage preimg = im.getBufferedImage();
int compression =
(int)dir.getFieldAsLong(TIFFImageDecoder.TIFF_COMPRESSION);
switch (compression) {
case TIFFImage.COMP_FAX_G3_1D:
case TIFFImage.COMP_FAX_G3_2D:
case TIFFImage.COMP_FAX_G4_2D:
// resize image to the same scale
if (
dir.isTagPresent(TIFFImageDecoder.TIFF_X_RESOLUTION) &&
dir.isTagPresent(TIFFImageDecoder.TIFF_Y_RESOLUTION)) {
double x_res =
dir.getFieldAsDouble(TIFFImageDecoder.TIFF_X_RESOLUTION);
double y_res =
dir.getFieldAsDouble(TIFFImageDecoder.TIFF_Y_RESOLUTION);
double x_scale = 1.0d;
double y_scale = 1.0d;
if ( x_res != y_res ) {
if ( x_res > y_res ) {
y_scale = x_res / y_res;
} else if ( y_res > x_res ) {
x_scale = y_res / x_res;
}
BufferedImageOp op = new AffineTransformOp(
AffineTransform.getScaleInstance(x_scale, y_scale),
new
RenderingHints(RenderingHints.KEY_INTERPOLATION,
RenderingHints.VALUE_INTERPOLATION_NEAREST_NEIGHBOR));
preimg = op.filter(preimg, null);
}
}
// invert an image that has the wrong photometric
interpretation
if
(dir.isTagPresent(TIFFImageDecoder.TIFF_PHOTOMETRIC_INTERPRETATION)) {
long photo =
dir.getFieldAsLong(TIFFImageDecoder.TIFF_PHOTOMETRIC_INTERPRETATION);
if ( photo == 1 ) {
preimg = binarizeImageAndInvert(preimg,
165);
}
}
return preimg;
default:
return im.getBufferedImage();
}
}
} catch (Exception ex) {
printStackTrace(ex);
}
}
}
return im.getBufferedImage();
}
BufferedImage newImg = new BufferedImage(image.getWidth(), image.getHeight(),
BufferedImage.TYPE_BYTE_BINARY);
WritableRaster raster = newImg.getRaster();
int imageSize = image.getWidth() * image.getHeight();
int imageWidth = image.getWidth();
for (int i = 0; i < imageSize; i++) {
int y = i / imageWidth;
int x = i - (y * imageWidth);
if (isBlack(image, x, y, luminanceCutOff)) {
raster.setSample(x,y,0,1);
} else {
raster.setSample(x,y,0,0);
}
}
newImg.flush();
return newImg;
}
> PDPage convertToImage bug creates white images from black and white pdf files.
> ------------------------------------------------------------------------------
>
> Key: PDFBOX-1018
> URL: https://issues.apache.org/jira/browse/PDFBOX-1018
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.2.0, 1.2.1, 1.3.1, 1.4.0, 1.5.0
> Environment: JDK 1.6.0_22
> Reporter: Roland Quast
> Assignee: Andreas Lehmkühler
> Priority: Critical
> Labels: pdfbox
> Attachments: BlackAndWhiteBug.java, ColorWorks.java,
> PDFBOX1018-black_and_white1.png, black_and_white.pdf, color.pdf
>
>
> This bug has been reported in various other tickets submitted before. I am
> attempting to conclusively prove that this is an issue, and it needs to be
> attended to since all past tickets regarding this bug have been marked
> invalid.
> I have attached a video showing very basic code that will reproduce the
> issue. I have also attached the code that causes the issue, as well as a PDF
> file that works (a color one), and a black and white PDF file that doesn't.
> The main issue is that when reading a black and white PDF file (see attached
> black and white pdf file), the following message is displayed, and the
> contents of the output image is completely white.
> 26/05/2011 3:20:14 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
> process
> WARNING: getRGBImage returned NULL
> We use PDFBox in our program for reading PDF files, and at least 50 percent
> of our customer's PDF files (from different scanners) will not read because
> of this issue. This is a complete show stopper, and I'd be more than happy to
> help in any way I could to resolve it.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira