[
https://issues.apache.org/jira/browse/PDFBOX-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026677#comment-14026677
]
Tilman Hausherr commented on PDFBOX-2128:
-----------------------------------------
You misunderstood the discussion, I took that out. However you can use it for
the 1.8 version too, instead of image.write2file(), do this for JPegs (didn't
test it, there may be syntax errors, I hope it works)
{code}
public void writeJpeg2file(PDJpeg image, String filename) throws IOException
{
final List<String> DCT_FILTERS = new ArrayList<String>();
static
{
DCT_FILTERS.add( COSName.DCT_DECODE.getName() );
DCT_FILTERS.add( COSName.DCT_DECODE_ABBREVIATION.getName() );
}
FileOutputStream out = null;
try
{
out = new FileOutputStream(filename + "." + suffix);
InputStream data = image.getPDStream().getPartiallyFilteredStream(
DCT_FILTERS );
byte[] buf = new byte[1024];
int amountRead;
while( (amountRead = data.read( buf )) != -1 )
{
out.write( buf, 0, amountRead );
}
IOUtils.closeQuietly(data);
out.flush();
}
finally
{
if( out != null )
{
out.close();
}
}
}
{code}
> CMYK images are not supported correctly in the PDJpeg class
> -----------------------------------------------------------
>
> Key: PDFBOX-2128
> URL: https://issues.apache.org/jira/browse/PDFBOX-2128
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 1.8.5, 1.8.6, 2.0.0
> Environment: Windows 7 Professional
> Running jvm: Java HotSpot(TM) 64-Bit Server VM - 1.6.0_26-b03 - 20.1-b02 -
> Sun Microsystems Inc
> Reporter: Ludovic Davoine
> Labels: PDJpeg, cmyk, images
> Attachments: porsche_cmyk.pdf-2.png
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I have a PDF with CMYK images inside and i need to extract the images in the
> RGB format. But the PDJpeg class seems to not work correctly; the colors are
> bad. Example:
> - Original image in te PDF : http://ludoda.free.fr/IMAGE_IN_PDF.jpg
> - Extracted image: http://ludoda.free.fr/IMAGE_EXTRACTED.jpg
> You can download the PDF : http://ludoda.free.fr/PORSCHE_CMYK.PDF
> and try my simple Test Case (I'm using PDFbox 1.8.5):
> {code}
> import java.awt.image.BufferedImage;
> import java.io.File;
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
> import javax.imageio.ImageIO;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.PDResources;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDJpeg;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObject;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage;
> public class TestCase {
>
> public static void main(String[] args)
> {
> try
> {
> System.out.println("START EXTRACTING IMAGES...");
> read_pdf();
> System.out.println("COMPLETE");
> }
> catch (IOException ex)
> {
> System.out.println("" + ex);
> }
> }
> public static void read_pdf() throws IOException
> {
> PDDocument document = null;
> document = PDDocument.load("C:\\temp\\PORSCHE_CMYK.pdf");
> @SuppressWarnings("unchecked")
> List<PDPage> pages =
> document.getDocumentCatalog().getAllPages();
> Iterator<PDPage> iter = pages.iterator();
> int i =1;
> while (iter.hasNext())
> {
> PDPage page = (PDPage) iter.next();
> PDResources resources = page.getResources();
> Map<String, PDXObject> pageImages =
> resources.getXObjects();
> if (pageImages != null)
> {
> Iterator<String> imageIter =
> pageImages.keySet().iterator();
> while (imageIter.hasNext())
> {
> String key = (String) imageIter.next();
> if(pageImages.get(key) instanceof
> PDXObjectImage)
> {
> PDJpeg image = (PDJpeg)
> pageImages.get(key);
>
> // Test 1 : write2file
>
> image.write2file("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i);
>
> // Test 2: getRGBImage
> BufferedImage
> bimage=image.getRGBImage();
> File outputfile = new
> File("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i+"_buffered.jpg");
> ImageIO.write(bimage, "jpg",
> outputfile);
> i ++;
> }
> }
> }
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)