[
https://issues.apache.org/jira/browse/PDFBOX-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859096#action_12859096
]
Radim Hatlapatka commented on PDFBOX-698:
-----------------------------------------
Essential part of code that I use for extracting images from PDF (I also catch
exceptions and close streams, but here I left it out,...) .
This code I think is alright but doesn't recognize images in PDF documents
described in this thread.
// loading pdfFile as PDDocument
PDDocument document = null;
try {
document = PDDocument.load(inputStream);
AccessPermission accessPermissions =
document.getCurrentAccessPermission();
if (!accessPermissions.canExtractContent()) {
throw new PdfRecompressionException("Error: You do not have
permission to extract images.");
}
// going page by page
List pages = document.getDocumentCatalog().getAllPages();
for (int pageNumber = 0; pageNumber < pages.size(); pageNumber++) {
PDPage page = (PDPage) pages.get(pageNumber);
PDResources resources = page.getResources();
// reading images from each page and saving them to file
// (name of file is saved in list
namSystem.err.println(images);esOfImages
Map images = resources.getImages();
if (images != null) {
Iterator imageIter = images.keySet().iterator();
while (imageIter.hasNext()) {
String key = (String) imageIter.next();
PDXObjectImage image = (PDXObjectImage) images.get(key);
String name = getUniqueFileName(prefix + key,
image.getSuffix());
System.out.println("Writing image:" + name);
image.write2file(name);
}
}
}
> Unable parse images from PDF documents concated by tex
> ------------------------------------------------------
>
> Key: PDFBOX-698
> URL: https://issues.apache.org/jira/browse/PDFBOX-698
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 0.8.0-incubator, 1.0.0, 1.1.0
> Environment: Using jdk 1.6 in Ubuntu 8.10 (using IDE netbeans 6.5)
> Reporter: Radim Hatlapatka
> Attachments: item.pdf
>
>
> Unable to extract images from PDF document created from another PDF documents
> by their concatanation using tex (if concat by pdftk than it works fine, but
> if concat by tex it doesn't find any).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.