Tilman, The ExtractImages sample code is a 1.8 artifact (I believe). It has a lot of errors when compiled with 2.0.5 libraries.
1) two imports are no longer in the 2.0.5 library import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm; import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage; 2) missing methods or methods with different signatures: PDDocument.loadNonSeq( ** method not define PDDocument.load( ** load now requires a File, not a String document.openProtection ( document.getDocumentCatalog().getAllPages() ** getAllPages is missing from the PDDocumentCatalog resources.getXObjects() ** where resources is a PDResources object if (xobject instanceof PDXObjectImage) ** PDXObjectImage is not defined else if (xobject instanceof PDXObjectForm) ** same with PDXObjectForm Maybe a new ExtractImages2 program needs to be developed for the PDFBox 2 era. Dave Patterson On Thu, Apr 6, 2017 at 5:02 PM, Tilman Hausherr <[email protected]> wrote: > Am 06.04.2017 um 21:22 schrieb David Patterson: > >> I've got some PDF's to try to read. Many of them have images in them. I'd >> like to be able to iterate over the images and determine their encoding >> (png vs. jpeg vs. ?) and size. >> >> I've found a sample that lets me iterate over the PDXObject entities, but >> I'm missing a key piece to determine the size and format of the objects. >> >> a) Is a PDXObject always an image, or could it be something else? >> > > Yes it could be a form. That's why all examples (e.g. ExtractImages.java) > always check the type, and the cast to the image xobject type. That one > will give the size and the filters. > > Tilman > > >> Here is the code I've got so far. >> >> for ( PDPage aPage : pdfDocument.getPages() ) { >> PDResources pdResources = aPage.getResources(); >> for ( COSName cosObject : pdResources.getXObjectNames() ) { >> PDXObject xObj = pdResources.getXObject( cosObject); >> System.out.println( "got an image maybe" ); >> >> This is where I've gotten stumped. I've looked at lots of lists of >> COS-whatever things, but it has not led me to "the answer." >> >> Thanks for any guidance you can provide. >> >> Dave Patterson >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >

