[jira] [Comment Edited] (PDFBOX-3904) Command Line ExtractImages with pageNumbers

JIRA Wed, 23 Aug 2017 02:16:38 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138119#comment-16138119
 ]


Hasan Karaoğlu edited comment on PDFBOX-3904 at 8/23/17 9:15 AM:
-----------------------------------------------------------------

Ok. I overcome this as programmatically. Below code helped me.


{code:java}
   PDPageTree list = document.getPages();
    int pageNumber = 1;
    int imageId = 1;
    for (PDPage page : list) {
        PDResources pdResources = page.getResources();
        for (COSName c : pdResources.getXObjectNames()) {
            PDXObject o = pdResources.getXObject(c);
            if (o instanceof 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject) {
                File file = new File("LOCAL_PATH" + pageNumber + "_" + imageId 
+ ".png" );
                
ImageIO.write(((org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject)o).getImage(),
 "png", file);
                imageId++;
            }
        }
        pageNumber++;
    }
{code}

Reference link: https://stackoverflow.com/a/37176699/3371708



was (Author: hkaraoglu):
Ok. I overcome this as programmatically. Below code helped me.


{code:java}
   PDPageTree list = document.getPages();
    int pageNumber = 1;
    int imageId = 1;
    for (PDPage page : list) {
        PDResources pdResources = page.getResources();
        for (COSName c : pdResources.getXObjectNames()) {
            PDXObject o = pdResources.getXObject(c);
            if (o instanceof 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject) {
                File file = new File("LOCAL_PATH" + pageNumber + "_" + imageId 
+ ".png" );
                
ImageIO.write(((org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject)o).getImage(),
 "png", file);
                imageId++;
            }
        }
        pageNumber++;
    }
{code}


> Command Line ExtractImages with pageNumbers
> -------------------------------------------
>
>                 Key: PDFBOX-3904
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3904
>             Project: PDFBox
>          Issue Type: Wish
>            Reporter: Hasan Karaoğlu
>
> When we use this command: 
> {noformat}
> java -jar pdfbox-app-2.y.z.jar ExtractImages [OPTIONS] <inputfile>
> {noformat}
> We get images. But how to we know page number of images?They are extracted 
> sequentially and named number-ordered without look their page numbers. 
> May doing naming convention should be like this:  
> fileName_pageNumber_imageNumber.jpg



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (PDFBOX-3904) Command Line ExtractImages with pageNumbers

Reply via email to