[jira] [Comment Edited] (PDFBOX-1618) Split PDF file to single page files, some files are inflated in size

JIRA Sun, 21 Dec 2014 08:47:51 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739714#comment-13739714
 ]


Andreas Lehmkühler edited comment on PDFBOX-1618 at 12/21/14 4:47 PM:
----------------------------------------------------------------------

I have the same problem. I'm splitting pdf documents into chapters and my 
chapters getting as big as the original file - which is not acceptable.

My code looks like this:
{code}
        public void split(OutlineItems outline) throws IOException, 
COSVisitorException {
                document = PDDocument.load(getInputPath() + getFileName());
                List<PDPage> pages = 
document.getDocumentCatalog().getAllPages();
                List<OutlineItem> outlineItems = outline.getItems();

                for (int i = 0; i < outlineItems.size(); i++) {
                        OutlineItem outlineItem = outlineItems.get(i);
                        int chapterStart = outlineItem.getPage();
                        int chapterEnd;
                        if (outlineItems.size() > i + 1) {
                                chapterEnd = outlineItems.get(i + 1).getPage() 
- 1;
                        } else {
                                chapterEnd = document.getNumberOfPages();
                        }

                        PDDocument chapter = new PDDocument();
                        int currentPage = chapterStart;
                        do {                                
                             chapter.addPage(pages.get(currentPage));
                             currentPage++;
                        } while (currentPage < chapterEnd);
                        
                        File file = new File(getOutputPath());
                        file.mkdirs();
                        chapter.setDocumentInformation(new 
PDDocumentInformation());
                        chapter.save(getOutputPath() + File.separator + 
outlineItem.getId() + ".pdf");
                        chapter.close();

                        Logger.getLogger(Task.class.getName()).log(Level.INFO, 
"Chapter: {0} - Start:{1} <-> End: {2}", new Object[]{outlineItem.getTitle(), 
chapterStart, chapterEnd});
                }

        }
{code}

[~michael.kuss]: May you provide your solution?


was (Author: geschan):
I have the same problem. I'm splitting pdf documents into chapters and my 
chapters getting as big as the original file - which is not acceptable.

My code looks like this:
        public void split(OutlineItems outline) throws IOException, 
COSVisitorException {
                document = PDDocument.load(getInputPath() + getFileName());
                List<PDPage> pages = 
document.getDocumentCatalog().getAllPages();
                List<OutlineItem> outlineItems = outline.getItems();

                for (int i = 0; i < outlineItems.size(); i++) {
                        OutlineItem outlineItem = outlineItems.get(i);
                        int chapterStart = outlineItem.getPage();
                        int chapterEnd;
                        if (outlineItems.size() > i + 1) {
                                chapterEnd = outlineItems.get(i + 1).getPage() 
- 1;
                        } else {
                                chapterEnd = document.getNumberOfPages();
                        }

                        PDDocument chapter = new PDDocument();
                        int currentPage = chapterStart;
                        do {                                
                             chapter.addPage(pages.get(currentPage));
                             currentPage++;
                        } while (currentPage < chapterEnd);
                        
                        File file = new File(getOutputPath());
                        file.mkdirs();
                        chapter.setDocumentInformation(new 
PDDocumentInformation());
                        chapter.save(getOutputPath() + File.separator + 
outlineItem.getId() + ".pdf");
                        chapter.close();

                        Logger.getLogger(Task.class.getName()).log(Level.INFO, 
"Chapter: {0} - Start:{1} <-> End: {2}", new Object[]{outlineItem.getTitle(), 
chapterStart, chapterEnd});
                }

        }

[~michael.kuss]: May you provide your solution?

> Split PDF file to single page files, some files are inflated in size
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-1618
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1618
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 1.8.1
>         Environment: Windows 7, JVM 1.6.0_29
>            Reporter: Tom Taylor
>             Fix For: 2.0.0
>
>         Attachments: 112080-TECHNICAL MANUAL FOR GENERATOR NIR 7194 A-10LW OF 
> 4038 KVA.pdf, Test_PDFs.zip, internalstructure.png
>
>
> A PDF file is split into single pages for inclusion within another document 
> (pdfbox.utils.Splitter within our code but same phenomenon observed when 
> splitting using command line PDFSplit tool). Som of the pages are almost as 
> large as the original file which causes performance problems for our 
> customers.
> Again, I have a sample pdf to attach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PDFBOX-1618) Split PDF file to single page files, some files are inflated in size

Reply via email to