[
https://issues.apache.org/jira/browse/PDFBOX-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739714#comment-13739714
]
Andreas Lehmkühler edited comment on PDFBOX-1618 at 12/21/14 4:47 PM:
----------------------------------------------------------------------
I have the same problem. I'm splitting pdf documents into chapters and my
chapters getting as big as the original file - which is not acceptable.
My code looks like this:
{code}
public void split(OutlineItems outline) throws IOException,
COSVisitorException {
document = PDDocument.load(getInputPath() + getFileName());
List<PDPage> pages =
document.getDocumentCatalog().getAllPages();
List<OutlineItem> outlineItems = outline.getItems();
for (int i = 0; i < outlineItems.size(); i++) {
OutlineItem outlineItem = outlineItems.get(i);
int chapterStart = outlineItem.getPage();
int chapterEnd;
if (outlineItems.size() > i + 1) {
chapterEnd = outlineItems.get(i + 1).getPage()
- 1;
} else {
chapterEnd = document.getNumberOfPages();
}
PDDocument chapter = new PDDocument();
int currentPage = chapterStart;
do {
chapter.addPage(pages.get(currentPage));
currentPage++;
} while (currentPage < chapterEnd);
File file = new File(getOutputPath());
file.mkdirs();
chapter.setDocumentInformation(new
PDDocumentInformation());
chapter.save(getOutputPath() + File.separator +
outlineItem.getId() + ".pdf");
chapter.close();
Logger.getLogger(Task.class.getName()).log(Level.INFO,
"Chapter: {0} - Start:{1} <-> End: {2}", new Object[]{outlineItem.getTitle(),
chapterStart, chapterEnd});
}
}
{code}
[~michael.kuss]: May you provide your solution?
was (Author: geschan):
I have the same problem. I'm splitting pdf documents into chapters and my
chapters getting as big as the original file - which is not acceptable.
My code looks like this:
public void split(OutlineItems outline) throws IOException,
COSVisitorException {
document = PDDocument.load(getInputPath() + getFileName());
List<PDPage> pages =
document.getDocumentCatalog().getAllPages();
List<OutlineItem> outlineItems = outline.getItems();
for (int i = 0; i < outlineItems.size(); i++) {
OutlineItem outlineItem = outlineItems.get(i);
int chapterStart = outlineItem.getPage();
int chapterEnd;
if (outlineItems.size() > i + 1) {
chapterEnd = outlineItems.get(i + 1).getPage()
- 1;
} else {
chapterEnd = document.getNumberOfPages();
}
PDDocument chapter = new PDDocument();
int currentPage = chapterStart;
do {
chapter.addPage(pages.get(currentPage));
currentPage++;
} while (currentPage < chapterEnd);
File file = new File(getOutputPath());
file.mkdirs();
chapter.setDocumentInformation(new
PDDocumentInformation());
chapter.save(getOutputPath() + File.separator +
outlineItem.getId() + ".pdf");
chapter.close();
Logger.getLogger(Task.class.getName()).log(Level.INFO,
"Chapter: {0} - Start:{1} <-> End: {2}", new Object[]{outlineItem.getTitle(),
chapterStart, chapterEnd});
}
}
[~michael.kuss]: May you provide your solution?
> Split PDF file to single page files, some files are inflated in size
> --------------------------------------------------------------------
>
> Key: PDFBOX-1618
> URL: https://issues.apache.org/jira/browse/PDFBOX-1618
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.8.1
> Environment: Windows 7, JVM 1.6.0_29
> Reporter: Tom Taylor
> Fix For: 2.0.0
>
> Attachments: 112080-TECHNICAL MANUAL FOR GENERATOR NIR 7194 A-10LW OF
> 4038 KVA.pdf, Test_PDFs.zip, internalstructure.png
>
>
> A PDF file is split into single pages for inclusion within another document
> (pdfbox.utils.Splitter within our code but same phenomenon observed when
> splitting using command line PDFSplit tool). Som of the pages are almost as
> large as the original file which causes performance problems for our
> customers.
> Again, I have a sample pdf to attach.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)