[
https://issues.apache.org/jira/browse/PDFBOX-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler resolved PDFBOX-1508.
----------------------------------------
Resolution: Not A Problem
Assignee: Andreas Lehmkühler
You have to copy some more values after importing the page. Have a look at
org.apache.pdfbox.pdmodel.PDDocument.Splitter#processNextPage, works fine with
the provided pdfs.
> Extracting page causes incorrect clipping
> -----------------------------------------
>
> Key: PDFBOX-1508
> URL: https://issues.apache.org/jira/browse/PDFBOX-1508
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing, PDFReader
> Affects Versions: 1.7.1
> Environment: Windows 7, Windows XP, Windows Server 2008
> Reporter: Adina Toma
> Assignee: Andreas Lehmkühler
> Attachments: files.zip
>
>
> I have a compressed pdf from which i extract pages (each page will become an
> individual pdf file). The extracted pages are clipped incorrectly (text is
> cut), as opposed to original pdf that is not clipped. I traced it down to a
> missing mediabox attribute in the extracted pages, which exists in the
> original file as an attribute on all pages. Using the same file, but
> uncompressed, the extracted pages are not cut and the mediabox attribute is
> present.
> The main code (without initializations and checks) used to load and extract
> pages is the following:
> temp = new File("e:/temp.tmp");
> rand = new RandomAccessFile(temp,"rw");
> doc = PDDocument.loadNonSeq(file,rand);
> PDPage page = (PDPage) doc.getPrintable(pageIndex);
> PDDocument newDoc = new PDDocument();
> newDoc.importPage(page);
> newDoc.close();
> doc.close();
> rand.close();
> temp.delete();
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira