Adina Toma created PDFBOX-1508:
----------------------------------
Summary: Extracting page causes incorrect clipping
Key: PDFBOX-1508
URL: https://issues.apache.org/jira/browse/PDFBOX-1508
Project: PDFBox
Issue Type: Bug
Components: Parsing, PDFReader
Affects Versions: 1.7.1
Environment: Windows 7, Windows XP, Windows Server 2008
Reporter: Adina Toma
I have a compressed pdf from which i extract pages (each page will become an
individual pdf file). The extracted pages are clipped incorrectly (text is
cut), as opposed to original pdf that is not clipped. I traced it down to a
missing mediabox attribute in the extracted pages, which exists in the original
file as an attribute on all pages. Using the same file, but uncompressed, the
extracted pages are not cut and the mediabox attribute is present.
The main code (without initializations and checks) used to load and extract
pages is the following:
temp = new File("e:/temp.tmp");
rand = new RandomAccessFile(temp,"rw");
doc = PDDocument.loadNonSeq(file,rand);
PDPage page = (PDPage) doc.getPrintable(pageIndex);
PDDocument newDoc = new PDDocument();
newDoc.importPage(page);
newDoc.close();
doc.close();
rand.close();
temp.delete();
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira