[
https://issues.apache.org/jira/browse/PDFBOX-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-6209.
-----------------------------------
Resolution: Duplicate
> Regression in v3.0.7 causes Splitter to extract pages with text converted to
> symbols
> ------------------------------------------------------------------------------------
>
> Key: PDFBOX-6209
> URL: https://issues.apache.org/jira/browse/PDFBOX-6209
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 3.0.7 PDFBox
> Reporter: Edward Ashley
> Priority: Minor
> Fix For: 3.0.8 PDFBox
>
> Attachments: Screenshot 2026-06-10 at 14.03.59.png
>
>
> When splitting pages on certain PDF's the splitter corrupts certain pages,
> this is working in version 3.0.6 but not 3.0.7.
> Example Code:
> {code:java}
> @Test
> public void testSplitPage() {
> try {
> var inputFile = new
> ClassPathResource("/letter-redacted.pdf").getContentAsByteArray();
> try (PDDocument doc = Loader.loadPDF(inputFile)) {
> var pages = new Splitter().split(doc);
> int count = 0;
> for (var page : pages) {
> page.save(
> FileSystemView.getFileSystemView().getHomeDirectory()
> + File.separator
> + "Downloads/output-" + count++ + ".pdf");
> }
> }
> } catch (Exception ex) {
> log.error("Error splitting PDF: {}", ex.getMessage(), ex);
> }
> }{code}
> I have attached an example PDF this is happening to, and a screenshots of the
> corrupt output.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]