[ https://issues.apache.org/jira/browse/PDFBOX-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler resolved PDFBOX-5752. ---------------------------------------- Resolution: Fixed The fix detects object key numbers which are already is use and replaced them with a new one > Font errors after copying a page to another document > ---------------------------------------------------- > > Key: PDFBOX-5752 > URL: https://issues.apache.org/jira/browse/PDFBOX-5752 > Project: PDFBox > Issue Type: Bug > Components: Writing > Affects Versions: 3.0.1 PDFBox > Reporter: Christian Haegele > Assignee: Andreas Lehmkühler > Priority: Critical > Fix For: 3.0.2 PDFBox, 4.0.0 > > Attachments: empty.pdf, image-2024-01-16-07-41-16-462.png, > image-2024-01-16-07-46-04-195.png, image-2024-01-16-07-47-05-883.png, > roboto-14.pdf, target-merged882552058302116763.pdf > > > I try to merge import a page into a pdf document and copy the font resources. > With PDFBOX 2.0 the code worked perfectly fine, as expected, there is a > result document, including the required, embedded fonts. > Essentially I'm doing this steps in the code, while the first document is one > empty page PDF/A, and the second document contains the roboto font, also a > PDF/A document. All fonts are embedded. > > {code:java} > PDDocument targetDoc = Loader.loadPDF(targetDocBytes); > PDPage sourcePage = Loader.loadPDF(data).getPage(0); > final var copiedPage = targetDoc.importPage(sourcePage); > copiedPage.setResources(sourcePage.getResources());{code} > In PDFBOX 3.0 it doesn't seeem to work any more, the document is corrupted if > you open it in the Adobe Acrobat. > It shows a lot of errors, if you open it with the PDFBOX PreflightParser. > Here the error messages of the preflight parser: > {{1.4 Trailer Syntax error, /XRef cross reference streams are not allowed}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.1.3 Invalid Font definition, BCDFEE+Roboto-Regular: FontFile entry is > missing from FontDescriptor}} > {{3.1.14 Invalid Font definition, Unknown font type: XML}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.1.8 Invalid Font definition}} > {{3.1.2 Invalid Font definition, BCDGEE+TimesNewRomanPS-BoldMT: some > mandatory fields are missing from the FontDescriptor: Type, ItalicAngle, > FontBBox, Ascent, FontName, StemV, Flags, CapHeight, Descent.}} > {{3.1.3 Invalid Font definition, null: FontFile entry is missing from > FontDescriptor}} > {{3.3.2 Glyph error, invalid font dictionary ==> }} > and here the complete test case. I used PDFBox 3.0.1 and the newest snapshot > version from 15.01.2024. > > {code:java} > @Test > void importPageWithFonts_validateFontInfo() throws IOException { > // given > final var targetDocBytes = > IOUtils.toByteArray(PdfUtilitiesTest.class.getClassLoader().getResourceAsStream("empty.pdf")); > String[] additionalFiles = new String[]{ > "roboto-14.pdf", > }; > PDDocument targetDoc = Loader.loadPDF(targetDocBytes); > // when > for (String fileName : Arrays.asList(additionalFiles)) { > byte[] data = > IOUtils.toByteArray(PdfUtilitiesTest.class.getClassLoader().getResourceAsStream(fileName)); > // verify source is valid > PDPage sourcePage = Loader.loadPDF(data).getPage(0); > final var copiedPage = targetDoc.importPage(sourcePage); > copiedPage.setResources(sourcePage.getResources()); > targetDoc.save(Files.createTempFile("merged-fonts", > ".pdf").toFile()); > } > Path tmpFile = Files.createTempFile("fscd-merged", ".pdf"); > targetDoc.save(tmpFile.toFile(), > CompressParameters.DEFAULT_COMPRESSION); > // then > // font errors, e.g. Invalid Font definition, BCDFEE+Roboto-Regular: > FontFile entry is missing from FontDescriptor > assertFontsAreValid(tmpFile); > } > private static void assertFontsAreValid(Path tmpFile) throws IOException { > PreflightParser parser = new PreflightParser(tmpFile.toFile()); > final var documentToVerify = (PreflightDocument) parser.parse(); > // Get validation result > final var result = documentToVerify.validate(); > final var resultString = result.getErrorsList().stream() > .filter(err -> !err.getErrorCode() > > .matches("7\\.11\\.2|3\\.1\\.11|2\\.1\\.2|2\\.2\\.1|2\\.4\\.3")) // filter > findings from the source documents > .map(err -> err.getErrorCode() + " " + > err.getDetails()).collect(Collectors.joining("\n")); > assertTrue(resultString.isBlank(), resultString); > } > {code} > > The problem is still present with the snapshot version > 3.0.2-2024-0115.083906-63. > > Here is the output preflight parser output of the snapshot version: > {{1.4 Trailer Syntax error, /XRef cross reference streams are not allowed}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > {{3.3.1 Glyph error, The character code 0 in the font program > "BCDEEE+Calibri" is missing from the Character Encoding}} > > The input displays correctly: > !image-2024-01-16-07-47-05-883.png! > The output file doesn't display the font correctly: > !image-2024-01-16-07-46-04-195.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org