[
https://issues.apache.org/jira/browse/PDFBOX-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737842#comment-16737842
]
Tilman Hausherr commented on PDFBOX-4424:
-----------------------------------------
That Adobe is able to show the file may be because it does parse on demand and
we don't. Edge is not able to show the file. Chrome can show it.
I don't know if it has to be URL encoded - the specification only tells "A
uniform resource identifier (URI) is a string that identifies (resolves to) a
resource on the Internet — typically a file that is the destination of a
hypertext link, although it may also resolve to a query or other entity. (URIs
are described in Internet RFC 3986, Uniform Resource Identifiers (URI): Generic
Syntax.)"
The file was generated with Apache Fop 2.2. Their current version is 2.3. If
that one doesn't do it properly, create an issue with them.
> IOException when merging PDF documents containing URLs with unmatched brakets.
> ------------------------------------------------------------------------------
>
> Key: PDFBOX-4424
> URL: https://issues.apache.org/jira/browse/PDFBOX-4424
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.9, 2.0.13
> Reporter: Paul Slauenwhite
> Priority: Major
> Attachments: PdfBoxDefect.java, Test_PDF_Not_Working.pdf,
> Test_PDF_Working.pdf
>
>
> Steps to reproduce:
> 1. Download the attached files to a directory.
> 2. Refactor the constants in PdfBoxDefect.java to reference the downloaded
> files.
> 3. Run PdfBoxDefect.java.
> 4. Note, the error merging error:
> Merging /Users/paulslauenwhite/Downloads/Test_PDF_Working.pdf to
> /Users/paulslauenwhite/Downloads/Test_PDF_Working_MERGED.pdf.
> Merged /Users/paulslauenwhite/Downloads/Test_PDF_Working.pdf to
> /Users/paulslauenwhite/Downloads/Test_PDF_Working_MERGED.pdf.
> Merging /Users/paulslauenwhite/Downloads/Test_PDF_Not_Working.pdf to
> /Users/paulslauenwhite/Downloads/Test_PDF_Not_Working_MERGED.pdf.
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Invalid dictionary, found: 'Æ' but expected: '/' at offset 13199
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Bad dictionary declaration at offset 13224
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Invalid dictionary, found: 'R' but expected: '/' at offset 13224
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Bad dictionary declaration at offset 13314
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Invalid dictionary, found: '_' but expected: '/' at offset 13314
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Corrupt object reference at offset 13574
> 15:59:32,120 [main] WARN org.apache.pdfbox.pdfparser.BaseParser
> - Corrupt object reference at offset 13588
> java.io.IOException: Unknown dir object c=')' cInt=41 peek=')' peekInt=41 at
> offset 13588
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:961)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:631)
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:874)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:152)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:862)
> at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:852)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:821)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:741)
> at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:701)
> at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:205)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:240)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1144)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1060)
> at
> org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:261)
> at
> org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:231)
> at PdfBoxDefect.merge(PdfBoxDefect.java:73)
> at PdfBoxDefect.main(PdfBoxDefect.java:48)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]