[ 
https://issues.apache.org/jira/browse/PDFBOX-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449579#comment-16449579
 ] 

Luca Pino commented on PDFBOX-4206:
-----------------------------------

The code was quite straightforward (i.e. {{new PreflightParser(file).parse()}}) 
and I did not have library duplicates. 

Still my bad, as I discovered though that the files I was trying to open 
(resources on classpath) were corrupted by maven during the resource copy, as 
the project had a custom {{project.build.sourceEncoding}}.

As much as I tried the simplest example I could think of, maven is still code 
that gets executed and I should have tried to exclude that as well.

Thanks a lot for the quick support and sorry about this

> Number XXX is getting too long, stop reading at offset YY
> ---------------------------------------------------------
>
>                 Key: PDFBOX-4206
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4206
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, Preflight
>    Affects Versions: 2.0.9
>         Environment: Windows 10
>            Reporter: Luca Pino
>            Priority: Major
>         Attachments: norma.pdf, playright.pdf, spedidam.pdf
>
>
> Parsing PDFs fails:
> {code:java}
> java.io.IOException: Number 'p*�    ��Q�B' is getting too long, stop 
> reading at offset 219498
> at 
> org.apache.pdfbox.pdfparser.BaseParser.readStringNumber(BaseParser.java:1413)
> at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1375)
> at 
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1312)
> at 
> org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:330)
> at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:291)
> at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:197)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.initialParse(PreflightParser.java:310)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:276)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:252)
> ... 30 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to