[ 
https://issues.apache.org/jira/browse/PDFBOX-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996236#comment-13996236
 ] 

Andreas Lehmkühler commented on PDFBOX-2074:
--------------------------------------------

We should be more lenient and somehow skip the problematic CMap entry (I don't 
like the idea, but as long as adobe accepts such pdfs, we have to do it as 
well).
IMHO we should skip the whole entry instead of reading the first or last two 
bytes, as both variants are most likely invalid.

> 4-bytes CMap entry causes exception
> -----------------------------------
>
>                 Key: PDFBOX-2074
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2074
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Juraj Lonc
>         Attachments: PDFBOX-2074_CMap.diff, pdf_with_4B_cmap_entry.pdf
>
>
> I have PDF that has CMap entry consisting of 4 bytes. It is just one entry 
> with that size, other entries have 2-bytes.
> Adobe reader has no problems with that, PDFBox throws Exception.
> I think this Exception should not be thrown. It should be skipped or 
> truncated tu 2 bytes and write warning to log.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to