[ 
https://issues.apache.org/jira/browse/PDFBOX-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ladislav Dudáš updated PDFBOX-4032:
-----------------------------------
    Attachment: Contains_tab_bad.pdf
                Contains_tab_ok.pdf

The problem is not within reading characters but writing. If any COSString 
contains byte 0x09 then it's written by PDFBox as one byte which can cause that 
some readers read string incorrectly.

I prepared example. Attached files are identical except one character in 
bookmark. In file [^Contains_tab_ok.pdf] is TAB written as "\t" which is 
correct and all readers are ok. Second [^Contains_tab_bad.pdf] on same place is 
used one byte 0x09 which cause some readers do not handle correctly (for 
example Nitro report file as corrupted and repairs it). Probably many PDF 
syntax checker identify this as problem.

> Handle correctly special characters while writing COSString
> -----------------------------------------------------------
>
>                 Key: PDFBOX-4032
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4032
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 2.0.8
>            Reporter: Ladislav Dudáš
>             Fix For: 2.0.9
>
>         Attachments: Contains_tab_bad.pdf, Contains_tab_ok.pdf
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Regarding to case PDFBOX-3107. There was change in CosWritter.java that if 
> string contains characters CR (0x0d) and LF (0x0a) the string is written in 
> hex format. This may be ok, but PDF specification (7.3.4.2 Literal Strings) 
> explicitly defines more characters that should handle specially.
> I'm providing another version of the code that handles all special characters 
> without transforming to hex format.
> PR [#41|https://github.com/apache/pdfbox/pull/41]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to