[ https://issues.apache.org/jira/browse/PDFBOX-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Hewson resolved PDFBOX-904. -------------------------------- Resolution: Fixed > Potential issue with COSString and UTF-16-encoded Strings. > ---------------------------------------------------------- > > Key: PDFBOX-904 > URL: https://issues.apache.org/jira/browse/PDFBOX-904 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 1.4.0, 2.0.0 > Reporter: Neil McErlean > Priority: Critical > Fix For: 2.0.0 > > Attachments: PDFBOX-904.patch > > > I've been looking into PDFBOX-903 and I came across a potential issue with > the COSString class. > The issue occurs when you construct an instance of COSString and pass a > UTF-16-encoded String. > The current code (trunk) checks the passed String parameter in the > constructor to see if it is UTF-16. It does this by looking for char values > above 255. > Whilst a String that contains char values greater than 255 is likely to be > UTF-16, it is possible to have UTF-16-encoded Strings whose characters do not > exceed this limit. > These Strings would be incorrectly marked as being not unicode16. An example > (from the upcoming patch) > /**してく */ > String textHighBits = "\u3057\u3066\u304f"; > Furthermore, if you construct a COSString using the COSString(byte[]) > constructor, then the COSString class cannot know what the encoding is. > I will attach a patch in a moment which includes a test case to reproduce the > issue and a fix for the product code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org