Philip Helger created PDFBOX-2009:
-------------------------------------
Summary: PDFStreamEngine.processEncodedText incorrectly handling
UTF-16 text with BOM FEFF
Key: PDFBOX-2009
URL: https://issues.apache.org/jira/browse/PDFBOX-2009
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 2.0.0
Reporter: Philip Helger
Fix For: 2.0.0
When having a text print operation like
<FEFF21222193219103B103A003A6> Tj
than the PDFStreamEngine.processEncodedText does not handle this correctly.
Am I correct that if a BOM was determined, the codelength should be set to 2
(and not be changed)? Or should alternatively simply the BOM be skipped?
--
This message was sent by Atlassian JIRA
(v6.2#6252)