Some characters from TeX-created files are mapped into ASCII range 1-31
-----------------------------------------------------------------------
Key: PDFBOX-756
URL: https://issues.apache.org/jira/browse/PDFBOX-756
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 1.2.0
Environment: Mac OS X 10.6.4
Reporter: Thomas Fischer
Priority: Minor
For some TeX-created files, some characters are mapped to low ASCII values.
Example:
fx 2y − fx − 2y
instead of
(x + 2y) - f(x − 2y) =
With the non-printable characters denote by \xN, PDFBox's result is
f\x3x\x4 2y\x5 − f\x3x − 2y\x5 \x6
This probably cannot be fixed, since in another file the same numbers represent
different characters:
Za {a, a 1, . . .}
instead of
Z(a) = {a, a + 1,...}
(Z\x4a\x5 \x6 {a, a \x7 1, . . .})
in another file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.