The extracted text of a PDF document created using TeX contains remnants of TeX
commands like \parenleftbig etc.
----------------------------------------------------------------------------------------------------------------
Key: PDFBOX-727
URL: https://issues.apache.org/jira/browse/PDFBOX-727
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 1.1.0
Environment: Mac OS X 10.6.3, using org.apache.pdfbox.ExtractText
-encoding UTF-8
Reporter: Thomas Fischer
In otherwise (more or less) correctly recognised text there appear remnants of
TeX commands, e.g.
parenleftbigparenleftbigparenleftbig
Aσ̇,τ − σ
parenrightbigparenrightbigparenrightbig
+
parenleftbigparenleftbigparenleftbig
ξ̇,η − ξ
parenrightbigparenrightbigparenrightbig
that are not visible in any PDF viewer nor present if text is copied from there
(tested: Acrobat Reades 9.3.2 and Preview 5.0.2).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.