Konstantin created TIKA-1552:
--------------------------------
Summary: Pdf document parser
Key: TIKA-1552
URL: https://issues.apache.org/jira/browse/TIKA-1552
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.7
Reporter: Konstantin
Hello,
We found that when a pdf document has marked text inside frame (table) then
after parsing Tika insert tabs between words.
Original text:
Provides $17.7 billion in discretionary funding for the National Aeronautics
and Space
Parsed text:
• Provides $17.7 billion in discretionary funding for
the National Aeronautics and Space
Thank you.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)