Konstantin created TIKA-1552:
--------------------------------

             Summary: Pdf document parser
                 Key: TIKA-1552
                 URL: https://issues.apache.org/jira/browse/TIKA-1552
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.7
            Reporter: Konstantin


Hello,
We found that when a pdf document has marked text inside frame (table) then 
after parsing Tika insert tabs between words.
Original text:
Provides $17.7 billion in discretionary funding for the National Aeronautics 
and Space

Parsed text:
•        Provides       $17.7   billion in      discretionary   funding for     
the     National        Aeronautics     and     Space

Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to