Konstantin created TIKA-1552: -------------------------------- Summary: Pdf document parser Key: TIKA-1552 URL: https://issues.apache.org/jira/browse/TIKA-1552 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.7 Reporter: Konstantin
Hello, We found that when a pdf document has marked text inside frame (table) then after parsing Tika insert tabs between words. Original text: Provides $17.7 billion in discretionary funding for the National Aeronautics and Space Parsed text: • Provides $17.7 billion in discretionary funding for the National Aeronautics and Space Thank you. -- This message was sent by Atlassian JIRA (v6.3.4#6332)