Saurabh Patil created TIKA-2650:
-----------------------------------
Summary: Soft-hyphen is not extracted properly
Key: TIKA-2650
URL: https://issues.apache.org/jira/browse/TIKA-2650
Project: Tika
Issue Type: Bug
Components: app
Affects Versions: 1.18
Reporter: Saurabh Patil
Attachments: Peter Rabbit.pdf
We are tring to extract text from PDF. if PDF having any big word at the end of
line then after half word there is soft hyphen and remaining word goes to next
line. but which extracting these text TIKA automatically replace hyphen with
space.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)