Saurabh Patil created TIKA-2650:
-----------------------------------

             Summary: Soft-hyphen is not extracted properly
                 Key: TIKA-2650
                 URL: https://issues.apache.org/jira/browse/TIKA-2650
             Project: Tika
          Issue Type: Bug
          Components: app
    Affects Versions: 1.18
            Reporter: Saurabh Patil
         Attachments: Peter Rabbit.pdf

We are tring to extract text from PDF. if PDF having any big word at the end of 
line then after half word there is soft hyphen and remaining word goes to next 
line. but which extracting these text TIKA automatically replace hyphen with 
space.  

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to