Markus Jelsma created TIKA-975:
----------------------------------

             Summary: LinkBuilder to optionally collapse anchor whitespace
                 Key: TIKA-975
                 URL: https://issues.apache.org/jira/browse/TIKA-975
             Project: Tika
          Issue Type: Improvement
          Components: parser
    Affects Versions: 1.2
            Reporter: Markus Jelsma
            Priority: Minor
             Fix For: 1.3


Links extracted by the LinkContentHandler contain the verbatim anchor text. 
This is usually fine but unfortunately many websites have the anchor text 
spread over multiple lines or have it indented with tabulators or spaces.

This patch adds a boolean option to LinkContentHandler with which whitespace 
collapsing can be toggled on or off. Default behaviour remains as-is and the 
API remains backward compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to