[jira] [Commented] (TIKA-2054) Problem with ligatures converting from PDF to HTML with Tika

2016-08-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418760#comment-15418760 ] Tim Allison commented on TIKA-2054: --- You might try subclassing the XHTMLHandler/SafeContentHandler and

[jira] [Commented] (TIKA-2054) Problem with ligatures converting from PDF to HTML with Tika

2016-08-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418755#comment-15418755 ] Tim Allison commented on TIKA-2054: --- I don't think we want to modify our SafeContentHandler to stop