[
https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12680411#action_12680411
]
Sami Siren commented on NUTCH-705:
--
I think we should start looking at Apache Tika for most
[
https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677878#action_12677878
]
Dmitry Lihachev commented on NUTCH-705:
---
Yes, it looks a bit like a problem... How can
[
https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677508#action_12677508
]
Sami Siren commented on NUTCH-705:
--
I think that the patch contains some lgpl code that we
[
https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677242#action_12677242
]
Dmitry Lihachev commented on NUTCH-705:
---
This parser correctly handles non ascii input