UWhat bothers me here is not the time to commit, although I agree
probably should have been longer than 1 day, but that AFAIK there is
very little documentation about Tika. That being said, both Chris and
Sami are committers for Tika. So if they both feel that Tika is mature
enough to use, and can help answer the inevitable question on the Nutch
list about it, then I feel it would be okay to keep the changes.
Dennis Kubes
Andrzej Bialecki wrote:
Chris A. Mattmann (JIRA) wrote:
[
https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann closed NUTCH-562.
-----------------------------------
- Patch applied to trunk in r583016
I think this issue didn't get enough attention before it was committed.
I agree with the direction of this patch - functionality-wise the mime
type detector in Tika is clearly superior to the one that we have now in
Nutch - but I feel that the use of an external framework, which is not
yet released, should be discussed first, and the proper working of the
patch should be confirmed by other users. There was too little time to
do this before the commit.
I vote for reverting this patch, unless there is an overall consensus
among Nutch developers that it's ok to keep it as it is - on one hand
considering the added functionality and simplification of Nutch code,
and on the other hand considering the (lack of) maturity of Tika.