[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853251#action_12853251 ] Julien Nioche commented on NUTCH-789: - Will upgrade as soon as 0.7 is available from http://repo1.maven.org/maven2/org/apache/tika/ - which is not the case yet. I will leave this issue open but unmark it as 1.1 Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853285#action_12853285 ] Chris A. Mattmann commented on NUTCH-789: - Hey Julien, Tika 0.7 is available from Maven central: http://repo1.maven.org/maven2/org/apache/tika/tika-parsers/ Cheers, Chris Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853212#action_12853212 ] Chris A. Mattmann commented on NUTCH-789: - Hey Julien -- okey dok, Tika 0.7 has been released. Feel free to upgrade, and close this one out...after that, I'll cut the Nutch 1.1 RC. Thanks! Cheers, Chris Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852048#action_12852048 ] Chris A. Mattmann commented on NUTCH-789: - Folks, I'm going to put together an RC for Tika 0.7 and take care of JIRA now. Once I do that, we can try and close out this issue for 1.1. I should be able to do this before the 48 hr deadline I threw up for Nutch 1.1... Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851316#action_12851316 ] Julien Nioche commented on NUTCH-789: - Shall we postpone the work on this issue to after 1.1? Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851331#action_12851331 ] Andrzej Bialecki commented on NUTCH-789: - There are no diffs, so it's difficult to figure out what's changed ... I think that Tika will soon release v. 0.7 which may also impact this patch if we decide to upgrade before our release. I asked the Tika guys about their release, let's wait a couple days more. Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-789) Improvements to Tika parser
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833714#action_12833714 ] Sami Siren commented on NUTCH-789: -- It would be really useful to include the improvements in the functionality since that way almost all (-flash ?) parsers would be covered. Improvements to Tika parser --- Key: NUTCH-789 URL: https://issues.apache.org/jira/browse/NUTCH-789 Project: Nutch Issue Type: Improvement Components: fetcher Environment: reported by Sami, in NUTCH-766 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Minor Fix For: 1.1 Attachments: NutchTikaConfig.java, TikaParser.java As reported by Sami in NUTCH-766, Sami has a few improvements he made to the Tika parser. We'll track that progress here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.