[jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590246#comment-13590246 ] kiran commented on NUTCH-874: - The following plugins need to be ported for compatibility in 2.x i) Feed ii) parse-swf iii) parse-ext iv) parse-zip v) parse-metatags ( I wrote patch for this earlier, NUTCH-1478) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Affects Versions: nutchgora Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: 2.2 Attachments: NUTCH-874.patch I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473654#comment-13473654 ] Lewis John McGibbney commented on NUTCH-874: part 1 e.g. removal of unused imports committed @revision 1396850 in 2.x head Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Affects Versions: nutchgora Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: 2.2 Attachments: NUTCH-874.patch I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473827#comment-13473827 ] Hudson commented on NUTCH-874: -- Integrated in Nutch-nutchgora #375 (See [https://builds.apache.org/job/Nutch-nutchgora/375/]) NUTCH-874 Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora (part 1) (Revision 1396850) Result = SUCCESS lewismc : Files : * /nutch/branches/2.x/CHANGES.txt * /nutch/branches/2.x/src/plugin/feed/src/java/org/apache/nutch/indexer/feed/FeedIndexingFilter.java * /nutch/branches/2.x/src/plugin/feed/src/java/org/apache/nutch/parse/feed/FeedParser.java * /nutch/branches/2.x/src/plugin/feed/src/test/org/apache/nutch/parse/feed/TestFeedParser.java * /nutch/branches/2.x/src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java * /nutch/branches/2.x/src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java * /nutch/branches/2.x/src/plugin/parse-swf/src/test/org/apache/nutch/parse/swf/TestSWFParser.java * /nutch/branches/2.x/src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java * /nutch/branches/2.x/src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipParser.java * /nutch/branches/2.x/src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipTextExtractor.java * /nutch/branches/2.x/src/plugin/parse-zip/src/test/org/apache/nutch/parse/zip/TestZipParser.java Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Affects Versions: nutchgora Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: 2.2 Attachments: NUTCH-874.patch I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180562#comment-13180562 ] Lewis John McGibbney commented on NUTCH-874: I know the heat has kind of shifted away from Nutchgora but it would be great to clarify what this issues actually encapsulates. Was/is it is the case that some plugins in Nutchgora are not actually working with the Nutchgora API? I kinda confused with this one! Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: nutchgora I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896479#action_12896479 ] Julien Nioche commented on NUTCH-874: - Some plugins have not been ported to the new API as it does not provide multi valued parse results. See See http://search.lucidimagination.com/search/document/844c48289f2d07db/nutchbase_multi_value_parseresult_missing#4ed6f352ebcce8ef This is probably not the case for the ExtParser though. We could rely on Tika's mechanism for external parsing instead of maintaining ours. WDYT? Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: 2.0 I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896552#action_12896552 ] Chris A. Mattmann commented on NUTCH-874: - Hey Julien, I think Jukka already worked on something really similar to the ExtParser in Tika. See: http://tika.apache.org/0.7/api/org/apache/tika/parser/ExternalParser.html If we go that route here in Nutch, then I think we should add an encoding attribute similar to NUTCH-564 and flow it through in parse-tika then. If we can do that, I think we're good! Cheers, Chris Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora -- Key: NUTCH-874 URL: https://issues.apache.org/jira/browse/NUTCH-874 Project: Nutch Issue Type: Bug Components: parser Environment: Nutch 2.0 Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Critical Fix For: 2.0 I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin to make sure they all work with Gora/Nutchbase now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.