Re: [VOTE] Apache Nutch 1.5.1 Release Candidate

2012-06-26 Thread Lewis John Mcgibbney
Hi Guys, The RC was pulled from the most recent commit to ./src/bin/nutch which I believe was to remove the -core developer CLI options which are now deprecated. This was a much cleaner option for providing a 1.5.1 branch and generating the relevant 1.5.1 tag. If we it's required I can run a 1.5.

Build failed in Jenkins: Nutch-trunk #1881

2012-06-26 Thread Apache Jenkins Server
See -- Started by timer Building remotely on solaris1 in workspace hudson.util.IOException2: remote file operation failed:

Build failed in Jenkins: Nutch-nutchgora #293

2012-06-26 Thread Apache Jenkins Server
See -- Started by timer Building remotely on solaris1 in workspace hudson.util.IOException2: remote file operation failed:

ant build: central list of plugins

2012-06-26 Thread Sebastian Nagel
Plugins are "registered" multiple times in build.xml src/plugins/build.xml default.properties This is error-prone and there are already some inconsistencies (trunk): build.xml: lib-http (given twice in target "release") urlfilter-prefix (given twice in target "release") default.proper

[jira] [Commented] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries

2012-06-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401489#comment-13401489 ] Markus Jelsma commented on NUTCH-1405: -- I dont know what the indexer entry is doing t

[jira] [Commented] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries

2012-06-26 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401465#comment-13401465 ] Julien Nioche commented on NUTCH-1405: -- Correct me if I 'm wrong but doesn't this rep

[jira] [Commented] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries

2012-06-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401457#comment-13401457 ] Lewis John McGibbney commented on NUTCH-1405: - Looks good apart from the index

[jira] [Commented] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries

2012-06-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401445#comment-13401445 ] Markus Jelsma commented on NUTCH-1405: -- Any comments? > Allow to ov

[jira] [Commented] (NUTCH-1319) HostNormalizer

2012-06-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401275#comment-13401275 ] Hudson commented on NUTCH-1319: --- Integrated in nutch-trunk-maven #331 (See [https://builds.

[jira] [Commented] (NUTCH-1251) SolrDedup to use proper Lucene catch-all query

2012-06-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401269#comment-13401269 ] Hudson commented on NUTCH-1251: --- Integrated in nutch-trunk-maven #330 (See [https://builds.

[jira] [Updated] (NUTCH-1233) Rely on Tika for outlink extraction

2012-06-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1233: - Attachment: NUTCH-1233-1.6-1.patch Here's a new patch without garbage and it actually compiles an

[jira] [Updated] (NUTCH-1100) SolrDedup broken

2012-06-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1100: - Attachment: NUTCH-1100-1.6-1.patch I finally got around this again and it is indeed a problem wit

[jira] [Resolved] (NUTCH-1251) SolrDedup to use proper Lucene catch-all query

2012-06-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1251. -- Resolution: Fixed Committed for 1.6 in rev. 1353857. Thanks Arkadi! > SolrDedu