[jira] [Updated] (NUTCH-1081) ant tests fail

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1081: Fix Version/s: (was: nutchgora) 2.1 Set and classify

[jira] [Updated] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-874: --- Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-1167) Write JUnit tests for scoring-opic

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1167: Fix Version/s: (was: nutchgora) 2.1 Set and classify

[jira] [Updated] (NUTCH-1159) Write JUnit tests for index-anchor

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1159: Fix Version/s: (was: nutchgora) 2.1 Set and classify

[jira] [Updated] (NUTCH-896) Gora-based tests need to have their own config files

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-896: --- Fix Version/s: (was: nutchgora) 2.1 Set and classify

[jira] [Updated] (NUTCH-1162) Write JUnit tests for parse-js

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1162: Fix Version/s: (was: nutchgora) 2.1 Set and classify

[jira] [Resolved] (NUTCH-946) cache.jsp does not recognize encoding conversion from content different to UTF-8

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-946. Resolution: Won't Fix This issue is now deprecated and can't be fixed in current

Re: Suitable naming for Nutchgora branch?

2012-04-25 Thread Ferdy Galema
Hi Lewis, 2.1 is fine with me. This is assuming 2.x is a good naming scheme in the first place. I must say that since the move of Nutchgora from trunk to branch it's kind of odd that it's still referred to as 2.x. (For now that's okay I guess). Ferdy On Wed, Apr 25, 2012 at 10:46 AM, Lewis John

Re: Suitable naming for Nutchgora branch?

2012-04-25 Thread Julien Nioche
I must say that since the move of Nutchgora from trunk to branch it's kind of odd that it's still referred to as 2.x. (For now that's okay I guess). Moving it from the trunk made a lot of sense and has been abundantly discussed on this list. We had one stable version which is actively

[jira] [Updated] (NUTCH-1170) Write JUnit tests for urlfilter-validator

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1170: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1158) Write JUnit tests for all nutchgora plugins

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1158: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1163) Write JUnit tests for protocol-ftp

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1163: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1165) Write JUnit tests for protocol-sftp

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1165: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1160) Write JUnit tests for index-basic

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1160: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1161) Write JUnit tests for microformats-reltag plugin

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1161: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1166) Write JUnit tests for scoring-link

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1166: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1164) Write JUnit tests for protocol-http

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1164: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1168) Write JUnit tests for tld

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1168: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1277) Fix [fallthrough] javac warnings

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1277: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1104) Port issues from trunk NutchGora branch

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1104: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Commented] (NUTCH-1340) Increase scalability by only removing markers when they actually exist for DbUpdaterReducer

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262121#comment-13262121 ] Lewis John McGibbney commented on NUTCH-1340: - Hi Ferdy. I am +1 for this

[jira] [Updated] (NUTCH-1283) Radically update all Solr configuration in Nutchgora

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1283: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1038) Port IndexingFiltersChecker to 2.0

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1038: Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-887) Delegate parsing of feeds to Tika

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-887: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-842) AutoGenerate WebPage code

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-842: --- Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-840) Port tests from parse-html to parse-tika

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-840: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Commented] (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262160#comment-13262160 ] Lewis John McGibbney commented on NUTCH-902: I made some commits on this to in

[jira] [Updated] (NUTCH-956) solrindex issues

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-956: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify more

[jira] [Updated] (NUTCH-992) SolrDedup is broken in trunk

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-992: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Commented] (NUTCH-879) URL-s getting lost

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262171#comment-13262171 ] Lewis John McGibbney commented on NUTCH-879: This looks heliishly serious and

[jira] [Updated] (NUTCH-1026) Strip UTF-8 non-character codepoints

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1026: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1094) create comprehensive documentation for Nutchgora branch

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1094: Fix Version/s: (was: nutchgora) 2.1 create

[jira] [Updated] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-970: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-864) Fetcher generates entries with status 0

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-864: --- Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-841) Nutch 2.0 webapp

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-841: --- Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-1249) Resolve all issues flagged up by adding javac -Xlint arguement

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1249: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-978) A Plugin for extracting certain element of a web page on html page parsing.

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-978: --- Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1285) Debian Packaging for Nutch

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1285: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-1025) Add option not to commit to Solr

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1025: Fix Version/s: (was: nutchgora) 2.1 Set and Classify

[jira] [Updated] (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-944: --- Fix Version/s: (was: nutchgora) 2.1 1.6 Set

[jira] [Updated] (NUTCH-1290) crawlId not supported by all Tools

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1290: Patch Info: Patch Available crawlId not supported by all Tools

[jira] [Updated] (NUTCH-710) Support for rel=canonical attribute

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-710: --- Fix Version/s: (was: nutchgora) 2.1 1.6 Set

[jira] [Updated] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-797: --- Affects Version/s: nutchgora Fix Version/s: (was: nutchgora)

[jira] [Updated] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-979: --- Patch Info: Patch Available Add support for deleting Solr documents with

[jira] [Updated] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-979: --- Fix Version/s: (was: nutchgora) 2.1 Some work to be done Set

[jira] [Updated] (NUTCH-849) different versions of the same library in nutch-2.0-dev.job and local\lib directory

2012-04-25 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-849: --- Affects Version/s: nutchgora 1.4 Fix Version/s:

Re: Suitable naming for Nutchgora branch?

2012-04-25 Thread Lewis John Mcgibbney
Hi Everyone, As you guys will have seen I've quickly polluted our dev list again (sorry!!!) with set and classify for 2.1. The open issues for 2.0 are ones which I think we could address within the 2.0 release. This is merely my opinion, based upon the assertion that they all contain patches

Re: Suitable naming for Nutchgora branch?

2012-04-25 Thread Mattmann, Chris A (388J)
Great work Lewis, thanks! Cheers, Chris On Apr 25, 2012, at 4:01 PM, Lewis John Mcgibbney wrote: Hi Everyone, As you guys will have seen I've quickly polluted our dev list again (sorry!!!) with set and classify for 2.1. The open issues for 2.0 are ones which I think we could address