Running individual test classes from nutch script cont'd

2011-07-17 Thread lewis john mcgibbney
Hi, OK this stems from discussion on the user@ list a while ago [1] and my discovery of NUTCH-672 yesterday. I attached a patch, which fails completely, as I hadn't uncovered things I now know. The original patch submitted for the issue would have been fine for =Nutch 1.2 but now as the file

[jira] [Updated] (NUTCH-1057) Make fetcher thread time out configurable

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1057: - Attachment: NUTCH-1057-1.4-1.patch Patch for 1.4. There's also a diff for NUTCH-1037 in the

[jira] [Updated] (NUTCH-1043) Add pattern for filtering .js in default url filters

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1043: - Patch Info: [Patch Available] Add pattern for filtering .js in default url filters

[jira] [Updated] (NUTCH-1023) Trivial error in error message for org.apache.nutch.crawl.LinkDbReader

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1023: - Patch Info: [Patch Available] Trivial error in error message for

[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-961: Attachment: NUTCH-961-1.4-dombuilder-1.patch With BP enabled you can get an

[jira] [Updated] (NUTCH-965) Skip parsing for truncated documents

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-965: Patch Info: [Patch Available] Skip parsing for truncated documents

[jira] [Commented] (NUTCH-1044) Redirected URLs and possibly all of their outlinked URLs have invalid scores.

2011-07-17 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066649#comment-13066649 ] Markus Jelsma commented on NUTCH-1044: -- Can you provide a patch? Redirected URLs

adding details to mvn.template?

2011-07-17 Thread lewis john mcgibbney
Hi, Quick question, I've been looking at various issues dealt with prior to Nutch 1.3 release in particular NUTCH-995. Please excuse (and correct) my ignorance, but I need to clear this one up so I understand correctly. The purpose the mvn.template file serves is so we can specify exactly who

Re: adding details to mvn.template?

2011-07-17 Thread Julien Nioche
Please excuse (and correct) my ignorance, but I need to clear this one up so I understand correctly. The purpose the mvn.template file serves is so we can specify exactly who can commit a Nutch maven pom. The pom in turn specifies the build dirs e.g. source dir as well as test dir. Then finally

[jira] [Commented] (NUTCH-1019) Edit comment in org.apache.nutch.crawl.Crawl to reflect removal of legacy

2011-07-17 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066731#comment-13066731 ] Lewis John McGibbney commented on NUTCH-1019: - Committed at revision 1147712.

[jira] [Updated] (NUTCH-1059) Remove convdb command from /bin/nutch

2011-07-17 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1059: Attachment: NUTCH-1059-remove-convdb.patch The patch simply removes both the

Build failed in Jenkins: Nutch-trunk #1549

2011-07-17 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/1549/ -- [...truncated 985 lines...] A src/plugin/subcollection/src/java/org/apache/nutch/collection A src/plugin/subcollection/src/java/org/apache/nutch/collection/Subcollection.java A