[jira] [Updated] (NUTCH-1304) GeneratorMapper.java dosen't return when skipping and already generated mark

2012-03-08 Thread Dan Rosher (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Rosher updated NUTCH-1304: -- Attachment: NUTCH-1304.patch GeneratorMapper.java dosen't return when skipping and already

[jira] [Created] (NUTCH-1304) GeneratorMapper.java dosen't return when skipping and already generated mark

2012-03-08 Thread Dan Rosher (Created) (JIRA)
GeneratorMapper.java dosen't return when skipping and already generated mark Key: NUTCH-1304 URL: https://issues.apache.org/jira/browse/NUTCH-1304 Project: Nutch

[jira] [Commented] (NUTCH-1304) GeneratorMapper.java dosen't return when skipping and already generated mark

2012-03-08 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225159#comment-13225159 ] Lewis John McGibbney commented on NUTCH-1304: - +1 for commit. I'll wait until

[jira] [Commented] (NUTCH-1300) Indexer to normalize URL's

2012-03-08 Thread Sebastian Nagel (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225174#comment-13225174 ] Sebastian Nagel commented on NUTCH-1300: +1 * effective fix for a serious problem:

[jira] [Updated] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1305: - Attachment: NUTCH-1305-1.5-1.patch Patch for 1.5. Fixes the issue.

[jira] [Created] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Markus Jelsma (Created) (JIRA)
Domain(blacklist)URLFilter to trim entries -- Key: NUTCH-1305 URL: https://issues.apache.org/jira/browse/NUTCH-1305 Project: Nutch Issue Type: Bug Affects Versions: 1.4 Reporter:

[jira] [Commented] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225206#comment-13225206 ] Lewis John McGibbney commented on NUTCH-1305: - +1

[jira] [Resolved] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Markus Jelsma (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1305. -- Resolution: Fixed Committed for 1.5 in rev. 1298394.

[jira] [Commented] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225209#comment-13225209 ] Markus Jelsma commented on NUTCH-1305: -- Thanks Lewis.

[jira] [Created] (NUTCH-1306) Commit after finished writing to solr index

2012-03-08 Thread Dan Rosher (Created) (JIRA)
Commit after finished writing to solr index --- Key: NUTCH-1306 URL: https://issues.apache.org/jira/browse/NUTCH-1306 Project: Nutch Issue Type: Improvement Components: indexer Affects

[jira] [Updated] (NUTCH-1306) Commit after finished writing to solr index

2012-03-08 Thread Dan Rosher (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Rosher updated NUTCH-1306: -- Attachment: NUTCH-1306.patch Commit after finished writing to solr index

NutchGora release, and Nutch 1.x trunk release

2012-03-08 Thread Mattmann, Chris A (388J)
Hey Guys, I've got some cycles this weekend -- anyone up for a 1.5 release off trunk (stable), and a NutchGora branch release? I suggested this before [1] regarding NutchGora. I'm inclined to say let's do the following: 1. NutchGora: apache-nutch-2.0 - release 2.x series based on this branch 2.

[jira] [Commented] (NUTCH-1305) Domain(blacklist)URLFilter to trim entries

2012-03-08 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225232#comment-13225232 ] Hudson commented on NUTCH-1305: --- Integrated in nutch-trunk-maven #187 (See

Re: NutchGora release, and Nutch 1.x trunk release

2012-03-08 Thread Markus Jelsma
+1 1.5 has, again, many fixes and improvements, just as 1.4 had over 1.3. But i'd like to integrate Tika 1.1 after its pending release. Cheers On Thursday 08 March 2012 15:38:15 Mattmann, Chris A (388J) wrote: Hey Guys, I've got some cycles this weekend -- anyone up for a 1.5 release off

Re: NutchGora release, and Nutch 1.x trunk release

2012-03-08 Thread Lewis John Mcgibbney
Yeah I agree Chris Markus. On the Nutchgora note, I would like to see Gora 0.2. released before hand, as we have a blocking issue NUTCH-1205 with Ivy retrieving alien Gora 0.2-SNAPSHOT dependencies from repository.apache.org. We should be able to overcome this issue by releasing Gora 0.2 to

[jira] [Created] (NUTCH-1307) Improve formatting of ant targets for clearer project help

2012-03-08 Thread Lewis John McGibbney (Created) (JIRA)
Improve formatting of ant targets for clearer project help -- Key: NUTCH-1307 URL: https://issues.apache.org/jira/browse/NUTCH-1307 Project: Nutch Issue Type: New Feature

Re: NutchGora release, and Nutch 1.x trunk release

2012-03-08 Thread Mattmann, Chris A (388J)
Hey Guys, OK, sounds good. Looks like we need to wait for the Tika 1.1 release (seems to be going well so far), and then try and push Gora 0.2 (which I know Lewis is pushing, and which I'm happy to RM once we're ready there). So, maybe I'll shoot for next weekend or the weekend after to push

Re: NutchGora release, and Nutch 1.x trunk release

2012-03-08 Thread Ferdy Galema
+1 for pushing Gora 0.2 prior to the Nutchgora 2.0 RC. For Nutchgora, besides Nutch-1205 the only thing I'm a bit concerned about is Nutch-1253. This seems like a blocker to me, and I think it only affects Nutch trunk. (Though I'm not sure). On Thu, Mar 8, 2012 at 4:32 PM, Mattmann, Chris A

[jira] [Updated] (NUTCH-1307) Improve formatting of ant targets for clearer project help

2012-03-08 Thread Lewis John McGibbney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1307: Attachment: NUTCH-1307-trunk.patch NUTCH-1307-nutchgora.patch

[jira] [Closed] (NUTCH-1307) Improve formatting of ant targets for clearer project help

2012-03-08 Thread Lewis John McGibbney (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-1307. --- Improve formatting of ant targets for clearer project help

[jira] [Resolved] (NUTCH-1307) Improve formatting of ant targets for clearer project help

2012-03-08 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1307. - Resolution: Fixed Committed @ revision 1298437 in Nutchgora branch Committed @

[jira] [Resolved] (NUTCH-1304) GeneratorMapper.java dosen't return when skipping and already generated mark

2012-03-08 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1304. - Resolution: Fixed Committed @ revision 1298444 in Nutchgora branch Thank you

[jira] [Commented] (NUTCH-1304) GeneratorMapper.java dosen't return when skipping and already generated mark

2012-03-08 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225270#comment-13225270 ] Lewis John McGibbney commented on NUTCH-1304: - Please close this one off when

[jira] [Commented] (NUTCH-1307) Improve formatting of ant targets for clearer project help

2012-03-08 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225299#comment-13225299 ] Hudson commented on NUTCH-1307: --- Integrated in nutch-trunk-maven #188 (See

[jira] [Commented] (NUTCH-728) Improve nutch release packaging

2012-03-08 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225389#comment-13225389 ] Lewis John McGibbney commented on NUTCH-728: Looking at this, then at what we

[jira] [Commented] (NUTCH-882) Design a Host table in GORA

2012-03-08 Thread Mathijs Homminga (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225486#comment-13225486 ] Mathijs Homminga commented on NUTCH-882: Status: I have updated the patches to

[jira] [Commented] (NUTCH-1278) Fetch Improvement in threads per host

2012-03-08 Thread Ferdy Galema (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225545#comment-13225545 ] Ferdy Galema commented on NUTCH-1278: - I noticed you used the diff command this time,

[jira] [Updated] (NUTCH-841) Nutch 2.0 webapp

2012-03-08 Thread Ferdy Galema (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-841: --- Priority: Major (was: Blocker) Nutch 2.0 webapp Key:

[jira] [Commented] (NUTCH-841) Nutch 2.0 webapp

2012-03-08 Thread Chris A. Mattmann (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225551#comment-13225551 ] Chris A. Mattmann commented on NUTCH-841: - Yep not a blocker!