[jira] Updated: (NUTCH-339) Refactor nutch to allow fetcher improvements

2006-11-25 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-339?page=all ] Andrzej Bialecki updated NUTCH-339: Attachment: patch4-fixed.txt Sorry, the patch was incomplete - please try patch4-fixed.txt instead. Refactor nutch to allow fetcher improvements

RE: [jira] Created: (NUTCH-408) Plugin development documentation

2006-11-25 Thread Armel T. Nene
I agree with you that documentation is vital not the just extending the current version but also for any plugins and patches created. I have been spending almost two weeks trying to adapt nutch to my project but I spend more time in reading code and trying to understand what they do before I can

Re: [jira] Created: (NUTCH-408) Plugin development documentation

2006-11-25 Thread Stefan Groschupf
did you erver browse this: http://wiki.media-style.com/display/ nutchDocu/Home Nothing big, but it will give you some ideas, also about plugins. On 25.11.2006, at 06:32, Armel T. Nene wrote: I agree with you that documentation is vital not the just extending the current version but also for

[jira] Commented: (NUTCH-408) Plugin development documentation

2006-11-25 Thread nutch.newbie (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-408?page=comments#action_12452610 ] nutch.newbie commented on NUTCH-408: Yes, I have gone through the media style documentation and it is a good start. and there are also some very good

[jira] Updated: (NUTCH-409) Add short circuit notion to filters to speedup mixed site/subsite crawling

2006-11-25 Thread Doug Cook (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-409?page=all ] Doug Cook updated NUTCH-409: Attachment: shortcircuit.patch Add short circuit notion to filters to speedup mixed site/subsite crawling

Re: More fetcher speed increases

2006-11-25 Thread Doug Cook
Done. See http://issues.apache.org/jira/browse/NUTCH-409 This is my first Nutch contribution, so hopefully I've got it right ;-) Any suggestions/questions/feedback welcome. Hope this is useful to others. D scott green wrote: Hi Doug, Your idea about PrefixURLFilter and

[jira] Commented: (NUTCH-409) Add short circuit notion to filters to speedup mixed site/subsite crawling

2006-11-25 Thread Doug Cook (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-409?page=comments#action_12452617 ] Doug Cook commented on NUTCH-409: - I should also note that this approach is still not optimal (though it is faster for my usage pattern). I'm still running the