uery strings can be tricky. The best approach is to use the regex url filter to avoid such pages.

- Grouping by same-hosts (already posted on this issue, and looks like you are working towards a solution. I’m excited to try this out, once implemented)

Its weekend so open source hacking time. ;-) I hope i get it a patch delivered until next 2 days.

We should start a Wiki page listing such folks, like the Lucene Support page (http://wiki.apache.org/jakarta-lucene/Support). Can someone please add such a page to the Nutch wiki?

B-] I will do!

Cheers,
Stefan

---------------------------------------------------------------
enterprise information technology consulting
open technology:   http://www.media-style.com
open source:           http://www.weta-group.net
open discussion:    http://www.text-mining.org



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to