uery strings can be tricky. The best approach is to use the regex url
filter to avoid such pages.
- Grouping by same-hosts (already posted on this issue, and
looks like you are working towards a solution. I’m excited to try
this out, once implemented)
Its weekend so open source hacking time. ;-) I hope i get it a patch
delivered until next 2 days.
We should start a Wiki page listing such folks, like the Lucene
Support page (http://wiki.apache.org/jakarta-lucene/Support). Can
someone please add such a page to the Nutch wiki?
B-]
I will do!
Cheers,
Stefan
---------------------------------------------------------------
enterprise information technology consulting
open technology: http://www.media-style.com
open source: http://www.weta-group.net
open discussion: http://www.text-mining.org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers