On Fri, 2006-12-08 at 12:54 +0100, Andrzej Bialecki wrote: > Prefix filter to cut off anything without "http://". And then a > (non-existent) domain-suffix filter, which considers only domain > suffixes - this is easy to implement based on the suffix filter that > ships with Nutch.
Right.. I don't know Java but I'll give it a shot. Thanks :-) -Rob ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general