Just like Shane, I have also considered developing a filtered search engine - one that is child safe.
Please let me know if this is possible: 1) add all sites appearing in the Open Directory adult categories to a "do not index list" 2) use filter/stop words to remove most profanity from the index (I think there is a workaround: people can use quotes around words search past filter words in the Nutch) One final question: Is stemming available in Nutch? There are instances where this can be a good thing or a problem. An example is the common last name "Sexton", if sex was a filter word, would that name be filtered out of the index? Just curious. I would rather develop an algorithm for scoring the content of a webpage. I know that not all use of the word "sex" is pornographic. Thanks, Barry Bowen 580-916-0339 ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
