Hi everyone,
Sorry for the confusion in project description. Anansi is a distributed web crawler only for research purposes, with a good manner in complying with rules in "robots.txt" defined by each URI host it meets. The goal of such a project is to explore as many as URIs that could be directly reached by the public. The URIs returned to the server is only used for a scheduling algorithm implemented with Map-Reduce Framework, so * NO harm * is doing to any public or individual. Information collected by Anansi is no more than "URIs" itself, *NO page content* of will be returned; *NO E-mail address* will be returned; *No user/password* will be returned; One task contains one URI each time, in order to reduce load of volunteers. The project has been running as internal one by couple machines for a while and we are ooking for helps from the public. The project link, yes, it is http://canis.csc.ncsu.edu:8005/anansi Thanks for your participation, -Kunsheng _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
