Hi everyone,

Sorry for the confusion in project description.


Anansi is a distributed web crawler only for research purposes,  with a good 
manner in complying with rules in "robots.txt" defined by each URI host it 
meets.


The goal of such a project is to explore as many as URIs that could be directly 
reached by the public. The URIs returned to the server is only used for a 
scheduling algorithm implemented with Map-Reduce Framework, so * NO harm * is 
doing to any public or individual.



Information collected by Anansi is no more than "URIs" itself,

*NO page content* of will be returned;

*NO E-mail address* will be returned;

*No user/password* will be returned;


One task contains one URI each time, in order to reduce load of volunteers.

The project has been running as internal one by couple machines for a while and 
we are ooking for helps from the public.

The project link, yes, it is http://canis.csc.ncsu.edu:8005/anansi



Thanks for your participation,

-Kunsheng


      
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to