I am sorry. Where you mentioned "clear web" I mistook hidden services to be the Deep Web[1]. That's what I meant by Dark Web.
[1] : http://en.wikipedia.org/wiki/Deep_Web On Tue, Feb 25, 2014 at 9:41 PM, George Kadianakis <[email protected]>wrote: > Vighnesh Birodkar <[email protected]> writes: > > > Hello > > > > I am found a couple of ideas from the Ideas Page interesting . I was a > GSoC > > student for SimpleCV last year. In the past I've programmed in C,C++,Java > > and Python . > > > > Following are my queries . > > > > 1. Search for Hidden Services . > > > > I apologize in advance if there is something obviously wrong with my > idea. > > Dark Web consists of information that cannot be crawled because it > doesn't > > appear as hyperlinks in other pages . But someone somewhere will always > > have access to this information, either by entering search queries , > > through subscriptions or logging in. What if we can index all the pages a > > browser visits ? Users can voluntarily install and enable or disable a > > plugin in their browsers . This plugin will index process ( and maybe > index > > ) pages locally and upload it's data to servers which will hold the > global > > index . > > > > I'm not sure what you mean by 'Dark Web', but if you mean 'Tor Hidden > Services' it _is_ possible to crawl and index onion addresses. For > example, if you google for ".onion" and check through the first few > result pages you can find dozens of onion addresses. If you then crawl > those pages you will get even more onion addresses. > > Then the question is how you present those onion addresses to the user > of the search engine. Users should be able to search for terms and get > accurate results (popularity tracking, backlinks, etc. should be used > to reduce phishing). The search engine should also be able to give a > short description of each hidden service (e.g. by scraping its > contents, or by the community editing the description, or by using > official descriptions [0], or...). > > Assuming that all the above are solved we might get to the point were > we have indexed all the potentialy visible onion addresses and that's > where your browser extension idea might be useful. However we are > currently quite far away from that situation. I also doubt that many > users of hidden services would install a browser extension to index > Hidden Services that have been intentionally kept secret (and hence > not found by conventional crawling). > > [0]: https://ahmia.fi/documentation/descriptionProposal/ > _______________________________________________ > tor-dev mailing list > [email protected] > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev >
_______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
