I have come up with a scheme that would allow searching of the freenet network and as an added benefit works entirely with in the network. No external daemons or schemes are required to support this search method. It does have the draw back of falling victom to DSB and garbage collection just like all data on freenet. I believe the benefits will out weigh the drawbacks however.
The scheme is actualy quite simple and I am very interested in working on this project. I am hoping to get some other people who might have more development experience then me to help out too. The system is broken up in 3 distinct parts - the spider, the indexer, and the indexs. The spider is obvious in its implementation - walk all the pages it can and feed them to the indexer. The indexer is also fairly trivial in its operation. I will demonstrate how it operates and what the indexs look like in one example: Lets say we have 2 pages: pageA and pageB. pageA contains: hello this is a test and pageB contains: hello this is not a test The first step is to rip apart each page into distinct words. We are going to use each word as a freenet key and the data associated to each key is the list of locations that contained that word. In our simple example we would have the following key/value pairs: hello: pageA pageB this: pageA pageB is: pageA pageB not: pageB a: pageA pageB test: pageA pageB Of course in the real world examples the data would be complete freenet URIs. Once our lists are complete we compress the text files and insert them under a SSK to avoid tampering. A DBR could also be used to allow us to update the indexs on a set interval - probably on the order of a week or so to avoid killing the network with trafic. When a client wants to perform a search they simply request freenet keys. Lets say Frank the freenet user wants to search for "not test". He is going to request the following keys: SSK@abc/index/not and then SSK@abc/index/test - all of the URIs that are common between them are the results of his search. This offloads all boolean operations and complex search operation onto the client and uses freenet only as a storage medium for the indexs - very favorable. This search scheme could be implemented easily as external applications that request information from the node or could even be integrated into fproxy. It requires no modification to the node or existing infrastructure and is fairly simple to implement. Best of all it fills a need that has existed on freenet since it's creation. I hope there is sufficent interest to pull this project off. Tyler ===== AIM:rllybites Y! Messenger:triddle_1999 __________________________________________________ Do you Yahoo!? New DSL Internet Access from SBC & Yahoo! http://sbc.yahoo.com _______________________________________________ Tech mailing list [EMAIL PROTECTED] http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/tech
