I have come up with a scheme that would allow
searching of the freenet network and as an added
benefit works entirely with in the network. No
external daemons or schemes are required to support
this search method. It does have the draw back of
falling victom to DSB and garbage collection just like
all data on freenet. I believe the benefits will out
weigh the drawbacks however. 

The scheme is actualy quite simple and I am very
interested in working on this project. I am hoping to
get some other people who might have more development
experience then me to help out too. 

The system is broken up in 3 distinct parts - the
spider, the indexer, and the indexs. The spider is
obvious in its implementation - walk all the pages it
can and feed them to the indexer. The indexer is also
fairly trivial in its operation. I will demonstrate
how it operates and what the indexs look like in one
example:

Lets say we have 2 pages: pageA and pageB. 
pageA contains:
hello this is a test
and pageB contains:
hello this is not a test

The first step is to rip apart each page into distinct
words. We are going to use each word as a freenet key
and the data associated to each key is the list of
locations that contained that word. In our simple
example we would have the following key/value pairs:

hello: pageA pageB
this: pageA pageB
is: pageA pageB
not: pageB
a: pageA pageB
test: pageA pageB

Of course in the real world examples the data would be
complete freenet URIs. Once our lists are complete we
compress the text files and insert them under a SSK to
avoid tampering. A DBR could also be used to allow us
to update the indexs on a set interval - probably on
the order of a week or so to avoid killing the network
with trafic.

When a client wants to perform a search they simply
request freenet keys. Lets say Frank the freenet user
wants to search for "not test". He is going to request
the following keys:

SSK@abc/index/not and then SSK@abc/index/test - all of
the URIs that are common between them are the results
of his search. This offloads all boolean operations
and complex search operation onto the client and uses
freenet only as a storage medium for the indexs - very
favorable. 

This search scheme could be implemented easily as
external applications that request information from
the node or could even be integrated into fproxy. It
requires no modification to the node or existing
infrastructure and is fairly simple to implement. Best
of all it fills a need that has existed on freenet
since it's creation. I hope there is sufficent
interest to pull this project off.

Tyler



=====
AIM:rllybites    Y! Messenger:triddle_1999

__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com

_______________________________________________
Tech mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/tech

Reply via email to