[Tech] Keyword searching on Freenet (reprise?)

Jeremy Smith Sun, 08 Sep 2002 14:12:40 -0700

Hi!

Does the Freenet architecture allow for keyword searches of the text bodies of the
documents on there? I think this would really popularise anonymous publishing networks
by allowing people to search other than URLs.


I am working on a system which will allow for total anonymity and keyword searching.
However, before I start work on writing this system, I would like to know if the
alternatives could feasibly offer this.

Quick background on searching by keyword is, to make it less processor intensive than
loading up all the files and searching through them one at a time is, you make an index
file first of all those files, which merely indicates that the word "rhubarb" is in 
file
10 ("rhubarb.txt"), the index is small and quick to access.

Okay, so you publish the index on Freenet. Then what? The only way to use it is to get
it and look at it, and it must point to documents on Freenet or on a particular server
(usually it is tied up with a document). Then you can censor those documents as the
index will tell you what's in them, merely by going on keyword. I cannot see a way 
round
this, and my system stores unencrypted documents and is self-censoring by whoever 
stores
the information.

However, if a document is sufficiently censored, you can publish off your own machine,
anonymously, but really that's not so hot if the bad guys knock on the door (although
how the bad guys would *know* to knock on your door, I don't know). Not giving too much
away at this stage because it's not all worked out yet, but randomising initial
number of hops (a random number of 2-3 hops would make the chance that the host before
is the originator about 25%, which would make it impractical to check if a certain 
host 
is a publisher) is my basic anonymising idea, as is passing along packets that have 
been 
read (to give the impression to any omnipotent network overseer that the packet is not 
for you - it eventually times out in the network when the number of hops hits a 
certain 
number).

In a nutshell, it's like Gnutella with encryption and where the document is passed 
along 
like search results, except with random initial hops.

Again, Freenet has its purpose, as does Publius, but mine in theory covers other goals
and aims. It shouldn't be seen as a replacement, but it is (will be!) a simple 
protocol.
No politics here, Freenet is great, but what I want out of such a system is keyword 
searching of documents.

Anyway, I'm just posting here to ask about the keyword searching on Freenet. Maybe
there's some ultra-clever way of doing it without giving away the contents of the
documents, but I can't see it myself. Search engines like Altavista contributed to
making the web a huge hit, it's an important aspect of a publishing system.

Jeremy.

PS. I have checked the archives and haven't found anything on keyword searches. I wish 
those archives were searchable!

PPS. I will of course be writing this program myself, although maybe with some help on 
the underlying crypto to prevent a rubbish system.

_______________________________________________
Tech mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/tech

[Tech] Keyword searching on Freenet (reprise?)

Reply via email to