Re: [freenet-dev] freenet (pre-)searchengine

Gordan Thu, 14 Aug 2003 15:34:25 -0700

On Thursday 14 August 2003 22:44, Newsbite wrote:
> Anyway, what I was thinking was, that there are javascripts (and probably
> other stuff as well :-) that can emulate a searchengine. The database is
> stored as part of the javascript on a webpage, thus, and is readily (and
> very fast) at showing the index of the links (of the word(s) that were
> requested).


Actually, you would want to keep the data and the code completely separate. 
The index would be enormous. You would also have to implement some very 
unorthodox indexing methods to create an index that would scale to any extent 
with an underlying storage medium such as Freenet.

For most standard indexing mechanisms used today, when they are applied to 
Freenet, they fall apart because of the nature of access. When you try to 
make things future proof and scalable up to, say 3bn pages, it all becomes 
infeasible.

> This is not ideal, ofcourse, but it would be an improvement to the current
> system.

It would - if it could be done efficiently.

> Once again, I've told my idea on IIRC, and it was met rather positively,
> but with the remarks (which had occured to me also ;-):
>
> It still needs someone to insert/retrieve the database.

That is not a big problem. Any fairly standard web crawler would work for 
indexing the pages. Uploading the database is also not an issue. The problem 
is in the database storage format. It is difficult to come up with a method 
that would yield good results and acceptable response times with a 
high-latency network.

> Wich is true, but that could be said of the current TFE system too.
> Besides, it can't be that difficult to largely automate the process.

Automating the process would be dead easy. Coming up with a storage format 
that is efficient is difficult. Another difficulty lies in implementing an 
index format which is compact yet useful. There is no point in creating an 
index that would take up as much space as all the data it is trying to index. 
That would be bad, as the index would effectively double the required storage 
capacity of the network.

> The advantages are legio:
>
> 1)a real search-like mechanism

Sort of. There would be some additional limitations.

> 2)more user friendly

Maybe. You could do it all in JavaScript, as you said. This would, however, 
put most people off because of the filter warnings. A better way to do it 
would be to create a Fred plug-in applet that would perform this function. It 
would probably be faster, and it would work around the problem of filter 
warnings. It would also be "easier" to trust it if it were distributed with 
the node library, rather than just a random page from an inherently 
untrustworthy medium.

> 3)no more scrolling and manually searching for stuff (TFE is beginning to
> become TOO large to easely navigate through)

True to some extent. IIRC, YoYo handles this reasonably sensibly, but in the 
long term, any manualy created index will become implausible. We haven't 
reached that amount of content in Freenet yet.

> 4)the moral issue is greately
> reduced; because (links to) 'illegal' things such as copyrighted material
> (or worse) would only be visible when you actively seek/request it

That is not necessarily strictly true. It depends on how much of specific type 
of content there is. Any automated search engine has such issues. For 
example, how many times have you entered a completely normal, mundane and 
geeky search string into Google/Altavista/Other search engine and found that 
totally unrelated porn pages crop up even on the first results page, because 
some porn site web master put the terms on his page so that it would come up 
for pretty much ANY query you typed in?

> It would require, however, that at least for this particular script (or for
> some particular page), the java-script filter would have to let it pass
> without much fuzz.

Not really. Just leave it to the user to decide whether they trust the page. 
If they do, they can click the "proceed anyway" button. The correct way 
around this would have to be the plug-in applet.

> Not ideal, perhaps, but untill a true good-working, scalable, anonymous
> searchengine is created to work in freenet, it would beat everything that
> is currently available on freenet.

There are many, many more technical difficulties involved in that than you may 
realize, especially in coming up with a good, scalable index format.

Gordan
_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] freenet (pre-)searchengine

Reply via email to