Re: [freenet-dev] freenet (pre-)searchengine

Gordan Fri, 15 Aug 2003 03:46:55 -0700

On Friday 15 Aug 2003 11:13, Some Guy wrote:
> Some of you may want to see the previous discussion we
> had along these lines:
> http://hawk.freenetproject.org:8080/pipermail/devl/2003-June/006607.html


Yes, I agree with what was said there. One thing that gets me, though, is that 
people keep comparing Freenet to networks such as Kazaa. If people just want 
a file sharing tool of that sort, why not just use Frost, or the replacement 
Frost front end that is being worked on to make it look more like Kazaa?

What we are talking about here (if I am understanding this all correctly), is 
a Google type search engine for Freesite content. The two concepts are quite 
different.

> I think the idea that was most liked was that the user
> downloads a few index files from freesites he chooses
> and then uses them in some local search engine.
> 
> Indexes could be built by hand, crawler, or people
> might somehow recomend thier site for an index.

Interesting idea. So, a site author would insert an additional file, called, 
say, //index.txt which would contain a compact index of all their pages? That 
would certainly make the crawling process faster, as only one file per 
Freesite would need to be retrieved.

There are reasons, however, why this would be bad. The big reason against this 
is that the link structure of the network wuld not be known (unless it, too, 
is added to this compact index). Link structure is what defines the "page 
rank" in Google terms, and is useful in ranking the search results.

Depending on site size, having a single, compact file containing all the word 
and link information may be a good thing for automated indexing robots.

This, however, still doesn't solve the big problem, as specifying sites to 
search is not really an effective way to perform a good search. You would 
have to have a good idea of where to look for things first, which is exactly 
what we are trying to avoid.

The big problem is coming up with an efficient, compact and scalable index 
format that can be very segmented and searched sparsely, without a very deep 
search tree.

> While it might be posible to build a distributed
> database that routes peices of the query around to
> indexes out in freenet or something, that seems hard
> and beyond our means.

Agreed. Searching ALL content in Freenet should be impossible by design. 
However, content that is linked/advertised should be indexable by automated 
crawlers.

Gordan
_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] freenet (pre-)searchengine

Reply via email to