Re: [freenet-dev] freenet (pre-)searchengine

dbradfor Fri, 15 Aug 2003 10:34:35 -0700

Alternatively, we can modify the Freenet program such that, should the user enable the 
feature, Freenet generates an index.dbf file, with a listing of all of the files the 
user is sharing.  The user's freenet then queries all immediately adjacent nodes and 
gets their index.dbf files.  In this manner, all of the indexes can also have weights, 
based on their availability in relation to the node that actually wants to download 
the file.


Best Regards,
Drew

http://www.drewbradford.com/

----- Original Message -----
From: Tom Kaitchuck <[EMAIL PROTECTED]>
Date: Friday, August 15, 2003 12:16 pm
Subject: Re: [freenet-dev] freenet (pre-)searchengine

> On Friday 15 August 2003 01:22 am, Gordan wrote:
> 
> > OK, let's say that the index would take up 100 MB. If you think that
> > downloading a 100 MB HTML file (or XML, or CSV if they are 
> separate files)
> > into a browser using JavaScript will work, they you have some 
> interesting> misconceptions about what modern browsers can handle 
> sensibly.>
> > 1) If you give IE6 or Mozilla (I'm guessing that you are aiming 
> for DOM-ish
> > browsers only) a 100 MB file to process with JavaScript, it is 
> going to go
> > away for a very long time.
> >
> > 2) If you make it in such a way that you have to download a 100 
> MB file to
> > perform a query, then that's a non-starter anyway, as that can 
> take hours,
> > and has to deal with redundant FEC - again, it could be difficult.
> >
> > Therefore, you would need a way of segmenting the index so that 
> you could
> > search it sparsely, and only download a very small fraction of 
> it, based on
> > the search terms.
> 
> 
> Here is how you can do it.
> 
> Have a bot that spiders Freenet and grabs the URI the Title, a one 
> line 
> description and the META keywords in the HTML. Create a SSK that 
> has a list 
> of all the keywords and a page for each of the keywords that had 
> enough 
> content to be included in the index. Each of the keyword indexes 
> contain just 
> a list of URIs, Titles, and descriptions. Each of these indexes is 
> compressed. Update each index when it has enough new content to go 
> up to the 
> next size level (you want to avoid padding), or if it has not been 
> updated in 
> a long time. Clients fetch only the keywords they want, and they 
> hold on to 
> the index for say 1 month. If ever any of the indexes gets too 
> big, label it 
> a 'popular' index and then have it only link to index.htmls, and 
> sites with 
> very large numbers of links. 
> 
> Since this would have to be implemented in a client side app, you 
> could add 
> all sorts of features. Like allowing anyone to generate their own 
> content 
> specific index, have a sight black list, or show only DBRs or one 
> shot 
> sights. Then when the user finds what they want, the app requests 
> it, and 
> then opens their webbrouser to the right URI. Easy to rank too: % 
> of keywords 
> contained * % those keywords make up out of the total keywords.
> 
> This would scale pretty well, because it would only use a few 
> hundred bytes 
> (after compression) for each site. So, you could have thousands of 
> separate 
> sites for each category with no problem.
> _______________________________________________
> devl mailing list
> [EMAIL PROTECTED]
> http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl
> 

_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] freenet (pre-)searchengine

Reply via email to