Michael Rogers wrote: > On 28/07/10 18:29, Matthew Toseland wrote: > >>> Maybe we could do something even simpler than that. Each indexer >>> publishes her index in the form USK at blah,blah/index/123/keyword. >>> Retrieve only the keywords you're interested in, from only the indexers >>> you trust. >>> >> That's what we do now. It doesn't scale. >> > > I thought the current scheme used one file per indexer, containing all > the indexer's words, whereas I'm suggesting one file per word per > indexer. Or does that come to the same thing due to the use of containers? > > Cheers, > Michae The site could be constructed to insert large keyword indexes separately using redirects, and consolidate small indexes (for uncommon keywords) inside a common container. Whether or not this would be more effective or efficient obviously depends on how large your indexes are and how much data could be shared between indexes for different keywords. I can imagine a search for 'techno music' being [relatively] fast, as it would only have to load the /techno and /music indexes (relatively small files, and relatively common search terms, as search terms go) from each index site rather than the entire index. Of course this only makes sense if you're using a small number of index sites, the bottleneck when running searches is the time it takes to download large index files, and the time it takes to download the index for one word is a very small fraction of the time it takes to download the entire index. If indexes are small you'd probably lose more performance this way than you gain (relative to the current scheme) due to uncommonly searched words not having their indexes dispersed throughout the network.
Disclaimer: I haven't been keeping much attention on the Freenet search party, so I may be totally off on what about the current scheme doesn't scale well. :)