On Thu, Apr 20, 2000 at 01:43:39PM +0100, Theodore Hong wrote: > Michael Wiktowy <spam at mindless.com> wrote: > > > FROM: finney.org > > > I want to reiterate a comment I made earlier, with regard to storing > > > things into the Freenet under a "searchkey" like mp3. This is not > > > going to work, because too many documents will use that keyword, and > > > they will all try to go onto that one node (even if the "documents" are > > > just index or metadata entries there are too many). > > > > I read your concerns before and can totally see where you are coming > > from. There certainly will be an increased load on IPs that the smart > > routing thinks should the the home for hashes of popular keywords. There > > are other things to consider though. I don't know how the routing > > algorithm works exactly but it seems to me that it's focus can be > > adjusted. What I mean is the "best" IP for a particular hash may not > > strictly be one single IP but rather a group of IPs. By adjusting the > > fuzziness of the targeting you might reduce the efficiency of the routing > > mechanism by a hop or two but the load on the targeted server will be > > dropped by a lot more. > > I don't think this is really a problem. The thing is, the routing is not > absolute -- it's not the case that globally, 123.56.78.* might have a > really big affinity for the hash of the keyword mp3. Each node decides for > itself which target it thinks might have an affinity for mp3, based on > "past experience." If we draw all those associations as a graph, it's > possible all the arrows would go towards a single node, but more likely > they would swirl into a stream that loops around and doesn't go anywhere in > particular. > > theo >
After thinking about it, I have to agree. A 'tag', or a 'tag' with some meta-data attached, is going to behave very much like ordinary poplular data. As a thought experiment, I pictured a network where one node had some -very- popular data in it, as an initial condition. What would happen as requests were made? The data would disperse. If there is a problem with a tag / meta-data pair routed by tag, it would be that searches for a particular pair would have to be *deeper than for normal data. In this case you would be trying to route to a specific instance of popular data. As to key clustering in general, there is a randomizing factor involved, that of the user interaction with the node. An 'undisturbed' node might have a tendency to attract close keys, ending up with a datastore of closely packed key values, but any amount of user interaction involving inserts and requests is going to stir the pot. Also, this mixing has to considered from a network view, in that even if a single node is undisturbed, others that reference it will be, so the pattern of requests will change. Unfortunately, I don't have the math to demonstrate this conclusively, in fact, I don't think -anyone- has the math to model the network as a whole, which makes things pretty tricky. I'm going to revise the meta-data spiel I wrote up, I think it's worth it even if it's just for future reference. I'll make a note on the chat list when it's available if anyone is interested. David Schutt _______________________________________________ Freenet-dev mailing list Freenet-dev at lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/freenet-dev
