On Fri, 04 Aug 2000, Travis Bemann wrote: > On Fri, Aug 04, 2000 at 02:08:53PM +0700, Oskar Sandberg wrote: <> > > The easy part of the DataStore is the lookups. Each key (20 bytes) entry > > needs > > to have an address (4 bytes) of another node and a reference to where the > > actual data is stored (not sure, but to be safe say 16 bytes). That is 40 > > bytes > > per entry. In other words, an index for a currently standard node holding > > 500 > > entries is 20 kB. An index for an extreme node holding a million entries is > > only 40 MB, still small enough to keep in the memory, and certainly not an > > issue on disk. Your half gigabyte index file that "will happen" contains 125 > > million entries. I would like to contend that that will _not_ happen, and > > that > > is not because I think "Freenet never get that big", but for that same > > reason > > you don't see that many ADT tree structures with 125 million branches per > > level. > > The half gigabyte index file was just hypothetical. I never said that > it would really happen.
Umm, I hate to nitpick, but yes you did. The quote is: "yes, this is an extreme example, but it *will* happen." <> > > However, that is NOT the problem with DataStore design. Looking up data in a > > list is probably the oldest problem in CS and we could probably find a > > million > > examples of people having done it before. The hard part in the DataStore is > > the > > routing, being able to find the "closest" key and it's reference, as well as > > the nth-closest should that fail. This is not a case where I am aware of > > much > > prior art (I've wondered a bit about spell checkers, if anybody knows > > anything > > else I would be very interested). <> > As for closeness, is the purpose of this to choose which node to send > a request which could not be served locally to? I looked in the > source code, and it appears that "closeness" is literally how close > the keys are to each other, not some algorithm to determine which > nodes are most likely to have the data available on them. Why not > just keep data on nodes and their likelyhood of having data available > on or through them, and try use nodes in descending order of > reliability? That is what I was planning to do with nfreenetd. Of > course, such a node choosing algorithm would result in nodes with > large datastores and many connections to other nodes getting much > heavier use that other nodes. Do you have a clue what you are trying implement? Have you even read Theodore's or Ian's original paper? > > -- > Travis Bemann > Sendmail is still screwed up on my box. > My email address is really bemann at execpc.com. ---------------------------------------- Content-Type: application/pgp-signature; name="unnamed" Content-Transfer-Encoding: 7bit Content-Description: ---------------------------------------- -- \oskar _______________________________________________ Freenet-dev mailing list Freenet-dev at lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/freenet-dev
